CGIAR Gender

Dealing with limitations of upscaling gender data

By Catherine Pfeifer, scientist and spatial analyst at the International Livestock Research Institute (ILRI).

Gender research often focuses on small study areas (e.g. villages or groups of villages) to understand in-depth power relations, intra-household decision making or asset ownership. These in-depth context-specific studies make it difficult to compare studies and draw lessons for other locations.

Significant efforts have been made in recent years to provide standardized individual-level data sets (i.e. collecting similar individual information across different countries) and include locational information.

Examples of these datasets include:

Most of these surveys contain household or individual level gender data.

I worked on mapping ‘gender contexts’ in Africa and used DHS data-sets. With over half a million observations of individual level geo-referenced gender data, I could identify ‘spatial gender patterns’ across the African continent, .

Micro-level socio-economic data, including gender data, is characterized by what geographers refer to as a ‘high short distance variability’. To illustrate this with a simple example: imagine two neighbors – i.e. two geographically close persons or households. They might be very different people. Think how unlikely it is that you share the same hobby as your neighbor! There is a lot of variability when comparing you and your neighbors. Yet, looking above the individual scale, patterns emerge. For example, there may be a general tendency for people in richer neighborhoods to play golf.

In order to identify this spatial pattern, we need to ‘overcome’ short distance variability. To do this, geographers usually spatially aggregate micro data, also known as ‘up-scaling’. It sounds simple, but it isn’t. Though your data might span a representative sample of the overall population, it might not be geographically representative.

Gender maps (image credit: C. Pfeifer)
Gender maps (image credit: C. Pfeifer)

Going back to the golf example: in a poor neighborhood there might be young, ambitious but low-paid professionals playing golf in order to meet senior well-positioned executives. If, by coincidence, your interviewing timing leads you to interview only young people in the poor neighborhood and older people in the rich neighborhood then you might draw the conclusion that there is no spatial pattern to golf playing when comparing neighborhoods. This may not be the case. The dataset at the level of the city is representative, i.e. has the right number of young and old people but you will miss the spatial trend (rich neighborhoods playing golf), as both the poor and rich neighborhood reported playing the golf. However, this is an ‘artifact’ (ie. a ‘glitch’) in the data; that it is to say that the sampling at the individual level might be skewed even though it is representative at the city level. In other words, if you have representative city level data, stick to this level because at the level(s) below you might miss existing patterns or identify patterns that don’t exist in reality.

There are two ways to address this problem:

  1. Aggregate the data at the level at which the sample is representative. In terms of the golf example, aggregate city data and compare it across cities. You might still conclude that in richer cities more people play golf. With this approach, the option to compare different neighborhoods within the city would be flawed.


  1. Correct the geographical sample bias through a regression model. A regression model using geographic variables (so-called ‘full coverage’ data, ie. often satellite images and maps) could help you predict a pattern. But while these spatial modelling methods work well for biophysical data, they are not suitable to up-scaling of gender data.


If we replace ‘playing golf’ with decision-making power of women within their household in Africa, the most reliable approach to mapping gender patterns is to aggregate DHS data at the geographical level at which the sample is representative (often the first administrative level – the subnational level). This is a coarse representation of gender contexts that does not allow comparison of villages within a subnational level. What it does provide is a bigger picture across units, which might suggest where to find similar contexts across the African continent. And the exciting element here is that this can help out-scale good practices to similar contexts.

Find out more about the gender context maps here :

Leave a Reply

Your email address will not be published.

Time limit is exhausted. Please reload CAPTCHA.