Engendering Data Blog Post

Answering the call for new approaches in phone surveys: reaching Indigenous-language speakers in Guatemala

	 UN Women Guatemala - Rural Women Diversify Incomes and Build Resilience

Photo: UN Women/Ryan Brown.

Recently, researchers working in Guatemala learned that it is possible to identify and interview women and men who are speakers of less common languages via phone surveys.

Continue in Spanish

Guatemala, like many countries in Central and South America, is linguistically diverse, reflecting the diversity of its Indigenous populations. Almost half the population speaks one or more of 22 Mayan languages or two other Amerindian languages – sometimes in addition to Spanish.

This linguistic diversity presents a unique challenge when conducting phone surveys.

We were recently faced with this challenge when conducting a phone survey as part of the development of the Women’s Empowerment Metric for National Statistical Systems (WEMNS).

This metric is being developed by researchers from the International Food Policy Research Institute (IFPRI), Emory University, the Living Standards Measurement Study team at World Bank and Oxford University – it will be an empowerment metric for the 50x2030 Initiative to close the agricultural data gap.

In our efforts to validate WEMNS, it was important to determine whether survey questions were interpreted similarly by speakers of different languages, and whether the questions measured similar constructs across genders and language groups. Therefore, we aimed to sample 450 respondents from each of the following groups:

  • women who primarily speak Spanish
  • men who primarily speak Spanish
  • women who primarily speak either K’iche’ or Q’eqchi’
  • men who primarily speak either K’iche’ or Q’eqchi’

K’iche’ or Q’eqchi’ are the two most commonly spoken languages in Guatemala, after Spanish.

In Guatemala, there are no registries of phone numbers with basic characteristics of phone owners (e.g., gender, age or language) or SIM cards (e.g., when and where the card was sold); nor recent, nationally representative surveys with phone numbers to provide a sampling frame.

In our case, we also needed to consider that only around nine percent of the population would speak K’iche’ or Q’eqchi’ as a first language, and that these people may have lower rates of phone ownership and less consistent access to mobile phone networks. Therefore, we could not rely on random digit-dialing approaches to identify our target sample of K’iche’ or Q’eqchi’ speakers.

Additionally, survey administration relies heavily on carefully reviewed translations and interviewing respondents in their primary language – quickly changing interviewers and language in the interview software, according to who picked up the phone, would not have been feasible.

We needed a new solution.

In this blog post, we outline our approach for identifying our phone survey sample. Notably, we were not successful with our first approach, but were with our second. We describe both attempts so that others can learn from our experiences. We also describe the number of potential respondents lost at each step, so that others can anticipate response rates for their own work.

Initial approach – what did not work

We designed our initial approach based on known information on response rates in previous phone surveys and the prevalence of local-language groups.

The typical approach to developing phone survey samples in Guatemala starts with a publicly available list of all known phone numbers that have been released by the government (i.e., active numbers). Randomly selected active phone numbers are then called by automated software (‘robocalls’), and if a human voice picks up, they are considered ‘live’.

From past experience, approximately 25 percent of all active numbers are live and, from live numbers, there is a further loss of up to 90 percent due to rejection, hang-ups or dropped calls. Based on this information, we drew a random sample of 80,000 active phone numbers to be verified.

Upon picking up, respondents were asked to respond to a series of questions about their primary language, gender, age eligibility and willingness to participate by pressing a number corresponding to their response. After identifying the target language, the remaining questions were asked in that language. The automated screening survey was piloted in advance in multiple languages.

Of the 80,000 automated calls, 29 percent were answered by a person. When responding to the automated questions, many stopped responding midway through and, of those that did respond, a notable majority reported being a man older than 18 who primarily spoke Spanish (a 1-1-1 response code).

Although we did not have specific information on phone ownership among speakers of Indigenous languages, phone ownership rates among women and men in Guatemala are relatively similar, and these results seemed unlikely.

We considered the possibility that women and local-language speakers were more conservative when responding to an outside call of this nature. We thought that the more likely explanation was that respondents did not understand how to use the automated response system.

Enumerator conducting phone interview. Credit: Vox Latina.

Second attempt – what worked better

We moved forward after the disappointing results of the automated approach to identifying respondent characteristics.

In our second attempt, we used the automated process only to identify active numbers, then trained interviewers screened all active number respondents for primary language, gender, age eligibility and willingness to participate in the survey.

By the time we finished, the software had identified 40,409 live phone numbers, and interviewers called 21,588 of these for the live screening survey.

Fortunately, these results were much more promising.

After several days, we determined that this approach was likely going to yield our target sample size. To avoid respondent loss due to phone numbers expiring, the team began conducting the primary interviews.

Ultimately, those who responded to the screening survey were much more similar to the population of Guatemala than the sample yielded by our first attempt. Of those identified as eligible, seven percent were K’iche’- or Q’eqchi’-speaking women; eight percent were K’iche’- or Q’eqchi’-speaking men; and 44 percent and 41 percent were Spanish-speaking women and men, respectively.

In the end, we interviewed 558 Spanish-speaking women (34 percent response rate), 554 Spanish-speaking men (36 percent response rate), 464 K’iche’- or Q’eqchi’-speaking women (53 percent response rate) and 455 K’iche’- or Q’eqchi’-speaking men (46 percent response rate).

Conclusion: live screenings get better responses, but cost more

In Guatemala, we faced the challenge of conducting a phone survey with speakers of less commonly spoken languages who we had already anticipated would be harder to identify.

Importantly, we learned that it is possible to identify and interview women and men who are speakers of less common languages via phone surveys.

Additionally, we learned that screening by a live interviewer was more effective than the automated screening protocol that we attempted. One downside of this approach, however, is that the time and cost to call potential respondents was notably larger than if this work was automated with software.

We hope that others attempting similar phone surveys can learn from this approach to better identify phone samples of subpopulations that are relatively small, have limited phone ownership, or both.