The Demographic and Health Surveys (DHS) Program strives to maintain the highest standards of data collection, processing, and analysis. This report is one of a series of DHS Methodological Reports on data quality. Earlier reports included descriptions of methods but focused on actual assessments of, for example, maternal mortality data or potential interview effects. This report focuses primarily on strategies and new methods, or significant modifications of methods that have appeared previously. The report includes many examples for illustrative purposes. The methods can be generalized to other substantive outcomes. For example, a chapter on the analysis of fertility that uses retrospective birth histories could be extended to under-5 mortality using the birth histories, or to adult and maternal mortality using the retrospective sibling histories.
The report has five main chapters. After the introductory chapter, Chapter 2 focuses on a type of displacement of birthdate, observed in many DHS surveys, that results from recording the year of birth as the year of interview minus years of age. The calculation is correct only if the respondent has already reached their birthday in the year of interview. Displacement of month of birth has been noted in the past but without an explanation of the mechanism behind it. Chapter 2 describes a simple indicator to measure the resulting bias, as well as other indicators based on the stated month of birth, and for children, the day of birth.
Chapter 3 focuses on the quality of the birth histories. The main method is a comparison of two successive surveys and their estimates of fertility rates for the 5 calendar years before the first survey. Single-year rates, as well as 5-year rates, are compared with statistical models. Chapter 4 uses statistical models to describe variations in data quality indicators according to characteristics such as place of residence, household wealth, and level of education. Quality-related outcomes, such as nonresponse or age heaping, may potentially vary for reasons that lie beyond the implementation of fieldwork. It could be argued that interviewer effects, for example, should be adjusted for the characteristics of the individuals being interviewed.
Chapter 5 discusses the interview process and, in particular, the duration of the household interview, as related to its position within the duration of fieldwork, within the duration of fieldwork in a specific cluster or time of day, and how the duration of the interview relates to the number of household members and the number of items in the questionnaire. In the same way as age heaping, very short interviews may suggest substantively serious data quality problems. Chapter 6 provides examples of how it is possible to focus on interviewers, clusters, and days, in any combination, in which there was an irregularity that was both large in magnitude and statistically significant. It is possible to simulate or track the fieldwork in a variety of ways by using information from the data files.
The methods described here, as well as those that appeared earlier, will be used in the future to prepare data quality profiles of all DHS surveys, and to better monitor long-term trends in data quality and identify potential problems in new surveys. Some methods can also be adapted to better monitor data quality in real time during fieldwork.