Back to browse results
Modeling Excess Zeros and Heterogeneity in Count Data from a Complex Survey Design with Application to the Demographic Health Survey in Sub-Saharan Africa
Authors: Lin Dai, Michael D Sweat, and Mulugeta Gebregziabher
Source: Statistical Methods in Medical Research, 27(1): 208-220; DOI: 10.1177/0962280215626608
Topic(s): HIV/AIDS
Country: Africa
  Multiple African Countries
Published: JAN 2018
Abstract: Purpose To show a novel application of a weighted zero-inflated negative binomial model in modeling count data with excess zeros and heterogeneity to quantify the regional variation in HIV-AIDS prevalence in sub-Saharan African countries. Methods Data come from latest round of the Demographic and Health Survey (DHS) conducted in three countries (Ethiopia-2011, Kenya-2009 and Rwanda-2010) using a two-stage cluster sampling design. The outcome is an aggregate count of HIV cases in each census enumeration area of each country. The outcome data are characterized by excess zeros and heterogeneity due to clustering. We compare scale weighted zero-inflated negative binomial models with and without random effects to account for zero-inflation, complex survey design and clustering. Finally, we provide marginalized rate ratio estimates from the best zero-inflated negative binomial model. Results The best fitting zero-inflated negative binomial model is scale weighted and with a common random intercept for the three countries. Rate ratio estimates from the final model show that HIV prevalence is associated with age and gender distribution, HIV acceptance, HIV knowledge, and its regional variation is associated with divorce rate, burden of sexually transmitted diseases and rural residence. Conclusions Scale weighted zero-inflated negative binomial with proper modeling of random effects is shown to be the best model for count data from a complex survey design characterized by excess zeros and extra heterogeneity. In our data example, the final rate ratio estimates show significant regional variation in the factors associated with HIV prevalence indicating that HIV intervention strategies should be tailored to the unique factors found in each country. Keywords: HIV; multi-country survey data; negative binomial; regional variation; sub-Saharan Africa; zero-inflation.