|Health system measurement: Harnessing machine learning to advance global health
|Hannah H. Leslie, Xin Zhou, Donna Spiegelman, and Margaret E. Kruk
|PLOS ONE , 13(10): e0204958; DOI: 10.1371/journal.pone.0204958
Health care utilization
More than one region
Further improvements in population health in low- and middle-income countries demand high-quality care to address an increasingly complex burden of disease. Health facility surveys provide an important but costly source of information on readiness to provide care. To improve the efficiency of health system measurement, we applied unsupervised machine learning methods to assess the performance of the service readiness index (SRI) defined by the World Health Organization and compared it to empirically derived indices.
We drew data from nationally representative Service Provision Assessment surveys conducted in 10 countries between 2007 and 2015. We extracted 649 items in domains such as infrastructure, medication, and management to calculate an index using all available information and classified facilities into quintiles. We compared three approaches against the full item set: the SRI, a new index based on sequential backward selection, and an enriched SRI that added empirically selected items to the SRI. We evaluated index performance with a cross-validated kappa statistic comparing classification using the candidate index against the 649-item index.
9238 facilities were assessed. The 49-item SRI performed poorly against the index using all 649 items, with a kappa value of 0.35. New empirically derived indices with 50 and 100 items captured much more information, with cross-validated kappa statistics of 0.71 and 0.80, respectively. Items varied across the indices and in sensitivity analyses. A 100-item enriched SRI reliably captured the information from the full index: 83% of the facilities were classified into correct quintiles of service readiness based on the full index.
A facility readiness measure developed by global health experts performed poorly in capturing the totality of readiness information collected during facility surveys. Using a machine learning approach with sequential selection and cross-validation to identify the most informative items dramatically improved performance. Such approaches can make assessment of health facility readiness more efficient. Further improvements in measurement will require identification of external criteria—such as patient outcomes—to guide and validate measure development.