Application of machine learning methods for predicting infant mortality in Rwanda: analysis of Rwanda demographic health survey 2014–15 dataset |
Authors: |
Emmanuel Mfateneza, Pierre Claver Rutayisire, Emmanuel Biracyaza, Sanctus Musafiri and Willy Gasafari Mpabuka |
Source: |
BMC Pregnancy and Childbirth , Volume 22; issue 388; DOI:https://doi.org/10.1186/s12884-022-04699-8 |
Topic(s): |
Data use Infant mortality Modelling
|
Country: |
More than one region
Multiple Regions
|
Published: |
MAY 2022 |
Abstract: |
Background:
Extensive research on infant mortality (IM) exists in developing countries; however, most of the methods applied thus far relied on conventional regression analyses with limited prediction capability. Advanced of Machine Learning (AML) methods provide accurate prediction of IM; however, there is no study conducted using ML methods in Rwanda. This study, therefore, applied Machine Learning Methods for predicting infant mortality in Rwanda.
Methods:
A cross-sectional study design was conducted using the 2014–15 Rwanda Demographic and Health Survey. Python software version 3.8 was employed to test and apply ML methods through Random Forest (RF), Decision Tree, Support Vector Machine and Logistic regression. STATA version 13 was used for analysing conventional methods. Evaluation metrics methods specifically confusion matrix, accuracy, precision, recall, F1 score, and Area under the Receiver Operating Characteristics (AUROC) were used to evaluate the performance of predictive models.
Results:
Ability of prediction was between 68.6% and 61.5% for AML. We preferred with the RF model (61.5%) presenting the best performance. The RF model was the best predictive model of IM with accuracy (84.3%), recall (91.3%), precision (80.3%), F1 score (85.5%), and AUROC (84.2%); followed by decision tree model with model accuracy (83%), recall (91%), precision (79%), F1 score (84.67%) and AUROC(82.9%), followed by support vector machine with model accuracy (68.6%), recall (74.9%), precision(67%), F1 score (70.73%) and AUROC (68.6%) and last was a logistic regression with the low accuracy of prediction (61.5%), recall (61.1%), precision (62.2%), F1 score (61.6%) and AUROC (61.5%) compared to other predictive models. Our predictive models showed that marital status, children ever born, birth order and wealth index are the 4 top predictors of IM.
Conclusions:
In developing a predictive model, ML methods are used to classify certain hidden information that could not be detected by traditional statistical methods. Random Forest was classified as the best classifier to be used for the predictive models of IM. |
Web: |
https://bmcpregnancychildbirth.biomedcentral.com/articles/10.1186/s12884-022-04699-8 |
|