Mortality Prediction based on Imbalanced New Born and Perinatal Period Data
Journal Title: International Journal of Advanced Computer Science & Applications - Year 2019, Vol 10, Issue 8
Abstract
This study was carried out by the New York State Department of Health, between 2012 and 2016. This experiment relates to six supervised machine learning methods: Support Vector Machine (SVM), Logistic Regression (LR), Gradient Boosting (GB), Random Forest (RF), Deep Learning (DL) and the Ensemble Model, all of which are used in the prediction of infant mortality. This experiment applied ensemble model that concentrated on assigning different weights to different models per output class in order to obtain a better predictive performance for infant mortality. Efforts were made to measure the performance and compare the classifier accuracy of each model. Several criteria, including the area under ROC curve, were considered when comparing the ensemble model (GB, RF and DL) with the other five models (SVM, LR, DL, GB and RF). In terms of these different criteria, the ensemble model outperformed the others in predicting survival rates among infant patients given a balanced data set (the areas under the ROC curve for minor, moderate, major and extreme were 98%, 95%, 92% and 97% respectively, giving a total accuracy of 80.65%). For the imbalanced dataset, (the areas under the ROC curve for minor, moderate, major and extreme were 98%, 98%, 99% and 99% respectively, giving total accuracy increased to 97.44%). The results of the experiments used in this dissertation showed that using the ensemble model provided a better level of prediction for infant mortality than the other five models, based on the relative prediction accuracy for each model for each output class. Therefore, the ensemble model provides and extremely promises classifier in terms of predicting infant mortality.
Authors and Affiliations
Wafa M. AlShwaish, Maali Ibr. Alabdulhafith
Load Balancing based on Bee Colony Algorithm with Partitioning of Public Clouds
Cloud computing is an emerging trend in the IT industry that provides new opportunities to control costs associated with the creation and maintenance of applications. Of prevalent issues in cloud computing, load balancin...
Effective Data Mining Technique for Classification Cancers via Mutations in Gene using Neural Network
The prediction plays the important role in detecting efficient protection and therapy/treatment of cancer. The prediction of mutations in gene needs a diagnostic and classification, which is based on the whole database (...
Word-Based Grammars for PPM
The Prediction by Partial Matching (PPM) compression algorithm is considered one of the most efficient methods for compressing natural language text. Despite the advances of the PPM method for the English language to pre...
Secure Clustering in Vehicular Ad Hoc Networks
A vehicular Ad-hoc network is composed of moving cars as nodes without any infrastructure. Nodes self-organize to form a network over radio links. Security issues are commonly observed in vehicular ad hoc networks; like...
Robust Fuzzy-Second Order Sliding Mode based Direct Power Control for Voltage Source Converter
This paper focuses on a second order sliding mode based direct power controller (SOSM-DPC) of a three-phase grid-connected voltage source converter (VSC). The proposed control scheme combined with fuzzy logic aims at reg...