Decision Support System for Classification of Early Childhood Diseases Using Principal Component Analysis and K-Nearest Neighbors Classifier

Toddler Diseases Principal Component Analysis (PCA) K-Nearest Neighbor Classifier

Authors

April 25, 2019

Downloads

Background: Data on early childhood disease collected in clinics has accumulated into big data. Those data can be used for classification of early childhood diseases to help medical staff in diagnosing diseases that attack early childhoods.

Objective: This study aims to apply Principal Component Analysis (PCA) and K-Nearest Neighbor (K-NN) Classifier for the classification of early childhood diseases.

Methods: Data analysis was performed using PCA to obtain variables that had a major influence on the classification of early childhood diseases. PCA was done by observing the correlation between variables and eliminating variables that have little influence on classification. Furthermore, data on early childhood disease was classified using the K-Nearest Neighbor Classifier method.

Results:  The results of system evaluation using 150 test data indicated that the classification system by applying PCA and KNN Classifier had an accuracy value of 86%.

Conclusion: PCA can be used to reduce the number of variables involved so that it can improve system performance in terms of efficiency. In addition, the application of PCA and KNN can also improve accuracy in the classification of early childhood diseases.