Decision Support System for Classification of Early Childhood Diseases Using Principal Component Analysis and K-Nearest Neighbors Classifier

Damar Dananjaya, Indah Werdiningsih, Rini Semiati

Abstract views = 158 times | views = 230 times


Background: Data on early childhood disease collected in clinics has accumulated into big data. Those data can be used for classification of early childhood diseases to help medical staff in diagnosing diseases that attack early childhoods.

Objective: This study aims to apply Principal Component Analysis (PCA) and K-Nearest Neighbor (K-NN) Classifier for the classification of early childhood diseases.

Methods: Data analysis was performed using PCA to obtain variables that had a major influence on the classification of early childhood diseases. PCA was done by observing the correlation between variables and eliminating variables that have little influence on classification. Furthermore, data on early childhood disease was classified using the K-Nearest Neighbor Classifier method.

Results:  The results of system evaluation using 150 test data indicated that the classification system by applying PCA and KNN Classifier had an accuracy value of 86%.

Conclusion: PCA can be used to reduce the number of variables involved so that it can improve system performance in terms of efficiency. In addition, the application of PCA and KNN can also improve accuracy in the classification of early childhood diseases.


Toddler Diseases, Principal Component Analysis (PCA), K-Nearest Neighbor Classifier

Full Text:



A. Fauzi, "Penerapan Forward Chaining dalam Sistem Pakar Untuk Mendiagnosis Penyakit Pada Anak," Techno Xplore, vol. 1, no. 1, pp. 11-16, 2016.

W. Bank, "World Development Report: Investing in Health," Oxford Univ. Press, Oxford, 2012.

M. Garenne, C. Ronsmans and H. Campbel, "The Magnitude of Mortality From Acute Respiratory Infections in Children Under 5 Years in Developing Countrie," World Health Stat Q, vol. 45, no. 2-3, p. 180–191, 1992.

J. Snyder and M. Merson, "The magnitude of the global problem of acute diarrhoeal disease: A review of active surveillance data," Bull World Health Organ, vol. 60, no. 4, pp. 605-13, 1982.

B. C, M. J, d. Z. I and G. RI, "The magnitude of the global problem of diarrhoeal disease: a ten-year update," Bull World Health Organ, vol. 70, no. 6, pp. 705-714, 1992.

O. WA, M. LE, A. WL and H. AR, "Worldwide Measles Prevention," Israel Joumal of Medical Science, vol. 30, no. 5-6, pp. 469-481, 1994.

Kementrian Kesehatan Republik Indonesia, "Manajemen Terpadu Balita Sakit (MTBS)," Jakarta, 2012.

P. K. Patra, D. P. Sahu and n. Mandal, "An Expert System for Diagnosis of Human Diseases," International Journal of Computer Applications, vol. 1, no. 13, pp. 71-73, 2010.

B. F. Yanto, I. Werdiningsih and E. Purwanti, "Perancangan Sistem Pakar Diagnosa Penyakit Pada Anak Bawah Lima Tahun Menggunakan Metode Forward Chaining," Journal of Information Systems Engineering and Business Intelligence, vol. 3, no. 1, pp. 61-67, 2017.

E. Kurniawan, I. K. E. Purnama and S. Sumpeno, "Analisa Rekam Medis untuk Menentukan Pola Kelompok Penyakit Menggunakan Klasifikasi dengan Decision Tree J48," in Pasca Sarjana Teknik Elektro ITS, Surabaya, 2011.

M. A. Jabbar, B. Deekshatulu and P. Chandra, "Heart Disease Classication Using Nearest Neighbor Classifier With Feature Subset Selection," Anale. Seria Informatică, vol. 11, no. 1, pp. 47-54, 2013.

D. Jain and V. Singh, "Feature selection and classification systems for chronic disease prediction: A review," Egyptian Informatics Journal, vol. xxx, pp. xxx-xxx, 2018.

I. E. Kaya, A. A. Pehlivanl, E. G. Zekiskarde and T. Ibrikci, "PCA Based Clustering For Brain Tumor Segmentation of T1w MRI Image," Computer Methods and Programs in Biomedicine, vol. 140 , no. C, pp. 19-28, 2017.

M. Shardlow, "An Analysis of Feature Selection Techniques," in The University of Mancheste, Manchester, 2016.

M. Dash and L. H, "Feature selection for classification," Intelligent Data Analysis, vol. 1, no. 1-4, pp. 131-156, 1997.

M. G. Hendro, A. T. Bharata, S. and N. Akhmad, "Penggunaan Metodologi Analisa Komponen Utama (PCA) untuk Mereduksi FaktorFaktor yang Mempengaruhi Penyakit Jantung Koroner," in Seminar Nasional Science, Engineering and Technology, Yogyakarta, 2012.

J. Han, M. Kamber and J. Pei, Data Mining Concepts and Techniques, Waltham: Morgan Kaufmann, 2012.

J. Tang, S. Alelyani and H. Liu, "Feature Selection for Classification: A Review," in Data Classification: Algorithms and Applications, New York, CRC Press, 2014, p. 37.

S. Beniwal and J. Arora, "Classification and Feature Selection Techniques in Data Mining," International Journal of Engineering Research & Technology, vol. 1, no. 6, pp. 1-6, 2012.

D. T. Larose, Discovering Knowledge in Data, Second Edition, New Jersey: John Wiley & Sons, 2014.

Ahn, H., & Kim, K.-j, Bankruptcy prediction modeling with hybrid case-based reasoning and genetic algorithms approach, Elsevier, 599-607, (2009)


  • There are currently no refbacks.

Copyright (c) 2019 Authors

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN 2443-2555 (online) 2598-6333 (print). Published by Universitas Airlangga.
 All article published in JISEBI are open access and under the CC BY license (