Predicting Students Graduate on Time Using C4.5 Algorithm
Downloads
Background: Facilitating an effective learning process is the goal of higher education institutions. Despite improvement in curriculum and resources, many students cannot graduate on time. Mostly, the number of students who graduate on time is lower than the number of new students enrolling to universities. This could dilute the chance for students to learn effectively as the ratio between faculty members and students becomes non-ideal.
Objective: This study aims to present a prediction model for students' on-time graduation using the C4.5 algorithm by considering four features, namely the department, GPA, English score, and age.
Methods: This research was completed in three stages: data pre-processing, data processing and performance measurement. This predicting scheme make the prediction based on the department of study, age, GPA and English proficiency.
Results: The results of this study have successfully predicted students' graduation. This result is based on the data of students who graduated in 2008-2014. The prediction performance result achieved 90% of accuracy using 300 testing data.
Conclusion: The finding is expected to be useful for universities in administering their teaching and learning process.
B. Bertaccini, S. Bacci, and A. Petrucci, "A graduates ' satisfaction index for the evaluation of the university overall quality,” Socioecon. Plann. Sci., p. 100875, May 2020.
C. Aina and G. Casalone, "Early labor market outcomes of university graduates: Does time to degree matter?,” Socioecon. Plann. Sci., p. 100822, Mar. 2020.
X. Xu, J. Wang, H. Peng, and R. Wu, "Prediction of academic performance associated with internet usage behaviors using machine learning algorithms,” Comput. Human Behav., vol. 98, pp. 166–173, Sep. 2019.
R. Asif, A. Merceron, S. A. Ali, and N. G. Haider, "Analyzing undergraduate students' performance using educational data mining,” Comput. Educ., vol. 113, pp. 177–194, 2017.
R. Campagni, D. Merlini, R. Sprugnoli, and M. C. Verri, "Data mining models for student careers,” Expert Syst. Appl., 2015.
A. I. Adekitan and O. Salau, "The impact of engineering students' performance in the first three years on their graduation result using educational data mining,” Heliyon, vol. 5, no. 2, p. e01250, Feb. 2019.
S. Winiarti, H. Yuliansyah, and A. A. Purnama, "Identification of Toddlers' Nutritional Status using Data Mining Approach,” Int. J. Adv. Comput. Sci. Appl., vol. 9, no. 1, pp. 164–169, 2018.
D. Chi, "Research on the Application of K-Means Clustering Algorithm in Student Achievement,” in 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE), 2021, pp. 435–438.
D. Kumalasari, A. B. W. Putra, and A. F. O. Gaffar, "Speech classification using combination virtual center of gravity and k-means clustering based on audio feature extraction,” J. Inform., vol. 14, no. 2, p. 85, May 2020.
A. Namoun and A. Alshanqiti, "Predicting Student Performance Using Data Mining and Learning Analytics Techniques: A Systematic Literature Review,” Appl. Sci., vol. 11, no. 1, p. 237, Dec. 2020.
H. Yuliansyah and L. Zahrotun, "Designing web-based data mining applications to analyze the association rules tracer study at university using a FOLD-growth method,” Int. J. Adv. Comput. Res., vol. 6, no. 27, pp. 215–221, Oct. 2016.
A. Khan and S. K. Ghosh, "Student performance analysis and prediction in classroom learning: A review of educational data mining studies,” Educ. Inf. Technol., vol. 26, no. 1, pp. 205–240, Jan. 2021.
A. M. Shahiri, W. Husain, and N. A. Rashid, "A Review on Predicting Student's Performance Using Data Mining Techniques,” in Procedia Computer Science, 2015.
M. Goga, S. Kuyoro, and N. Goga, "A Recommender for Improving the Student Academic Performance,” Procedia - Soc. Behav. Sci., vol. 180, pp. 1481–1488, May 2015.
M. Ashraf, M. Zaman, and M. Ahmed, "An Intelligent Prediction System for Educational Data Mining Based on Ensemble and Filtering approaches,” Procedia Comput. Sci., vol. 167, pp. 1471–1483, 2020.
F. Yang and F. W. B. Li, "Study on student performance estimation, student progress analysis, and student potential prediction based on data mining,” Comput. Educ., vol. 123, pp. 97–108, Aug. 2018.
H. Hamsa, S. Indiradevi, and J. J. Kizhakkethottam, "Student Academic Performance Prediction Model Using Decision Tree and Fuzzy Genetic Algorithm,” Procedia Technol., vol. 25, pp. 326–332, 2016.
S. Helal et al., "Predicting academic performance by considering student heterogeneity,” Knowledge-Based Syst., vol. 161, pp. 134–146, Dec. 2018.
D. Kurniawan, A. Anggrawan, and H. Hairani, "Graduation Prediction System On Students Using C4.5 Algorithm,” MATRIK J. Manajemen, Tek. Inform. dan Rekayasa Komput., vol. 19, no. 2, pp. 358–365, 2020.
D. H. Kamagi and S. Hansun, "Implementasi Data Mining dengan Algoritma C4.5 untuk Memprediksi Tingkat Kelulusan Mahasiswa,” J. Ultim., vol. 6, no. 1, pp. 15–20, 2014.
H. Yuliansyah, Hafsah, I. Arfiani, and R. Umar, "Discovering Meaningful Pattern of Undergraduate Students Data using Association Rules Mining,” in 2019 Ahmad Dahlan International Conference Series on Engineering and Science (ADICS-ES 2019), 2019, pp. 13–17.
K. Sya'iyah, H. Yuliansyah, and I. Arfiani, "Clustering Student Data Based On K-Means Algorithms,” Int. J. Sci. Technol. Res., vol. 8, no. 8, pp. 1014–1018, 2019.
K. P. Shaleena and S. Paul, "Data mining techniques for predicting student performance,” in ICETECH 2015 - 2015 IEEE International Conference on Engineering and Technology, 2015.
M. Wibowo, F. Noviyanto, S. Sulaiman, and S. M. Shamsuddin, "Machine Learning Technique For Enhancing Classification Performance In Data Summarization Using Rough Set And Genetic Algorithm,” Int. J. Sci. Technol. Res., vol. 8, no. 10, pp. 1108–1119, 2019.
A. Luque, A. Carrasco, A. Martín, and A. de las Heras, "The impact of class imbalance in classification performance metrics based on the binary confusion matrix,” Pattern Recognit., vol. 91, pp. 216–231, Jul. 2019.
M. Dash and H. Liu, "Feature selection for classification,” Intell. Data Anal., vol. 1, no. 1–4, pp. 131–156, 1997.
T. Fushiki, "Estimation of prediction error by using K-fold cross-validation,” Stat. Comput., vol. 21, no. 2, pp. 137–146, Apr. 2011.
Authors who publish with this journal agree to the following terms:
All accepted papers will be published under a Creative Commons Attribution 4.0 International (CC BY 4.0) License. Authors retain copyright and grant the journal right of first publication. CC-BY Licenced means lets others to Share (copy and redistribute the material in any medium or format) and Adapt (remix, transform, and build upon the material for any purpose, even commercially).