Clustering and Mixture Modeling of Schooling Expectancy Trends in Papua Province: A Spatial Analysis Using the Mapping Toolbox
Downloads
Background: Persistent educational inequality in Papua Province, particularly in remote highland districts, is driven by limited infrastructure and accessibility. Although Schooling Expectancy (Harapan Lama Sekolah, HLS) is widely recognized as a forward-looking educational metric, existing studies rarely incorporate probabilistic modeling with spatial analysis to examine regional disparities.
ObjectiveThis study aimed to identify spatial and statistical patterns of schooling expectancy across 29 districts in Papua from 2010 to 2023 by combining probabilistic clustering with spatial visualization methods.
Methods: The analysis applied Gaussian Mixture Model (GMM) clustering, which was validated using the Silhouette Index and Davies–Bouldin Index (DBI), to group districts based on HLS trends. Fourteen candidate probability distributions were evaluated using Kolmogorov–Smirnov and Anderson–Darling tests. In addition, five model selection criteria (AIC, BIC, AICc, CAIC, HQC) were applied to refine the fit. Cluster-wise mixture model was constructed, and spatial interpretation was improved through MATLAB’s Mapping Toolbox as well as wind rose diagrams.
Results: During the process of the analysis, four statistically distinct clusters were identified. Cluster 3 (coastal districts) showed the highest and most stable HLS (12.1–14.0 years), while Cluster 4 (remote highlands) signified the lowest (2.4–5.6 years) with high dispersion. Right-skewed distributions (e.g., Weibull, Gamma) modeled high-performing districts, and heavy-tailed, left-skewed ones (e.g., Stable, Inverse Gaussian) modeled marginalized regions. Spatial visualization confirmed a clear coastal–highland divide in educational attainment.
Conclusion: The proposed incorporation of probabilistic modeling and spatial clustering offered a robust analytical tool for capturing intra-regional educational disparities. This framework provided empirical evidence to support geographically differentiated policy interventions in Papua and could be adapted to similar underserved regions in future studies.
Keywords: Schooling Expectancy, Gaussian Mixture Model, Probabilistic Modeling, Silhouette Index, Davies–Bouldin Index, Spatial Clustering, Education Inequality, Papua Province.
R. Read and A. Benavot, “Global education monitoring report,” in International Encyclopedia of Education: 4th ed., 2022. doi: 10.1016/B978-0-12-818630-5.01026-5.
R. Karagiannis and G. Karagiannis, “Constructing composite indicators with Shannon entropy: The case of human development index,” Socio-Economic Planning Sciences, vol. 70, p. 100701, Jun. 2020, doi: 10.1016/j.seps.2019.03.007.
C. Türe and Y. Türe, “A model for the sustainability assessment based on the human development index in districts of megacity Istanbul (Turkey),” Environment, Development and Sustainability, vol. 23, pp. 13966–13990, Aug. 2021, doi: 10.1007/s10668-020-00735-9.
R. E. Caraka et al., “Understanding pediatric health trends in Papua: Insights from SUSENAS, RISKESDAS, remote sensing, and its relevance to Prabowo and Gibran’s free lunch and milk program,” IEEE Access, vol. 12, pp. 33977–33989, 2024, doi: 10.1109/ACCESS.2024.3380018.
F. Kartiasih, N. Djalal Nachrowi, I. D. G. K. Wisana, and D. Handayani, “Inequalities of Indonesia’s regional digital development and its association with socioeconomic characteristics: a spatial and multivariate analysis,” Information Technology for Development, vol. 29, no. 1, pp. 1–24, 2023, doi: 10.1080/02681102.2022.2110556.
R. C. Miranti and C. Mendez, “Social and economic convergence across districts in indonesia: A spatial econometric approach,” Bulletin of Indonesian Economic Studies, vol. 59, no. 1, pp. 69–93, 2023, doi: 10.1080/00074918.2022.2071415.
P. A. Widyastaman and D. Hartono, “Geographic distribution of economic inequality and crime in Indonesia: Exploratory spatial data analysis and spatial econometrics approach,” Applied Spatial Analysis and Policy, 2024, doi: 10.1007/s12061-023-09556-5.
T. Alqurashi and W. Wang, “Clustering ensemble method,” International Journal of Machine Learning and Cybernetics, vol. 10, pp. 1225–1246, 2019, doi: 10.1007/s13042-017-0756-7.
D. Hooshyar, Y. Yang, M. Pedaste, and Y. M. Huang, “Clustering algorithms in an educational context: An automatic comparative approach,” IEEE Access, vol. 8, pp. 146407–146417, 2020, doi: 10.1109/ACCESS.2020.3014948.
A. Sroyer, H. Morin, F. Reba, J. Wororomi, and A. Languwuyo, “Clustering and mixture distribution analysis of average years of schooling in Papua ( 2010 – 2023 ),” vol. 10, no. 2, pp. 495–506, 2025, doi: 10.18860/cauchy.v10i2.32988.
V. A. Sari and S. Tiwari, “The geography of human capital: Insights from the subnational human capital index in Indonesia,” Social Indicators Research, vol. 170, pp. 673–692, 2024, doi: 10.1007/s11205-024-03322-x.
P. A. Widyastaman and D. Hartono, “Economic inequality decomposition and spatial pattern of crime in Indonesia,” Papers in Applied Geography, vol. 8, no. 1, pp. 22–37, 2022, doi: 10.1080/23754931.2021.1991842.
H. Kurniawan, H. L. F. de Groot, and P. Mulder, “Are poor provinces catching-up the rich provinces in Indonesia?,” Regional Science Policy and Practice, vol. 11, no. 2, pp. 341–356, 2019, doi: 10.1111/rsp3.12160.
F. Astika Saputra, A. Barakbah, and P. Riza Rokhmawati, “Data Analytics of human development index (HDI) with features descriptive and predictive mining,” in IES 2020 - International Electronics Symposium: The Role of Autonomous and Intelligent Systems for Human Life and Comfort, Sep. 2020, pp. 126–131. doi: 10.1109/IES50839.2020.9231661.
S. A. Takyi, O. Amponsah, M. O. Asibey, and R. A. Ayambire, “An overview of Ghana’s educational system and its implication for educational equity,” International Journal of Leadership in Education, vol. 24, no. 2, pp. 150–168, 2021, doi: 10.1080/13603124.2019.1613565.
X. Liu and J. LeSage, “Arc_Mat: A matlab-based spatial data analysis toolbox,” Journal of Geographical Systems, vol. 12, no. 4, pp. 357–385, Dec. 2010, doi: 10.1007/s10109-009-0096-6.
J. Kim, Changho; Do, Thang N.; Kim, “Spatially explicit supply chain for nationwide CO₂-to-fuel infrastructure: Data-driven optimization with gaussian mixture model-based region screening and clustering,” Journal of Cleaner Production, vol. 445, p. 141313, 2024, doi: 10.1016/j.jclepro.2024.141313.
G. Barzizza, E.; Biasetton, N.; Ceccato, R.; Maculan, N.; Masiero, A.; Tardivo, “Evaluating optimal number of clusters: A permutation-based approach,” SSRN Electronic Journal, 2023, doi: 10.2139/ssrn.5028456.
D. L. Davies and D. W. Bouldin, “A cluster separation measure,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. PAMI-1, no. 2, pp. 224–227, 1979, doi: 10.1109/TPAMI.1979.4766909.
Y. Huang, P. Sripathanallur Murali, and G. Vejarano, “Distributed monitoring of moving thermal targets using unmanned aerial vehicles and gaussian mixture models,” Robotics, vol. 14, no. 7, p. 85, 2025, doi: 10.3390/robotics14070085.
L. P. da Silva, E. J. P. Santiago, F. Gomes-Silva, A. S. A. da Silva, and R. S. C. Menezes, “Mixture models of probability distributions applied to rainfall in the state of Pernambuco, Brazil,” Acta Scientiarum - Technology, vol. 45, e60621, 2023, doi: 10.4025/actascitechnol.v45i1.60621.
J. Saarenpää, M. Kolehmainen, and H. Niska, “Geodemographic analysis and estimation of early plug-in hybrid electric vehicle adoption,” Applied Energy, 2013, doi: 10.1016/j.apenergy.2013.02.066.
V. Kumar, Shanu, and Jahangeer, “Statistical distribution of rainfall in Uttarakhand, India,” Applied Water Science, vol. 7, no. 8, pp. 4765–4776, 2017, doi: 10.1007/s13201-017-0586-5.
N. Balakrishnan, E. Chimitova, and M. Vedernikova, “An empirical analysis of some nonparametric goodness-of-fit tests for censored data,” Communications in Statistics: Simulation and Computation, vol. 44, no. 4, pp. 1101–1115, 2015, doi: 10.1080/03610918.2013.796982.
R. A. Fox and P. G. Hoel, “Introduction to mathematical statistics.,” The Statistician, 1964, doi: 10.2307/2987351.
I. Pobočíková, M. Michalková, Z. Sedliačková, and D. Jurášová, “Modelling the wind speed using exponentiated weibull distribution: case study of Poprad-Tatry, Slovakia,” Applied Sciences (Switzerland), vol. 13, no. 6, p. 4031, Mar. 2023, doi: 10.3390/app13064031.
R. J. Rossi, Mathematical statistics: An introduction to likelihood based inference. 2018. doi: 10.1002/9781118771075.
A. Alzaatreh, C. Lee, and F. Famoye, “A new method for generating families of continuous distributions,” Metron, vol. 71, pp. 63–79, 2013, doi: 10.1007/s40300-013-0007-y.
B. Anderson, “The failure of education in Papua’s highlands,” Inside Indonesia, 2013. [Online]. Available: https://www.insideindonesia.org/the-failure-of-education-in-papua-s-highlands.
J. A. Mollet, “Educational investment in conflict areas of Indonesia: The case of West Papua Province,” International Education Journal, vol. 8, no. 2, pp. 155–166, 2007..
Education Sector Analytical and Capacity Development Partnership (ACDP), “Rural and remote area education strategic planning study for Tanah Papua,” Jakarta, Indonesia, 2014. [Online]. Available: http://www.acdp-indonesia.org/wp-content/uploads/2014/08/ACDP039-RRA-Strategic-Planning-Study-for-Tanah-Papua-2014.pdf
Copyright (c) 2025 The Authors. Published by Universitas Airlangga.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
All accepted papers will be published under a Creative Commons Attribution 4.0 International (CC BY 4.0) License. Authors retain copyright and grant the journal right of first publication. CC-BY Licenced means lets others to Share (copy and redistribute the material in any medium or format) and Adapt (remix, transform, and build upon the material for any purpose, even commercially).















