Clustering and Mixture Modeling of Schooling Expectancy Trends in Papua Province: A Spatial Analysis Using the Mapping Toolbox

Authors

October 28, 2025

Downloads

Background: Persistent educational inequality in Papua Province, particularly in remote highland districts, is driven by limited infrastructure and accessibility. Although Schooling Expectancy (Harapan Lama Sekolah, HLS) is widely recognized as a forward-looking educational metric, existing studies rarely incorporate probabilistic modeling with spatial analysis to examine regional disparities.

ObjectiveThis study aimed to identify spatial and statistical patterns of schooling expectancy across 29 districts in Papua from 2010 to 2023 by combining probabilistic clustering with spatial visualization methods.

Methods: The analysis applied Gaussian Mixture Model (GMM) clustering, which was validated using the Silhouette Index and Davies–Bouldin Index (DBI), to group districts based on HLS trends. Fourteen candidate probability distributions were evaluated using Kolmogorov–Smirnov and Anderson–Darling tests. In addition, five model selection criteria (AIC, BIC, AICc, CAIC, HQC) were applied to refine the fit. Cluster-wise mixture model was constructed, and spatial interpretation was improved through MATLAB’s Mapping Toolbox as well as wind rose diagrams.

Results: During the process of the analysis, four statistically distinct clusters were identified. Cluster 3 (coastal districts) showed the highest and most stable HLS (12.1–14.0 years), while Cluster 4 (remote highlands) signified the lowest (2.4–5.6 years) with high dispersion. Right-skewed distributions (e.g., Weibull, Gamma) modeled high-performing districts, and heavy-tailed, left-skewed ones (e.g., Stable, Inverse Gaussian) modeled marginalized regions. Spatial visualization confirmed a clear coastal–highland divide in educational attainment.

Conclusion: The proposed incorporation of probabilistic modeling and spatial clustering offered a robust analytical tool for capturing intra-regional educational disparities. This framework provided empirical evidence to support geographically differentiated policy interventions in Papua and could be adapted to similar underserved regions in future studies.

Keywords: Schooling Expectancy, Gaussian Mixture Model, Probabilistic Modeling, Silhouette Index, Davies–Bouldin Index, Spatial Clustering, Education Inequality, Papua Province.