Clustering of Drug Sampling Data to Determine Drug Distribution Patterns with K-Means Method : Study on Central Kalimantan Province, Indonesia

Wahyuri Wahyuri, Umi Athiyah, Ira Puspitasari, Yunita Nita

Abstract views = 389 times | downloads = 305 times


Background: Drug sampling and testing in the context of post-marketing control is an important component to ensure drug safety in the supply chains. The results are used by the Indonesian National Agency for Drug and Food Control (NA-FDC) for conducting public warnings, evaluating the Good Manufacturing Practice (GMP) and Good Distribution Practice (GDP) implementation, and enforcing the law against drug violation.

Objective: This study aimed to identify and analyze drug distribution patterns to provide an overview of drug sampling in the public sector.

Methods: The data was collected from Balai Besar Pengawas Obat dan Makanan (BBPOM) Palangka Raya’s database. The collected data were the drug sampling data from Integrated Information Reporting Systems (IIRS) application from 2014 to 2018. Next, we employed CRISP-DM methodology to analyze the data and to identify the pattern. K-means clustering model was selected for data modeling.

Results: The dataset contained five attributes, i.e., drug name, therapeutic classes, district/city, sample category, and evaluation of drug surveillance. The drug distribution pattern formed three clusters. First cluster contained 522 drug items in eight therapeutic classes and spread over ten districts, second cluster contained 1542 drug items in five therapeutic classes and spread over five districts, and third cluster contained 503 drug items in eleven therapeutic classes and spread across nine districts.

Conclusion: To conclude, the applied data mining technique has improved the decision on the drug sampling planning. It also provides in-depth information on the improvement of drug post-marketing control performance in Central Kalimantan Province.


Clustering, CRISP-DM, Data Mining, Drug distribution patterns, Drug quality control, Drug sampling

Full Text:



BPOM, 2016. 2016 Annual Report, Jakarta, pp. 1-10.

BPOM in Palangka Raya (2014). Strategic Plan for 2014-2019 , Palangka Raya, pp. 1-32.

BPOM, 2018. Guidelines for Priority Sampling in 2018, Jakarta, pp. 1- 180.

Han J, Kamber M (2012). Data Mining: Concepts and Techniques , Urbana-Champaign, Third Edition, University of Illinois, pp. 1-35.

Larose DT, Larose CD, 2014. Discovering Knowledge in Data , Wiley, pp.1-15.

Braga A, Portela F, Santis MF, Belha A, Machado J, Silva A, Rua F, (2016). Data Mining to Predict The Use of Vasopressors in Intensive Medicines Patients, Journal of Technology 78: 6-7.

Chen TJ, Chou LF, Hwang SJ (2003). Application of a Data Mining Technique to Analyze Copreccription Patterns for Antacids in Taiwan, Elsevier, Clinical Therapeutics, Vol. 25, Issue 9: 2453-2463.

Duggirala HJ, Tonning JM, Sith E, Bright RA, Baker JD, Ball R, Bell C, Bright-Ponte SJ, Botsis T, Bouri K, Boyer M, Burkhart K, Condrey GS, Chen JJ, Chirtel S, Filice RW, Francis H, Jiang H, Levine J, Martin D, Oladipo T, O'Neill R, Palmer LAM, Paredes A, Rochester G, Sholtes D, SzarfmanA, Wong HL, Xu Z, Koss-Hout T (2016). Use of Data Mining at the Food and Drug Administration, Journal of the American Medical Informatics Association 23 (2): 428-434.

Erawati S, Mustafa K, Lazuardi L (2016). Pattern of Diabetes Mellitus Inpatient Cost Component Grouping in Hospitals , Journal of Information Systems for Public Health, Volume 1, April 2016: 25-31.

Ibrahim H, Saad A, Abdo A, Eldin S (2016). Mining Association Patterns of Drug-Interactions Using Post Marketing FDA's Spontaneous Reporting Data, Journal of Biomedical Informatics 60 Elsevier: 294-308.

Ilayaraja M, Meyyapan T (2013). Mining Medical Data to Identify Frequent Diseases Using Apriori Algorithm, International Conference in Pattern Recognition Informatics and Mobile Engineering, India: 194-199.

Jen CH, Wang CC, Jiang BC, Chu YH, Chen MS (2012). Application of Classification Techniques on Development of an Early-Warning System for Chronic Illnesses, Expert Systems with Applications 39 (10). Elsevier: 8852–8858.

Koh HC, Tan G (2011). Data Mining Applications in Healthcare, Journal of Healthcare Information Management 19 (2): 65

Moon SS, Kang SY, Jikpitaklert W, Kim SB (2012). Decision Tree Models For Characterizing Smoking Patterns of Older Adults, Elsevier, Expert System With Application, Vol. 39, Issue 1, January 2012: 445-451

Ranjan J (2006). Applications of Mining Data Techniques in Pharmaceutical Industry, Journal of Theoretical and Applied Information Technology : 61-67.

Razali AM, Ali S (2009). Generating Treatment Plan in Medicine: A Data Mining Approach, American Journal of Applied Science, ed. 6: 345-351.

Reddy CK, Aggarwal CC (2015). Healthcare Data Analytics, CRC Press, Wayne State University in Detroit, Michigan, USA, pp. 1-15.

Stuhlinger W, Hogl O, Muller M (2000). Intelligent Data Mining For Medical Quality Management, Workshop on Noter of The 14th European Conference Artificial Intelligence: 1-10.

Tomar D, Agarwal S (2013). A Survey on Data Mining Approaches for Healthcare, International Journal of Bio-Science and Bio Technology, Vol. 5 No. 5: 241-266

Wirth R (2000). CRISP-DM: Towards a Standard Process Model for Data Mining, Proceedings of the Fourth International Conference on the Practical Application of Knowledge Discovery and Data Mining, no.24959: 29-39.

Hofmann M, Klinkenberg R, 2014. Rapid Miner: Data Mining Use Cases and Business Analytics Applications, CRC Press, Boca Raton, FL, pp. 19-29, 162.

Davies DL, Bouldin, DW, 1979. A Cluster Separation Measure, IEEE Transactions on Pattern Analysis and Machine Intelligence. PAMI-1 (2), pp. 224-227.

Suyanto, 2017. Data Mining For Data Classification and Clustering, Penerbit Informatika, pp 247-262.

WHO (2016). Annex 7: Guidelines on The Conduct of Surveys of The Quality of Medicines, WHO Expert Committee on Specifications for Pharmaceutical Preparations, Fiftieth report, WHO Technical Report Series No. 996, Geneva.

BPOM in Palangka Raya (2017). 2017 Performance Report , Palangka Raya, pp. 1-16.

BPOM in Palangka Raya (2016). 2016 Performance Report , Palangka Raya, pp. 1-18.


  • There are currently no refbacks.

Copyright (c) 2019 The Authors. Published by Universitas Airlangga.

Creative Commons License
This work is licensed under a Creative Commons Attribution 4.0 International License.

ISSN 2443-2555 (online) 2598-6333 (print). Published by Universitas Airlangga.
 All article published in JISEBI are open access and under the CC BY license (