Business Intelligence Development in Distributed Information Systems to Visualized Predicting and Give Recommendation for Handling Dengue Hemorrhagic Fever

ISSN 2443-2555 (online) 2598-6333 (print) © 2020 The Authors. Published by Universitas Airlangga. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/) doi: http://dx.doi.org/10.20473/jisebi.6.1.55-69 Business Intelligence Development in Distributed Information Systems to Visualized Predicting and Give Recommendation for Handling Dengue Hemorrhagic Fever


I. INTRODUCTION
Indonesia has recorded tens of thousands of dengue cases every year. In 2017 there were 68,407 cases with 493 deaths. The following year there were 53,075 cases with 344 deaths. And as of February 3, 2019, there were 16,692 cases with 169 deaths. While as of March 15, 2020, there were 25,693 cases with 164 fatalities. [1] This 56 shows that there are more than 150 dengue cases every month, and more than one person dies every day from 2017 to 2020.
One of the factors of DHF patients dying is due to late handling of patients in hospitals or clinics. Health Office of Malang Regency recorded 1,114 cases of Dengue Hemorrhagic Fever (DHF) that occurred during the year 2016 with the largest number of patients in the eastern part of the district of Wajak and Tajinan. [2] Whereas the number of patient rooms available at the Community Health Center (puskesmas) is limited. Therefore, the Health Office of Malang Regency wants to make a system to predict the number of cases in the next period. A map of the distribution of dengue patients in each Community Health Center then displays them on a visual dashboard to help make decision systems become faster and prepare for medical facilities better handling of DHF patients so that it can reduce mortality. To overcome this problem, we made a dashboard which visualized forecasts and map of DHF patients distribution. This prediction is based on the total patient number from previous SP2TP system (Reporting and Recording System of Community Health Center) which is an application for recording DHF patients in Malang district, seasons such as rain, temperature, and humidity, patient increment status, and the results of Epidemiological Investigations (PE). In this research, we put more emphasis on how we develop the decision-making system while the method and prediction have been explained in our previous research. [3] The research was conducted by Chen et al., they assume the transmission of Dengue Hemorrhagic Fever lasts for two weeks in a row. The result of this research is an online analytic system for health officers at the health center (front line public health), so the development of Dengue Fever can be known by transmission every week. [4] The study only shows a visualization of the epidemic area of Dengue Fever but did not show specifically the big or small number of Dengue fever patients. Another research by Nirbhay Mathur et al. made a visualization for the predicted incidences of Dengue Hemorrhagic Fever each week in Selangor Malaysia. Still, the visualization created has not been integrated with the data input system for Dengue Fever. [5] Research conducted by Jastini and Izwan made predictions of DHF patients and is visualized in the form of a histogram. Predictions are made annually, not monthly, and with a wide regional scope. Predictions in the form of aggregates do not give insight to mid-level decision-makers to make preparations, and large areas do not describe which areas need more attention. [6] A work of Niels et al. (2018) made a visualization of patient-flow realtime regarding Emergency Department crowding using the Shiny dashboard in R.R is more suitable for use by scientists, not native users or experience users and offline access so must be connected to the internet. [7] Wilfred Bonney, in his research, showed that Business Intelligence technology was beneficial and supports in decision making if the technology integrated with Electronic Health Record (EHR). Using Business Intelligence (BI) technology also helps healthcare providers to have a framework that supports to assess the information potential of datasets that contained in repository EHR. [8] The research by Ana Pereira et al. that studying concepts of BI solutions with pervasive characteristics applied to pervasive healthcare shown by enabling data access from anywhere and anytime, it is possible to contribute to the reduction of the number of medical errors and consequently improve the quality and safety of patients. [9] Richardo's (2018) study made traffic dashboards to inform route changes, flood routes, traffic jams, and car accidents to their city, but the dashboard can't perform well because of insufficient data quality, lack of understanding of data, poor analysis, and wrong interpretation. [10] Research by Amy Franklin et al. in making dashboard visualization to provide a real-time representation of the status of the entire emergency department so it can support clinical decision-making in-the-moment and provide rapid intervention in improving ED flow still need to be presented in a readily understandable format. Interfaces must accommodate a diverse set of users. [11] Adriaan Haasbroek et al., in his research, made monitoring dashboards to reduce downtime allowed by early, detect, and identification fault conditions followed by appropriate process recovery actions but skills required to use the dashboard. [12] And research by William A. Mattingly, et al. they made visualization dashboard for managing multi-site clinical trial enrollment and only visualized in histogram graph whereas to visualize data against time is easier to use line graphs. [13] Another research conducted by Parama Fadli shown by using Business Intelligence to analyze social media information helps to calculate and to perform data summaries efficiently. The application is beneficial for monitoring news posted on social media that can support providers in the decision-making process. [14] From all the research that has been done on making visualization dashboard and decision making systems that have existed, no BI system is integrated with old applications, visualized in many graphs, and easy to use using simple tools. So we make a dashboard to display the predictions, visualize the distribution of DHF patients in many graph types, and give mitigation recommendations for handling DHF patients in Malang Health Office without any skill required.

II. METHODS
This research uses the Business Intelligence development method which broadly consists of two main phases, namely the making of Business Intelligence and the use of Business Intelligence. By utilizing the first phase which is making Business Intelligence where this phase has four stages including defining Business Intelligence development strategies, identification and preparation of data sources carried out in data collection, selection of Business Intelligence tools, and designing and implementing Business Intelligence [15]. Based on the problems explained in the previous chapter, which is an increase in dengue cases from year to year, Business Intelligence is needed to predict dengue cases in the future. The process of making Business Intelligence is shown in Fig. 1. The identification and preparation of the data sources that needed to build Business Intelligence are explained in more detail in the data collection section, where the data that used in this study comes from two primary sources, namely the Integrated Puskesmas Recording and Reporting System (SP2TP) owned by the Malang Regency Health Office and Meteorology, Climatology, and Geophysical Agency (BMKG) website. Data from SP2TP is utilized by extracting, loading and transforming in the process of developing Business Intelligence.

A. Data Collection
In the process of data collection, as previously explained, data that users are data from SP2TP obtained from Malang District Health Office. The data from SP2TP that used are record of DHF patients from January 2017 until March 2018. Besides that, data of Epidemiological Investigations (PE) results from DHF patients per month are also used as a Decision Tree target variable which will help recommend countermeasures in the form of fogging, abating, counseling, and eradicating mosquito nests. The data has previously been validated by the Malang District Health Office. Recapitulation of the DHF patients register is shown in Table 1. Recapitulation data from the results of PE can be seen in Table 2.
Data taken from the BMKG website is weather data in Malang regency, namely the amount of rainfall per month in Malang Regency in January 2017 to March 2018. Weather data obtained from BMKG is still in the form of the daily amount of rain, so the data needs to be aggregated, and it appears in the form of a monthly amount of rainfall data. Furthermore, the information is labeled to determine the season of every month based on the amount of rainfall per month. Recapitulation data of season and amount of rainfall can be seen in Table 3.

B. Preprocessing
Before performing further data processing, at this stage, the Business Intelligence tool is chosen. At this stage, it has been determined to utilize existing BI software, not to build BI software from scratch. The existing BI tool is selected to use because it saves more time and no need to make a program or application first, but it can directly apply to visualize data into a form of a dashboard. This study used Power BI as a tool to make a visualization of the data. Power BI can be connected with various DBMS (Database Management Systems) such as MySQL, SQL Server, Microsoft Access, and another DBMS. Besides that, Power BI has a Hybrid Deployment Support feature where this feature supports connecting with various data sources in different formats. Power BI also has a Customization feature where it can help developers to customize the dashboard visualization in Power BI. The APIs for integration feature also owned by Power BI, where this feature can be used as a link between dashboards that have been made on Power BI if the dashboard will be used on other products such as websites.
The advantage of Power BI compared to other BI tools is that Power BI has excellent features when visualizing the data into the dashboard as a user interface to display reports. When compared to QlikView, Power BI is easier to use and learn because it is quite user-friendly even if the user only has Excel knowledge. When compared to Tableau, the speed of Power BI is more promising because it has a smart recovery feature, while Tableau speed depends on RAM and Dataset. Power BI also less expensive because the desktop version is free and has scalability for quite significant projects.

C. Extract and Load
In this stage, the data is extracted from the source database that was previously created, namely the SP2TP Database to the destination database. The design view form SP2TP Database that extracted to the new system database shown as Fig. 2. Job_master used to load data on villages, sub-districts, districts, provinces, and others. This process is done to get master data. Some of the files that are run by Job_master include Trans_desa, Trans_kecamatan, Trans_puskesmas, Trans_kelurahan, Trans_provincial, and Trans_coverage. The query used is shown in Table 4, and the results from Job_master are shown in Table 5. After extracting and loading the data into new system database, the next step to do is to transform the data. The transformation was carried out in two stages, by using Job DBD and Job forecast. The stages of transformation are illustrated in  Job DBD used to transform DHF patient data into statistical data that is ready to be used for predicting DHF patients in the future. Some files that are run in this transformation include Trans_desa and Trans_dbd_per_desa. Job DBD will transform to get statistical values from detailed data per patient to data on the number of patients per month in each public health center and each village. The query used to perform the transformation can be seen in Table 6. The data that was transformed from SP2TP database is in Table 7. The results of the transformation process using Job DBD shown in Table 8.    A job forecast is used to see the match between predicted data and actual data. The results of the prediction data are imported in the Power BI visualization. In this job forecast, DHF status is determined from each Community Health Office to get further treatment.

2) Making Forecast Number of DHF
Making a forecast is used to produce the forecast of the total DHF number in the following month and so on. To make forecasting, we used weather data, and DHF patients increase data. Weather data processing begins with taking data from BMKG and entered into the season variable. The season is classified into two, namely dry and rainy. The season determination is done by reviewing the amount of rainfall for ten days with a threshold of 50 mm so that the rule is set, in one month to 150 mm. If rainfall is <150 mm in a certain month period, the season is considered "dry," whereas if the amount of rainfall in one month> = 150 mm, the season can be considered "rainy". The equations for determining the season are shown in (1)  The following queries and PHP syntax which is shown in Table 10 are used to create season variables.  Next is the data processing increase in the number of DHF patients. Data obtained from data on the number of patients per health center each month. The number of patients in a particular month compared to the previous month. If there is an increasing number of patients, it will form a patient increment status variable, and the value is "increased,". In contrast, if there is no increase in the number of patients, the variable will be "Not Increased." The equation for determining patient increment status variable are shown in (3) and (4).

P_(i-1)< P_i=K_i →Increased (3) P_(i-1)≥ P_i=K_i →Not increased (4)
where: P : Total DHF patient i : i-period K : patient increment status Table 11 shows a query and PHP syntax that is used to create an increase in the patient increment status. From both data, weather data and patient increment status produce a prediction of DHF number in the following month and so on.

3) Making Mitigation Recommendations in Dengue Fever Prevention
To make a recommendations, we used epidemiological investigation (PE) data and patients increment status variable. Based on regent regulations in 2018, PE is categorized into three categories, namely positive PE, negative PE, and no PE. If there are other DHF patients or there are mosquito larvae and other heat patients then included in Positive PE. Community Health Center with positive PE status will get mitigation recommendations in the form of counseling, eradication of mosquito nests (PSN), fogging, and abating (recommendation 1). If there 64 are no other DHF patients or no mosquito larvae and other heat patients then included in the negative PE. Community Health Offices with negative PE status will receive mitigation recommendations in the form of counseling, eradication of mosquito nests (recommendation 2). If there is no patient then there is no PE. Community Health Offices that have no PE status do not get mitigation recommendations (recommendation 3). Table 12 shows a query and PHP syntax that is used to create PE result variables.

a) Model Implementation
The decision tree model that has been validated and tested is the model that has the most optimal performance, so the next step is to apply the model to the decision support system. In this final project the application of the decision tree model is done using SQL queries in PHP files. The following query contains the model of the selected decision tree algorithm. The following source code used in the implementation of the decision tree model can be seen in Table 13.

E. Database Design
The data needed to create a system is illustrated in Fig. 4 as the Relational Model.

C. Dengue Fever Patient Trends Per Year and Predictions
The dashboard displays four pieces of information. They are the distribution of DHF patients' territories, the comparison between actual and prediction of the DHF patient number, and the prediction of the DHF patients number in the next two years. The comparison between actual and forecast is displayed in histograms and tables. The prediction of the DHF patients number in the next two years that can be seen in Fig. 7. Besides, we can do a drill-through to display the actual and predicted number of DHF patients in the monthly period, as illustrated in

D. Mitigation Recommendation
This dashboard display PE result, DHF prediction, and mitigation recommendation in Tumpang Community Health Office. In January, February, March and May, we don't need any prevention protocol but in April and June we need to have twice fogging every week. The results can be seen in Fig. 9. Journal of Information Systems Engineering and Business Intelligence, 2020, 6 (1), 55-69 67 Fig. 9 Mitigation Recommendation for Each Community Health Office IV. DISCUSSION

A. Installation
System installation is done on the internal server. The system is connected to the SP2TP system in the same network, which facilitates the ELT process. Besides, the system is also connected to the internet to do data scraping from BMKG website for the need to display weather data. The extraction process from SP2TP data is done periodically at certain hours, and minutes have done every day. This process will continue with the process of loading data and the transformation of the prediction that will take about 30 minutes overall.
The extraction process restriction is done once per day due to the heavy SP2TP server load. Fortunately, this is not too problematic because the predictive process requires more historical data than data that is always real-time updates. Although we see the ability of a real-time system precisely can improve the speed of the process. For example, applying Change Data Capture. [16] Will accelerate the extraction process, which no longer requires the overall process of reading the data. Another idea that can be applied is by implanted the full data virtualization process [17]. Therefore, the extraction and loading process can be eliminated in total. However, it needs significant changes from the system database SP2TP.

B. Self Service BI in a remote area
One of the reasons we use PowerBI technology for dashboard visualization is the ability to conduct self-service BI [18]. Self Service BI means placing end-users as users who have the knowledge and independence to make changes to the front-end system of the dashboard. This reduces the dependency of management on the existing IT team. But for us, BI Self-service is also very useful in remote areas, where connection failure to the system often occurs, especially on areas affected on the blank spot state of the wireless connection. Though, these benefits are still perceived only by users who use the desktop, not mobile phones that always require an Internet connection.
With these advantages, there are still many things that can be improved from this system. Although it uses many aspects of self-service BI, it is still in the level of self-tailoring from the side of the visualization, because the user is only given exposure with a method of prediction. Increasing the number of predictive methods used will add wealth to the user as a decision-maker to be able to choose and make better decisions.

C. Cloud Computing Era
Although many challenges and obstacles, such as the issue of data privacy, cloud computing can be considered in the creation of a Business Intelligence [19]. A system with cloud computing, the resource can be more reliable with expectations of more precise results, the IT team can focus on making A solution that hopes to improve the achievement of better results. And at least, it reduces the many failures and errors that occur when using an onpremises solution.

D. Evaluation
Although technically, the dashboard has been successfully built using a methodology in the creation of dashboards, [15]. This paper still has not done a thorough evaluation of the success of the implementation of this dashboard. Understanding the Critical Success Factor (CSFs) is one of the earliest phases of doing so. By specifying CSFs, we can combine the evaluation of the infrastructure performance and evaluate the performance of business processes that lead to benefit on the organization's side. [20]. This becomes important to the organization because non-technical factors such as organizational readiness factors and business process factors that relate more affect the success rate of business intelligence technology implementation. Like research on CSF in Poland, [21] found that for business intelligence to be successfully implemented, some crucial things need to be done, such as making business intelligence a part of business strategy, must have sponsors of the organization's core hierarchy, as well as the user's ability to use the business intelligence.

V. CONCLUSION
In this paper, we have built a decision-making system to perform predictive visualization and provide recommendations to overcome dengue fever in the form of a business intelligence dashboard. The development of this system is done by using the method of business intelligence development, consisting of (1) defining the development of intelligence business in Strategies, (2) Identification of data sources, (3) Selection of intelligence business Tools, (4) Conducting design and development of intelligence business. The development of this system has also explored the capabilities of self-service business intelligence tooling.
This research is still very open for improvement, including the utilization of cloud computing, change data capture, and data virtualization technology. Besides, further research needs to be done in evaluating the success rate of this technological implementation by looking at the Critical Success Factor aspects of business intelligence technology.