Improving Café Reputation: Machine Learning Analytics for Predicting Customer Engagement on Google Maps
Downloads
Background: Online reviews is a powerful tool in shaping customer decisions, as they significantly influence a business’s reputation and the ability to attract new customer. Given the growing reliance on digital platforms, understanding engagement levels is crucial for business that want to enhance online presence. By analyzing these customer activities, business owners can leverage Machine Learning (ML) analytics to predict engagement on Google Maps reviews.
Objective: This study aimed to develop the most suitable ML model in order to predict customer engagement levels in café business on Google Maps, and determine the online review features that have the greatest impact on engagement. Additionally, the analysis aimed to provide actionable recommendations to help business owners improve online reputation and engagement strategies.
Method: A total of 5,626 online reviews data were collected using web scraping methods during the analysis. The data was then preprocessed by extracting major review features, calculating engagement levels, and addressing class imbalance with SMOTE method. In the study, K-Means clustering was used to segment engagement levels, while sentiment analysis through VADER Lexicon was applied to measure sentiment content. Various ML models were trained and validated using a 10-fold cross-validation method. Finally, Analysis was conducted using Spearman's correlation to identify relationships among features derived from the best-performing ML models.
Results: The result of the analysis showed that Random Forest model achieved the highest accuracy and PR AUC in predicting engagement levels. The four most influential factors were review length (16.23%), photos (15.57%), total rating (12.35%), and author review count (10.19%). Spearman's correlation analysis showed a positive relationship among review length, photos, and author review count, signifying the combined impact on engagement levels.
Conclusion: This study described the effectiveness of Random Forest model in predicting engagement levels in Google Maps reviews. Specifically, the model identified review length, photos, total rating, and author review count as the key factors influencing engagement. These results would provide valuable guidance for business owners that desire to improve customer engagement and online reputation. Building on this, future studies should explore larger datasets, integrate additional features, and examine how the engagement contribute to long-term customer retention.
Keywords: Online Reputation Management, Customer Engagement, Behavior, Machine Learning, Google Maps Review, Predictive Analytics
S. J. Dixon, “Share of consumer reviews posted on Google worldwide in 2020 and 2021,” Statista, 2022. Accessed: Jan. 17, 2024. [Online]. Available: https://www.statista.com/statistics/1305930/consumer-reviews-posted-google/
D. T. Andariesta and M. Wasesa, “Machine learning models to predict the engagement level of Twitter posts: Indonesian e-commerce case study,” in Procedia Computer Science, Elsevier B.V., 2023, pp. 823–832. doi: 10.1016/j.procs.2023.10.588.
G. Tandiawan and M. Wasesa, “Impact of Mixue’s Halal Announcement on Company’s Brand Reputation: a Naive Bayes Sentiment Analysis Approach,” Journal of Consumer Studies and Applied Marketing, vol. 1, no. 1, pp. 74–80, Jul. 2023, doi: 10.58229/jcsam.v1i1.77.
R. Thakur, “Customer engagement and online reviews,” Journal of Retailing and Consumer Services, vol. 41, pp. 48–59, Mar. 2018, doi: 10.1016/j.jretconser.2017.11.002.
A. H. Busalim, F. Ghabban, and A. R. C. Hussin, “Customer engagement behaviour on social commerce platforms: An empirical study,” Technol Soc, vol. 64, Feb. 2021, doi: 10.1016/j.techsoc.2020.101437.
I. Pletikosa Cvijikj and F. Michahelles, “Online engagement factors on Facebook brand pages,” Soc Netw Anal Min, vol. 3, no. 4, pp. 843–861, Jan. 2013, doi: 10.1007/s13278-013-0098-8.
E. J. Downes and S. J. McMillan, “Defining interactivity: A qualitative identification of key dimensions,” New Media Soc, vol. 2, no. 2, pp. 157–179, 2000, doi: 10.1177/14614440022225751.
S. S. Sundar, “Theorizing Interactivity’s Effects,” The Information Society, vol. 20, no. 5, pp. 385–389, Nov. 2004, doi: 10.1080/01972240490508072.
K. P. Frey and A. H. Eagly, “Vividness can undermine the persuasiveness of messages.,” J Pers Soc Psychol, vol. 65, no. 1, pp. 32–44, Jul. 1993, doi: 10.1037/0022-3514.65.1.32.
M. Schreiner, T. Fischer, and R. Riedl, “Impact of content characteristics and emotion on behavioral engagement in social media: literature review and research agenda,” Electronic Commerce Research, vol. 21, no. 2, pp. 329–345, Jun. 2021, doi: 10.1007/s10660-019-09353-8.
P. Frangidis, K. Georgiou, and S. Papadopoulos, “Sentiment analysis on movie scripts and reviews: Utilizing sentiment scores in rating prediction,” in IFIP Advances in Information and Communication Technology, Springer, 2020, pp. 430–438. doi: 10.1007/978-3-030-49161-1_36.
D. Patel et al., “Comparative Analysis of a Large Language Model and Machine Learning Method for Prediction of Hospitalization from Nurse Triage Notes: Implications for Machine Learning-based Resource Management,” Aug. 10, 2023. doi: 10.1101/2023.08.07.23293699.
D. C. Gkikas, K. Tzafilkou, P. K. Theodoridis, A. Garmpis, and M. C. Gkikas, “How do text characteristics impact user engagement in social media posts: Modeling content readability, length, and hashtags number in Facebook,” International Journal of Information Management Data Insights, vol. 2, no. 1, Apr. 2022, doi: 10.1016/j.jjimei.2022.100067.
L. Y. Koh, C. Ng, X. Wang, and K. F. Yuen, “Social media engagement in the maritime industry during the pandemic,” Technol Forecast Soc Change, vol. 192, Jul. 2023, doi: 10.1016/j.techfore.2023.122553.
S. K. Roy, G. Singh, S. Sadeque, P. Harrigan, and K. Coussement, “Customer engagement with digitalized interactive platforms in retailing,” J Bus Res, vol. 164, Sep. 2023, doi: 10.1016/j.jbusres.2023.114001.
A. Z. Abbasi, H. Qummar, S. Bashir, S. Aziz, and D. H. Ting, “Customer engagement in Saudi food delivery apps through social media marketing: Examining the antecedents and consequences using PLS-SEM and NCA,” Journal of Retailing and Consumer Services, vol. 81, Nov. 2024, doi: 10.1016/j.jretconser.2024.104001.
E. Izquierdo-Verdiguier and R. Zurita-Milla, “An evaluation of Guided Regularized Random Forest for classification and regression tasks in remote sensing,” International Journal of Applied Earth Observation and Geoinformation, vol. 88, Jun. 2020, doi: 10.1016/j.jag.2020.102051.
A. A. Hamidi, B. Robertson, and J. Ilow, “A new approach for ECG artifact detection using fine-KNN classification and wavelet scattering features in vital health applications,” in Procedia Computer Science, Elsevier B.V., 2023, pp. 60–67. doi: 10.1016/j.procs.2023.09.011.
M. AminiMotlagh, H. S. Shahhoseini, and N. Fatehi, “A reliable sentiment analysis for classification of tweets in social networks,” Soc Netw Anal Min, vol. 13, no. 1, Dec. 2023, doi: 10.1007/s13278-022-00998-2.
A. Zamhuri Fuadi, Irsyad Nashirul Haq, and Edi Leksono, “Support Vector Machine to Predict Electricity Consumption in the Energy Management Laboratory,” Jurnal RESTI (Rekayasa Sistem dan Teknologi Informasi), vol. 5, no. 3, pp. 466–473, Jun. 2021, doi: 10.29207/resti.v5i3.2947.
X. E. Pantazi, D. Moshou, and D. Bochtis, Intelligent Data Mining and Fusion Systems in Agriculture. Elsevier, 2020. doi: 10.1016/C2017-0-01141-2.
M. Nokkaew et al., “Analyzing online public opinion on Thailand-China high-speed train and Laos-China railway mega-projects using advanced machine learning for sentiment analysis,” Soc Netw Anal Min, vol. 14, no. 1, Dec. 2024, doi: 10.1007/s13278-023-01168-8.
D. W. Hosmer, S. Lemeshow, and R. X. Sturdivant, Applied Logistic Regression, 3rd ed. John Wiley & Sons, 2013. doi: 10.1002/9781118548387.
Y. Qi and Z. Shabrina, “Sentiment analysis using Twitter data: a comparative application of lexicon- and machine-learning-based approach,” Soc Netw Anal Min, vol. 13, no. 1, Dec. 2023, doi: 10.1007/s13278-023-01030-x.
Y. Huang et al., “Utilization of hyperspectral imaging for the analysis of aroma components of Soy Sauce-Aroma Type Baijiu,” Journal of Food Composition and Analysis, vol. 134, Oct. 2024, doi: 10.1016/j.jfca.2024.106498.
M. T. Puth, M. Neuhäuser, and G. D. Ruxton, “Effective use of Spearman’s and Kendall’s correlation coefficients forassociation between two measured traits,” Apr. 01, 2015, Academic Press. doi: 10.1016/j.anbehav.2015.01.010.
M. A. Afrianto and M. Wasesa, “The impact of tree-based machine learning models, length of training data, and quarantine search query on tourist arrival prediction’s accuracy under COVID-19 in Indonesia,” Current Issues in Tourism, vol. 25, no. 23, pp. 3854–3870, Dec. 2022, doi: 10.1080/13683500.2022.2085079.
M. A. Afrianto and M. Wasesa, “Booking Prediction Models for Peer-to-peer Accommodation Listings using Logistics Regression, Decision Tree, K-Nearest Neighbor, and Random Forest Classifiers,” Journal of Information Systems Engineering and Business Intelligence, vol. 6, no. 2, p. 123, Oct. 2020, doi: 10.20473/jisebi.6.2.123-132.
A. D. Goenawan and S. Hartati, “The Comparison of K-Nearest Neighbors and Random Forest Algorithm to Recognize Indonesian Sign Language in a Real-Time,” Scientific Journal of Informatics, vol. 11, no. 1, pp. 237–244, Feb. 2024, doi: 10.15294/sji.v11i1.48475.
L. Su and D. H. Zhu, “A picture is worth a thousand words: Understanding the predictors of picture sharing in online consumer reviews,” Heliyon, vol. 9, no. 12, p. e22789, Dec. 2023, doi: 10.1016/j.heliyon.2023.e22789.
R. Filieri, E. Raguseo, and C. Vitari, “When are extreme ratings more helpful? Empirical evidence on the moderating effects of review characteristics and product type,” Human Behavior, vol. 88, pp. 134–142, 2018, doi: 10.1016/j.chb.2018.05.042ï.
M. Wasesa, “Business Analytics to Support Sustainable Community-Based Tourism: Insights from a Community Development Project in Pengudang Village, Bintan, Indonesia,” 2024, pp. 117–130. doi: 10.1007/978-981-97-5219-5_7.
R. Rahmadhan and M. Wasesa, “Segmentation using Customers Lifetime Value: Hybrid K-means Clustering and Analytic Hierarchy Process,” Journal of Information Systems Engineering and Business Intelligence, vol. 8, no. 2, pp. 130–141, Oct. 2022, doi: 10.20473/jisebi.8.2.130-141.
A. A. Nugraha and M. Wasesa, “Customer Segmentation and Preference Modeling of Indonesian Mobile Telecommunication Industry: A Data Mining Approach,” in 2021 6th International Conference on Management in Emerging Markets (ICMEM), IEEE, Aug. 2021, pp. 1–6. doi: 10.1109/ICMEM53145.2021.9869383.
Copyright (c) 2025 The Authors. Published by Universitas Airlangga.

This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
All accepted papers will be published under a Creative Commons Attribution 4.0 International (CC BY 4.0) License. Authors retain copyright and grant the journal right of first publication. CC-BY Licenced means lets others to Share (copy and redistribute the material in any medium or format) and Adapt (remix, transform, and build upon the material for any purpose, even commercially).