Crypto-sentiment Detection in Malay Text Using Language Models with an Attention Mechanism
Downloads
Background: Due to the increased interest in cryptocurrencies, opinions on cryptocurrency-related topics are shared on news and social media. The enormous amount of sentiment data that is frequently released makes data processing and analytics on such important issues more challenging. In addition, the present sentiment models in the cryptocurrency domain are primarily focused on English with minimal work on Malay language, further complicating problems.
Objective: The performance of the sentiment regression model to forecast sentiment scores for Malay news and tweets is examined in this study.
Methods: Malay news headlines and tweets on Bitcoin and Ethereum are used as the input. A hybrid Generalized Autoregressive Pretraining for Language Understanding (XLNet) language model in combination with Bidirectional-Gated Recurrent Unit (Bi-GRU) deep learning model is applied in the proposed sentiment regression implementation. The effectiveness of the proposed sentiment regression model is also investigated using the multi-head self-attention mechanism. Then, a comparison analysis using Bidirectional Encoder Representations from Transformers (BERT) is carried out.
Results: The experimental results demonstrate that the number of attention heads is vital in improving the XLNet-GRU sentiment model performance. There are slight improvements of 0.03 in the adjusted R2 values with an average MAE of 0.163 (Malay news) and 0.174 (Malay tweets). In addition, an average RMSE of 0.267 and 0.255 were obtained respectively for Malay news and tweets, which show that the proposed XLNet-GRU sentiment model outperforms the BERT sentiment model with lesser prediction errors.
Conclusion: The proposed model contributes to predicting sentiment on cryptocurrency. Moreover, this study also introduced two carefully curated Malay corpora, CryptoSentiNews-Malay and CryptoSentiTweets-Malay, which are extracted from news and tweets, respectively. Further works to enhance Malay news and tweets corpora on cryptocurrency-related issues will be expended with implementing the proposed XLNet Bi-GRU deep learning model for greater financial insight.
Keywords: Cryptocurrency, Deep learning model, Malay text, Sentiment analysis, Sentiment regression model
K. Danial, Cryptocurrency investing for dummies. John Wiley & Sons, 2019.
S. Nakamoto, "Bitcoin: A peer-to-peer electronic cash system,” Decentralized Business Review, p. 21260, 2008.
K. Farhana, and S. Muthaiyah, "Behavioral Intention to Use Cryptocurrency as an Electronic Payment in Malaysia,” J. Syst. Manag. Sci., vol. 12, no. 4, pp. 219-231, 2022.
S. Sukumaran, T. S. Bee, and S. Wasiuzzaman, "Cryptocurrency as an investment: The Malaysian context,” Risks, vol. 10, no. 4, 86, 2022, doi: 10.3390/risks10040086.
M. F. Yusof, L.A. Rasid, and R. Masri, "Implementation Of Zakat Payment Platform For Cryptocurrencies,” AZKA International Journal of Zakat and Social Finance, vol. 2, no. 1, pp. 17-31, 2021, doi: 10.51377/azjaf.vol2no1.41.
S.A. Farimani, M.V. Jahan, A.M. Fard, and S.R.K. Tabbakh, "Investigating the informativeness of technical indicators and news sentiment in financial market price prediction,” Knowledge-Based Systems, vol. 247, 108742, 2022, doi: 10.1016/j.knosys.2022.108742.
A.M. Balfagih, and V. Keselj, "Evaluating sentiment classifiers for Bitcoin tweets in price prediction task,” in IEEE International Conference on Big Data (Big Data), pp. 5499–5506, 2019, doi: 10.1109/BigData47090.2019.9006140.
G.N.C. Cerda, "Bitcoin price prediction through stimulus analysis: On the footprints of Twitter's crypto influencers,” Master's Thesis, Pontificia Universidad Católica de Chile, Santiago de Chile, 2021. [Online]. Available: https://repositorio.uc.cl/xmlui/bitstream/handle/ 11534/60881/TESIS_GCheuque_Firma%20Final.pdf?sequence=1
E. Edgari, J. Thiojaya, and N.N. Qomariyah, "The impact of Twitter sentiment analysis on Bitcoin price during COVID-19 with XGBoost,” in 5th International Conference on Computing and Informatics (ICCI), pp. 337–342, 2022, doi: 10.1109/ICCI54321.2022.9756123.
N.A.M. Zamani, J.S.Y. Liew, and A.M. Yusof, "XLNET-GRU sentiment regression model for cryptocurrency news in English and Malay,” in Proceedings of the 4th Financial Narrative Processing Workshop @ LREC 2022, pp. 36–42, 2022.
F.H. Jahjah and M. Rajab, "Impact of Twitter Sentiment Related to Bitcoin on Stock Price Returns,” jcoeng, vol. 26, no. 6, pp. 60–71, Jun. 2020, doi: 10.31026/j.eng.2020.06.05.
S. Mohapatra, N. Ahmed, and P. Alencar, "KryptoOracle: A real-time cryptocurrency price prediction platform using Twitter sentiments,” in IEEE International Conference on Big Data (Big Data), pp. 5544–5551, 2019, doi: 10.1109/BigData47090.2019.9006554.
U. Maqsood, F.Y. Khuhawar, S. Talpur, F.H. Jaskani, and A.A. Memon, "Twitter Mining based Forecasting of cryptocurrency using sentimental analysis of Tweets,” in Global Conference on Wireless and Optical Technologies (GCWOT), pp. 1-6, 2022, doi: 10.1109/GCWOT53057.2022.9772923.
Z. Ye, Y. Wu, H. Chen, Y. Pan, and Q. Jiang, "A Stacking Ensemble Deep Learning Model for Bitcoin Price Prediction Using Twitter Comments on Bitcoin,” Mathematics, vol. 10, no. 8, 1307, 2022, doi: 10.3390/math10081307.
V. John and O. Vechtomova, "UW-FinSent at SemEval-2017 Task 5: Sentiment Analysis on Financial News Headlines using Training Dataset Augmentation,” in Proceedings of the 11th International Workshop on Semantic Evaluation, pp. 872–876, 2017, doi: 10.18653/v1/S17-2149.
S. Symeonidis, J. Kordonis, D. Effrosynidis, and A. Arampatzis, "Sentiment predictability in financial microblogging and news articles,” in Proceedings of the 11th International Workshop on Semantic Evaluation pp. 861–865, 2017, doi: 10.18653/v1/S17-2147.
A. Vaswani et al., "Attention is All you Need,” arXiv:1706.03762, 2017.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, "BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 4171–4186, 2019.
N. Passalis, L. Avramelou, S. Seficha, A. Tsantekidis, S. Doropoulos, G. Makris, and A. Tefas, "Multisource financial sentiment analysis for detecting Bitcoin price change indications using deep learning,” Neural Comput & Applic, vol. 34, no. 22, pp. 19441-19452, 2022, doi: 10.1007/s00521-022-07509-6.
M. Ortu, N. Uras, C. Conversano, S. Bartolucci, and G. Destefanis, "On technical trading and social media indicators for cryptocurrency price classification through deep learning,” Expert Systems with Applications, vol. 198, 116804, 2022, doi: 10.1016/j.eswa.2022.116804.
L. Rognone, S. Hyde, and S.S. Zhang, "News sentiment in the cryptocurrency market: An empirical comparison with Forex,” International Review of Financial Analysis, vol. 69, 101462, 2020, doi: 10.1016/j.irfa.2020.101462.
E. Stenqvist and J. Lönnö, "Predicting Bitcoin price fluctuation with Twitter sentiment analysis,” Degree Project, KTH Royal Institute of Technology School of Computer Science and Communication, 2017.
T.M. Dulău and M. Dulău, "Cryptocurrency-sentiment analysis in social media,” Acta Marisiensis. Seria Technologica, vol. 16, no. 2, pp. 1-6, 2019, doi: 10.2478/amset-2019-0009.
T. Loughran and B. Mcdonald, "When is a liability not a liability? textual analysis, dictionaries, and 10-Ks,” The Journal of Finance, vol. 66, no. 1, pp. 35-65, 2011, doi: 10.1111/j.1540-6261.2010.01625.x.
T. Loughran and B. Mcdonald, "Measuring readability in financial disclosures,” The Journal of Finance, vol. 69, no. 4, pp. 1643-1671, 2014, doi: 10.1111/jofi.12162.
C. Y.H. Chen, R. Després, L. Guo, and T. Renault, "What makes cryptocurrencies special? Investor sentiment and return predictability during the bubble,” Comparative Political Economy: Monetary Policy eJournal, pp. 1–36, 2019.
C. Gurdgiev and D. O'Loughlin, "Herding and anchoring in cryptocurrency markets: Investor reaction to fear and uncertainty,” Journal of Behavioral and Experimental Finance, vol. 25, 100271, 2020, doi: 10.1016/j.jbef.2020.100271.
V. Karalevicius, N. Degrande, and J. De Weerdt, "Using sentiment analysis to predict interday Bitcoin price movements,” The Journal of Risk Finance, vol. 19, no. 1, pp. 56-75, 2018, doi: 10.1108/JRF-06-2017-0092.
F. Mai, Q. Bai, and J. Shan, "The impacts of social media on Bitcoin performance,” in International Conference on Information Systems, pp. 1-16, 2015.
F. Mai, Z. Shan, Q. Bai, X. Wang, and R.H.L. Chiang, "How does social media impact Bitcoin value? A test of the silent majority hypothesis,” Journal of Management Information Systems, vol. 35, no. 1, pp. 19-52, 2018, doi: 10.1080/07421222.2018.1440774.
C. J. Hutto and E. Gilbert, "VADER: A parsimonious rule-based model for sentiment analysis of social media text,” in Proceedings of the 8th International Conference on Weblogs and Social Media, pp. 10, 2015.
Y. B. Kim et al., "Predicting fluctuations in cryptocurrency transactions based on user comments and replies,” PLoS ONE, vol. 11, no. 8, e0161197, 2016, doi: 10.1371/journal.pone.0161197.
F. Valencia, A. Gómez-Espinosa, and B. Valdés-Aguirre, "Price movement prediction of cryptocurrencies using sentiment analysis and machine learning,” Entropy, vol. 21, no. 6, 589, 2019, doi: 10.3390/e21060589.
K. WoÅ‚k, "Advanced social media sentiment analysis for short"term cryptocurrency price prediction,” Expert Systems, vol. 37, no. 2, pp. 1-16, 2019, doi: 10.1111/exsy.12493.
O. Kraaijeveld and J. De Smedt, "The predictive power of public Twitter sentiment for forecasting cryptocurrency prices,” Journal of International Financial Markets, Institutions and Money, vol. 65, 101188, 2020, doi: 10.1016/j.intfin.2020.101188.
G. Serafini et al., "Sentiment-driven price prediction of the Bitcoin based on statistical and deep learning approaches,” in 2020 International Joint Conference on Neural Networks (IJCNN), pp. 1–8, 2020, doi: 10.1109/IJCNN48605.2020.9206704.
S. Loria, "Textblob Documentation.” Apr. 26, 2020. [Online]. Available: https://buildmedia.readthedocs.org/media/pdf/textblob/latest/ textblob.pdf
A. Jain, S. Tripathi, H. D. Dwivedi, and P. Saxena, "Forecasting price of cryptocurrencies using tweets sentiment analysis,” in 2018 Eleventh International Conference on Contemporary Computing (IC3), pp. 1-7, 2018, doi: 10.1109/IC3.2018.8530659.
C. Lamon, E. Nielsen, and E. Redondo, "Cryptocurrency price prediction using news and social media sentiment,” SMU Data Science Review, pp. 1-22, 2017.
A. Inamdar, A. Bhagtani, S. Bhatt, and P.M. Shetty, "Predicting cryptocurrency value using sentiment analysis,” in 2019 International Conference on Intelligent Computing and Control Systems (ICCS), pp. 932–934, 2019, doi: 10.1109/ICCS45141.2019.9065838.
M.M. Patel, S. Tanwar, R. Gupta, and N. Kumar, "A deep learning-based cryptocurrency price prediction scheme for financial institutions,” Journal of Information Security and Applications, vol. 55, 102583, 2020, doi: 10.1016/j.jisa.2020.102583.
Y. Wang and R. Chen, "Cryptocurrency price prediction based on multiple market sentiment,” in Proceedings of the 53rd Hawaii International Conference on System Sciences, pp. 1092-1100, 2020.
J. Luo, "Bitcoin price prediction in the time of COVID-19,” in Management Science Informatization and Economic Innovation Development Conference (MSIEID), pp. 243–247, 2020, doi: 10.1109/MSIEID52046.2020.00050.
L. Barbaglia, L. Frattarolo, L. Onorante, F.M. Pericoli, M. Ratto, and L.T. Pezzoli, "Testing big data in a big crisis: Nowcasting under Covid-19,” International Journal of Forecasting, S0169207022001431, 2022, doi: 10.1016/j.ijforecast.2022.10.005.
W. Ahmad, B. Wang, P. Martin, M. Xu, and H. Xu, "Enhanced sentiment analysis regarding COVID-19 news from global channels,” J Comput Soc Sc, vol. 6, no. 1, pp. 19-57, 2023, doi: 10.1007/s42001-022-00189-1.
K. Krippendorff, Content analysis: An introduction to its methodology, Fourth. SAGE Publications, 2018.
R. Artstein and M. Poesio, "Inter-Coder agreement for computational linguistics,” Computational Linguistics, vol. 34, no. 4, pp. 555-596, 2008, doi: 10.1162/coli.07-034-R2.
X. R. Gong, J.X. Jin, and T. Zhang, "Sentiment analysis using autoregressive language modeling and broad learning system,” in IEEE International Conference on Bioinformatics and Biomedicine (BIBM), pp. 1130-1134, 2019, doi: 10.1109/BIBM47256.2019.8983025.
X. Chen, L. Ke, Z. Lu, H. Su, and H. Wang, "A novel hybrid model for Cantonese rumor detection on Twitter,” Applied Sciences, vol. 10, no. 20, 7093, 2020, doi: 10.3390/app10207093.
X. Li, L. Ding, Y. Du, Y. Fan, and F. Shen, "Position-Enhanced Multi-Head Self-Attention Based Bidirectional Gated Recurrent Unit for Aspect-Level Sentiment Classification,” Front. Psychol., vol. 12, 799926, 2022, doi: 10.3389/fpsyg.2021.799926.
X. Zhang, Z. Wu, K. Liu, Z. Zhao, J. Wang, and C. Wu, "Text sentiment classification based on BERT embedding and sliced multi-head self-attention Bi-GRU,” Sensors, vol. 23, no. 3, 1481, 2023, doi: 10.3390/s23031481.
Y.H. Lim and J.S.Y. Liew, "English-Malay word embeddings alignment for cross-lingual emotion classification with hierarchical attention network,” in Proceedings of the 12th Workshop on Computational Approaches to Subjectivity, Sentiment & Social Media Analysis, pp. 113–124, 2022, doi: 10.18653/v1/2022.wassa-1.12.
S. M. Robeson and C. J. Willmott, "Decomposition of the mean absolute error (MAE) into systematic and unsystematic components,” PLoS ONE, vol. 18, no. 2, e0279774, 2023, doi: 10.1371/journal.pone.0279774.
D. Chicco, M.J. Warrens, and G. Jurman, "The coefficient of determination R-squared is more informative than SMAPE, MAE, MAPE, MSE and RMSE in regression analysis evaluation,” PeerJ Computer Science, vol. 7, e623, 2021, doi: 10.7717/peerj-cs.623.
J. Karch, "Improving on Adjusted R-Squared,” PsyArXiv, preprint, Sep. 2019. doi: 10.31234/osf.io/v8dz5.
Copyright (c) 2023 The Authors. Published by Universitas Airlangga.
This work is licensed under a Creative Commons Attribution 4.0 International License.
Authors who publish with this journal agree to the following terms:
All accepted papers will be published under a Creative Commons Attribution 4.0 International (CC BY 4.0) License. Authors retain copyright and grant the journal right of first publication. CC-BY Licenced means lets others to Share (copy and redistribute the material in any medium or format) and Adapt (remix, transform, and build upon the material for any purpose, even commercially).