Analysis of Emoticon and Sarcasm Effect on Sentiment Analysis of Indonesian Language on Twitter

Background: Indonesia is an active Twitter user that is the largest ranked in the world. Tweets written by Twitter users vary, from tweets containing positive to negative responses. This agreement will be utilized by the parties concerned for evaluation. Objective: On public comments there are emoticons and sarcasm which have an influence on the process of sentiment analysis. Emoticons are considered to make it easier for someone to express their feelings but not a few are also other opinion researchers, namely by ignoring emoticons, the reason being that it can interfere with the sentiment analysis process, while sarcasm is considered to be produced from the results of the sarcasm sentiment analysis in


I. INTRODUCTION
Twitter is a social media that can be used by all people to express themselves freely, so Twitter has a significant user increase [1].There are several reasons why Twitter is superior.Firstly, there are many third-party software that can be used to explore Twitter.Also, Twitter has a user-friendly interface which makes it easier to use and adapt.Furthermore, Twitter is a real-time search engine which makes it as reliable research tool and therefore faster data analysis can be applied.One of the most frequently used forms of data analysis is sentiment analysis.
Sentiment analysis is the process of classifying sentiments expressed in documents to obtain information [2] and the purpose is to convert that information into natural language [3].Basically sentiment analysis is a classification, but in reality it is not as easy as the usual classification process because it is related to the use of language [4], where there is ambiguity in the use of words, the absence of intonation in a text, and the development of the language itself.In face-to-face communication, sentiments can be inferred from visual signals such as smiles that can be described with emoticons.Using emoticons in communication is very popular nowadays because it is considered to be able to dilute formal communication, making communication more open and making it easier for someone to express their feelings through messages or comments on social media [5].In the past few years, the approach to sentiment analysis has been widely proposed.However, many other approaches ignore emoticons as part of the features.Instead, they use texts since emoticons are seen as possible interference to the sentiment analysis process 101 [6].However, it is found that there are many comments on social media which use emoticons to give positive or negative opinions.There are also other opinions in the form of innuendo to political figures, product brands, companies, etc. which are often referred to sarcastic sentences.Sarcasm is a word that has the opposite meaning of what is said that is used to mock or show resentment.Almost all sarcastic sentences take the form of positive sentences that have negative meanings, even though it is not impossible that there is a sarcastic sentence that has positive meaning [7].
In the study [8], analyzing the use of emoticons in Dutch-language forum messages based on lexicons and providing an increase in the value of accuracy on the classification results.Meanwhile, in research [9] used 3 features to detect sarcasm, namely unigram, negativity and number of exclamations.The results showed that features negativity and the number of exclamations opinions can increase accuracy by 6% compared to just using the unigram feature.Tests and analysis were conducted to find out how much influence emoticons and sarcasm in the sentiment analysis process were classified using the Naïve Bayes Classifier method, Support Vector Machine and random forest classifier.

II. METHODS
This study used the Naïve Bayes Classifier and Support Vector Machine method for the sentiment data classification process to produce positive, negative, and neutral labeled data.Afterward, sarcasm detection was carried out using random forest classifier method which changed the positive data to negative labeled data.The sentiment analysis process consisted of 4 stages, namely data collection, preprocessing, classification, and detection of sarcasm.
The test carried out in this study was the test of fold cross-validation.It consisted of three testing processes, namely the process of sentiment analysis without the detection of sarcasm, the sarcasm detection process, and the sentiment analysis process with the detection of sarcasm.In the sentiment analysis process, there were two comparisons of testing namely test data by utilizing emoticons and test data without utilizing emoticons.
The test results were analyzed to ascertain the extent of the effect of using emoticons and sarcasm detection on the sentiment analysis process.The data used was the Indonesian language commentary obtained from twitter.From several classification methods, information about the values of accuracy, precision, recall, and F1score was also attained.
The proposed process in the system consisted of tweet crawling, preprocessing, feature extraction, classification, and detection of sarcasm.The following steps can be seen in Fig. 1.

A. Data Collection
In the process of data collecting, the data were obtained from twitter.com.That is from scraping the twitter search page used to retrieve tweets.Data collection starts from the period beginning in January 2018 to March 2018 with data in Indonesian language tweets about "public services".The tweet data will be in the form of groups according to the group name created when searching for data based on product brands, companies and governments.The tweet data obtained is then given a positive, neutral and negative label for the classification of sentiment analysis and label sarcasm or not sarcasm in a positive label for the process of detecting sarcasm.
The total data generated through the scraping process are 2,281 data which then through the preprocessing process the data is divided into two parts, namely the first part is the emoticon data, the data is done convert emoticons with 2,072 data numbers.The second part is non emoticon data which ignores data emoticons with 2,072 data numbers.The data consists of three sentiment polarity namely positive, negative and neutral sentiments.The number of emoticon data that is labeled positive is 1023, negative is 587, and neutral is 462, while the number of non emoticon data that is labeled positive is 557, negative is 406, and neutral is 1109.
The labeling of the data are manually listed by involving expert respondents.Examples of the results of data retrieval stages can be seen in Table 1.

B. Preprocessing
In conducting text mining, the text of the document used must be prepared in advance, after that it can only be used for the main process.The process of preparing text in a document or raw dataset is also called a text preprocessing process.Text preprocessing serves to convert unstructured or arbitrary text data into structured data [10].
Preprocessing stage is to manage tweet data to make it easier to carry out the classification process to be more accurate and preprocessing is the initial stage used before taking steps in data mining to get better data quality [11].In the preprocessing process there are five stages, namely tokenizing, case-folding, stopword, convert emoticons, and slangword.

C. Emoticon
Emoticon is a symbol or a combination of symbols used to describe facial expressions and feelings in the form of messages or writing.Paralanguage emoticons are used to express positive, negative and neutral people on Twitter, Facebook, or other communication and social media [12].Some examples of emoticons are :-) >:] :) :

D. Classification
The Naïve Bayes Classifier algorithm is a simple component for the classification process and looks for the highest probability value by applying the Bayes theorem which will be assumed to be in the learning machine [13].There are two stages in the classification of documents.The first stage is training on documents that have known categories.While the second stage is the process of classifying documents that are not yet known [14].At the time of classification the algorithm will look for the highest probability of all categories of documents tested (Vmap), where the equations are as follows (1): Support Vector Machine (SVM) has proven to be a useful learning machine, especially for multiclass data classification.Basically, there are two kinds of approaches for multiclass SVM.First, directly process all data in one optimization formulation.the second describes multiclass into a binary SVM series.The idea behind binary SVM is to build a multiclass classification from binary with the one against one technique [15] This method with one against one technique was built (k(k-1))/2 the binary classification model fruit (k is the number of classes).Each classification model is trained on data from two classes.There are several methods for carrying out testing after the whole k(k −1)/ 2 the classification model was completed.One of them is the voting method [16].For training used 3 pieces of binary SVM and their use in classifying new data can be seen in Fig. 2. Fig. 2. Examples of classification with the SVM method E. Sarcasm Detection Sarcasm detection is done on tweets that have a positive classification result in sentiment analysis.The features that will be extracted in the detection of sarcasm consist of sentiment-relate, punctuation-relate, lexical and syntactic, and pattern-relate [17].Sarcasm detection will be classified using the Random Forest Classifier.
In the sentiment-relate feature set, there are 10 features that will be extracted.The first feature is to determine the number of words in the form of opinion words that have a tendency to be positive or negative.The second, third, and fourth feature extracted is the number of positive, negative, and sarcasm emoticons.The fifth and sixth feature is to determine the number of hashtags on a tweet.The number of positive and negative hashtags is calculated and made a feature.The seventh feature up to the tenth feature describes the contrast between components, where there are negative and positive components in one tweet in the first order are the word components, the two hashtag components with the hashtag, the three word components with hashtag, and finally the word component with emoticons.
In the set of punctuation-relate feature, there are 6 features that will be extracted.The first feature is counting the number of exclamation marks, the second feature calculating the number of question marks, the third feature calculating the number of points, the fourth feature calculating the number of letters, the fifth feature calculating the number of words, and the last feature calculating the number of repetitions of more than 2 times on a word.
In the lexical and syntactic feature set, there are 4 features that will be extracted [18], namely Count of Laughter, Laughter is a word that describes joy, pleasure or amusement that is described through writing on a tweet.Examples of laughter that are often found in Indonesian-language tweets are hehe, haha, hihi, hoho, wkwk, wkowko, wkewke, and LOL.Count of Exclamation, Exclamations are words that express one's feelings and intentions.Exclamations are used to show admiration, wonder, disgust, or fear of something.Examples of exclamations that can be recognized are, huh, uh, uh, oh, aw, iw, uw, ew, ow, waw, wow, wah, woh, idih, dih, aih, duh, huh, hm, um, ups , cie, loh, pft, and o.The existence of words that are rarely used, Words that are rarely used are derived from the Xi F 12  104 words in the data tweet, which only appear once.The existence of a general pattern of sarcasm sentences.The general pattern of sarcasm sentences is obtained by extracting all possible sentence patterns with word lengths that vary from 3 to 6. Words that have a POS CD, FW, NN, PRP, SYM, UH, MD, RB, WDT, WP will be converted into POS on a tweet .Sentences that appear at least 2 times will be selected and checked manually to eliminate irrelevant patterns.Then the sentence pattern that appears on the tweet does not sarcasm also be deleted.
The way to extract the pattern-relate feature set is the same as in the general pattern of sarcasm and then list the patterns of sarcasm and group the number of sarcasm patterns based on the length of the sarcasm pattern.
F. Sarcasm Classification Sarcasm detection will be classified using the Random Forest Classifier, Naïve Bayes Classifier and Support Vector Machine by utilizing 23 features that have been extracted previously.Positive Tweets classified as sarcasm will be changed to negative.Here are examples of tweets and extraction results of sarcasm detection features with manually determined label values that can be seen in Table 2. [-1, 0, 1, 0, 0, 0, False, False, False, False, 0, 0, 0, 0, 0, False, False, False, 0, 0, 1.03, 0.61, 0.66] no Making Decision Trees is obtained from the results of sarcasm detection extraction consisting of several Trees that contain 23 features.Each Tree will determine whether a tweet is a sarcasm sentence or not.Examples of vote collection can be seen in Table 3.

A. Results of testing sentiment analysis
In the sentiment analysis test without sarcasm detection using 2 methods namely Naïve Bayes Classifier and Support Vector Machine.The test uses 2-fold cross validation to 10-fold cross validation and then compares the effect of using emoticons with no emoticons and obtains a quality measure of the sentiment analysis test that can be seen in Table 4.
From Table 4, it can be seen that the results of the Naïve Bayes Classifier method with the use of emoticons provide the best accuracy, precision, recall and F1score values through 10-Fold cross validation with an accuracy value of 74.57%, precision value of 47.30%, value 53.21% recall and F1Score value 50.00%.
The results of testing the Naïve Bayes Classifier method without the use of emoticons provide the highest accuracy value in testing 6-Fold cross validation and the precision, recall and F1score values are best done through 10-Fold cross validation with an accuracy of 52.91%, precision values of 50.32 %, 46.91% recall value and F1Score value 46.75%.From Table 5 it can be seen that the results of testing the Support Vector Machine method with the use of emoticons provide the highest accuracy value in testing 3-Fold cross validation and the best precision, recall and F1score values are done through 10-Fold cross validation with an accuracy value of 77.79%.The precision value is 64.01%, the recall value is 62.45% and F1Score value is 62.32%.
The results of testing the Support Vector Machine method without the use of emoticons provide the highest accuracy value in testing 9-Fold cross validation and the best precision, recall and F1score values are done through 7-Fold cross validation with an accuracy value of 49.81%, precision value of 49.86 %, recall value of 46.68% and F1Score value of 44.02%.

Fig. 3. A comparison chart of quality measures in sentiment analysis without detection of sarcasm
In Fig. 3 shows that the Naive Bayes Method without the use of emoticons has a higher average value of accuracy of 51.56% compared to the SVM method, while the highest use of emoticons is achieved by the SVM method which has a value of 77.22%.

B. Sarcasm detection test results
The results of sarcasm detection tests can be seen in Table 6.This sarcasm detection testing using random forest classifier method, Naïve Bayes Classifier and Support Vector Machine.It extracted with 4 features, namely relate feature sentiment, relate feature punctuation, lexical and syntactic features, and relate feature pattern obtained by the quality measure of sarcasm detection tests From Table 6, it can be seen that the results of the sarcasm detection test with the Naïve Bayes Classifier method provide the best accuracy, precision, recall and F1score values through 7-Fold cross validation with an accuracy value of 55.30%, precision value of 63.13%, value recall of 64.68% and F1Score value of 54.97%.
The results of the sarcasm detection test with the Support Vector Machine method provide the best value of accuracy, precision, recall and F1score done through 7-Fold cross validation with an accuracy value of 61.13%, a precision value of 60.45%, a recall value of 62.69% and F1Score value of 58.80%.
The results of the sarcasm detection test with random forest classifier method provide the highest accuracy of 62.55% and F1 score value of 58.80% by testing 3-Fold cross validation.The highest precision and recall values were obtained and through testing 7-Fold cross validation with a precision value of 59.53% and a recall value of 61.56%.107 method with a precision value of 62.82% and value recall of 64.23% and for F1score value with a value of 57.78% with the Support Vector Machine method.

C. The results of testing sentiment analysis with the detection of sarcasm
In testing sentiment analysis with sarcasm detection tweets that have been labeled positive, negative, neutral are used, then for sentences that are labeled positively a sarcasm and no sarcasm labeling process will then be carried out by the classification process and the quality measure of the sentiment analysis test with the detection of sarcasm can be seen in Table 7. From the results of Table 7, it can be seen that the results of testing the Naïve Bayes Classifier method with the use of emoticons provide the best accuracy, precision, recall, and F1score values through 10-Fold cross validation with an accuracy value of 61.85%, precision value of 41.50%, recall value of 46.20%, and F1Score value of 41.24%.
The results of testing the Naïve Bayes Classifier method without the use of emoticons provide the highest accuracy value in testing 4-Fold cross validation and 6-Fold cross validation and the best value of precision, recall and F1score is done by testing 10-Fold cross validation with an accuracy value of 47.81%, precision value of 45.56%, recall value of 44.52% and F1Score value of 43.74%.From Table 8, it can be seen that the results of testing the Support Vector Machine method with the use of emoticons provide the highest accuracy value in testing 2-Fold cross validation and the highest precision value in testing 10-Fold cross validation, while the best recall and F1score values are tested 9-Fold cross validation with an accuracy value of 48.14%, precision value of 45.56%, recall value of 44.52% and F1Score value of 43.74%.
The results of testing the Support Vector Machine method without the use of emoticons provide the highest accuracy value in testing 7-Fold cross validation and the best precision, recall and F1score values are done by testing 10-Fold cross validation with an accuracy of 47.49%, precision values of 48.94%, recall value 44.74% and F1Score value 41.27%.
In Fig. 5 shows that the SVM method with the use of emoticons has the highest average value of accuracy performed with a value of 62.25% for the highest average accuracy value without the use of emoticons obtained through the Naïve Bayes Classifier method with a value of 47.45%.Sentiment classification stage uses the Naïve Bayes Classifier and Support Vector Machine method.The Tweet will be classified into three classes: positive, negative, and neutral [19].Before classification, a feature extraction process consisting of unigram, POS Tag, TF, and TF-IDF will be carried out.The final value of the classification is in the form where the largest value will determine the class of tweets, before getting the final grade the training process will be carried out on the training data which will be used to determine the class in the test data.The SVM method uses votes to determine the sentiment class of each of the two class classification results.The classification results with a positive label will then be used for the process of detecting sarcasm because sarcasm in this study is sarcasm in positive but negative meaning sentences which are often used to insinuate someone, product brand, company, etc.
At the sarcasm detection stage using the Random Forest Classifier method, the Support Vector Machine and the Naïve Bayes Classifier contain a set of decision trees or trees.Each tree uses different random features that can be used to determine which votes and votes with the most classes will be used as a result of the sarcasm classification or not sarcasm [20].
Testing is done by measuring accuracy from the results of sentiments regarding public opinion carried out by the system.Based on the results of the tests carried out, it can be seen which parameters produce the best accuracy.
This study produced a system to analyse the influence of emoticons and sarcasm in sentiment analysis, the test results using k-fold cross validation indicate that the sentiment analysis system with the use of emoticons produces better system accuracy compared to the sentiment analysis system without the use of emoticons.
The use of emoticons without sarcasm detection can increase the accuracy value of the sentiment analysis process with a slightly higher value than the use of emoticons with sarcasm detection which also increases the value of accuracy in the sentiment analysis process.The difference in the amount of sarcasm data and not sarcasm is the cause of the difference in increasing the value of accuracy with the sentiment analysis process and the difference in the amount of data on positive, negative and neutral sentiments is also one of the causes of the value less than 70%.

V. CONCLUSIONS
Based on the research that has been done, it can be concluded that the use of emoticons without the detection of sarcasm can increase the value of accuracy in the sentiment analysis process by 25.30% and the use of emoticons with sarcasm detection can increase the value of accuracy in the sentiment analysis process of 14.80%.The best method in the sentiment analysis process with the use of emoticons is the SVM method with a value of 69.74% and without the use of emoticons is the Naïve Bayes Classifier method with a value of 49.51%.

Fig. 4 .
Fig. 4. Graph of comparison of quality measures in the detection of sarcasm

Fig. 5 .
Fig. 5. Graph of comparison of quality measures in sentiment analysis with detection of sarcasm

TABLE 4 THE
MEASURE OF THE QUALITY OF SENTIMENT ANALYSIS USING THE NAÏVE BAYES CLASSIFIER METHOD

TABLE 5 THE
VALUE OF THE QUALITY OF SENTIMENT ANALYSIS USING THE SVM METHOD

TABLE 7 THE
MEASURE OF THE QUALITY OF SENTIMENT ANALYSIS WITH THE DETECTION OF SARCASM USING THE NAÏVE BAYES CLASSIFIER METHOD

TABLE 8 THE
MEASURE OF THE QUALITY OF SENTIMENT ANALYSIS WITH THE DETECTION OF SARCASM USING THE SVMMETHOD