AU2021104828A4

AU2021104828A4 - Sarcasm in Twitter -A C-RNN Approach

Info

Publication number: AU2021104828A4
Application number: AU2021104828A
Authority: AU
Inventors: Samir Kumar Bandyopadhyay; Amiya Bhaumik; Shawni Dutta; Sandeep Poddar
Original assignee: Lincoln University College
Current assignee: Lincoln University College
Priority date: 2021-08-02
Filing date: 2021-08-02
Publication date: 2022-05-19
Anticipated expiration: 2029-08-02

Abstract

Sarcasm is a form of speech which transforms the verbatim meaning of a sentence into its antonym. Sarcasm identification in social media is a crucial facet of the sentiment analysis process, since it deals with texts whose polarity is completely opposite from its utterance. Detecting sarcasm in online reviews help in obtaining clarity about the opinions on a particular product. This would help in improving the efficiency of a Recommender System.Sarcasm is a way of expressing negative feelings using some positive words and phrases, or vice-versa. For example, "I just love to work in this romantic weather #sarcasm" utters the negative feelings using positive words. The intention of the approach is the use of Deep learning-based framework in order to extract sarcastic clues automatically from text data. In this context, twitter news dataset is exploited to recognize sarcasm. Convolutional-Recurrent Neural network (C-RNN) based model is proposed to enable automatic discovery of sarcastic pattern detection.The present research targets in achieving sarcastic clues detection from news corpus by discarding the need of manual feature engineering task. News tweets among which 11724 posts are sarcastic and the rest 14985 are non-sarcastic.The dataset consists of 26709 headlines.The words present in the corpus are transformed into lower case for applying pre-processing. The selection of word frequency allows the vectorization a text body, by turning each text either into a vector where the coefficient of each token could be binary, based on word count/tf-idf, or into a sequence of integers where each integer is the index of a token in a dictionary. Next, this tokenizer is fitted into the corpus of tweets and a feature vector is obtained. Later that feature vector is fitted into the proposed classifier model. However, the produced tokenized vector is partitioned into training and testing dataset. Table 1 defines training, testing dataset size, number of sarcastic and non-sarcastic posts. The classifier learns from the training dataset which is given as input in terms of extracted feature vectors. Later, sarcastic pattern prediction results are retrieved using a testing dataset.this invention is environmental and user friendly. mosdelloss mo~delaccuracy 0.2 0,80 01 0.0 D.5 10 15 21Z5'D 3 0 10' 25 3'0 3-' 40 epxh cro (a)(b

Description

mosdelloss mo~delaccuracy

0.2 0,80

01

0.0 D.5 101521Z5'3D0 10' 25 3'0 3-' 40 epxh cro (a)(b

Title of Invention Sarcasm in Twitter -A C-RNN Approach

The Sarcasm is a sardonic comment coated by humour. Sarcasm has generally been used to create inconsistencies and uncertainties in the minds of the listeners while being derisive of them or someone else. Sarcasm employs the use of contradiction in order to keep the audience guessing about the true intentions of the host. Sarcasm is generally accompanied by a change in tone, body language and facial expressions while speaking. This makes it easier for sarcasm to be detected in an uttered mode of communication. When it comes to text however, these indicators are absent. Sarcasm detection in texts is done on the basis of contextual information, lexical structures, and use of grammar. This makes Sarcasm detection in texts an interesting task, thereby explaining the immense research interest in them. Sarcasm is a way of expressing positive feelings using some negative words and phrases, or vice-versa. For example, "You are really smart boy Sheldon #Sarcasm" utters the negative feelings using positive words. Sarcasm is often used as a tool to make jokes, be humorous, or to criticize and make remarks about any product, individual or any proceedings. Different authors have given different definitions of sarcasm. According to the situational differences between the text and the context are often regarded as Sarcasm. Depending on the usage, On the other hand, describes Sarcasm as a pointed and satirical or ironic exclamation designed to cut or give pain. It is also described that sarcasm can be defined as a mode of satirical wit depending for its effect on acrimonious, corrosive, and often tongue-in-cheek language that is usually directed against an individual.

The approach is to recognize sarcastic patterns from news headlines. These news headlines belong to twitter social media platform. It focuses on discovering sarcastic patterns from these data. Deep Learning (DL) techniques are utilized while analyzing and inferring sarcasm from tweets. DL techniques are beneficial since it simulates an automated feature extraction method which reduces the burden of manual processing step. DL technique exemplifies the use of neural network model which identifies underlying hidden patterns in the data. Deep neural networks are an improvised version of traditional neural networks in the sense that DNN allows stacking of multiple hidden layers between the input and output layers. Presence of multiple hidden layers will allow learning of features in numerous ways. Recurrent Neural Network (RNN) and Convolutional Neural Network (CNN) follow deep neural network model are employed here. Long-short term memory (LSTM) is a kind of RNN approach and it is exploited in the research. The proposed C-RNN method consists of convolutional layer and LSTM layers. Convolutional layer and Bi-directional LSTM layers are put into a single entity. This implemented method is applied on larger corpus of twitter dataset in order to obtain sarcastic patterns from the dataset.

CNNs) are inspired by the working of the human brain - mainly the visual cortex. CNNs are shown to require less parameters compared to its counterparts. The Convolution Layer in the CNN performs Convolutions instead of matrix multiplication. Parameter sharing is achieved by tying the weights for two different units. Sparse interaction is achieved by having the "kernel" size smaller than the input image. A CNN performs three steps perform multiple convolutions to generate linear activation, applying nonlinear function on the linear activation, and finally a pooling function that modifies the output of a particular location in the net based on its neighboring values. Examples of pooling functions include - MAXPOOLING, MINPOOLING and AVERAGE-POOLING. The target is to detect whether a given post is sarcastic or not. For this purpose, News tweets among which 11724 posts are sarcastic and the rest 14985 are non-sarcastic. Once the data collection is done, a tokenization method as pre-processing techniques on the news Headlines datasetfor Sarcasm Detection is applied. The dataset consist of 26709 headlines. The words present in the corpus are transformed into lower case for applying pre-processing. This is called tokenization of the sentence. Next, this tokenizer is fitted into the corpus of tweets and a feature vector is obtained. Later that feature vector is fitted into the proposed classifier model. However, the produced tokenized vector is partitioned into training and testing dataset. The classifier learns from the training dataset which is given as input in terms of extracted feature vectors. Later, sarcastic pattern prediction results are retrieved using a testing dataset. After obtaining the pre-processed dataset, it needs to be analyzed for sarcastic pattern identification. To accomplish the objective, the classifier model needs to be employed. Classifier model associates input dataset into target class after discovering hidden relationships among large corpus. The approach utilizes a deep learning framework for implementing a classifier model. RNN and convolutional layer of CNN are the main components of the classifier model. The model consists of one Embedding layer, 1 dimensional convolutional layer, two bi-directional LSTM layers and finally two fully connected layers respectively. The algorithm is shown below:

Algorithm

The presented research targets in achieving sarcastic clues detection from news corpus by discarding the need of manual feature engineering task. Our research focuses on automatic sarcastic clue detection using deep learning methodologies because of its self-adaptive nature. The proposed dissimilar neural network model can handle and analyze the news content by itself as well as provide insight to sarcastic patterns present in the news contents.

Step 1: Collect the dataset.

Step 2: Apply a tokenization method on the collected dataset as pre-processing techniques.

Step 3: Tokenization method creates a vector of the corresponding text data.

Step 4: The produced tokenized vector is partitioned into training and testing dataset.

Step 5: The tokenizer is fitted into the corpus of tweets and a feature vector is obtained. This

feature vector is fitted into the proposed classifier model.

Step 6.a: Create a classifier model that hybridizes the convolutional layers and LSTM layers

Step 6.b: Compile these layers as a single entity.

Step 6.c: Train the model using training dataset and obtain the final prediction results by

testing the dataset.

Step 7: Evaluate the prediction performance in terms of acquired loss and accuracy.

Step 8: End

Table. Distribution of Dataset Number of Number of tweets Number of Number of Non- tweets in in Sarcastic Sarcastic Training dataset Testing dataset tweets tweets 11724 14985 25,000 1709

After obtaining the pre-processed dataset, it needs to be analyzed for sarcastic pattern identification. To accomplish the objective, the classifier model needs to be employed. Classifier model associates input dataset into target class after discovering hidden relationships among large corpus. This paper utilizes a deep learning framework for implementing a classifier model. RNN and convolutional layer of CNN are the main components of the classifier model. The model consists of one Embedding layer, 1 dimensional convolutional layer, two bi-directional LSTM layers and finally two fully-connected layers respectively. Embedding Layer: The size of embedding layer is the same as the size of vocabulary size, i.e, 10,000. This layer receives input shape of 40 and dimension of this layer is 2. • Convolutional Layer: This layer is stacked next to the embedding layer. This layer used 1 dimensional convolutional layer which is constructed using a filter size of 32 and kernel size of 3. This layer uses relu [21] as an activation function. • Bidirectional LSTM layer: Following a 1-dimensional convolutional layer, two bi-directional layers are stacked into the model. The layers consist of having learning nodes 64 and 32 respectively. Both these layers are activated using relu [21] function. • Flatten Layer: The output of the last bi-directional LSTM layer produces 3-dimensional output. This layer accepts 3 dimensional inputs and produces 2-dimensional output. This output will be given as input for next fully-connected layers. Fully-connected Layer: Two fully-connected layers are added into the model of learning nodes 64 and 1 respectively and these layers are activated using the 'relu' [21] and 'sigmoid' [22] activation functions respectively. The last layer is the output layer of the entire model. For designing the fully connected layers, we employ keras [23] dense layers. All these layers are compiled using 'adam' optimizer [24] and binary cross entropy is used as another training criterion. The model is trained using 5 epochs with batch size of 32. Once the training is being completed, a testing dataset is used for obtaining the final prediction results. The following table 2 shows the description of the model. Table2. Model Description

Layer Output #

Layer No. Name Layer Type Shape Parameters 1 embedding (Embedding) (None,40,2) 20000 2 convId (ConvID) (None, 38, 32) 224 3 bidirectional (Bidirectional (None, 38, 128) 49664 LSTM) 4 bidirectional_ (Bidirectional (None, 38, 64) 41216

1 LSTM) 5 Flatten (Flatten) (None, 64) 0 6 dense (Dense) (None, 64) 4160 7 dense_1 (Dense) (None, 1) 65

The accuracy and loss obtained during each epoch is shown in Figure 2. As shown in Figure 2, as the number of epochs is increasing, the accuracy increases and the loss decreases. Finally, an accuracy of 0.9571 and loss of 0.1189 is reached by our proposed model during the last epoch of the training process. Table 3 shows the exact proportion of loss and accuracy for each epoch. Once the training procedure is completed, the test dataset is fitted to the model. In other words, the testing accuracy and error rate is measured at the end of 5th epoch. Table 4 shows the testing accuracy and loss acquired by the model. As discussed, the training results exhibited by the C-RNN model shows good performance as it reaches accuracy almost close to 1.0. This training outcome is justified by the testing results which show a good generalization output. Automated Sarcasm detection is an interesting field because it self-comprehends the differences between sarcastic and lying patterns which would not be feasible by manual recognition process. Detecting sarcasm in social media enables capturing insight into the trend of current public opinion. This paper approaches an automated process that will discover unseen sarcastic sentiment on news twitter posts. Use of neural network is approached in this study in order to simulate humanbrain-like operations. So, DL based implementation is favoured for this sarcasm detection domain which is indeed a complex event to be identified. This paper carries out a combined method that assembles convolutional layer as well as Bi-LSTM layer into an entity for recognizing hidden sarcastic patterns in tweet. This combined model is adjusted using necessary parameter tuning. Fine-tuning these parameters will assist in obtaining the best performance. The proposed model is capable of identifying sarcastic tweets with an accuracy of 84.73%. In conclusion, a computerized sarcasm detection system is implemented in this paper that is proficient to infer sarcasm from large databases with promising accuracy and optimized error rate. Figure2. (a) Loss and (b) Accuracy obtained for each epoch during training. Table3. Exact Ratio of Training Time Loss and Accuracy Epoch #Samples Time Taken Loss Training Number Set Accuracy 1 25000 53s 0.6684 0.7604 2 25000 51s 0.2643 0.8917

3 25000 52s 0.1867 0.927 4 25000 54s 0.141 0.9427 5 25000 54s 0.1189 0.9571

Table4. Error Rate and Testing Accuracy of proposed method Proposed Loss Accuracy Method C-RNN 0.5158 84.73%

Claims

The claims defining the invention are as follows: 1) A Deep learning-based framework is applied in this research in order to extract sarcastic clues automatically from text data accurately.
2) Twitter news dataset is exploited to recognize sarcasm.
3) Convolutional-Recurrent Neural network (C-RNN) based model is proposed 4) The proposed model consists of two major layers such as convolutional layer, and Long-term short memory (LSTM) layers. LSTM is known to be a variant of traditional RNN. ) Experimental results confirmed sarcastic news detection with promising accuracy of 84.73%.

(a) 1

(b)