AU2019101147A4 - A sentimental analysis system for film review based on deep learning - Google Patents

A sentimental analysis system for film review based on deep learning Download PDF

Info

Publication number
AU2019101147A4
AU2019101147A4 AU2019101147A AU2019101147A AU2019101147A4 AU 2019101147 A4 AU2019101147 A4 AU 2019101147A4 AU 2019101147 A AU2019101147 A AU 2019101147A AU 2019101147 A AU2019101147 A AU 2019101147A AU 2019101147 A4 AU2019101147 A4 AU 2019101147A4
Authority
AU
Australia
Prior art keywords
sep
analysis system
data
deep learning
words
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
AU2019101147A
Inventor
Haoran Han
Yilin Hao
Yisiyuan Huang
Yufei Meng
Zixing Shen
Keyao Wu
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hao Yilin Miss
Meng Yufei Miss
Wu Keyao Miss
Original Assignee
Hao Yilin Miss
Meng Yufei Miss
Wu Keyao Miss
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hao Yilin Miss, Meng Yufei Miss, Wu Keyao Miss filed Critical Hao Yilin Miss
Priority to AU2019101147A priority Critical patent/AU2019101147A4/en
Application granted granted Critical
Publication of AU2019101147A4 publication Critical patent/AU2019101147A4/en
Ceased legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V30/00Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
    • G06V30/40Document-oriented image-based pattern recognition
    • G06V30/41Analysis of document content
    • G06V30/416Extracting the logical structure, e.g. chapters, sections or page numbers; Identifying elements of the document, e.g. authors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • General Engineering & Computer Science (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Multimedia (AREA)
  • Machine Translation (AREA)

Abstract

Abstract This application will be introduced as sentimental analysis system of film criticism based on deep learning. This project contains four main processing sections, which are as follows: data processing can be recognized as the first part of the patent, which includes some well-know models and theories in the area of information searching. Moreover, in this sentimental analysis system, accuracy is the main criterion to measure the degree of system optimization and the efficiency of target realization. Compared with other systems, our sentimental analysis system based on deep learning has plenty of advantages, including simple structure, high accuracy, and rapid encoding speed.

Description

This invention belongs to field of information processing, which is the sentimental analysis system of film criticism based on deep learning. BACKGROUND
It is widely acknowledged that, due to the rapid development of the Internet, a great many emerging social websites, well-known forums, and blog writers tend to take advantage of user’s sentimental comments, feelings, perspective and so on, in order to produce a great deal of data that usually about various events on society, products, brand, politics, film. To be specific, according to films, the feelings users compressed play a crucial role on the films’ latter viewers, their political images, and their network service providers. For instance, In Douban, a common website we make use of to make books or films comments, contains a large number of users’ positive or negative emotional reviews. Analyzing the sentimental tendencies of the comments in Douban, however, lays a solid foundation for investors to make decisions, and also can be regarded as a mean to assist them to improve the quality of their works. Consequently, since the decentralized and unstructured data need to be
2019101147 30 Sep 2019 properly managed, under such background, the emotional analysis has been attached to a great importance.
With the explosive growth of this type of comment, the demand of the technology of the sentimental analysis, a section of the natural language processing (NLP), are gradually increasing, as it can be employed to analyze and judge the emotional types of text description, so that the machine can be prone to comprehend the emotions and views expressed in the text. Nonetheless, due to both the complexity and diversity of the human languages, the applications of the sentimental analysis are considered as a challenging task.
Previous researches show that the basic machine learning techniques can accomplish some natural language processing tasks effectively, such as document subject classification. However, the same techniques cannot be applied to the field of emotional classification, since it requires more efforts to overcome the challenges emotional analysis faced and to deal with the diversity involved in emotional analysis.
There are two main thoughts applied to emotion analysis so far. The first is the one that based on the emotion thesaurus, which needs to calculate the emotional tendency of the text according to the constructed emotion thesaurus- quantifying the emotion of the text according to the
2019101147 30 Sep 2019 semantics and dependencies-and the final classification effect depends on the integrity of the emotion thesaurus. What’s more, as for this particular method, an excellent linguistic foundation is required. It is necessary to know when a sentence is usually expressed as positive or negative under different situations. Nevertheless, the emotions expressed by words are difficult to judge as absolute positive or negative emotions due to the complexity of modem language, so it is difficult to perfect the judgment of emotions by this mean. Another common method that based on the machine learning is to select emotional words as feature words, and then matrix the text. Among those methods, logistic Regression, Naive Bayes (NB), and Support Vector Machine (SVM) are more commonly used. The final classification effect always depends on the selection of the training text and the emotional labeling. Each of these approaches, however, has their own drawbacks. For example, in theory, Naive Bayes model used to have the minimum error compared with other classification methods. In fact, this is not always the case. The Naive Bayesian model assumes that attributes are independent of each other, this assumption, therefore, is often not valid in the practical applications. Classification effect shows bad effect when the number of attributes is big or the correlation between attributes is large. In other words, Naive
2019101147 30 Sep 2019
Bayes performs best when attribute correlation is small. At this point, an emerging algorithm, semi-Naive Bayes, partly improved the problem of the correlation. In addition, prior probability that depends on hypothesis in many cases needs to be known before using this model. But there are many kinds of hypothesis models, which make it to be more likely to come out with the bad prediction effect. Moreover, there has a certain error rate in classification decision and it is sensitive to the expression form of input data. At present, most of the relevant researches use the emotional characteristics manually annotated by SVM or Naive Bayer’s to conduct emotional analysis on Weibo. However, as Weibo usually contain limited contextual information, it is challenging to conduct emotional analysis on it. Meanwhile, these methods require us to extract features manually. It is almost impossible to do that due to the large sample size, so their applicability is limited.
Accordingly, we decide to adopt convolutional neural network (CNN) and fully connected neural network (FC) in our program of emotional analysis of film reviews. With its special structure of local weight sharing, convolutional neural network has unique advantages in speech recognition and image processing. Its layout is closer to the actual biological neural network. In CNN, Weight sharing reduces the
2019101147 30 Sep 2019 complexity of the network and further decreases the number of parameters. Moreover, the image of multi-dimensional input vector can be directly input into the network, which avoids the complexity of data reconstruction in feature extraction and classification. In the pooling layer, the max-pooling method can be used to compress features to achieve dimensionality reduction and facilitate the extraction of main features. In this invention, we take advantage of 4 convolution layers, 2 max pooling layers, and 2 full connection layers, which not only makes the overall framework structure of the project relatively simple and speeds up, but achieves the content that the previous design wants to optimize as well.
SUMMARY
Our Patent is a sentimental analysis system for film review based on deep learning, and this system includes six major steps:
1) Firstly, the original film review data would be preprocessed, which includes eliminating html tags; deleting non-character information; and utilizing the nltk. stop word in python to cast off the stop words.
2) Then our system would transform the preprocessed data to the form of Bag-of-words Model, which serves to transform natural language information to arrays conducted with numbers. The review in the form of
2019101147 30 Sep 2019
Bag-of-words Model is totally independent with their grammar or words’ orders, but the choice of words as well as words’ frequency would be the decisive elements in the Bag-of-words Model. Our system utilizes skleam in python to achieve this goal.
As to the specific process of Bag-of-words Model Transformation, our system would firstly transfer every review in our film review data to a list made up of each word in every review. And then it would go over all the review data and conduct a dictionary that specifically reveals the vocabulary used in our review data. Yet since the dictionary at this time is too redundant and includes lots of vocabulary with little analytical value, the system would rearrange the dictionary according to the frequency of each word in a descending order. The dictionary would be renewed to the top m words with the most frequency.
Then the system would conduct a matrix with the shape of [1, m] from each review according to the new dictionary. The indexes in this matrix correspond to the indexes in the dictionary, while every element in the matrix reflects the frequency of the corresponding word with the same index in the dictionary. Thus the final processed review data is a matrix with the shape of [n, m] (n is the total number of reviews in the data)
3) The system would then transfer the sentiment in the data into the
2019101147 30 Sep 2019 form of one-hot-encoding. While the system is executing the Bag-ofwords Model transformation, it would label the Bag-of-words from each review with their corresponding sentiment. Then it would transfer every sentiment into a list with i elements (i represents the number of categories of the sentiment: for instance, if the sentiment ranges from 1-star to 5stars, then i would be 5). The actual value of the sentiment would be used as the index for one-hot-encoding, and the corresponding element in the list with that index is given the value 1 while other elements in the list would have the value 0. After these steps, there will be a one-hotencoding labels as binary lists with the length i for corresponding reviews.
4) The data after such process, along with the sentiments in the form of one-hot-encoding, would be imported as input samples and labels into the deep learning network in our system.
As it is indicated on the picture above, the basic structure of this network is composed of four convolutional(CNN) layers and two maxpooling layer, followed by two fully-connected(FC) layers.
With the Rectified Linear Unit (ReLU) as the activation function, the CNN layers mainly use the filter to go over the whole data and extract the features from the review data, which are the choice of words and the
2019101147 30 Sep 2019 frequency of words. Then in regarding the new matrix (with the size of a) the system gets from CNN layers, MaxPooling would extract the maximum value from each portion and then conducts another matrix (with the size of a/2) with the maximum values extracted from the convolutional layers. Such process is to ensure that there would not be too much features entering the fully-connected layers.
By executing the above process twice, the shape of input review data is transferred from n*m*l*l to (n/4)*(m*4)*l*l and then enters the FC layers. In the FC layers, we use Softmax function to categorize the features extracted from the CNN layers.
5) And then we use primarily three methods for the optimization of the learning structure.
Firstly, we use Regularization, which utilizes L2 norm to calculate the loss for the tensors; but instead of square rooting the result of this norm, we reduce the value in half. And we use this value as the new loss to prevent over-fitting situations.
Also, we apply random dropout at the FC layers at the training phase of our learning network. By dropping the nodes at a given probability p, the over-fitting situation would also be effectively suppressed.
Lastly, we can update our parameters through gradient descent,
2019101147 30 Sep 2019
Newton, momenta and Adam. What’s more, we optimize the learning rate by decaying it with given batch-size.
6) Last but not least, we split the film review data to the training set and testing set in at a ratio of p:q. The system uses the training set as input data to train and modify our emotional analysis model and uses the testing set to calculate the accuracy of the model.
DESCRIPTION OF DRAWING
Ligure 1 shows the data flow of our convolutional neutral network.
Ligure 2 shows the data flow of our convolutional neutral network that substituted with real data.
Ligure 3-5 show the fully-connected simple neural network.
Ligure 6-7 show the mechanism of the optimizing method-dropout. Ligure 8 shows the mechanism of the functional chain.
DECRIPTION OF PREFERRED EMBODIMENT
Data Processing & One hot encoding
To accomplish our final goal, the first and inevitable step is data preprocessing. As our data is given in the form of a tsv file, we must import a library to open such a file. The best choice and our choice are no doubt the “pandas”. You can type in “pip install pandas” in command ( for windows users) like any other libraries, but do note that is method is
2019101147 30 Sep 2019 valid only when you alter the suffix of the file into tsv in the parameter. In, the next few steps, we will be using what is called the “regular operation expressions”, which is also a library called “re” when imported. Normally, the basic means of data preprocessing should cover but not limit to “removing html tags, turning the text into all lower-case letters, removing non-characters and meaningless stop words.” We used two “for” loops to make sure it goes through the whole file so that we would get our expected outcome. By using re.comilie(), we transferred the string expression into a complied “pattern” in order to proceed the following steps. Then, we replace every non-character by a empty string while using the .lowerQ function provided by python to transfer the whole file into lowercase letters. The last yet crucial move is to import a library nltk, which stands for Natural Language Toolkit, to clean the stop words from the text and split the text into words known as tokenization academically. Regardless of the possible connection among words (which most likely exists, and in a way will affect the accuracy despite the fact that we still reached an accurateness over eighty five percent) as our work is based on an old model called “Naive Bayes Classifier”. When all of the above is done, we applied .append method within the python interpreter to put the preprocessed data into an empty list “a” as io
2019101147 30 Sep 2019 we defined before. The following task would be turning the labels, in this case, the “sentiment” column of the data, into one-hot encoding labels. So, one hot encoding is process that pivots categorical values which represents numerical values into a form that suits machine learning algorithms for better predictions. The reason why we chose one hot encoding instead of label encoding is because label encoding considers the ones with higher categorical values are generally “better data”, whereas obviously, it is not. However, the same problem will not happen to one hot encoding as it is more of a “binarization” process that is far more objective and efficient as not all data provided for machine learning is not always sequential, in other words, categorical. To some extent, we even added one more feature for extraction. We assigned these movie reviews with two values, zero and one, one meaning positive while zero representing passive. (The opposite would work just fine if you want) Through one hot encoding, we successfully pivoted ones into [0,1] matrices and zeros into [1,0] matrices.
Bag of Words
After the success of data cleaning, we conducted data processing and feature construction. We use the bag-of-words to construct text features and it was originally used in the field of information retrieval. For a
2019101147 30 Sep 2019 document, it is assumed that regardless of the order relationship and syntax of the words in the document, only the occurrence of the word is considered. Assuming that there are five categories of topics, our task is to determine a document and which topic it belongs to. In the training set, we have several documents whose topic types are known. We pick out some documents, each document contains some words, and we use these words to build word bags. The word bags can be in this form: { watch , sports, phone, like, Roman...}, and then each document can be converted into a histogram with each word as the horizontal coordinate and the word occurrence times as the vertical coordinate. After that, normalization is carried out and the frequency of each word occurrence is taken as the feature of the document. This model ignores the grammar and word order of the text and converts every comment we make into a vector. We took 996 reviews and broke them down into individual words. Then we culled the top 5,000 words. Make these five thousand words as a dictionary. This dictionary is a train, which used to be tested in the later process. The next step is to create a document vector that converts each document of free text into a text vector that we can use as input or output to the machine-learning model. The simplest way to design a word is to mark its presence as a Boolean, with 0 for negative and 1 for positive.
2019101147 30 Sep 2019
Using any order listed in our dictionary, we can convert reviews to binary vectors. All sorts of traditional document-like words are discarded, and we can use this generic method to extract features from any document in our corpus, which can then be used for modeling.
Convolutional Neural Network
When the data preprocessing is done, we used neural network, which comprises of four convolutional layers, two pooling layers, and two fully connected layers, to train our models. Except for neural network, random forest is also a commonly used machine-learning algorithm; we will discuss their differences and explain why our final option is neural network. First of all, the random forest algorithm is rather independent and conventional, which holds for every one of it decision tree whereas the neural network is closely bonded with all of its neurons, one cannot work without each other. Secondly, random forest algorithm can only process data provided in chart form which would have cost us a lot of inconveniences if we had used this. In comparison, neural network can deal with a variety forms of data, including audio, pictures, text and so on. Clearly, we chose the bag of words model therefore our best choice is the neural network, though theoretically the random forest algorithm can achieve the same goal but clearly it is unnecessary. Put random forest
2019101147 30 Sep 2019 aside, we split the original text into “train” and “test” with a proportionality of seven to three, enough and adequate for both. The next thing we are going to do is to use the “reshape” feature in the library NumPy to change the shape into a [17500, 5000, 1,1] matrix for “train” and a [7500, 5000, 1, 1] matrix for “test” as a result of splitting totally 25000 movie reviews. For inputs, there are a few parameters that we need to explain and specify. To start with, we defined batch size to be 64neither too large or too small as both them could result in lower accuracy. The bigger the batch size, the bigger the learning rate, to keep the standard division of gradient a constant so we set the basic learning rate to 0.001 and since if the learning rate is too big, it begins to oscillate due to too big step, thus it is critically important to have a high decay rate to prevent the model from overfitting. Usually it decays in maximum speed when the decay right is 1 so here we set the value to be 0.99. Unlike the normal neural network, the method we are applying is called the convolutional neural network. The advantage of is most significantly shown on its function to extract partial features, in this very way, reducing the input characteristic variables, reducing the amount of parameters as well. The same filter goes through the input matrix several times; in this case, we set up totally four convolutional layers. After the
2019101147 30 Sep 2019 convolutional layers, we continued to use the max pooling method, every two layers of convolution then a max pooling to be precise. We set the patch size to be, in_depth and out_depth both to be 32 for every layer except the first one that has been assigned the value of feature_col. The we would yield the result same as the input if the input is greater than zero, and input times a coefficient if the input is less or equal to zero. The pooling scale is two, which means we would get a new two by two matrix which four largest values are selected from each group in order to extract maximum feature thus improve accuracy. The final step is to add two fully connected layers where every neuron in one layer relates to every neuron in the next layer. Our first fully connected layer’s input number of nodes is equivalent to feature size divided by four since we added two max pooling layers, times feature_col times thirty two which is the out depth of the fourth convolutional layer. The output number of nodes is one hundred and twenty eight and the activation function is still ReLU as we have discussed before, this process can be simply described as “wx+b” which w is weight, x is input value and b is the value of bias. The second fully connected layer has the input number of nodes same as the output number of nodes of the first fully connected layer, which is one hundred and twenty eight, the output number of nodes is 2 and this
2019101147 30 Sep 2019 time since it is the last layer we do not have an activation function.
Optimize
The last thing we need to do is optimize this program. The whole has two aspects. The first is the optimization of the whole code, and the second is the optimization method for the design architecture. When we actually did the convolution, we found that each convolution layer was coded with a different name, and the contents were the same. At this point, we can choose to make an overall optimization of our code to make our code cleaner and faster. The goal of the design architecture optimization is to prevent our data from overfitting. Since our convolution and neural network are nonlinear, it is easy to overfit. Overfitting usually makes very few mistakes on the training set, but when you have new data on the test set the results are usually not accurate. Under fitting and overfitting are both bad and need to be optimized. First, we use the regular expression, which usually has two expressions. We adopted L2 loss method because it's unique. Its advantages are that it is flexible, logical, and functional that can quickly and easily achieve complex control over strings. Then we used dropout, which was used during the training phase of FC, not at the convolution layer. Dropout is to drop some nodes randomly at a given probability on the full
2019101147 30 Sep 2019 connection layer of the training stage to reduce overfitting. Dropout has a great inhibiting effect on overfitting. And the last thing we want to do is to update the parameters, the way to update the parameters is through gradient descent, Newton, momenta, Adam, etc. Here we use Adam, which has the best effect. By comparing a gif of parameter updating methods, Adam is the fastest method with the least error.
Table 1 The training data.
Average Accuracy Standard Deviation Dropout rate Base Learning Rate Decay Rate Iteration Step
1 84.814 2.042 0.95 0.001 0.99 4000
2 84.771 2.185 0.95 0.001 0.99 3500
3 85.207 2.042 0.90 0.001 0.99 4000
4 49.514 1.718 0.90 0.005 0.99 4000
5 84.214 1.850 0.91 0.001 0.99 4000
6 84.542 2.163 0.92 0.001 0.99 4000
7 84.899 1.922 0.93 0.001 0.99 4000
8 83.642 2.294 0.94 0.001 0.99 4000
9 84.314 2..203 0.93 0.002 0.99 4000
10 82.685 2.259 0.93 0.003 0.99 4000
11 49.515 1.718 0.93 0.004 0.99 4000
12 84.671 2.215 0.90 0.001 0.98 4000
13 84.614 2.071 0.90 0.001 0.97 4000
14 84.786 2.018 0.90 0.001 0.96 4000
2019101147 30 Sep 2019

Claims (2)

1. A sentimental analysis system for film review based on deep learning, wherein the top 5000 high frequency words are choosed, only 2 MaxPooling layers are needed, which led to this structure relatively simple with higher processing speed.
2. The sentimental analysis system for film review based on deep learning according to calim 1, wherein a structure with high accuracy, convolutional neural network are used, to build up our system; consequently, with appropriate parameters, our system can maintain a relatively high accuracy.
2019101147 30 Sep 2019
FIGURE 1
FIGURE 2
FIGURES
2019101147 30 Sep 2019
FIGURE 4
Output
xi <1 r “1 hi wy **11 1’ Vl /* , -t , 1 xi · ή* hj E » i Vk ‘ 1 wy . wkj XP VP & “z. hL 1 y , wml Input Layer | wh Hidden Layer wy .1. Output Layer
yi ym
FIGURE 5
2019101147 30 Sep 2019 a
c
d
2019101147 30 Sep 2019
AU2019101147A 2019-09-30 2019-09-30 A sentimental analysis system for film review based on deep learning Ceased AU2019101147A4 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2019101147A AU2019101147A4 (en) 2019-09-30 2019-09-30 A sentimental analysis system for film review based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2019101147A AU2019101147A4 (en) 2019-09-30 2019-09-30 A sentimental analysis system for film review based on deep learning

Publications (1)

Publication Number Publication Date
AU2019101147A4 true AU2019101147A4 (en) 2019-10-31

Family

ID=68341989

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2019101147A Ceased AU2019101147A4 (en) 2019-09-30 2019-09-30 A sentimental analysis system for film review based on deep learning

Country Status (1)

Country Link
AU (1) AU2019101147A4 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111144108A (en) * 2019-12-26 2020-05-12 北京百度网讯科技有限公司 Emotion tendency analysis model modeling method and device and electronic equipment
CN112580351A (en) * 2020-12-31 2021-03-30 成都信息工程大学 Machine-generated text detection method based on self-information loss compensation

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111144108A (en) * 2019-12-26 2020-05-12 北京百度网讯科技有限公司 Emotion tendency analysis model modeling method and device and electronic equipment
CN111144108B (en) * 2019-12-26 2023-06-27 北京百度网讯科技有限公司 Modeling method and device of emotion tendentiousness analysis model and electronic equipment
CN112580351A (en) * 2020-12-31 2021-03-30 成都信息工程大学 Machine-generated text detection method based on self-information loss compensation
CN112580351B (en) * 2020-12-31 2022-04-19 成都信息工程大学 Machine-generated text detection method based on self-information loss compensation

Similar Documents

Publication Publication Date Title
Onan Sentiment analysis on product reviews based on weighted word embeddings and deep neural networks
CN110210037B (en) Syndrome-oriented medical field category detection method
Yasen et al. Movies reviews sentiment analysis and classification
KR102155768B1 (en) Method for providing question and answer data set recommendation service using adpative learning from evoloving data stream for shopping mall
Beysolow Applied natural language processing with python
CN113591483A (en) Document-level event argument extraction method based on sequence labeling
Kulkarni et al. Deep learning for NLP
CN110580287A (en) Emotion classification method based ON transfer learning and ON-LSTM
Gangadharan et al. Paraphrase detection using deep neural network based word embedding techniques
Rodrigues et al. Machine & deep learning techniques for detection of fake reviews: A survey
Huang et al. Text classification with document embeddings
AU2019101147A4 (en) A sentimental analysis system for film review based on deep learning
CN112036189A (en) Method and system for recognizing gold semantic
Ribeiro et al. Acceptance decision prediction in peer-review through sentiment analysis
Villmow et al. Automatic keyphrase extraction using recurrent neural networks
Tian et al. Chinese short text multi-classification based on word and part-of-speech tagging embedding
Rezaei et al. Hierarchical three-module method of text classification in web big data
CN114510569A (en) Chemical emergency news classification method based on Chinesebert model and attention mechanism
Basarslan et al. Sentiment analysis with various deep learning models on movie reviews
CN114003773A (en) Dialogue tracking method based on self-construction multi-scene
M Alashqar A Classification of Quran Verses Using Deep Learning
Falzone et al. Measuring similarity for technical product descriptions with a character-level siamese neural network
Hameed User ticketing system with automatic resolution suggestions
Padia et al. Automating class/instance representational choices in knowledge bases
Zouari French AXA insurance word embeddings: Effects of fine-tuning bert and camembert on AXA france’s data

Legal Events

Date Code Title Description
FGI Letters patent sealed or granted (innovation patent)
MK22 Patent ceased section 143a(d), or expired - non payment of renewal fee or expiry