CN112434516B - Self-adaptive comment emotion analysis system and method for merging text information - Google Patents
Self-adaptive comment emotion analysis system and method for merging text information Download PDFInfo
- Publication number
- CN112434516B CN112434516B CN202011506610.0A CN202011506610A CN112434516B CN 112434516 B CN112434516 B CN 112434516B CN 202011506610 A CN202011506610 A CN 202011506610A CN 112434516 B CN112434516 B CN 112434516B
- Authority
- CN
- China
- Prior art keywords
- text
- comment
- vector
- data
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 39
- 230000008451 emotion Effects 0.000 title claims abstract description 27
- 238000000034 method Methods 0.000 title abstract description 21
- 239000013598 vector Substances 0.000 claims abstract description 166
- 238000000605 extraction Methods 0.000 claims abstract description 20
- 230000006835 compression Effects 0.000 claims abstract description 19
- 238000007906 compression Methods 0.000 claims abstract description 19
- 238000007781 pre-processing Methods 0.000 claims abstract description 14
- 238000012098 association analyses Methods 0.000 claims abstract description 10
- 238000012512 characterization method Methods 0.000 claims description 22
- 238000011176 pooling Methods 0.000 claims description 14
- 230000003044 adaptive effect Effects 0.000 claims description 10
- 238000001914 filtration Methods 0.000 claims description 5
- 238000012549 training Methods 0.000 description 8
- 230000011218 segmentation Effects 0.000 description 4
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000011156 evaluation Methods 0.000 description 2
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000000306 recurrent effect Effects 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06Q—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
- G06Q50/00—Information and communication technology [ICT] specially adapted for implementation of business processes of specific business sectors, e.g. utilities or tourism
- G06Q50/01—Social networking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Business, Economics & Management (AREA)
- Biophysics (AREA)
- Data Mining & Analysis (AREA)
- Life Sciences & Earth Sciences (AREA)
- Molecular Biology (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Economics (AREA)
- Human Resources & Organizations (AREA)
- Marketing (AREA)
- Primary Health Care (AREA)
- Strategic Management (AREA)
- Tourism & Hospitality (AREA)
- General Business, Economics & Management (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a self-adaptive comment emotion analysis system and method for merging text information, wherein the method comprises the following steps: step a, determining the source and scale of data; step b, preprocessing the data; step c, extracting feature vectors according to the preprocessed data; step d, performing association analysis on the extracted feature vectors to obtain weighted text vectors; and e, carrying out convolution operation on the weighted text vector and the comment feature compression vector to finish final comment classification. The invention avoids the manual supervision work required by using LDA while introducing the main information, and has certain discovery and feature extraction capability for unregistered text types; the model has the capability of discovering new topics to a certain extent, and can automatically match text information with higher relativity with different comments in the same text, so that the problem that LDA cannot be fine-grained is solved.
Description
Technical Field
The invention belongs to the field of data statistics analysis, and particularly relates to a self-adaptive comment emotion analysis system and method for merging text information.
Background
With the rapid development of social platforms such as microblogs, weChats and the like, people can know events and news all over the world and leave comments through a network at any time and any place. Through analysis and statistics of the comment data of the part, the concept of the masses for certain events, such as support, objection, no so-called equiattitudes, can be known from a general perspective.
How to process the comment data is the key point of accurately acquiring real information, and because the data volume of the network comment is huge, the manual examination is not practical, and the emotion analysis algorithm becomes the only feasibility scheme.
Existing emotion analysis algorithms are well established, including but not limited to BiLSTM, fastText, CLSTM and the like. The general flow is as follows:
1. data preprocessing including word segmentation, stop word removal and irrelevant character filtering
2. Feature extraction of segmented results by using CNN or other algorithm models
3. Inputting the extracted features into a classifier (a full connection layer or any other classifier) to finish emotion classification of comments
In addition to the above conventional steps, the following are also more common ways to improve discrimination:
1. Using word stock modes such as emotion word stock and semantic library to assist in judging whether emotion in anticipation is positive or negative
2. The syntactic analysis is added into the basic model, so that the model can learn the semantic and grammar information of comments better
3. Obtaining subject text subject word information of commented text by using LDA and introducing the subject text subject word information into a model to assist in judging
A large number of practices prove that the basic scheme has better discrimination capability when facing conventional comment data, and can further improve the discrimination capability of confusing data after using emotion word stock or LDA (being a dimension reduction technology for supervised learning) information, but the methods still have larger limitations and mainly comprise the following steps:
1. The conventional scheme has better performance, but ignores the subject information of the text. On this basis, even if a semantic library or an emotion word library is introduced, the problem is still not solved.
2. Although the above problems can be solved after the topic information is introduced by using the LDA, obtaining text topics by using the LDA requires that training of an LDA model is completed separately by using corresponding main text, and the number of topics of the batch of text needs to be set manually.
3. The LDA model after training is completed can only extract the determined topic information in the training engineering, and the new topic is lack of effective information extraction capability.
4. Because the generation of the theme in the LDA training process does not depend on a certain document, certain deviation exists in the theme information extracted by the LDA for part of articles, and the LDA cannot extract more accurate theme information for a certain article.
5. If there are multiple small topics in the same text and comments for different small topics, in this case, only a part of the topic information extracted by the LDA is identical to the comment and the rest is interference information due to the situation in the problem 4, so that the subsequent results are affected to some extent.
Aiming at the defects, the invention aims to solve the problems that the training process of the LDA model is complex and a large amount of manual supervision is required while introducing main body information. The model has the capability of discovering new topics to a certain extent, and can automatically match text information with higher relativity with different comments in the same text, so that the problem that LDA cannot be fine-grained is solved. LDA represents an implicit dirichlet distribution (Latent DirichletAllocation), a widely used topic model for mining and finding different topic distributions in a large volume of text.
Disclosure of Invention
Aiming at the problems, the invention provides a self-adaptive comment emotion analysis method for merging text information of a text, which comprises the following steps:
Step a, determining the source and scale of data;
step b, preprocessing the data;
step c, extracting feature vectors according to the preprocessed data;
Step d, performing association analysis on the extracted feature vectors to obtain weighted text vectors;
And e, carrying out convolution operation on the weighted text vector and the comment feature compression vector to finish final comment classification.
Further, the data includes body text and comment text.
Further, the preprocessing of the data in the step b specifically comprises the steps of text and comment text stop word filtering and text length compression.
Further, the step c of extracting feature vectors according to the preprocessed data includes a step of extracting feature vectors of the body text and the comment text respectively.
Further, extracting feature vectors of the comment text and the body text specifically includes the following steps:
Step c1, carrying out data vectorization on the preprocessed comment text to obtain sentence vector characterization corresponding to the comment text;
Step c2, further coding and extracting features of sentence vector characterizations corresponding to the comment text to obtain feature vectors of the comment text;
Step c3, carrying out data vectorization on the compressed text, and obtaining sentence vector characterization corresponding to the text;
and c4, further encoding and extracting features of sentence vector representations corresponding to the text, and obtaining feature vectors of the text.
Further, the step d of obtaining the weighted text vector by association analysis specifically includes the following steps:
Step d1, calculating the relevance r ij:rij=ci·sj of each sentence feature vector and comment feature vector of the compressed text, wherein c i represents the ith comment feature vector, and s j represents the feature vector of the jth text sentence;
Step d2, calculating the relevance R ij of the ith comment feature vector for each sentence j in the body:
And d3, calculating a weighted text vector V i:Vi=∑jRij×Sj.
Further, in step e, the evaluation feature compression vector is obtained by using max_ pooling and average_ pooling in sequence through the evaluation feature vector, where max_ pooling and average_ pooling each represent a convolution kernel.
The invention also provides a self-adaptive comment emotion analysis system for merging text information, which comprises the following steps:
A data source and scale determining unit for determining a data source and scale;
the data preprocessing unit is used for preprocessing data;
The feature vector extraction unit is used for extracting feature vectors according to the preprocessed data;
The association degree analysis unit is used for carrying out association degree analysis on the extracted feature vectors and obtaining weighted text vectors;
And the decision unit is used for carrying out convolution operation on the weighted text vector and the comment feature compression vector to finish final comment classification.
Further, the feature vector extraction unit is configured to perform feature vector extraction according to the preprocessed data, and includes:
Carrying out data vectorization on the preprocessed comment text to obtain sentence vector characterization corresponding to the comment text; further coding and extracting features of sentence vector characterizations corresponding to the comment text to obtain feature vectors of the comment text;
Carrying out data vectorization on the compressed text to obtain sentence vector characterization corresponding to the text; and further encoding and extracting features of sentence vector characterization corresponding to the text to obtain feature vectors of the text.
Further, the association degree analysis unit is configured to perform association degree analysis on the extracted feature vector and obtain a weighted text vector, and includes:
Calculating the relevance r ij:rij=ci·sj of each sentence characteristic vector and comment characteristic vector of the compressed text, wherein c i represents an ith comment characteristic vector, and s j represents a characteristic vector of a jth text sentence;
Calculating the relevance R ij of the ith comment feature vector for each sentence j in the body:
a weighted text vector V i:Vi=∑jRij×Sj is calculated.
The invention has the beneficial effects that:
1. The method solves the problems that the training process of an LDA model is complex and a large amount of manual supervision is required while introducing main information, introduces text information of a text by using a deep learning mode in the comment emotion analysis process, avoids the manual supervision work required by using the LDA, and has certain discovery and feature extraction capability for unregistered text types; the model has the capability of discovering new topics to a certain extent, and can automatically match text information with higher relativity with different comments in the same text, so that the problem that LDA cannot be fine-grained is solved;
2. According to the method, when the text information is introduced, the relevance feature vector of the text is obtained by calculating the relevance of each sentence of the text and the comment, so that the model can adaptively extract text features with higher relevance to different comments. The method has better feature extraction capability for body text with multiple fine-grained subjects.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention may be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 shows a schematic flow chart of an adaptive comment emotion analysis method for merging text information in an embodiment of the invention;
Fig. 2 shows a specific flow diagram of an adaptive comment emotion analysis method for merging text information in an embodiment of the present invention;
Fig. 3 shows a specific flow chart of association analysis in the embodiment of the invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are some embodiments of the present invention, but not all embodiments of the present invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Fig. 1 shows a schematic flow chart of an adaptive comment emotion analysis method for merging text information in an embodiment of the present invention, and in fig. 1, the method includes the following steps:
Step a, determining the source and scale of data;
step b, preprocessing the data;
step c, extracting feature vectors according to the preprocessed data;
Step d, performing association analysis on the extracted feature vectors to obtain weighted text vectors;
and e, carrying out convolution operation on the weighted text vector and the comment feature compression vector to finish final comment classification.
Specifically, the data of the invention are microblog comments and corresponding text information, and about 100 tens of thousands of comment data and corresponding text data are obtained in a crawler mode. Part of the data (about 30 ten thousand pieces) therein is noted using a manual annotation, and subsequent model training is completed using the part of the data. The final annotated data format is the following triplet relationship: (body text, comment classification), wherein the comment classification is obtained by analyzing body text data and comment text data.
Fig. 2 shows a specific flow diagram of an adaptive comment emotion analysis method for merging text information in an embodiment of the present invention, and in fig. 2, specifically, preprocessing data in step b includes the steps of text and comment text stop word filtering, and text length compression: the method comprises the steps of performing word segmentation on a text body text and a comment text by using a crust word segmentation device, and performing stop word filtering on a word segmentation result by using a Ha Gong stop word list; and calling a TextRank algorithm module in the TextRank4ZH, and screening out key sentences of the text (taking a top 30) so as to finish the length compression of the text. The step is mainly used for preventing the problem of too slow training speed caused by overlong text of some microblogs, wherein TextRank represents a common keyword and keyword sentence extraction algorithm.
In the step c, extracting the feature vector according to the preprocessed data comprises the step of extracting the feature vector of the text and the comment text respectively.
Specifically, when feature vectors are performed on comment texts: vectorizing (embedding) the preprocessed comment text data by using an open source Erine model with hundred degrees, and obtaining corresponding sentence vector characterization; further encoding and feature extraction are carried out on the vector representation of the sentence by using BiLSTM, and the encoded output vector is recorded as a vector representation V i of the sentence (the dimension of V i is seq_length embedding _size at the moment); for the vector characterization V i of each sentence, the last time step is taken as the feature vector of the comment text, and the subsequent association analysis is performed on the partial vector (the dimension is 1 x compressing_size).
BiLSTM and time steps: a bi-directional Long-Short-Term Memory network (Long Short-Term Memory), a commonly used recurrent neural network, is used to process data, such as text data, that depends on the time of existence. Wherein each word in the text is a time step. Typically, the output at the last time step of BiLSTM contains the entire sequence of information.
When feature vectors are carried out on body text: carrying out data vectorization on the compressed text to obtain sentence vector characterization corresponding to the text; and further encoding and extracting features of sentence vector characterization corresponding to the text to obtain feature vectors of the text. Compared with the feature vector of the comment text, the text length of the text is usually much longer than that of the comment text (the text is limited to be within 30 sentences after being preprocessed by TextRank), so that sentence vector features of the 30 sentences can be extracted one by one to serve as the feature vector of the text when the sentence vector features are extracted, and the subsequent association degree analysis is carried out on the partial vectors.
And d, carrying out association analysis on the extracted feature vectors and obtaining weighted text vectors, finding out related information between comments and texts, and extracting text information features related to the comments according to association between each sentence of the texts and the comments so as to assist in final classification. Fig. 3 shows a specific flow chart of association analysis in the embodiment of the invention.
In fig. 3, c i is defined as the sentence vector of the ith comment under an article, s j is the sentence vector of the jth sentence of the article body, where c i and s j are respectively from the extracted comment and the vector representation of the text in the second module.
The calculation mode of the correlation between the ith comment and the jth text sentence is as follows: r ij=ci·sj;
For the ith comment, its relevance R ij for each sentence j in the body is defined as follows:
calculating the relevance of each sentence and comment, carrying out softmax probability normalization on the relevance, and calculating the weight vector of the relevance;
Finally, sentence vectors in the text are weighted and summed according to the relevance R ij, and the text vector for comment i is obtained as follows: v i=∑jRij×Sj.
In step e, the comment feature compression vector is obtained by sequentially using max_ pooling and average_ pooling through comment text feature vectors, where max_ pooling and average_ pooling each represent a convolution kernel, which can play a role in feature compression, and can extract the most significant feature (max_ pooling) or the more common feature (average_ pooling).
Specifically, in step e, performing convolution operation on the weighted text vector and the comment feature compression vector to complete final comment classification, which includes the following steps: combining the comment feature compression vector (3×compressing_size) and the weighted text vector (1×compressing_size), and splicing the comment feature compression vector and the weighted text vector to form a vector with the dimension of 4×compressing_size; performing convolution operation on the spliced feature vectors by using different convolution kernels, and splicing the convolved results to be used as the input of a full connection layer; and receiving the input result of the full connection layer, and finishing final classification by using the full connection layer.
CNN and convolution: the convolutional neural network is a common feature extractor, and the specific feature extraction purpose is mainly achieved through different convolutional kernels.
The invention also provides a self-adaptive comment emotion analysis system for merging text information, which comprises:
A data source and scale determining unit for determining a data source and scale;
the data preprocessing unit is used for preprocessing data;
The feature vector extraction unit is used for extracting feature vectors according to the preprocessed data;
The association degree analysis unit is used for carrying out association degree analysis on the extracted feature vectors and obtaining weighted text vectors;
and the decision unit is used for carrying out convolution operation on the weighted text vector and the comment feature compression vector to finish final comment classification.
Specifically, the feature vector extraction unit is configured to perform feature vector extraction according to the preprocessed data, and includes:
Carrying out data vectorization on the preprocessed comment text to obtain sentence vector characterization corresponding to the comment text; further coding and extracting features of sentence vector characterizations corresponding to the comment text to obtain feature vectors of the comment text;
Carrying out data vectorization on the compressed text to obtain sentence vector characterization corresponding to the text; and further encoding and extracting features of sentence vector characterization corresponding to the text to obtain feature vectors of the text.
Specifically, the association degree analysis unit is configured to perform association degree analysis on the extracted feature vector and obtain a weighted text vector, and includes:
Calculating the relevance r ij:rij=ci·sj of each sentence characteristic vector and comment characteristic vector of the compressed text, wherein c i represents an ith comment characteristic vector, and s j represents a characteristic vector of a jth text sentence;
Calculating the relevance R ij of the ith comment feature vector for each sentence j in the body:
a weighted text vector V i:Vi=∑jRij×Sj is calculated.
Specifically, the decision unit receives the text vector (1×embedding_size) weighted by the relevance analysis unit and the comment vector (3×embedding_size) in the second module, and splices the comment feature compression vector and the weighted text vector to form a vector with a dimension of 4×embedding_size.
In the comment emotion analysis process, the text information of the text is introduced by using a deep learning mode, the manual supervision work required by using LDA is avoided, and the method has certain discovery and feature extraction capability for unregistered text types.
According to the method, when the text information is introduced, the relevance feature vector of the text is obtained by calculating the relevance of each sentence of the text and the comment, so that the model can adaptively extract text features with higher relevance to different comments. The method has better feature extraction capability for body text with multiple fine-grained subjects.
Although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (7)
1. The self-adaptive comment emotion analysis method for merging text information of a text is characterized by comprising the following steps of:
step a, determining the source and the scale of data, wherein the data comprises body text and comment text;
step b, preprocessing the data;
step c, extracting feature vectors according to the preprocessed data;
Step d, performing association analysis on the extracted feature vectors to obtain weighted text vectors;
The association analysis in the step d to obtain the weighted text vector specifically comprises the following steps:
Step d1, calculating the relevance r ij of each sentence characteristic vector and comment characteristic vector of the compressed text: Wherein c i represents the i-th comment feature vector, and s j represents the feature vector of the j-th body sentence;
Step d2, calculating the relevance R ij of the ith comment feature vector for each sentence j in the body: ;
Step d3, calculating a weighted text vector V i: ;
And e, carrying out convolution operation on the weighted text vector and the comment feature compression vector to finish final comment classification.
2. The adaptive comment emotion analysis method for merging body text information according to claim 1, wherein the preprocessing of the data in the step b specifically includes the steps of body text and comment text stop word filtering and body text length compression.
3. The adaptive comment emotion analysis method for merging body text information according to claim 1, wherein the feature vector extraction in step c according to the preprocessed data includes a step of extracting feature vectors for body text and comment text, respectively.
4. The adaptive comment emotion analysis method for merging body text information according to claim 3, characterized in that extracting feature vectors of said comment text and said body text specifically comprises the steps of:
Step c1, carrying out data vectorization on the preprocessed comment text to obtain sentence vector characterization corresponding to the comment text;
Step c2, further coding and extracting features of sentence vector characterizations corresponding to the comment text to obtain feature vectors of the comment text;
Step c3, carrying out data vectorization on the compressed text, and obtaining sentence vector characterization corresponding to the text;
and c4, further encoding and extracting features of sentence vector representations corresponding to the text, and obtaining feature vectors of the text.
5. The adaptive comment emotion analysis method for merging text information according to claim 1, wherein in step e, the comment feature compression vector is obtained by sequentially using max_ pooling and average_ pooling for the comment text feature vector, where max_ pooling and average_ pooling each represent a convolution kernel.
6. An adaptive comment emotion analysis system for merging text information of a body, the system comprising:
The data source and scale determining unit is used for determining the source and scale of data, wherein the data comprises a body text and a comment text;
the data preprocessing unit is used for preprocessing data;
The feature vector extraction unit is used for extracting feature vectors according to the preprocessed data;
The association degree analysis unit is used for carrying out association degree analysis on the extracted feature vectors and obtaining weighted text vectors;
The association degree analysis unit is used for performing association degree analysis on the extracted feature vectors and obtaining weighted text vectors, and comprises the following steps:
Calculating the relevance r ij of each sentence characteristic vector and comment characteristic vector of the compressed body text: Wherein c i represents the i-th comment feature vector, and s j represents the feature vector of the j-th body sentence;
Calculating the relevance R ij of the ith comment feature vector for each sentence j in the body: ;
calculating a weighted text vector V i: ;
And the decision unit is used for carrying out convolution operation on the weighted text vector and the comment feature compression vector to finish final comment classification.
7. The adaptive comment emotion analysis system of claim 6, wherein the feature vector extraction unit is configured to perform feature vector extraction based on the preprocessed data, and includes:
Carrying out data vectorization on the preprocessed comment text to obtain sentence vector characterization corresponding to the comment text; further coding and extracting features of sentence vector characterizations corresponding to the comment text to obtain feature vectors of the comment text;
Carrying out data vectorization on the compressed text to obtain sentence vector characterization corresponding to the text; and further encoding and extracting features of sentence vector characterization corresponding to the text to obtain feature vectors of the text.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011506610.0A CN112434516B (en) | 2020-12-18 | 2020-12-18 | Self-adaptive comment emotion analysis system and method for merging text information |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011506610.0A CN112434516B (en) | 2020-12-18 | 2020-12-18 | Self-adaptive comment emotion analysis system and method for merging text information |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112434516A CN112434516A (en) | 2021-03-02 |
CN112434516B true CN112434516B (en) | 2024-04-26 |
Family
ID=74696783
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011506610.0A Active CN112434516B (en) | 2020-12-18 | 2020-12-18 | Self-adaptive comment emotion analysis system and method for merging text information |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112434516B (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017090051A1 (en) * | 2015-11-27 | 2017-06-01 | Giridhari Devanathan | A method for text classification and feature selection using class vectors and the system thereof |
CN108363753A (en) * | 2018-01-30 | 2018-08-03 | 南京邮电大学 | Comment text sentiment classification model is trained and sensibility classification method, device and equipment |
CN109033433A (en) * | 2018-08-13 | 2018-12-18 | 中国地质大学(武汉) | A kind of comment data sensibility classification method and system based on convolutional neural networks |
CN109145112A (en) * | 2018-08-06 | 2019-01-04 | 北京航空航天大学 | A kind of comment on commodity classification method based on global information attention mechanism |
CN109284506A (en) * | 2018-11-29 | 2019-01-29 | 重庆邮电大学 | A kind of user comment sentiment analysis system and method based on attention convolutional neural networks |
CN109977413A (en) * | 2019-03-29 | 2019-07-05 | 南京邮电大学 | A kind of sentiment analysis method based on improvement CNN-LDA |
CN110008311A (en) * | 2019-04-04 | 2019-07-12 | 北京邮电大学 | A kind of product information security risk monitoring method based on semantic analysis |
CN111177386A (en) * | 2019-12-27 | 2020-05-19 | 安徽商信政通信息技术股份有限公司 | Proposal classification method and system |
CN111259140A (en) * | 2020-01-13 | 2020-06-09 | 长沙理工大学 | False comment detection method based on LSTM multi-entity feature fusion |
CN111310476A (en) * | 2020-02-21 | 2020-06-19 | 山东大学 | Public opinion monitoring method and system using aspect-based emotion analysis method |
CN111914086A (en) * | 2020-07-07 | 2020-11-10 | 广西科技大学 | Method and system for analyzing mobile phone comments based on LSTM neural network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US11700420B2 (en) * | 2010-06-07 | 2023-07-11 | Affectiva, Inc. | Media manipulation using cognitive state metric analysis |
-
2020
- 2020-12-18 CN CN202011506610.0A patent/CN112434516B/en active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017090051A1 (en) * | 2015-11-27 | 2017-06-01 | Giridhari Devanathan | A method for text classification and feature selection using class vectors and the system thereof |
CN108363753A (en) * | 2018-01-30 | 2018-08-03 | 南京邮电大学 | Comment text sentiment classification model is trained and sensibility classification method, device and equipment |
CN109145112A (en) * | 2018-08-06 | 2019-01-04 | 北京航空航天大学 | A kind of comment on commodity classification method based on global information attention mechanism |
CN109033433A (en) * | 2018-08-13 | 2018-12-18 | 中国地质大学(武汉) | A kind of comment data sensibility classification method and system based on convolutional neural networks |
CN109284506A (en) * | 2018-11-29 | 2019-01-29 | 重庆邮电大学 | A kind of user comment sentiment analysis system and method based on attention convolutional neural networks |
CN109977413A (en) * | 2019-03-29 | 2019-07-05 | 南京邮电大学 | A kind of sentiment analysis method based on improvement CNN-LDA |
CN110008311A (en) * | 2019-04-04 | 2019-07-12 | 北京邮电大学 | A kind of product information security risk monitoring method based on semantic analysis |
CN111177386A (en) * | 2019-12-27 | 2020-05-19 | 安徽商信政通信息技术股份有限公司 | Proposal classification method and system |
CN111259140A (en) * | 2020-01-13 | 2020-06-09 | 长沙理工大学 | False comment detection method based on LSTM multi-entity feature fusion |
CN111310476A (en) * | 2020-02-21 | 2020-06-19 | 山东大学 | Public opinion monitoring method and system using aspect-based emotion analysis method |
CN111914086A (en) * | 2020-07-07 | 2020-11-10 | 广西科技大学 | Method and system for analyzing mobile phone comments based on LSTM neural network |
Non-Patent Citations (4)
Title |
---|
A Multi-model Fusion Framework based on Deep Learning for Sentiment Classification;Fen Yang et al;《2018 IEEE 22nd International Conference on Computer Supported Cooperative Work in Design ((CSCWD))》;20180916;1-5 * |
基于WMAB和CNN的网络评论方面级情感分析;沈远星;《中国优秀硕士学位论文全文数据库信息科技辑》;20200615(第6期);I138-1319 * |
基于动态池化和注意力的文本情感极性分类;杜梦豪 等;《 计算机工程与设计》;20190416;第40卷(第4期);1126-1132 * |
基于并行混合神经网络模型的短文本情感分析;陈洁 等;《计算机应用》;20190810;第39卷(第8期);2192-2197 * |
Also Published As
Publication number | Publication date |
---|---|
CN112434516A (en) | 2021-03-02 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110096570B (en) | Intention identification method and device applied to intelligent customer service robot | |
CN107609009B (en) | Text emotion analysis method and device, storage medium and computer equipment | |
US20220147836A1 (en) | Method and device for text-enhanced knowledge graph joint representation learning | |
CN111738003B (en) | Named entity recognition model training method, named entity recognition method and medium | |
CN111414461B (en) | Intelligent question-answering method and system fusing knowledge base and user modeling | |
CN108536754A (en) | Electronic health record entity relation extraction method based on BLSTM and attention mechanism | |
CN112395393B (en) | Remote supervision relation extraction method based on multitask and multiple examples | |
CN110362819B (en) | Text emotion analysis method based on convolutional neural network | |
CN110717324B (en) | Judgment document answer information extraction method, device, extractor, medium and equipment | |
CN111462752B (en) | Attention mechanism, feature embedding and BI-LSTM (business-to-business) based customer intention recognition method | |
CN113657115B (en) | Multi-mode Mongolian emotion analysis method based on ironic recognition and fine granularity feature fusion | |
CN110851594A (en) | Text classification method and device based on multi-channel deep learning model | |
CN112036705A (en) | Quality inspection result data acquisition method, device and equipment | |
CN113593661A (en) | Clinical term standardization method, device, electronic equipment and storage medium | |
CN111653275A (en) | Method and device for constructing voice recognition model based on LSTM-CTC tail convolution and voice recognition method | |
CN115952292B (en) | Multi-label classification method, apparatus and computer readable medium | |
CN113051932A (en) | Method for detecting category of network media event of semantic and knowledge extension topic model | |
CN113886562A (en) | AI resume screening method, system, equipment and storage medium | |
CN117272142A (en) | Log abnormality detection method and system and electronic equipment | |
CN114722798A (en) | Ironic recognition model based on convolutional neural network and attention system | |
CN112015903B (en) | Question duplication judging method and device, storage medium and computer equipment | |
CN113535928A (en) | Service discovery method and system of long-term and short-term memory network based on attention mechanism | |
CN113486143A (en) | User portrait generation method based on multi-level text representation and model fusion | |
CN115905187B (en) | Intelligent proposition system oriented to cloud computing engineering technician authentication | |
CN113051886A (en) | Test question duplicate checking method and device, storage medium and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |