CN114911942A - Interpretability text emotion analysis method, system and equipment based on confidence coefficient - Google Patents
Interpretability text emotion analysis method, system and equipment based on confidence coefficient Download PDFInfo
- Publication number
- CN114911942A CN114911942A CN202210607887.5A CN202210607887A CN114911942A CN 114911942 A CN114911942 A CN 114911942A CN 202210607887 A CN202210607887 A CN 202210607887A CN 114911942 A CN114911942 A CN 114911942A
- Authority
- CN
- China
- Prior art keywords
- confidence
- data
- deep learning
- text
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000004458 analytical method Methods 0.000 title claims abstract description 33
- 230000008451 emotion Effects 0.000 title claims abstract description 14
- 238000013135 deep learning Methods 0.000 claims abstract description 48
- 238000000034 method Methods 0.000 claims abstract description 32
- 230000006870 function Effects 0.000 claims abstract description 31
- 238000007781 pre-processing Methods 0.000 claims abstract description 8
- 238000012549 training Methods 0.000 claims description 20
- 239000013598 vector Substances 0.000 claims description 14
- 238000011176 pooling Methods 0.000 claims description 5
- 238000000605 extraction Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000009825 accumulation Methods 0.000 claims description 2
- 230000003213 activating effect Effects 0.000 claims description 2
- 238000007670 refining Methods 0.000 claims description 2
- 125000004122 cyclic group Chemical group 0.000 claims 1
- 229910052731 fluorine Inorganic materials 0.000 claims 1
- 125000001153 fluoro group Chemical group F* 0.000 claims 1
- 238000012360 testing method Methods 0.000 abstract description 11
- 239000000284 extract Substances 0.000 abstract description 5
- 230000010354 integration Effects 0.000 abstract description 5
- 238000013528 artificial neural network Methods 0.000 abstract description 3
- 238000010801 machine learning Methods 0.000 description 8
- 230000002996 emotional effect Effects 0.000 description 7
- 238000010586 diagram Methods 0.000 description 5
- 230000000694 effects Effects 0.000 description 5
- 238000005065 mining Methods 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000011160 research Methods 0.000 description 3
- 238000004364 calculation method Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000005457 optimization Methods 0.000 description 2
- 241000590419 Polygonia interrogationis Species 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000002790 cross-validation Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000013136 deep learning model Methods 0.000 description 1
- 238000006073 displacement reaction Methods 0.000 description 1
- 230000009977 dual effect Effects 0.000 description 1
- 230000000518 effect on emotion Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Evolutionary Computation (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Computing Systems (AREA)
- Molecular Biology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Biophysics (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Biomedical Technology (AREA)
- Databases & Information Systems (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a text emotion analysis method, a system and equipment based on interpretability of confidence coefficient, which comprises the following steps of firstly, carrying out data preprocessing on pre-analysis text data; then inputting the processed data into a deep learning network for classification; then constructing a confidence segmenter, defining a confidence function, setting a confidence threshold value, and dividing the deep learning network classification result into a confidence degree strong part and a confidence degree weak part; according to the confidence degree strong and weak points, the data with strong confidence degree is classified by a deep learning network, and the data with weak confidence degree is classified by an enhanced network; and finally, combining the two network classification results and outputting a final classification result. The invention constructs a new network model frame RTS-CF, quickly extracts longer keywords through RAKE, and is simple and efficient; and dividing the test set into a confidence degree strong part and a confidence degree weak part through a confidence function, and reclassifying the data with weak confidence degree by combining the enhanced network. The integration method for optimizing the neural network by utilizing the enhanced network has strong interpretability and improves the overall classification performance.
Description
Technical Field
The invention belongs to the technical field of text data mining, and relates to a text emotion analysis method, system and equipment, in particular to a text emotion analysis method, system and equipment with strong interpretability based on confidence coefficient.
Background
With the development of internet technology and the rise of modeling deep learning, the research of text sentiment analysis is more popular, and relevant research not only has very important practical significance to scientific researchers, but also to daily life, for example, a government department can guide the public opinion development by analyzing the network public opinion sentiment tendency, and an e-commerce merchant can know user preference by analyzing the user comment sentiment tendency. Through deep mining and analysis of texts in various fields, the interests, hobbies and emotional biases of users can be better known.
Currently, commonly used text emotion analysis methods include emotion classification based on dictionaries, emotion analysis based on traditional machine learning, and emotion analysis methods based on deep learning. The deep neural network model achieves a remarkable effect in emotion classification. Although the classification method based on the traditional machine learning is slightly inferior to the deep learning method in the aspect of classification accuracy, the method has advantages in the aspects of interpretability and time complexity. The integration method of the deep learning method and the traditional machine learning method is adopted, the integral classification performance can be improved, the interpretability is strong, the mastering and understanding of the personal emotional tendency can be realized, and the analysis modeling method is rarely used at present and is worthy of exploration and trial. The method can quickly extract some longer key words of the professional terms by adopting the RAKE, is simple and efficient, and has good effect on text classification.
Disclosure of Invention
The invention aims to provide a text sentiment analysis method, a text sentiment analysis system and text sentiment analysis equipment with strong interpretability based on confidence coefficient, which utilize an integration method of an enhanced model to optimize a deep neural network and improve the whole text classification performance.
The method adopts the technical scheme that: a text emotion analysis method based on interpretability of confidence coefficient comprises the following steps:
step 1: performing data preprocessing aiming at the pre-analysis text data;
step 2: inputting the preprocessed data into a deep learning network for classification;
and step 3: constructing a confidence segmenter, defining a confidence function, setting a confidence threshold value, and dividing a deep learning network classification result into a strong confidence part and a weak confidence part;
the confidence functionWherein d is a preset value; mean (—) is a mean function; y is 1 ,y 2 And (3) representing the output value of the softmax layer of the deep learning network, and respectively considering the scores of the strong confidence coefficient and the weak confidence coefficient, wherein0<y i <1,∑y i =1;z i Is the output value of the ith node as the input value of softmax; n is the number of output nodes, namely the number of classified categories;represents the sum of all the predicted results;
and 4, step 4: according to the strong and weak confidence scores, classifying the data with strong confidence scores by a deep learning network, and reclassifying the data with weak confidence scores by an enhanced network;
and 5: and combining the results of the deep learning network and the enhanced network, and outputting a final classification result.
The technical scheme adopted by the system of the invention is as follows: a text sentiment analysis system based on interpretability of confidence coefficient comprises the following modules:
the module 1 is used for preprocessing data aiming at pre-analysis text data;
the module 2 is used for inputting the preprocessed data into a deep learning network for classification;
the module 3 is used for constructing a confidence segmenter, defining a confidence function, setting a confidence threshold value and dividing a deep learning network classification result into a high confidence part and a low confidence part;
the confidence functionWherein d is a preset value; mean is a mean function; y is 1 ,y 2 And the output value representing the softmax layer of the deep learning network can be respectively considered as the score of the part with strong confidence coefficient and the part with weak confidence coefficient, wherein0<y i <1,∑y i =1;z i Is the output value of the ith node as the input value of softmax; n is the number of output nodes, namely the number of classified categories;represents the sum of all the predicted results;
the module 4 is used for classifying the data with strong confidence coefficient by the deep learning network according to the strong and weak confidence coefficient, and reclassifying the data with weak confidence coefficient by the enhancement network;
and the module 5 is used for combining the results of the deep learning network and the enhancement network and outputting the final classification result.
The technical scheme adopted by the equipment of the invention is as follows: a text sentiment analysis device based on interpretability of confidence, comprising:
one or more processors;
a storage device to store one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the method for text sentiment analysis based on confidence-based interpretability.
The invention comprises the following technical effects:
(1) the deep learning model R-TextCNN trained by the whole training set can achieve a remarkable effect on emotion classification.
(2) By extracting the keywords through the RAKE, some longer term keywords can be extracted, and good effect can be achieved.
(3) Through a confidence function, the test set can be divided into two parts of high confidence degree and low confidence degree, and the traditional machine learning model is combined to reclassify the part of data with low confidence degree.
(4) And automatically adjusting parameters by GridSearchCV to obtain optimized parameters.
(5) The integration method for optimizing the neural network by utilizing the enhanced network model has strong interpretability and can improve the overall classification performance.
Drawings
FIG. 1 is a flow chart of a method of an embodiment of the present invention;
FIG. 2 is a diagram of a deep learning network architecture in accordance with an embodiment of the present invention;
FIG. 3 is a diagram of the calculation of the softmax function of an example of the present invention;
FIG. 4 is a diagram of an enhanced network architecture of an embodiment of the present invention;
FIG. 5 is a hyperplane view of an enhanced network of an embodiment of the present invention;
fig. 6 is a diagram of an RTS-CF network architecture according to an example of the present invention.
Detailed Description
In order to facilitate the understanding and implementation of the present invention for those of ordinary skill in the art, the present invention is further described in detail with reference to the accompanying drawings and examples, it is to be understood that the embodiments described herein are merely illustrative and explanatory of the present invention and are not restrictive thereof.
Educational text mining is a non-negligible area of text mining. Potential learning feelings and emotional tendencies of the learner are mined and found from a simple text, reference and basis can be provided for personalized teaching, and a teacher can be helped to quickly master learning conditions of the learner, including learning attitude and overall progress, so that timely answering and confusion are facilitated, and feedback is provided. As a research hotspot in the field of education text mining, the emotional tendency of the learner is calculated and analyzed through the text, so that the method not only can help to understand and analyze the potential psychological change of the learner, but also has great help to diversify and richen teaching resources and modes. Many online platforms serve as important teaching aids that allow learners to freely release their personal views and subjective feelings, as well as socially interact with others. Text is among the simplest and most common ways of interacting. The learner can analyze the emotional tendency of the learner from the text through the publishing viewpoint and know the whole learning state of the learner in time, so that the feedback and the intervention of the teacher are possible.
Referring to fig. 1, the text sentiment analysis method based on interpretability of confidence level provided by the present invention includes the following steps:
step 1: carrying out data preprocessing aiming at the pre-analysis text data;
in this embodiment, the specific implementation of step 1 includes the following substeps:
step 1.1: arranging the acquired text data into a required data type and storing the data type in a txt file;
step 1.2: reading and writing the content of the text file, and removing spaces and other useless symbols for subsequent use;
in this embodiment, for subsequent classification, the data needs to be processed into a txt file for reading text content, removing symbols other than chinese and designated punctuation marks, and storing the txt file in a new txt file.
Step 2: inputting the preprocessed data into a deep learning network for classification;
referring to fig. 2, the deep learning network R-TextCNN of the present embodiment includes a RAKE extraction keyword layer, a keyword embedding layer, a convolution layer, a maximum pooling layer, and a fully connected softmax layer;
the RAKE extraction keyword layer of the present embodiment is a method for quickly and automatically extracting keywords. Dividing the text into a plurality of sentences by using specified punctuations, such as periods, question marks, exclamation marks, commas and the like; for each clause, the stop word is used as a separator to divide the sentence into a plurality of phrases, and the phrases are candidate words to be sequenced; each phrase is composed of a number of words, each word is assigned a score, the score for each phrase is obtained by accumulation,where deg is the degree of each word, which means the number of co-occurrences of all words in the text in the candidate keyword, and freq is the word frequency of each word; sorting the extracted candidate keywords from big to small; finally, outputting a plurality of phrases with top ranking scores as keywords;
the keyword embedding layer of the embodiment converts the extracted keywords into embedding representation. N words mapped as word vectors are concatenated into a sentence. The sentence of length n is represented as: x is the number of 1:n =x 1 ⊕x 2 ⊕...⊕x n (ii) a Wherein x is i ∈R K A k-dimensional word vector corresponding to the ith word in the sentence; [ ] is a connect operation; x is the number of i:i+j Representing the word x i ,x i+1 ,...,x i+j The connection of (1);
the convolution layer of this embodiment uses a convolution kernel w and x with a width d and a height h i:i+h-1 After convolution operation is carried out on the (h words), the corresponding characteristic c is obtained by activating the activation function i Then the convolution operation is denoted c i =f(w.x i:i+h-1 + b); wherein, w is the initialization weight, b is the bias term, and h is the filter window length; after convolution operation, obtaining a vector c with n-h +1 dimension: c ═ c 1 ,c 2 ,...,c i ,...,c n-h+1 ](ii) a Wherein n is the number of words of each sentence;
in the maximum pooling layer of this embodiment, the maximum is taken for a plurality of one-dimensional vectors obtained after convolutionAnd the values are spliced together to be used as the output value of the layer: z ═ z 1 ,z 2 ,z 3 ,...,z i ,...,z m }; wherein z is i =max{c i };
The fully-connected softmax layer of this embodiment sends z into the fully-connected softmax layer, and obtains the tag probability distribution of a sentence:wherein, y i Is label i Corresponding predictive score, w i Is the weight of the full connection layer; label i Is the ith category label.
The deep learning network adopted by the embodiment is a trained deep learning network; the training process comprises the following substeps:
(1) collecting a training data text set, and dividing the text and the label into a training set and a test set according to the sample ratio;
the embodiment divides the data set into a training set and a testing set through a train _ test _ split () function, and sets the sample proportion test _ size. For example, there are 100 data, test _ size ═ 0.2, then 80% of the training set, 80, and 20% of the test set.
(2) Creating an embedded matrix, obtaining an embedded vector through an embedded index, assigning the embedded vector to the embedded matrix, and loading a pre-trained word to be embedded into an embedded layer;
(3) training a deep learning network by using a training set;
(4) and after the data are trained, storing the deep learning network for predicting and classifying the test set.
And step 3: constructing a confidence segmenter, defining a confidence function, setting a confidence threshold value, and dividing a deep learning network classification result into a high confidence part and a low confidence part;
confidence function used in the present embodimentWherein d is the training times of the deep learning network, and the minimum interval of the iteration times during training is presetOn the basis of the iteration times when the loss function value of the first training of the deep learning network tends to be stable, training a model for testing data every time an iteration interval is added; if the minimum interval is 5, the iteration number reference is 50, and the training number d is 3, the deep learning network needs to be trained and tested when the iteration number is 55, 60, and 65, respectively; mean is a mean function; y is 1 ,y 2 And the output value representing the softmax layer of the deep learning network can be respectively considered as the score of the part with strong confidence coefficient and the part with weak confidence coefficient, wherein0<y i <1,∑y i =1;z i Is the output value of the ith node as the input value of softmax; n is the number of output nodes, namely the number of classified categories;represents the sum of all the predicted results;
referring to fig. 3, the softmax function, also called normalized exponential function, adopted in the present embodiment is a generalization of a two-classification function sigmoid on multi-classification, and is intended to show the result of multi-classification in a probabilistic form, and the calculation process includes the following sub-steps:
(1) converting the predicted result into a non-negative number: the prediction result z of the model is { z ═ z 1 ,z 2 ,...,z i ,...,z n Conversion to exponential function f (x) exp (x), guarantees non-negativity of the probability.
(2) The sum of the probabilities of the various predictors is equal to 1: to ensure that the sum of the probabilities is equal to 1, the converted result needs to be normalized. The method is to convert the result exp (z) i ) Divided by the sum of all converted resultsGet the probability of approximation
In the embodiment, after the softmax layer obtains two classification scores, a visual confidence function is defined by a user, and the classification score is divided into two types of data according to the strength of the confidence degree, wherein one type of data is data with strong confidence degree, namely the data with large difference of the two types of scores and good classification effect; one type is data with weak confidence, namely, the two types of data with weak score difference and are not classified well.
And 4, step 4: according to the strong and weak confidence scores, classifying the data with strong confidence scores by a deep learning network, and reclassifying the data with weak confidence scores by an enhanced network;
referring to fig. 4, the classification performed by the enhanced network in this embodiment includes setting a parameter adjusting point, GridSearchCV, training SVM, and a classification result;
setting a parameter adjusting starting point, setting a penalty parameter C and a gamma value of a kernel function parameter to be between 0.1 and 100, and multiplying the penalty parameter C and the gamma value by 0.1 or 10 every time to be used as a step length according to the performance of an enhanced network model; after the approximate range is determined, refining the search interval;
GridSearchCV of the present embodiment tries each possibility by loop traversal in the parameter selection of the refined search interval, and the best performing parameter is the final result. The final performance is greatly related to the result of the division of the initial data, so that the chance is reduced by adopting a cross-validation method;
after the parameters are adjusted, the SVC in the sklern.svm is called to train the enhanced network model, and the result obtained by adjusting the parameters before is set during training, so that a trained enhanced network model is finally obtained;
the classification result of the embodiment is obtained by loading the trained enhanced network model and predicting classification by using the trained SVM data with weak confidence.
Please refer to fig. 5, a hyperplane view of the enhanced network of the present embodiment;
in this embodiment, a maximum hyperplane is found in the feature space, so that the distance from all samples to the plane is maximum (the distance from the sample set to the plane, that is, the distance from the nearest sample point to the hyperplane, is calculated), and we find the distance from all samples to the plane, that is, the distance from the nearest sample point to the hyperplaneThe learning objective of (2) is to solve the parameter α, determine the hyperplane, and maximize this distance. The parameter alpha is solved by adopting an SMO algorithm, two alpha are selected in each loop for optimization, once a pair of alpha which is out of the interval boundary and has not been subjected to interval processing or is not on the boundary is found, one of the alpha is increased while the other is decreased until all the alpha are all reduced i The KKT condition and the constraint condition of the optimization problem are met.
The classification implementation process is further explained below;
D={(x 1 ,y 1 ),(x 2 ,y 2 ),...,(x m ,y m )}
given a sample set: y is i { -1, +1 }; wherein x is i As an attribute, y i Is a class label. The purpose is as follows: and finding an optimal hyperplane (with the highest generalization capability) to separate samples of different classes.
The object to be trained is hyperplane: w is a s T x+b s 0; wherein w s Is a normal vector, b s Is a displacement term.
several sample points for which equal signs hold are called "support vectors", and the sum of the distances from two heterogeneous support vectors to the hyperplane is:which is called a "space".
Finding hyperplanes with "maximum separation", i.e.It can be known that the maximum||w s || -1 Equivalent to minimizing w s || 2 The above formula is rewritten as:this equation is the "base model" of the SVM.
Solving the above equation to obtain the model: f (x) w s T x+b s ;
Adding Lagrangian multiplier alpha to each constraint in the equation i (α i 0 or more), obtaining:
let L respectively pair w s And b s The partial derivative of (a) is 0, resulting in:
substituting the formula into the formula to obtain the dual problem of the SVM basic model:
calculating w s (i.e. find a) and b s Obtaining a model:
the above process needs to satisfy the KKT condition.
Using SMO algorithm to obtain alpha, using property of support vector to obtain b s 。
In this embodiment, for the part of data with weak confidence, a traditional machine learning method is adopted as an enhanced model to reclassify the part of data. The traditional machine learning method has the characteristic of strong interpretability.
And 5: and combining the results of the deep learning network and the enhanced network, and outputting a final classification result.
Please refer to fig. 6, which shows a structure diagram of the RTS-CF network;
in this embodiment, first, the data type and content of the text data are processed; secondly, RAKE extracts keywords, and sequentially enters a keyword embedding layer, a convolution layer, a maximum pooling layer and a fully-connected softmax layer for classification; then, entering a confidence segmenter and passing through a confidence functionDividing the results into a strong confidence degree result and a weak confidence degree result, and finding out a corresponding text and a corresponding label through indexes to obtain list data with a strong confidence degree and list data with a weak confidence degree; then, the data with strong confidence coefficient enters a deep learning network for classification, and the data with weak confidence coefficient enters an enhanced network for classification; and finally, merging the classification results of the two networks through a concatenate () function to obtain a final prediction result.
The method of the invention is used for carrying out emotion classification on the text sent out by the individual. Firstly, loading data and preprocessing the data; training a deep learning network model (the existing textCNN, RNN and other models can also be adopted) by using the whole training data, and classifying the test data; constructing a confidence divider, defining a confidence function, and dividing the classification result of the deep learning network model into a high confidence part and a low confidence part; according to the strong and weak degree of confidence, the data with high degree of confidence is classified by a deep learning network model, the data with weak degree of confidence is reclassified by an enhanced network model (the existing naive Bayes, SVM with naive Bayes characteristics and the like can also be adopted), and the enhanced model is a traditional machine learning model; and finally, combining the results of the deep learning network model and the enhanced network model and outputting the final classification result. The method and the system can obtain the emotional tendency of the text sent by the individual and know the interest topic of the individual. The invention adopts an integration method of a deep learning method and a machine learning method, aims to improve the overall classification performance, and realizes the mastering and understanding of the personal emotional tendency. The method has the advantages that the method adopts RAKE to quickly extract keywords, is simple and efficient, can extract some longer key terms of professional terms, belongs to an unsupervised method, and does not need a large amount of labeled data. In future exploration work, other effective confidence functions can be found in an attempt, and the framework is applied to other models to study the effectiveness and the applicability of the models.
It should be understood that the above description of the preferred embodiments is given for clarity and not for any purpose of limitation, and that various changes, substitutions and alterations can be made herein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (7)
1. A text emotion analysis method based on interpretability of confidence coefficient is characterized by comprising the following steps:
step 1: carrying out data preprocessing aiming at the pre-analysis text data;
step 2: inputting the preprocessed data into a deep learning network for classification;
and step 3: constructing a confidence segmenter, defining a confidence function, setting a confidence threshold value, and dividing a deep learning network classification result into a high confidence part and a low confidence part;
the confidence functionWherein d is a preset value; mean (—) is a mean function; y is 1 ,y 2 And (3) representing the output value of the softmax layer of the deep learning network, and respectively considering the scores of the strong confidence coefficient and the weak confidence coefficient, whereinz i Is the output value of the ith node as the input value of softmax; n is the number of output nodes, i.e. the number of classes to be classifiedCounting;represents the sum of all the predicted results;
and 4, step 4: according to the strong and weak confidence scores, classifying the data with strong confidence scores by a deep learning network, and reclassifying the data with weak confidence scores by an enhanced network;
and 5: and combining the results of the deep learning network and the enhanced network, and outputting a final classification result.
2. The method for text sentiment analysis based on confidence interpretability according to claim 1, wherein: preprocessing data in the step 1, firstly, arranging the acquired text data into a required data type and storing the data type in a txt file; and reading and writing the content of the text file, and removing spaces and other useless symbols for subsequent use.
3. The method for text sentiment analysis based on confidence interpretability according to claim 1, wherein: the deep learning network R-TextCNN in the step 2 comprises a RAKE extraction keyword layer, a keyword embedding layer, a convolution layer, a maximum pooling layer and a fully-connected softmax layer;
the RAKE extraction keyword layer divides the text into a plurality of sentences by using the appointed punctuation marks; for each clause, dividing the sentence into a plurality of phrases by using stop words as separators, wherein the phrases are candidate words to be sorted; each phrase is composed of a number of words, each word is assigned a score, the score for each phrase is obtained by accumulation,the term deg refers to the co-occurrence frequency of the word and all words in the text in the candidate keywords, and freq refers to the word frequency of each word; sorting the extracted candidate keywords from big to small; finally, outputting a plurality of phrases with top ranking scores as keywords;
the keyword embedding layerConverting the extracted keywords into an embedding representation; connecting n words mapped into word vectors into a sentence; the sentence of length n is represented as:wherein x is i ∈R K A k-dimensional word vector corresponding to the ith word in the sentence;is a connection operation; x is a radical of a fluorine atom i:i+j Representing the word x i ,x i+1 ,...,x i+j The connection of (1);
the convolutional layer uses convolution kernels w and x with width d and height h i:i+h-1 After convolution operation, activating by using an activation function to obtain corresponding characteristics c i Then the convolution operation is denoted c i =f(w.x i:i+h-1 + b); wherein f is an activation function, w is an initialization weight, b is a bias term, and h is a filter window length; after convolution operation, obtaining a vector c with n-h +1 dimension: c ═ c 1 ,c 2 ,...,c i ,...,c n-h+1 ](ii) a Wherein n is the number of words of each sentence;
the maximum pooling layer takes the maximum value of a plurality of one-dimensional vectors obtained after convolution, and then is spliced together to be used as the output value of the layer: z ═ z 1 ,z 2 ,z 3 ,...,z i ,...,z m }; wherein z is i =max{c i };
4. The method for text sentiment analysis based on confidence interpretability according to claim 1, wherein: reclassifying by the enhanced network in the step 4, wherein the reclassification comprises setting a parameter adjusting point, GridSearchCV, training an SVM and a classification result;
setting a parameter adjusting starting point, firstly setting a penalty parameter C and a gamma value of a kernel function parameter between 0.1 and 100, and multiplying by 0.1 or 10 each time as a step length according to the performance of an enhanced network model; after the approximate range is determined, refining the search interval;
the GridSearchCV tries each possibility through cyclic traversal in the parameter selection of the refined search interval, and the parameter with the best performance is the final result;
after the parameters of the training SVM are adjusted, calling SVC in sklern.svm to train an enhanced network model, and setting a result obtained by the previous parameter adjustment during training to finally obtain a trained enhanced network model;
and loading the trained enhanced network model according to the classification result, and predicting and classifying by using the trained SVM data with weak confidence to obtain a classification result.
5. The method for textual emotion analysis based on interpretability of confidence according to any one of claims 1 to 4, wherein: and 5, merging the classification results of the two networks through a concatenate () function to obtain a final prediction result.
6. A text sentiment analysis system based on interpretability of confidence coefficient is characterized by comprising the following modules:
the module 1 is used for carrying out data preprocessing aiming at pre-analysis text data;
the module 2 is used for inputting the preprocessed data into a deep learning network for classification;
the module 3 is used for constructing a confidence segmenter, defining a confidence function, setting a confidence threshold value and dividing a deep learning network classification result into a high confidence part and a low confidence part;
the confidence functionWherein d is a preset value; mean (—) is a mean function; y is 1 ,y 2 And (3) representing the output value of the softmax layer of the deep learning network, and respectively considering the scores of the strong confidence coefficient and the weak confidence coefficient, whereinz i Is the output value of the ith node as the input value of softmax; n is the number of output nodes, namely the number of classified categories;represents the sum of all the predicted results;
the module 4 is used for classifying the data with strong confidence coefficient by the deep learning network according to the strong and weak confidence coefficient, and reclassifying the data with weak confidence coefficient by the enhancement network;
and the module 5 is used for combining the results of the deep learning network and the enhancement network and outputting the final classification result.
7. A text sentiment analysis device based on interpretability of confidence, comprising:
one or more processors;
storage means for storing one or more programs that, when executed by the one or more processors, cause the one or more processors to implement the text sentiment analysis method based on confidence interpretability of the text according to any one of claims 1 to 5.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210607887.5A CN114911942B (en) | 2022-05-31 | 2022-05-31 | Text emotion analysis method, system and equipment based on confidence level interpretability |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210607887.5A CN114911942B (en) | 2022-05-31 | 2022-05-31 | Text emotion analysis method, system and equipment based on confidence level interpretability |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114911942A true CN114911942A (en) | 2022-08-16 |
CN114911942B CN114911942B (en) | 2024-06-18 |
Family
ID=82770893
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210607887.5A Active CN114911942B (en) | 2022-05-31 | 2022-05-31 | Text emotion analysis method, system and equipment based on confidence level interpretability |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114911942B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080249764A1 (en) * | 2007-03-01 | 2008-10-09 | Microsoft Corporation | Smart Sentiment Classifier for Product Reviews |
CN109597891A (en) * | 2018-11-26 | 2019-04-09 | 重庆邮电大学 | Text emotion analysis method based on two-way length Memory Neural Networks in short-term |
US20210216880A1 (en) * | 2019-01-02 | 2021-07-15 | Ping An Technology (Shenzhen) Co., Ltd. | Method, equipment, computing device and computer-readable storage medium for knowledge extraction based on textcnn |
CN113656548A (en) * | 2021-08-18 | 2021-11-16 | 福州大学 | Text classification model interpretation method and system based on data envelope analysis |
-
2022
- 2022-05-31 CN CN202210607887.5A patent/CN114911942B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20080249764A1 (en) * | 2007-03-01 | 2008-10-09 | Microsoft Corporation | Smart Sentiment Classifier for Product Reviews |
CN109597891A (en) * | 2018-11-26 | 2019-04-09 | 重庆邮电大学 | Text emotion analysis method based on two-way length Memory Neural Networks in short-term |
US20210216880A1 (en) * | 2019-01-02 | 2021-07-15 | Ping An Technology (Shenzhen) Co., Ltd. | Method, equipment, computing device and computer-readable storage medium for knowledge extraction based on textcnn |
CN113656548A (en) * | 2021-08-18 | 2021-11-16 | 福州大学 | Text classification model interpretation method and system based on data envelope analysis |
Non-Patent Citations (1)
Title |
---|
王庆林;李晗;庞良健;徐新胜;: "基于全局语义学习的文本情感增强方法研究", 科学技术与工程, no. 21, 28 July 2020 (2020-07-28) * |
Also Published As
Publication number | Publication date |
---|---|
CN114911942B (en) | 2024-06-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108984526B (en) | Document theme vector extraction method based on deep learning | |
CN109753566B (en) | Model training method for cross-domain emotion analysis based on convolutional neural network | |
CN110298037B (en) | Convolutional neural network matching text recognition method based on enhanced attention mechanism | |
CN110245229B (en) | Deep learning theme emotion classification method based on data enhancement | |
CN109325231B (en) | Method for generating word vector by multitasking model | |
CN109670039B (en) | Semi-supervised e-commerce comment emotion analysis method based on three-part graph and cluster analysis | |
CN111414461B (en) | Intelligent question-answering method and system fusing knowledge base and user modeling | |
CN111985247B (en) | Microblog user interest identification method and system based on multi-granularity text feature representation | |
CN112364638B (en) | Personality identification method based on social text | |
CN111274790B (en) | Chapter-level event embedding method and device based on syntactic dependency graph | |
CN111078833A (en) | Text classification method based on neural network | |
CN109101490B (en) | Factual implicit emotion recognition method and system based on fusion feature representation | |
CN111914556A (en) | Emotion guiding method and system based on emotion semantic transfer map | |
CN111368082A (en) | Emotion analysis method for domain adaptive word embedding based on hierarchical network | |
CN112256866A (en) | Text fine-grained emotion analysis method based on deep learning | |
CN113255366B (en) | Aspect-level text emotion analysis method based on heterogeneous graph neural network | |
CN115952292B (en) | Multi-label classification method, apparatus and computer readable medium | |
CN115168574A (en) | Method and device for classifying problem texts with multi-value chains | |
CN113435211A (en) | Text implicit emotion analysis method combined with external knowledge | |
CN111753088A (en) | Method for processing natural language information | |
CN111984791A (en) | Long text classification method based on attention mechanism | |
CN111581364B (en) | Chinese intelligent question-answer short text similarity calculation method oriented to medical field | |
CN114417851A (en) | Emotion analysis method based on keyword weighted information | |
CN114265935A (en) | Science and technology project establishment management auxiliary decision-making method and system based on text mining | |
CN114722835A (en) | Text emotion recognition method based on LDA and BERT fusion improved model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |