CN116756326B - Emotion and non-emotion text feature analysis and judgment method and device and electronic equipment - Google Patents

Emotion and non-emotion text feature analysis and judgment method and device and electronic equipment Download PDF

Info

Publication number
CN116756326B
CN116756326B CN202311045524.8A CN202311045524A CN116756326B CN 116756326 B CN116756326 B CN 116756326B CN 202311045524 A CN202311045524 A CN 202311045524A CN 116756326 B CN116756326 B CN 116756326B
Authority
CN
China
Prior art keywords
emotion
text
score
matrix
residual space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311045524.8A
Other languages
Chinese (zh)
Other versions
CN116756326A (en
Inventor
谭光华
宋旭龙
屠海龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Guangyun Technology Co ltd
Original Assignee
Hangzhou Guangyun Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Guangyun Technology Co ltd filed Critical Hangzhou Guangyun Technology Co ltd
Priority to CN202311045524.8A priority Critical patent/CN116756326B/en
Publication of CN116756326A publication Critical patent/CN116756326A/en
Application granted granted Critical
Publication of CN116756326B publication Critical patent/CN116756326B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application discloses a method and a device for analyzing and judging emotion and non-emotion text characteristics and electronic equipment, wherein emotion analysis and judgment are carried out on a text to be analyzed to obtain a vector of the text to be analyzed; inputting the vector to an emotion classifier to obtain scores of the texts to be analyzed, wherein the scores belong to various categories; converting the score to a probability P using a softmax function; inputting the vector and the score into a non-emotion discriminator, and calculating an OOD score; taking the maximum value k of the probability P as a classification score, wherein the text category corresponding to the maximum value k is emotion text; judging whether the OOD score is larger than a first threshold value or whether the classification score is smaller than a second threshold value, if so, the text to be analyzed is a non-emotion text, and if not, the text is converted into a better vector representation, the text information and emotion type information of the current text can be better understood by the vector representation, and emotion analysis can be better carried out.

Description

Emotion and non-emotion text feature analysis and judgment method and device and electronic equipment
Technical Field
The application relates to the technical field of text data processing, in particular to an emotion and non-emotion text characteristic analysis and judgment method and device and electronic equipment.
Background
Aiming at text emotion analysis, text feature extraction and emotion type judgment are needed, wherein the core is text feature extraction, and currently, two types of feature extractors mainly realize text feature extraction: the small models with few parameters such as LSTM, CNN, transformer and the like are utilized for feature extraction, and the models are very simple to deploy due to the fact that the parameters are few, training efficiency is very high, but the precision of feature extraction needs to be improved; the other is to use a BERT, roBERTa, XLNET pre-training language model for feature extraction, and the feature extraction precision of the model is high, but the training efficiency is reduced due to excessive parameters, so that the deployment difficulty is increased.
The current text emotion analysis method has a defect aiming at text emotion analysis in the clothing field, and a large number of non-emotion type samples exist in dialogue corpus of customers, such as: "hello"; "when to ship". These dialogs do not have emotion colors, so these non-emotion samples are excluded if the emotion classification decision must be wrong. The prior removal methods MSP, energy, maha and the like have poor effects in the real context of the clothing field.
Disclosure of Invention
The application provides a method for analyzing and judging emotion and non-emotion text characteristics, which is suitable for analyzing and judging emotion and non-emotion text characteristics in the field of clothing, so that non-emotion texts are eliminated.
The technical scheme adopted for solving the technical problems is as follows: a method for analyzing and judging emotion and non-emotion text features includes:
s1: acquiring a text to be analyzed;
s2: inputting the text to be analyzed into a pre-training language model for emotion analysis and discrimination to obtain a vector feature of the text to be analyzed, wherein the dimension is [1, d ];
s3: inputting the vector feature into an emotion classifier to obtain scores logic of the text to be analyzed, wherein the scores logic belong to each category, and the dimension is [1, k ];
s4: converting the score into probability P by using a softmax function, wherein the dimension is [1, k ];
s5: inputting the vector feature and the score logic into a non-emotion discriminator, and calculating an OOD score;
s6: taking the maximum value k of the probability P as a classification score, wherein the text category corresponding to the maximum value k is emotion text;
s7: and judging whether the OOD score is larger than a first threshold value or whether the classification score is smaller than a second threshold value, if so, judging that the text to be analyzed is a non-emotion text, and if not, judging that the text to be analyzed is an emotion text.
According to the emotion and non-emotion text feature analysis judging method, the text to be analyzed is converted into better vector representation, the vector representation can better understand text information and emotion type information of the current text, emotion analysis can be better conducted, emotion type judgment of the text to be analyzed is more accurate, an emotion classifier calculates a score value through a scoring function, a non-emotion discriminator calculates an OOD score through the scoring function, and therefore emotion samples and non-emotion samples can be obviously distinguished, the non-emotion samples can be better removed when emotion analysis is conducted, the accuracy of emotion analysis is finally improved, the non-emotion type judgment is simple and effective, the non-emotion type samples can be better removed, and the emotion analysis effect can be further improved.
Preferably, the pre-training language model adopts a BERT pre-training model, and uses MLM and NSP to pre-train, and the modification loss function is used for replacing the MLM task with a character replacement discriminator during pre-training, because the implementation of MLM is not very efficient, only 15% of the tokens are useful for updating parameters, the other 85% are not participating in the update of the components, and the pre-training and fine-training mismatch exists, and because in the fine-training stage, no token of [ MASK ] exists, the BERT pre-training language model is directly used as a text feature extractor for emotion classification, so that the text feature extractor modifies the loss function during pre-training, replaces the MLM task with the character replacement discriminator, and can greatly improve training efficiency, and the pre-training language model has better effect for emotion analysis and discrimination.
Preferably, the specific replacement method for modifying the loss function to replace the MLM task with the character replacement discriminator during pre-training is as follows, masking part of characters of an input text, predicting characters corresponding to masking positions by using a model, judging whether the predicted characters are consistent with original characters at the current positions, if yes, indicating that the current positions are not replaced, otherwise, indicating that the current positions are replaced, and at the moment, pre-training is a classification task, so that training targets of a pre-training language model and model training parameters are simplified, and model training efficiency is improved.
Preferably, in order to improve the recognition accuracy of emotion classification, the emotion classifier uses a cross entropy loss function and a class loss function as the loss function to train together, the class loss function records class information of texts in the training process, features of samples of the same class can be distributed in a close manner in the training process, and features of samples of different classes can be distributed in a separated manner, so that a training model can distribute similar samples as close as possible, different types of samples are distributed as far as possible, and differences among classes are better distinguished for emotion class judgment.
Preferably, the non-emotion discriminator calculates a residual space matrix N and a residual space weight alpha by using the pre-training language model and the characteristics of the emotion classifier, calculates an OOD score by using the residual space matrix N and the residual space weight alpha, calculates only the text characteristic representation of the current text and the residual matrix, can judge whether the text is a non-emotion sample by calculating the score of the current text, and obviously distinguishes the emotion sample from the non-emotion sample by designing a scoring function, thereby better eliminating the non-emotion sample when performing emotion analysis and finally improving the precision of emotion analysis.
Preferably, the residual space matrix N and the residual space weights alpha are calculated as follows,
s51: obtaining a characteristic vector matrix V of samples by utilizing the pre-training language model, and obtaining a sample covariance matrix Z by the characteristic vector matrix V;
s52: calculating eigenvalues and eigenvectors of a sample covariance matrix Z;
s53: selecting feature vectors corresponding to the minimum d-k feature values to construct a residual space matrix N, wherein k is the emotion category number of emotion classification, and d is the vector dimension of the pre-training language model;
s54: obtaining a matrix VL according to the mapping relation between the eigenvector matrix V and the residual space matrix N;
s55: the sum of squares of the dimensions of each sample in the matrix VL is rooted, and the sum is averaged to obtain a;
s56: inputting the sample into an emotion classifier to obtain scores of the sample belonging to k categories;
s57: taking the maximum classification score of each sample, and summing and averaging to obtain b;
s58: b/a gets the residual spatial weight alpha.
Preferably, when the OOD score is calculated, multiplying the vector feature by the residual space matrix N to obtain a residual space map NF, multiplying the sum of squares of each dimension of the residual space map NF by the residual space weight alpha to obtain a VL value of the current text, merging the VL value and the score logit to construct a matrix [ VL, logit ], calculating the probability by using a softmax function, and selecting a probability value corresponding to the VL value position as the OOD score.
The application provides a emotion and non-emotion text characteristic analysis and judgment device, which comprises:
an acquisition unit: the method comprises the steps of obtaining a text to be analyzed;
emotion analysis and discrimination unit: the method comprises the steps of inputting a text to be analyzed into a pre-training language model for emotion analysis and discrimination to obtain a vector feature of the text to be analyzed, wherein the dimension is [1, d ];
scoring unit: the method comprises the steps of inputting vectors feature into an emotion classifier to obtain scores logic of texts to be analyzed, wherein the scores logic belong to each category, and the dimension is [1, k ];
probability conversion unit: for converting the score logit into a probability P, dimension [1, k ], using a softmax function;
score calculating unit: the method comprises the steps of inputting vector features and score logits into a non-emotion discriminator, and calculating an OOD score;
emotion text definition unit: the method comprises the steps of taking the maximum value k of probability P as a classification score, wherein the text category corresponding to the maximum value k is emotion text;
text feature judgment unit: and the method is used for judging whether the OOD score is larger than a first threshold value or whether the classification score is smaller than a second threshold value, if so, the text to be analyzed is a non-emotion text, and if not, the text to be analyzed is an emotion text.
An electronic device comprising a memory and a processor, the memory for storing one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement the emotion and non-emotion text feature analysis determination method of any of the above.
The application has the following substantial effects:
the emotion and non-emotion text feature analysis judging method converts the text to be analyzed into better vector representation, the vector representation can better understand text information and emotion type information of the current text, emotion analysis can be better carried out, emotion type judgment of the text to be analyzed is more accurate, an emotion classifier calculates a score by a scoring function, a non-emotion discriminator calculates an OOD score by the scoring function, and the emotion sample and the non-emotion sample can be obviously distinguished, so that the non-emotion sample can be better removed when emotion analysis is carried out, the precision of the emotion analysis is finally improved, the non-emotion type sample can be better removed when the non-emotion type judgment is simple and effective, and the emotion analysis effect can be further improved;
the pre-training language model used by the emotion and non-emotion text feature analysis and judgment method is used for replacing a loss function trained by an original model with a character replacement discriminator by using an MLM, converting the loss function into a two-classification task, simplifying the pre-training model, reducing parameters, and improving training efficiency and simplifying deployment difficulty on the basis of keeping text feature extraction effect, so that the model training efficiency is improved;
in the emotion classifier in the emotion and non-emotion text feature analysis judging method, in order to improve the recognition precision, the emotion classifier is trained by adopting a cross entropy loss function and a category loss function as loss functions, and category loss functions are added, so that a trained model can distribute similar samples as close as possible, different types of samples are distributed as far as possible, emotion samples are distributed more closely, the demarcation between a non-emotion sample and an emotion sample is more obvious, the scores of the non-emotion sample discriminators of the emotion sample and the non-emotion sample have obvious distribution differences, and the difference between categories is better distinguished for emotion category judgment;
according to the emotion and non-emotion text feature analysis and judgment method, the non-emotion sample discriminator only calculates the text feature representation of the current text and the residual matrix, and judges whether the current text is a non-emotion sample or not through the score of the current text, so that the non-emotion sample can be better removed when emotion analysis is carried out, and finally the emotion analysis precision is improved.
Drawings
Fig. 1 is a flowchart of the steps of the first embodiment.
Detailed Description
The technical scheme of the application is further specifically described by the following specific examples.
Example 1
As shown in FIG. 1, the emotion and non-emotion text feature analysis and judgment method comprises the following steps:
s1: acquiring a text to be analyzed;
s2: inputting the text to be analyzed into a pre-training language model for emotion analysis and discrimination to obtain a vector feature of the text to be analyzed, wherein the dimension is [1, d ];
s3: inputting the vector feature into an emotion classifier to obtain scores logic of the text to be analyzed, wherein the scores logic belong to each category, and the dimension is [1, k ];
s4: converting the score into probability P by using a softmax function, wherein the dimension is [1, k ];
s5: inputting the vector feature and the score logic into a non-emotion discriminator, and calculating an OOD score;
s6: taking the maximum value k of the probability P as a classification score, wherein the text category corresponding to the maximum value k is emotion text;
s7: and judging whether the OOD score is larger than a first threshold value or whether the classification score is smaller than a second threshold value, if so, judging that the text to be analyzed is a non-emotion text, and if not, judging that the text to be analyzed is an emotion text.
According to the emotion and non-emotion text feature analysis judging method, the text to be analyzed is converted into better vector feature representation through the model, the vector representation can better understand text information and emotion type information of the current text, emotion analysis can be better conducted, emotion type judgment of the text to be analyzed is more accurate, an emotion classifier calculates a scoring object through a scoring function, a non-emotion discriminator calculates an OOD score according to the vector feature and the scoring object through the scoring function, and therefore emotion samples and non-emotion samples can be obviously distinguished, the non-emotion samples can be better removed when emotion analysis is conducted, finally the emotion analysis precision is improved, the non-emotion type samples can be better removed when the non-emotion type judgment is simple and effective, and the emotion analysis effect can be further improved.
Example two
A method for analyzing and judging emotion and non-emotion text features specifically comprises the following steps: inputting a text to be analyzed into a pre-training language model for emotion analysis and discrimination to obtain a vector feature of the text to be analyzed, wherein the dimension is [1, d ], the pre-training language model adopts a BERT pre-training model, and is pre-trained by using MLM and NSP, because the implementation of the MLM is not very efficient, only 15% of token is useful for updating parameters, and the other 85% are updates which do not participate in the grads; and there is pretraining and fine-tuning of the mismatch because there is no token for [ MASK ] during the fine-tuning phase; therefore, the BERT pre-training language model is directly used as a text feature extractor to carry out emotion classification, and mismatch exists, so that a loss function is modified to replace an MLM task with a character replacement discriminator during pre-training of the pre-training language model, partial characters of an input text are masked, then the model is used for predicting characters corresponding to the masking position, whether the predicted characters are consistent with original characters at the current position or not is judged, if the predicted characters are consistent, the current position is not replaced, if the predicted characters are inconsistent, the current position is replaced, the training target of the pre-training language model and training parameters of the model are simplified, and the model training efficiency is improved.
Then inputting the vector feature into an emotion classifier to obtain scores logic of the texts to be analyzed, wherein the scores logic belong to each category, and the dimension is [1, k ]; the emotion classifier adopts a cross entropy loss function and a class loss function as the loss function to train together, the class loss function records the class information of texts in the training process, the characteristic distribution of samples of the same class can be as close as possible in the training process, the characteristic distribution of samples of different classes can be separated as far as possible, the mode can improve the accuracy of judging emotion classes on one hand, can improve the judgment of non-emotion class samples on the other hand, a specific design formula is as follows,
wherein h is i A vector representation for the ith sample; h is a p A vector representation for the p-th sample; p (i) is the data set of samples of the same sample class as the i-th sample class; II h i -h p2 Calculating the similarity of the two vectors by using the Euclidean distance; the I P (i) I is the number of samples in the P (i) data set; l (L) pos A homogeneous sample European data sum for each sample;
h n a vector representation for the nth sample; n (i) is a data set composed of samples different from the ith sample class; i N (i) is the number of samples in the N (i) dataset; ζ -II h i -h p2 + is a sign function, when ζ -IIh i -h p2 More than or equal to 0, then the value is ζ -IIh i -h p2 When xi-II h i -h p2 A value of 0 is < 0; ζ is the marginal distance, representing the maximum Euclidean distance of each sample and the like; l (L) neg European data sums for different classes of samples for each sample; d is the dimension of the vector representation; m is the total sample number; l (L) cont For category loss, i.e. L margin ;L ce Cross entropy loss for classification; lambda is L ce And L cont Two lost weights; l is the total loss of model training.
Then the score logic is converted into probability P by using softmax function, the dimension is [1, k ], then the vector feature and the score logic are input into a non-emotion discriminator, the OOD score is calculated, the residual space matrix N and the residual space weight alpha are calculated when the OOD score is calculated, the specific process is as follows,
s51: obtaining a characteristic vector matrix V of samples by utilizing the pre-training language model, and obtaining a sample covariance matrix Z by the characteristic vector matrix V;
s52: calculating eigenvalues and eigenvectors of a sample covariance matrix Z;
s53: selecting feature vectors corresponding to the minimum d-k feature values to construct a residual space matrix N, wherein k is the emotion category number of emotion classification, and d is the vector dimension of the pre-training language model;
s54: obtaining a matrix VL according to the mapping relation between the eigenvector matrix V and the residual space matrix N;
s55: the sum of squares of the dimensions of each sample in the matrix VL is rooted, and the sum is averaged to obtain a;
s56: inputting the sample into an emotion classifier to obtain scores of the sample belonging to k categories;
s57: taking the maximum classification score of each sample, and summing and averaging to obtain b;
s58: b/a gets the residual spatial weight alpha.
And multiplying the vector feature by a residual space matrix N to obtain a residual space map NF, squaring and rooting each dimension of the residual space map NF, multiplying the sum of squares of each dimension of the residual space map NF by a residual space weight alpha to obtain a VL value of the current text, merging the VL value and a score log to construct a matrix [ VL, log ], calculating probability by using a softmax function, and selecting a probability value corresponding to the position of the VL value as an OOD score.
At this time, the maximum k of the probability P is taken as a classification score, the text category corresponding to the maximum k is emotion text, a definition is made for emotion text and non-emotion text, finally, whether the OOD score is larger than a first threshold or whether the classification score is smaller than a second threshold is judged, if yes, the text to be analyzed is non-emotion text, and if no, the text to be analyzed is emotion text.
It should be noted that, when the method of this embodiment is actually applied, when the client sends two text segments, text one: you sell too bad clothing; text II: the garment is too bad for me to return.
Text one belongs to negative emotion in emotion classification, and text two has specific meaning although having negative emotion and should belong to the buyer to return goods.
Parameter description: d represents a vector dimension; k represents the emotion category number.
At this time, inputting the text into the pre-training language model to obtain a corresponding text vector feature, wherein the dimension is [1, d ], inputting the vector feature into the emotion recognizer to obtain a classification score logic of the current text, wherein the dimension is [1, k ], converting the score logic into probability P by using a softmax function, selecting the maximum score in the k class scores as the emotion class of the current text, taking the maximum score as a classification score, inputting the text vector feature and the score logic into a non-emotion discriminator, and calculating an OOD score, wherein the method is as follows: calculating a residual space weight alpha by using the residual space matrix N calculated by the training set, wherein the dimension is [ d, d-k ], carrying out matrix calculation on the vector feature and the residual space matrix N to obtain f1, the dimension is [1, d-k ], calculating L2 for the f1 according to a second dimension, multiplying the residual space weight alpha to obtain f2, wherein f2 is a numerical value, combining the f2 and the score logic to form a matrix [ f2, logic ], the dimension is [1, k+1], calculating probability by using softmax, selecting a probability value corresponding to the f2 position as an OOD score, and if the score is smaller than a value or the OOD score is larger than a value, determining that the current text does not belong to the emotion category.
The above process is a specific calculation to be completed by the method, and then, the calculation result of each step of the above process is shown by adopting a text I and a text II, and d=4, k=2, k1 represents dissatisfaction, and k2 represents satisfaction;
inputting the text I and the text II into a pre-training language model to obtain vector representations F1 and F2 corresponding to the two sentences;
obtained, F1: [0.25, 0.33, 0.5, 0.2]; f2: [0.1, 0.2, 0.6, 0.3];
f1 and F2 are input into an emotion classifier to obtain classification scores, L1: [0.7, 0.5], L2: [0.66, 0.2];
converting the classification score into probabilities, P1: [0.77, 0.23], P2: [0.78, 0.22];
from the classification score, the two sentences belong to emotion categories, the OOD score is calculated, calculation is completed by using a residual space matrix N and a residual space weight alpha, and F1 and F2 are respectively subjected to matrix multiplication with the residual space matrix N to obtain a residual space mapping vector NF1: [0.75, 0.43], NF2: [0.55, 0.47];
multiplying each square sum and open root of the residual space mapping vector by the residual space weight alpha to obtain a VL value, NF1A:0.3, NF2A:0.6;
combining VL values of the first text and the second text with classification scores of the first text and the second text respectively, and LF1: [0.3, 0.7, 0.5]; LF2: [0.6, 0.66, 0.2], converting into probability, selecting probability value of the first position as OOD score, obtaining LFP1: [0.3, 0.6, 0.1], LFP2: [0.99, 0,005, 0.005], namely the OOD score of the text I is 0.3, the OOD score of the text II is 0.99, at this time, the score of the text I is 0.77, the score of the text II is 0.88, and the text II is a non-emotion type sample, thereby completing the non-emotion type judgment.
Example III
The application also provides a device for analyzing and judging the characteristics of emotion and non-emotion texts, which comprises the following steps:
an acquisition unit: the method comprises the steps of obtaining a text to be analyzed;
emotion analysis and discrimination unit: the method comprises the steps of inputting a text to be analyzed into a pre-training language model for emotion analysis and discrimination to obtain a vector feature of the text to be analyzed, wherein the dimension is [1, d ];
scoring unit: the method comprises the steps of inputting vectors feature into an emotion classifier to obtain scores logic of texts to be analyzed, wherein the scores logic belong to each category, and the dimension is [1, k ];
probability conversion unit: for converting the score logit into a probability P, dimension [1, k ], using a softmax function;
score calculating unit: the method comprises the steps of inputting vector features and score logits into a non-emotion discriminator, and calculating an OOD score;
emotion text definition unit: the method comprises the steps of taking the maximum value k of probability P as a classification score, wherein the text category corresponding to the maximum value k is emotion text;
text feature judgment unit: and the method is used for judging whether the OOD score is larger than a first threshold value or whether the classification score is smaller than a second threshold value, if so, the text to be analyzed is a non-emotion text, and if not, the text to be analyzed is an emotion text.
Example IV
The application also provides electronic equipment, which comprises a memory and a processor, wherein the memory is used for storing one or more computer instructions, and the one or more computer instructions are executed by the processor to realize the emotion and non-emotion text feature analysis and judgment method.
The above-described embodiment is only a preferred embodiment of the present application, and is not limited in any way, and other variations and modifications may be made without departing from the technical aspects set forth in the claims.

Claims (6)

1. The emotion and non-emotion text feature analysis and judgment method is characterized by comprising the following steps:
s1: acquiring a text to be analyzed;
s2: inputting the text to be analyzed into a pre-training language model for emotion analysis and discrimination to obtain a vector feature of the text to be analyzed, wherein the dimension is [1, d ];
s3: inputting the vector feature into an emotion classifier to obtain scores logic of the text to be analyzed, wherein the scores logic belong to each category, and the dimension is [1, k ];
s4: converting the score into probability P by using a softmax function, wherein the dimension is [1, k ];
s5: the non-emotion discriminant computation utilizes the pre-trained language model and the features of the emotion classifier to compute the residual space matrix N and the residual space weight alpha, inputs the vector feature and the score log into the non-emotion discriminant, utilizes the residual space matrix N and the residual space weight alpha to compute the OOD score, concretely as follows,
s51: obtaining a characteristic vector matrix V of samples by utilizing the pre-training language model, and obtaining a sample covariance matrix Z by the characteristic vector matrix V;
s52: calculating eigenvalues and eigenvectors of a sample covariance matrix Z;
s53: selecting feature vectors corresponding to the minimum d-k feature values to construct a residual space matrix N, wherein k is the emotion category number of emotion classification, and d is the vector dimension of the pre-training language model;
s54: obtaining a matrix VL according to the mapping relation between the eigenvector matrix V and the residual space matrix N;
s55: the sum of squares of the dimensions of each sample in the matrix VL is rooted, and the sum is averaged to obtain a;
s56: inputting the sample into an emotion classifier to obtain scores of the sample belonging to k categories;
s57: taking the maximum classification score of each sample, and summing and averaging to obtain b;
s58: b/a obtaining a residual space weight alpha;
s59: multiplying the vector feature by a residual space matrix N to obtain a residual space map NF, multiplying the square sum of each dimension of the residual space map NF by the root of each dimension of the residual space map NF, multiplying the sum of the squares of each dimension by a residual space weight alpha to obtain a VL value of the current text, merging the VL value and a score log to construct a matrix [ VL, log ], calculating probability by using a softmax function, and selecting a probability value corresponding to the position of the VL value as an OOD score;
s6: taking the maximum value k of the probability P as a classification score, wherein the text category corresponding to the maximum value k is emotion text;
s7: and judging whether the OOD score is larger than a first threshold value or whether the classification score is smaller than a second threshold value, if so, judging that the text to be analyzed is a non-emotion text, and if not, judging that the text to be analyzed is an emotion text.
2. The emotion and non-emotion text feature analysis and judgment method of claim 1, wherein the pre-training language model adopts a BERT pre-training model, and MLM and NSP are used for pre-training, and the MLM task is replaced by a character replacement discriminator during pre-training.
3. The emotion and non-emotion text feature analysis and judgment method according to claim 2, characterized in that the specific substitution method for substituting the MLM task for the character substitution discriminator in the pre-training is as follows, masking part of characters of the input text and predicting the character corresponding to the masking position by using the model, judging whether the predicted character is identical with the original character at the current position, if yes, indicating that the current position is not substituted, and if not, indicating that the current position is substituted.
4. The emotion and non-emotion text feature analysis and judgment method according to claim 1, wherein the emotion classifier is trained by adopting a cross entropy loss function and a class loss function as loss functions, the class loss function records class information of texts in a training process, features of samples of the same class are distributed in a near mode in the training process, and features of samples of different classes are distributed in a separating mode.
5. An emotion and non-emotion text feature analysis and judgment device, characterized by comprising:
an acquisition unit: the method comprises the steps of obtaining a text to be analyzed;
emotion analysis and discrimination unit: the method comprises the steps of inputting a text to be analyzed into a pre-training language model for emotion analysis and discrimination to obtain a vector feature of the text to be analyzed, wherein the dimension is [1, d ];
scoring unit: the method comprises the steps of inputting vectors feature into an emotion classifier to obtain scores logic of texts to be analyzed, wherein the scores logic belong to each category, and the dimension is [1, k ];
probability conversion unit: for converting the score logit into a probability P, dimension [1, k ], using a softmax function;
score calculating unit: the method is used for calculating the residual space matrix N and the residual space weight alpha by using the pre-training language model and the characteristics of the emotion classifier by using the non-emotion discriminant, inputting the vector feature and the score logic into the non-emotion discriminant, calculating the OOD score by using the residual space matrix N and the residual space weight alpha, and concretely comprises the following steps,
s51: obtaining a characteristic vector matrix V of samples by utilizing the pre-training language model, and obtaining a sample covariance matrix Z by the characteristic vector matrix V;
s52: calculating eigenvalues and eigenvectors of a sample covariance matrix Z;
s53: selecting feature vectors corresponding to the minimum d-k feature values to construct a residual space matrix N, wherein k is the emotion category number of emotion classification, and d is the vector dimension of the pre-training language model;
s54: obtaining a matrix VL according to the mapping relation between the eigenvector matrix V and the residual space matrix N;
s55: the sum of squares of the dimensions of each sample in the matrix VL is rooted, and the sum is averaged to obtain a;
s56: inputting the sample into an emotion classifier to obtain scores of the sample belonging to k categories;
s57: taking the maximum classification score of each sample, and summing and averaging to obtain b;
s58: b/a obtaining a residual space weight alpha;
s59: multiplying the vector feature by a residual space matrix N to obtain a residual space map NF, multiplying the square sum of each dimension of the residual space map NF by the root of each dimension of the residual space map NF, multiplying the sum of the squares of each dimension by a residual space weight alpha to obtain a VL value of the current text, merging the VL value and a score log to construct a matrix [ VL, log ], calculating probability by using a softmax function, and selecting a probability value corresponding to the position of the VL value as an OOD score;
emotion text definition unit: the method comprises the steps of taking the maximum value k of probability P as a classification score, wherein the text category corresponding to the maximum value k is emotion text;
text feature judgment unit: and the method is used for judging whether the OOD score is larger than a first threshold value or whether the classification score is smaller than a second threshold value, if so, the text to be analyzed is a non-emotion text, and if not, the text to be analyzed is an emotion text.
6. An electronic device comprising a memory and a processor, the memory configured to store one or more computer instructions, wherein the one or more computer instructions are executed by the processor to implement the emotion and non-emotion text feature analysis determination method of any of claims 1-4.
CN202311045524.8A 2023-08-18 2023-08-18 Emotion and non-emotion text feature analysis and judgment method and device and electronic equipment Active CN116756326B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311045524.8A CN116756326B (en) 2023-08-18 2023-08-18 Emotion and non-emotion text feature analysis and judgment method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311045524.8A CN116756326B (en) 2023-08-18 2023-08-18 Emotion and non-emotion text feature analysis and judgment method and device and electronic equipment

Publications (2)

Publication Number Publication Date
CN116756326A CN116756326A (en) 2023-09-15
CN116756326B true CN116756326B (en) 2023-11-24

Family

ID=87961278

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311045524.8A Active CN116756326B (en) 2023-08-18 2023-08-18 Emotion and non-emotion text feature analysis and judgment method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN116756326B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109960791A (en) * 2017-12-25 2019-07-02 上海智臻智能网络科技股份有限公司 Judge the method and storage medium, terminal of text emotion
CN111324744A (en) * 2020-02-17 2020-06-23 中山大学 Data enhancement method based on target emotion analysis data set
CN114758676A (en) * 2022-04-18 2022-07-15 哈尔滨理工大学 Multi-modal emotion recognition method based on deep residual shrinkage network
CN115169361A (en) * 2022-08-03 2022-10-11 中国银行股份有限公司 Emotion analysis method and related equipment thereof
CN115309864A (en) * 2022-08-11 2022-11-08 平安科技(深圳)有限公司 Intelligent sentiment classification method and device for comment text, electronic equipment and medium
CN115795011A (en) * 2022-11-24 2023-03-14 北京工业大学 Emotional dialogue generation method based on improved generation of confrontation network
WO2023134083A1 (en) * 2022-01-11 2023-07-20 平安科技(深圳)有限公司 Text-based sentiment classification method and apparatus, and computer device and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111160297B (en) * 2019-12-31 2022-05-13 武汉大学 Pedestrian re-identification method and device based on residual attention mechanism space-time combined model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109960791A (en) * 2017-12-25 2019-07-02 上海智臻智能网络科技股份有限公司 Judge the method and storage medium, terminal of text emotion
CN111324744A (en) * 2020-02-17 2020-06-23 中山大学 Data enhancement method based on target emotion analysis data set
WO2023134083A1 (en) * 2022-01-11 2023-07-20 平安科技(深圳)有限公司 Text-based sentiment classification method and apparatus, and computer device and storage medium
CN114758676A (en) * 2022-04-18 2022-07-15 哈尔滨理工大学 Multi-modal emotion recognition method based on deep residual shrinkage network
CN115169361A (en) * 2022-08-03 2022-10-11 中国银行股份有限公司 Emotion analysis method and related equipment thereof
CN115309864A (en) * 2022-08-11 2022-11-08 平安科技(深圳)有限公司 Intelligent sentiment classification method and device for comment text, electronic equipment and medium
CN115795011A (en) * 2022-11-24 2023-03-14 北京工业大学 Emotional dialogue generation method based on improved generation of confrontation network

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
CURIOUS:Efficient Neural Architecture Search Based on a Performance Predictor and Evolutionary Search;Shayan Hassantabar;IEEE;第4975-4990页 *
基于特征矩阵构造与BP神经网络的垃圾文本过滤模型;方瑞;于俊洋;董李锋;;计算机工程(第08期);第1-7页 *

Also Published As

Publication number Publication date
CN116756326A (en) 2023-09-15

Similar Documents

Publication Publication Date Title
WO2021134871A1 (en) Forensics method for synthesized face image based on local binary pattern and deep learning
CN105701502B (en) Automatic image annotation method based on Monte Carlo data equalization
CN108268838B (en) Facial expression recognition method and facial expression recognition system
CN111680706B (en) Dual-channel output contour detection method based on coding and decoding structure
CN104463250B (en) A kind of Sign Language Recognition interpretation method based on Davinci technology
CN114841257B (en) Small sample target detection method based on self-supervision comparison constraint
KR20040037180A (en) System and method of face recognition using portions of learned model
CN108537168B (en) Facial expression recognition method based on transfer learning technology
Samuel et al. From generalized zero-shot learning to long-tail with class descriptors
CN106096560A (en) A kind of face alignment method
CN109344713A (en) A kind of face identification method of attitude robust
CN111985554A (en) Model training method, bracelet identification method and corresponding device
CN111539452A (en) Image recognition method and device for multitask attributes, electronic equipment and storage medium
CN111144462B (en) Unknown individual identification method and device for radar signals
CN114998602A (en) Domain adaptive learning method and system based on low confidence sample contrast loss
CN109697727A (en) Method for tracking target, system and storage medium based on correlation filtering and metric learning
CN109409231B (en) Multi-feature fusion sign language recognition method based on self-adaptive hidden Markov
CN110827809B (en) Language identification and classification method based on condition generation type confrontation network
CN110598718A (en) Image feature extraction method based on attention mechanism and convolutional neural network
CN116756326B (en) Emotion and non-emotion text feature analysis and judgment method and device and electronic equipment
CN113420833A (en) Visual question-answering method and device based on question semantic mapping
CN116363712B (en) Palmprint palm vein recognition method based on modal informativity evaluation strategy
KR102264988B1 (en) Traditional Korean character Hanja Recognition System and method using thereof
Yin The method of table tennis players' posture recognition based on a genetic algorithm
CN114898464B (en) Lightweight accurate finger language intelligent algorithm identification method based on machine vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant