CN112347787A - Method, device and equipment for classifying aspect level emotion and readable storage medium - Google Patents

Method, device and equipment for classifying aspect level emotion and readable storage medium Download PDF

Info

Publication number
CN112347787A
CN112347787A CN202011227233.7A CN202011227233A CN112347787A CN 112347787 A CN112347787 A CN 112347787A CN 202011227233 A CN202011227233 A CN 202011227233A CN 112347787 A CN112347787 A CN 112347787A
Authority
CN
China
Prior art keywords
token
sequence
probability
emotion
text
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011227233.7A
Other languages
Chinese (zh)
Inventor
刘剑
杨海钦
姚晓远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011227233.7A priority Critical patent/CN112347787A/en
Publication of CN112347787A publication Critical patent/CN112347787A/en
Priority to PCT/CN2021/091198 priority patent/WO2022095376A1/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks

Abstract

The invention discloses a method, a device and equipment for classifying aspect level emotions and a readable storage medium, wherein the method comprises the following steps: acquiring a text to be classified, and converting keywords contained in the text to be classified into tokens to form a token sequence; wherein the token sequence comprises T tokens; inputting the token sequence into a preset token processing model to obtain a probability matrix; wherein the probability matrix is a matrix of T columns and (T +1) rows; setting the tokens with the probability values belonging to the aspect terms larger than a preset threshold value as target tokens, and forming an emotion token probability sequence by the probability values of the second row to the (T +1) th row corresponding to the target tokens; determining the emotion type corresponding to the text to be classified by using a preset Transformer model and a classifier based on the emotion token probability sequence; the invention can realize cross-domain aspect level emotion analysis.

Description

Method, device and equipment for classifying aspect level emotion and readable storage medium
Technical Field
The invention relates to the technical field of voice semantics, in particular to a method, a device and equipment for classifying aspect level emotion and a readable storage medium.
Background
With the rise of network social media, a large amount of user comment information is generated on the internet, various emotional colors and emotional tendencies are expressed in the user comment information, and the opinion of public opinion on a certain event or product can be known through emotional analysis on the user comment information.
The emotion analysis is a process of analyzing, processing, inducing and reasoning subjective texts with emotion colors, and provides emotion analysis of aspect levels in order to fully acquire emotion tendencies of various aspects in a certain text, and the analysis granularity is refined to the aspect levels; in the prior art, most of neural network models are trained in a supervised learning mode to obtain emotion classifiers for emotion analysis at aspect levels, inevitably, a large number of marked training samples are needed in the process of training the neural network models in the supervised learning mode, and the lack of marked training samples becomes a main obstacle for obtaining the emotion classifiers.
Therefore, how to avoid the obstacle of obtaining a training sample with insufficient labeled emotion classifier, how to utilize the powerful feature characterization capability of the pre-training model to realize cross-domain aspect level emotion analysis is a problem to be solved by those skilled in the art.
Disclosure of Invention
The invention aims to provide an aspect level emotion classification method, device and equipment and a readable storage medium, which can realize cross-domain aspect level emotion analysis.
According to an aspect of the invention, an aspect level emotion classification method is provided, and the method comprises the following steps:
acquiring a text to be classified, and converting keywords contained in the text to be classified into tokens to form a token sequence; wherein the token sequence comprises T tokens;
inputting the token sequence into a preset token processing model to obtain a probability matrix; wherein the probability matrix is a matrix of T columns and (T +1) rows, each column of the probability matrix characterizing a token, the first row of the probability matrix characterizing a probability value of each token belonging to an aspect term, the second row to the (T +1) th row characterizing a probability value of each token belonging to an emotion token mapped with the corresponding aspect term;
setting the tokens with the probability values belonging to the aspect terms larger than a preset threshold value as target tokens, and forming an emotion token probability sequence by the probability values of the second row to the (T +1) th row corresponding to the target tokens;
and determining the emotion type corresponding to the text to be classified by utilizing a preset Transformer model and a classifier based on the emotion token probability sequence.
Optionally, the obtaining a text to be classified and converting the keywords contained in the text to be classified into tokens to form a token sequence specifically includes:
performing word segmentation processing on the text to be classified to obtain T keywords contained in the text to be classified;
coding each keyword respectively to obtain a token of each keyword;
all tokens are grouped into the token sequence.
Optionally, the inputting the token sequence into a preset token processing model to obtain a probability matrix specifically includes:
inputting the token sequence into a BERT model in the token processing model to obtain feature representations corresponding to each token and fusing front and rear token information, and forming all the feature representations into a feature representation sequence;
inputting the feature characterization sequence into a full-connection layer in the token processing model, and performing normalization processing on the output of the full-connection layer through a softmax function to obtain the probability matrix.
Optionally, the determining, based on the emotion token probability sequence, an emotion type corresponding to the text to be classified by using a preset fransformer model and a classifier specifically includes:
multiplying the characteristic characterization sequence and the emotion token probability sequence by elements at a vector level to obtain a characterization result;
and sequentially inputting the representation result into the Transformer model and the classifier to obtain the emotion type corresponding to the text to be classified.
Optionally, the method further includes:
acquiring a sample text set; each sample text in the sample text set is marked with a corresponding aspect term and an emotion type;
training an initial neural network model based on the sample text set to correct each parameter in the initial neural network model to obtain an emotion classification model; wherein the emotion classification model comprises: a token processing model, a Transformer model, and a classifier.
In order to achieve the above object, the present invention also provides an aspect level emotion classification apparatus, including:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a text to be classified and converting keywords contained in the text to be classified into tokens to form a token sequence; wherein the token sequence comprises T tokens;
the input module is used for inputting the token sequence into a preset token processing model to obtain a probability matrix; wherein the probability matrix is a matrix of T columns and (T +1) rows, each column of the probability matrix characterizing a token, the first row of the probability matrix characterizing a probability value of each token belonging to an aspect term, the second row to the (T +1) th row characterizing a probability value of each token belonging to an emotion token mapped with the corresponding aspect term;
the processing module is used for setting the tokens with the probability values belonging to the aspect terms larger than a preset threshold value as target tokens and forming an emotion token probability sequence by the probability values of the second row to the (T +1) th row corresponding to the target tokens;
and the determining module is used for determining the emotion type corresponding to the text to be classified by utilizing a preset Transformer model and a classifier based on the emotion token probability sequence.
Optionally, the obtaining module is specifically configured to:
performing word segmentation processing on the text to be classified to obtain T keywords contained in the text to be classified;
coding each keyword respectively to obtain a token of each keyword;
all tokens are grouped into the token sequence.
Optionally, the input module is specifically configured to:
inputting the token sequence into a BERT model in the token processing model to obtain feature representations corresponding to each token and fusing front and rear token information, and forming all the feature representations into a feature representation sequence;
inputting the feature characterization sequence into a full-connection layer in the token processing model, and performing normalization processing on the output of the full-connection layer through a softmax function to obtain the probability matrix.
In order to achieve the above object, the present invention further provides a computer device, which specifically includes: the emotion classification method comprises the steps of the aspect level emotion classification method introduced above when the processor executes the computer program.
To achieve the above object, the present invention also provides a computer-readable storage medium having stored thereon a computer program which, when being executed by a processor, realizes the steps of the above-described aspect-level emotion classification method.
According to the method, the device and the equipment for classifying the aspect level emotion and the readable storage medium, the method for analyzing the cross-domain aspect level emotion based on the pre-training model is provided by utilizing the strong characteristic representation capability of the pre-training model, the structure of the pre-training model is simple, and the pre-training model can be widely applied to other similar tasks; according to the method, the threshold value is set, so that the aspect terms larger than the threshold value can participate in the subsequent aspect level emotion analysis task, the calculated amount is effectively reduced, and the performance of the model can be further improved.
Drawings
Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:
FIG. 1 is a schematic flow chart of an alternative aspect level emotion classification method according to an embodiment;
FIG. 2 is a diagram of an emotion classification model according to an embodiment one;
FIG. 3 is a schematic diagram of an alternative structure of the apparatus for classifying emotion according to the third aspect of the present invention;
fig. 4 is a schematic diagram of an alternative hardware architecture of the computer device according to the fourth embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Example one
The embodiment of the invention provides an aspect level emotion classification method, which specifically comprises the following steps of:
step S101: acquiring a text to be classified, and converting keywords contained in the text to be classified into tokens to form a token sequence; wherein the token sequence comprises T tokens.
Specifically, step S101 includes:
step A1: performing word segmentation processing on the text to be classified to obtain T keywords contained in the text to be classified;
step A2: coding each keyword respectively to obtain a token of each keyword;
step A3: all tokens are grouped into the token sequence.
For example, the text to be classified input by the user is obtained: "this restaurant served well and meals were also good"; performing word segmentation processing on the text to be classified to obtain keywords: home, restaurant, serving, good, and meal, good; encoding each keyword into corresponding tokens by using One-Hot codes respectively, thereby forming a token sequence X ═ { X ] according to the tokens of all the keywords1,x2,…,xT}。
Step S102: inputting the token sequence into a preset token processing model to obtain a probability matrix; wherein the probability matrix is a matrix of T columns and (T +1) rows, each column of the probability matrix characterizing a token, the first row of the probability matrix characterizing the probability value of the respective token belonging to an aspect term, the second row to the (T +1) th row characterizing the probability value of each token belonging to an emotion token mapped with the corresponding aspect term.
Wherein the token processing model comprises: BERT model, full connectivity layer, and softmax function.
Specifically, step S102 includes:
step B1: inputting the token sequence into a BERT (bidirectional Encoder retrieval from transformations) model in the token processing model to obtain feature representations corresponding to each token and fusing the information of the front token and the rear token, and forming all the feature representations into a feature Representation sequence;
for example, let sequence X ═ { X ═ X1,x2,…,xTInputting the data into a pre-training word embedding layer formed by a BERT model, thereby outputting a feature table fused with contextSignature sequence
Figure BDA0002763956480000061
Wherein the content of the first and second substances,
Figure BDA0002763956480000062
is a token xiCorresponding feature characterization, i ∈ [1, T ]]. In this embodiment, a BERT module is introduced as a pre-training word embedding layer, and the BERT module is a pre-training model capable of context awareness and obtained by pre-training a large-scale data set.
Step B2: inputting the feature characterization sequence into a full connection layer in the token processing model, and performing normalization processing on the output of the full connection layer through a softmax function to obtain the probability matrix;
for example, the above features are characterized by sequence HLInputting to a task-specific neural architecture layer, preferably a fully-connected layer; in addition, the neural architecture layer further comprises a softmax function model for performing a probability normalization operation, i.e., outputting each token xiCorresponding T +1 dimensions, thereby forming a probability matrix of T × (T + 1); wherein the first row of the probability matrix represents each token xiProbability of belonging to an aspect term; for target token xiThe second row to the T +1 th row of the probability matrix represent the token x respectively1,x2,…,xTBelonging to and target token xiAspect (b) the probability of the corresponding sentiment token, such that each token x in the token sequence is paired with the above-mentioned T +1 dimensionsiProbability labeling is performed.
Step S103: setting the tokens with the probability values belonging to the aspect terms larger than a preset threshold value as target tokens, and forming the probability values of the second row to the (T +1) th row corresponding to the target tokens into an emotion token probability sequence.
Screening out target tokens with probability values belonging to the aspect terms larger than a preset threshold value in a token sequence, and if the probability values belonging to the aspect terms are displayed in the probability marks of any token and are larger than the preset threshold value, proving that the token is the aspect term; for example, a token sequence is { this, restaurant, service, good, and meal, good }, the preset probability threshold is 0.5, and in the current token sequence, only the tokens "service" and "meal" belong to the aspect terms with a probability greater than the preset probability threshold of 0.5, the output target token corresponds to the aspect terms "service" and "meal".
Step S104: and determining the emotion type corresponding to the text to be classified by utilizing a preset Transformer model and a classifier based on the emotion token probability sequence.
Specifically, step S104 includes:
step C1: multiplying the characteristic characterization sequence and the emotion token probability sequence by elements at a vector level to obtain a characterization result;
step C2: and sequentially inputting the representation result into the Transformer model and the classifier to obtain the emotion type corresponding to the text to be classified.
And after the target token is determined from the token sequence, performing emotion analysis task according to the emotion token probability sequence of the target token. In particular, a sequence of feature characterizations incorporating context
Figure BDA0002763956480000071
And finally, inputting the target representation to a classifier, and screening to obtain the emotion category corresponding to the token sequence.
Further, the method further comprises:
step D1: acquiring a sample text set; each sample text in the sample text set is marked with a corresponding aspect term and an emotion type;
step D2: training an initial neural network model based on the sample text set to correct each parameter in the initial neural network model to obtain an emotion classification model; wherein the emotion classification model comprises: a token processing model, a Transformer model, and a classifier.
Preferably, as shown in fig. 2, which is a schematic diagram of an emotion classification model, as can be seen from fig. 2, a token sequence X formed according to a text to be classified is first input into a BERT model in a token processing model to obtain a feature characterization sequence H with information of front and rear tokens fused thereinL(ii) a Then the characteristic characterization sequence HLInputting the probability matrix into a full connection Layer (FC Layer) and a Softmax function (Softmax Layer) in a token processing model in sequence to obtain a probability matrix, and obtaining an emotion token probability sequence P according to the probability matrix; then the characteristic characterization sequence HLMultiplying the emotion token probability sequence P by elements at a vector level to obtain a characterization result; and finally, inputting the representation result into a Transformer model (Transformer Block) and a Classifier (Classifier Layer) in sequence to obtain the emotion type corresponding to the text to be classified. Wherein the BERT model comprises: token Embedding, Segment Embedding, Position Embedding, and L transform blocks.
Example two
The embodiment of the invention provides an aspect level emotion classification method, which specifically comprises the following steps:
and step S1, acquiring a sample text set, and training the neural network model by using the sample text set.
In this embodiment, the neural network model is composed of a pre-training word embedding layer and a neural architecture layer; firstly, a sample text is split into a plurality of tokens, namely a plurality of words, and the tokens are combined into a token sequence X { X ═ X1,x2,…,xTInputting the words to a pre-training word embedding layer; in this embodiment, a BERT module is introduced as a pre-training word embedding layer, and the BERT module is pre-trained by a large-scale data setAnd obtaining a pre-training model capable of context perception.
Further, after the BERT module receives the token sequence X, a feature characterization sequence fused with the context is output
Figure BDA0002763956480000081
Wherein the content of the first and second substances,
Figure BDA0002763956480000082
is a token xiCorresponding feature characterization, i ∈ [1, T ]]。
Further, the characteristic characterization sequence HLInput to a task-specific neural architecture layer, which includes a model of the softmax function, for performing a probability normalization operation, i.e., outputting each token xiCorresponding T +1 dimensions; wherein the first dimension represents the token xiProbability belonging to the term of the aspect, the second to T + 1-th dimensions then represent each token X in the token sequence XiThe probability of belonging to the sentiment token to which the term corresponds in this respect.
Further, all parameters of the neural network model are frozen, and fine adjustment of the neural network model for the aspect level emotion analysis is completed, so that the token processing model is obtained.
Step S2, inputting the token sequence into the token processing model, and generating a probability mark corresponding to each token;
converting the token sequence X into { X ═ X1,x2,…,xTInputting the data into the token processing model trimmed in the step S1, and outputting a feature characterization sequence fused with the context by using a BERT model
Figure BDA0002763956480000091
Wherein the content of the first and second substances,
Figure BDA0002763956480000092
is a token xiCorresponding feature characterization, i ∈ [1, T ]](ii) a Characterizing the sequence HLInput to a task-specific neural architecture layer, the layer comprising a model of the softmax function, for performing probabilisticNormalizing operation, i.e. outputting each token xiA corresponding matrix of T +1 dimensions, i.e., T × (T + 1); wherein the first row of the matrix represents each token xiThe second line to the T +1 line respectively represent each token X in the token sequence XiProbability of emotion token corresponding to term in the aspect, and using the T +1 dimensions to each token x in token sequenceiProbability labeling is performed.
And step S3, screening out target tokens from the token sequence, carrying out sentiment analysis on the target tokens, and finishing training of a Transformer model and a classifier in a field corresponding to the token sequence.
Presetting a probability threshold, screening out target tokens with the probability of belonging to the aspect terms larger than the preset probability threshold in a token sequence, and if the probability of belonging to the aspect terms is displayed in the probability mark of any token and is larger than the preset threshold, proving that the token is the aspect term; for example, if a token sequence is { this, restaurant, service, good, and meal, good }, then its corresponding aspect terms are "service" and "meal", and naturally the probability of the token belonging to the aspect term output by the neural network model in this embodiment is greater than the preset probability threshold.
Further, after the target token is selected, each token x in the target token is marked according to the probability of the target tokeniThe corresponding probability of the emotion token belonging to the term in this respect is operated on. Specifically, the feature characterization sequence fused with the context obtained in step S2 is used to determine the context
Figure BDA0002763956480000093
Each token x in the probability label with the target tokeniCarrying out vector-level element multiplication operation on the probability of the corresponding emotion token belonging to the term in the aspect to obtain a final feature representation, carrying out feature conversion on the feature representation sequence by using a Transformer module, converting the final feature representation sequence into a feature representation sequence fusing sentence semantics, and further, further utilizing a full connection layer to further convert the feature representation sequence into a feature representation sequence fusing sentence semanticsAnd line integration is carried out, a characteristic representation capable of representing the token sequence is obtained, finally, the characteristic representation is input into a classifier, and the emotion classification corresponding to the token sequence is obtained through screening.
Further, according to the steps, the training of the emotion classification model aiming at the aspect level emotion analysis in the field corresponding to the token sequence is completed.
And step S4, obtaining a text to be classified, carrying out emotion classification by using the emotion classification model, and screening emotion classification categories corresponding to the text to be classified.
Acquiring a text to be classified input by a user, and splitting the text to be classified into a token sequence X formed by a plurality of tokens (X ═ X)1,x2,…,xT}; for example, if the text to be classified entered by the client is "the restaurant is well served and the meal is also good", the text to be classified is split into token sequences { the restaurant is well served and the meal is also good }.
Further, the token sequence is input into a token processing model formed by a BERT model, and a feature characterization sequence fused with context is output
Figure BDA0002763956480000101
Wherein the content of the first and second substances,
Figure BDA0002763956480000102
is a token xiCorresponding feature characterization, i ∈ [1, T ]]。
Further, the above-mentioned characteristic characterization sequence HLInput to a task-specific neural architecture layer, which includes a model of the softmax function, for performing a probability normalization operation, i.e., outputting each token xiA corresponding matrix of T +1 dimensions, i.e., T × (T + 1); wherein the first row of the matrix represents each token xiThe second line to the T +1 line respectively represent each token X in the token sequence XiProbability of emotion token corresponding to term in the aspect, and using the T +1 dimensions to each token x in token sequenceiProbability labeling is performed.
Further, screening out a target token with the probability of belonging to the aspect term larger than the preset probability threshold value in the token sequence, and if the probability of belonging to the aspect term is displayed in the probability mark of the token and is larger than the preset probability threshold value, proving that the token is the aspect term; for example, a token sequence is { this, restaurant, service, good, and meal, also good }, the predetermined probability threshold is 0.5, and only the tokens "service" and "meal" in the current token sequence have a probability of belonging to the facet terms greater than the predetermined probability threshold of 0.5, then the output facet terms token are "service" and "meal".
Further, after the target token is selected, each token X in the token sequence X in the probability mark of the target tokeniThe corresponding probability of the emotion token belonging to the term in this respect is operated on. In particular, a context-fused feature characterization sequence will be obtained
Figure BDA0002763956480000111
Each token x in the probability label with the target tokeniAnd finally, inputting the feature characterization into a classifier, and screening to obtain the emotion classification category corresponding to the token sequence.
EXAMPLE III
The embodiment of the invention provides an aspect level emotion classification device, which specifically comprises the following components as shown in fig. 3:
an obtaining module 301, configured to obtain a text to be classified, and convert a keyword included in the text to be classified into a token to form a token sequence; wherein the token sequence comprises T tokens;
an input module 302, configured to input the token sequence into a preset token processing model to obtain a probability matrix; wherein the probability matrix is a matrix of T columns and (T +1) rows, each column of the probability matrix characterizing a token, the first row of the probability matrix characterizing a probability value of each token belonging to an aspect term, the second row to the (T +1) th row characterizing a probability value of each token belonging to an emotion token mapped with the corresponding aspect term;
a processing module 303, configured to set the token with the probability value belonging to the aspect term being greater than a preset threshold as a target token, and form an emotion token probability sequence by using the probability values of the second row to the (T +1) th row corresponding to the target token;
and a determining module 304, configured to determine, based on the emotion token probability sequence, an emotion type corresponding to the text to be classified by using a preset fransformer model and a classifier.
Specifically, the obtaining module 301 is configured to:
performing word segmentation processing on the text to be classified to obtain T keywords contained in the text to be classified;
coding each keyword respectively to obtain a token of each keyword;
all tokens are grouped into the token sequence.
Further, the input module 302 is configured to:
inputting the token sequence into a BERT model in the token processing model to obtain feature representations corresponding to each token and fusing front and rear token information, and forming all the feature representations into a feature representation sequence;
inputting the feature characterization sequence into a full-connection layer in the token processing model, and performing normalization processing on the output of the full-connection layer through a softmax function to obtain the probability matrix.
Further, the determining module 304 is configured to:
multiplying the characteristic characterization sequence and the emotion token probability sequence by elements at a vector level to obtain a characterization result;
and sequentially inputting the representation result into the Transformer model and the classifier to obtain the emotion type corresponding to the text to be classified.
Still further, the apparatus further comprises:
training module for
Acquiring a sample text set; each sample text in the sample text set is marked with a corresponding aspect term and an emotion type;
training an initial neural network model based on the sample text set to correct each parameter in the initial neural network model to obtain an emotion classification model; wherein the emotion classification model comprises: a token processing model, a Transformer model, and a classifier.
Example four
The embodiment also provides a computer device, such as a smart phone, a tablet computer, a notebook computer, a desktop computer, a rack server, a blade server, a tower server or a rack server (including an independent server or a server cluster composed of a plurality of servers) capable of executing programs, and the like. As shown in fig. 4, the computer device 40 of the present embodiment at least includes but is not limited to: a memory 401, a processor 402, which may be communicatively coupled to each other via a system bus. It is noted that FIG. 4 only shows the computer device 40 having components 401 and 402, but it is understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
In this embodiment, the memory 401 (i.e., a readable storage medium) includes a flash memory, a hard disk, a multimedia card, a card-type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a read-only memory (ROM), an electrically erasable programmable read-only memory (EEPROM), a programmable read-only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, and the like. In some embodiments, the storage 401 may be an internal storage unit of the computer device 40, such as a hard disk or a memory of the computer device 40. In other embodiments, the memory 401 may also be an external storage device of the computer device 40, such as a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), or the like, provided on the computer device 40. Of course, the memory 401 may also include both internal and external storage devices for the computer device 40. In the present embodiment, the memory 401 is generally used for storing an operating system and various types of application software installed in the computer device 40. Further, the memory 401 may also be used to temporarily store various types of data that have been output or are to be output.
Processor 402 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 402 is generally operative to control the overall operation of the computer device 40.
Specifically, in this embodiment, the processor 402 is configured to execute a program of the aspect level emotion classification method stored in the processor 402, and the program of the aspect level emotion classification method implements the following steps when executed:
acquiring a text to be classified, and converting keywords contained in the text to be classified into tokens to form a token sequence; wherein the token sequence comprises T tokens;
inputting the token sequence into a preset token processing model to obtain a probability matrix; wherein the probability matrix is a matrix of T columns and (T +1) rows, each column of the probability matrix characterizing a token, the first row of the probability matrix characterizing a probability value of each token belonging to an aspect term, the second row to the (T +1) th row characterizing a probability value of each token belonging to an emotion token mapped with the corresponding aspect term;
setting the tokens with the probability values belonging to the aspect terms larger than a preset threshold value as target tokens, and forming an emotion token probability sequence by the probability values of the second row to the (T +1) th row corresponding to the target tokens;
and determining the emotion type corresponding to the text to be classified by utilizing a preset Transformer model and a classifier based on the emotion token probability sequence.
The specific embodiment process of the above method steps can be referred to in the first embodiment, and the detailed description of this embodiment is not repeated here.
EXAMPLE five
The present embodiments also provide a computer readable storage medium, such as a flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, a server, an App application mall, etc., having stored thereon a computer program that when executed by a processor implements the method steps of:
acquiring a text to be classified, and converting keywords contained in the text to be classified into tokens to form a token sequence; wherein the token sequence comprises T tokens;
inputting the token sequence into a preset token processing model to obtain a probability matrix; wherein the probability matrix is a matrix of T columns and (T +1) rows, each column of the probability matrix characterizing a token, the first row of the probability matrix characterizing a probability value of each token belonging to an aspect term, the second row to the (T +1) th row characterizing a probability value of each token belonging to an emotion token mapped with the corresponding aspect term;
setting the tokens with the probability values belonging to the aspect terms larger than a preset threshold value as target tokens, and forming an emotion token probability sequence by the probability values of the second row to the (T +1) th row corresponding to the target tokens;
and determining the emotion type corresponding to the text to be classified by utilizing a preset Transformer model and a classifier based on the emotion token probability sequence.
The specific embodiment process of the above method steps can be referred to in the first embodiment, and the detailed description of this embodiment is not repeated here.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. An aspect-level emotion classification method, the method comprising:
acquiring a text to be classified, and converting keywords contained in the text to be classified into tokens to form a token sequence; wherein the token sequence comprises T tokens;
inputting the token sequence into a preset token processing model to obtain a probability matrix; wherein the probability matrix is a matrix of T columns and (T +1) rows, each column of the probability matrix characterizing a token, the first row of the probability matrix characterizing a probability value of each token belonging to an aspect term, the second row to the (T +1) th row characterizing a probability value of each token belonging to an emotion token mapped with the corresponding aspect term;
setting the tokens with the probability values belonging to the aspect terms larger than a preset threshold value as target tokens, and forming an emotion token probability sequence by the probability values of the second row to the (T +1) th row corresponding to the target tokens;
and determining the emotion type corresponding to the text to be classified by utilizing a preset Transformer model and a classifier based on the emotion token probability sequence.
2. The aspect-level emotion classification method according to claim 1, wherein the obtaining a text to be classified and converting keywords contained in the text to be classified into tokens to form a token sequence specifically includes:
performing word segmentation processing on the text to be classified to obtain T keywords contained in the text to be classified;
coding each keyword respectively to obtain a token of each keyword;
all tokens are grouped into the token sequence.
3. The aspect-level emotion classification method according to claim 1, wherein the step of inputting the token sequence into a preset token processing model to obtain a probability matrix specifically includes:
inputting the token sequence into a BERT model in the token processing model to obtain feature representations corresponding to each token and fusing front and rear token information, and forming all the feature representations into a feature representation sequence;
inputting the feature characterization sequence into a full-connection layer in the token processing model, and performing normalization processing on the output of the full-connection layer through a softmax function to obtain the probability matrix.
4. The aspect-level emotion classification method according to claim 3, wherein the determining, based on the emotion token probability sequence and by using a preset fransformer model and a classifier, an emotion type corresponding to the text to be classified specifically includes:
multiplying the characteristic characterization sequence and the emotion token probability sequence by elements at a vector level to obtain a characterization result;
and sequentially inputting the representation result into the Transformer model and the classifier to obtain the emotion type corresponding to the text to be classified.
5. The aspect level emotion classification method of claim 1, further comprising:
acquiring a sample text set; each sample text in the sample text set is marked with a corresponding aspect term and an emotion type;
training an initial neural network model based on the sample text set to correct each parameter in the initial neural network model to obtain an emotion classification model; wherein the emotion classification model comprises: a token processing model, a Transformer model, and a classifier.
6. An aspect-level emotion classification apparatus, the apparatus comprising:
the system comprises an acquisition module, a storage module and a processing module, wherein the acquisition module is used for acquiring a text to be classified and converting keywords contained in the text to be classified into tokens to form a token sequence; wherein the token sequence comprises T tokens;
the input module is used for inputting the token sequence into a preset token processing model to obtain a probability matrix; wherein the probability matrix is a matrix of T columns and (T +1) rows, each column of the probability matrix characterizing a token, the first row of the probability matrix characterizing a probability value of each token belonging to an aspect term, the second row to the (T +1) th row characterizing a probability value of each token belonging to an emotion token mapped with the corresponding aspect term;
the processing module is used for setting the tokens with the probability values belonging to the aspect terms larger than a preset threshold value as target tokens and forming an emotion token probability sequence by the probability values of the second row to the (T +1) th row corresponding to the target tokens;
and the determining module is used for determining the emotion type corresponding to the text to be classified by utilizing a preset Transformer model and a classifier based on the emotion token probability sequence.
7. The aspect-level emotion classification apparatus of claim 6, wherein the obtaining module is specifically configured to:
performing word segmentation processing on the text to be classified to obtain T keywords contained in the text to be classified;
coding each keyword respectively to obtain a token of each keyword;
all tokens are grouped into the token sequence.
8. The aspect-level emotion classification apparatus of claim 6, wherein the input module is specifically configured to:
inputting the token sequence into a BERT model in the token processing model to obtain feature representations corresponding to each token and fusing front and rear token information, and forming all the feature representations into a feature representation sequence;
inputting the feature characterization sequence into a full-connection layer in the token processing model, and performing normalization processing on the output of the full-connection layer through a softmax function to obtain the probability matrix.
9. A computer device, the computer device comprising: memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the steps of the method of any of claims 1 to 5 when executing the computer program.
10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 5.
CN202011227233.7A 2020-11-06 2020-11-06 Method, device and equipment for classifying aspect level emotion and readable storage medium Withdrawn CN112347787A (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011227233.7A CN112347787A (en) 2020-11-06 2020-11-06 Method, device and equipment for classifying aspect level emotion and readable storage medium
PCT/CN2021/091198 WO2022095376A1 (en) 2020-11-06 2021-04-29 Aspect-based sentiment classification method and apparatus, device, and readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011227233.7A CN112347787A (en) 2020-11-06 2020-11-06 Method, device and equipment for classifying aspect level emotion and readable storage medium

Publications (1)

Publication Number Publication Date
CN112347787A true CN112347787A (en) 2021-02-09

Family

ID=74428904

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011227233.7A Withdrawn CN112347787A (en) 2020-11-06 2020-11-06 Method, device and equipment for classifying aspect level emotion and readable storage medium

Country Status (2)

Country Link
CN (1) CN112347787A (en)
WO (1) WO2022095376A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022095376A1 (en) * 2020-11-06 2022-05-12 平安科技(深圳)有限公司 Aspect-based sentiment classification method and apparatus, device, and readable storage medium
CN114492387A (en) * 2022-04-18 2022-05-13 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Domain self-adaptive aspect term extraction method and system based on syntactic structure
CN116737938A (en) * 2023-07-19 2023-09-12 人民网股份有限公司 Fine granularity emotion detection method and device based on fine tuning large model online data network
CN117688611A (en) * 2024-01-30 2024-03-12 深圳昂楷科技有限公司 Electronic medical record desensitizing method and system, electronic equipment and storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115659995B (en) * 2022-12-30 2023-05-23 荣耀终端有限公司 Text emotion analysis method and device

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110851604A (en) * 2019-11-12 2020-02-28 中科鼎富(北京)科技发展有限公司 Text classification method and device, electronic equipment and storage medium
CN111444709A (en) * 2020-03-09 2020-07-24 腾讯科技(深圳)有限公司 Text classification method, device, storage medium and equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7937263B2 (en) * 2004-12-01 2011-05-03 Dictaphone Corporation System and method for tokenization of text using classifier models
CN110633730B (en) * 2019-08-07 2023-05-23 中山大学 Deep learning machine reading understanding training method based on course learning
CN111143553B (en) * 2019-12-06 2023-04-07 国家计算机网络与信息安全管理中心 Method and system for identifying specific information of real-time text data stream
CN112632982A (en) * 2020-10-29 2021-04-09 国网浙江省电力有限公司湖州供电公司 Dialogue text emotion analysis method capable of being used for supplier evaluation
CN112347787A (en) * 2020-11-06 2021-02-09 平安科技(深圳)有限公司 Method, device and equipment for classifying aspect level emotion and readable storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110851604A (en) * 2019-11-12 2020-02-28 中科鼎富(北京)科技发展有限公司 Text classification method and device, electronic equipment and storage medium
CN111444709A (en) * 2020-03-09 2020-07-24 腾讯科技(深圳)有限公司 Text classification method, device, storage medium and equipment

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022095376A1 (en) * 2020-11-06 2022-05-12 平安科技(深圳)有限公司 Aspect-based sentiment classification method and apparatus, device, and readable storage medium
CN114492387A (en) * 2022-04-18 2022-05-13 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Domain self-adaptive aspect term extraction method and system based on syntactic structure
CN114492387B (en) * 2022-04-18 2022-07-19 哈尔滨工业大学(深圳)(哈尔滨工业大学深圳科技创新研究院) Domain self-adaptive aspect term extraction method and system based on syntactic structure
CN116737938A (en) * 2023-07-19 2023-09-12 人民网股份有限公司 Fine granularity emotion detection method and device based on fine tuning large model online data network
CN117688611A (en) * 2024-01-30 2024-03-12 深圳昂楷科技有限公司 Electronic medical record desensitizing method and system, electronic equipment and storage medium

Also Published As

Publication number Publication date
WO2022095376A1 (en) 2022-05-12

Similar Documents

Publication Publication Date Title
CN110347835B (en) Text clustering method, electronic device and storage medium
CN112347787A (en) Method, device and equipment for classifying aspect level emotion and readable storage medium
CN110188202B (en) Training method and device of semantic relation recognition model and terminal
CN111488931B (en) Article quality evaluation method, article recommendation method and corresponding devices
CN107291840B (en) User attribute prediction model construction method and device
CN110362723A (en) A kind of topic character representation method, apparatus and storage medium
CN111368075A (en) Article quality prediction method and device, electronic equipment and storage medium
US20210295109A1 (en) Method and device for generating painting display sequence, and computer storage medium
CN111985243B (en) Emotion model training method, emotion analysis device and storage medium
CN116737938A (en) Fine granularity emotion detection method and device based on fine tuning large model online data network
CN115730597A (en) Multi-level semantic intention recognition method and related equipment thereof
CN111709225A (en) Event cause and effect relationship judging method and device and computer readable storage medium
KR102410715B1 (en) Apparatus and method for analyzing sentiment of text data based on machine learning
CN116882414B (en) Automatic comment generation method and related device based on large-scale language model
CN113934834A (en) Question matching method, device, equipment and storage medium
CN111445545B (en) Text transfer mapping method and device, storage medium and electronic equipment
CN113486143A (en) User portrait generation method based on multi-level text representation and model fusion
CN117349402A (en) Emotion cause pair identification method and system based on machine reading understanding
CN116450819A (en) Multi-mode emotion recognition method and system based on self-adaptive fusion
CN115600605A (en) Method, system, equipment and storage medium for jointly extracting Chinese entity relationship
CN111566665B (en) Apparatus and method for applying image coding recognition in natural language processing
CN113591881A (en) Intention recognition method and device based on model fusion, electronic equipment and medium
CN112950261A (en) Method and system for determining user value
CN113704623A (en) Data recommendation method, device, equipment and storage medium
CN114547435A (en) Content quality identification method, device, equipment and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210209