WO2021139107A1

WO2021139107A1 - Intelligent emotion recognition method and apparatus, electronic device, and storage medium

Info

Publication number: WO2021139107A1
Application number: PCT/CN2020/098962
Authority: WO
Inventors: 蒋江涛; 马骏; 王少军
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-01-10
Filing date: 2020-06-29
Publication date: 2021-07-15
Also published as: CN111241828A

Abstract

An intelligent emotion recognition method and apparatus, an electronic device, and a computer-readable storage medium, which relate to artificial intelligence technology. The method comprises: acquiring a text conversation set, and performing de-duplication and word segmentation operations on the text conversation set to obtain a word set; calculating an importance score of a word in the word set, selecting, according to a preset manner, words in the word set as a topic sequence set of the text conversation set, and converting the topic sequence set into a topic vector set; converting the word set into a word vector set, and then inputting the word vector set into a pre-trained sentence vector prediction model, and outputting a sentence vector set corresponding to the word vector set; and performing coding and decoding operations on the topic vector set and the sentence vector set by using a pre-trained intelligent conversation emotion recognition model to obtain a corresponding emotion expression, thereby completing emotion recognition of a text conversation. The method realizes more accurate emotion recognition for text conversations.

Description

Emotional intelligent recognition method, device, electronic equipment and storage medium

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on January 10, 2020, the application number is CN202010034202.3, and the invention title is "Emotional Intelligent Recognition Method, Device, and Computer-readable Storage Medium", and its entire contents Incorporated in this application by reference.

Technical field

This application relates to the field of artificial intelligence technology, and in particular to a method, device, electronic equipment, and computer-readable storage medium for emotional intelligent recognition in natural language processing.

Background technique

Emotion recognition is an important task of natural language processing and an important field in the field of artificial intelligence. It also has a wide range of applications in the industry, such as product review emotion recognition, text-based intelligent customer service emotion recognition, and user emotion recognition for discussion topics, etc. . Therefore, how to improve the accuracy of emotions is of great significance. The inventor realizes that the current common emotion recognition methods rarely consider the context of the user’s expression, resulting in the inability to correctly recognize the user’s true emotions in the emotion recognition process, especially in the context of text-based dialogue, the intelligent customer service and the user are often right Round dialogue, in the process of dialogue, the user's emotions are affected by many aspects such as the topic of discussion and the expression of intelligent customer service.

Summary of the invention

This application provides an emotional intelligent recognition method, device, electronic equipment, and computer-readable storage medium.

An emotional intelligent recognition method provided by this application includes:

Acquiring a text dialogue set, performing a deduplication operation on the text dialogue set to obtain a standard text dialogue set, and performing a word segmentation operation on the standard text dialogue set to obtain a word set;

Calculate the importance score of the words in the word set, select the words in the word set as the subject sequence set of the standard text dialogue set according to the importance score of the words, and select the subject sequence set in the standard text dialogue set in a preset manner, and set the subject sequence set Converted to theme vector set;

Inputting the word set into a pre-trained sentence vector prediction model after converting the word vector set, and outputting a sentence vector set corresponding to the word vector set;

A pre-trained intelligent dialogue emotion recognition model is used to encode and decode the topic vector set and sentence vector set to obtain the corresponding emotion expression, thereby completing the emotion recognition of the text dialogue.

The present application also provides an electronic device that includes a memory and a processor. The memory stores an emotional intelligence recognition program that can run on the processor. When the emotional intelligence recognition program is executed by the processor, To achieve the following steps:

The present application also provides a computer-readable storage medium having an emotional intelligence recognition program stored on the computer-readable storage medium, and the emotional intelligence recognition program can be executed by one or more processors to implement the following steps:

Calculate the importance score of the words in the word set, select the words in the word set as the subject sequence set of the standard text dialogue set according to the importance score of the words and in a preset manner, and convert the subject sequence set Is the theme vector set;

In addition, in order to achieve the above objective, this application also provides an emotional intelligent recognition device, including:

The deduplication word segmentation module is used to obtain a text dialogue set, after performing a deduplication operation on the text dialogue set, a standard text dialogue set is obtained, and a word segmentation operation is performed on the standard text dialogue set to obtain a word set.

The calculation conversion module is used to calculate the importance score of the words in the word set, select corresponding words as the subject sequence set of the standard text dialogue set according to the importance score of the words, and according to a preset method, and The subject sequence set is converted into a subject vector set.

The sentence vector prediction module is used to convert the word set into a word vector set and input it into a pre-trained sentence vector prediction model, and output a sentence vector set corresponding to the word vector set.

The emotion recognition module is used to encode and decode the topic vector set and sentence vector set by using a pre-trained intelligent dialogue emotion recognition model to obtain the corresponding emotion expression, thereby completing the emotion recognition of the text dialogue.

Description of the drawings

FIG. 1 is a schematic flowchart of an emotional intelligence recognition method provided by an embodiment of this application;

2 is a schematic diagram of the internal structure of an electronic device provided by an embodiment of the application;

FIG. 3 is a schematic diagram of modules of an emotional intelligent recognition device provided by an embodiment of the application.

The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Detailed ways

It should be understood that the specific embodiments described here are only used to explain the present application, and are not used to limit the present application.

This application provides an emotional intelligent recognition method. Referring to FIG. 1, it is a schematic flowchart of an emotional intelligence recognition method provided by an embodiment of this application. The method can be executed by a device, and the device can be implemented by software and/or hardware.

In this embodiment, the emotional intelligence recognition method includes:

S1. Obtain a text dialogue set, perform a deduplication operation on the text dialogue set to obtain a standard text dialogue set, and perform a word segmentation operation on the standard text dialogue set to obtain a word set.

In a preferred embodiment of the present application, the text dialogue set is a text dialogue set formed by recording the dialogue between a customer service and a user, and the customer service may be a front desk customer service, sales customer service, after-sales customer service, etc. of Ping An of China.

Further, since there are usually repeated dialog texts in the dialog text set, a large amount of repeated data will affect the classification accuracy. Therefore, in the embodiment of the present application, the standard text dialog set is obtained by performing deduplication operations on the dialog text set first. Wherein, the deduplication operation includes:

Wherein, n represents the number of text dialogs in the text dialog set, w _1j and w _2j represent any two text dialogs in the text dialog set, and d represents the distance between any two text dialogs.

Preferably, in this application, if the distance between any two text dialogs is less than a preset threshold, then any one of the two text dialogs is deleted. Wherein, the preset threshold value is 0.1.

Preferably, the word segmentation operation in this application includes: matching the words in the text dialogue set with the words in the preset dictionary through a preset strategy to obtain multiple words, and using spaces for the words Symbols are separated to obtain the set of words.

In a preferred embodiment of the present application, the preset dictionary may include a statistical dictionary and a prefix dictionary.

The statistical dictionary is a dictionary constructed by all possible word segmentation obtained by statistical methods. The statistical dictionary counts the frequency of the contribution of adjacent characters in the corpus and calculates mutual information. When the mutual information of adjacent characters is greater than a preset threshold, it is recognized as a constituent word, wherein the threshold is 0.6.

The prefix dictionary includes the prefix of each word segment in the statistical dictionary. For example, the prefixes of the word "Peking University" in the statistical dictionary are "North", "Beijing", and "Beijing University"; The prefix is "big" and so on. This application uses the statistical dictionary to obtain possible word segmentation results of the text dialogue set, and obtains the final segmentation form according to the segmentation position of the word through the prefix dictionary, thereby obtaining the word set of the text dialogue set.

S2. Calculate the importance score of the words in the word set, select the words in the word set as the subject sequence set of the standard text dialogue set according to the importance score of the words, and select the words in the word set in a preset manner, and combine the subject The sequence set is converted to the subject vector set.

In a preferred embodiment of the present application, the calculation of the importance score of the words in the word set includes:

Calculate the dependency correlation degree of any two words W _i and W _{j in the word set:}

Among them, Dep(W _i , W _j ) represents the degree of dependency relationship between _{the words W i} and W _j _{, len(W i} , W _j ) represents the length of the dependency path between _{the words W i} and W _{j, and b is} Hyperparameter

Calculate the gravitational forces of _{the words W i} and W _j according to the dependency correlation degree:

Among them, f _grav (W _i , W _j ) represents the _{gravitational force of the words W i} and W _j , tfidf(W _i ) represents the TF-IDF value of the _{word W i} _{, and tfidf(W j} ) represents _{the TF-IDF of the word W j} IDF value, TF means word frequency, IDF means inverse document frequency index, d is the Euclidean distance between the word vectors _{of words W i} and W _j;

The correlation strength between the words W _i and W _{j is obtained as:}

weight(W _i ,W _j )=Dep(W _i ,W _j )*f _grav (W _i ,W _j )

The words W _i is calculated based on the strength of association importance score:

among them,

Is the set related to the vertex W _i , and η is the damping coefficient.

Preferably, the preset method in the present application is to select t words with the highest scores as the subject sequence set of the standard text dialogue set according to the importance score of the word.

Further, the preferred embodiment of the present application converts the topic sequence set into a topic vector set (sent_topic_vec_i) through the word2vec model, where i represents the index of the sentence that completes the conversation at one time. The word2vec model refers to an efficient algorithm model that represents words as real-valued vectors. It uses the idea of deep learning to simplify the processing of text content to vector operations in a K-dimensional vector space through training. The similarity in the vector space can be used to represent the semantic similarity of the text. In detail, the word2vec model training process includes: selecting a window of an appropriate size as the context, the input layer of the word2vec model reads the words in the window, and the vector of the words in the window (K-dimensional, initial random) Add them together to form K nodes in the hidden layer; the output layer of the word2vec model is a huge binary tree, and the leaf nodes represent all the words in the corpus (the corpus contains V independent words, then the binary tree has |V| leaves Node); for each word of the leaf node, there will be a globally unique code, like "010011", which can indicate that the left subtree is 1, and the right subtree is 0; because, in the hidden layer of the word2vec model Each node will have an edge with the inner node of the binary tree, so for each inner node of the binary tree there will be K connected edges, and each edge will also have a weight. Therefore, the word2vec model after training can convert the above topic sequence into a topic vector.

S3. The word set is converted into a word vector set and then input into a pre-trained sentence vector prediction model, and a sentence vector set corresponding to the word vector set is output.

In a preferred embodiment of the present application, the sentence vector prediction model is a two-way Transformer model. The Transformer model processes all word vectors in the input sequence in parallel, and at the same time uses a self-attention mechanism to combine context with distant word vectors to output corresponding sentence vectors. Furthermore, this application trains the two-way Transformer model through a fine-turning method and a random concealment method. The fine-turing method extracts the shallow features of the available neural network, modifies the parameters in the deep neural network, and builds a new neural network model to reduce the number of iterations. The random masking method does not predict each word as in the word2vec model, but randomly masks part of the input word vector from the input, and its goal is to predict the original vocabulary of the masked word vector based on its context.

S4. Use a pre-trained intelligent dialogue emotion recognition model to perform encoding and decoding operations on the topic vector set and sentence vector set to obtain corresponding emotion expressions, thereby completing the emotion recognition of the text dialogue.

In a preferred embodiment of the present application, the pre-built intelligent dialogue emotion recognition model includes an input layer, a hidden layer and an output layer.

Preferably, the training process of the intelligent dialogue emotion recognition model in this application is as follows:

a. Construct a loss function. According to the basic formula in deep learning, the input and output of each layer are

C _i = f(z _i ), where

Is the input of the _{i-th neuron in the l-th layer network, Ws i-1} is the link from the i-th neuron in the l-th layer network to the j-th neuron in the l+1-th layer network, and C _j is each unit of the output layer To construct the loss function

b. Use the gradient descent algorithm to update the parameter values of the loss function. The gradient descent algorithm is the most commonly used optimization algorithm for neural network model training. In the embodiment of this application, to find the loss function

This application updates the variable y along the direction opposite to the gradient vector -dL/dy, so that the gradient can be reduced the fastest until the loss converges to the minimum value. The parameter update formula is as follows: L=L-αdL/dy, α represents the learning rate, so that the trained intelligent dialogue emotion recognition model can be obtained for the analysis of the emotion state.

Preferably, the encoding and decoding operations include: receiving the topic vector set and the sentence vector set through the input layer, and using the hidden layer to perform an encoding operation on the topic vector set and the sentence vector set to obtain the The feature sequence set of the topic vector set and the sentence vector set, the dynamic programming algorithm is used to decode the probability set of the state sequence corresponding to the feature sequence set, and the state sequence corresponding to the maximum probability in the probability set of the state sequence is used as the corresponding emotion expression The output result.

In a preferred embodiment of the present application, the dynamic programming algorithm is used to find the Viterbi path-hidden state sequence that is most likely to generate the observed event sequence. Wherein, the dynamic programming algorithm includes:

V _1,k = p(y ₁ |k)·π _k

V _t,k = p(y _t |k)·max _x∈S (a _x,k ·V _t-1,x )

Among them, V _1,k represents the probability of the output state sequence corresponding to k for the first state sequence, p represents the probability value, y ₁ represents the output value of the first state sequence, k represents the state sequence, and π _k represents the initial state k. Probability dimension; V _t,k represents the probability of the state sequence corresponding to the output of _{the t-th state, y t} represents the output value of the t-th state sequence, S represents the state space, and a _x,k represents the state sequence from the state x to the state k The transition probability, V _t-1,x represents the probability of the state sequence corresponding to the output of the t-1 state as x.

The application also provides an electronic device. Referring to FIG. 2, it is a schematic diagram of the internal structure of an electronic device provided by an embodiment of this application.

In this embodiment, the electronic device 1 may be a PC (Personal Computer, personal computer), or a terminal device such as a smart phone, a tablet computer, or a portable computer, or a server. The electronic device 1 at least includes a memory 11, a processor 12, a communication bus 13, and a network interface 14.

The memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, and the like. The memory 11 may be an internal storage unit of the electronic device 1 in some embodiments, such as a hard disk of the electronic device 1. In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a plug-in hard disk equipped on the electronic device 1, a smart memory card (Smart Media Card, SMC), and a Secure Digital (SD) Card, Flash Card, etc. Further, the memory 11 may also include both an internal storage unit of the electronic device 1 and an external storage device. The memory 11 can be used not only to store application software and various data installed in the electronic device 1, such as the code of the emotional intelligence recognition program 01, etc., but also to temporarily store data that has been output or will be output.

In some embodiments, the processor 12 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip, for running program codes or processing stored in the memory 11 Data, such as the execution of emotional intelligence recognition program 01 and so on.

The communication bus 13 is used to realize the connection and communication between these components.

The network interface 14 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface), and is generally used to establish a communication connection between the electronic device 1 and other electronic devices.

Optionally, the electronic device 1 may also include a user interface. The user interface may include a display (Display) and an input unit such as a keyboard (Keyboard). The optional user interface may also include a standard wired interface and a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc. Among them, the display can also be appropriately called a display screen or a display unit, which is used to display the information processed in the electronic device 1 and to display a visualized user interface.

Figure 2 only shows the electronic device 1 with the components 11-14 and the emotional intelligent recognition program 01. Those skilled in the art can understand that the structure shown in Figure 1 does not constitute a limitation on the electronic device 1, and may include ratios Fewer or more parts are shown, or some parts are combined, or different parts are arranged.

In the embodiment of the electronic device 1 shown in FIG. 2, the memory 11 stores the emotional intelligence recognition program 01; when the processor 12 executes the emotional intelligence recognition program 01 stored in the memory 11, the following steps are implemented:

Step 1: Obtain a text dialogue set, perform a deduplication operation on the text dialogue set to obtain a standard text dialogue set, and perform a word segmentation operation on the standard text dialogue set to obtain a word set.

Preferably, in this application, if the distance between any two text dialogs is less than a preset threshold, any one of the two text dialogs is deleted. Wherein, the preset threshold value is 0.1.

Step 2: Calculate the importance score of the words in the word set, select the words in the word set as the subject sequence set of the standard text dialogue set according to the importance score of the words, and select the words in the word set in a preset manner, and combine the The subject sequence set is converted to the subject vector set.

The correlation strength between the words W _i and W _{j is obtained as:}

weight(W _i ,W _j )=Dep(W _i ,W _j )*f _grav (W _i ,W _j )

among them,

Is the set related to the vertex W _i , and η is the damping coefficient.

Preferably, the preset method in this application is to select t words with the highest scores as the subject sequence set of the standard text dialogue set according to the importance score of the word.

Step 3: The word set is converted into a word vector set and then input into a pre-trained sentence vector prediction model, and a sentence vector set corresponding to the word vector set is output.

Step 4: Use a pre-trained intelligent dialogue emotion recognition model to perform encoding and decoding operations on the topic vector set and sentence vector set to obtain the corresponding emotion expression, thereby completing the emotion recognition of the text dialogue.

C _i = f(z _i ), where

V _1,k = p(y ₁ |k)·π _k

V _t,k = p(y _t |k)·max _x∈S (a _x,k ·V _t-1,x )

Referring to FIG. 3, this is a schematic diagram of the emotional intelligence recognition device 100 of this application. In this embodiment, the emotional intelligence recognition device 100 includes a deduplication word segmentation module 10, a calculation conversion module 20, a sentence vector prediction module 30, and an emotion recognition module 40. Illustratively:

The deduplication word segmentation module 10 is used to obtain a text dialogue set, perform a deduplication operation on the text dialogue set to obtain a standard text dialogue set, and perform a word segmentation operation on the standard text dialogue set to obtain a word set.

The calculation conversion module 20 is configured to calculate the importance scores of words in the word set, select corresponding words according to the importance scores of the words, and select corresponding words as the topic sequence set of the standard text dialogue set in a preset manner , And convert the subject sequence set into a subject vector set.

The sentence vector prediction module 30 is configured to: convert the word set into a word vector set and input it into a pre-trained sentence vector prediction model, and output a sentence vector set corresponding to the word vector set.

The emotion recognition module 40 is configured to use a pre-trained intelligent dialogue emotion recognition model to perform encoding and decoding operations on the topic vector set and sentence vector set to obtain corresponding emotion expressions, thereby completing the emotion recognition of the text dialogue.

The functions or operation steps implemented by the program modules such as the deduplication word segmentation module 10, the calculation conversion module 20, the sentence vector prediction module 30, and the emotion recognition module 40 when executed are substantially the same as those in the foregoing embodiment, and will not be repeated here.

In addition, the embodiment of the present application also proposes a computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile. The computer-readable storage medium stores an emotional intelligence recognition program. The emotional intelligence recognition program can be executed by one or more processors to achieve the following operations:

Calculate the importance scores of the words in the word set, select corresponding words according to the importance scores of the words, and select corresponding words as the subject sequence set of the standard text dialogue set in a preset manner, and convert the subject sequence set Is the theme vector set;

The specific implementation manners of the computer-readable storage medium of the present application are basically the same as the foregoing embodiments of the electronic device and method, and will not be repeated here.

It should be noted that the serial numbers of the above-mentioned embodiments of the present application are only for description, and do not represent the superiority or inferiority of the embodiments. And the terms "include", "include" or any other variants thereof in this article are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements, but also includes those elements that are not explicitly included. The other elements listed may also include elements inherent to the process, device, article, or method. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, device, article, or method that includes the element.

Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, the technical solution of this application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM/RAM) as described above. , Magnetic disks, optical disks), including several instructions to make a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) execute the method described in each embodiment of the present application.

The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

An emotional intelligence recognition method, wherein the method includes:

Acquiring a text dialogue set, performing a deduplication operation on the text dialogue set to obtain a standard text dialogue set, and performing a word segmentation operation on the standard text dialogue set to obtain a word set;

Calculate the importance score of the words in the word set, select the words in the word set as the subject sequence set of the standard text dialogue set according to the importance score of the words and in a preset manner, and convert the subject sequence set Is the theme vector set;

Inputting the word set into a pre-trained sentence vector prediction model after converting the word vector set, and outputting a sentence vector set corresponding to the word vector set;

A pre-trained intelligent dialogue emotion recognition model is used to encode and decode the topic vector set and sentence vector set to obtain the corresponding emotion expression, thereby completing the emotion recognition of the text dialogue.
5. The emotional intelligence recognition method of claim 1, wherein the deduplication operation comprises:

Wherein, n represents the number of text dialogs in the text dialog set, w 1j and w 2j represent any two text dialogs in the text dialog set, and d represents the distance between any two text dialogs.
5. The emotional intelligence recognition method according to claim 1, wherein said calculating the importance score of words in said word set comprises:

Calculate the dependency correlation degree of any two words W i and W j in the word set:

Among them, Dep(W i , W j ) represents the degree of dependency relationship between the words W i and W j , len(W i , W j ) represents the length of the dependency path between the words W i and W j, and b is Hyperparameter

Calculate the gravitational forces of the words W i and W j according to the dependency correlation degree:

Among them, f grav (W i , W j ) represents the gravitational force of the words W i and W j , tfidf(W i ) represents the TF-IDF value of the word W i , and tfidf(W j ) represents the TF-IDF of the word W j IDF value, TF means word frequency, IDF means inverse document frequency index, d is the Euclidean distance between the word vectors of words W i and W j;

According to the dependency correlation degree and the gravity, the correlation strength between the words W i and W j is:

weight(W i ,W j )=Dep(W i ,W j )*f grav (W i ,W j )

The words W i is calculated based on the strength of association importance score:

among them,
Is the set related to the vertex W i , and η is the damping coefficient.
The emotional intelligence recognition method according to any one of claims 1 to 3, wherein the encoding and decoding operations include:

Performing an encoding operation on the topic vector set and the sentence vector set to obtain the feature sequence set of the topic vector set and the sentence vector set;

A dynamic programming algorithm is used to decode the probability set of the state sequence corresponding to the feature sequence set, and the state sequence corresponding to the maximum probability in the probability set of the state sequence is used as the output result of the corresponding emotional expression.
The emotional intelligence recognition method according to claim 4, wherein the dynamic programming algorithm comprises:

V 1,k = p(y 1 |k)·π k

V t,k = p(y t |k)·max x∈S (a x,k ·V t-1,x )

Among them, V 1,k represents the probability of the output state sequence corresponding to k for the first state sequence, p represents the probability value, y 1 represents the output value of the first state sequence, k represents the state sequence, and π k represents the initial state k. Probability dimension; V t,k represents the probability of the state sequence corresponding to the output of the t-th state, y t represents the output value of the t-th state sequence, S represents the state space, and a x,k represents the state sequence from the state x to the state k The transition probability, V t-1,x represents the probability of the state sequence corresponding to the output of the t-1 state as x.
The emotional intelligent recognition method according to any one of claims 1 to 3, wherein the word segmentation operation comprises: matching words in the text dialogue set with entries in a preset dictionary through a preset strategy To obtain a plurality of words, and separate the words with spaces to obtain the word set.
8. The method for emotional intelligent recognition of claim 6, wherein the preset dictionary includes a statistical dictionary and a prefix dictionary.
An electronic device, wherein the device includes a memory and a processor, the memory stores an emotional intelligence recognition program that can run on the processor, and the emotional intelligence recognition program is implemented when the processor is executed The following steps:

Acquiring a text dialogue set, performing a deduplication operation on the text dialogue set to obtain a standard text dialogue set, and performing a word segmentation operation on the standard text dialogue set to obtain a word set;

Calculate the importance score of the words in the word set, select the words in the word set as the subject sequence set of the standard text dialogue set according to the importance score of the words and in a preset manner, and convert the subject sequence set Is the theme vector set;

Inputting the word set into a pre-trained sentence vector prediction model after converting the word vector set, and outputting a sentence vector set corresponding to the word vector set;

A pre-trained intelligent dialogue emotion recognition model is used to encode and decode the topic vector set and sentence vector set to obtain the corresponding emotion expression, thereby completing the emotion recognition of the text dialogue.
The electronic device according to claim 8, wherein the deduplication operation comprises:

Wherein, n represents the number of text dialogs in the text dialog set, w 1j and w 2j represent any two text dialogs in the text dialog set, and d represents the distance between any two text dialogs.
8. The electronic device according to claim 8, wherein said calculating the importance score of the words in the word set comprises:

Calculate the dependency correlation degree of any two words W i and W j in the word set:

Among them, Dep(W i , W j ) represents the degree of dependency relationship between the words W i and W j , len(W i , W j ) represents the length of the dependency path between the words W i and W j, and b is Hyperparameter

Calculate the gravitational forces of the words W i and W j according to the dependency correlation degree:

Among them, f grav (W W , W j ) represents the gravitational force of the words W i and W j , tfidf(W i ) represents the TF-IDF value of the word W i , and tfidf(W j ) represents the TF-IDF value of the word W j IDF value, TF means word frequency, IDF means inverse document frequency index, d is the Euclidean distance between the word vectors of words W i and W j;

According to the dependency correlation degree and the gravity, the correlation strength between the words W i and W j is:

weight(W i ,W j )=Dep(W i ,W j )*f grav (W W ,W j )

The words W i is calculated based on the strength of association importance score:

among them,
Is the set related to the vertex W i , and η is the damping coefficient.
The electronic device according to any one of claims 8 to 10, wherein the encoding and decoding operations include:

Performing an encoding operation on the topic vector set and the sentence vector set to obtain the feature sequence set of the topic vector set and the sentence vector set;

A dynamic programming algorithm is used to decode the probability set of the state sequence corresponding to the feature sequence set, and the state sequence corresponding to the maximum probability in the probability set of the state sequence is used as the output result of the corresponding emotional expression.
The electronic device of claim 11, wherein the dynamic programming algorithm comprises:

V 1,k = p(y 1 |k)·π k

V t,k = p(y t |k)·max x∈S (a x,k ·V t-1,x )

Among them, V 1,k represents the probability of the output state sequence corresponding to k for the first state sequence, p represents the probability value, y 1 represents the output value of the first state sequence, k represents the state sequence, and π k represents the initial state k. Probability dimension; V t,k represents the probability of the state sequence corresponding to the output of the t-th state, y t represents the output value of the t-th state sequence, S represents the state space, and a x,k represents the state sequence from the state x to the state k The transition probability, V t-1,x represents the probability of the state sequence corresponding to the output of the t-1 state as x.
The electronic device according to any one of claims 8 to 10, wherein the word segmentation operation comprises: matching words in the text dialogue set with entries in a preset dictionary through a preset strategy to obtain Multiple words, and separate the words with spaces to obtain the word set.
The electronic device according to claim 13, wherein the preset dictionary includes a statistical dictionary and a prefix dictionary.
A computer-readable storage medium, wherein an emotional intelligence recognition program is stored on the computer-readable storage medium, and the emotional intelligence recognition program can be executed by one or more processors to implement the following steps:

Acquiring a text dialogue set, performing a deduplication operation on the text dialogue set to obtain a standard text dialogue set, and performing a word segmentation operation on the standard text dialogue set to obtain a word set;

Calculate the importance score of the words in the word set, select the words in the word set as the subject sequence set of the standard text dialogue set according to the importance score of the words and in a preset manner, and convert the subject sequence set Is the theme vector set;

Inputting the word set into a pre-trained sentence vector prediction model after converting the word vector set, and outputting a sentence vector set corresponding to the word vector set;

A pre-trained intelligent dialogue emotion recognition model is used to encode and decode the topic vector set and sentence vector set to obtain the corresponding emotion expression, thereby completing the emotion recognition of the text dialogue.
The computer-readable storage medium of claim 15, wherein the deduplication operation comprises:

Wherein, n represents the number of text dialogs in the text dialog set, w 1j and w 2j represent any two text dialogs in the text dialog set, and d represents the distance between any two text dialogs.
15. The computer-readable storage medium according to claim 15, wherein said calculating the importance score of the words in the word set comprises:

Calculate the dependency correlation degree of any two words W i and W j in the word set:

Among them, Dep(W i , W j ) represents the degree of dependency relationship between the words W i and W j , len(W i , W j ) represents the length of the dependency path between the words W i and W j, and b is Hyperparameter

Calculate the gravitational forces of the words W i and W j according to the dependency correlation degree:

Among them, f grav (W W , W j ) represents the gravitational force of the words W i and W j , tfidf(W i ) represents the TF-IDF value of the word W i , and tfidf(W j ) represents the TF-IDF value of the word W j IDF value, TF means word frequency, IDF means inverse document frequency index, d is the Euclidean distance between the word vectors of words W i and W j;

According to the dependency correlation degree and the gravity, the correlation strength between the words W i and W j is:

weight(W i ,W j )=Dep(W i ,W j )*f grav (W i ,W j )

The words W i is calculated based on the strength of association importance score:

among them,
Is the set related to the vertex W i , and η is the damping coefficient.
17. The computer-readable storage medium according to any one of claims 15 to 17, wherein the encoding and decoding operations include:

Performing an encoding operation on the topic vector set and the sentence vector set to obtain the feature sequence set of the topic vector set and the sentence vector set;

A dynamic programming algorithm is used to decode the probability set of the state sequence corresponding to the feature sequence set, and the state sequence corresponding to the maximum probability in the probability set of the state sequence is used as the output result of the corresponding emotional expression.
The computer-readable storage medium of claim 18, wherein the dynamic programming algorithm comprises:

V 1,k = p(y 1 |k)·π k

V t,k = p(y t |k)·max x∈S (a x,k ·V t-1,x )

Among them, V 1,k represents the probability of the output state sequence corresponding to k for the first state sequence, p represents the probability value, y 1 represents the output value of the first state sequence, k represents the state sequence, and π k represents the initial state k. Probability dimension; V t,k represents the probability of the state sequence corresponding to the output of the t-th state, y t represents the output value of the t-th state sequence, S represents the state space, and a x,k represents the state sequence from the state x to the state k The transition probability, V t-1,x represents the probability of the state sequence corresponding to the output of the t-1 state as x.
An emotional intelligent recognition device, which includes:

The deduplication word segmentation module is used to obtain a text dialog set, perform a deduplication operation on the text dialog set to obtain a standard text dialog set, and perform a word segmentation operation on the standard text dialog set to obtain a word set;

The calculation conversion module is used to calculate the importance score of the words in the word set, select corresponding words as the subject sequence set of the standard text dialogue set according to the importance score of the words, and in a preset manner, and The subject sequence set is converted into a subject vector set;

A sentence vector prediction module, configured to convert the word set into a word vector set and input it into a pre-trained sentence vector prediction model, and output a sentence vector set corresponding to the word vector set;

The emotion recognition module is used to encode and decode the topic vector set and sentence vector set by using a pre-trained intelligent dialogue emotion recognition model to obtain the corresponding emotion expression, thereby completing the emotion recognition of the text dialogue.