CN112948554B - Real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge - Google Patents

Real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge Download PDF

Info

Publication number
CN112948554B
CN112948554B CN202110222049.1A CN202110222049A CN112948554B CN 112948554 B CN112948554 B CN 112948554B CN 202110222049 A CN202110222049 A CN 202110222049A CN 112948554 B CN112948554 B CN 112948554B
Authority
CN
China
Prior art keywords
layer
dialogue
information
emotion
neural network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110222049.1A
Other languages
Chinese (zh)
Other versions
CN112948554A (en
Inventor
张科
李苑青
王靖宇
苏雨
谭明虎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northwestern Polytechnical University
Original Assignee
Northwestern Polytechnical University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northwestern Polytechnical University filed Critical Northwestern Polytechnical University
Priority to CN202110222049.1A priority Critical patent/CN112948554B/en
Publication of CN112948554A publication Critical patent/CN112948554A/en
Application granted granted Critical
Publication of CN112948554B publication Critical patent/CN112948554B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3343Query execution using phonetics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Mathematical Physics (AREA)
  • Databases & Information Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge, and belongs to the technical field of user emotion tendency analysis. Aiming at the characteristic that the real-time multi-modal emotion analysis cannot obtain the related information after the target, a new model and a network structure are designed by combining reinforcement learning and a cyclic neural network, multi-modal information in a target and pre-target sampling time period is fully extracted, fused and analyzed, and recognition efficiency and accuracy are further improved by combining with field knowledge.

Description

Real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge
Technical Field
The invention belongs to the technical field of emotion tendency analysis of users, and particularly relates to a real-time multi-mode dialogue emotion analysis model and method based on reinforcement learning and domain knowledge.
Background
User multi-mode emotion analysis is a very popular research field in recent years, and has wide development potential and application prospect, for example: driver fatigue driving monitoring of an automatic driving system, airport security protection monitoring for dangerous molecules in crowd, self-closing symptom accompanying and monitoring in medical field, accompanying and monitoring for solitary old people and children in intelligent home field, alarming and monitoring, and the like. In the existing multi-mode emotion analysis technology, the modes for analysis are various according to different research directions, wherein the following four main modes are: visual signals, acoustic signals, text information and brain electrical signals. The electroencephalogram signal has relatively highest accuracy, but is required to be matched with corresponding special signal acquisition sensor equipment, so that the electroencephalogram signal is difficult to popularize in the field of daily life conveniently and widely. Thus, vision, sound and text are the most common input modalities for multimodal user emotion analysis studies. In the prior related technology using the three modes, the analysis is mainly divided into two types, wherein one type is that sentence-by-sentence or segment-by-segment is used as an object for analysis, namely, emotion analysis of context information is not considered; the other is to consider the context information, that is, to make a judgment on the emotion of the user at a certain point in time on the basis of considering the entire dialogue content. The former technology has strong real-time performance, but does not have good accuracy because context information is not considered, and the latter identification accuracy is greatly improved compared with the former technology, but does not have real-time performance function in practical application, and the capability of real-time monitoring is lost.
The cyclic neural network is a very popular research direction in the field of artificial intelligent machine learning in recent years, and is also used as reinforcement learning of one of a paradigm and a methodology of machine learning, and by continuously combining with the cyclic neural network in recent years, the algorithm design is more flexible, and the application field is greatly expanded. Accordingly, different application fields correspond to different field knowledge, which is common sense specification and guidance of the studied problem, and can optimize the result obtained by the algorithm to a certain extent, for example, filter out causal relationships against common sense or actual conditions, increase the probability of more likely events being selected, and the like. By combining reinforcement learning and domain knowledge, the cyclic neural network has breakthrough progress in the directions of image processing, text analysis, voice recognition and the like, and has the characteristics of short training time, few training parameters and simple design.
Liu Qiyuan, zhang Dong (Multi-modal emotion analysis based on context enhanced LSTM) computer science 2019,046 (011): 181-185) for multi-modal emotion analysis to obtain information inside a single modality and interaction information between multiple modalities, a method of multi-modal emotion analysis based on context enhanced LSTM is presented. LSTM is one of the cyclic neural networks, for each expression of multiple modes, they combine contextual features, encoded separately using LSTM, and each capturing information inside a single mode; then fusing the independent information of the single modes, and obtaining interaction information among the multiple modes by using LSTM, so as to form a multi-mode characteristic representation; and finally, adopting a maximum pooling strategy to reduce the dimension of the multi-mode, thereby constructing the emotion classifier. The algorithm obtains good recognition accuracy on the public data set, and greatly improves the training speed. However, the multi-modal emotion analysis model uses all the context information related to the recognition target as input, belongs to post-hoc analysis, and cannot have the capability of real-time emotion analysis.
Disclosure of Invention
Technical problem to be solved
The existing multi-modal emotion analysis model aims at post analysis of an analyzed target, not only needs information before the target, but also needs to extract information after the target, and does not meet the requirements and actual conditions of real-time multi-modal dialogue emotion analysis. Aiming at the defect that the prior art cannot analyze in real time, the invention provides a real-time multi-mode dialogue emotion analysis model and method based on reinforcement learning and domain knowledge.
Technical proposal
The reinforcement learning model based on the cyclic neural network for emotion analysis is characterized by comprising 12 layers, wherein the first layer is an input layer, the middle 10 layers are hidden layers, the reinforcement learning model comprises 2 cyclic neural network layers, 2 normalization layers, 1 activation layer and 5 full connection layers, and the last layer is an output layer; inputting three-mode information of images, characters and voices in a current dialogue sampling section, and firstly, respectively carrying out single-mode feature processing; the image feature processing layer, the text feature processing layer and the voice feature processing layer respectively comprise a normalization layer, a circulating neural network layer and a full-connection layer; then the three modes are fused through a normalization layer, a circulating neural network layer, an activation layer and a 1-layer full-connection layer, and finally the three full-connection layers are connected to output results; the network output is the probability value of the last sentence of dialogue information in the emotion type of the sampling segment.
The technical scheme of the invention is as follows: the emotion types include 6 types: happiness, excitement, depression, sadness, anger and neutrality.
The technical scheme of the invention is as follows: the sampling segment comprises 4 consecutive dialogs.
A real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge is characterized by comprising the following steps:
step 1: acquiring a multi-mode dialogue information database, and generating dialogue emotion field knowledge according to the database;
step 2: constructing the reinforcement learning model based on the recurrent neural network as claimed in claim 1, and training the model;
step 3: and (3) collecting multi-mode dialogue information in real time, sequentially sampling according to the occurrence time sequence of the dialogue, analyzing the emotion of the dialogue in real time by using the reinforced learning model based on the cyclic neural network trained in the step (2), outputting probability values of the emotion types respectively appearing, and correcting the recognition result according to the domain knowledge to obtain the final classification result.
The technical scheme of the invention is as follows: the construction of the real-time multi-mode dialogue emotion analysis model in the step 2 is specifically as follows:
1) Representing the input multimodal information as:
s(t)=[V(t),T(t),A(t)]
t is the current sampling time, s (T) is the state information of the current sampling time, V (T) is the image information in the current sampling time, T (T) is the text information in the current sampling time, and A (T) is the voice information in the current sampling time;
2) Training the model on a multi-mode dialogue information database, and calculating the result of the multi-mode information at the sampling time t, which passes through a normalization layer, a circulating neural network layer, an activation layer and a full connection layer to obtain an output layer, wherein the formula is as follows:
action(t)=Q(s(t))
q is an emotion type identification result of the current sampling moment output by constructing a reinforcement learning algorithm model based on a cyclic neural network, action (t) is a model, and a reward function R is calculated according to the output result;
wherein label (t) is a true emotion type; then, calculating the difference value between the expected value and the estimated value to obtain a loss function of the whole network; wherein the expected value eval has the following calculation formula:
eval=Q(s(t+1))
the calculation formula of the estimated value epet is:
thereby obtaining a loss function loss:
loss=E[epet-eval]
wherein E is the desire for epet-eval.
The technical scheme of the invention is as follows: and step 2, training a reinforcement learning model based on the cyclic neural network by adopting a gradient descent and back propagation algorithm.
Advantageous effects
Compared with the existing multi-mode dialogue emotion analysis model, the model provided by the invention focuses on the real-time performance of dialogue emotion analysis, is divided into coherent emotion states containing target related information by sampling according to the dialogue occurrence sequence, processes and fuses the multi-mode information by adopting a cyclic neural network, and screens and corrects recognition results by referring to the domain knowledge, thereby realizing real-time dialogue emotion analysis.
The novel multi-mode emotion analysis model combining reinforcement learning, a circulating neural network and domain knowledge can realize real-time emotion analysis in a dialogue process, ensure real-time performance, simultaneously consider multi-mode information and domain knowledge related to target sentences, and improve recognition accuracy.
According to the real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge, aiming at the characteristic that relevant information after a target cannot be obtained through real-time multi-mode emotion analysis, a new model and a network structure are designed through combination of reinforcement learning and a cyclic neural network, multi-mode information in a target and a sampling time period before the target is fully extracted, fused and analyzed, and recognition efficiency and accuracy are further improved through combination with the domain knowledge.
Drawings
FIG. 1 is a block diagram of a real-time multimodal dialog emotion analysis model based on reinforcement learning and domain knowledge;
FIG. 2 is a flow chart of the method of the present invention;
FIG. 3 is a graph of the test results of the present invention.
Detailed Description
The invention will now be further described with reference to examples, figures:
in order to realize real-time and rapid multi-modal dialogue emotion analysis, the invention provides a novel multi-modal emotion analysis model combining reinforcement learning with a circulating neural network and field learning, wherein a competition network structure is adopted as an iterative training algorithm of reinforcement learning, the circulating neural network is used as a network model, 6 basic emotion types (happiness, excitement, frustration, sadness, anger and neutrality) are counted on the basis of a general public dialogue data set, the correlation size of 4 sentences in the sampling length is calculated, and the output result of the model is corrected.
In a multi-modal dialogue, every 4 sentences in the dialogue are sampling segments, namely the sampling length is 4, and the steps are 1 according to the occurrence sequence of the dialogue. Meanwhile, multi-modal dialogue information (images, characters and voices) in each sampling segment is used as a state in the reinforcement learning algorithm environment, a 4 th sentence in the sampling segment is target information needed to be subjected to multi-modal emotion analysis, the first 3 sentences are the 4 th sentence, needed associated information is provided as a reference, the information is used as input parameters of a circulating neural network, the size of the possibility of the target sentence in 6 alternative emotion types is obtained through calculation and recognition of the circulating neural network, finally, final probability value sorting is output through domain knowledge normalization and correction, the emotion type with the highest probability is used as the emotion type of the judged target information, namely, the judged emotion type is regarded as the emotion type selected based on the current state, and the judged emotion type is compared with the true emotion type to obtain a reward function. And finally, completing state transition through the action, wherein the corresponding next state is the multi-mode dialogue information contained in the next sampling section in the current dialogue until the current dialogue is ended, and completing identification.
As shown in FIG. 1, in the invention, the structure based on the reinforcement learning algorithm of the cyclic neural network has 12 layers, namely an input layer and an output layer, and the middle 10 layers are hidden layers, and comprise 2 cyclic neural network layers, 2 normalization layers, 1 activation layer and 5 full connection layers. The neural network inputs three-mode information of images, characters and voices in the current dialogue sampling section, and firstly, single-mode characteristic processing is respectively carried out. The image feature processing layer, the text feature processing layer and the voice feature processing layer respectively comprise a normalization layer, a cyclic neural network layer and a full-connection layer. And then the three modes are fused through a normalization layer, a circulating neural network layer, an activation layer and a 1-layer full-connection layer, and finally the three full-connection layers are connected to output results. The network output is the possible probability value of the last sentence of dialogue information in the sampling segment in 6 emotion types, namely a Q table. And finally, correcting the calculated probability value by combining the domain knowledge corresponding to the current dialogue to obtain a corrected Q table, and selecting the emotion type with the highest probability as the recognition result.
As shown in FIG. 2, the embodiment of the invention relates to a method for analyzing emotion of a real-time multi-mode dialogue for reinforcement learning and domain knowledge, which comprises the following steps:
step one, acquiring a multi-mode dialogue information database and statistical dialogue emotion domain knowledge. The method specifically comprises the following steps: the multi-modal dialogue database with good diversity is constructed, and the multi-modal dialogue database needs to have the characteristics of average sex proportion of the talkers, approximately uniform distribution of talking contents and emotion types and the like. After the database is determined, sampling is sequentially completed by taking a complete dialogue as a unit and taking the occurrence sequence of the dialogue as a unit to form a sample library, and the probability of occurrence of six emotion types under different bases is calculated by taking the sampling length as a unit and taking the corresponding three emotion types of the previous three sentences as the basis to generate domain knowledge K of dialogue emotion analysis.
Step two, constructing a reinforcement learning algorithm model based on a cyclic neural network, and training the model by adopting a gradient descent and back propagation algorithm, wherein the specific process is as follows:
(1) A reinforcement learning algorithm model based on a cyclic neural network is constructed according to fig. 1, and all parameters and weights are initialized by random numbers. Representing the input multimodal information as:
s(t)=[V(t),T(t),A(t)]
t is the current sampling time, s (T) is the current sampling time state information, V (T) is the image information in the current sampling time, T (T) is the text information in the current sampling time, and A (T) is the voice information in the current sampling time.
(2) Training the model on a multi-mode dialogue information database, and calculating the result of the multi-mode information at the sampling time t, which passes through a normalization layer, a circulating neural network layer, an activation layer and a full connection layer to obtain an output layer, wherein the formula is as follows:
action(t)=Q(s(t))
wherein Q is an emotion type recognition result of the current sampling moment output by constructing a reinforcement learning algorithm model based on a cyclic neural network according to FIG. 1, action (t) is a model, and a reward function R is calculated according to the output result.
Wherein label (t) is the true emotion type. And then, calculating the difference between the expected value and the estimated value to obtain the loss function of the whole network. Wherein the expected value eval has the following calculation formula:
eval=Q(s(t+1))
the calculation formula of the estimated value epet is:
thereby obtaining a loss function loss:
loss=E[epet-eval]
wherein E is the desire for epet-eval.
Training of the model is accomplished by the back propagation loss function loss.
And thirdly, adopting a dialogue which is not trained in the data set as a test example, performing real-time dialogue emotion analysis by using a reinforcement learning model based on a cyclic neural network, outputting probability values which are respectively appeared for six emotion types, correcting the recognition result according to the domain knowledge, and adding the output probability values and the corresponding domain knowledge by the correction method to obtain a final classification result, thereby obtaining the final classification result. The specific process is as follows:
(1) Taking the dialogue as a unit, sequentially sampling according to the occurrence time sequence of the dialogue, and identifying through a reinforcement learning model based on a cyclic neural network;
(2) Normalizing the recognition result, and correcting by using domain knowledge to obtain a final recognition result.
As shown in fig. 3, the black solid line is the test result of the method of the present invention, and the remaining broken lines are the test results of other existing methods. The abscissa in the figure is the dialogue length, taking the whole sentence of one speaker as a unit, the dialogue length is continuously increased along with the progress of the dialogue, and the maximum dialogue length is 50 according to the tested database, namely, the maximum dialogue length is 50 times for the speaker. The ordinate in the figure is the recognition accuracy, and the range is [0,1]. It can be seen from the figure that, firstly, only the method of the invention can dynamically recognize the emotion tendencies of the user in real time along with the progress of the dialogue, while the other methods do not have such capability; secondly, the test result of the method is higher than the identification accuracy of the existing method before the dialogue length is less than or equal to 35, and the measurable dialogue with the dialogue length greater than 35 in the database is greatly reduced after the dialogue length is greater than 35, so that the concussion of the result occurs, but the average accuracy is still higher than that of the existing method, and the effectiveness of the method is illustrated.
While the invention has been described with reference to certain preferred embodiments, it will be understood by those skilled in the art that various changes and substitutions of equivalents may be made without departing from the spirit and scope of the invention.

Claims (1)

1. A real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge is characterized in that an adopted model comprises 12 layers, wherein the first layer is an input layer, the middle 10 layers are hidden layers, the model comprises 2 circulating neural network layers, 2 normalization layers, 1 activation layer and 5 full connection layers, and the last layer is an output layer; inputting three-mode information of images, characters and voices in a current dialogue sampling section, and firstly, respectively carrying out single-mode feature processing; the image feature processing layer, the text feature processing layer and the voice feature processing layer respectively comprise a normalization layer, a circulating neural network layer and a full-connection layer; then the three modes are fused through a normalization layer, a circulating neural network layer, an activation layer and a 1-layer full-connection layer, and finally the three full-connection layers are connected to output results; the network output is the probability value of the last sentence of dialogue information in the emotion type of the sampling segment; the emotion types include 6 types: happiness, excitement, depression, sadness, anger and neutrality; the sampling section comprises 4 continuous dialogs; the method comprises the following steps:
step 1: acquiring a multi-mode dialogue information database, and generating dialogue emotion field knowledge according to the database;
step 2: building a reinforcement learning model based on a cyclic neural network and training the model;
1) Representing the input multimodal information as:
s(t)=[V(t),T(t),A(t)]
t is the current sampling time, s (T) is the state information of the current sampling time, V (T) is the image information in the current sampling time, T (T) is the text information in the current sampling time, and A (T) is the voice information in the current sampling time;
2) Training the model on a multi-mode dialogue information database, and calculating the result of the multi-mode information at the sampling time t, which passes through a normalization layer, a circulating neural network layer, an activation layer and a full connection layer to obtain an output layer, wherein the formula is as follows:
action(t)=Q(s(t))
q is an emotion type identification result of the current sampling moment output by constructing a reinforcement learning algorithm model based on a cyclic neural network, action (t) is a model, and a reward function R is calculated according to the output result;
wherein label (t) is a true emotion type; then, calculating the difference value between the expected value and the estimated value to obtain a loss function of the whole network; wherein the expected value eval has the following calculation formula:
eval=Q(s(t+1))
the calculation formula of the estimated value epet is:
thereby obtaining a loss function loss:
loss=Ε[epet-eval]
wherein, E is the expectation of epi-eval;
training a reinforcement learning model based on a cyclic neural network by adopting a gradient descent and back propagation algorithm;
step 3: and (3) collecting multi-mode dialogue information in real time, sequentially sampling according to the occurrence time sequence of the dialogue, analyzing the emotion of the dialogue in real time by using the reinforced learning model based on the cyclic neural network trained in the step (2), outputting probability values of the emotion types respectively appearing, and correcting the recognition result according to the domain knowledge to obtain the final classification result.
CN202110222049.1A 2021-02-28 2021-02-28 Real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge Active CN112948554B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110222049.1A CN112948554B (en) 2021-02-28 2021-02-28 Real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110222049.1A CN112948554B (en) 2021-02-28 2021-02-28 Real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge

Publications (2)

Publication Number Publication Date
CN112948554A CN112948554A (en) 2021-06-11
CN112948554B true CN112948554B (en) 2024-03-08

Family

ID=76246708

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110222049.1A Active CN112948554B (en) 2021-02-28 2021-02-28 Real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge

Country Status (1)

Country Link
CN (1) CN112948554B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113592001B (en) * 2021-08-03 2024-02-02 西北工业大学 Multi-mode emotion recognition method based on deep canonical correlation analysis

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844442A (en) * 2016-12-16 2017-06-13 广东顺德中山大学卡内基梅隆大学国际联合研究院 Multi-modal Recognition with Recurrent Neural Network Image Description Methods based on FCN feature extractions
CN108764268A (en) * 2018-04-02 2018-11-06 华南理工大学 A kind of multi-modal emotion identification method of picture and text based on deep learning
CN108804611A (en) * 2018-05-30 2018-11-13 浙江大学 A kind of dialogue reply generation method and system based on self comment Sequence Learning
CN110610138A (en) * 2019-08-22 2019-12-24 西安理工大学 Facial emotion analysis method based on convolutional neural network
WO2020173133A1 (en) * 2019-02-27 2020-09-03 平安科技(深圳)有限公司 Training method of emotion recognition model, emotion recognition method, device, apparatus, and storage medium
CN112163169A (en) * 2020-09-29 2021-01-01 海南大学 Multi-mode user emotion analysis method based on knowledge graph

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2019011824A1 (en) * 2017-07-11 2019-01-17 Koninklijke Philips N.V. Multi-modal dialogue agent

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106844442A (en) * 2016-12-16 2017-06-13 广东顺德中山大学卡内基梅隆大学国际联合研究院 Multi-modal Recognition with Recurrent Neural Network Image Description Methods based on FCN feature extractions
CN108764268A (en) * 2018-04-02 2018-11-06 华南理工大学 A kind of multi-modal emotion identification method of picture and text based on deep learning
CN108804611A (en) * 2018-05-30 2018-11-13 浙江大学 A kind of dialogue reply generation method and system based on self comment Sequence Learning
WO2020173133A1 (en) * 2019-02-27 2020-09-03 平安科技(深圳)有限公司 Training method of emotion recognition model, emotion recognition method, device, apparatus, and storage medium
CN110610138A (en) * 2019-08-22 2019-12-24 西安理工大学 Facial emotion analysis method based on convolutional neural network
CN112163169A (en) * 2020-09-29 2021-01-01 海南大学 Multi-mode user emotion analysis method based on knowledge graph

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Thoma,e M Etal..Learning Multimodal Transition Dynamics for Model-Based Reinforcement Learning.arXiv.2017,全文. *
何俊 ; 张彩庆 ; 李小珍 ; 张德海 ; .面向深度学习的多模态融合技术研究综述.计算机工程.2020,(第05期),全文. *
刘启元等.基于上下文增强LSTM的多模态情感分析.计算机科学.2019,第46卷(第11期),全文. *
刘菁菁 ; 吴晓峰 ; .基于长短时记忆网络的多模态情感识别和空间标注.复旦学报(自然科学版).2020,(第05期),全文. *
林敏鸿等.基于注意力神经网络的多模态情感分析.计算机科学.2020,(第S2期),全文. *

Also Published As

Publication number Publication date
CN112948554A (en) 2021-06-11

Similar Documents

Publication Publication Date Title
CN112348075B (en) Multi-mode emotion recognition method based on contextual attention neural network
Chang et al. Learning representations of emotional speech with deep convolutional generative adversarial networks
CN110136749A (en) The relevant end-to-end speech end-point detecting method of speaker and device
CN110956953B (en) Quarrel recognition method based on audio analysis and deep learning
Senthilkumar et al. Speech emotion recognition based on Bi-directional LSTM architecture and deep belief networks
Sahu et al. Multi-Modal Learning for Speech Emotion Recognition: An Analysis and Comparison of ASR Outputs with Ground Truth Transcription.
CN111798874A (en) Voice emotion recognition method and system
CN112329438B (en) Automatic lie detection method and system based on domain countermeasure training
Zhou et al. ICRC-HIT: A deep learning based comment sequence labeling system for answer selection challenge
CN111401105B (en) Video expression recognition method, device and equipment
Parthasarathy et al. Predicting speaker recognition reliability by considering emotional content
Lin et al. DeepEmoCluster: A semi-supervised framework for latent cluster representation of speech emotions
CN114898779A (en) Multi-mode fused speech emotion recognition method and system
CN112948554B (en) Real-time multi-mode dialogue emotion analysis method based on reinforcement learning and domain knowledge
Mu et al. Speech emotion recognition using convolutional-recurrent neural networks with attention model
Wu et al. The DKU-LENOVO Systems for the INTERSPEECH 2019 Computational Paralinguistic Challenge.
Atkar et al. Speech Emotion Recognition using Dialogue Emotion Decoder and CNN Classifier
CN113571095B (en) Speech emotion recognition method and system based on nested deep neural network
Whitehill et al. Whosecough: In-the-wild cougher verification using multitask learning
CN110348482A (en) A kind of speech emotion recognition system based on depth model integrated architecture
Parab et al. Stress and emotion analysis using IoT and deep learning
Maji et al. Multimodal emotion recognition based on deep temporal features using cross-modal transformer and self-attention
CN113707175A (en) Acoustic event detection system based on feature decomposition classifier and self-adaptive post-processing
CN113128284A (en) Multi-mode emotion recognition method and device
Febriansyah et al. SER: speech emotion recognition application based on extreme learning machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant