CN109215678B - Construction method of deep emotion interaction model based on emotion dimensionality - Google Patents
Construction method of deep emotion interaction model based on emotion dimensionality Download PDFInfo
- Publication number
- CN109215678B CN109215678B CN201810867950.2A CN201810867950A CN109215678B CN 109215678 B CN109215678 B CN 109215678B CN 201810867950 A CN201810867950 A CN 201810867950A CN 109215678 B CN109215678 B CN 109215678B
- Authority
- CN
- China
- Prior art keywords
- emotion
- network
- deep
- rbm
- emotional
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000008451 emotion Effects 0.000 title claims abstract description 129
- 230000003993 interaction Effects 0.000 title claims abstract description 29
- 238000010276 construction Methods 0.000 title abstract description 5
- 230000002996 emotional effect Effects 0.000 claims abstract description 68
- 230000001149 cognitive effect Effects 0.000 claims abstract description 40
- 238000000034 method Methods 0.000 claims abstract description 38
- 238000013139 quantization Methods 0.000 claims abstract description 9
- 230000008569 process Effects 0.000 claims description 16
- 239000011159 matrix material Substances 0.000 claims description 14
- 238000004422 calculation algorithm Methods 0.000 claims description 12
- 238000012549 training Methods 0.000 claims description 11
- 230000008859 change Effects 0.000 claims description 9
- 230000006870 function Effects 0.000 claims description 7
- 210000002569 neuron Anatomy 0.000 claims description 6
- 238000001228 spectrum Methods 0.000 claims description 6
- 230000036651 mood Effects 0.000 claims description 5
- 230000000717 retained effect Effects 0.000 claims description 4
- 239000003550 marker Substances 0.000 claims description 3
- 238000012512 characterization method Methods 0.000 claims description 2
- 239000000284 extract Substances 0.000 claims description 2
- 230000008909 emotion recognition Effects 0.000 abstract description 3
- 238000003909 pattern recognition Methods 0.000 abstract description 2
- 238000004458 analytical method Methods 0.000 description 5
- 238000004364 calculation method Methods 0.000 description 5
- 238000013528 artificial neural network Methods 0.000 description 4
- 238000012360 testing method Methods 0.000 description 4
- 238000013507 mapping Methods 0.000 description 3
- 241000282461 Canis lupus Species 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000006185 dispersion Substances 0.000 description 2
- 230000007787 long-term memory Effects 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 230000009022 nonlinear effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 230000007704 transition Effects 0.000 description 2
- 230000004913 activation Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000013135 deep learning Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 230000015654 memory Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000002269 spontaneous effect Effects 0.000 description 1
- 230000009897 systematic effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Signal Processing (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Hospice & Palliative Care (AREA)
- General Health & Medical Sciences (AREA)
- Psychiatry (AREA)
- Child & Adolescent Psychology (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
- User Interface Of Digital Computer (AREA)
Abstract
The invention belongs to the fields of emotion recognition, pattern recognition and the like, and particularly relates to a construction method of a deep emotion interaction model based on emotion dimensionality. The problem that the traditional emotion model can only describe the probability of the occurrence of the emotional state and ignore the correlation of the emotion is solved, and the method comprises the following steps: s100, extracting emotional characteristics aiming at the existing emotional electroencephalogram database in a laboratory; s200, constructing an RBM network with optimal depth; s300, representing the association degree between the emotional states through the spatial distance of the dimension quantization value to obtain the weight between the emotional states; s400, constructing a correlation cognitive network S500, and constructing a deep emotion interaction model through the combination of the optimal deep RBM network and the correlation cognitive network to realize accurate judgment of continuous emotion.
Description
Technical Field
The invention belongs to the fields of emotion recognition, pattern recognition and the like, and particularly relates to a construction method of a deep emotion interaction model based on emotion dimensionality.
Background
The emotion is a comprehensive state generated by whether people meet the needs of objective objects or not, and different emotional states influence the learning, memory, decision and the like of people. In recent years, with the development of artificial intelligence theory, emotion modeling has become a current research hotspot and is widely concerned by researchers at home and abroad.
The emotion model is used for simulating the human emotion processing process. At present, there are many representative results for the research on emotion models, and the more commonly used shallow emotion classification models include a Support Vector Machine (SVM), an Artificial Neural Network (ANN), a Hidden Markov Model (HMM), a Gaussian Mixture Model (GMM), and the like. With the continuous expansion of data volume, academic and industrial communities have a high enthusiasm for researching Deep Neural networks, and Deep Neural Networks (DNNs) are applied to emotion models.
The existing emotion model can describe only the probability of the occurrence of the emotional state or the spontaneous transition process, neglects the characteristics of the emotion, namely, the emotion is related to each other in a systematic way, so that the fluctuation and the transition of the emotional state cannot be described. When the emotion model of the system is poor in fitting with the real emotion data, the emotion recognition accuracy is remarkably reduced.
Disclosure of Invention
The invention provides a construction method of a deep emotion interaction model based on emotion dimensionality, aiming at solving the problem that the traditional emotion model can only describe the probability of occurrence of an emotion state and neglects the mutual correlation of emotions.
The invention adopts the following technical scheme: a method for constructing a deep emotion interaction model based on emotion dimensionality comprises the following steps:
s100, extracting emotional characteristics aiming at the existing emotional electroencephalogram database in a laboratory;
s200, constructing an optimal depth RBM network, and taking the emotional characteristics as the input of the depth RBM network to obtain the weight between the bottom layer emotional characteristics and the emotional state;
s300, representing the association degree between the emotional states through the spatial distance of the dimension quantization value to obtain the weight between the emotional states;
s400, constructing a correlation cognitive network, wherein a weight between the bottom layer emotional characteristics and the emotional states of the deep RBM network is used as a weight between input and output in the correlation cognitive network, and the weight between the emotional states is used as a weight between output and output of the correlation cognitive network;
s500, a deep emotion interaction model is constructed through combination of the optimal deep RBM network and the associated cognitive network, and accurate judgment of continuous emotion is achieved.
The specific process of S100 is as follows:
the database adopts an emotion electroencephalogram database, selects electrodes at the positions of FC1, FC2, FC3, FC4, C1, C2, C3, C4, CP1, CP2, CP3 and CP4, namely electroencephalogram data with 12 leads in total, analyzes, extracts traditional characteristic energy, power spectrum and power spectrum entropy for representing electroencephalogram signal energy characteristics, approximate entropy of nonlinear attribute characteristics for representing electroencephalogram signal nonlinear attribute characteristics, hurst index, lyapunov index and nonlinear geometric characteristics for describing electroencephalogram signal geometric structures, namely descriptor outlines of the electroencephalogram signals based on tracks.
Nonlinear property characteristics: according to the nonlinear characteristic of the electroencephalogram signal, the nonlinear attribute characteristic representing the nonlinear attribute characteristic of the electroencephalogram signal is extracted, and the characteristics of approximate entropy, hurst index and Lyapunov index under phase space reconstruction are extracted, wherein the approximate entropy can be used as an effective method for measuring the incidence of new information in a time sequence, the probability of generating a new mode is higher, the complexity of the sequence is higher, the approximate entropy expresses the irregularity of the information, and the more complex signal approximate entropy is higher.
Maximum Lyapunov index:
the Lyapunov exponent reflects the average rate of change of local convergence or divergence of adjacent tracks in phase space. And maximum Lyapunov Exponent (LLE) λ 1 Indicating how fast the orbit converges or diverges. When lambda is 1 When > 0, λ 1 A larger value indicates a greater rate of trajectory divergence and a greater degree of chaos. The maximum Lyapunov index is obtained by adopting a Wolf method in the project. Taking an initial point X in phase space i And find its nearest neighbor point X i' A distance of L 0 . Tracking the distance L between two points after time n i This point is retained if the set value epsilon is met. Start tracking of the next moment. When the tracking and the superposition are carried out for M times, the maximum Lyapunov index can be obtained as shown in the formula (1):
compared with other algorithms, the method has the characteristics of fast calculation, good robustness on embedding dimension m, delay time tau and noise and the like.
Hurst index:
the Hurst index (abbreviated H) measures the long-term memory of a time series. H values ranged from 0 to 1. If H > 0.5 indicates that the time series has long-term autocorrelation, the correlation before and after the time series is larger. H values were calculated using the re-normalized range analysis method. The re-standard range analysis is a non-parameter statistical method and is not influenced by time series distribution. The method is to combine the one-dimensional emotion voice signals [ x (1), x (2), …, x (N)]And dividing the M adjacent subsequences u with the same length. By calculating the cumulative dispersion z of each subsequence u And standard deviation S u By calculating the re-scalar difference R for each sub-sequence u /S u Wherein R is u =maxz u -minz u The Hurst index is obtained. The calculation method is shown as the formula (2):
R M /S M =bH M (2)
taking logarithm on two sides of the above formula, and obtaining H as Hurst index, wherein b is a constant. When the emotional states of the voice signals are different, the H change rule is different. The Hurst index feature of the emotional speech is extracted, so that the contextual relevance of emotional change can be reflected
Non-linear geometric features: mapping the one-dimensional electroencephalogram signal to a high-dimensional space through phase space reconstruction, analyzing the electroencephalogram signal in the high-dimensional space, and extracting geometrical characteristics of phase space reconstruction under different emotional states: three trajectory-based descriptor profiles;
in the phase space reconstruction, an embedding dimension m =3 and a time delay tau =4 are selected, and when two samples x (t-4) and x (t-8) lagging an original waveform x (t) have smaller difference, namely an identity equation exists:
x(t)=x(t-4)=x(t-8) (3)
defining the above identity as a mark line, and describing differences among attractors by analyzing the distances from the attractors to the mark line.
Wherein the attractor is arranged under the two-dimensional spaceThree-dimensional space lower attractor
A third profile: the total length of the attractor continuous track is denoted as S:
The S200 includes the following steps of,
s201, using the extracted emotional electroencephalogram characteristics as input data, calculating RBM network parameters by using a contrast divergence algorithm, as shown in the following formula (7),
(w,a,b)=S CD (x 0 ,m,η,T) (7)
in the above formula, S CD Stands for CD Algorithm, x 0 Is a sample in training data, m is the number of neurons in a hidden layer, eta is the learning rate, T is the maximum iteration number of training, the connecting weight of a visible layer and the hidden layer is recorded as W, a and b are respectively the bias of a visible unit and a hidden unit, the output of the obtained hidden layer is shown as the following formula (8), f RBM In order to activate the function(s),
h=f RBM (v|W,b) (8)
s202, setting a plurality of layers of RBMs, taking the output of the RBM of the upper layer as the input of the RBM of the lower layer, adding neurons representing categories into each RBM, and taking the number of output nodes of the lower layer as the number of emotion categories to obtain network parameters of the RBMs of the deep layer.
S203, repeating the steps S201 and S202 to obtain the number of hidden nodes corresponding to the dimension with the largest information quantity of the reserved original data, and obtaining the optimal deep RBM network.
The specific process of S300 is as follows:
the PAD value of the basic emotion is evaluated based on a PAD three-dimensional emotion model and a Chinese PAD emotion scale, a three-dimensional emotion space P is established, A and D serve as coordinate axes of the emotion space, the relation between classes is mapped by using the space distance, the weight between emotions is finally determined, and the space distance of the two emotions in the three-dimensional PAD model can be calculated by the following formula (9):
wherein d is 12 Represents the spatial distance between point 1 and point 2, i.e. (x) 1 ,y 1 ,z 1 ) And (x) 2 ,y 2 ,z 2 ) Representing the coordinates of point 1 and point 2, respectively, in the three-dimensional PAD mood space, the relationship between classes is obtained by calculating the inverse of the spatial distance between any two moods.
The specific process of S400 is that,
the bottom layer identification network of the deep RBM network is defined as a correlation cognitive network, namely, network parameters of the bottom layer RBM obtained through a contrast divergence algorithm are used as weights between input and output of the correlation cognitive network, and the weight between the spatial distance characterization emotional states through the dimension quantization value is used as the weight between the output and the output of the correlation cognitive network.
Suppose F i (i =1,2.., n) represents an emotional speech feature, C j (j =1,2.., m) represents an emotion category, and a weight matrix formed by the relationship between features and emotion categories is represented by W i Representing that the input threshold is represented by b, and the weight matrix formed by the relation between classes is represented by W o It is shown that the weight matrix of the system can be reduced to an (n + m) x m matrix.
In the training process of the ICN network, the change of the node state value can be represented by the following formula (9), wherein C 0 Representing the objective function.
The specific step of S500 is to combine the optimal depth RBM obtained in S200 with the associated cognitive network in S400, that is, to obtain a weight between input and output of the associated cognitive network through the optimal depth RBM, to finally obtain a depth emotion interaction model, to implement accurate emotion discrimination.
Compared with the prior art, the emotion feature extraction method has the advantages that the emotion features are extracted according to the existing emotion electroencephalogram database in a laboratory; taking the emotional characteristics as the input of a deep RBM network, and obtaining the weight between the input and the output of the bottom-layer associated cognitive network through training; evaluating the PAD value of the basic emotion based on the PAD three-dimensional emotion model and the Chinese version PAD emotion scale, constructing a three-dimensional emotion space P, using A and D as coordinate axes of the emotion space, mapping the relation between classes by using the space distance, and finally determining the weight between emotions to use the weight as the weight between the output and the output of the associated cognitive network; and constructing an effective emotion model, namely a deep emotion model, by combining the deep RBM network and the associated cognitive network to realize accurate emotion judgment.
Drawings
FIG. 1 is a 64 lead electrode signal profile and selected electrode profiles of the present invention;
FIG. 2 is a diagram of a cognitive network architecture;
FIG. 3 is a diagram of a deep emotion interaction model.
Detailed Description
The method for constructing the deep emotion interaction model based on the emotion dimensions is described in detail below with reference to the attached drawings.
The invention provides a method for constructing a deep emotion interaction model based on emotion dimensionality, which constructs an emotion interaction model through mutual connection between emotions and constructs the deep emotion interaction model by combining a deep learning theory so as to achieve accurate judgment of the emotion. The specific practical manner is as follows.
Implementing a method for constructing a deep emotion interaction model based on emotion dimensionality, which comprises the following steps:
s100, extracting emotional characteristics aiming at an existing emotional electroencephalogram database in a laboratory;
s200, constructing an optimal depth RBM network, taking the emotional characteristics as the input of the depth RBM network, and obtaining the weight between the bottom layer emotional characteristics and the emotional state
S300, representing the association degree between the emotional states through the spatial distance of the dimension quantization value to obtain a weight value between the emotional states;
s400, constructing a correlation cognitive network, taking a weight between bottom layer emotional characteristics and emotional states of the RBM as a weight between input and output in the correlation cognitive network, and taking the weight between the emotional states as a weight between output and output of the correlation cognitive network;
s500, a deep emotion interaction model is constructed through combination of the optimal deep RBM network and the associated cognitive network, and accurate judgment of continuous emotion is achieved.
In the above experimental scheme, the specific process of extracting emotional features in S100 is as follows: the database adopts the existing emotional electroencephalogram database in a laboratory, mainly inspects the electrodes corresponding to the auditory function area, and selects the electrodes at the positions of FC1, FC2, FC3, FC4, C1, C2, C3, C4, CP1, CP2, CP3 and CP4, namely 12-lead electroencephalogram data for analysis as shown in figure 1. Traditional characteristic energy, power spectrum and power spectrum entropy representing electroencephalogram signal energy characteristics, nonlinear attribute characteristic approximate entropy representing electroencephalogram signal nonlinear attribute characteristics, hurst index and Lyapunov index and nonlinear geometric characteristics describing electroencephalogram signal geometric structures are extracted, namely, the electroencephalogram signal descriptor contour based on tracks is obtained.
Nonlinear property characteristics: according to the nonlinear characteristic of the electroencephalogram signal, the nonlinear attribute characteristic representing the nonlinear attribute characteristic of the electroencephalogram signal is extracted, and the characteristics of approximate entropy, hurst index and Lyapunov index under phase space reconstruction are extracted, wherein the approximate entropy can be used as an effective method for measuring the incidence of new information in a time sequence, the probability of generating a new mode is higher, the complexity of the sequence is higher, the approximate entropy expresses the irregularity of the information, and the more complex signal approximate entropy is higher.
Maximum Lyapunov index:
the Lyapunov exponent reflects the average rate of change of local convergence or divergence of adjacent tracks in phase space. And the maximum Lyapunov Exponent (LLE) λ 1 Indicating how fast the orbit converges or diverges. When lambda is 1 When > 0, λ 1 A larger value indicates a greater rate of orbital divergence and a greater degree of chaos. The maximum Lyapunov index is determined herein using the Wolf method. Get initial point X in phase space i And find its nearest neighbor point X i' A distance L 0 . Tracking the distance L between two points after time n i This point is retained if the set value epsilon is met. The tracking of the next moment is started. The maximum Lyapunov index can be obtained after tracing and overlapping for M times as shown in formula (1):
compared with other algorithms, the method has the advantages of fast calculation and embedded dimension m The delay time tau and the noise have the characteristics of good robustness and the like.
Hurst index:
the Hurst index (abbreviated H) measures the long-term memory of a time series. H values ranged from 0 to 1. If H > 0.5 indicates that the time series has long-term autocorrelation, the correlation before and after the time series is larger. H values were calculated using a re-normalized range analysis method. The re-standard range analysis is a non-parameter statistical method and is not influenced by time series distribution. The method is to use one-dimensional emotional voice signals [ x (1), x (2), …, x (N)]And dividing the M adjacent subsequences u with the same length. By calculating the cumulative dispersion z of each subsequence u And standard deviation S u By calculating the re-scalar difference R for each sub-sequence u /S u Wherein R is u =maxz u -minz u The Hurst index is obtained. The calculation method is shown as the formula (2):
R M /S M =bH M (2)
taking logarithm on two sides of the above formula, and obtaining H as Hurst index, wherein b is a constant. When the emotional states of the voice signals are different, the H change rule is different. The Hurst index feature of the emotional speech is extracted, so that the front and back relevance of emotional change can be reflected.
Non-linear geometric features: mapping the one-dimensional electroencephalogram signals to a high-dimensional space through phase space reconstruction, analyzing the electroencephalogram signals in the high-dimensional space, and extracting geometrical characteristics of phase space reconstruction under different emotional states: three trajectory-based descriptor profiles.
In the phase space reconstruction of the project, an embedding dimension m =3 and a time delay τ =4 are selected, and when two samples x (t-4) and x (t-8) lagging an original waveform x (t) have a small difference, that is, an identity exists:
x(t)=x(t-4)=x(t-8) (3)
defining the above identity as a mark line, and describing differences among attractors by analyzing the distances from the attractors to the mark line.
A third profile: the total length of the attractor continuous track is denoted as S:
In the test scheme, an optimal depth RBM network is constructed in S200, the emotional features are used as the input of the depth RBM network, and the weight between the bottom layer emotional features and the emotional state is obtained, and the specific process is as follows:
s201, the extracted emotional electroencephalogram characteristics are used as input data, and RBM network parameters are calculated by using a Contrast Divergence (CD) algorithm, wherein the RBM network parameters are shown in the following formula (7).
(w,a,b)=S CD (x 0 ,m,η,T) (7)
In the above formula, S CD Stands for CD Algorithm, x 0 Is a sample in training data, m is the number of neurons in a hidden layer, eta is the learning rate, T is the maximum iteration number of training, the connecting weight of a visible layer and the hidden layer is recorded as W, a and b are the bias of the visible unit and the hidden unit respectively, the output of the obtained hidden layer is shown as the following formula (8), f RBM Is an activation function.
h=f RBM (v|W,b) (8)。
S202, setting a plurality of layers of RBMs, taking the output of the RBM of the upper layer as the input of the RBM of the lower layer, adding neurons representing categories into each RBM, and taking the number of output nodes of the lower layer as the number of emotion categories to obtain network parameters of the RBMs of the deep layer.
S203-repeating the steps S201 and S202 to obtain the number of hidden nodes corresponding to the dimension with the largest amount of retained original data information, and obtaining the optimal deep RBM network.
In the above test scheme, in S300, the correlation degree between emotional states is represented by the spatial distance of the dimension quantization value, and a weight between the emotional states is obtained by the following specific process:
the PAD value of the basic emotion is evaluated based on a PAD three-dimensional emotion model and a Chinese PAD emotion scale, a three-dimensional emotion space P is established, A and D serve as coordinate axes of the emotion space, the relation between classes is mapped by using the space distance, the weight between emotions is finally determined, and the space distance of the two emotions in the three-dimensional PAD model can be calculated by the following formula (9):
wherein d is 12 Represents the spatial distance between point 1 and point 2, i.e. (x) 1 ,y 1 ,z 1 ) And (x) 2 ,y 2 ,z 2 ) Representing the coordinates of point 1 and point 2, respectively, in the three-dimensional PAD mood space. The relationship between classes is obtained by calculating the inverse of the spatial distance between any two emotions.
In the above test scheme, S400 constructs a correlation cognitive network, and takes a weight between the bottom-layer emotional feature and the emotional state of the RBM network with the optimal depth as a weight between input and output in the correlation cognitive network, and takes a weight between the emotional states as a weight between output and output of the correlation cognitive network, and the specific process is as follows:
as shown in fig. 2, the bottom layer identification network of the deep RBM network is defined as the associated cognitive network, that is, the network parameters of the bottom layer RBM obtained by the contrast divergence algorithm are used as the weights between the input and the output of the associated cognitive network, and the weight between the spatial distance representation emotional states through the dimension quantization value is used as the weight between the output and the output of the associated cognitive network.
Suppose F i (i =1,2.., n) represents an emotional speech feature, C j (j =1,2.., m) represents an emotion category. The weight matrix formed by the relationship between the features and emotion classes is represented by W i (also referred to as an input weight matrix), the threshold of the input is represented by b, and the weight matrix formed by the relationship between classes is represented by W o (referred to as the output weight matrix). The weight matrix of the system can be reduced to an (n + m) x m matrix.
In the training process of the ICN network, the change of the node state value can be represented by the following formula (11), wherein C 0 Representing the objective function.
In the test scheme, S500 is used for constructing a deep emotion interaction model through the combination of the optimal deep RBM network and the associated cognitive network, so that the accurate judgment of continuous emotion is realized, and the specific process is as follows: as shown in fig. 3, the optimal depth RBM network obtained in step two is combined with the associated cognitive network in step four, that is, the weight between the input and the output of the associated cognitive network is obtained through the optimal depth RBM network, and finally, a depth emotion interaction model is obtained, so that accurate emotion judgment is realized.
According to the method for constructing the deep emotion interaction model based on the emotion dimensionality, the emotion characteristics are extracted from an existing emotion electroencephalogram database in a laboratory, a multilayer RBM network is constructed, the emotion characteristics are used as input of the multilayer RBM, parameters of the RBM network are obtained through CD algorithm training, bottom layer RBM network parameters are used as weights between input and output of an associated cognitive network, then the association degree between emotion states is obtained through a P, A, D emotion scale and is used as the weights between output and output of the associated cognitive network, the deep emotion interaction model is constructed by combining the multilayer RBM network and the associated cognitive network, emotion judgment can be achieved accurately, and the method has important theoretical significance and application value in the fields of emotion calculation and artificial intelligence.
Claims (7)
1. A method for constructing a deep emotion interaction model based on emotion dimensionality is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
s100, extracting emotional characteristics aiming at the existing emotional electroencephalogram database in a laboratory;
s200, constructing an optimal depth RBM network, and taking the emotional characteristics as the input of the depth RBM network to obtain the weight between the bottom layer emotional characteristics and the emotional state;
s300, representing the association degree between the emotional states through the spatial distance of the dimension quantization value to obtain the weight between the emotional states;
s400, constructing a correlation cognitive network, wherein a weight between the bottom layer emotional characteristics and the emotional states of the deep RBM network is used as a weight between input and output in the correlation cognitive network, and the weight between the emotional states is used as a weight between output and output of the correlation cognitive network;
s500, a deep emotion interaction model is constructed through combination of the optimal deep RBM network and the associated cognitive network, and accurate judgment of continuous emotion is achieved.
2. The method for constructing the deep emotion interaction model based on the emotion dimension as claimed in claim 1, wherein: the specific process of S100 is as follows: the database adopts an emotion electroencephalogram database, selects electrodes at the positions of FC1, FC2, FC3, FC4, C1, C2, C3, C4, CP1, CP2, CP3 and CP4, namely electroencephalogram data with 12 leads in total, analyzes, extracts traditional characteristic energy, power spectrum and power spectrum entropy for representing electroencephalogram signal energy characteristics, approximate entropy of nonlinear attribute characteristics for representing electroencephalogram signal nonlinear attribute characteristics, hurst index, lyapunov index and nonlinear geometric characteristics for describing electroencephalogram signal geometric structures, namely descriptor outlines of the electroencephalogram signals based on tracks.
3. The method for constructing the deep emotion interaction model based on the emotion dimensions as claimed in claim 2, wherein: the nonlinear geometric characteristics map the one-dimensional electroencephalogram signals to a high-dimensional space through phase space reconstruction, the electroencephalogram signals are analyzed in the high-dimensional space, and the geometric characteristics of phase space reconstruction under different emotional states are extracted: three trajectory-based descriptor profiles;
in the phase space reconstruction, an embedding dimension m =3 and a time delay τ =4 are selected, and when two samples x (t-4) and x (t-8) lagging behind an original waveform x (t) have a small difference, that is, an identity exists:
x(t)=x(t-4)=x(t-8) (1)
defining the identity equation as a mark line, and describing the difference between the attractors by analyzing the distance from the attractors to the mark line;
A third profile: the total length of the attractor continuous track is denoted as S:
4. The method for constructing the deep emotion interaction model based on the emotion dimension as claimed in claim 3, wherein: the S200 includes the following steps of,
s201, using the extracted emotional electroencephalogram characteristics as input data, calculating RBM network parameters by using a contrast divergence algorithm, as shown in the following formula (5),
(w,a,b)=S CD (x 0 ,m,η,T) (5)
in the above formula, S CD Stands for CD Algorithm, x 0 Is a sample in the training data, m is the number of hidden layer neurons, η is the learning rate, T isThe maximum iteration number of training is represented as W, the connection weights of the visible layer and the hidden layer are respectively the bias of the visible unit and the hidden unit, the output of the obtained hidden layer is shown as the following formula (6), f RBM In order to activate the function(s),
h=f RBM (v|W,b) (6);
s202, setting a plurality of layers of RBMs, taking the output of the RBM of the upper layer as the input of the RBM of the lower layer, adding neurons representing categories into each RBM, and taking the number of output nodes of the lower layer as the number of emotion categories to obtain network parameters of the RBMs of the deep layer;
s203-repeating the steps S201 and S202 to obtain the number of hidden nodes corresponding to the dimension with the largest amount of retained original data information, and obtaining the optimal deep RBM network.
5. The method for constructing the deep emotion interaction model based on the emotion dimension as claimed in claim 4, wherein: the specific process of S300 is as follows:
the PAD value of the basic emotion is evaluated based on a PAD three-dimensional emotion model and a Chinese PAD emotion scale, a three-dimensional emotion space P is established, A and D serve as coordinate axes of the emotion space, the relation between classes is mapped by using the space distance, the weight between emotions is finally determined, and the space distance of the two emotions in the three-dimensional PAD model can be calculated by the following formula (7):
wherein d is 12 Represents the spatial distance between point 1 and point 2, i.e. (x) 1 ,y 1 ,z 1 ) And (x) 2 ,y 2 ,z 2 ) Representing the coordinates of point 1 and point 2, respectively, in the three-dimensional PAD mood space, the relationship between classes is obtained by calculating the inverse of the spatial distance between any two moods.
6. The method for constructing the deep emotion interaction model based on the emotion dimensions as claimed in claim 5, wherein: the specific process of S400 is that,
defining a bottom layer identification network of the deep RBM network as a correlation cognitive network, namely, taking a network parameter of the bottom layer RBM obtained by a contrast divergence algorithm as a weight between input and output of the correlation cognitive network, and taking a weight between spatial distance characterization emotional states through a dimension quantization value as a weight between output and output of the correlation cognitive network;
suppose F i (i =1,2.., n) represents an emotional speech feature, C j (j =1,2.., m) represents an emotion category, and a weight matrix formed by the relationship between features and emotion categories is represented by W i Representing that the input threshold is represented by b, and the weight matrix formed by the relation between the classes is represented by W o To illustrate, the weight matrix of the system can be simplified to an (n + m) x m matrix:
in the training process of the ICN network, the change of the node state value can be represented by the following formula (9), wherein C 0 The function of the object is represented by,
7. the method for constructing the deep emotion interaction model based on emotion dimensions according to claim 6, wherein: the specific step of S500 is to combine the optimal depth RBM network obtained in S200 with the associated cognitive network in S400 to finally obtain a depth emotion interaction model, that is, to obtain a weight between input and output of the associated cognitive network through the optimal depth RBM network, to represent a degree of association between emotion states through a spatial distance of the dimension quantization value of S300, to obtain a weight between emotion states, that is, a weight between output and output, and to extract emotion extracted from the signal to be detected as model input to implement accurate emotion discrimination.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810867950.2A CN109215678B (en) | 2018-08-01 | 2018-08-01 | Construction method of deep emotion interaction model based on emotion dimensionality |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810867950.2A CN109215678B (en) | 2018-08-01 | 2018-08-01 | Construction method of deep emotion interaction model based on emotion dimensionality |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109215678A CN109215678A (en) | 2019-01-15 |
CN109215678B true CN109215678B (en) | 2022-10-11 |
Family
ID=64987931
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810867950.2A Active CN109215678B (en) | 2018-08-01 | 2018-08-01 | Construction method of deep emotion interaction model based on emotion dimensionality |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109215678B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110033029A (en) * | 2019-03-22 | 2019-07-19 | 五邑大学 | A kind of emotion identification method and device based on multi-modal emotion model |
CN110781945A (en) * | 2019-10-22 | 2020-02-11 | 太原理工大学 | Electroencephalogram signal emotion recognition method and system integrating multiple features |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104200804A (en) * | 2014-09-19 | 2014-12-10 | 合肥工业大学 | Various-information coupling emotion recognition method for human-computer interaction |
CN106228977A (en) * | 2016-08-02 | 2016-12-14 | 合肥工业大学 | The song emotion identification method of multi-modal fusion based on degree of depth study |
CN106297825A (en) * | 2016-07-25 | 2017-01-04 | 华南理工大学 | A kind of speech-emotion recognition method based on integrated degree of depth belief network |
CN106503654A (en) * | 2016-10-24 | 2017-03-15 | 中国地质大学(武汉) | A kind of face emotion identification method based on the sparse autoencoder network of depth |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US10453479B2 (en) * | 2011-09-23 | 2019-10-22 | Lessac Technologies, Inc. | Methods for aligning expressive speech utterances with text and systems therefor |
US20160189730A1 (en) * | 2014-12-30 | 2016-06-30 | Iflytek Co., Ltd. | Speech separation method and system |
-
2018
- 2018-08-01 CN CN201810867950.2A patent/CN109215678B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104200804A (en) * | 2014-09-19 | 2014-12-10 | 合肥工业大学 | Various-information coupling emotion recognition method for human-computer interaction |
CN106297825A (en) * | 2016-07-25 | 2017-01-04 | 华南理工大学 | A kind of speech-emotion recognition method based on integrated degree of depth belief network |
CN106228977A (en) * | 2016-08-02 | 2016-12-14 | 合肥工业大学 | The song emotion identification method of multi-modal fusion based on degree of depth study |
CN106503654A (en) * | 2016-10-24 | 2017-03-15 | 中国地质大学(武汉) | A kind of face emotion identification method based on the sparse autoencoder network of depth |
Non-Patent Citations (3)
Title |
---|
Detect the emotions of the public based on cascade neural network model;Xiao Sun;<2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS)>;20160825;全文 * |
基于PAD情绪模型的情感语音识别;张雪英;《微电子学与计算机》;20160930(第9期);全文 * |
面向情感语音合成的言语情感建模研究;高莹莹;《中国博士学位论文全文数据库》;20161215(第12期);全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN109215678A (en) | 2019-01-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111291678B (en) | Face image clustering method and device based on multi-feature fusion | |
CN110163258A (en) | A kind of zero sample learning method and system reassigning mechanism based on semantic attribute attention | |
CN111126263B (en) | Electroencephalogram emotion recognition method and device based on double-hemisphere difference model | |
CN101828921A (en) | Identity identification method based on visual evoked potential (VEP) | |
Bu | Human motion gesture recognition algorithm in video based on convolutional neural features of training images | |
JP7191443B2 (en) | Target object attribute prediction method based on machine learning, related equipment and computer program | |
CN109215678B (en) | Construction method of deep emotion interaction model based on emotion dimensionality | |
CN110619084B (en) | Method for recommending books according to borrowing behaviors of library readers | |
CN112800998A (en) | Multi-mode emotion recognition method and system integrating attention mechanism and DMCCA | |
CN101561881B (en) | Emotion identification method for human non-programmed motion | |
CN105654035A (en) | Three-dimensional face recognition method and data processing device applying three-dimensional face recognition method | |
CN108985455A (en) | A kind of computer application neural net prediction method and system | |
CN109686403A (en) | Based on key protein matter recognition methods in uncertain protein-protein interaction network | |
CN116522153B (en) | Lithium battery capacity prediction method, lithium battery capacity prediction device, computer equipment and storage medium | |
CN108364098B (en) | Method for measuring influence of weather characteristics on user sign-in | |
Han et al. | Emotion recognition in speech with latent discriminative representations learning | |
CN110163130B (en) | Feature pre-alignment random forest classification system and method for gesture recognition | |
CN105657653B (en) | Indoor positioning method based on fingerprint data compression | |
CN113378691B (en) | Intelligent home management system and method based on real-time user behavior analysis | |
CN106056167A (en) | Normalization possibilistic fuzzy entropy clustering method based on Gaussian kernel hybrid artificial bee colony algorithm | |
Saleh et al. | A OBTAINING UNIQUE BY ANALYZING DNA USING A NEURO-FUZZY ALGORITHM. | |
CN106789149A (en) | Using the intrusion detection method of modified self-organizing feature neural network clustering algorithm | |
CN113450562A (en) | Road network traffic state discrimination method based on clustering and graph convolution network | |
Park et al. | Enhanced machine learning algorithms: deep learning, reinforcement learning, and q-learning | |
CN111709441A (en) | Behavior recognition feature selection method based on improved feature subset discrimination |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |