CN109215678B - Construction method of deep emotion interaction model based on emotion dimensionality - Google Patents

Construction method of deep emotion interaction model based on emotion dimensionality Download PDF

Info

Publication number
CN109215678B
CN109215678B CN201810867950.2A CN201810867950A CN109215678B CN 109215678 B CN109215678 B CN 109215678B CN 201810867950 A CN201810867950 A CN 201810867950A CN 109215678 B CN109215678 B CN 109215678B
Authority
CN
China
Prior art keywords
emotion
network
deep
rbm
emotional
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810867950.2A
Other languages
Chinese (zh)
Other versions
CN109215678A (en
Inventor
孙颖
张雪英
马江河
王少玄
贾海蓉
段淑斐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Taiyuan University of Technology
Original Assignee
Taiyuan University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Taiyuan University of Technology filed Critical Taiyuan University of Technology
Priority to CN201810867950.2A priority Critical patent/CN109215678B/en
Publication of CN109215678A publication Critical patent/CN109215678A/en
Application granted granted Critical
Publication of CN109215678B publication Critical patent/CN109215678B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • G10L25/30Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/63Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Hospice & Palliative Care (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Child & Adolescent Psychology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
  • User Interface Of Digital Computer (AREA)

Abstract

The invention belongs to the fields of emotion recognition, pattern recognition and the like, and particularly relates to a construction method of a deep emotion interaction model based on emotion dimensionality. The problem that the traditional emotion model can only describe the probability of the occurrence of the emotional state and ignore the correlation of the emotion is solved, and the method comprises the following steps: s100, extracting emotional characteristics aiming at the existing emotional electroencephalogram database in a laboratory; s200, constructing an RBM network with optimal depth; s300, representing the association degree between the emotional states through the spatial distance of the dimension quantization value to obtain the weight between the emotional states; s400, constructing a correlation cognitive network S500, and constructing a deep emotion interaction model through the combination of the optimal deep RBM network and the correlation cognitive network to realize accurate judgment of continuous emotion.

Description

Construction method of deep emotion interaction model based on emotion dimensionality
Technical Field
The invention belongs to the fields of emotion recognition, pattern recognition and the like, and particularly relates to a construction method of a deep emotion interaction model based on emotion dimensionality.
Background
The emotion is a comprehensive state generated by whether people meet the needs of objective objects or not, and different emotional states influence the learning, memory, decision and the like of people. In recent years, with the development of artificial intelligence theory, emotion modeling has become a current research hotspot and is widely concerned by researchers at home and abroad.
The emotion model is used for simulating the human emotion processing process. At present, there are many representative results for the research on emotion models, and the more commonly used shallow emotion classification models include a Support Vector Machine (SVM), an Artificial Neural Network (ANN), a Hidden Markov Model (HMM), a Gaussian Mixture Model (GMM), and the like. With the continuous expansion of data volume, academic and industrial communities have a high enthusiasm for researching Deep Neural networks, and Deep Neural Networks (DNNs) are applied to emotion models.
The existing emotion model can describe only the probability of the occurrence of the emotional state or the spontaneous transition process, neglects the characteristics of the emotion, namely, the emotion is related to each other in a systematic way, so that the fluctuation and the transition of the emotional state cannot be described. When the emotion model of the system is poor in fitting with the real emotion data, the emotion recognition accuracy is remarkably reduced.
Disclosure of Invention
The invention provides a construction method of a deep emotion interaction model based on emotion dimensionality, aiming at solving the problem that the traditional emotion model can only describe the probability of occurrence of an emotion state and neglects the mutual correlation of emotions.
The invention adopts the following technical scheme: a method for constructing a deep emotion interaction model based on emotion dimensionality comprises the following steps:
s100, extracting emotional characteristics aiming at the existing emotional electroencephalogram database in a laboratory;
s200, constructing an optimal depth RBM network, and taking the emotional characteristics as the input of the depth RBM network to obtain the weight between the bottom layer emotional characteristics and the emotional state;
s300, representing the association degree between the emotional states through the spatial distance of the dimension quantization value to obtain the weight between the emotional states;
s400, constructing a correlation cognitive network, wherein a weight between the bottom layer emotional characteristics and the emotional states of the deep RBM network is used as a weight between input and output in the correlation cognitive network, and the weight between the emotional states is used as a weight between output and output of the correlation cognitive network;
s500, a deep emotion interaction model is constructed through combination of the optimal deep RBM network and the associated cognitive network, and accurate judgment of continuous emotion is achieved.
The specific process of S100 is as follows:
the database adopts an emotion electroencephalogram database, selects electrodes at the positions of FC1, FC2, FC3, FC4, C1, C2, C3, C4, CP1, CP2, CP3 and CP4, namely electroencephalogram data with 12 leads in total, analyzes, extracts traditional characteristic energy, power spectrum and power spectrum entropy for representing electroencephalogram signal energy characteristics, approximate entropy of nonlinear attribute characteristics for representing electroencephalogram signal nonlinear attribute characteristics, hurst index, lyapunov index and nonlinear geometric characteristics for describing electroencephalogram signal geometric structures, namely descriptor outlines of the electroencephalogram signals based on tracks.
Nonlinear property characteristics: according to the nonlinear characteristic of the electroencephalogram signal, the nonlinear attribute characteristic representing the nonlinear attribute characteristic of the electroencephalogram signal is extracted, and the characteristics of approximate entropy, hurst index and Lyapunov index under phase space reconstruction are extracted, wherein the approximate entropy can be used as an effective method for measuring the incidence of new information in a time sequence, the probability of generating a new mode is higher, the complexity of the sequence is higher, the approximate entropy expresses the irregularity of the information, and the more complex signal approximate entropy is higher.
Maximum Lyapunov index:
the Lyapunov exponent reflects the average rate of change of local convergence or divergence of adjacent tracks in phase space. And maximum Lyapunov Exponent (LLE) λ 1 Indicating how fast the orbit converges or diverges. When lambda is 1 When > 0, λ 1 A larger value indicates a greater rate of trajectory divergence and a greater degree of chaos. The maximum Lyapunov index is obtained by adopting a Wolf method in the project. Taking an initial point X in phase space i And find its nearest neighbor point X i' A distance of L 0 . Tracking the distance L between two points after time n i This point is retained if the set value epsilon is met. Start tracking of the next moment. When the tracking and the superposition are carried out for M times, the maximum Lyapunov index can be obtained as shown in the formula (1):
Figure BDA0001750683390000021
compared with other algorithms, the method has the characteristics of fast calculation, good robustness on embedding dimension m, delay time tau and noise and the like.
Hurst index:
the Hurst index (abbreviated H) measures the long-term memory of a time series. H values ranged from 0 to 1. If H > 0.5 indicates that the time series has long-term autocorrelation, the correlation before and after the time series is larger. H values were calculated using the re-normalized range analysis method. The re-standard range analysis is a non-parameter statistical method and is not influenced by time series distribution. The method is to combine the one-dimensional emotion voice signals [ x (1), x (2), …, x (N)]And dividing the M adjacent subsequences u with the same length. By calculating the cumulative dispersion z of each subsequence u And standard deviation S u By calculating the re-scalar difference R for each sub-sequence u /S u Wherein R is u =maxz u -minz u The Hurst index is obtained. The calculation method is shown as the formula (2):
R M /S M =bH M (2)
taking logarithm on two sides of the above formula, and obtaining H as Hurst index, wherein b is a constant. When the emotional states of the voice signals are different, the H change rule is different. The Hurst index feature of the emotional speech is extracted, so that the contextual relevance of emotional change can be reflected
Non-linear geometric features: mapping the one-dimensional electroencephalogram signal to a high-dimensional space through phase space reconstruction, analyzing the electroencephalogram signal in the high-dimensional space, and extracting geometrical characteristics of phase space reconstruction under different emotional states: three trajectory-based descriptor profiles;
in the phase space reconstruction, an embedding dimension m =3 and a time delay tau =4 are selected, and when two samples x (t-4) and x (t-8) lagging an original waveform x (t) have smaller difference, namely an identity equation exists:
x(t)=x(t-4)=x(t-8) (3)
defining the above identity as a mark line, and describing differences among attractors by analyzing the distances from the attractors to the mark line.
A first profile: the distance of the attractor to the center of the circle is expressed as
Figure BDA0001750683390000031
Figure BDA0001750683390000032
Wherein the attractor is arranged under the two-dimensional space
Figure BDA0001750683390000041
Three-dimensional space lower attractor
Figure BDA0001750683390000042
A second profile: the distance from the attractor to the marker line is shown as
Figure BDA0001750683390000043
Figure BDA0001750683390000044
A third profile: the total length of the attractor continuous track is denoted as S:
Figure BDA0001750683390000045
wherein, the above definition
Figure BDA0001750683390000046
And S are extracted nonlinear geometric features.
The S200 includes the following steps of,
s201, using the extracted emotional electroencephalogram characteristics as input data, calculating RBM network parameters by using a contrast divergence algorithm, as shown in the following formula (7),
(w,a,b)=S CD (x 0 ,m,η,T) (7)
in the above formula, S CD Stands for CD Algorithm, x 0 Is a sample in training data, m is the number of neurons in a hidden layer, eta is the learning rate, T is the maximum iteration number of training, the connecting weight of a visible layer and the hidden layer is recorded as W, a and b are respectively the bias of a visible unit and a hidden unit, the output of the obtained hidden layer is shown as the following formula (8), f RBM In order to activate the function(s),
h=f RBM (v|W,b) (8)
s202, setting a plurality of layers of RBMs, taking the output of the RBM of the upper layer as the input of the RBM of the lower layer, adding neurons representing categories into each RBM, and taking the number of output nodes of the lower layer as the number of emotion categories to obtain network parameters of the RBMs of the deep layer.
S203, repeating the steps S201 and S202 to obtain the number of hidden nodes corresponding to the dimension with the largest information quantity of the reserved original data, and obtaining the optimal deep RBM network.
The specific process of S300 is as follows:
the PAD value of the basic emotion is evaluated based on a PAD three-dimensional emotion model and a Chinese PAD emotion scale, a three-dimensional emotion space P is established, A and D serve as coordinate axes of the emotion space, the relation between classes is mapped by using the space distance, the weight between emotions is finally determined, and the space distance of the two emotions in the three-dimensional PAD model can be calculated by the following formula (9):
Figure BDA0001750683390000051
wherein d is 12 Represents the spatial distance between point 1 and point 2, i.e. (x) 1 ,y 1 ,z 1 ) And (x) 2 ,y 2 ,z 2 ) Representing the coordinates of point 1 and point 2, respectively, in the three-dimensional PAD mood space, the relationship between classes is obtained by calculating the inverse of the spatial distance between any two moods.
The specific process of S400 is that,
the bottom layer identification network of the deep RBM network is defined as a correlation cognitive network, namely, network parameters of the bottom layer RBM obtained through a contrast divergence algorithm are used as weights between input and output of the correlation cognitive network, and the weight between the spatial distance characterization emotional states through the dimension quantization value is used as the weight between the output and the output of the correlation cognitive network.
Suppose F i (i =1,2.., n) represents an emotional speech feature, C j (j =1,2.., m) represents an emotion category, and a weight matrix formed by the relationship between features and emotion categories is represented by W i Representing that the input threshold is represented by b, and the weight matrix formed by the relation between classes is represented by W o It is shown that the weight matrix of the system can be reduced to an (n + m) x m matrix.
Figure BDA0001750683390000052
In the training process of the ICN network, the change of the node state value can be represented by the following formula (9), wherein C 0 Representing the objective function.
Figure BDA0001750683390000053
The specific step of S500 is to combine the optimal depth RBM obtained in S200 with the associated cognitive network in S400, that is, to obtain a weight between input and output of the associated cognitive network through the optimal depth RBM, to finally obtain a depth emotion interaction model, to implement accurate emotion discrimination.
Compared with the prior art, the emotion feature extraction method has the advantages that the emotion features are extracted according to the existing emotion electroencephalogram database in a laboratory; taking the emotional characteristics as the input of a deep RBM network, and obtaining the weight between the input and the output of the bottom-layer associated cognitive network through training; evaluating the PAD value of the basic emotion based on the PAD three-dimensional emotion model and the Chinese version PAD emotion scale, constructing a three-dimensional emotion space P, using A and D as coordinate axes of the emotion space, mapping the relation between classes by using the space distance, and finally determining the weight between emotions to use the weight as the weight between the output and the output of the associated cognitive network; and constructing an effective emotion model, namely a deep emotion model, by combining the deep RBM network and the associated cognitive network to realize accurate emotion judgment.
Drawings
FIG. 1 is a 64 lead electrode signal profile and selected electrode profiles of the present invention;
FIG. 2 is a diagram of a cognitive network architecture;
FIG. 3 is a diagram of a deep emotion interaction model.
Detailed Description
The method for constructing the deep emotion interaction model based on the emotion dimensions is described in detail below with reference to the attached drawings.
The invention provides a method for constructing a deep emotion interaction model based on emotion dimensionality, which constructs an emotion interaction model through mutual connection between emotions and constructs the deep emotion interaction model by combining a deep learning theory so as to achieve accurate judgment of the emotion. The specific practical manner is as follows.
Implementing a method for constructing a deep emotion interaction model based on emotion dimensionality, which comprises the following steps:
s100, extracting emotional characteristics aiming at an existing emotional electroencephalogram database in a laboratory;
s200, constructing an optimal depth RBM network, taking the emotional characteristics as the input of the depth RBM network, and obtaining the weight between the bottom layer emotional characteristics and the emotional state
S300, representing the association degree between the emotional states through the spatial distance of the dimension quantization value to obtain a weight value between the emotional states;
s400, constructing a correlation cognitive network, taking a weight between bottom layer emotional characteristics and emotional states of the RBM as a weight between input and output in the correlation cognitive network, and taking the weight between the emotional states as a weight between output and output of the correlation cognitive network;
s500, a deep emotion interaction model is constructed through combination of the optimal deep RBM network and the associated cognitive network, and accurate judgment of continuous emotion is achieved.
In the above experimental scheme, the specific process of extracting emotional features in S100 is as follows: the database adopts the existing emotional electroencephalogram database in a laboratory, mainly inspects the electrodes corresponding to the auditory function area, and selects the electrodes at the positions of FC1, FC2, FC3, FC4, C1, C2, C3, C4, CP1, CP2, CP3 and CP4, namely 12-lead electroencephalogram data for analysis as shown in figure 1. Traditional characteristic energy, power spectrum and power spectrum entropy representing electroencephalogram signal energy characteristics, nonlinear attribute characteristic approximate entropy representing electroencephalogram signal nonlinear attribute characteristics, hurst index and Lyapunov index and nonlinear geometric characteristics describing electroencephalogram signal geometric structures are extracted, namely, the electroencephalogram signal descriptor contour based on tracks is obtained.
Nonlinear property characteristics: according to the nonlinear characteristic of the electroencephalogram signal, the nonlinear attribute characteristic representing the nonlinear attribute characteristic of the electroencephalogram signal is extracted, and the characteristics of approximate entropy, hurst index and Lyapunov index under phase space reconstruction are extracted, wherein the approximate entropy can be used as an effective method for measuring the incidence of new information in a time sequence, the probability of generating a new mode is higher, the complexity of the sequence is higher, the approximate entropy expresses the irregularity of the information, and the more complex signal approximate entropy is higher.
Maximum Lyapunov index:
the Lyapunov exponent reflects the average rate of change of local convergence or divergence of adjacent tracks in phase space. And the maximum Lyapunov Exponent (LLE) λ 1 Indicating how fast the orbit converges or diverges. When lambda is 1 When > 0, λ 1 A larger value indicates a greater rate of orbital divergence and a greater degree of chaos. The maximum Lyapunov index is determined herein using the Wolf method. Get initial point X in phase space i And find its nearest neighbor point X i' A distance L 0 . Tracking the distance L between two points after time n i This point is retained if the set value epsilon is met. The tracking of the next moment is started. The maximum Lyapunov index can be obtained after tracing and overlapping for M times as shown in formula (1):
Figure BDA0001750683390000071
compared with other algorithms, the method has the advantages of fast calculation and embedded dimension m The delay time tau and the noise have the characteristics of good robustness and the like.
Hurst index:
the Hurst index (abbreviated H) measures the long-term memory of a time series. H values ranged from 0 to 1. If H > 0.5 indicates that the time series has long-term autocorrelation, the correlation before and after the time series is larger. H values were calculated using a re-normalized range analysis method. The re-standard range analysis is a non-parameter statistical method and is not influenced by time series distribution. The method is to use one-dimensional emotional voice signals [ x (1), x (2), …, x (N)]And dividing the M adjacent subsequences u with the same length. By calculating the cumulative dispersion z of each subsequence u And standard deviation S u By calculating the re-scalar difference R for each sub-sequence u /S u Wherein R is u =maxz u -minz u The Hurst index is obtained. The calculation method is shown as the formula (2):
R M /S M =bH M (2)
taking logarithm on two sides of the above formula, and obtaining H as Hurst index, wherein b is a constant. When the emotional states of the voice signals are different, the H change rule is different. The Hurst index feature of the emotional speech is extracted, so that the front and back relevance of emotional change can be reflected.
Non-linear geometric features: mapping the one-dimensional electroencephalogram signals to a high-dimensional space through phase space reconstruction, analyzing the electroencephalogram signals in the high-dimensional space, and extracting geometrical characteristics of phase space reconstruction under different emotional states: three trajectory-based descriptor profiles.
In the phase space reconstruction of the project, an embedding dimension m =3 and a time delay τ =4 are selected, and when two samples x (t-4) and x (t-8) lagging an original waveform x (t) have a small difference, that is, an identity exists:
x(t)=x(t-4)=x(t-8) (3)
defining the above identity as a mark line, and describing differences among attractors by analyzing the distances from the attractors to the mark line.
The first profile: the attractor to center distance is expressed as:
Figure BDA0001750683390000081
Figure BDA0001750683390000082
wherein the two-dimensional space lower attractor
Figure BDA0001750683390000083
Three-dimensional space lower attractor
Figure BDA0001750683390000084
A second profile: the distance from the attractor to the marker line is shown as
Figure BDA0001750683390000085
Figure BDA0001750683390000091
A third profile: the total length of the attractor continuous track is denoted as S:
Figure BDA0001750683390000092
wherein, the above definition
Figure BDA0001750683390000093
And S are extracted nonlinear geometric features.
In the test scheme, an optimal depth RBM network is constructed in S200, the emotional features are used as the input of the depth RBM network, and the weight between the bottom layer emotional features and the emotional state is obtained, and the specific process is as follows:
s201, the extracted emotional electroencephalogram characteristics are used as input data, and RBM network parameters are calculated by using a Contrast Divergence (CD) algorithm, wherein the RBM network parameters are shown in the following formula (7).
(w,a,b)=S CD (x 0 ,m,η,T) (7)
In the above formula, S CD Stands for CD Algorithm, x 0 Is a sample in training data, m is the number of neurons in a hidden layer, eta is the learning rate, T is the maximum iteration number of training, the connecting weight of a visible layer and the hidden layer is recorded as W, a and b are the bias of the visible unit and the hidden unit respectively, the output of the obtained hidden layer is shown as the following formula (8), f RBM Is an activation function.
h=f RBM (v|W,b) (8)。
S202, setting a plurality of layers of RBMs, taking the output of the RBM of the upper layer as the input of the RBM of the lower layer, adding neurons representing categories into each RBM, and taking the number of output nodes of the lower layer as the number of emotion categories to obtain network parameters of the RBMs of the deep layer.
S203-repeating the steps S201 and S202 to obtain the number of hidden nodes corresponding to the dimension with the largest amount of retained original data information, and obtaining the optimal deep RBM network.
In the above test scheme, in S300, the correlation degree between emotional states is represented by the spatial distance of the dimension quantization value, and a weight between the emotional states is obtained by the following specific process:
the PAD value of the basic emotion is evaluated based on a PAD three-dimensional emotion model and a Chinese PAD emotion scale, a three-dimensional emotion space P is established, A and D serve as coordinate axes of the emotion space, the relation between classes is mapped by using the space distance, the weight between emotions is finally determined, and the space distance of the two emotions in the three-dimensional PAD model can be calculated by the following formula (9):
Figure BDA0001750683390000101
wherein d is 12 Represents the spatial distance between point 1 and point 2, i.e. (x) 1 ,y 1 ,z 1 ) And (x) 2 ,y 2 ,z 2 ) Representing the coordinates of point 1 and point 2, respectively, in the three-dimensional PAD mood space. The relationship between classes is obtained by calculating the inverse of the spatial distance between any two emotions.
In the above test scheme, S400 constructs a correlation cognitive network, and takes a weight between the bottom-layer emotional feature and the emotional state of the RBM network with the optimal depth as a weight between input and output in the correlation cognitive network, and takes a weight between the emotional states as a weight between output and output of the correlation cognitive network, and the specific process is as follows:
as shown in fig. 2, the bottom layer identification network of the deep RBM network is defined as the associated cognitive network, that is, the network parameters of the bottom layer RBM obtained by the contrast divergence algorithm are used as the weights between the input and the output of the associated cognitive network, and the weight between the spatial distance representation emotional states through the dimension quantization value is used as the weight between the output and the output of the associated cognitive network.
Suppose F i (i =1,2.., n) represents an emotional speech feature, C j (j =1,2.., m) represents an emotion category. The weight matrix formed by the relationship between the features and emotion classes is represented by W i (also referred to as an input weight matrix), the threshold of the input is represented by b, and the weight matrix formed by the relationship between classes is represented by W o (referred to as the output weight matrix). The weight matrix of the system can be reduced to an (n + m) x m matrix.
Figure BDA0001750683390000102
In the training process of the ICN network, the change of the node state value can be represented by the following formula (11), wherein C 0 Representing the objective function.
Figure BDA0001750683390000103
In the test scheme, S500 is used for constructing a deep emotion interaction model through the combination of the optimal deep RBM network and the associated cognitive network, so that the accurate judgment of continuous emotion is realized, and the specific process is as follows: as shown in fig. 3, the optimal depth RBM network obtained in step two is combined with the associated cognitive network in step four, that is, the weight between the input and the output of the associated cognitive network is obtained through the optimal depth RBM network, and finally, a depth emotion interaction model is obtained, so that accurate emotion judgment is realized.
According to the method for constructing the deep emotion interaction model based on the emotion dimensionality, the emotion characteristics are extracted from an existing emotion electroencephalogram database in a laboratory, a multilayer RBM network is constructed, the emotion characteristics are used as input of the multilayer RBM, parameters of the RBM network are obtained through CD algorithm training, bottom layer RBM network parameters are used as weights between input and output of an associated cognitive network, then the association degree between emotion states is obtained through a P, A, D emotion scale and is used as the weights between output and output of the associated cognitive network, the deep emotion interaction model is constructed by combining the multilayer RBM network and the associated cognitive network, emotion judgment can be achieved accurately, and the method has important theoretical significance and application value in the fields of emotion calculation and artificial intelligence.

Claims (7)

1. A method for constructing a deep emotion interaction model based on emotion dimensionality is characterized by comprising the following steps: comprises the following steps of (a) carrying out,
s100, extracting emotional characteristics aiming at the existing emotional electroencephalogram database in a laboratory;
s200, constructing an optimal depth RBM network, and taking the emotional characteristics as the input of the depth RBM network to obtain the weight between the bottom layer emotional characteristics and the emotional state;
s300, representing the association degree between the emotional states through the spatial distance of the dimension quantization value to obtain the weight between the emotional states;
s400, constructing a correlation cognitive network, wherein a weight between the bottom layer emotional characteristics and the emotional states of the deep RBM network is used as a weight between input and output in the correlation cognitive network, and the weight between the emotional states is used as a weight between output and output of the correlation cognitive network;
s500, a deep emotion interaction model is constructed through combination of the optimal deep RBM network and the associated cognitive network, and accurate judgment of continuous emotion is achieved.
2. The method for constructing the deep emotion interaction model based on the emotion dimension as claimed in claim 1, wherein: the specific process of S100 is as follows: the database adopts an emotion electroencephalogram database, selects electrodes at the positions of FC1, FC2, FC3, FC4, C1, C2, C3, C4, CP1, CP2, CP3 and CP4, namely electroencephalogram data with 12 leads in total, analyzes, extracts traditional characteristic energy, power spectrum and power spectrum entropy for representing electroencephalogram signal energy characteristics, approximate entropy of nonlinear attribute characteristics for representing electroencephalogram signal nonlinear attribute characteristics, hurst index, lyapunov index and nonlinear geometric characteristics for describing electroencephalogram signal geometric structures, namely descriptor outlines of the electroencephalogram signals based on tracks.
3. The method for constructing the deep emotion interaction model based on the emotion dimensions as claimed in claim 2, wherein: the nonlinear geometric characteristics map the one-dimensional electroencephalogram signals to a high-dimensional space through phase space reconstruction, the electroencephalogram signals are analyzed in the high-dimensional space, and the geometric characteristics of phase space reconstruction under different emotional states are extracted: three trajectory-based descriptor profiles;
in the phase space reconstruction, an embedding dimension m =3 and a time delay τ =4 are selected, and when two samples x (t-4) and x (t-8) lagging behind an original waveform x (t) have a small difference, that is, an identity exists:
x(t)=x(t-4)=x(t-8) (1)
defining the identity equation as a mark line, and describing the difference between the attractors by analyzing the distance from the attractors to the mark line;
a first profile: the distance from the attractor to the center of the circle is expressed as
Figure FDA0001750683380000021
Figure FDA0001750683380000022
Wherein the two-dimensional space lower attractor
Figure FDA0001750683380000023
Three-dimensional space lower attractor
Figure FDA0001750683380000024
A second profile: the distance from the attractor to the marker line is shown as
Figure FDA0001750683380000025
Figure FDA0001750683380000026
A third profile: the total length of the attractor continuous track is denoted as S:
Figure FDA0001750683380000027
wherein, the above definition
Figure FDA0001750683380000028
The three physical quantities S are the extracted nonlinear geometric features.
4. The method for constructing the deep emotion interaction model based on the emotion dimension as claimed in claim 3, wherein: the S200 includes the following steps of,
s201, using the extracted emotional electroencephalogram characteristics as input data, calculating RBM network parameters by using a contrast divergence algorithm, as shown in the following formula (5),
(w,a,b)=S CD (x 0 ,m,η,T) (5)
in the above formula, S CD Stands for CD Algorithm, x 0 Is a sample in the training data, m is the number of hidden layer neurons, η is the learning rate, T isThe maximum iteration number of training is represented as W, the connection weights of the visible layer and the hidden layer are respectively the bias of the visible unit and the hidden unit, the output of the obtained hidden layer is shown as the following formula (6), f RBM In order to activate the function(s),
h=f RBM (v|W,b) (6);
s202, setting a plurality of layers of RBMs, taking the output of the RBM of the upper layer as the input of the RBM of the lower layer, adding neurons representing categories into each RBM, and taking the number of output nodes of the lower layer as the number of emotion categories to obtain network parameters of the RBMs of the deep layer;
s203-repeating the steps S201 and S202 to obtain the number of hidden nodes corresponding to the dimension with the largest amount of retained original data information, and obtaining the optimal deep RBM network.
5. The method for constructing the deep emotion interaction model based on the emotion dimension as claimed in claim 4, wherein: the specific process of S300 is as follows:
the PAD value of the basic emotion is evaluated based on a PAD three-dimensional emotion model and a Chinese PAD emotion scale, a three-dimensional emotion space P is established, A and D serve as coordinate axes of the emotion space, the relation between classes is mapped by using the space distance, the weight between emotions is finally determined, and the space distance of the two emotions in the three-dimensional PAD model can be calculated by the following formula (7):
Figure FDA0001750683380000031
wherein d is 12 Represents the spatial distance between point 1 and point 2, i.e. (x) 1 ,y 1 ,z 1 ) And (x) 2 ,y 2 ,z 2 ) Representing the coordinates of point 1 and point 2, respectively, in the three-dimensional PAD mood space, the relationship between classes is obtained by calculating the inverse of the spatial distance between any two moods.
6. The method for constructing the deep emotion interaction model based on the emotion dimensions as claimed in claim 5, wherein: the specific process of S400 is that,
defining a bottom layer identification network of the deep RBM network as a correlation cognitive network, namely, taking a network parameter of the bottom layer RBM obtained by a contrast divergence algorithm as a weight between input and output of the correlation cognitive network, and taking a weight between spatial distance characterization emotional states through a dimension quantization value as a weight between output and output of the correlation cognitive network;
suppose F i (i =1,2.., n) represents an emotional speech feature, C j (j =1,2.., m) represents an emotion category, and a weight matrix formed by the relationship between features and emotion categories is represented by W i Representing that the input threshold is represented by b, and the weight matrix formed by the relation between the classes is represented by W o To illustrate, the weight matrix of the system can be simplified to an (n + m) x m matrix:
Figure FDA0001750683380000032
in the training process of the ICN network, the change of the node state value can be represented by the following formula (9), wherein C 0 The function of the object is represented by,
Figure FDA0001750683380000041
7. the method for constructing the deep emotion interaction model based on emotion dimensions according to claim 6, wherein: the specific step of S500 is to combine the optimal depth RBM network obtained in S200 with the associated cognitive network in S400 to finally obtain a depth emotion interaction model, that is, to obtain a weight between input and output of the associated cognitive network through the optimal depth RBM network, to represent a degree of association between emotion states through a spatial distance of the dimension quantization value of S300, to obtain a weight between emotion states, that is, a weight between output and output, and to extract emotion extracted from the signal to be detected as model input to implement accurate emotion discrimination.
CN201810867950.2A 2018-08-01 2018-08-01 Construction method of deep emotion interaction model based on emotion dimensionality Active CN109215678B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810867950.2A CN109215678B (en) 2018-08-01 2018-08-01 Construction method of deep emotion interaction model based on emotion dimensionality

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810867950.2A CN109215678B (en) 2018-08-01 2018-08-01 Construction method of deep emotion interaction model based on emotion dimensionality

Publications (2)

Publication Number Publication Date
CN109215678A CN109215678A (en) 2019-01-15
CN109215678B true CN109215678B (en) 2022-10-11

Family

ID=64987931

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810867950.2A Active CN109215678B (en) 2018-08-01 2018-08-01 Construction method of deep emotion interaction model based on emotion dimensionality

Country Status (1)

Country Link
CN (1) CN109215678B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110033029A (en) * 2019-03-22 2019-07-19 五邑大学 A kind of emotion identification method and device based on multi-modal emotion model
CN110781945A (en) * 2019-10-22 2020-02-11 太原理工大学 Electroencephalogram signal emotion recognition method and system integrating multiple features

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104200804A (en) * 2014-09-19 2014-12-10 合肥工业大学 Various-information coupling emotion recognition method for human-computer interaction
CN106228977A (en) * 2016-08-02 2016-12-14 合肥工业大学 The song emotion identification method of multi-modal fusion based on degree of depth study
CN106297825A (en) * 2016-07-25 2017-01-04 华南理工大学 A kind of speech-emotion recognition method based on integrated degree of depth belief network
CN106503654A (en) * 2016-10-24 2017-03-15 中国地质大学(武汉) A kind of face emotion identification method based on the sparse autoencoder network of depth

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10453479B2 (en) * 2011-09-23 2019-10-22 Lessac Technologies, Inc. Methods for aligning expressive speech utterances with text and systems therefor
US20160189730A1 (en) * 2014-12-30 2016-06-30 Iflytek Co., Ltd. Speech separation method and system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104200804A (en) * 2014-09-19 2014-12-10 合肥工业大学 Various-information coupling emotion recognition method for human-computer interaction
CN106297825A (en) * 2016-07-25 2017-01-04 华南理工大学 A kind of speech-emotion recognition method based on integrated degree of depth belief network
CN106228977A (en) * 2016-08-02 2016-12-14 合肥工业大学 The song emotion identification method of multi-modal fusion based on degree of depth study
CN106503654A (en) * 2016-10-24 2017-03-15 中国地质大学(武汉) A kind of face emotion identification method based on the sparse autoencoder network of depth

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Detect the emotions of the public based on cascade neural network model;Xiao Sun;<2016 IEEE/ACIS 15th International Conference on Computer and Information Science (ICIS)>;20160825;全文 *
基于PAD情绪模型的情感语音识别;张雪英;《微电子学与计算机》;20160930(第9期);全文 *
面向情感语音合成的言语情感建模研究;高莹莹;《中国博士学位论文全文数据库》;20161215(第12期);全文 *

Also Published As

Publication number Publication date
CN109215678A (en) 2019-01-15

Similar Documents

Publication Publication Date Title
CN111291678B (en) Face image clustering method and device based on multi-feature fusion
CN110163258A (en) A kind of zero sample learning method and system reassigning mechanism based on semantic attribute attention
CN111126263B (en) Electroencephalogram emotion recognition method and device based on double-hemisphere difference model
CN101828921A (en) Identity identification method based on visual evoked potential (VEP)
Bu Human motion gesture recognition algorithm in video based on convolutional neural features of training images
JP7191443B2 (en) Target object attribute prediction method based on machine learning, related equipment and computer program
CN109215678B (en) Construction method of deep emotion interaction model based on emotion dimensionality
CN110619084B (en) Method for recommending books according to borrowing behaviors of library readers
CN112800998A (en) Multi-mode emotion recognition method and system integrating attention mechanism and DMCCA
CN101561881B (en) Emotion identification method for human non-programmed motion
CN105654035A (en) Three-dimensional face recognition method and data processing device applying three-dimensional face recognition method
CN108985455A (en) A kind of computer application neural net prediction method and system
CN109686403A (en) Based on key protein matter recognition methods in uncertain protein-protein interaction network
CN116522153B (en) Lithium battery capacity prediction method, lithium battery capacity prediction device, computer equipment and storage medium
CN108364098B (en) Method for measuring influence of weather characteristics on user sign-in
Han et al. Emotion recognition in speech with latent discriminative representations learning
CN110163130B (en) Feature pre-alignment random forest classification system and method for gesture recognition
CN105657653B (en) Indoor positioning method based on fingerprint data compression
CN113378691B (en) Intelligent home management system and method based on real-time user behavior analysis
CN106056167A (en) Normalization possibilistic fuzzy entropy clustering method based on Gaussian kernel hybrid artificial bee colony algorithm
Saleh et al. A OBTAINING UNIQUE BY ANALYZING DNA USING A NEURO-FUZZY ALGORITHM.
CN106789149A (en) Using the intrusion detection method of modified self-organizing feature neural network clustering algorithm
CN113450562A (en) Road network traffic state discrimination method based on clustering and graph convolution network
Park et al. Enhanced machine learning algorithms: deep learning, reinforcement learning, and q-learning
CN111709441A (en) Behavior recognition feature selection method based on improved feature subset discrimination

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant