CN113903362A - Speech emotion recognition method based on neural network - Google Patents
Speech emotion recognition method based on neural network Download PDFInfo
- Publication number
- CN113903362A CN113903362A CN202110990439.3A CN202110990439A CN113903362A CN 113903362 A CN113903362 A CN 113903362A CN 202110990439 A CN202110990439 A CN 202110990439A CN 113903362 A CN113903362 A CN 113903362A
- Authority
- CN
- China
- Prior art keywords
- emotion
- neural network
- text
- speech
- features
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/63—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for estimating an emotional state
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
- G10L25/30—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique using neural networks
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention discloses a speech emotion recognition method based on a neural network, which comprises the steps of classifying a target speech signal into four types of emotions, namely happy emotion, sad emotion, neutral emotion and angry emotion, extracting the characteristics of the speech signal based on a filter bank, then respectively sending the characteristics into a convolutional neural network and a time delay neural network to automatically extract emotion characteristics, obtaining the probability value belonging to each type of emotion by using a normalized exponential function classifier, and selecting the emotion corresponding to the maximum probability value as the emotion category of the speech; and then the target voice signal is recognized as a text, the text emotion classification is obtained by sending the text to a pre-training model of a bidirectional encoder, and the final emotion classification is obtained by fusing the three models, so that the problems that model fusion and multi-mode emotion recognition training are difficult and the accuracy is not improved greatly in the prior art are solved.
Description
Technical Field
The invention relates to the technical field of speech emotion recognition, in particular to a speech emotion recognition method based on a neural network.
Background
Many methods for speech emotion recognition are to adopt different speech emotion classification models for fusion, however, since all speech information is used, the relevance of the models is relatively high, and the effect of model fusion is not greatly improved; there is also a method of extracting features using different models, and then different models are fused according to the same weight, which also has the problem of little effect improvement.
At present, a multi-mode method of text emotion recognition and voice emotion recognition is adopted, but feature fusion is adopted, and due to the fact that learning speeds of different models are different, the feature fusion cannot well play a role in complementing advantages of information in different modes.
Disclosure of Invention
The invention aims to provide a speech emotion recognition method based on a neural network, and aims to solve the problems that model fusion and multi-mode emotion recognition training are difficult and accuracy is not improved greatly in the prior art.
In order to achieve the above object, the present invention adopts a speech emotion recognition method based on a neural network, comprising the following steps:
extracting voice features and sending the voice features to a convolutional neural network to obtain convolutional emotion categories;
the voice features are sent to a time delay neural network to obtain time delay emotion types;
recognizing a voice text and sending the voice text into a pre-training model of a bidirectional encoder to obtain the emotion type of the text;
and model fusion to obtain the final emotion classification.
Wherein the speech feature is a filter bank based feature of the target speech signal.
The target speech signal is divided into four categories of happiness, sadness, neutrality and anger, and the convolution emotion category, the time delay emotion category, the text emotion category and the final emotion category are any one of the four categories.
In the process of extracting voice features and sending the voice features to a convolutional neural network to obtain convolutional emotion categories, the convolutional neural network automatically extracts emotion features contained in the voice features, a normalized exponential function classifier is used for obtaining probability values of the emotion features belonging to each category, and the emotion features corresponding to the maximum probability values are selected as convolutional emotion categories.
In the process of sending the voice features into the time delay neural network to obtain the time delay emotion categories, the time delay neural network automatically extracts the emotion features contained in the voice features, then uses a normalization index function classifier to obtain the probability value of the emotion features belonging to each category, and selects the emotion features corresponding to the maximum probability value as the time delay emotion categories.
The method comprises the following steps of recognizing a voice text and sending the voice text to a pre-training model of a bidirectional encoder to obtain the emotion type of the text, wherein the method comprises the following steps:
recognizing a text corresponding to the target voice signal by utilizing a voice recognition technology to obtain a voice text;
mapping the characters in the voice text into corresponding labels to form a label sequence;
sending the label sequence into a pre-training model of a bidirectional encoder, and extracting emotional characteristics contained in the text;
and obtaining the probability value of the emotional feature belonging to each type by using a normalized index function classifier, and selecting the emotional feature corresponding to the maximum probability value as the text emotional category.
In the process of obtaining the final emotion category through model fusion, the probability values of the convolution emotion category, the time delay emotion category and the text emotion category after respective normalization index functions are linearly added, and the emotion feature corresponding to the maximum value is selected as the final emotion category.
And in the process of carrying out the linear addition, the weight values of different models are set to be the same or different.
The invention relates to a voice emotion recognition method based on a neural network, which comprises the steps of firstly classifying target voice signals into four types of emotions of happiness, sadness, neutrality and anger, then extracting the characteristics of the voice signals based on a filter bank, then respectively sending the characteristics into a convolutional neural network and a time delay neural network to automatically extract emotion characteristics, obtaining the probability value belonging to each type of emotion by using a normalized exponential function classifier, and selecting the emotion corresponding to the maximum probability value as the emotion category of the voice; and then the target voice signal is recognized as a text, the text emotion classification is obtained by sending the text to a pre-training model of a bidirectional encoder, and the final emotion classification is obtained by fusing the three models, so that the problems that model fusion and multi-mode emotion recognition training are difficult and the accuracy is not improved greatly in the prior art are solved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.
FIG. 1 is a flow chart of a speech emotion recognition method based on a neural network according to the present invention.
FIG. 2 is a model architecture diagram of the convolutional neural network of the present invention.
FIG. 3 is a model architecture diagram of the delay neural network of the present invention.
Fig. 4 is a block diagram of a single layer bi-directional encoder of the present invention.
FIG. 5 is a schematic diagram of the model fusion weighted procedure of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are illustrative and intended to be illustrative of the invention and are not to be construed as limiting the invention.
In this application, the corresponding terms may also be referred to as other names, for example, the filter bank-based feature is an FBank feature, the convolutional neural network is CNN, the delay neural network is ECAPA-TDNN, the pre-training model of the bidirectional encoder is Bert, and the normalized exponential function is Softmax.
Referring to fig. 1, the present invention provides a speech emotion recognition method based on a neural network, including the following steps:
s1: extracting voice features and sending the voice features to a convolutional neural network to obtain convolutional emotion categories;
s2: the voice features are sent to a time delay neural network to obtain time delay emotion types;
s3: recognizing a voice text and sending the voice text into a pre-training model of a bidirectional encoder to obtain the emotion type of the text;
s4: and model fusion to obtain the final emotion classification.
The speech feature is a filterbank-based feature of the target speech signal.
The emotional characteristics of the target speech signal are divided into four categories of happiness, sadness, neutrality and anger, and the convolution emotion category, the time delay emotion category, the text emotion category and the final emotion category can be any one of the four categories.
In the process of extracting voice features and sending the voice features to a convolutional neural network to obtain convolutional emotion categories, the convolutional neural network automatically extracts emotion features contained in the voice features, a normalized exponential function classifier is used for obtaining probability values of the emotion features belonging to each category, and the emotion features corresponding to the maximum probability values are selected as convolutional emotion categories.
And in the process of sending the voice features into a time delay neural network to obtain time delay emotion categories, the time delay neural network automatically extracts emotion features contained in the voice features, then uses a normalization index function classifier to obtain probability values of the emotion features belonging to each category, and selects the emotion features corresponding to the maximum probability values as the time delay emotion categories.
Recognizing a voice text and sending the voice text into a pre-training model of a bidirectional encoder to obtain a text emotion category, wherein the method comprises the following steps:
recognizing a text corresponding to the target voice signal by utilizing a voice recognition technology to obtain a voice text;
mapping the characters in the voice text into corresponding labels to form a label sequence;
sending the label sequence into a pre-training model of a bidirectional encoder, and extracting emotional characteristics contained in the text;
and obtaining the probability value of the emotional feature belonging to each type by using a normalized index function classifier, and selecting the emotional feature corresponding to the maximum probability value as the text emotional category.
In the process of obtaining the final emotion category through model fusion, the probability values of the convolution emotion category, the time delay emotion category and the text emotion category after respective normalization index functions are linearly added, and the emotion feature corresponding to the maximum value is selected as the final emotion category.
In the linear addition, the weight values of different models may be set to be the same or different.
Further, referring to fig. 2, the model architecture of the convolutional neural network CNN is as follows:
the speech signal is used as the input of the convolutional neural network based on the characteristics of a filter bank, the model is composed of 5 layers of two-dimensional convolutional neural network blocks, each two-dimensional convolutional neural network block is composed of 3 parts, namely a two-dimensional convolutional neural network, a batch normalization layer and a maximum pooling layer. And then connecting a global average pooling layer. And then connecting the full connection layer, obtaining the probability value belonging to each type of emotion by activating the function to be the normalized index function softmax, and then selecting the emotion corresponding to the maximum probability value as the emotion category of the voice.
The architecture of the time-delay neural network ECAPA-TDNN model is shown in FIG. 3:
the method comprises the steps of using the filter bank-based features of a voice signal as the input of a model, connecting a time delay neural network to the rear of the model, connecting a modified linear unit activation function and a batch standardization network to the rear of the model, connecting a 3-layer feature compression and excitation module, inputting the output of the first and second feature compression and excitation modules and the output of the third feature compression and excitation module into the time delay neural network, connecting the modified linear unit activation function to the model, obtaining a statistical attention pooling vector based on the features of the filter bank through attention pooling calculation, carrying out batch standardization, sending the statistical attention pooling vector to a full-connection network layer, carrying out batch standardization, obtaining probability values belonging to each emotion through an additional angle margin normalization index function, and selecting the maximum class as the emotion class of the voice.
In the process of pre-training the model by Bert:
the text corresponding to the voice is recognized by utilizing a voice recognition technology, and then each word in the text is mapped into a corresponding label according to a dictionary, wherein different words correspond to different labels. The corresponding label sequence of the text is then input to a pre-trained model of a bi-directional encoder (Bert).
The Bert pre-training model is formed by overlapping a multi-layer bidirectional encoder. The structure of the single-layer bi-directional encoder is shown in fig. 4. Inputting a text, extracting to obtain input embedding, carrying out position coding on input information, then sending the input information into a coder for coding, then sending the output of the previous layer into a decoder, sending the output of the previous layer into a full connection layer and a normalization index function softmax layer for classification by combining the characteristics obtained by coding of the coder, and obtaining the emotion category of the text.
Further, in the process of obtaining the final emotion classification through model fusion:
referring to fig. 5, the fusion method: the probability value after the softmax of the weight 1 × CNN + the probability value after the softmax of the weight 2 × ECAPA-TDNN + the probability value after the softmax of the weight 3 × Bert is a new probability value, and then the emotion corresponding to the maximum value is selected as the final emotion category.
Wherein: weight 1+ weight 2+ weight 3 ═ 1
The invention also provides a specific embodiment illustrating the improvement change of the identification accuracy rate:
related terms mean: accuracy rate is the number of correctly predicted samples/total number of samples
Weighted accuracy WA: the accuracy of a certain type of emotion is the proportion of a certain type of emotion in a data set;
non-weighted accuracy UA is the accuracy of a certain type of emotion classification.
Model 1 Filter Bank based features (Fbank features) input as speech, weighted accuracy WA, unweighted accuracy UA 67%, 65% using convolutional neural network cnn model
Model 2 Filter bank-based features (Fbank features) of input speech, weighted accuracy WA and unweighted accuracy UA of 67% and 66% by using a time-delay neural network ECAPA-TDNN model
Model 3, a two-way encoder Bert pre-training model for inputting text, with a weighted accuracy WA and a non-weighted accuracy UA of 62% and 61%
The weights of the set different models are the same, and the speech emotion recognition result is as follows:
# weighted accuracy WA, non-weighted accuracy UA 76%, 74%
(1. probability value after softmax of model 1+ 1. probability value after softmax of model 2+ 1. probability value after model 3 softmax)/3
When the models are fused, the weights are changed to be different, and the performance is greatly improved:
# weighted accuracy WA, non-weighted accuracy UA 81%, 80%
(0.5 probability value after softmax of model 1+ 2.1 probability value after softmax of model 2+ 0.4 probability value after model 3 softmax)/3
While the invention has been described with reference to a preferred embodiment, it will be understood by those skilled in the art that various changes in form and detail may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (8)
1. A speech emotion recognition method based on a neural network is characterized by comprising the following steps:
extracting voice features and sending the voice features to a convolutional neural network to obtain convolutional emotion categories;
the voice features are sent to a time delay neural network to obtain time delay emotion types;
recognizing a voice text and sending the voice text into a pre-training model of a bidirectional encoder to obtain the emotion type of the text;
and model fusion to obtain the final emotion classification.
2. The method of claim 1, wherein the speech features are filter bank based features of a target speech signal.
3. The neural network-based speech emotion recognition method of claim 2, wherein the emotion characteristics of the target speech signal are classified into four categories of happy, sad, neutral and angry, and the convolutional emotion category, the time-delayed emotion category, the text emotion category and the final emotion category are any one of the four categories.
4. The method for recognizing the speech emotion based on the neural network as claimed in claim 1, wherein in the process of extracting the speech features and sending the speech features to the convolutional neural network to obtain the convolutional emotion categories, the convolutional neural network automatically extracts the emotion features contained in the speech features, then uses a normalized exponential function classifier to obtain the probability value of the emotion features belonging to each category, and selects the emotion features corresponding to the maximum probability value as the convolutional emotion categories.
5. The method for recognizing the speech emotion based on the neural network as claimed in claim 1, wherein in the process of sending the speech features into the time-delay neural network to obtain the time-delay emotion categories, the time-delay neural network automatically extracts the emotion features included in the speech features, then uses a normalized exponential function classifier to obtain the probability value of the emotion features belonging to each category, and selects the emotion features corresponding to the maximum probability value as the time-delay emotion categories.
6. The method for speech emotion recognition based on neural network as claimed in claim 2, wherein the speech text is recognized and fed into the pre-training model of the bi-directional encoder to obtain the text emotion classification, comprising the following steps:
recognizing a text corresponding to the target voice signal by utilizing a voice recognition technology to obtain a voice text;
mapping the characters in the voice text into corresponding labels to form a label sequence;
sending the label sequence into a pre-training model of a bidirectional encoder, and extracting emotional characteristics contained in the text;
and obtaining the probability value of the emotional feature belonging to each type by using a normalized index function classifier, and selecting the emotional feature corresponding to the maximum probability value as the text emotional category.
7. The speech emotion recognition method based on neural network as claimed in claim 1, wherein in the process of obtaining the final emotion category through model fusion, the probability values after normalization index functions of the convolution emotion category, the time delay emotion category and the text emotion category are linearly added, and the emotion feature corresponding to the maximum value is selected as the final emotion category.
8. The method as claimed in claim 7, wherein the linear addition is performed by setting the weight values of different models to be the same or different.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110990439.3A CN113903362B (en) | 2021-08-26 | 2021-08-26 | Voice emotion recognition method based on neural network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110990439.3A CN113903362B (en) | 2021-08-26 | 2021-08-26 | Voice emotion recognition method based on neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113903362A true CN113903362A (en) | 2022-01-07 |
CN113903362B CN113903362B (en) | 2023-07-21 |
Family
ID=79188027
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110990439.3A Active CN113903362B (en) | 2021-08-26 | 2021-08-26 | Voice emotion recognition method based on neural network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113903362B (en) |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106847309A (en) * | 2017-01-09 | 2017-06-13 | 华南理工大学 | A kind of speech-emotion recognition method |
CN107609572A (en) * | 2017-08-15 | 2018-01-19 | 中国科学院自动化研究所 | Multi-modal emotion identification method, system based on neutral net and transfer learning |
CN108564942A (en) * | 2018-04-04 | 2018-09-21 | 南京师范大学 | One kind being based on the adjustable speech-emotion recognition method of susceptibility and system |
CN110489521A (en) * | 2019-07-15 | 2019-11-22 | 北京三快在线科技有限公司 | Text categories detection method, device, electronic equipment and computer-readable medium |
CN110534132A (en) * | 2019-09-23 | 2019-12-03 | 河南工业大学 | A kind of speech-emotion recognition method of the parallel-convolution Recognition with Recurrent Neural Network based on chromatogram characteristic |
CN111081280A (en) * | 2019-12-30 | 2020-04-28 | 苏州思必驰信息科技有限公司 | Text-independent speech emotion recognition method and device and emotion recognition algorithm model generation method |
US20200192927A1 (en) * | 2018-12-18 | 2020-06-18 | Adobe Inc. | Detecting affective characteristics of text with gated convolutional encoder-decoder framework |
CN111583964A (en) * | 2020-04-14 | 2020-08-25 | 台州学院 | Natural speech emotion recognition method based on multi-mode deep feature learning |
CN112700796A (en) * | 2020-12-21 | 2021-04-23 | 北京工业大学 | Voice emotion recognition method based on interactive attention model |
-
2021
- 2021-08-26 CN CN202110990439.3A patent/CN113903362B/en active Active
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106847309A (en) * | 2017-01-09 | 2017-06-13 | 华南理工大学 | A kind of speech-emotion recognition method |
CN107609572A (en) * | 2017-08-15 | 2018-01-19 | 中国科学院自动化研究所 | Multi-modal emotion identification method, system based on neutral net and transfer learning |
CN108564942A (en) * | 2018-04-04 | 2018-09-21 | 南京师范大学 | One kind being based on the adjustable speech-emotion recognition method of susceptibility and system |
US20200192927A1 (en) * | 2018-12-18 | 2020-06-18 | Adobe Inc. | Detecting affective characteristics of text with gated convolutional encoder-decoder framework |
CN110489521A (en) * | 2019-07-15 | 2019-11-22 | 北京三快在线科技有限公司 | Text categories detection method, device, electronic equipment and computer-readable medium |
CN110534132A (en) * | 2019-09-23 | 2019-12-03 | 河南工业大学 | A kind of speech-emotion recognition method of the parallel-convolution Recognition with Recurrent Neural Network based on chromatogram characteristic |
CN111081280A (en) * | 2019-12-30 | 2020-04-28 | 苏州思必驰信息科技有限公司 | Text-independent speech emotion recognition method and device and emotion recognition algorithm model generation method |
CN111583964A (en) * | 2020-04-14 | 2020-08-25 | 台州学院 | Natural speech emotion recognition method based on multi-mode deep feature learning |
CN112700796A (en) * | 2020-12-21 | 2021-04-23 | 北京工业大学 | Voice emotion recognition method based on interactive attention model |
Also Published As
Publication number | Publication date |
---|---|
CN113903362B (en) | 2023-07-21 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110825845B (en) | Hierarchical text classification method based on character and self-attention mechanism and Chinese text classification method | |
CN109241255B (en) | Intention identification method based on deep learning | |
CN112100383B (en) | Meta-knowledge fine tuning method and platform for multitask language model | |
CN112732916B (en) | BERT-based multi-feature fusion fuzzy text classification system | |
CN112307208A (en) | Long text classification method, terminal and computer storage medium | |
CN113223509B (en) | Fuzzy statement identification method and system applied to multi-person mixed scene | |
CN110263164A (en) | A kind of Sentiment orientation analysis method based on Model Fusion | |
CN113836992A (en) | Method for identifying label, method, device and equipment for training label identification model | |
CN111506700A (en) | Fine-grained emotion analysis method based on context perception embedding | |
CN111899766A (en) | Speech emotion recognition method based on optimization fusion of depth features and acoustic features | |
CN112989843B (en) | Intention recognition method, device, computing equipment and storage medium | |
CN113297374A (en) | Text classification method based on BERT and word feature fusion | |
CN112883167A (en) | Text emotion classification model based on hierarchical self-power-generation capsule network | |
CN112364636A (en) | User intention identification system based on dual target coding | |
CN113903362A (en) | Speech emotion recognition method based on neural network | |
CN115859989A (en) | Entity identification method and system based on remote supervision | |
CN114091469B (en) | Network public opinion analysis method based on sample expansion | |
CN116204643A (en) | Cascade label classification method based on multi-task learning knowledge enhancement | |
CN113257225B (en) | Emotional voice synthesis method and system fusing vocabulary and phoneme pronunciation characteristics | |
CN115169363A (en) | Knowledge-fused incremental coding dialogue emotion recognition method | |
CN111814468B (en) | Self-adaptive architecture semantic distribution text understanding method and system | |
CN114121018A (en) | Voice document classification method, system, device and storage medium | |
CN113255360A (en) | Document rating method and device based on hierarchical self-attention network | |
CN113705194A (en) | Extraction method and electronic equipment for short | |
CN113761106A (en) | Self-attention-enhanced bond transaction intention recognition system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |