CN109272993A - Recognition methods, device, computer equipment and the storage medium of voice class - Google Patents
Recognition methods, device, computer equipment and the storage medium of voice class Download PDFInfo
- Publication number
- CN109272993A CN109272993A CN201810956681.7A CN201810956681A CN109272993A CN 109272993 A CN109272993 A CN 109272993A CN 201810956681 A CN201810956681 A CN 201810956681A CN 109272993 A CN109272993 A CN 109272993A
- Authority
- CN
- China
- Prior art keywords
- classification
- voice messaging
- sound spectrograph
- voice
- speech model
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 52
- 238000012549 training Methods 0.000 claims abstract description 83
- 238000013527 convolutional neural network Methods 0.000 claims abstract description 38
- 230000002996 emotional effect Effects 0.000 claims abstract description 35
- 238000012360 testing method Methods 0.000 claims description 62
- 230000004044 response Effects 0.000 claims description 22
- 238000004590 computer program Methods 0.000 claims description 14
- 238000001228 spectrum Methods 0.000 claims description 7
- 235000013399 edible fruits Nutrition 0.000 claims description 2
- 230000008451 emotion Effects 0.000 abstract description 22
- 230000000694 effects Effects 0.000 abstract description 8
- 230000001737 promoting effect Effects 0.000 abstract description 5
- 230000000875 corresponding effect Effects 0.000 description 42
- 238000000605 extraction Methods 0.000 description 12
- 238000010586 diagram Methods 0.000 description 7
- 230000002349 favourable effect Effects 0.000 description 5
- 235000021167 banquet Nutrition 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 3
- 230000001537 neural effect Effects 0.000 description 3
- 206010002368 Anger Diseases 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 125000004122 cyclic group Chemical group 0.000 description 2
- 238000009432 framing Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000000151 deposition Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 239000000203 mixture Substances 0.000 description 1
- 238000012216 screening Methods 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/16—Speech classification or search using artificial neural networks
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Computation (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
This application involves field of speech recognition, provide recognition methods, device, computer equipment and the storage medium of a kind of voice class, comprising: obtain the first voice messaging to be identified, and first voice messaging is converted to the first sound spectrograph;First sound spectrograph is input in preset Classification of Speech model, to obtain the classification results of first sound spectrograph, and using the classification results as the classification of first voice messaging;Wherein, the Classification of Speech model is composed using known emotional category or the sonagram of personality classification, is obtained based on the training of depth convolutional neural networks;Recognition methods, device, computer equipment and the storage medium of voice class provided herein, convenient for promoting the effect of emotion in voice messaging, personality classification.
Description
Technical field
This application involves technical field of voice recognition, in particular to a kind of recognition methods of voice class, device, computer
Equipment and storage medium.
Background technique
Currently, the emphasis of speech emotional, personality identification focuses primarily upon acoustic feature extraction.Know in existing speech emotional
In other technology, acoustic feature for identification has prosodic features, sound quality feature, frequency spectrum correlated characteristic and features described above screening
The fusion feature etc. of composition.These features mainly individually concentrate in time domain or frequency domain, for time domain, frequency domain character association variation
Voice signal for, features described above often lost part characteristic information, and then influence the effect of emotion, personality identification;And
And above-mentioned acoustic feature is during the extraction process, it can be by some factor (such as speech content, speaker, the rings unrelated with emotion, personality
Border etc.) it influences.These incoherent factors are contained in the acoustic feature of extraction, also can greatly influence emotion, personality classification
Effect.
Summary of the invention
The main purpose of the application is that the recognition methods for providing a kind of voice class, device, computer equipment and storage are situated between
Matter promotes the effect of emotion in voice messaging, personality classification.
To achieve the above object, this application provides a kind of recognition methods of voice class, comprising the following steps:
The first voice messaging to be identified is obtained, and first voice messaging is converted into the first sound spectrograph;
First sound spectrograph is input in preset Classification of Speech model, to obtain the classification of first sound spectrograph
As a result, and using the classification results as the classification of first voice messaging;Wherein, the Classification of Speech model is using
The sonagram spectrum for knowing emotional category or personality classification, is obtained based on the training of depth convolutional neural networks;First voice messaging
Classification be emotional category or personality classification.
Further, described the step of first voice messaging is converted to the first sound spectrograph, includes:
By Fourier analysis, first voice messaging is converted into corresponding first sound spectrograph.
Further, described to obtain the first voice messaging to be identified, and first voice messaging is converted to first
Before the step of sound spectrograph, comprising:
Training sound spectrograph in training set is input in the depth convolutional neural networks and is trained, it is described to obtain
Classification of Speech model.
Further, the training sound spectrograph by training set, which is input in the depth convolutional neural networks, instructs
Practice, the step of to obtain the Classification of Speech model after, comprising:
Test sound spectrograph in test set is input to export corresponding classification results in the Classification of Speech model, and
Whether identical as the test sound spectrograph classification in the test set verify the classification results.
Further, the sound spectrograph by training set is input in the depth convolutional neural networks and is trained,
Before the step of obtaining the Classification of Speech model, comprising:
Second voice messaging of each known class is converted into corresponding second sound spectrograph respectively, and by second language
Spectrogram is assigned as the training sound spectrograph in training set and the test sound spectrograph in test set according to setting ratio.
Further, described that first sound spectrograph is input in preset Classification of Speech model, to obtain described
The classification results of one sound spectrograph, and using the classification results as after the step of the classification of first voice messaging, comprising:
According to the classification of first voice messaging, the default response message of the corresponding classification of matching, and will be described pre-
If response message pushes to customer service terminal.
Further, described that first sound spectrograph is input in preset Classification of Speech model, to obtain described
The classification results of one sound spectrograph, and using the classification results as after the step of the classification of first voice messaging, comprising:
Obtain the identity information of the source user of first voice messaging, and by the classification of first voice messaging with
And the identity information establish binding relationship after be stored in database profession.
Present invention also provides a kind of identification devices of voice class, comprising:
Converting unit is converted to first for obtaining the first voice messaging to be identified, and by first voice messaging
Sound spectrograph;
Recognition unit, for first sound spectrograph to be input to the voice obtained based on the training of depth convolutional neural networks
In disaggregated model, classification of the classification results of first sound spectrograph as first voice messaging is exported;First language
The classification of message breath is emotional category or personality classification.
The application also provides a kind of computer equipment, including memory and processor, is stored with calculating in the memory
The step of machine program, the processor realizes any of the above-described the method when executing the computer program.
The application also provides a kind of computer storage medium, is stored thereon with computer program, the computer program quilt
The step of processor realizes method described in any of the above embodiments when executing.
Recognition methods, device, computer equipment and the storage medium of voice class provided herein have with following
Beneficial effect:
Recognition methods, device, computer equipment and the storage medium of voice class provided herein obtain to be identified
The first voice messaging, and first voice messaging is converted into the first sound spectrograph;First sound spectrograph is input to pre-
If Classification of Speech model in, to obtain the classification results of first sound spectrograph, and using the classification results as described
The classification of one voice messaging;Wherein, the Classification of Speech model is composed using known emotional category or the sonagram of personality classification,
It is obtained based on the training of depth convolutional neural networks;Convenient for promoting the effect of emotion in voice messaging, personality classification.
Detailed description of the invention
Fig. 1 is the recognition methods step schematic diagram of voice class in one embodiment of the application;
Fig. 2 is the recognition methods step schematic diagram of voice class in another embodiment of the application;
Fig. 3 is the identification device structural block diagram of voice class in one embodiment of the application;
Fig. 4 is the identification device structural block diagram of voice class in another embodiment of the application;
Fig. 5 is the identification device structural block diagram of voice class in the another embodiment of the application;
Fig. 6 is the structural schematic block diagram of the computer equipment of one embodiment of the application.
The embodiments will be further described with reference to the accompanying drawings for realization, functional characteristics and the advantage of the application purpose.
Specific embodiment
It is with reference to the accompanying drawings and embodiments, right in order to which the objects, technical solutions and advantages of the application are more clearly understood
The application is further elaborated.It should be appreciated that specific embodiment described herein is only used to explain the application, not
For limiting the application.
It referring to Fig.1, is to provide a kind of recognition methods of voice class in one embodiment of the application, comprising the following steps:
Step S1 obtains the first voice messaging to be identified, and first voice messaging is converted to the first sound spectrograph;
In the present embodiment, above-mentioned first sound spectrograph is a kind of spectrogram (including two and three dimensions), is to indicate voice frequency
Compose the figure changed over time.Above-mentioned first voice messaging can be the customer voice got in customer service system, be also possible to
Arbitrarily need to identify the voice messaging of classification in database.
In the present embodiment, not only can be in simultaneously by the mode for needing the first voice messaging identified to be converted into the first sound spectrograph
The time domain of existing voice messaging, frequency domain character, avoid the loss of favorable characteristics information, and can reflect speaker in voice messaging
Language feature.In the present embodiment, carrying out Fourier analysis to above-mentioned first voice messaging to be identified can be obtained accordingly
The first sound spectrograph, the display figure that Fourier analysis obtains is known as sound spectrograph.
First sound spectrograph is input in preset Classification of Speech model by step S2, to obtain the first language spectrum
The classification results of figure, and using the classification results as the classification of first voice messaging;Wherein, the Classification of Speech model
It is to be composed using known emotional category or the sonagram of personality classification, is obtained based on the training of depth convolutional neural networks;Described first
The classification of voice messaging is emotional category or personality classification.
In the present embodiment, the classification of above-mentioned first voice messaging refer to above-mentioned first voice messaging emotional category or
Person's personality classification in the present embodiment, is mainly used for classifying to the emotion of the first voice messaging.According to the above-mentioned depth volume of training
The difference of training set used in product neural network, obtained Classification of Speech model is also different, final Classification of Speech model output
Classification results it is also different.Specifically, if the training set used is the training set for being labeled with emotional category, obtained voice point
For carrying out identification classification to the emotion of above-mentioned first voice messaging, the output result of above-mentioned Classification of Speech model is then class model
The emotional category of first voice messaging;It wherein, include a variety of emotional semantic classifications in above-mentioned emotional category, such as impatient, irascible, resistance to
Heart etc..If the training set used is the training set for being labeled with personality classification, obtained Classification of Speech model is used for the first language
The personality of message breath carries out identification classification, and the output result of above-mentioned Classification of Speech model is then the personality class of the first voice messaging
Not;It wherein, include the classification of a variety of personality in above-mentioned personality classification, such as optimistic, pessimistic etc..
In the present embodiment, above-mentioned first sound spectrograph is input in Classification of Speech model, which is based on deep
Degree convolutional neural networks training obtains comprising multiple network layers, each network layer can obtain feature map (feature
Mapping), i.e. the feature (phonetic feature of namely above-mentioned first voice messaging) of image, network layer to above-mentioned first sound spectrograph into
Row Level by level learning is to extract the feature of the first sound spectrograph;By the feature extraction of each network layer, more past high level, said extracted
Feature more has Semantic, more has distinction and representativeness, highlights feature relevant to emotion, personality, so as to
Enough protrude the difference between different sound spectrographs.By the study of multiple network layers, finally in the last of depth convolutional neural networks
One layer (softmax layers) is classified, and the classification of above-mentioned first voice messaging is obtained.Conventional speech recognition methods need fixed by hand
Justice chooses suitable phonetic feature, in the present embodiment, directly automatically extracts feature using depth convolutional neural networks, then lead to
The last layer for crossing depth convolutional neural networks is classified.Use depth convolutional neural networks as classifier, compared to biography
Classification method of uniting improves the classifying quality of emotion in voice messaging, personality with better recognition performance.
In one embodiment, above-mentioned the step of first voice messaging is converted to the first sound spectrograph, includes:
By Fourier analysis, first voice messaging is converted into corresponding first sound spectrograph.For one section first
Voice messaging x (t) carries out framing to it first, becomes x (m, n), wherein n is frame length, and m is the number of frame;Then FFT change is done
(Fourier transform) is changed, X (m, n) is obtained;Cyclic graph Y (m, n) is again, wherein Y (m, n)=X (m, n) * X (m, n) ');Then it takes
10*log10(Y (m, n)), n)), it according to time change is scale M by above-mentioned m, n is scale N according to frequency variation, finally by M,
N, 10*log10(Y (m, n)) is drawn as X-Y scheme and obtains above-mentioned first sound spectrograph (can also be drawn as three-dimensional figure).
Referring to Fig. 2, in one embodiment, above-mentioned acquisition the first voice messaging to be identified, and first voice is believed
Breath is converted to before the step S1 of the first sound spectrograph, comprising:
Training sound spectrograph in training set is input in the depth convolutional neural networks and is trained by step S101,
To obtain the Classification of Speech model.
In this step, above-mentioned depth convolutional neural networks are trained, in advance to obtain the Classification of Speech model.Specifically
Ground, it uses sound spectrograph is largely trained in training set, which is known emotional category or personality classification
, it is input in above-mentioned depth convolutional neural networks and is trained, and it is (identical that its output result is substantially equal to
In) its corresponding emotional category or personality classification, obtain corresponding training parameter;Above-mentioned training parameter is input to depth volume
In product neural network, to obtain optimal above-mentioned Classification of Speech model.Then, then the first unknown voice messaging can be converted
It at sound spectrograph, then is input in above-mentioned Classification of Speech model, then can export the corresponding classification of the first voice messaging.
In the present embodiment, the above-mentioned training sound spectrograph by training set, which is input in the depth convolutional neural networks, to be carried out
Training, after obtaining the step S101 of the Classification of Speech model, comprising:
Test sound spectrograph in test set is input in the Classification of Speech model to export corresponding point by step S102
Class is as a result, whether and to verify the classification results identical as the test sound spectrograph classification in the test set.
Test sound spectrograph in above-mentioned test set is the sound spectrograph of known class.In order to verify above-mentioned Classification of Speech model
Test sound spectrograph a large amount of in test set is input in above-mentioned Classification of Speech model and learns by classification accuracy, judgement
Whether its classification results for corresponding to output is identical as classification known to the test sound spectrograph in test set.Finally, being tested to improve
The accuracy of card can also count the accuracy rate of the corresponding above-mentioned classification results of all test sound spectrographs, when accuracy rate is greater than one
When a setting value, then illustrate that above-mentioned Classification of Speech category of model is accurate.
In one embodiment, the above-mentioned sound spectrograph by training set is input in the depth convolutional neural networks and instructs
Practice, before obtaining the step S101 of the Classification of Speech model, comprising:
Second voice messaging of each known class is converted into corresponding second sound spectrograph by step S10 respectively, and by institute
It states the second sound spectrograph and is assigned as the training sound spectrograph in training set and the test sound spectrograph in test set according to setting ratio.
In the present embodiment, the second voice messaging of above-mentioned known class is corresponded into the step of being converted into the second sound spectrograph
It is similar to above-mentioned steps S1, it is different only in that, the voice messaging being directed to is different, and the second voice messaging in the present embodiment is
A kind of voice messaging of known class classification, is translated into after the second sound spectrograph, the second obtained sound spectrograph is also known
The data of emotional category or personality classification.The training language above-mentioned second sound spectrograph being assigned as according to setting ratio in training set
Test sound spectrograph in spectrogram and test set;For example, by the second sound spectrograph according to 4:1 pro rate be training set in instruction
Practice the test sound spectrograph in sound spectrograph and test set, so that the data volume ratio of training set and test set is 4:1.
In one embodiment, above-mentioned that first sound spectrograph is input in preset Classification of Speech model, to obtain
State the classification results of the first sound spectrograph, and using the classification results as the step S2 of the classification of first voice messaging it
Afterwards, comprising:
Step S3a, according to the classification of first voice messaging, the default response message of the corresponding classification of matching, and
The default response message is pushed into customer service terminal.
In one embodiment, the above method is applied in customer service call scene of attending a banquet, and different phonetic is preset in database
Default response message corresponding to information category.By the recognition methods of above-mentioned voice class to the first voice messaging of client into
It is different according to the classification of the first voice messaging of client after the identification of row classification, the default response message of the corresponding classification of matching,
And the default response message is pushed into customer service terminal, response is made according to above-mentioned default response message convenient for customer service;Such as visitor
When the emotion behavior at family is impatient, above-mentioned default response message, which then can be prompting and attend a banquet, to be switched topic or hangs up.
In another embodiment, the above method is applied recommends in scene in insurance, by the above method to the first of client
After voice messaging carries out the classification of personality classification, according to the difference of client's personality, corresponding prompting message is sent to customer service, and
Different insurance products are pushed to customer service terminal according to different client's personality classifications, are that lead referral insurance produces convenient for customer service
Product.
In another embodiment, above-mentioned that first sound spectrograph is input in preset Classification of Speech model, to obtain
The classification results of first sound spectrograph, and using the classification results as the step S2 of the classification of first voice messaging it
Afterwards, comprising:
Step S3b, obtains the identity information of the source user of first voice messaging, and by first voice messaging
Classification and the identity information establish binding relationship after be stored in database profession.For example, recommending to be convenient in scene in insurance
Identity information of the insurance agent from database according to client gets the personality of the corresponding client, so as to insurance agent's needle
Different session schemes is made to the personality of client.
In another embodiment, the classification of above-mentioned first voice messaging is personality classification, and the above method is applied to social flat
In platform.It is above-mentioned that first sound spectrograph is input in preset Classification of Speech model, to obtain point of first sound spectrograph
Class as a result, and using the classification results as the step S2 of the classification of first voice messaging after, comprising:
According to the personality classification of first voice messaging, matched in social data library and first voice messaging
The target user that personality classification matches, and the social information of the target user is recommended into coming for first voice messaging
Source user;Alternatively, being also possible to push away the corresponding social information of source user of first voice messaging in other embodiments
Give target user.Wherein, above-mentioned first voice messaging is to be sent by social platform, can be taken when sending the first voice messaging
Social information (id information, gender etc.) with source user;A large amount of user information and its corresponding is stored in social data library
Personality classification.
In conclusion obtaining the first language to be identified for the recognition methods of the voice class provided in the embodiment of the present application
Message breath, and first voice messaging is converted into the first sound spectrograph;First sound spectrograph is input to preset voice
In disaggregated model, to obtain the classification results of first sound spectrograph, and believe the classification results as first voice
The classification of breath;Wherein, time domain, the frequency domain character of voice messaging can be not only presented in the first sound spectrograph simultaneously, and favorable characteristics is avoided to believe
The loss of breath, and can reflect the language feature of speaker in voice messaging;First sound spectrograph is input to Classification of Speech model
In, by the layer-by-layer feature extraction of multiple network layers, more past high level, the feature of extraction more has Semantic, more has and distinguishes
Property and representativeness, highlight feature relevant to emotion, personality, so as to the difference between the different sound spectrographs of protrusion, just
In the effect for promoting emotion in voice messaging, personality classification.
Referring to Fig. 3, a kind of identification device of voice class is additionally provided in one embodiment of the application, comprising:
Converting unit 10 is converted to for obtaining the first voice messaging to be identified, and by first voice messaging
One sound spectrograph;
In the present embodiment, above-mentioned first sound spectrograph is a kind of spectrogram (including two and three dimensions), is to indicate voice frequency
Compose the figure changed over time.Above-mentioned first voice messaging can be the customer voice got in customer service system, be also possible to
Arbitrarily need to identify the voice messaging of classification in database.
In the present embodiment, converting unit 10 will need the first voice messaging identified to be converted into the mode of the first sound spectrograph not
But time domain, the frequency domain character that voice messaging can be presented simultaneously, avoid the loss of favorable characteristics information, and can reflect voice letter
The language feature of speaker in breath.In the present embodiment, Fourier analysis is carried out to above-mentioned first voice messaging to be identified
To obtain corresponding first sound spectrograph, the display figure that Fourier analysis obtains is known as sound spectrograph.
Recognition unit 20, it is described to obtain for first sound spectrograph to be input in preset Classification of Speech model
The classification results of first sound spectrograph, and using the classification results as the classification of first voice messaging;Wherein, the voice
Disaggregated model is composed using known emotional category or the sonagram of personality classification, is obtained based on the training of depth convolutional neural networks;
The classification of first voice messaging is emotional category or personality classification.
In the present embodiment, the classification of above-mentioned first voice messaging refer to above-mentioned first voice messaging emotional category or
Person's personality classification in the present embodiment, is mainly used for classifying to the emotion of the first voice messaging.According to the above-mentioned depth volume of training
The difference of training set used in product neural network, obtained Classification of Speech model is also different, final Classification of Speech model output
Classification results it is also different.Specifically, if the training set used is the training set for being labeled with emotional category, obtained voice point
For carrying out identification classification to the emotion of above-mentioned first voice messaging, the output result of above-mentioned Classification of Speech model is then class model
The emotional category of first voice messaging;It wherein, include a variety of emotional semantic classifications in above-mentioned emotional category, such as impatient, irascible, resistance to
Heart etc..If the training set used is the training set for being labeled with personality classification, obtained Classification of Speech model is used for the first language
The personality of message breath carries out identification classification, and the output result of above-mentioned Classification of Speech model is then the personality class of the first voice messaging
Not;It wherein, include the classification of a variety of personality in above-mentioned personality classification, such as optimistic, pessimistic etc..
In the present embodiment, above-mentioned first sound spectrograph is input in Classification of Speech model by recognition unit 20, the Classification of Speech
Model is based on the training of depth convolutional neural networks and obtains comprising multiple network layers, each network layer can obtain feature
Map (Feature Mapping), the i.e. feature (phonetic feature of namely above-mentioned first voice messaging) of image, network layer is to above-mentioned first
Sound spectrograph carries out Level by level learning to extract the feature of the first sound spectrograph;By the feature extraction of each network layer, more past high level, on
The feature for stating extraction more has Semantic, more has distinction and representativeness, highlights spy relevant to emotion, personality
Sign, so as to the difference between the different sound spectrographs of protrusion.By the study of multiple network layers, finally in depth convolutional Neural net
The last layer (softmax layers) of network is classified, and the classification of above-mentioned first voice messaging is obtained.Conventional speech recognition methods need
It wants manual definition or chooses suitable phonetic feature, in the present embodiment, directly automatically extracted using depth convolutional neural networks
Feature, then classified by the last layer of depth convolutional neural networks.Use depth convolutional neural networks as classifier,
Compared to conventional sorting methods, there is better recognition performance, improve the classifying quality of emotion in voice messaging, personality.
In one embodiment, above-mentioned converting unit 10 is specifically used for:
By Fourier analysis, first voice messaging is converted into corresponding first sound spectrograph.For one section first
Voice messaging x (t) carries out framing to it first, becomes x (m, n), wherein n is frame length, and m is the number of frame;Then FFT change is done
(Fourier transform) is changed, X (m, n) is obtained;Cyclic graph Y (m, n) is again, wherein Y (m, n)=X (m, n) * X (m, n) ');Then it takes
10*log10(Y (m, n)), n)), it according to time change is scale M by above-mentioned m, n is scale N according to frequency variation, finally by M,
N, 10*log10(Y (m, n)) is drawn as X-Y scheme and obtains above-mentioned first sound spectrograph (can also be drawn as three-dimensional figure).
Referring to Fig. 4, in one embodiment, the identification device of above-mentioned voice class further include:
Training unit 101, for by the training sound spectrograph in training set be input in the depth convolutional neural networks into
Row training, to obtain the Classification of Speech model.
In the present embodiment, training unit 101 trains above-mentioned depth convolutional neural networks in advance, to obtain the voice point
Class model.Specifically, training unit 101 has used and has largely trained sound spectrograph in training set, which is known
Emotional category or personality classification, it is input in above-mentioned depth convolutional neural networks and is trained, and make its output
As a result it is substantially equal to and (is identical to) its corresponding emotional category or personality classification, obtains corresponding training parameter;It will be above-mentioned
Training parameter is input in depth convolutional neural networks, to obtain optimal above-mentioned Classification of Speech model.It then, then can will not
The first voice messaging known is converted to sound spectrograph, then is input in above-mentioned Classification of Speech model, then can export first voice
The corresponding classification of information.
In one embodiment, the identification device of above-mentioned voice class further include:
Test cell 102, for the test sound spectrograph in test set to be input in the Classification of Speech model to export
Corresponding classification results, and whether verify the classification results identical as the test sound spectrograph classification in the test set.
Test sound spectrograph in above-mentioned test set is the sound spectrograph of known class.In order to verify above-mentioned Classification of Speech model
Classification accuracy, test cell 102 by test sound spectrograph a large amount of in test set be input in above-mentioned Classification of Speech model into
Row study judges whether it corresponds to the classification results of output identical as classification known to the test sound spectrograph in test set;Finally
The accuracy rate that the corresponding above-mentioned classification results of all test sound spectrographs can also be counted, when accuracy rate is greater than a setting value,
Then illustrate that above-mentioned Classification of Speech category of model is accurate.
Referring to Fig. 5, in one embodiment, the identification device of above-mentioned voice class further include:
Allocation unit 103 is composed for the second voice messaging of each known class to be converted into corresponding second language respectively
Figure, and second sound spectrograph is assigned as the training sound spectrograph in training set and the test in test set according to setting ratio
Sound spectrograph.
In the present embodiment, allocation unit 103 converts second for the second voice messaging one-to-one correspondence of above-mentioned known class
The process of sound spectrograph is similar to the conversion process of above-mentioned converting unit 10, is different only in that, the voice letter that allocation unit 103 is directed to
Breath is different, and the second voice messaging in the present embodiment is a kind of voice messaging of known class classification, is translated into second
After sound spectrograph, the second obtained sound spectrograph is also the data of known emotional category or personality classification.Above-mentioned second language is composed
Figure is assigned as the training sound spectrograph in training set and the test sound spectrograph in test set according to setting ratio;For example, by second
Sound spectrograph is the test sound spectrograph in the training sound spectrograph and test set in training set according to the pro rate of 4:1, so that
The data volume ratio of training set and test set is 4:1.
In one embodiment, the identification device of above-mentioned voice class further include:
First matching unit, for the classification according to first voice messaging, corresponding the default of classification of matching is answered
Information is answered, and the default response message is pushed into customer service terminal.
In one embodiment, the identification device of above-mentioned voice class is applied in customer service call scene of attending a banquet, in database
It is preset with default response message corresponding to different phonetic information category.By the recognition methods of above-mentioned voice class to client's
It is different according to the classification of the first voice messaging of client after first voice messaging carries out classification identification, the corresponding classification of matching
Default response message, and the default response message is pushed into customer service terminal, is believed convenient for customer service according to above-mentioned default response
Breath makes response;If the emotion behavior of client is when being impatient of, above-mentioned default response message, which then can be prompting and attend a banquet, switches words
It inscribes or hangs up.
In another embodiment, above-mentioned apparatus is applied recommends in scene in insurance, by the above method to the first of client
After voice messaging carries out the classification of personality classification, according to the difference of client's personality, corresponding prompting message is sent to customer service, and
Different insurance products are pushed to customer service terminal according to different client's personality classifications, are that lead referral insurance produces convenient for customer service
Product.
In another embodiment, the identification device of above-mentioned voice class further include:
Storage unit, the identity information of the source user for obtaining first voice messaging, and by first language
The classification of message breath and the identity information are stored in database profession after establishing binding relationship.For example, recommending scene in insurance
In, the personality of the corresponding client is got, convenient for identity information of the insurance agent from database according to client to insure
Agent makes different session schemes for the personality of client.
In another embodiment, the classification of above-mentioned first voice messaging is personality classification.The identification of above-mentioned voice class fills
It sets further include:
Second matching unit, for the personality classification according to first voice messaging, in social data library matching with
The target user that the personality classification of first voice messaging matches, and the social information of the target user is recommended into institute
State the source user of the first voice messaging.Alternatively, being also possible to use in the source of first voice messaging in other embodiments
The corresponding social information in family is pushed to target user.Wherein, above-mentioned first voice messaging is to be sent by social platform, is sent
The social information (id information, gender etc.) of source user can be carried when the first voice messaging;It is stored in social data library a large amount of
User information and its corresponding personality classification.
In conclusion for the identification device of the voice class provided in the embodiment of the present application, converting unit 10 is obtained wait know
Other first voice messaging, and first voice messaging is converted into the first sound spectrograph;Recognition unit 20 is by first language
Spectrogram is input in preset Classification of Speech model, to obtain the classification results of first sound spectrograph, and the classification is tied
Classification of the fruit as first voice messaging;Wherein, time domain, the frequency domain of voice messaging can be not only presented in the first sound spectrograph simultaneously
Feature, avoids the loss of favorable characteristics information, and can reflect the language feature of speaker in voice messaging;First sound spectrograph
It is input in Classification of Speech model, by the layer-by-layer feature extraction of multiple network layers, the feature of more past high level, extraction more has language
Justice more has distinction and representativeness, feature relevant to emotion, personality is highlighted, so as to the different languages of protrusion
Difference between spectrogram, convenient for promoting the effect of emotion in voice messaging, personality classification.
Referring to Fig. 6, a kind of computer equipment is also provided in the embodiment of the present application, which can be server,
Its internal structure can be as shown in Figure 6.The computer equipment includes processor, the memory, network connected by system bus
Interface and database.Wherein, the processor of the Computer Design is for providing calculating and control ability.The computer equipment is deposited
Reservoir includes non-volatile memory medium, built-in storage.The non-volatile memory medium is stored with operating system, computer program
And database.The built-in storage provides environment for the operation of operating system and computer program in non-volatile memory medium.
The database of the computer equipment is for data such as storaged voice disaggregated models.The network interface of the computer equipment be used for it is outer
The terminal in portion passes through network connection communication.A kind of identification side of voice class is realized when the computer program is executed by processor
Method.
Above-mentioned processor executes the step of recognition methods of above-mentioned voice class:
The first voice messaging to be identified is obtained, and first voice messaging is converted into the first sound spectrograph;
First sound spectrograph is input in preset Classification of Speech model, to obtain the classification of first sound spectrograph
As a result, and using the classification results as the classification of first voice messaging;Wherein, the Classification of Speech model is using
The sonagram spectrum for knowing emotional category or personality classification, is obtained based on the training of depth convolutional neural networks;First voice messaging
Classification be emotional category or personality classification.
In one embodiment, the step of first voice messaging is converted to the first sound spectrograph by the processor include:
By Fourier analysis, first voice messaging is converted into corresponding first sound spectrograph.
In one embodiment, the processor obtains the first voice messaging to be identified, and by first voice messaging
Before the step of being converted to the first sound spectrograph, comprising:
Training sound spectrograph in training set is input in the depth convolutional neural networks and is trained, it is described to obtain
Classification of Speech model.
In one embodiment, the training sound spectrograph in training set is input to the depth convolutional Neural net by the processor
Be trained in network, the step of to obtain the Classification of Speech model after, comprising:
Test sound spectrograph in test set is input to export corresponding classification results in the Classification of Speech model, and
Whether identical as the test sound spectrograph classification in the test set verify the classification results.
In one embodiment, the sound spectrograph in training set is input in the depth convolutional neural networks by the processor
Be trained, the step of to obtain the Classification of Speech model before, comprising:
Second voice messaging of each known class is converted into corresponding second sound spectrograph respectively, and by second language
Spectrogram is assigned as the training sound spectrograph in training set and the test sound spectrograph in test set according to setting ratio.
In one embodiment, first sound spectrograph is input in preset Classification of Speech model by the processor, with
The classification results of first sound spectrograph are obtained, and using the classification results as the step of the classification of first voice messaging
Later, comprising:
According to the classification of first voice messaging, the default response message of the corresponding classification of matching, and will be described pre-
If response message pushes to customer service terminal.
In one embodiment, first sound spectrograph is input in preset Classification of Speech model by the processor, with
The classification results of first sound spectrograph are obtained, and using the classification results as the step of the classification of first voice messaging
Later, comprising:
Obtain the identity information of the source user of first voice messaging, and by the classification of first voice messaging with
And the identity information establish binding relationship after be stored in database profession.
It will be understood by those skilled in the art that structure shown in Fig. 6, only part relevant to application scheme is tied
The block diagram of structure does not constitute the restriction for the computer equipment being applied thereon to application scheme.
One embodiment of the application also provides a kind of computer storage medium, is stored thereon with computer program, computer journey
A kind of recognition methods of voice class is realized when sequence is executed by processor, specifically:
The first voice messaging to be identified is obtained, and first voice messaging is converted into the first sound spectrograph;
First sound spectrograph is input in preset Classification of Speech model, to obtain the classification of first sound spectrograph
As a result, and using the classification results as the classification of first voice messaging;Wherein, the Classification of Speech model is using
The sonagram spectrum for knowing emotional category or personality classification, is obtained based on the training of depth convolutional neural networks;First voice messaging
Classification be emotional category or personality classification.
In one embodiment, the step of first voice messaging is converted to the first sound spectrograph by the processor include:
By Fourier analysis, first voice messaging is converted into corresponding first sound spectrograph.
In one embodiment, the processor obtains the first voice messaging to be identified, and by first voice messaging
Before the step of being converted to the first sound spectrograph, comprising:
Training sound spectrograph in training set is input in the depth convolutional neural networks and is trained, it is described to obtain
Classification of Speech model.
In one embodiment, the training sound spectrograph in training set is input to the depth convolutional Neural net by the processor
Be trained in network, the step of to obtain the Classification of Speech model after, comprising:
Test sound spectrograph in test set is input to export corresponding classification results in the Classification of Speech model, and
Whether identical as the test sound spectrograph classification in the test set verify the classification results.
In one embodiment, the sound spectrograph in training set is input in the depth convolutional neural networks by the processor
Be trained, the step of to obtain the Classification of Speech model before, comprising:
Second voice messaging of each known class is converted into corresponding second sound spectrograph respectively, and by second language
Spectrogram is assigned as the training sound spectrograph in training set and the test sound spectrograph in test set according to setting ratio.
In one embodiment, first sound spectrograph is input in preset Classification of Speech model by the processor, with
The classification results of first sound spectrograph are obtained, and using the classification results as the step of the classification of first voice messaging
Later, comprising:
According to the classification of first voice messaging, the default response message of the corresponding classification of matching, and will be described pre-
If response message pushes to customer service terminal.
In one embodiment, first sound spectrograph is input in preset Classification of Speech model by the processor, with
The classification results of first sound spectrograph are obtained, and using the classification results as the step of the classification of first voice messaging
Later, comprising:
Obtain the identity information of the source user of first voice messaging, and by the classification of first voice messaging with
And the identity information establish binding relationship after be stored in database profession.
In conclusion for the recognition methods of the voice class provided in the embodiment of the present application, device, computer equipment and depositing
Storage media obtains the first voice messaging to be identified, and first voice messaging is converted to the first sound spectrograph;By described
One sound spectrograph is input in preset Classification of Speech model, to obtain the classification results of first sound spectrograph, and will be described point
Classification of the class result as first voice messaging;Wherein, the first sound spectrograph can not only present simultaneously voice messaging time domain,
Frequency domain character, avoids the loss of favorable characteristics information, and can reflect the language feature of speaker in voice messaging;First language
Spectrogram is input in Classification of Speech model, and by the layer-by-layer feature extraction of multiple network layers, the feature of more past high level, extraction more has
There is Semantic, more there is distinction and representativeness, highlight feature relevant to emotion, personality, so as to protrude not
With the difference between sound spectrograph, convenient for promoting the effect of emotion in voice messaging, personality classification.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the computer program can store and a non-volatile computer
In read/write memory medium, the computer program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein,
Any reference used in provided herein and embodiment to memory, storage, database or other media,
Including non-volatile and/or volatile memory.Nonvolatile memory may include read-only memory (ROM), programming ROM
(PROM), electrically programmable ROM (EPROM), electrically erasable ROM (EEPROM) or flash memory.Volatile memory may include
Random access memory (RAM) or external cache.By way of illustration and not limitation, RAM can by diversified forms
, such as static state RAM (SRAM), dynamic ram (DRAM), synchronous dram (SDRAM), double speed are according to rate SDRAM (SSRSDRAM), increasing
Strong type SDRAM (ESDRAM), synchronization link (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM
(RDRAM), direct memory bus dynamic ram (DRDRAM) and memory bus dynamic ram (RDRAM) etc..
It should be noted that, in this document, the terms "include", "comprise" or its any other variant are intended to non-row
His property includes, so that the process, device, article or the method that include a series of elements not only include those elements, and
And further include the other elements being not explicitly listed, or further include for this process, device, article or method institute it is intrinsic
Element.In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including being somebody's turn to do
There is also other identical elements in the process, device of element, article or method.
The foregoing is merely preferred embodiment of the present application, are not intended to limit the scope of the patents of the application, all utilizations
Equivalent structure or equivalent flow shift made by present specification and accompanying drawing content is applied directly or indirectly in other correlations
Technical field, similarly include in the scope of patent protection of the application.
Claims (10)
1. a kind of recognition methods of voice class, which comprises the following steps:
The first voice messaging to be identified is obtained, and first voice messaging is converted into the first sound spectrograph;
First sound spectrograph is input in preset Classification of Speech model, to obtain the classification knot of first sound spectrograph
Fruit, and using the classification results as the classification of first voice messaging;Wherein, the Classification of Speech model is using known
Emotional category or the sonagram of personality classification spectrum, are obtained based on the training of depth convolutional neural networks;First voice messaging
Classification is emotional category or personality classification.
2. the recognition methods of voice class according to claim 1, which is characterized in that described by first voice messaging
The step of being converted to the first sound spectrograph include:
By Fourier analysis, first voice messaging is converted into corresponding first sound spectrograph.
3. the recognition methods of voice class according to claim 1, which is characterized in that described to obtain the first language to be identified
Message breath, and before the step of first voice messaging is converted to the first sound spectrograph, comprising:
Training sound spectrograph in training set is input in the depth convolutional neural networks and is trained, to obtain the voice
Disaggregated model.
4. the recognition methods of voice class according to claim 3, which is characterized in that the training language by training set
Spectrogram is input in the depth convolutional neural networks and is trained, the step of to obtain the Classification of Speech model after, packet
It includes:
Test sound spectrograph in test set is input in the Classification of Speech model to export corresponding classification results, and is verified
Whether the classification results are identical as the test sound spectrograph classification in the test set.
5. the recognition methods of voice class according to claim 3, which is characterized in that the sound spectrograph by training set
Be input in the depth convolutional neural networks and be trained, the step of to obtain the Classification of Speech model before, comprising:
Second voice messaging of each known class is converted into corresponding second sound spectrograph respectively, and by second sound spectrograph
The training sound spectrograph in training set and the test sound spectrograph in test set are assigned as according to setting ratio.
6. the recognition methods of voice class according to claim 1, which is characterized in that described that first sound spectrograph is defeated
Enter into preset Classification of Speech model, to obtain the classification results of first sound spectrograph, and using the classification results as
After the step of classification of first voice messaging, comprising:
According to the classification of first voice messaging, the default response message of the corresponding classification of matching, and described preset is answered
It answers information and pushes to customer service terminal.
7. the recognition methods of voice class according to claim 1, which is characterized in that described that first sound spectrograph is defeated
Enter into preset Classification of Speech model, to obtain the classification results of first sound spectrograph, and using the classification results as
After the step of classification of first voice messaging, comprising:
Obtain the identity information of the source user of first voice messaging, and by the classification of first voice messaging and institute
It states after identity information establishes binding relationship and is stored in database profession.
8. a kind of identification device of voice class characterized by comprising
Converting unit is converted to the first language spectrum for obtaining the first voice messaging to be identified, and by first voice messaging
Figure;
Recognition unit, for first sound spectrograph to be input in preset Classification of Speech model, to obtain first language
The classification results of spectrogram, and using the classification results as the classification of first voice messaging;Wherein, the Classification of Speech mould
Type is composed using known emotional category or the sonagram of personality classification, is obtained based on the training of depth convolutional neural networks;Described
The classification of one voice messaging is emotional category or personality classification.
9. a kind of computer equipment, including memory and processor, it is stored with computer program in the memory, feature exists
In the step of processor realizes any one of claims 1 to 7 the method when executing the computer program.
10. a kind of computer storage medium, is stored thereon with computer program, which is characterized in that the computer program is located
The step of reason device realizes method described in any one of claims 1 to 7 when executing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810956681.7A CN109272993A (en) | 2018-08-21 | 2018-08-21 | Recognition methods, device, computer equipment and the storage medium of voice class |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810956681.7A CN109272993A (en) | 2018-08-21 | 2018-08-21 | Recognition methods, device, computer equipment and the storage medium of voice class |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109272993A true CN109272993A (en) | 2019-01-25 |
Family
ID=65153984
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810956681.7A Pending CN109272993A (en) | 2018-08-21 | 2018-08-21 | Recognition methods, device, computer equipment and the storage medium of voice class |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109272993A (en) |
Cited By (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110047516A (en) * | 2019-03-12 | 2019-07-23 | 天津大学 | A kind of speech-emotion recognition method based on gender perception |
CN110188235A (en) * | 2019-05-05 | 2019-08-30 | 平安科技(深圳)有限公司 | Music style classification method, device, computer equipment and storage medium |
CN110349564A (en) * | 2019-07-22 | 2019-10-18 | 苏州思必驰信息科技有限公司 | Across the language voice recognition methods of one kind and device |
CN110397131A (en) * | 2019-07-01 | 2019-11-01 | 厦门瑞尔特卫浴科技股份有限公司 | A kind of automatic control system and method for closet flushing amount |
CN110473566A (en) * | 2019-07-25 | 2019-11-19 | 深圳壹账通智能科技有限公司 | Audio separation method, device, electronic equipment and computer readable storage medium |
CN110570844A (en) * | 2019-08-15 | 2019-12-13 | 平安科技(深圳)有限公司 | Speech emotion recognition method and device and computer readable storage medium |
CN110600015A (en) * | 2019-09-18 | 2019-12-20 | 北京声智科技有限公司 | Voice dense classification method and related device |
CN110992941A (en) * | 2019-10-22 | 2020-04-10 | 国网天津静海供电有限公司 | Power grid dispatching voice recognition method and device based on spectrogram |
CN111048071A (en) * | 2019-11-11 | 2020-04-21 | 北京海益同展信息科技有限公司 | Voice data processing method and device, computer equipment and storage medium |
WO2021012495A1 (en) * | 2019-07-23 | 2021-01-28 | 平安科技(深圳)有限公司 | Method and device for verifying speech recognition result, computer apparatus, and medium |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105047194A (en) * | 2015-07-28 | 2015-11-11 | 东南大学 | Self-learning spectrogram feature extraction method for speech emotion recognition |
CN106847309A (en) * | 2017-01-09 | 2017-06-13 | 华南理工大学 | A kind of speech-emotion recognition method |
CN108364662A (en) * | 2017-12-29 | 2018-08-03 | 中国科学院自动化研究所 | Based on the pairs of speech-emotion recognition method and system for differentiating task |
-
2018
- 2018-08-21 CN CN201810956681.7A patent/CN109272993A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105047194A (en) * | 2015-07-28 | 2015-11-11 | 东南大学 | Self-learning spectrogram feature extraction method for speech emotion recognition |
CN106847309A (en) * | 2017-01-09 | 2017-06-13 | 华南理工大学 | A kind of speech-emotion recognition method |
CN108364662A (en) * | 2017-12-29 | 2018-08-03 | 中国科学院自动化研究所 | Based on the pairs of speech-emotion recognition method and system for differentiating task |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110047516A (en) * | 2019-03-12 | 2019-07-23 | 天津大学 | A kind of speech-emotion recognition method based on gender perception |
CN110188235A (en) * | 2019-05-05 | 2019-08-30 | 平安科技(深圳)有限公司 | Music style classification method, device, computer equipment and storage medium |
CN110397131A (en) * | 2019-07-01 | 2019-11-01 | 厦门瑞尔特卫浴科技股份有限公司 | A kind of automatic control system and method for closet flushing amount |
CN110349564A (en) * | 2019-07-22 | 2019-10-18 | 苏州思必驰信息科技有限公司 | Across the language voice recognition methods of one kind and device |
CN110349564B (en) * | 2019-07-22 | 2021-09-24 | 思必驰科技股份有限公司 | Cross-language voice recognition method and device |
WO2021012495A1 (en) * | 2019-07-23 | 2021-01-28 | 平安科技(深圳)有限公司 | Method and device for verifying speech recognition result, computer apparatus, and medium |
CN110473566A (en) * | 2019-07-25 | 2019-11-19 | 深圳壹账通智能科技有限公司 | Audio separation method, device, electronic equipment and computer readable storage medium |
CN110570844A (en) * | 2019-08-15 | 2019-12-13 | 平安科技(深圳)有限公司 | Speech emotion recognition method and device and computer readable storage medium |
CN110570844B (en) * | 2019-08-15 | 2023-05-05 | 平安科技(深圳)有限公司 | Speech emotion recognition method, device and computer readable storage medium |
CN110600015A (en) * | 2019-09-18 | 2019-12-20 | 北京声智科技有限公司 | Voice dense classification method and related device |
CN110992941A (en) * | 2019-10-22 | 2020-04-10 | 国网天津静海供电有限公司 | Power grid dispatching voice recognition method and device based on spectrogram |
CN111048071A (en) * | 2019-11-11 | 2020-04-21 | 北京海益同展信息科技有限公司 | Voice data processing method and device, computer equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109272993A (en) | Recognition methods, device, computer equipment and the storage medium of voice class | |
CN109451188B (en) | Method and device for differential self-help response, computer equipment and storage medium | |
CN110675288B (en) | Intelligent auxiliary judgment method, device, computer equipment and storage medium | |
CN110392281B (en) | Video synthesis method and device, computer equipment and storage medium | |
CN111883140B (en) | Authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition | |
CN106503236A (en) | Question classification method and device based on artificial intelligence | |
CN109388701A (en) | Minutes generation method, device, equipment and computer storage medium | |
CN106407178A (en) | Session abstract generation method and device | |
CN109256136A (en) | A kind of audio recognition method and device | |
CN111182162B (en) | Telephone quality inspection method, device, equipment and storage medium based on artificial intelligence | |
CN110704571B (en) | Court trial auxiliary processing method, trial auxiliary processing device, equipment and medium | |
CN109190124B (en) | Method and apparatus for participle | |
CN108281139A (en) | Speech transcription method and apparatus, robot | |
CN107256428A (en) | Data processing method, data processing equipment, storage device and the network equipment | |
CN111724908A (en) | Epidemic situation investigation method and device based on robot process automation RPA | |
CN110246503A (en) | Blacklist vocal print base construction method, device, computer equipment and storage medium | |
CN113239147A (en) | Intelligent conversation method, system and medium based on graph neural network | |
CN109933671A (en) | Construct method, apparatus, computer equipment and the storage medium of personal knowledge map | |
CN110427455A (en) | A kind of customer service method, apparatus and storage medium | |
CN110209841A (en) | A kind of fraud analysis method and device based on swindle case merit | |
CN110556098B (en) | Voice recognition result testing method and device, computer equipment and medium | |
CN109800309A (en) | Classroom Discourse genre classification methods and device | |
CN113096634A (en) | Speech synthesis method, apparatus, server and storage medium | |
CN111724909A (en) | Epidemic situation investigation method and device combining RPA and AI | |
CN109473101A (en) | A kind of speech chip structures and methods of the random question and answer of differentiation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190125 |