CN110322895A - Speech evaluating method and computer storage medium - Google Patents
Speech evaluating method and computer storage medium Download PDFInfo
- Publication number
- CN110322895A CN110322895A CN201810259445.XA CN201810259445A CN110322895A CN 110322895 A CN110322895 A CN 110322895A CN 201810259445 A CN201810259445 A CN 201810259445A CN 110322895 A CN110322895 A CN 110322895A
- Authority
- CN
- China
- Prior art keywords
- evaluated
- vector
- data
- similarity
- voice data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 50
- 238000003860 storage Methods 0.000 title claims abstract description 19
- 239000013598 vector Substances 0.000 claims abstract description 162
- 238000004364 calculation method Methods 0.000 claims abstract description 53
- 238000011156 evaluation Methods 0.000 claims abstract description 30
- 238000012545 processing Methods 0.000 claims description 35
- 230000011218 segmentation Effects 0.000 claims description 7
- 230000008569 process Effects 0.000 claims description 5
- 238000012549 training Methods 0.000 abstract description 35
- 238000012360 testing method Methods 0.000 description 17
- 239000011159 matrix material Substances 0.000 description 10
- 238000012512 characterization method Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 7
- 238000004590 computer program Methods 0.000 description 6
- 238000005516 engineering process Methods 0.000 description 6
- 230000006870 function Effects 0.000 description 5
- 238000006243 chemical reaction Methods 0.000 description 4
- 238000013136 deep learning model Methods 0.000 description 3
- 235000013399 edible fruits Nutrition 0.000 description 3
- 238000002203 pretreatment Methods 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 2
- 230000007246 mechanism Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 241001269238 Data Species 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004422 calculation algorithm Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 230000005611 electricity Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 238000011002 quantification Methods 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/06—Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/27—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Multimedia (AREA)
- Human Computer Interaction (AREA)
- Acoustics & Sound (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- Signal Processing (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The embodiment of the invention provides a kind of speech evaluating method and computer storage mediums.The speech evaluating method includes: to generate the vector to be evaluated of the voice data to be evaluated according to the corresponding text data of voice data to be evaluated;According to preset standard content vector, the vector sum similarity calculation to be evaluated, calculate the similarity between vector to be evaluated described in the preset standard content vector sum, wherein, cosine value of the similarity calculation for the vector to be evaluated according to the preset standard content vector sum calculates the similarity, and makes calculated numerical value of the similarity greater than 0;According to the similarity and preset speech assessment rule, the evaluation result data of the voice data to be evaluated are generated and exported.The speech evaluating method can assess learning outcome in assertiveness training course.
Description
Technical field
The present embodiments relate to field of computer technology more particularly to the storage of a kind of speech evaluating method and computer to be situated between
Matter.
Background technique
With the development of computer and Internet technology, study is carried out by means of computer and internet and teaching has become
A kind of trend.By computer and internet, learn student whenever and wherever possible, it is not necessary to be limited to the environment such as place, number
Factor.Especially in terms of underage child education, compensated for using computer and internet progress underage child education existing
The blank of underage child education.
Language expression, which is carried out, by computer and internet with 3-8 years old children is trained for example, existing assertiveness training mistake
Journey are as follows: by computer or mobile terminal device by one group of interesting picture presentation to student, student can be by describing these
Image content carries out assertiveness training.
During existing assertiveness training, do not feed back and judgment mechanism, it cannot be fine after causing student to complete training
Ground understands the promotion degree of oneself, less to the study situation awareness of oneself, is unfavorable for that student is motivated to continue study and progress.
Summary of the invention
In view of this, the embodiment of the invention provides a kind of speech evaluating method and computer storage medium, it is existing to solve
The problem of student's learning outcome cannot be assessed in some assertiveness training courses.
According to a first aspect of the embodiments of the present invention, a kind of speech evaluating method is provided, this method comprises: according to be evaluated
The corresponding text data of voice data is surveyed, the vector to be evaluated of voice data to be evaluated is generated;According to preset standard content to
Amount, vector sum similarity calculation to be evaluated, calculate the similarity between preset standard content vector sum vector to be evaluated,
Wherein, similarity calculation is used to calculate similarity according to the cosine value of preset standard content vector sum vector to be evaluated,
And make numerical value of the calculated similarity greater than 0;According to similarity and preset speech assessment rule, generates and export to be evaluated
Survey the evaluation result data of voice data.
The second aspect of embodiment according to the present invention, provides a kind of computer storage medium, and computer storage medium is deposited
It contains: for generating the finger of the vector to be evaluated of voice data to be evaluated according to the corresponding text data of voice data to be evaluated
It enables;For according to preset standard content vector, vector sum similarity calculation to be evaluated, calculate preset standard content to
Measure vector to be evaluated between similarity instruction, wherein similarity calculation be used for according to preset standard content to
The cosine value of amount and vector to be evaluated calculates similarity, and makes numerical value of the calculated similarity greater than 0;For according to similar
Degree and preset speech assessment rule, generate and export the instruction of the evaluation result data of voice data to be evaluated.
The scheme provided according to embodiments of the present invention, the speech evaluating method can be applied in assertiveness training course, lead to
It crosses and the voice of user's input is evaluated and tested, such as the corresponding text data of voice data to be evaluated is converted into direction finding to be evaluated
Amount, and the similarity of vector to be evaluated and preset standard content vector is evaluated and tested, evaluation result data are obtained, to pass through
The ability to express of evaluation result data characterization user, so that user is allow to understand the expression for easily understanding oneself, from
And user is motivated to continue study and progress.
The speech evaluating method is converted to corresponding text data to be evaluated when evaluating and testing to voice data to be evaluated
Direction finding amount is calculated using similarity calculation according to the cosine value between vector to be evaluated and preset standard content vector
Similarity, and make numerical value of the calculated similarity greater than 0, it solves existing similarity calculating method and is carrying out speech evaluating
When the low problem of existing evaluation and test accuracy.In addition, also solving similarity calculation has negative value, so that speech evaluating
Result it is more accurate.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The some embodiments recorded in inventive embodiments can also obtain according to these attached drawings for those of ordinary skill in the art
Obtain other attached drawings.
Fig. 1 is a kind of step flow chart of according to embodiments of the present invention one speech evaluating method;
Fig. 2 is a kind of step flow chart of according to embodiments of the present invention two speech evaluating method;
Fig. 3 is the structural schematic diagram for the doc2vec model that one of embodiment illustrated in fig. 2 speech evaluating method uses.
Specific embodiment
In order to make those skilled in the art more fully understand the technical solution in the embodiment of the present invention, below in conjunction with the present invention
Attached drawing in embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described reality
Applying example only is a part of the embodiment of the embodiment of the present invention, instead of all the embodiments.Based on the implementation in the embodiment of the present invention
The range of protection of the embodiment of the present invention all should belong in example, those of ordinary skill in the art's every other embodiment obtained.
Embodiment one
Referring to Fig.1, a kind of step flow chart of according to embodiments of the present invention one speech evaluating method is shown.
The speech evaluating method of the present embodiment the following steps are included:
Step S101: according to the corresponding text data of voice data to be evaluated, the to be evaluated of voice data to be evaluated is generated
Vector.
Wherein, text data corresponding with voice data to be evaluated can be obtained by any suitable mode.For example, sharp
With the mode of speech recognition, voice data to be evaluated is identified by speech recognition modeling or algorithm, and generates corresponding text
Data.This mode may be implemented that voice to be evaluated is converted to corresponding text data, high conversion efficiency, labor intensity automatically
It is low.
It certainly, in other embodiments, can also be by way of manual transcription, by manually turning voice data to be evaluated
It is changed to corresponding text data.
Similarly, vector to be evaluated corresponding with voice data to be evaluated can also be obtained by any suitable mode.
For example, generating corresponding vector to be evaluated by deep learning model according to text data corresponding with voice data to be evaluated.
Wherein, deep learning model can be Word2vec model, doc2vec model etc..Word2vec model and doc2vec model can
Text data is converted to corresponding vector to be evaluated on semantic level, voice data to be evaluated can be embodied well
Semanteme has great benefit to the accuracy for guaranteeing the subsequent progress speech evaluating on semantic level.In addition, the text that will identify that
Notebook data is converted to vector to be evaluated, subsequent that speech evaluating is carried out based on vector to be evaluated, it is ensured that the standard of speech evaluating
True property avoids during voice data to be evaluated is identified as text data, causes due to voice is close, phase is same
The text inaccuracy identified and the problem of influence speech evaluating accuracy.
Certainly, in other embodiments, only hotlist representation model (one-hot reprentation), shallow-layer can also be passed through
Text data is converted to corresponding vector to be evaluated by semantic analysis model (LSA) etc..
Step S102: it according to preset standard content vector, vector sum similarity calculation to be evaluated, calculates preset
Similarity between standard content vector sum vector to be evaluated, wherein similarity calculation is used for according in preset standard
The cosine value for holding vector sum vector to be evaluated calculates similarity, and makes numerical value of the calculated similarity greater than 0.
Wherein, preset standard content vector is generated according to Key for Reference data.For example, by text vector model,
Key for Reference data are converted to corresponding standard content vector by Word2vec model as the aforementioned, and standard content vector is pre-
It is first stored in computer equipment and/or in server.
Certainly, in other embodiments, can also in computer equipment and/or server preset reference answer data,
And Key for Reference data are converted into standard content vector when needed.
Similarity calculation is used to calculate the similarity between preset standard content vector sum vector to be evaluated, thus
According to this similarity characterization Key for Reference data (corresponding with preset standard content vector) and text data (with direction finding to be evaluated
Amount correspond to) between similarity degree, thus according to the similarity degree between Key for Reference data and text data be language to be evaluated
Sound data score.In the present embodiment, similarity calculation is used for be evaluated according to preset standard content vector sum
Cosine value between vector calculates the similarity, and makes numerical value of the calculated similarity greater than 0.
The similarity calculation can determine similarity according to the cosine value of two vectors, and make calculated similarity
For the numerical value greater than 0, the purpose that precise and high efficiency determines similarity is not only realized, is avoided when carrying out speech evaluating, is used
It is existing in such a way that text keyword determines similarity existing for influence of the unisonance allograph to similarity accuracy, and
Solving cosine value, there are negative values, and causing calculated similarity, there are negative values, so that the problem of speech evaluating result inaccuracy.
Step S103: it according to the similarity and preset speech assessment rule, generates and exports voice data to be evaluated
Evaluation result data.
Wherein, evaluation result data are used to characterize the level of the ability to express of user.For example, in assertiveness training course,
Evaluation result data can characterize the expression of user.The horizontal difference of the ability to express characterized as needed, evaluation and test knot
It may include different evaluation and test parameters in fruit data.For example, evaluation result data include semanteme in assertiveness training course above-mentioned
Score, dynamics score, tone score etc..
In a kind of feasible pattern, preset speech assessment rule includes score and phase above-mentioned in evaluation result data
Like positively related rule is spent, i.e., when similarity is higher, score is higher, and characterization ability to express is better.
The speech evaluating method can be applied in assertiveness training course, by being evaluated and tested to the voice that user inputs,
Such as voice data to be evaluated is converted into vector to be evaluated, and vector to be evaluated is similar to preset standard content vector
Degree is evaluated and tested, and evaluation result data are obtained, to pass through the ability to express of evaluation result data characterization user, to make user can
The expression of oneself is easily understood with clear, so that user be motivated to continue study and progress.
The speech evaluating method is converted to corresponding text data to be evaluated when evaluating and testing to voice data to be evaluated
Direction finding amount is calculated using similarity calculation according to the cosine value between vector to be evaluated and preset standard content vector
Similarity, and make numerical value of the calculated similarity greater than 0, it solves existing similarity calculating method and is carrying out speech evaluating
When the low problem of existing evaluation and test accuracy.In addition, also solving similarity calculation has negative value, so that speech evaluating
Result it is more accurate.
The speech evaluating method of the present embodiment can be realized by any suitable equipment having data processing function, be wrapped
It includes: various terminal equipment and server etc..
Embodiment two
Referring to Fig. 2, a kind of step flow chart of according to embodiments of the present invention two speech evaluating method is shown.
In the present embodiment, assertiveness training course is applied to the speech evaluating method, is particularly applied to underage child
It is illustrated for the assertiveness training course of (for example, 3-8 years old children).Certainly, in other embodiments, the speech evaluating method
It can be applied to any other scene appropriate, for example, for the Training valuation scene etc. to artificial intelligence equipment, the present embodiment
With no restriction to this.
The speech evaluating method of the present embodiment the following steps are included:
Step S201: according to the corresponding text data of voice data to be evaluated, the to be evaluated of voice data to be evaluated is generated
Vector.
In assertiveness training course, one group of picture and/or video can be shown to user, user passes through language description picture
And/or the content in video, with the ability to express of this training user.In order to allow users to more accurately grasp oneself
Practise situation, can to user state language formed voice data evaluate and test, allow it to it is more intuitive, clearly understand
The study situation of oneself promotes learning initiative so that user be supervised to continue to learn.
The voice data of user is being evaluated and tested so that parameter characterization table appropriate can be used when embodying the ability to express of user
Danone power, for example, a picture is shown to user, it is corresponding with picture by calculating the corresponding text data of user voice data
The similarity of Key for Reference data judges the semanteme and the matching journey of the content of the picture shown of the voice data of user's input
Degree, to characterize ability to express with this.
In a kind of feasible pattern, when carrying out speech evaluating, step S201 includes following sub-step:
Sub-step 1: identifying voice data to be evaluated using speech recognition modeling, generates and voice data to be evaluated
Corresponding text data.
It is alternatively possible to first obtain voice data to be evaluated;Transcoding processing is carried out to voice data to be evaluated again, and is generated
Voice data to be evaluated after transcoding;Using the voice data to be evaluated after transcoding as the input of speech recognition modeling, pass through language
Sound identification model generates text data corresponding with voice data to be evaluated.
Voice data to be evaluated can be the recording etc. by the sound pick-up outfit user voice acquired or user's input,
Or the user voice data of preservation is extracted from database as voice data to be evaluated.
It, can if voice data to be evaluated meets the format of speech recognition modeling needs after obtaining voice data to be evaluated
To generate voice number to be evaluated by speech recognition modeling directly by voice data to be evaluated as the input of speech recognition modeling
According to corresponding text data.
If voice data to be evaluated is unsatisfactory for the format of speech recognition modeling needs, voice data to be evaluated is turned
Code processing, and generate the voice data to be evaluated after transcoding.For example, if voice data to be evaluated is mp3 format, sample rate is
44100Hz, voice-grade channel 2 are carried out language if this phonetic matrix does not meet the input format of speech recognition modeling needs
Sound transcoded data, transcoding can use any suitable mode, the phonetic matrix for example, by using ffmpeg mode, after conversion are as follows:
Wav format, sample rate 16000Hz, voice-grade channel 2,16bit.
Input of the voice data to be evaluated as speech recognition modeling after transcoding, by speech recognition modeling generate with to
Evaluate and test the corresponding text data of voice data.
Those skilled in the art can according to need, and be carried out using speech recognition modeling appropriate to voice data to be evaluated
Identification, the present embodiment to this with no restriction.For example, speech recognition modeling can be based on HMM (hidden Markov model,
Hidden Markov Model) and N-gram model speech recognition modeling, existing speech recognition work can also be called directly
Tool carries out speech recognition.
Same or similar speech recognition will inevitably be pronounced in speech recognition process into different texts, example
Such as " ", " ", " horse ".Identifying voice data to be evaluated, and after generating text data, there may be turn in text data
The text pronunciation changed is identical as the pronunciation in voice data to be evaluated, but the different problem of word.Especially for underage child
Voice data, due to user there is a problem of expression it is unintelligible, pause etc. it is inevitable, cause the accuracy of speech recognition to have
It is reduced.
If directly being carried out using this text data and Key for Reference data using existing Text similarity computing model
Text similarity computing will be caused similarity calculation inaccurate due to the text inaccuracy identified, influence subsequent voice and comment
The problem of accuracy of survey.This is because existing similarity calculation such as T-IDF (term frequency-inverse document frequency model), LSI
(Latent semantic indexing, shallow semantic index), LDA (Latent Dirichlet Allocation, document
Theme generates model) model is that similarity calculation is carried out by Keywords matching, and the similarity that Keywords matching calculates
Accuracy can be influenced by the accuracy of the text identified.
In order to overcome defect above-mentioned, accuracy of speech recognition is promoted, a set of base of machine learning model training can be passed through
In the speech recognition modeling of underage child, voice data to be evaluated is identified using the speech recognition modeling after training, to obtain
More accurate text data.
Sub-step 2: vectorization processing is carried out to text data by text vector computation model, generates voice number to be evaluated
According to vector to be evaluated.
Text vector computation model is used to text data being converted to corresponding vector to be evaluated, to carry out subsequent text
Similarity calculation.
Optionally, a kind of mode for realizing this step may include:
Text data is pre-processed, and generates result data according to pre-processed results, wherein result data includes using
The participle data of multiple participles in instruction text data;Using the participle data of each participle as text vector computation model
Input, the vector to be evaluated of voice data to be evaluated is generated by text vector computation model.
In a kind of feasible pattern, pretreatment includes removal dirty data processing, word segmentation processing and removal stop words processing.
Based on this, text data is pre-processed, and generates result data according to pre-processed results and includes:
Pre-treatment step 1: dirty data processing is removed to text data, and obtains effective text data.
Wherein, dirty data can be text and seldom be not enough to trained data for empty, text useful information, such as in text
Number of words be less than preset threshold value, and in text sentence lack subject, predicate it may be considered that this article notebook data be useful information very
Few dirty data.
Such as: text 1:,.Text 1 is empty data.Text 2: having a Little Bear,.Useful information in text 2 is seldom.
These data are all dirty datas.The method of specific removal dirty data can in any suitable manner, for example, existing go
Except the method for dirty data.
Pre-treatment step 2: word segmentation processing is carried out to effective text data, and obtains multiple participles in effective text data
Participle data.
Word segmentation processing can in any suitable manner, for example, using hidden Markov model (Hidden is based on
Markov Model, HMM) machine learning participle model.
By taking " text 3: having a bear on picture " as an example, after word segmentation processing, data are segmented are as follows:/mono-, picture/go up/have/
Bear.
Pre-treatment step 3: stop words processing is removed to the participle data of multiple participles, obtains result data.
Wherein, stop words refers in information and/or text-processing, for save memory space and improve treatment effeciency and from
Information and/or the word of text removal.Stop words can according to need determination, for example, stop words can be auxiliary word or modal particle etc.,
Such as: " ", " ground ", " " etc., be also possible to other words, such as " having ", "upper".
With the participle data instance of text 3 above-mentioned, after removing stop words are as follows: one bear of picture.
After obtaining result data, using the participle data of wherein each participle as the input of text vector computation model, lead to
Cross the vector to be evaluated that text vector computation model generates voice data to be evaluated.
Wherein, text vector computation model can be deep learning model, such as: Word2vec model, doc2vec model
Deng.It is illustrated so that text vector computation model is doc2vec as an example in the present embodiment, the structure chart of doc2vec model is such as
Shown in Fig. 3, doc2vec model includes input layer (input layer), hidden layer (hidden layer) and output layer
(output layer).Wherein, input layer is for obtaining training sample data;Hidden layer be used for training sample data carry out to
Quantification treatment;Output layer is for exporting result.
In order to better adapt to underage child voice data, which can use underage child voice data
It is trained, so that more preferable by the vector to be evaluated of the voice data to be evaluated of the doc2vec model generation after training.
Wherein, the training of doc2vec model can be realized using conventional training method, for example, using following training process:
Firstly, the word in each training text data is initialized as a N-dimensional vector, N be can according to need really
A fixed value appropriate.Preferably, the value range of N are as follows: 30~200, it can adjust as needed.
For example, the participle data of text data are as follows: { one bear of picture is played };Word initialization vector dimension is 4, then
The vector of each word after initialization are as follows:
Picture x1:[1,0,0,0];Wherein, picture x1 is used to indicate word " picture " and inputs (i.e. in Fig. 3 as first
Shown in X1)。
One x2:[0,1,0,0];Wherein, an x2 is used to indicate word " one " and inputs (i.e. in Fig. 3 as second
Shown in X2)。
Bear x3:[0,0,1,0];Wherein, bear x3 is used to indicate word " bear " and inputs as third.
Play x4:[0,0,0,1];Wherein, the x4 that plays is used to indicate word " playing " as the 4th input.
It, can be using x1, x2 and x4 as the input to training pattern, i.e. in Fig. 3 in training doc2vec model
Layer layers of input are exactly x1, x2, x4, and using x3 as amendment benchmark.
When training, the training parameter matrix w by setting up a doc2vec model calculates ayer layers of hidden l and (hides
Layer) result.Matrix w is in the side being illustrated as between input layer (input layer) and hidden layer (hidden layer) in Fig. 3
Frame.In a kind of feasible pattern, matrix w is as follows:
It is as follows that Hidden Layer result v2 is calculated with x2 and matrix w:
Similarly, Hidden Layer result v1 and v4 is calculated in x1 and x4 and W matrix.According to the v1 being calculated,
V2 and v4 is averaged determining hidden layer.
Later, the matrix o between hidden layer and output layer, matrix o and hidden layer phase are set up
Multiply, then obtains probability, such as output layer calculated result with sofmax are as follows: [0.23,0.03,0.62,0.12], the 3rd
Value 0.62 is maximum, then close with true expectation [0,0,1,0].According to the true phase of the result of output layer output and x3
It hopes and carries out arameter optimization, such as adjust aforementioned each matrix, to obtain optimal models, complete the training of doc2vec model.
After doc2vec model is completed in training, using the participle data of each participle above-mentioned as text vector computation model
The input of (the doc2vec model that training is completed), generates the to be evaluated of voice data to be evaluated by text vector computation model
Vector.Since text vector computation model is by training, obtain for the optimized parameter in underage child usage scenario, because
This is more preferable in the accuracy for generating vector to be evaluated using text vector computation model.
Step S202: it according to preset standard content vector, vector sum similarity calculation to be evaluated, calculates preset
Similarity between standard content vector sum vector to be evaluated, wherein similarity calculation is used for according in preset standard
The cosine value for holding vector sum vector to be evaluated calculates the similarity, and makes numerical value of the calculated similarity greater than 0.
Wherein, preset standard content vector is generated according to Key for Reference data.For example, by text vector model,
Key for Reference data are converted to corresponding standard content vector by Word2vec model as the aforementioned, and standard content vector is pre-
It is first stored in computer equipment and/or in server.
The calculated similarity of similarity calculation may be considered Key for Reference data and voice data pair to be evaluated
The similarity degree between text data answered, to be that voice data to be evaluated scores according to this similarity.In this reality
It applies in example, similarity calculation is used to calculate institute according to the cosine value between preset standard content vector sum vector to be evaluated
Similarity is stated, and makes numerical value of the calculated similarity greater than 0.
The similarity calculation can determine similarity according to the cosine value of two vectors, and make calculated similarity
For the numerical value greater than 0, the purpose that precise and high efficiency determines similarity is not only realized, is avoided when carrying out speech evaluating using existing
Some in such a way that text keyword determines similarity existing for influence of the unisonance allograph to similarity accuracy, and solve
Having determined, there are negative values for cosine value, and causing calculated similarity, there are negative values, so that the problem of speech evaluating result inaccuracy.
In a kind of feasible pattern, using preset standard content vector sum vector to be evaluated as similarity calculation
Input, calculates similarity by similarity calculation.
Optionally, in order to improve the accuracy of evaluation and test, similarity be greater than 0 and be less than or equal to 1 numerical value.It both solves in this way
Similarity existing for existing cosine value instruction similarity has negative value, is unfavorable for the problem of accurately embodying ability to express, and just
It is scored in subsequent according to similarity, makes the simpler convenience that scores.
Optionally, similarity calculation includes:
Wherein, xiIt is used to indicate vector to be evaluated, xjIt is used to indicate standard content vector corresponding with vector to be evaluated,
Score is used to indicate similarity.The similarity calculated by the similarity calculation can be with cosine value linear change, cosine
Value is bigger, and similarity is closer to 1.0 after conversion, and cosine value is smaller, and similarity is similar in this way closer to 0.0 after conversion
The value range of degree is 0.0~1.0, may be conveniently used and is subsequently generated evaluation result data, such as evaluates and tests score.
Certainly, in other embodiments, similarity calculation includes score=ecos(xi,xj), wherein xiBe used to indicate to
Evaluate and test vector, xjIt is used to indicate standard content vector corresponding with vector to be evaluated, score is used to indicate similarity.
Step S203: it according to similarity and preset speech assessment rule, generates and exports commenting for voice data to be evaluated
Survey result data.
Wherein, evaluation result data are used to characterize the level of the ability to express of user.For example, in assertiveness training course,
Evaluation result data can characterize the expression of user.The horizontal difference of the ability to express characterized as needed, evaluation and test knot
It may include different evaluation and test parameters in fruit data.For example, evaluation result data include semanteme in assertiveness training course above-mentioned
Score, dynamics score, tone score etc..
In a kind of feasible pattern, preset speech assessment rule includes score and phase above-mentioned in evaluation result data
Like positively related rule is spent, i.e., when similarity is higher, score is higher, and characterization ability to express is better.
The speech evaluating method can be applied in assertiveness training course, by being evaluated and tested to the voice that user inputs,
Such as the corresponding text data of voice data to be evaluated is converted into vector to be evaluated, and by vector to be evaluated and preset standard
The similarity of content vector is evaluated and tested, and evaluation result data are obtained, to pass through the expression energy of evaluation result data characterization user
Power, to allow user to understand the expression for easily understanding oneself, so that user be motivated to continue study and progress.
The speech evaluating method is when evaluating and testing voice data to be evaluated, by semantic understanding technology by speech recognition
Inconsistent text data is converted to numeric type vector to reduce error, using similarity calculation, according to direction finding to be evaluated out
Cosine value between amount and preset standard content vector calculates similarity, and makes numerical value of the calculated similarity greater than 0,
Solve the problems, such as that the existing evaluation and test accuracy when carrying out speech evaluating of existing similarity calculating method is low.In addition, also solving
There is negative value in similarity calculation, so that the result of speech evaluating is more accurate.
The speech evaluating method of the present embodiment can be realized by any suitable equipment having data processing function, be wrapped
It includes: various terminal equipment and server etc..
Embodiment three
According to an embodiment of the invention, providing a kind of computer storage medium, computer storage medium is stored with: being used for root
According to the corresponding text data of voice data to be evaluated, the instruction of the vector to be evaluated of voice data to be evaluated is generated;For basis
Preset standard content vector, vector sum similarity calculation to be evaluated, it is to be evaluated to calculate preset standard content vector sum
The instruction of similarity between vector, wherein similarity calculation is used for be evaluated according to preset standard content vector sum
The cosine value of vector calculates similarity, and makes numerical value of the calculated similarity greater than 0;For according to similarity and preset
Speech assessment rule, generates and exports the instruction of the evaluation result data of voice data to be evaluated.
Optionally, for calculating default according to preset standard content vector, vector sum similarity calculation to be evaluated
Standard content vector sum vector to be evaluated between similarity instruction, comprising: for preset standard content vector sum
Input of the vector to be evaluated as similarity calculation calculates the instruction of similarity by similarity calculation.
Optionally, similarity is the numerical value greater than 0 and less than or equal to 1.
Optionally, similarity calculation includes:
Wherein, xiIt is used to indicate vector to be evaluated, xjIt is used to indicate standard content vector corresponding with vector to be evaluated,
Score is used to indicate similarity.
Optionally, for generating the to be evaluated of voice data to be evaluated according to the corresponding text data of voice data to be evaluated
The instruction of direction finding amount, comprising: for being identified using speech recognition modeling to voice data to be evaluated, generate and language to be evaluated
The instruction of the corresponding text data of sound data;For carrying out vectorization processing to text data by text vector computation model,
Generate the instruction of the vector to be evaluated of voice data to be evaluated.
Optionally, it for being identified using speech recognition modeling to voice data to be evaluated, generates and voice to be evaluated
The instruction of the corresponding text data of data, comprising: for obtaining the instruction of voice data to be evaluated;For to voice number to be evaluated
According to progress transcoding processing, and generate the instruction of the voice data to be evaluated after transcoding;For by the voice number to be evaluated after transcoding
According to the input as speech recognition modeling, text data corresponding with voice data to be evaluated is generated by speech recognition modeling
Instruction.
Optionally, for carrying out vectorization processing to text data by text vector computation model, language to be evaluated is generated
The instruction of the vector to be evaluated of sound data, comprising: for being pre-processed to text data, and generated and tied according to pre-processed results
The instruction of fruit data, wherein result data includes the participle data for the multiple participles being used to indicate in text data;For will be each
Input of the participle data of a participle as text vector computation model generates voice to be evaluated by text vector computation model
The instruction of the vector to be evaluated of data.
Optionally, pretreatment includes removal dirty data processing, word segmentation processing and removal stop words processing;For to textual data
According to being pre-processed, and generate according to pre-processed results the instruction of result data, comprising: dirty for being removed to text data
Data processing, and obtain the instruction of effective text data;For carrying out word segmentation processing to effective text data, and obtain effectively text
The instruction of the participle data of multiple participles in notebook data;It is removed at stop words for the participle data to multiple participles
Reason, obtains the instruction of result data.
The instruction stored in the computer storage medium can pass through semanteme when evaluating and testing to voice data to be evaluated
Speech recognition is gone out inconsistent text data and is converted to numeric type vector by understanding technology reduces error, utilizes similarity calculation
Model calculates similarity according to the cosine value between vector to be evaluated and preset standard content vector, and makes calculated phase
It is the numerical value greater than 0 like degree, it is low solves the existing evaluation and test accuracy when carrying out speech evaluating of existing similarity calculating method
The problem of.In addition, also solving similarity calculation has negative value, so that the result of speech evaluating is more accurate.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can
It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on
Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software product in other words, the meter
Calculation machine software product may be stored in a computer readable storage medium, and the computer readable recording medium includes by terms of
Any mechanism of the readable form storage of calculation machine (such as computer) or transmission information.For example, machine readable media includes read-only
Memory (ROM), random access memory (RAM), magnetic disk storage medium, optical storage media, flash medium, electricity, light, sound
Or transmitting signal (for example, carrier wave, infrared signal, digital signal etc.) of other forms etc., if the computer software product includes
Dry instruction is used so that computer equipment (can be personal computer, server or the network equipment an etc.) execution is each
Method described in certain parts of embodiment or embodiment.
Finally, it should be noted that above embodiments are only to illustrate the technical solution of the embodiment of the present invention, rather than it is limited
System;Although the present invention is described in detail referring to the foregoing embodiments, those skilled in the art should understand that: its
It is still possible to modify the technical solutions described in the foregoing embodiments, or part of technical characteristic is equal
Replacement;And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution
Spirit and scope.
It will be understood by those skilled in the art that the embodiment of the embodiment of the present invention can provide as method, apparatus (equipment) or
Computer program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine soft
The form of the embodiment of part and hardware aspect.Moreover, it wherein includes to calculate that the embodiment of the present invention, which can be used in one or more,
Computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, the optical memory of machine usable program code
Deng) on the form of computer program product implemented.
The embodiment of the present invention referring to according to the method for the embodiment of the present invention, device (equipment) and computer program product
Flowchart and/or the block diagram describes.It should be understood that can be realized by computer program instructions every in flowchart and/or the block diagram
The combination of process and/or box in one process and/or box and flowchart and/or the block diagram.It can provide these computers
Processor of the program instruction to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices
To generate a machine, so that generating use by the instruction that computer or the processor of other programmable data processing devices execute
In the dress for realizing the function of specifying in one or more flows of the flowchart and/or one or more blocks of the block diagram
It sets.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy
Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates,
Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or
The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting
Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or
The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one
The step of function of being specified in a box or multiple boxes.
Claims (10)
1. a kind of speech evaluating method characterized by comprising
According to the corresponding text data of voice data to be evaluated, the vector to be evaluated of the voice data to be evaluated is generated;
According to preset standard content vector, the vector sum similarity calculation to be evaluated, the preset standard is calculated
Similarity between vector to be evaluated described in content vector sum, wherein the similarity calculation is used for according to described default
Standard content vector sum described in the cosine value of vector to be evaluated calculate the similarity, and make the calculated similarity
Numerical value greater than 0;
According to the similarity and preset speech assessment rule, the evaluation result of the voice data to be evaluated is generated and exported
Data.
2. the method according to claim 1, wherein according to preset standard content vector, the direction finding to be evaluated
Amount and similarity calculation calculate the similarity between vector to be evaluated described in the preset standard content vector sum, packet
It includes:
Using vector to be evaluated described in preset standard content vector sum as the input of the similarity calculation, by described
Similarity calculation calculates the similarity.
3. the method according to claim 1, wherein the similarity is the numerical value greater than 0 and less than or equal to 1.
4. the method according to claim 1, wherein the similarity calculation includes:
Wherein, xiIt is used to indicate vector to be evaluated, xjIt is used to indicate standard content vector corresponding with vector to be evaluated, score is used
In instruction similarity.
5. method according to claim 1 to 4, which is characterized in that described according to voice data pair to be evaluated
The text data answered generates the vector to be evaluated of the voice data to be evaluated, comprising:
The voice data to be evaluated is identified using speech recognition modeling, is generated corresponding with the voice data to be evaluated
The text data;
Vectorization processing is carried out to the text data by text vector computation model, generates the voice data to be evaluated
Vector to be evaluated.
6. according to the method described in claim 5, it is characterized in that, described use speech recognition modeling to the voice to be evaluated
Data are identified, the text data corresponding with the voice data to be evaluated is generated, comprising:
Obtain the voice data to be evaluated;
Transcoding processing is carried out to the voice data to be evaluated, and generates the voice data to be evaluated after transcoding;
Using the voice data to be evaluated after transcoding as the input of the speech recognition modeling, pass through the speech recognition mould
Type generates the text data corresponding with the voice data to be evaluated.
7. according to the method described in claim 5, it is characterized in that, it is described by text vector computation model to the textual data
According to vectorization processing is carried out, the vector to be evaluated of the voice data to be evaluated is generated, comprising:
The text data is pre-processed, and generates result data according to pre-processed results, wherein the result data packet
Include the participle data for the multiple participles being used to indicate in the text data;
Using the participle data of each participle as the input of text vector computation model, mould is calculated by the text vector
Type generates the vector to be evaluated of the voice data to be evaluated.
8. the method according to the description of claim 7 is characterized in that described pre-process including at removal dirty data processing, participle
Reason and removal stop words processing;
It is described that the text data is pre-processed, and result data is generated according to pre-processed results, comprising:
Dirty data processing is removed to the text data, and obtains effective text data;
Word segmentation processing is carried out to effective text data, and obtains the participle number of multiple participles in effective text data
According to;
Stop words processing is removed to the participle data of the multiple participle, obtains the result data.
9. a kind of computer storage medium, which is characterized in that the computer storage medium is stored with: for according to language to be evaluated
The corresponding text data of sound data generates the instruction of the vector to be evaluated of the voice data to be evaluated;For according to preset
Standard content vector, the vector sum similarity calculation to be evaluated calculate described in the preset standard content vector sum
The instruction of similarity between vector to be evaluated, wherein the similarity calculation is used for according in the preset standard
The cosine value for holding vector to be evaluated described in vector sum calculates the similarity, and makes the calculated similarity greater than 0
Numerical value;For generating and exporting commenting for the voice data to be evaluated according to the similarity and preset speech assessment rule
Survey the instruction of result data.
10. computer storage medium according to claim 9, which is characterized in that described for according in preset standard
Hold vector, the vector sum similarity calculation to be evaluated, calculates to be evaluated described in the preset standard content vector sum
The instruction of similarity between vector, comprising: be used for using vector to be evaluated described in preset standard content vector sum described in
The input of similarity calculation calculates the instruction of the similarity by the similarity calculation.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810259445.XA CN110322895B (en) | 2018-03-27 | 2018-03-27 | Voice evaluation method and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810259445.XA CN110322895B (en) | 2018-03-27 | 2018-03-27 | Voice evaluation method and computer storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110322895A true CN110322895A (en) | 2019-10-11 |
CN110322895B CN110322895B (en) | 2021-07-09 |
Family
ID=68109770
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810259445.XA Active CN110322895B (en) | 2018-03-27 | 2018-03-27 | Voice evaluation method and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110322895B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110827794A (en) * | 2019-12-06 | 2020-02-21 | 科大讯飞股份有限公司 | Method and device for evaluating quality of voice recognition intermediate result |
CN111199750A (en) * | 2019-12-18 | 2020-05-26 | 北京葡萄智学科技有限公司 | Pronunciation evaluation method and device, electronic equipment and storage medium |
CN111639219A (en) * | 2020-05-12 | 2020-09-08 | 广东小天才科技有限公司 | Method for acquiring spoken language evaluation sticker, terminal device and storage medium |
CN112435512A (en) * | 2020-11-12 | 2021-03-02 | 郑州大学 | Voice behavior assessment and evaluation method for rail transit simulation training |
CN112562736A (en) * | 2020-12-11 | 2021-03-26 | 中国信息通信研究院 | Voice data set quality evaluation method and device |
Citations (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101496011A (en) * | 2006-11-09 | 2009-07-29 | 松下电器产业株式会社 | Content search apparatus |
CN103823896A (en) * | 2014-03-13 | 2014-05-28 | 蚌埠医学院 | Subject characteristic value algorithm and subject characteristic value algorithm-based project evaluation expert recommendation algorithm |
CN104240706A (en) * | 2014-09-12 | 2014-12-24 | 浙江大学 | Speaker recognition method based on GMM Token matching similarity correction scores |
CN104464757A (en) * | 2014-10-28 | 2015-03-25 | 科大讯飞股份有限公司 | Voice evaluation method and device |
CN104505103A (en) * | 2014-12-04 | 2015-04-08 | 上海流利说信息技术有限公司 | Voice quality evaluation equipment, method and system |
CN104778161A (en) * | 2015-04-30 | 2015-07-15 | 车智互联(北京)科技有限公司 | Keyword extracting method based on Word2Vec and Query log |
CN104867489A (en) * | 2015-04-27 | 2015-08-26 | 苏州大学张家港工业技术研究院 | Method and system for simulating reading and pronunciation of real person |
WO2015157036A1 (en) * | 2014-04-09 | 2015-10-15 | Google Inc. | Text-dependent speaker identification |
CN105261362A (en) * | 2015-09-07 | 2016-01-20 | 科大讯飞股份有限公司 | Conversation voice monitoring method and system |
CN105608180A (en) * | 2015-12-22 | 2016-05-25 | 北京奇虎科技有限公司 | Application recommendation method and system |
CN105898713A (en) * | 2016-06-17 | 2016-08-24 | 东华大学 | WiFi fingerprint indoor positioning method based on weighted cosine similarity |
US20160358599A1 (en) * | 2015-06-03 | 2016-12-08 | Le Shi Zhi Xin Electronic Technology (Tianjin) Limited | Speech enhancement method, speech recognition method, clustering method and device |
CN106503805A (en) * | 2016-11-14 | 2017-03-15 | 合肥工业大学 | A kind of bimodal based on machine learning everybody talk with sentiment analysis system and method |
CN106776559A (en) * | 2016-12-14 | 2017-05-31 | 东软集团股份有限公司 | The method and device of text semantic Similarity Measure |
CN106802956A (en) * | 2017-01-19 | 2017-06-06 | 山东大学 | A kind of film based on weighting Heterogeneous Information network recommends method |
CN106847288A (en) * | 2017-02-17 | 2017-06-13 | 上海创米科技有限公司 | The error correction method and device of speech recognition text |
CN107316638A (en) * | 2017-06-28 | 2017-11-03 | 北京粉笔未来科技有限公司 | A kind of poem recites evaluating method and system, a kind of terminal and storage medium |
CN107346340A (en) * | 2017-07-04 | 2017-11-14 | 北京奇艺世纪科技有限公司 | A kind of user view recognition methods and system |
CN107355342A (en) * | 2017-06-30 | 2017-11-17 | 北京金风科创风电设备有限公司 | The abnormal recognition methods of wind generating set pitch control and device |
CN107729322A (en) * | 2017-11-06 | 2018-02-23 | 广州杰赛科技股份有限公司 | Segmenting method and device, establish sentence vector generation model method and device |
CN107766426A (en) * | 2017-09-14 | 2018-03-06 | 北京百分点信息科技有限公司 | A kind of file classification method, device and electronic equipment |
CN107773982A (en) * | 2017-10-20 | 2018-03-09 | 科大讯飞股份有限公司 | Game voice interactive method and device |
CN107992470A (en) * | 2017-11-08 | 2018-05-04 | 中国科学院计算机网络信息中心 | A kind of text duplicate checking method and system based on similarity |
CN110136721A (en) * | 2019-04-09 | 2019-08-16 | 北京大米科技有限公司 | A kind of scoring generation method, device, storage medium and electronic equipment |
-
2018
- 2018-03-27 CN CN201810259445.XA patent/CN110322895B/en active Active
Patent Citations (24)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101496011A (en) * | 2006-11-09 | 2009-07-29 | 松下电器产业株式会社 | Content search apparatus |
CN103823896A (en) * | 2014-03-13 | 2014-05-28 | 蚌埠医学院 | Subject characteristic value algorithm and subject characteristic value algorithm-based project evaluation expert recommendation algorithm |
WO2015157036A1 (en) * | 2014-04-09 | 2015-10-15 | Google Inc. | Text-dependent speaker identification |
CN104240706A (en) * | 2014-09-12 | 2014-12-24 | 浙江大学 | Speaker recognition method based on GMM Token matching similarity correction scores |
CN104464757A (en) * | 2014-10-28 | 2015-03-25 | 科大讯飞股份有限公司 | Voice evaluation method and device |
CN104505103A (en) * | 2014-12-04 | 2015-04-08 | 上海流利说信息技术有限公司 | Voice quality evaluation equipment, method and system |
CN104867489A (en) * | 2015-04-27 | 2015-08-26 | 苏州大学张家港工业技术研究院 | Method and system for simulating reading and pronunciation of real person |
CN104778161A (en) * | 2015-04-30 | 2015-07-15 | 车智互联(北京)科技有限公司 | Keyword extracting method based on Word2Vec and Query log |
US20160358599A1 (en) * | 2015-06-03 | 2016-12-08 | Le Shi Zhi Xin Electronic Technology (Tianjin) Limited | Speech enhancement method, speech recognition method, clustering method and device |
CN105261362A (en) * | 2015-09-07 | 2016-01-20 | 科大讯飞股份有限公司 | Conversation voice monitoring method and system |
CN105608180A (en) * | 2015-12-22 | 2016-05-25 | 北京奇虎科技有限公司 | Application recommendation method and system |
CN105898713A (en) * | 2016-06-17 | 2016-08-24 | 东华大学 | WiFi fingerprint indoor positioning method based on weighted cosine similarity |
CN106503805A (en) * | 2016-11-14 | 2017-03-15 | 合肥工业大学 | A kind of bimodal based on machine learning everybody talk with sentiment analysis system and method |
CN106776559A (en) * | 2016-12-14 | 2017-05-31 | 东软集团股份有限公司 | The method and device of text semantic Similarity Measure |
CN106802956A (en) * | 2017-01-19 | 2017-06-06 | 山东大学 | A kind of film based on weighting Heterogeneous Information network recommends method |
CN106847288A (en) * | 2017-02-17 | 2017-06-13 | 上海创米科技有限公司 | The error correction method and device of speech recognition text |
CN107316638A (en) * | 2017-06-28 | 2017-11-03 | 北京粉笔未来科技有限公司 | A kind of poem recites evaluating method and system, a kind of terminal and storage medium |
CN107355342A (en) * | 2017-06-30 | 2017-11-17 | 北京金风科创风电设备有限公司 | The abnormal recognition methods of wind generating set pitch control and device |
CN107346340A (en) * | 2017-07-04 | 2017-11-14 | 北京奇艺世纪科技有限公司 | A kind of user view recognition methods and system |
CN107766426A (en) * | 2017-09-14 | 2018-03-06 | 北京百分点信息科技有限公司 | A kind of file classification method, device and electronic equipment |
CN107773982A (en) * | 2017-10-20 | 2018-03-09 | 科大讯飞股份有限公司 | Game voice interactive method and device |
CN107729322A (en) * | 2017-11-06 | 2018-02-23 | 广州杰赛科技股份有限公司 | Segmenting method and device, establish sentence vector generation model method and device |
CN107992470A (en) * | 2017-11-08 | 2018-05-04 | 中国科学院计算机网络信息中心 | A kind of text duplicate checking method and system based on similarity |
CN110136721A (en) * | 2019-04-09 | 2019-08-16 | 北京大米科技有限公司 | A kind of scoring generation method, device, storage medium and electronic equipment |
Non-Patent Citations (5)
Title |
---|
DANIEL L BOWLING: ""Vocal similarity predicts the relative attraction of musical chords"", 《PNAS》 * |
RAFAEL FERREIRA: ""Assessing sentence similarity through lexical,syntactic and semantic analysis"", 《COMPUTER SPEECH & LANGUAGE》 * |
李海洋: ""汉语语音关键词检测中置信测度研究"", 《中国博士学位论文全文数据库 信息科技辑》 * |
王超: "" 基于模糊综合评价的语音评测模型的研究与实现"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
郑世杰: "" 基于语音自动评测的普通话学习系统的研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110827794A (en) * | 2019-12-06 | 2020-02-21 | 科大讯飞股份有限公司 | Method and device for evaluating quality of voice recognition intermediate result |
CN111199750A (en) * | 2019-12-18 | 2020-05-26 | 北京葡萄智学科技有限公司 | Pronunciation evaluation method and device, electronic equipment and storage medium |
CN111199750B (en) * | 2019-12-18 | 2022-10-28 | 北京葡萄智学科技有限公司 | Pronunciation evaluation method and device, electronic equipment and storage medium |
CN111639219A (en) * | 2020-05-12 | 2020-09-08 | 广东小天才科技有限公司 | Method for acquiring spoken language evaluation sticker, terminal device and storage medium |
CN112435512A (en) * | 2020-11-12 | 2021-03-02 | 郑州大学 | Voice behavior assessment and evaluation method for rail transit simulation training |
CN112562736A (en) * | 2020-12-11 | 2021-03-26 | 中国信息通信研究院 | Voice data set quality evaluation method and device |
Also Published As
Publication number | Publication date |
---|---|
CN110322895B (en) | 2021-07-09 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110322895A (en) | Speech evaluating method and computer storage medium | |
CN107329949B (en) | Semantic matching method and system | |
CN110457432B (en) | Interview scoring method, interview scoring device, interview scoring equipment and interview scoring storage medium | |
CN107564511B (en) | Electronic device, phoneme synthesizing method and computer readable storage medium | |
WO2020114373A1 (en) | Method and apparatus for realizing element recognition in judicial document | |
CN111833853B (en) | Voice processing method and device, electronic equipment and computer readable storage medium | |
CN109034203A (en) | Training, expression recommended method, device, equipment and the medium of expression recommended models | |
US11087745B2 (en) | Speech recognition results re-ranking device, speech recognition results re-ranking method, and program | |
Swain et al. | Study of feature combination using HMM and SVM for multilingual Odiya speech emotion recognition | |
CN113297383B (en) | Speech emotion classification method based on knowledge distillation | |
Zhang et al. | Deep cross-corpus speech emotion recognition: Recent advances and perspectives | |
CN117711444B (en) | Interaction method, device, equipment and storage medium based on talent expression | |
Nguyen et al. | Meta-transfer learning for emotion recognition | |
Somogyi | The Application of Artificial Intelligence | |
CN104700831B (en) | The method and apparatus for analyzing the phonetic feature of audio file | |
Gumelar et al. | Forward feature selection for toxic speech classification using support vector machine and random forest | |
CN117252739B (en) | Method, system, electronic equipment and storage medium for evaluating paper | |
CN112116181B (en) | Classroom quality model training method, classroom quality evaluation method and classroom quality evaluation device | |
Lopez-de-Ipina et al. | Multilingual audio information management system based on semantic knowledge in complex environments | |
CN113095063A (en) | Two-stage emotion migration method and system based on masking language model | |
CN114239555A (en) | Training method of keyword extraction model and related device | |
Kiwelekar et al. | Automatic grading of student’s presentation skills based on powerpoint presentation and audio | |
WO2021098637A1 (en) | Voice transliteration method and apparatus, and related system and device | |
Level et al. | Introduction of semantic model to help speech recognition | |
Dong et al. | The application of big data to improve pronunciation and intonation evaluation in foreign language learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |