CN110322895A - Speech evaluating method and computer storage medium - Google Patents

Speech evaluating method and computer storage medium Download PDF

Info

Publication number
CN110322895A
CN110322895A CN201810259445.XA CN201810259445A CN110322895A CN 110322895 A CN110322895 A CN 110322895A CN 201810259445 A CN201810259445 A CN 201810259445A CN 110322895 A CN110322895 A CN 110322895A
Authority
CN
China
Prior art keywords
evaluated
vector
data
similarity
voice data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810259445.XA
Other languages
Chinese (zh)
Other versions
CN110322895B (en
Inventor
吴介圣
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yiduhuida Educational Technology (beijing) Co Ltd
Original Assignee
Yiduhuida Educational Technology (beijing) Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yiduhuida Educational Technology (beijing) Co Ltd filed Critical Yiduhuida Educational Technology (beijing) Co Ltd
Priority to CN201810259445.XA priority Critical patent/CN110322895B/en
Publication of CN110322895A publication Critical patent/CN110322895A/en
Application granted granted Critical
Publication of CN110322895B publication Critical patent/CN110322895B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/22Procedures used during a speech recognition process, e.g. man-machine dialogue
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/27Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the analysis technique
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Multimedia (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Theoretical Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the invention provides a kind of speech evaluating method and computer storage mediums.The speech evaluating method includes: to generate the vector to be evaluated of the voice data to be evaluated according to the corresponding text data of voice data to be evaluated;According to preset standard content vector, the vector sum similarity calculation to be evaluated, calculate the similarity between vector to be evaluated described in the preset standard content vector sum, wherein, cosine value of the similarity calculation for the vector to be evaluated according to the preset standard content vector sum calculates the similarity, and makes calculated numerical value of the similarity greater than 0;According to the similarity and preset speech assessment rule, the evaluation result data of the voice data to be evaluated are generated and exported.The speech evaluating method can assess learning outcome in assertiveness training course.

Description

Speech evaluating method and computer storage medium
Technical field
The present embodiments relate to field of computer technology more particularly to the storage of a kind of speech evaluating method and computer to be situated between Matter.
Background technique
With the development of computer and Internet technology, study is carried out by means of computer and internet and teaching has become A kind of trend.By computer and internet, learn student whenever and wherever possible, it is not necessary to be limited to the environment such as place, number Factor.Especially in terms of underage child education, compensated for using computer and internet progress underage child education existing The blank of underage child education.
Language expression, which is carried out, by computer and internet with 3-8 years old children is trained for example, existing assertiveness training mistake Journey are as follows: by computer or mobile terminal device by one group of interesting picture presentation to student, student can be by describing these Image content carries out assertiveness training.
During existing assertiveness training, do not feed back and judgment mechanism, it cannot be fine after causing student to complete training Ground understands the promotion degree of oneself, less to the study situation awareness of oneself, is unfavorable for that student is motivated to continue study and progress.
Summary of the invention
In view of this, the embodiment of the invention provides a kind of speech evaluating method and computer storage medium, it is existing to solve The problem of student's learning outcome cannot be assessed in some assertiveness training courses.
According to a first aspect of the embodiments of the present invention, a kind of speech evaluating method is provided, this method comprises: according to be evaluated The corresponding text data of voice data is surveyed, the vector to be evaluated of voice data to be evaluated is generated;According to preset standard content to Amount, vector sum similarity calculation to be evaluated, calculate the similarity between preset standard content vector sum vector to be evaluated, Wherein, similarity calculation is used to calculate similarity according to the cosine value of preset standard content vector sum vector to be evaluated, And make numerical value of the calculated similarity greater than 0;According to similarity and preset speech assessment rule, generates and export to be evaluated Survey the evaluation result data of voice data.
The second aspect of embodiment according to the present invention, provides a kind of computer storage medium, and computer storage medium is deposited It contains: for generating the finger of the vector to be evaluated of voice data to be evaluated according to the corresponding text data of voice data to be evaluated It enables;For according to preset standard content vector, vector sum similarity calculation to be evaluated, calculate preset standard content to Measure vector to be evaluated between similarity instruction, wherein similarity calculation be used for according to preset standard content to The cosine value of amount and vector to be evaluated calculates similarity, and makes numerical value of the calculated similarity greater than 0;For according to similar Degree and preset speech assessment rule, generate and export the instruction of the evaluation result data of voice data to be evaluated.
The scheme provided according to embodiments of the present invention, the speech evaluating method can be applied in assertiveness training course, lead to It crosses and the voice of user's input is evaluated and tested, such as the corresponding text data of voice data to be evaluated is converted into direction finding to be evaluated Amount, and the similarity of vector to be evaluated and preset standard content vector is evaluated and tested, evaluation result data are obtained, to pass through The ability to express of evaluation result data characterization user, so that user is allow to understand the expression for easily understanding oneself, from And user is motivated to continue study and progress.
The speech evaluating method is converted to corresponding text data to be evaluated when evaluating and testing to voice data to be evaluated Direction finding amount is calculated using similarity calculation according to the cosine value between vector to be evaluated and preset standard content vector Similarity, and make numerical value of the calculated similarity greater than 0, it solves existing similarity calculating method and is carrying out speech evaluating When the low problem of existing evaluation and test accuracy.In addition, also solving similarity calculation has negative value, so that speech evaluating Result it is more accurate.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this The some embodiments recorded in inventive embodiments can also obtain according to these attached drawings for those of ordinary skill in the art Obtain other attached drawings.
Fig. 1 is a kind of step flow chart of according to embodiments of the present invention one speech evaluating method;
Fig. 2 is a kind of step flow chart of according to embodiments of the present invention two speech evaluating method;
Fig. 3 is the structural schematic diagram for the doc2vec model that one of embodiment illustrated in fig. 2 speech evaluating method uses.
Specific embodiment
In order to make those skilled in the art more fully understand the technical solution in the embodiment of the present invention, below in conjunction with the present invention Attached drawing in embodiment, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described reality Applying example only is a part of the embodiment of the embodiment of the present invention, instead of all the embodiments.Based on the implementation in the embodiment of the present invention The range of protection of the embodiment of the present invention all should belong in example, those of ordinary skill in the art's every other embodiment obtained.
Embodiment one
Referring to Fig.1, a kind of step flow chart of according to embodiments of the present invention one speech evaluating method is shown.
The speech evaluating method of the present embodiment the following steps are included:
Step S101: according to the corresponding text data of voice data to be evaluated, the to be evaluated of voice data to be evaluated is generated Vector.
Wherein, text data corresponding with voice data to be evaluated can be obtained by any suitable mode.For example, sharp With the mode of speech recognition, voice data to be evaluated is identified by speech recognition modeling or algorithm, and generates corresponding text Data.This mode may be implemented that voice to be evaluated is converted to corresponding text data, high conversion efficiency, labor intensity automatically It is low.
It certainly, in other embodiments, can also be by way of manual transcription, by manually turning voice data to be evaluated It is changed to corresponding text data.
Similarly, vector to be evaluated corresponding with voice data to be evaluated can also be obtained by any suitable mode. For example, generating corresponding vector to be evaluated by deep learning model according to text data corresponding with voice data to be evaluated. Wherein, deep learning model can be Word2vec model, doc2vec model etc..Word2vec model and doc2vec model can Text data is converted to corresponding vector to be evaluated on semantic level, voice data to be evaluated can be embodied well Semanteme has great benefit to the accuracy for guaranteeing the subsequent progress speech evaluating on semantic level.In addition, the text that will identify that Notebook data is converted to vector to be evaluated, subsequent that speech evaluating is carried out based on vector to be evaluated, it is ensured that the standard of speech evaluating True property avoids during voice data to be evaluated is identified as text data, causes due to voice is close, phase is same The text inaccuracy identified and the problem of influence speech evaluating accuracy.
Certainly, in other embodiments, only hotlist representation model (one-hot reprentation), shallow-layer can also be passed through Text data is converted to corresponding vector to be evaluated by semantic analysis model (LSA) etc..
Step S102: it according to preset standard content vector, vector sum similarity calculation to be evaluated, calculates preset Similarity between standard content vector sum vector to be evaluated, wherein similarity calculation is used for according in preset standard The cosine value for holding vector sum vector to be evaluated calculates similarity, and makes numerical value of the calculated similarity greater than 0.
Wherein, preset standard content vector is generated according to Key for Reference data.For example, by text vector model, Key for Reference data are converted to corresponding standard content vector by Word2vec model as the aforementioned, and standard content vector is pre- It is first stored in computer equipment and/or in server.
Certainly, in other embodiments, can also in computer equipment and/or server preset reference answer data, And Key for Reference data are converted into standard content vector when needed.
Similarity calculation is used to calculate the similarity between preset standard content vector sum vector to be evaluated, thus According to this similarity characterization Key for Reference data (corresponding with preset standard content vector) and text data (with direction finding to be evaluated Amount correspond to) between similarity degree, thus according to the similarity degree between Key for Reference data and text data be language to be evaluated Sound data score.In the present embodiment, similarity calculation is used for be evaluated according to preset standard content vector sum Cosine value between vector calculates the similarity, and makes numerical value of the calculated similarity greater than 0.
The similarity calculation can determine similarity according to the cosine value of two vectors, and make calculated similarity For the numerical value greater than 0, the purpose that precise and high efficiency determines similarity is not only realized, is avoided when carrying out speech evaluating, is used It is existing in such a way that text keyword determines similarity existing for influence of the unisonance allograph to similarity accuracy, and Solving cosine value, there are negative values, and causing calculated similarity, there are negative values, so that the problem of speech evaluating result inaccuracy.
Step S103: it according to the similarity and preset speech assessment rule, generates and exports voice data to be evaluated Evaluation result data.
Wherein, evaluation result data are used to characterize the level of the ability to express of user.For example, in assertiveness training course, Evaluation result data can characterize the expression of user.The horizontal difference of the ability to express characterized as needed, evaluation and test knot It may include different evaluation and test parameters in fruit data.For example, evaluation result data include semanteme in assertiveness training course above-mentioned Score, dynamics score, tone score etc..
In a kind of feasible pattern, preset speech assessment rule includes score and phase above-mentioned in evaluation result data Like positively related rule is spent, i.e., when similarity is higher, score is higher, and characterization ability to express is better.
The speech evaluating method can be applied in assertiveness training course, by being evaluated and tested to the voice that user inputs, Such as voice data to be evaluated is converted into vector to be evaluated, and vector to be evaluated is similar to preset standard content vector Degree is evaluated and tested, and evaluation result data are obtained, to pass through the ability to express of evaluation result data characterization user, to make user can The expression of oneself is easily understood with clear, so that user be motivated to continue study and progress.
The speech evaluating method is converted to corresponding text data to be evaluated when evaluating and testing to voice data to be evaluated Direction finding amount is calculated using similarity calculation according to the cosine value between vector to be evaluated and preset standard content vector Similarity, and make numerical value of the calculated similarity greater than 0, it solves existing similarity calculating method and is carrying out speech evaluating When the low problem of existing evaluation and test accuracy.In addition, also solving similarity calculation has negative value, so that speech evaluating Result it is more accurate.
The speech evaluating method of the present embodiment can be realized by any suitable equipment having data processing function, be wrapped It includes: various terminal equipment and server etc..
Embodiment two
Referring to Fig. 2, a kind of step flow chart of according to embodiments of the present invention two speech evaluating method is shown.
In the present embodiment, assertiveness training course is applied to the speech evaluating method, is particularly applied to underage child It is illustrated for the assertiveness training course of (for example, 3-8 years old children).Certainly, in other embodiments, the speech evaluating method It can be applied to any other scene appropriate, for example, for the Training valuation scene etc. to artificial intelligence equipment, the present embodiment With no restriction to this.
The speech evaluating method of the present embodiment the following steps are included:
Step S201: according to the corresponding text data of voice data to be evaluated, the to be evaluated of voice data to be evaluated is generated Vector.
In assertiveness training course, one group of picture and/or video can be shown to user, user passes through language description picture And/or the content in video, with the ability to express of this training user.In order to allow users to more accurately grasp oneself Practise situation, can to user state language formed voice data evaluate and test, allow it to it is more intuitive, clearly understand The study situation of oneself promotes learning initiative so that user be supervised to continue to learn.
The voice data of user is being evaluated and tested so that parameter characterization table appropriate can be used when embodying the ability to express of user Danone power, for example, a picture is shown to user, it is corresponding with picture by calculating the corresponding text data of user voice data The similarity of Key for Reference data judges the semanteme and the matching journey of the content of the picture shown of the voice data of user's input Degree, to characterize ability to express with this.
In a kind of feasible pattern, when carrying out speech evaluating, step S201 includes following sub-step:
Sub-step 1: identifying voice data to be evaluated using speech recognition modeling, generates and voice data to be evaluated Corresponding text data.
It is alternatively possible to first obtain voice data to be evaluated;Transcoding processing is carried out to voice data to be evaluated again, and is generated Voice data to be evaluated after transcoding;Using the voice data to be evaluated after transcoding as the input of speech recognition modeling, pass through language Sound identification model generates text data corresponding with voice data to be evaluated.
Voice data to be evaluated can be the recording etc. by the sound pick-up outfit user voice acquired or user's input, Or the user voice data of preservation is extracted from database as voice data to be evaluated.
It, can if voice data to be evaluated meets the format of speech recognition modeling needs after obtaining voice data to be evaluated To generate voice number to be evaluated by speech recognition modeling directly by voice data to be evaluated as the input of speech recognition modeling According to corresponding text data.
If voice data to be evaluated is unsatisfactory for the format of speech recognition modeling needs, voice data to be evaluated is turned Code processing, and generate the voice data to be evaluated after transcoding.For example, if voice data to be evaluated is mp3 format, sample rate is 44100Hz, voice-grade channel 2 are carried out language if this phonetic matrix does not meet the input format of speech recognition modeling needs Sound transcoded data, transcoding can use any suitable mode, the phonetic matrix for example, by using ffmpeg mode, after conversion are as follows: Wav format, sample rate 16000Hz, voice-grade channel 2,16bit.
Input of the voice data to be evaluated as speech recognition modeling after transcoding, by speech recognition modeling generate with to Evaluate and test the corresponding text data of voice data.
Those skilled in the art can according to need, and be carried out using speech recognition modeling appropriate to voice data to be evaluated Identification, the present embodiment to this with no restriction.For example, speech recognition modeling can be based on HMM (hidden Markov model, Hidden Markov Model) and N-gram model speech recognition modeling, existing speech recognition work can also be called directly Tool carries out speech recognition.
Same or similar speech recognition will inevitably be pronounced in speech recognition process into different texts, example Such as " ", " ", " horse ".Identifying voice data to be evaluated, and after generating text data, there may be turn in text data The text pronunciation changed is identical as the pronunciation in voice data to be evaluated, but the different problem of word.Especially for underage child Voice data, due to user there is a problem of expression it is unintelligible, pause etc. it is inevitable, cause the accuracy of speech recognition to have It is reduced.
If directly being carried out using this text data and Key for Reference data using existing Text similarity computing model Text similarity computing will be caused similarity calculation inaccurate due to the text inaccuracy identified, influence subsequent voice and comment The problem of accuracy of survey.This is because existing similarity calculation such as T-IDF (term frequency-inverse document frequency model), LSI (Latent semantic indexing, shallow semantic index), LDA (Latent Dirichlet Allocation, document Theme generates model) model is that similarity calculation is carried out by Keywords matching, and the similarity that Keywords matching calculates Accuracy can be influenced by the accuracy of the text identified.
In order to overcome defect above-mentioned, accuracy of speech recognition is promoted, a set of base of machine learning model training can be passed through In the speech recognition modeling of underage child, voice data to be evaluated is identified using the speech recognition modeling after training, to obtain More accurate text data.
Sub-step 2: vectorization processing is carried out to text data by text vector computation model, generates voice number to be evaluated According to vector to be evaluated.
Text vector computation model is used to text data being converted to corresponding vector to be evaluated, to carry out subsequent text Similarity calculation.
Optionally, a kind of mode for realizing this step may include:
Text data is pre-processed, and generates result data according to pre-processed results, wherein result data includes using The participle data of multiple participles in instruction text data;Using the participle data of each participle as text vector computation model Input, the vector to be evaluated of voice data to be evaluated is generated by text vector computation model.
In a kind of feasible pattern, pretreatment includes removal dirty data processing, word segmentation processing and removal stop words processing.
Based on this, text data is pre-processed, and generates result data according to pre-processed results and includes:
Pre-treatment step 1: dirty data processing is removed to text data, and obtains effective text data.
Wherein, dirty data can be text and seldom be not enough to trained data for empty, text useful information, such as in text Number of words be less than preset threshold value, and in text sentence lack subject, predicate it may be considered that this article notebook data be useful information very Few dirty data.
Such as: text 1:,.Text 1 is empty data.Text 2: having a Little Bear,.Useful information in text 2 is seldom. These data are all dirty datas.The method of specific removal dirty data can in any suitable manner, for example, existing go Except the method for dirty data.
Pre-treatment step 2: word segmentation processing is carried out to effective text data, and obtains multiple participles in effective text data Participle data.
Word segmentation processing can in any suitable manner, for example, using hidden Markov model (Hidden is based on Markov Model, HMM) machine learning participle model.
By taking " text 3: having a bear on picture " as an example, after word segmentation processing, data are segmented are as follows:/mono-, picture/go up/have/ Bear.
Pre-treatment step 3: stop words processing is removed to the participle data of multiple participles, obtains result data.
Wherein, stop words refers in information and/or text-processing, for save memory space and improve treatment effeciency and from Information and/or the word of text removal.Stop words can according to need determination, for example, stop words can be auxiliary word or modal particle etc., Such as: " ", " ground ", " " etc., be also possible to other words, such as " having ", "upper".
With the participle data instance of text 3 above-mentioned, after removing stop words are as follows: one bear of picture.
After obtaining result data, using the participle data of wherein each participle as the input of text vector computation model, lead to Cross the vector to be evaluated that text vector computation model generates voice data to be evaluated.
Wherein, text vector computation model can be deep learning model, such as: Word2vec model, doc2vec model Deng.It is illustrated so that text vector computation model is doc2vec as an example in the present embodiment, the structure chart of doc2vec model is such as Shown in Fig. 3, doc2vec model includes input layer (input layer), hidden layer (hidden layer) and output layer (output layer).Wherein, input layer is for obtaining training sample data;Hidden layer be used for training sample data carry out to Quantification treatment;Output layer is for exporting result.
In order to better adapt to underage child voice data, which can use underage child voice data It is trained, so that more preferable by the vector to be evaluated of the voice data to be evaluated of the doc2vec model generation after training.
Wherein, the training of doc2vec model can be realized using conventional training method, for example, using following training process:
Firstly, the word in each training text data is initialized as a N-dimensional vector, N be can according to need really A fixed value appropriate.Preferably, the value range of N are as follows: 30~200, it can adjust as needed.
For example, the participle data of text data are as follows: { one bear of picture is played };Word initialization vector dimension is 4, then The vector of each word after initialization are as follows:
Picture x1:[1,0,0,0];Wherein, picture x1 is used to indicate word " picture " and inputs (i.e. in Fig. 3 as first Shown in X1)。
One x2:[0,1,0,0];Wherein, an x2 is used to indicate word " one " and inputs (i.e. in Fig. 3 as second Shown in X2)。
Bear x3:[0,0,1,0];Wherein, bear x3 is used to indicate word " bear " and inputs as third.
Play x4:[0,0,0,1];Wherein, the x4 that plays is used to indicate word " playing " as the 4th input.
It, can be using x1, x2 and x4 as the input to training pattern, i.e. in Fig. 3 in training doc2vec model Layer layers of input are exactly x1, x2, x4, and using x3 as amendment benchmark.
When training, the training parameter matrix w by setting up a doc2vec model calculates ayer layers of hidden l and (hides Layer) result.Matrix w is in the side being illustrated as between input layer (input layer) and hidden layer (hidden layer) in Fig. 3 Frame.In a kind of feasible pattern, matrix w is as follows:
It is as follows that Hidden Layer result v2 is calculated with x2 and matrix w:
Similarly, Hidden Layer result v1 and v4 is calculated in x1 and x4 and W matrix.According to the v1 being calculated, V2 and v4 is averaged determining hidden layer.
Later, the matrix o between hidden layer and output layer, matrix o and hidden layer phase are set up Multiply, then obtains probability, such as output layer calculated result with sofmax are as follows: [0.23,0.03,0.62,0.12], the 3rd Value 0.62 is maximum, then close with true expectation [0,0,1,0].According to the true phase of the result of output layer output and x3 It hopes and carries out arameter optimization, such as adjust aforementioned each matrix, to obtain optimal models, complete the training of doc2vec model.
After doc2vec model is completed in training, using the participle data of each participle above-mentioned as text vector computation model The input of (the doc2vec model that training is completed), generates the to be evaluated of voice data to be evaluated by text vector computation model Vector.Since text vector computation model is by training, obtain for the optimized parameter in underage child usage scenario, because This is more preferable in the accuracy for generating vector to be evaluated using text vector computation model.
Step S202: it according to preset standard content vector, vector sum similarity calculation to be evaluated, calculates preset Similarity between standard content vector sum vector to be evaluated, wherein similarity calculation is used for according in preset standard The cosine value for holding vector sum vector to be evaluated calculates the similarity, and makes numerical value of the calculated similarity greater than 0.
Wherein, preset standard content vector is generated according to Key for Reference data.For example, by text vector model, Key for Reference data are converted to corresponding standard content vector by Word2vec model as the aforementioned, and standard content vector is pre- It is first stored in computer equipment and/or in server.
The calculated similarity of similarity calculation may be considered Key for Reference data and voice data pair to be evaluated The similarity degree between text data answered, to be that voice data to be evaluated scores according to this similarity.In this reality It applies in example, similarity calculation is used to calculate institute according to the cosine value between preset standard content vector sum vector to be evaluated Similarity is stated, and makes numerical value of the calculated similarity greater than 0.
The similarity calculation can determine similarity according to the cosine value of two vectors, and make calculated similarity For the numerical value greater than 0, the purpose that precise and high efficiency determines similarity is not only realized, is avoided when carrying out speech evaluating using existing Some in such a way that text keyword determines similarity existing for influence of the unisonance allograph to similarity accuracy, and solve Having determined, there are negative values for cosine value, and causing calculated similarity, there are negative values, so that the problem of speech evaluating result inaccuracy.
In a kind of feasible pattern, using preset standard content vector sum vector to be evaluated as similarity calculation Input, calculates similarity by similarity calculation.
Optionally, in order to improve the accuracy of evaluation and test, similarity be greater than 0 and be less than or equal to 1 numerical value.It both solves in this way Similarity existing for existing cosine value instruction similarity has negative value, is unfavorable for the problem of accurately embodying ability to express, and just It is scored in subsequent according to similarity, makes the simpler convenience that scores.
Optionally, similarity calculation includes:
Wherein, xiIt is used to indicate vector to be evaluated, xjIt is used to indicate standard content vector corresponding with vector to be evaluated, Score is used to indicate similarity.The similarity calculated by the similarity calculation can be with cosine value linear change, cosine Value is bigger, and similarity is closer to 1.0 after conversion, and cosine value is smaller, and similarity is similar in this way closer to 0.0 after conversion The value range of degree is 0.0~1.0, may be conveniently used and is subsequently generated evaluation result data, such as evaluates and tests score.
Certainly, in other embodiments, similarity calculation includes score=ecos(xi,xj), wherein xiBe used to indicate to Evaluate and test vector, xjIt is used to indicate standard content vector corresponding with vector to be evaluated, score is used to indicate similarity.
Step S203: it according to similarity and preset speech assessment rule, generates and exports commenting for voice data to be evaluated Survey result data.
Wherein, evaluation result data are used to characterize the level of the ability to express of user.For example, in assertiveness training course, Evaluation result data can characterize the expression of user.The horizontal difference of the ability to express characterized as needed, evaluation and test knot It may include different evaluation and test parameters in fruit data.For example, evaluation result data include semanteme in assertiveness training course above-mentioned Score, dynamics score, tone score etc..
In a kind of feasible pattern, preset speech assessment rule includes score and phase above-mentioned in evaluation result data Like positively related rule is spent, i.e., when similarity is higher, score is higher, and characterization ability to express is better.
The speech evaluating method can be applied in assertiveness training course, by being evaluated and tested to the voice that user inputs, Such as the corresponding text data of voice data to be evaluated is converted into vector to be evaluated, and by vector to be evaluated and preset standard The similarity of content vector is evaluated and tested, and evaluation result data are obtained, to pass through the expression energy of evaluation result data characterization user Power, to allow user to understand the expression for easily understanding oneself, so that user be motivated to continue study and progress.
The speech evaluating method is when evaluating and testing voice data to be evaluated, by semantic understanding technology by speech recognition Inconsistent text data is converted to numeric type vector to reduce error, using similarity calculation, according to direction finding to be evaluated out Cosine value between amount and preset standard content vector calculates similarity, and makes numerical value of the calculated similarity greater than 0, Solve the problems, such as that the existing evaluation and test accuracy when carrying out speech evaluating of existing similarity calculating method is low.In addition, also solving There is negative value in similarity calculation, so that the result of speech evaluating is more accurate.
The speech evaluating method of the present embodiment can be realized by any suitable equipment having data processing function, be wrapped It includes: various terminal equipment and server etc..
Embodiment three
According to an embodiment of the invention, providing a kind of computer storage medium, computer storage medium is stored with: being used for root According to the corresponding text data of voice data to be evaluated, the instruction of the vector to be evaluated of voice data to be evaluated is generated;For basis Preset standard content vector, vector sum similarity calculation to be evaluated, it is to be evaluated to calculate preset standard content vector sum The instruction of similarity between vector, wherein similarity calculation is used for be evaluated according to preset standard content vector sum The cosine value of vector calculates similarity, and makes numerical value of the calculated similarity greater than 0;For according to similarity and preset Speech assessment rule, generates and exports the instruction of the evaluation result data of voice data to be evaluated.
Optionally, for calculating default according to preset standard content vector, vector sum similarity calculation to be evaluated Standard content vector sum vector to be evaluated between similarity instruction, comprising: for preset standard content vector sum Input of the vector to be evaluated as similarity calculation calculates the instruction of similarity by similarity calculation.
Optionally, similarity is the numerical value greater than 0 and less than or equal to 1.
Optionally, similarity calculation includes:
Wherein, xiIt is used to indicate vector to be evaluated, xjIt is used to indicate standard content vector corresponding with vector to be evaluated, Score is used to indicate similarity.
Optionally, for generating the to be evaluated of voice data to be evaluated according to the corresponding text data of voice data to be evaluated The instruction of direction finding amount, comprising: for being identified using speech recognition modeling to voice data to be evaluated, generate and language to be evaluated The instruction of the corresponding text data of sound data;For carrying out vectorization processing to text data by text vector computation model, Generate the instruction of the vector to be evaluated of voice data to be evaluated.
Optionally, it for being identified using speech recognition modeling to voice data to be evaluated, generates and voice to be evaluated The instruction of the corresponding text data of data, comprising: for obtaining the instruction of voice data to be evaluated;For to voice number to be evaluated According to progress transcoding processing, and generate the instruction of the voice data to be evaluated after transcoding;For by the voice number to be evaluated after transcoding According to the input as speech recognition modeling, text data corresponding with voice data to be evaluated is generated by speech recognition modeling Instruction.
Optionally, for carrying out vectorization processing to text data by text vector computation model, language to be evaluated is generated The instruction of the vector to be evaluated of sound data, comprising: for being pre-processed to text data, and generated and tied according to pre-processed results The instruction of fruit data, wherein result data includes the participle data for the multiple participles being used to indicate in text data;For will be each Input of the participle data of a participle as text vector computation model generates voice to be evaluated by text vector computation model The instruction of the vector to be evaluated of data.
Optionally, pretreatment includes removal dirty data processing, word segmentation processing and removal stop words processing;For to textual data According to being pre-processed, and generate according to pre-processed results the instruction of result data, comprising: dirty for being removed to text data Data processing, and obtain the instruction of effective text data;For carrying out word segmentation processing to effective text data, and obtain effectively text The instruction of the participle data of multiple participles in notebook data;It is removed at stop words for the participle data to multiple participles Reason, obtains the instruction of result data.
The instruction stored in the computer storage medium can pass through semanteme when evaluating and testing to voice data to be evaluated Speech recognition is gone out inconsistent text data and is converted to numeric type vector by understanding technology reduces error, utilizes similarity calculation Model calculates similarity according to the cosine value between vector to be evaluated and preset standard content vector, and makes calculated phase It is the numerical value greater than 0 like degree, it is low solves the existing evaluation and test accuracy when carrying out speech evaluating of existing similarity calculating method The problem of.In addition, also solving similarity calculation has negative value, so that the result of speech evaluating is more accurate.
Through the above description of the embodiments, those skilled in the art can be understood that each embodiment can It realizes by means of software and necessary general hardware platform, naturally it is also possible to pass through hardware.Based on this understanding, on Stating technical solution, substantially the part that contributes to existing technology can be embodied in the form of software product in other words, the meter Calculation machine software product may be stored in a computer readable storage medium, and the computer readable recording medium includes by terms of Any mechanism of the readable form storage of calculation machine (such as computer) or transmission information.For example, machine readable media includes read-only Memory (ROM), random access memory (RAM), magnetic disk storage medium, optical storage media, flash medium, electricity, light, sound Or transmitting signal (for example, carrier wave, infrared signal, digital signal etc.) of other forms etc., if the computer software product includes Dry instruction is used so that computer equipment (can be personal computer, server or the network equipment an etc.) execution is each Method described in certain parts of embodiment or embodiment.
Finally, it should be noted that above embodiments are only to illustrate the technical solution of the embodiment of the present invention, rather than it is limited System;Although the present invention is described in detail referring to the foregoing embodiments, those skilled in the art should understand that: its It is still possible to modify the technical solutions described in the foregoing embodiments, or part of technical characteristic is equal Replacement;And these are modified or replaceed, technical solution of various embodiments of the present invention that it does not separate the essence of the corresponding technical solution Spirit and scope.
It will be understood by those skilled in the art that the embodiment of the embodiment of the present invention can provide as method, apparatus (equipment) or Computer program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine soft The form of the embodiment of part and hardware aspect.Moreover, it wherein includes to calculate that the embodiment of the present invention, which can be used in one or more, Computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, the optical memory of machine usable program code Deng) on the form of computer program product implemented.
The embodiment of the present invention referring to according to the method for the embodiment of the present invention, device (equipment) and computer program product Flowchart and/or the block diagram describes.It should be understood that can be realized by computer program instructions every in flowchart and/or the block diagram The combination of process and/or box in one process and/or box and flowchart and/or the block diagram.It can provide these computers Processor of the program instruction to general purpose computer, special purpose computer, Embedded Processor or other programmable data processing devices To generate a machine, so that generating use by the instruction that computer or the processor of other programmable data processing devices execute In the dress for realizing the function of specifying in one or more flows of the flowchart and/or one or more blocks of the block diagram It sets.
These computer program instructions, which may also be stored in, is able to guide computer or other programmable data processing devices with spy Determine in the computer-readable memory that mode works, so that it includes referring to that instruction stored in the computer readable memory, which generates, Enable the manufacture of device, the command device realize in one box of one or more flows of the flowchart and/or block diagram or The function of being specified in multiple boxes.
These computer program instructions also can be loaded onto a computer or other programmable data processing device, so that counting Series of operation steps are executed on calculation machine or other programmable devices to generate computer implemented processing, thus in computer or The instruction executed on other programmable devices is provided for realizing in one or more flows of the flowchart and/or block diagram one The step of function of being specified in a box or multiple boxes.

Claims (10)

1. a kind of speech evaluating method characterized by comprising
According to the corresponding text data of voice data to be evaluated, the vector to be evaluated of the voice data to be evaluated is generated;
According to preset standard content vector, the vector sum similarity calculation to be evaluated, the preset standard is calculated Similarity between vector to be evaluated described in content vector sum, wherein the similarity calculation is used for according to described default Standard content vector sum described in the cosine value of vector to be evaluated calculate the similarity, and make the calculated similarity Numerical value greater than 0;
According to the similarity and preset speech assessment rule, the evaluation result of the voice data to be evaluated is generated and exported Data.
2. the method according to claim 1, wherein according to preset standard content vector, the direction finding to be evaluated Amount and similarity calculation calculate the similarity between vector to be evaluated described in the preset standard content vector sum, packet It includes:
Using vector to be evaluated described in preset standard content vector sum as the input of the similarity calculation, by described Similarity calculation calculates the similarity.
3. the method according to claim 1, wherein the similarity is the numerical value greater than 0 and less than or equal to 1.
4. the method according to claim 1, wherein the similarity calculation includes:
Wherein, xiIt is used to indicate vector to be evaluated, xjIt is used to indicate standard content vector corresponding with vector to be evaluated, score is used In instruction similarity.
5. method according to claim 1 to 4, which is characterized in that described according to voice data pair to be evaluated The text data answered generates the vector to be evaluated of the voice data to be evaluated, comprising:
The voice data to be evaluated is identified using speech recognition modeling, is generated corresponding with the voice data to be evaluated The text data;
Vectorization processing is carried out to the text data by text vector computation model, generates the voice data to be evaluated Vector to be evaluated.
6. according to the method described in claim 5, it is characterized in that, described use speech recognition modeling to the voice to be evaluated Data are identified, the text data corresponding with the voice data to be evaluated is generated, comprising:
Obtain the voice data to be evaluated;
Transcoding processing is carried out to the voice data to be evaluated, and generates the voice data to be evaluated after transcoding;
Using the voice data to be evaluated after transcoding as the input of the speech recognition modeling, pass through the speech recognition mould Type generates the text data corresponding with the voice data to be evaluated.
7. according to the method described in claim 5, it is characterized in that, it is described by text vector computation model to the textual data According to vectorization processing is carried out, the vector to be evaluated of the voice data to be evaluated is generated, comprising:
The text data is pre-processed, and generates result data according to pre-processed results, wherein the result data packet Include the participle data for the multiple participles being used to indicate in the text data;
Using the participle data of each participle as the input of text vector computation model, mould is calculated by the text vector Type generates the vector to be evaluated of the voice data to be evaluated.
8. the method according to the description of claim 7 is characterized in that described pre-process including at removal dirty data processing, participle Reason and removal stop words processing;
It is described that the text data is pre-processed, and result data is generated according to pre-processed results, comprising:
Dirty data processing is removed to the text data, and obtains effective text data;
Word segmentation processing is carried out to effective text data, and obtains the participle number of multiple participles in effective text data According to;
Stop words processing is removed to the participle data of the multiple participle, obtains the result data.
9. a kind of computer storage medium, which is characterized in that the computer storage medium is stored with: for according to language to be evaluated The corresponding text data of sound data generates the instruction of the vector to be evaluated of the voice data to be evaluated;For according to preset Standard content vector, the vector sum similarity calculation to be evaluated calculate described in the preset standard content vector sum The instruction of similarity between vector to be evaluated, wherein the similarity calculation is used for according in the preset standard The cosine value for holding vector to be evaluated described in vector sum calculates the similarity, and makes the calculated similarity greater than 0 Numerical value;For generating and exporting commenting for the voice data to be evaluated according to the similarity and preset speech assessment rule Survey the instruction of result data.
10. computer storage medium according to claim 9, which is characterized in that described for according in preset standard Hold vector, the vector sum similarity calculation to be evaluated, calculates to be evaluated described in the preset standard content vector sum The instruction of similarity between vector, comprising: be used for using vector to be evaluated described in preset standard content vector sum described in The input of similarity calculation calculates the instruction of the similarity by the similarity calculation.
CN201810259445.XA 2018-03-27 2018-03-27 Voice evaluation method and computer storage medium Active CN110322895B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810259445.XA CN110322895B (en) 2018-03-27 2018-03-27 Voice evaluation method and computer storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810259445.XA CN110322895B (en) 2018-03-27 2018-03-27 Voice evaluation method and computer storage medium

Publications (2)

Publication Number Publication Date
CN110322895A true CN110322895A (en) 2019-10-11
CN110322895B CN110322895B (en) 2021-07-09

Family

ID=68109770

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810259445.XA Active CN110322895B (en) 2018-03-27 2018-03-27 Voice evaluation method and computer storage medium

Country Status (1)

Country Link
CN (1) CN110322895B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110827794A (en) * 2019-12-06 2020-02-21 科大讯飞股份有限公司 Method and device for evaluating quality of voice recognition intermediate result
CN111199750A (en) * 2019-12-18 2020-05-26 北京葡萄智学科技有限公司 Pronunciation evaluation method and device, electronic equipment and storage medium
CN111639219A (en) * 2020-05-12 2020-09-08 广东小天才科技有限公司 Method for acquiring spoken language evaluation sticker, terminal device and storage medium
CN112435512A (en) * 2020-11-12 2021-03-02 郑州大学 Voice behavior assessment and evaluation method for rail transit simulation training
CN112562736A (en) * 2020-12-11 2021-03-26 中国信息通信研究院 Voice data set quality evaluation method and device

Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101496011A (en) * 2006-11-09 2009-07-29 松下电器产业株式会社 Content search apparatus
CN103823896A (en) * 2014-03-13 2014-05-28 蚌埠医学院 Subject characteristic value algorithm and subject characteristic value algorithm-based project evaluation expert recommendation algorithm
CN104240706A (en) * 2014-09-12 2014-12-24 浙江大学 Speaker recognition method based on GMM Token matching similarity correction scores
CN104464757A (en) * 2014-10-28 2015-03-25 科大讯飞股份有限公司 Voice evaluation method and device
CN104505103A (en) * 2014-12-04 2015-04-08 上海流利说信息技术有限公司 Voice quality evaluation equipment, method and system
CN104778161A (en) * 2015-04-30 2015-07-15 车智互联(北京)科技有限公司 Keyword extracting method based on Word2Vec and Query log
CN104867489A (en) * 2015-04-27 2015-08-26 苏州大学张家港工业技术研究院 Method and system for simulating reading and pronunciation of real person
WO2015157036A1 (en) * 2014-04-09 2015-10-15 Google Inc. Text-dependent speaker identification
CN105261362A (en) * 2015-09-07 2016-01-20 科大讯飞股份有限公司 Conversation voice monitoring method and system
CN105608180A (en) * 2015-12-22 2016-05-25 北京奇虎科技有限公司 Application recommendation method and system
CN105898713A (en) * 2016-06-17 2016-08-24 东华大学 WiFi fingerprint indoor positioning method based on weighted cosine similarity
US20160358599A1 (en) * 2015-06-03 2016-12-08 Le Shi Zhi Xin Electronic Technology (Tianjin) Limited Speech enhancement method, speech recognition method, clustering method and device
CN106503805A (en) * 2016-11-14 2017-03-15 合肥工业大学 A kind of bimodal based on machine learning everybody talk with sentiment analysis system and method
CN106776559A (en) * 2016-12-14 2017-05-31 东软集团股份有限公司 The method and device of text semantic Similarity Measure
CN106802956A (en) * 2017-01-19 2017-06-06 山东大学 A kind of film based on weighting Heterogeneous Information network recommends method
CN106847288A (en) * 2017-02-17 2017-06-13 上海创米科技有限公司 The error correction method and device of speech recognition text
CN107316638A (en) * 2017-06-28 2017-11-03 北京粉笔未来科技有限公司 A kind of poem recites evaluating method and system, a kind of terminal and storage medium
CN107346340A (en) * 2017-07-04 2017-11-14 北京奇艺世纪科技有限公司 A kind of user view recognition methods and system
CN107355342A (en) * 2017-06-30 2017-11-17 北京金风科创风电设备有限公司 The abnormal recognition methods of wind generating set pitch control and device
CN107729322A (en) * 2017-11-06 2018-02-23 广州杰赛科技股份有限公司 Segmenting method and device, establish sentence vector generation model method and device
CN107766426A (en) * 2017-09-14 2018-03-06 北京百分点信息科技有限公司 A kind of file classification method, device and electronic equipment
CN107773982A (en) * 2017-10-20 2018-03-09 科大讯飞股份有限公司 Game voice interactive method and device
CN107992470A (en) * 2017-11-08 2018-05-04 中国科学院计算机网络信息中心 A kind of text duplicate checking method and system based on similarity
CN110136721A (en) * 2019-04-09 2019-08-16 北京大米科技有限公司 A kind of scoring generation method, device, storage medium and electronic equipment

Patent Citations (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101496011A (en) * 2006-11-09 2009-07-29 松下电器产业株式会社 Content search apparatus
CN103823896A (en) * 2014-03-13 2014-05-28 蚌埠医学院 Subject characteristic value algorithm and subject characteristic value algorithm-based project evaluation expert recommendation algorithm
WO2015157036A1 (en) * 2014-04-09 2015-10-15 Google Inc. Text-dependent speaker identification
CN104240706A (en) * 2014-09-12 2014-12-24 浙江大学 Speaker recognition method based on GMM Token matching similarity correction scores
CN104464757A (en) * 2014-10-28 2015-03-25 科大讯飞股份有限公司 Voice evaluation method and device
CN104505103A (en) * 2014-12-04 2015-04-08 上海流利说信息技术有限公司 Voice quality evaluation equipment, method and system
CN104867489A (en) * 2015-04-27 2015-08-26 苏州大学张家港工业技术研究院 Method and system for simulating reading and pronunciation of real person
CN104778161A (en) * 2015-04-30 2015-07-15 车智互联(北京)科技有限公司 Keyword extracting method based on Word2Vec and Query log
US20160358599A1 (en) * 2015-06-03 2016-12-08 Le Shi Zhi Xin Electronic Technology (Tianjin) Limited Speech enhancement method, speech recognition method, clustering method and device
CN105261362A (en) * 2015-09-07 2016-01-20 科大讯飞股份有限公司 Conversation voice monitoring method and system
CN105608180A (en) * 2015-12-22 2016-05-25 北京奇虎科技有限公司 Application recommendation method and system
CN105898713A (en) * 2016-06-17 2016-08-24 东华大学 WiFi fingerprint indoor positioning method based on weighted cosine similarity
CN106503805A (en) * 2016-11-14 2017-03-15 合肥工业大学 A kind of bimodal based on machine learning everybody talk with sentiment analysis system and method
CN106776559A (en) * 2016-12-14 2017-05-31 东软集团股份有限公司 The method and device of text semantic Similarity Measure
CN106802956A (en) * 2017-01-19 2017-06-06 山东大学 A kind of film based on weighting Heterogeneous Information network recommends method
CN106847288A (en) * 2017-02-17 2017-06-13 上海创米科技有限公司 The error correction method and device of speech recognition text
CN107316638A (en) * 2017-06-28 2017-11-03 北京粉笔未来科技有限公司 A kind of poem recites evaluating method and system, a kind of terminal and storage medium
CN107355342A (en) * 2017-06-30 2017-11-17 北京金风科创风电设备有限公司 The abnormal recognition methods of wind generating set pitch control and device
CN107346340A (en) * 2017-07-04 2017-11-14 北京奇艺世纪科技有限公司 A kind of user view recognition methods and system
CN107766426A (en) * 2017-09-14 2018-03-06 北京百分点信息科技有限公司 A kind of file classification method, device and electronic equipment
CN107773982A (en) * 2017-10-20 2018-03-09 科大讯飞股份有限公司 Game voice interactive method and device
CN107729322A (en) * 2017-11-06 2018-02-23 广州杰赛科技股份有限公司 Segmenting method and device, establish sentence vector generation model method and device
CN107992470A (en) * 2017-11-08 2018-05-04 中国科学院计算机网络信息中心 A kind of text duplicate checking method and system based on similarity
CN110136721A (en) * 2019-04-09 2019-08-16 北京大米科技有限公司 A kind of scoring generation method, device, storage medium and electronic equipment

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DANIEL L BOWLING: ""Vocal similarity predicts the relative attraction of musical chords"", 《PNAS》 *
RAFAEL FERREIRA: ""Assessing sentence similarity through lexical,syntactic and semantic analysis"", 《COMPUTER SPEECH & LANGUAGE》 *
李海洋: ""汉语语音关键词检测中置信测度研究"", 《中国博士学位论文全文数据库 信息科技辑》 *
王超: "" 基于模糊综合评价的语音评测模型的研究与实现"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
郑世杰: "" 基于语音自动评测的普通话学习系统的研究"", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110827794A (en) * 2019-12-06 2020-02-21 科大讯飞股份有限公司 Method and device for evaluating quality of voice recognition intermediate result
CN111199750A (en) * 2019-12-18 2020-05-26 北京葡萄智学科技有限公司 Pronunciation evaluation method and device, electronic equipment and storage medium
CN111199750B (en) * 2019-12-18 2022-10-28 北京葡萄智学科技有限公司 Pronunciation evaluation method and device, electronic equipment and storage medium
CN111639219A (en) * 2020-05-12 2020-09-08 广东小天才科技有限公司 Method for acquiring spoken language evaluation sticker, terminal device and storage medium
CN112435512A (en) * 2020-11-12 2021-03-02 郑州大学 Voice behavior assessment and evaluation method for rail transit simulation training
CN112562736A (en) * 2020-12-11 2021-03-26 中国信息通信研究院 Voice data set quality evaluation method and device

Also Published As

Publication number Publication date
CN110322895B (en) 2021-07-09

Similar Documents

Publication Publication Date Title
CN110322895A (en) Speech evaluating method and computer storage medium
CN107329949B (en) Semantic matching method and system
CN110457432B (en) Interview scoring method, interview scoring device, interview scoring equipment and interview scoring storage medium
CN107564511B (en) Electronic device, phoneme synthesizing method and computer readable storage medium
WO2020114373A1 (en) Method and apparatus for realizing element recognition in judicial document
CN111833853B (en) Voice processing method and device, electronic equipment and computer readable storage medium
CN109034203A (en) Training, expression recommended method, device, equipment and the medium of expression recommended models
US11087745B2 (en) Speech recognition results re-ranking device, speech recognition results re-ranking method, and program
Swain et al. Study of feature combination using HMM and SVM for multilingual Odiya speech emotion recognition
CN113297383B (en) Speech emotion classification method based on knowledge distillation
Zhang et al. Deep cross-corpus speech emotion recognition: Recent advances and perspectives
CN117711444B (en) Interaction method, device, equipment and storage medium based on talent expression
Nguyen et al. Meta-transfer learning for emotion recognition
Somogyi The Application of Artificial Intelligence
CN104700831B (en) The method and apparatus for analyzing the phonetic feature of audio file
Gumelar et al. Forward feature selection for toxic speech classification using support vector machine and random forest
CN117252739B (en) Method, system, electronic equipment and storage medium for evaluating paper
CN112116181B (en) Classroom quality model training method, classroom quality evaluation method and classroom quality evaluation device
Lopez-de-Ipina et al. Multilingual audio information management system based on semantic knowledge in complex environments
CN113095063A (en) Two-stage emotion migration method and system based on masking language model
CN114239555A (en) Training method of keyword extraction model and related device
Kiwelekar et al. Automatic grading of student’s presentation skills based on powerpoint presentation and audio
WO2021098637A1 (en) Voice transliteration method and apparatus, and related system and device
Level et al. Introduction of semantic model to help speech recognition
Dong et al. The application of big data to improve pronunciation and intonation evaluation in foreign language learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant