CN110136721A - A kind of scoring generation method, device, storage medium and electronic equipment - Google Patents
A kind of scoring generation method, device, storage medium and electronic equipment Download PDFInfo
- Publication number
- CN110136721A CN110136721A CN201910280448.6A CN201910280448A CN110136721A CN 110136721 A CN110136721 A CN 110136721A CN 201910280448 A CN201910280448 A CN 201910280448A CN 110136721 A CN110136721 A CN 110136721A
- Authority
- CN
- China
- Prior art keywords
- voice
- assessment
- text
- received pronunciation
- curve
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Abstract
The application one or more embodiment discloses a kind of scoring generation method, device, storage medium and server, wherein, method includes: the inputted assessment voice set of acquisition, obtain the corresponding assessment text of the assessment voice set, the assessment text is compared with sample text, obtain the difference text information between the assessment text and the sample text, scoring processing is carried out to the assessment voice set based on the difference text information, generates the corresponding scoring of the assessment voice set.Using the application one or more embodiment, the accuracy of speech assessment can be improved.
Description
Technical field
This application involves field of computer technology more particularly to a kind of scoring generation method, device, storage medium and electronics
Equipment.
Background technique
With the development of the times, global integration trend is increasingly obvious, and more and more people wish to learn and grasp one
Or several pure fluent foreign languages, in order to more easily exchange.During foreign language studying, spoken status is especially heavy
It wants.
Currently, user mostly uses computer auxiliary language learning system study spoken, in computer-assisted language learning system
Include spoken scoring unit in system, can determine whether the spoken language pronunciation of user is accurate by spoken language scoring unit.However, this
The spoken scoring unit of kind can only score to the standard degree for the word that user has pronounced, once and user is long right in practice
There is the case where omitting word and/or redundancy word during words or long article, just accurately the treatise can not be talked with
Or long article scores, and thereby reduces the accuracy of speech assessment.
Summary of the invention
The embodiment of the present application provides a kind of scoring generation method, device, storage medium and electronic equipment, and language can be improved
The accuracy of sound scoring.The technical solution is as follows:
In a first aspect, the embodiment of the present application provides a kind of scoring generation method, which comprises
Inputted assessment voice set is acquired, the corresponding assessment text of the assessment voice set is obtained;
The assessment text is compared with sample text, is obtained between the assessment text and the sample text
Difference text information;
Scoring processing is carried out to the assessment voice set based on the difference text information, generates the assessment voice collection
Close corresponding scoring;
Second aspect, the embodiment of the present application provide a kind of scoring generating means, and described device includes:
Text obtains module, and for acquiring inputted assessment voice set, it is corresponding to obtain the assessment voice set
Assessment text;
Data obtaining module, for the assessment text to be compared with sample text, obtain the assessment text and
Difference text information between the sample text;
Score generation module, for carrying out scoring processing to the assessment voice set based on the difference text information,
Generate the corresponding scoring of the assessment voice set.
The third aspect, the embodiment of the present application provide a kind of computer storage medium, and the computer storage medium is stored with
A plurality of instruction, described instruction are suitable for being loaded by processor and executing above-mentioned method and step.
Fourth aspect, the embodiment of the present application provide a kind of electronic equipment, it may include: processor and memory;Wherein, described
Memory is stored with computer program, and the computer program is suitable for being loaded by the processor and being executed above-mentioned method step
Suddenly.
The technical solution bring beneficial effect that some embodiments of the application provide includes at least:
In the application one or more embodiment, user terminal acquires inputted assessment voice set, described in acquisition
The corresponding assessment text of voice set of testing and assessing, and the assessment text is compared with sample text, then obtain the survey
The difference text information between text and the sample text is commented, finally based on the difference text information to the assessment voice
Set carries out scoring processing, generates the corresponding scoring of the assessment voice set.By by acquired assessment text and sample
Text compares to obtain difference text information, is then scored according to difference text information, so that it may realize defeated to user
The voice assessment set entered is accurately scored, to improve the accuracy of speech assessment.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of application for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is the scene framework schematic diagram that the scoring that one or more embodiments provide generates;
Fig. 2 is the flow diagram for the scoring generation method that one or more embodiments provide;
Fig. 3 is the interface schematic diagram for the scoring generation method that one or more embodiments provide;
Fig. 4 a is the schematic diagram at the user's checking interface for the scoring generation method that one or more embodiments provide;
Fig. 4 b is the schematic diagram at the test prompts interface for the scoring generation method that one or more embodiments provide;
Fig. 4 c is that the recording of one or more scoring generation methods provided by the embodiments of the present application starts the schematic diagram at interface;
Fig. 4 d is the schematic diagram of the acquisition speech interfaces for the scoring generation method that one or more embodiments provide;
Fig. 4 e is the schematic diagram at the submission assessment interface for the scoring generation method that one or more embodiments provide;
Fig. 4 f is the schematic diagram for submitting successfully interface for the scoring generation method that one or more embodiments provide;
Fig. 5 is the flow diagram for the scoring generation method that one or more embodiments provide;
Fig. 6 is the interface schematic diagram for the scoring generation method that one or more embodiments provide;
Fig. 7 is the interface schematic diagram for the scoring generation method that one or more embodiments provide;
Fig. 8 is the structural schematic diagram for the scoring generating means that one or more embodiments provide;
Fig. 9 is the structural schematic diagram of the set acquisition module in the scoring generating means that one or more embodiments provide;
Figure 10 is the structural schematic diagram for the electronic equipment that one or more embodiments provide.
Specific embodiment
Below in conjunction with the attached drawing in the embodiment of the present application, technical solutions in the embodiments of the present application carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of embodiments of the present application, instead of all the embodiments.It is based on
Embodiment in the application, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall in the protection scope of this application.
In the description of the present application, it is to be understood that term " first ", " second " etc. are used for description purposes only, without
It can be interpreted as indication or suggestion relative importance.In the description of the present application, it should be noted that unless otherwise specific regulation
And restriction, " comprising " and " having " and their any deformations, it is intended that cover and non-exclusive include.Such as contain a system
The process, method, system, product or equipment of column step or unit are not limited to listed step or unit, but optional
Ground further includes the steps that not listing or unit, or optionally further comprising intrinsic for these process, methods, product or equipment
Other step or units.For the ordinary skill in the art, above-mentioned term can be understood in the application with concrete condition
In concrete meaning.In addition, unless otherwise indicated, " multiple " refer to two or more in the description of the present application." and/
Or ", the incidence relation of affiliated partner is described, indicates may exist three kinds of relationships, for example, A and/or B, can indicate: individually depositing
In A, A and B, these three situations of individualism B are existed simultaneously.It is a kind of "or" that character "/", which typicallys represent forward-backward correlation object,
Relationship.
The application is described in detail below with reference to specific embodiment.
It referring to Figure 1, is a kind of configuration diagram for the generation system that scores provided by the embodiments of the present application.As shown in Figure 1,
The scoring generation system may include user 110 and scoring generating means 120.The scoring generating means 120 can be electricity
Sub- equipment, which includes but is not limited to: PC, tablet computer, handheld device, mobile unit, wearable device,
Calculate equipment or the other processing equipments for being connected to radio modem etc..User terminal can be called in different networks
Different titles, such as: user equipment, access terminal, subscriber unit, subscriber station, movement station, mobile station, remote station, long-range end
End, mobile device, user terminal, terminal, wireless telecom equipment, user agent or user apparatus, cellular phone, wireless phone,
Terminal device in personal digital assistant (personal digital assistant, PDA), 5G network or future evolution network
Deng.Or has the server of scoring processing function.
For convenience, it is illustrated so that the generating means that score are user terminal as an example in the embodiment of the present application.
As shown in Figure 1, user 110 inputs assessment phonetic order to user terminal 120, user terminal 120 receives the survey
After commenting phonetic order, user terminal 120 makes a response the assessment phonetic order of user 110, loads the sample pre-set
This text, and the sample text is shown on a display screen.
User 110 starts input assessment voice according to sample text on display screen.
At this point, user terminal 120 can be used by built-in recording acquisition device or the acquisition of external recording acquisition device
The assessment voice that family 110 inputs, audio collecting device can be one or more microphones (also referred to as microphone).In the number of microphone
In the case that amount is multiple, multiple microphones can be distributed in different position composition microphone arrays, and user terminal passes through microphone battle array
Column obtain each collected assessment voice set of microphone, and the collected assessment voice set in multiple channels is merged to obtain
The assessment voice set of high-fidelity.
Optionally, in the case where audio collecting device is external, audio collecting device can be by preset length (such as
Receiver J-Horner, USB interface or bluetooth 3.5mm) is by collected assessment voice real time transport to user terminal 120.User is whole
End 120 saves assessment voice to assessment voice set.User terminal 120 can acquire the assessment voice of user 110 several times
Set, then selects a final assessment voice set according to the selection instruction of user 110 from multiple assessment voice set.
Such as: the Foreigh-language oral-speech that user Xiao Ming wants test oneself is horizontal, and user Xiao Ming opens the survey of mobile phone terminal at this time
Voice software is commented, assessment voice request is issued by clicking assessment talk button in assessment speech interfaces, at this point, mobile phone terminal is rung
It should show that on a display screen sample text and corresponding prompting message, mobile phone terminal are built-in with 2 in the assessment voice request of user
A microphone, is respectively distributed to the bottom and top of mobile phone terminal, and mobile phone terminal acquires the survey of user Xiao Ming by 2 microphones
Comment sound set, is filtered the assessment voice set acquired on two microphone acquisition channels and the processes such as noise reduction obtain later
To the assessment voice set of high-fidelity, and preserve.
User terminal 120 obtains the different information between the assessment text and the sample text.
Specifically, user terminal 120 obtains the current assessment voice in assessment voice set in received pronunciation information bank,
The corresponding assessment voice curve of the current assessment voice is obtained, the assessment voice curve and received pronunciation curve are then obtained
The similarity set of each received pronunciation curve in set, then obtain the similarity maximum value in the similarity set, Yi Jisuo
State the target criteria voice curve of similarity maximum value instruction.At this point, by the corresponding target mark of the target criteria voice curve
Quasi- voice is determined as the corresponding received pronunciation of the current assessment voice.
User terminal 120 continues the next assessment for obtaining current assessment voice after getting current assessment voice
Voice, and next assessment voice is determined as voice of currently testing and assessing, then execute and obtain the corresponding survey of the current assessment voice
Comment sound curve obtains the similarity set of each received pronunciation curve in the assessment voice curve and received pronunciation collection of curves
The step of.
If user terminal 120 is detected there is no when next assessment voice, the standard comprising the received pronunciation is generated
The corresponding received pronunciation group of each assessment voice is combined into standard speech based on the sequencing of each assessment voice by voice set
Sound set, and the corresponding relationship based on received pronunciation information and text information determine the corresponding assessment of the received pronunciation set
Text.
Wherein, the received pronunciation refers to the voice pre-set, these phonetic storages are in received pronunciation information bank
In, the voice messaging can be pitch, loudness of a sound, the duration of a sound, tone color of voice etc..The quasi- voice messaging institute of text information index
Corresponding text.It is emphasized that text information herein is corresponding with received pronunciation information, it can be understood as, in standard
What is stored in sound bank is received pronunciation information and the corresponding text information of each received pronunciation information.Usually have
Completely, the word, sentence, paragraph etc. of system meaning.Such as: text information corresponding to " Today " voice is written word
" Today ", text information corresponding to " good " voice are written word " good ", corresponding to " good day " voice
Text information is written vocabulary " good day ", and text information corresponding to " Today is a good day " voice is
For written sentence " Today is a good day ", etc..
The assessment text is compared with sample text again for user terminal 120, obtain the assessment text with it is described
Difference text information between sample text.
The difference text information includes the category of difference amount of text, difference content of text and the difference content of text
Property (importance, etc. of such as part of speech of difference word, sentence or paragraph).
Wherein, the difference content of text may include test and assess text in sample text word, sentence or paragraph it is different
The text of cause, the text may include certain words, sentence or paragraph in sample text, can also include that user inputs assessment
When voice set, word, sentence or the paragraph of addition.
Difference text information is input in Rating Model and carries out according to preset code of points by user terminal 120
Scoring processing, the corresponding scoring of output assessment voice set, then user terminal 120 is shown to user 110 comprising this scoring
Scoring report.
In one or more embodiments, user terminal acquires inputted assessment voice set, obtains the assessment language
The assessment text is compared sound set corresponding assessment text with sample text, obtain the assessment text with it is described
Difference text information between sample text is based on the difference text information and carries out scoring processing to the assessment voice set,
Generate the corresponding scoring of the assessment voice set.By comparing acquired assessment text and sample text to obtain difference
Then different text information scores according to the difference text information, so that it may realize the voice assessment collection inputted to user
Conjunction is accurately scored, to improve the accuracy of speech assessment.
Below in conjunction with attached drawing 2- attached drawing 7, describe in detail to scoring generation method provided by the embodiments of the present application.Its
In, the scoring generating means in the embodiment of the present application can be user terminal shown in FIG. 1.
Fig. 2 is referred to, provides a kind of flow diagram of generation method that scores for the embodiment of the present application.This method can be according to
Rely and realized in computer program, be can run in the scoring generating means based on von Neumann system.The computer program can collect
It is run in the application, also can be used as independent tool-class application.Scoring generating means in the embodiment of the present application can be Fig. 1
Or user terminal described in above-described embodiment.
Specifically, the scoring generation method includes:
S101: the inputted assessment voice set of acquisition obtains the corresponding assessment text of the assessment voice set.
Wherein, the sample text is the referenced text that user terminal provides, which can be is made of character/word
Sentence, paragraph, article etc..When voice is tested and assessed in input, the sample text that user provides according to user terminal is inputted pair
The assessment voice set answered.
For example, user terminal shows sample text in screen: Today is a good day, user show according to screen
Sample text input " Today is a good day " corresponding assessment voice set, user terminal acquires user institute in real time
The assessment voice set of input.
Specifically, assessment voice set refers to the assessment voice for the sample text input that user provides according to user terminal,
It may include at least one assessment voice in the set, each assessment voice can be understood as an assessment word or vocabulary.
For example, assessment voice collection be combined into " Today is a good day ", " Today ", " is ", " a ", " good ",
Pronunciation corresponding to " day " is each assessment voice in the set.Such as: the corresponding assessment word of assessment voice or
Word, the assessment text include multiple assessment words or word.
Specifically, assessment phonetic order of the user terminal in response to user, shows sample text in screen and prompts user
By microphone input assessment voice, user's sample for reference text completes the input of assessment voice.
Optionally, the assessment phonetic order of user's input can be can be led to by external equipment completion, such as user
Cross connection user terminal mouse choose user terminal display interface assessment talk button input assessment voice request, can be with
It is that user is carried out by the keyboard or touch tablet input command adapted thereto of connection user terminal, it is defeated by voice can be user
Enter the instruction for carrying out assessment voice, can be user by camera and acquire the behaviour that assessment phonetic order is completed in gesture control instruction
Make, can also be and assessment talk button etc. is chosen by touch-control user terminal screen.It should be noted that completing the mode of operation
There are many, it is not especially limited herein.
Optionally, user terminal acquires the assessment voice set of user's input by microphone, and the microphone can be
Built-in or external one or more can design microphone when number of microphone is multiple according to actual needs
Placement location, modes of emplacement can be different angle placement, to collect more good assessment voice set, Jin Ersheng
At good assessment voice set, when user terminal is after testing and assessing Speech time or user triggers and submits voice assessment instruction
Afterwards, collected assessment voice set is saved.
Optionally, during acquiring user's assessment voice set, for user terminal, user terminal passes through wheat
Gram elegance collection assessment voice set has an efficient voice acquisition distance range, and the efficient voice acquisition distance range refers to
The collected voice of institute can be identified user in the range.
In a kind of possible embodiment, the assessment voice set real-time monitoring that user terminal inputs user, judgement
Whether user is within efficient voice acquisition distance range, when user is when except efficient voice acquisition range, user terminal
Then show that prompt information, the prompt information are used to indicate the relative distance of user's adjustment and user terminal.
For example, the microphone acquisition efficient voice range of user terminal is 0-30cm, user is from user terminal microphone
The position input assessment voice set of 35cm, user terminal real-time monitor user distance too far, do not acquire model in efficient voice
Within enclosing, user terminal shows that the text of " hypertelorism please adjust at a distance from microphone " as shown in Figure 3 mentions on a display screen
Show information, prompt user's adjustment at a distance from microphone, user at this time can be by furthering at a distance from microphone to acquire
To more good assessment voice set.
Optionally, user terminal can acquire the assessment voice set of user several times, then refer to according to the user's choice
It enables and selects a final assessment voice set from multiple assessment voice set.
During a kind of feasible realization, the display of user terminal display interface can be with reference to as shown in Fig. 4 a- Fig. 4 f
Method wherein verifying interface comprising user images as shown in fig. 4 a on the surface include face-image preview region and mentioning
Show the graphical interfaces of information, which can be used for suggestion voice assessment person and input face using the front camera of mobile phone
Image, and face-image is placed in face-image preview region, user terminal detects currently in face-image preview region
When face-image and preset voice assessment person face-image match, i.e. subscriber authentication success, and then trigger in next step
The step of acquisition assessment voice set.
Further, user terminal display interface can also be described comprising prompting interface before the test as shown in Fig. 4 b
Prompting interface includes the relevant information and test ACK button of this tone testing, such as testing time, testing process, test tool
The points for attention etc. of body, the user terminal current page detect on ACK button when clicking touch action, such as Fig. 4 c
Shown, display starts the graphical interfaces of prompt information and voice assessment start button comprising sample text, test, and the voice is surveyed
Commenting start button is a control on graphical interfaces, and the operation that the assessment voice for triggering user terminal formally starts is used
The corresponding assessment voice set of family information input according to sample text.
Further, user inputs assessment voice set by user terminal, and during input, user terminal can be with
The content of pages such as Fig. 4 d is shown by display interface, and content of pages includes the total time of current assessment input, when shown total
Between can be convenient user and rationally control time schedule.
In a kind of possible embodiment, the graphical interfaces that user terminal is shown includes the completing button of voice assessment,
When user terminal detects the trigger action in completing button, collected assessment voice set is saved, user terminal is facilitated
Processor processing.Wherein, instruction is completed in user's false triggering in order to prevent, and the high trigger action of complexity can be set, such as:
In the case where being not over the time of testing and assessing, user wants to terminate in advance assessment, if user continuously clicks three times and mentions in 3 seconds
It hands over button that can just be successfully generated submission instruction, can also be the process that setting is submitted to avoid maloperation, can refer to Fig. 4 d, user
It clicks completing button and submits voice assessment set, triggering user terminal shows the submission confirmation letter such as Fig. 4 e in the display interface
Breath, when user wants to submit current speech assessment set, clicking confirmation can be submitted, and trigger user terminal at this time on display circle
Face display assessment waits interface, can refer to Fig. 4 f, and shown interface includes the information and prompt information of this voice assessment set, example
It such as tests and assesses the used time.
Optionally, user terminal is during the voice for acquiring user's input tests and assesses set, the voice of acquisition can because
The quality that the disturbing factors such as ambient noise, echo influence acquisition voice can acquire array to by microphone in actual implementation
The voice of acquisition is pre-processed, and the pretreatment includes end-point detection, noise reduction, Wave beam forming, by pretreated voice
It carries out post-filtering and eliminates remaining voice noise, the speech energy of acquisition is then adjusted by automatic gain algorithm, is finally used
Family terminal saves the voice set after processing, carries out language by assessment voice set of the user terminal processes device to preservation
Sound identification is converted into corresponding assessment text.
S102: the assessment text is compared with sample text, obtains the assessment text and the sample text
Between difference text information.
Wherein, the difference text refers to word, sentence or the inconsistent text of paragraph in assessment text and sample text
This, it can also include that user inputs assessment voice collection that the text, which may include certain words, sentence or paragraph in sample text,
When conjunction, word, sentence or the paragraph of addition.The difference text information includes difference amount of text, difference content of text and institute
State the attribute (importance, etc. of such as part of speech of difference word, sentence or paragraph) of difference content of text.
Specifically, the assessment text of conversion is compared user terminal with sample text, it will not during comparison
The information of consistent difference text is recorded, and is based on the difference text information to the assessment voice collection convenient for user terminal
Conjunction carries out scoring processing, generates the corresponding scoring of the assessment voice set.
For example, user terminal is by carrying out the assessment after speech recognition is converted to the tested speech set of acquisition
Text.
Above-mentioned assessment text includes e.g. information below:
Best of times it was the of times, it was the worst of times, it
was the age of wisdom,it was the age of foolishness.
Sample text includes e.g. information below:
Best of times it was the best of times, it was the worst of times, it
was the age of wisdom,it was the age of foolishness.
By comparing, difference text is obtained are as follows: miss " best ", redundancy " the ".
Optionally, user terminal can add attribute-bit to difference text, and the attribute-bit is used to record difference text
This attribute, such as position of the difference text in sample text, part of speech, the difference type of difference text etc..Specific addition belongs to
Property identification method can be currently according to it is scheduled rule generate one group of binary system random number series, can also be according to scheduled
The one group long integer character constant string that rule generates, is not especially limited, herein following for the convenience that embodiment illustrates, with generation
One group of binary code indicates.
One of feasible implementation is that each difference text has an attribute-bit, and each attribute-bit uses
One group of binary code indicates, here for 10.The front two of 8 binary codes indicates omission and redundancy, specific: 01 (loses
Leakage), 10 (redundancies);3rd and the 4th expression difference type, specific 01 (word), 10 (sentences), 11 (paragraphs), the 5th to the 8th
Position indicates the specific location parameter of difference text.
For example, as shown in table 1, above-mentioned difference text information can be to be recorded by method shown in table 1.
Table 1
Difference text | Attribute-bit |
best | 0101000111 |
the | 1001001100 |
S103: scoring processing is carried out to the assessment voice set based on the difference text information, generates the assessment
The corresponding scoring of voice set.
Specifically, difference text information can be input in Rating Model by user terminal according to preset code of points
Scoring processing, the corresponding scoring of output assessment voice set are carried out, user terminal generates the scoring report comprising this assessment scoring
It accuses and is shown in display interface, the test and evaluation report includes the word and voice category of omission or redundancy in user's assessment voice process
Property information, voice attributes information includes but is not limited to: word speed information, prosody information and tone color.Further, voice test and evaluation report
It can also include the pronunciation of received pronunciation and the evaluation of this voice.
Optionally, the Rating Model, which can be, trains to come using a large amount of test sample, as Rating Model can be with
It is based on convolutional neural networks (Convolutional Neural Network, CNN) model, deep neural network (Deep
Neural Network, DNN) model, Recognition with Recurrent Neural Network (Recurrent Neural Networks, RNN), model, insertion
(embedding) model, gradient promote decision tree (Gradient Boosting Decision Tree, GBDT) model, logic
Return what at least one of (Logistic Regression, LR) model was realized, based on the sample data marked to commenting
Sub-model is trained, available trained Rating Model.
In one or more embodiments, user terminal acquires inputted assessment voice set, obtains the assessment language
The assessment text is compared sound set corresponding assessment text with sample text, obtain the assessment text with it is described
Difference text information between sample text is based on the difference text information and carries out scoring processing to the assessment voice set,
Generate the corresponding scoring of the assessment voice set.By comparing acquired assessment text and sample text to obtain difference
Then different text information scores according to the difference text information, so that it may realize the voice assessment collection inputted to user
Conjunction is accurately scored, to improve the accuracy of speech assessment.
Fig. 5 is referred to, Fig. 5 is a kind of process signal of another embodiment for scoring generation method that the application proposes
Figure.It is specific:
S201: the inputted assessment voice set of acquisition obtains the assessment voice set in received pronunciation information bank
In current assessment voice.
Specific acquisition method please refers to step S101, and details are not described herein again.
Specifically, the voice refers to the sound of language, it is the carrier of linguistic notation system, the survey of user terminal acquisition
Comment sound is actually a kind of signal wave, and user terminal is needed when starting to identify the assessment voice set to adopting
The signal wave of collection is pre-processed, while carrying out framing to signal wave, and voice has reformed into many segments at this time, then to voice
Signal waveform make time domain transformation, it is to extract mel-frequency cepstrum feature (Mel that the time domain, which converts common method,
Frequency Cepstral Coefficents, MFCC) characteristic information, according to the physiological property of human ear, each frame signal
Waveform becomes a multi-C vector, can simply be interpreted as this vector and contain the content information of this frame voice, that is, include
The content information of assessment voice.It should be noted that the method for extracting special type information has MFCC incessantly, this is a kind of, and process is above-mentioned
After process, collected assessment voice is just at a multi-C vector matrix.
The set being made of on the assessment voice collective entity the pronunciation of multiple words, the pronunciation of word are made of phoneme,
For this language of English, a kind of common phone set is made of 39 phonemes, and a phoneme is usually divided into 3 shapes
State, the state refer to phonetic unit, after carrying out framing to signal wave, the corresponding state of several frames, and every 3 shapes
As soon as state just constitutes a phoneme, several phonemes constitute a word.
The received pronunciation information bank is the information bank for storing word pronunciation, and user terminal collects assessment voice set, and
The corresponding received pronunciation of first word pronunciation in the assessment voice set is obtained in received pronunciation information bank, will acquire
Received pronunciation is as current assessment voice.
S202: the corresponding assessment voice curve of the current assessment voice is obtained, the assessment voice curve and mark are obtained
The similarity set of each received pronunciation curve in quasi- voice collection of curves.
Specifically, voice messaging includes voice curve, by above-mentioned standard voice messaging library can acoustic model and
Language model connects, for example, a sentence is segmented into several words and is connected, each word for English
The aligned phoneme sequence of pronunciation of words is corresponded in received pronunciation information bank, the aligned phoneme sequence is become by standard voice signals wavelength-division frame
It alternatively obtains afterwards, the voice signal wave is commonly referred to as voice curve.
User terminal collects assessment voice set, and obtains in the assessment voice set in received pronunciation information bank
Current assessment voice, it is specific to obtain the corresponding assessment voice curve of current assessment voice, then in received pronunciation collection of curves
Middle lookup has the received pronunciation curve of similarity with assessment voice curve, and the similarity and received pronunciation curve, which correspond to, to be saved
Into similarity set.
For example, the collected assessment voice set of user terminal, corresponds to 5 aligned phoneme sequences by above-mentioned processing, it is each
A aligned phoneme sequence corresponds to one section of assessment voice curve, it is assumed that existing assessment voice curve A, B, C, D, E, user terminal obtain first
Assessment voice curve A is taken, then in the set for storing received pronunciation curve, searches mark similar with assessment voice curve A
Quasi- voice curve.
By searching for 3 voice curves with similarity are had found, corresponding similarity relationship can see the table below user terminal
2。
Table 2
Title | Similarity | State |
Received pronunciation curve 1 | 80 | Normally |
Received pronunciation curve 2 | 60 | Normally |
Received pronunciation curve 3 | 40 | Normally |
State described in table is used to show whether the voice curve with similarity can be got by terminal.
S203: obtaining the similarity maximum value in the similarity set, obtains the mesh of the similarity maximum value instruction
Received pronunciation curve is marked, the corresponding target criteria voice of the target criteria voice curve is determined as the current assessment voice
Corresponding received pronunciation.
Specifically, the similarity value of a plurality of received pronunciation curve is shown in the similarity set, for the above table 2,
There are 3 received pronunciation curves, corresponding similarity 80,60,40 in similarity set, state is normal, user terminal
The available received pronunciation curve into table, user terminal traverse similarity set first, it is right to find out similarity maximum value 80
The title answered is received pronunciation curve 1, and the received pronunciation curve 1 is then found in received pronunciation collection of curves, and user is whole
End regard received pronunciation curve 1 as target criteria voice curve, at this point, the corresponding voice of target criteria voice curve is described
The current corresponding received pronunciation of voice of testing and assessing.
Specifically, current assessment voice often has the interference such as noise in collection process, after executing above-mentioned steps, just
Current assessment voice can be converted into received pronunciation.
S204: next assessment voice of the current assessment voice is obtained, next assessment voice is determined as
Current assessment voice, and execute it is described obtain the corresponding assessment voice curve of the current assessment voice, obtain the assessment language
In sound curve and received pronunciation collection of curves the step of the similarity set of each received pronunciation curve.
Optionally, user terminal obtains next assessment after the received pronunciation curve for finding current assessment voice
Voice, for described above, user terminal is looked for find the corresponding received pronunciation of assessment voice curve A after, by next survey
Comment sound is as current assessment voice, particularly currently by searching for the similar standard of assessment voice curve B of assessment voice
Voice curve, and then the corresponding received pronunciation of received pronunciation curve is got, the specific finding step can refer to S202, herein
It repeats no more.
S205: when detecting there is no when next assessment voice, the received pronunciation collection comprising the received pronunciation is generated
It closes.
Specifically, when detecting that all assessment voices have all been searched, that is, next assessment voice is not present in user terminal,
User terminal saves all received pronunciations got into received pronunciation set at this time.
For above-mentioned, user terminal searched assessment voice curve A, B, C, D, E after, will acquire corresponding 5
A received pronunciation is corresponding to be saved to received pronunciation set.
S206: the sequencing based on each assessment voice, by the corresponding received pronunciation of each assessment voice
Group is combined into received pronunciation set.
Optionally, user terminal to each assessment voice curve be equipped with priority parameters, the priority parameters can be from
Arrive greatly it is small, from low to high etc., it is assumed that have assessment voice A, B, C, D, E, priority etc. of the user terminal to each assessment voice
Grade is A > B > C > D > E, gets the corresponding received pronunciation of each assessment voice through the above steps, corresponding relationship can refer to
Table 3.
Table 3
Assessment voice | Received pronunciation |
A | 001 |
B | 002 |
C | 004 |
D | 003 |
E | 005 |
Optionally, user terminal according to assessment voice priority level: A > B > C > D > E, each assessment voice is right respectively
001,002,004,003,005 group of the received pronunciation answered is combined into received pronunciation set, please examine table 4.
Table 4
Received pronunciation 001 | Received pronunciation 002 | Received pronunciation 004 | Received pronunciation 003 | Received pronunciation 005 |
Table 4 is combined received pronunciation set.
S207: the corresponding relationship based on received pronunciation information and text information determines that the received pronunciation set is corresponding
Assessment text.
Specifically, text information refers to the written representation of language, usually have one complete, system meaning
The combination of sentence or multiple sentences.The text information can be a word, a sentence, a section by taking English language as an example
It falls, the text information can be the practice form of language, usually refer to some spoken and written languages information in specific implementation.It needs
It is emphasized that text information herein is corresponding with received pronunciation information, it can be understood as, it is stored in received pronunciation library
It is received pronunciation information and the corresponding text information of each received pronunciation information.Usually have complete, system meaning
Word, sentence, paragraph etc..Such as: text information corresponding to " he " voice is written word " he ", " likes " voice institute
Corresponding text information is written word " likes ", and text information corresponding to " middle school " voice is book
The vocabulary " middle school " in face, etc..
Specifically, the received pronunciation refers to the voice pre-set, these phonetic storages are in received pronunciation information
In library, the voice messaging can be pitch, loudness of a sound, the duration of a sound, tone color of voice etc., and pitch refers to frequency of sound wave, i.e. each second
The number of vibration number;Loudness of a sound refers to the size of sonic wave amplitude;The duration of a sound refers to the length of acoustic vibration duration, also referred to as duration;
Tone color refers to the characteristic and essence of sound, also referred to as sound quality.
Optionally, user terminal is corresponding from determining received pronunciation with the corresponding relationship of text information based on received pronunciation information
Assessment text, can refer to table 5, table 5 is a kind of corresponding relationship of received pronunciation information and text information.
Table 5
Received pronunciation information | Text information | Text type |
Received pronunciation 001 | he | Subject |
Received pronunciation 002 | likes | Verb |
Received pronunciation 003 | riding | Verb |
Received pronunciation 004 | a | Preposition |
Received pronunciation 005 | bike | Noun |
The corresponding assessment text of the received pronunciation set is assured that by the corresponding relationship of upper table are as follows: He likes
riding a bike.
S208: the assessment text is compared with sample text, obtains the assessment text and the sample text
Between difference text information, the difference text information includes difference amount of text, difference content of text and the difference
The attribute of content of text.
It, can be with when user terminal receives assessment voice request specifically, the sample text is the preset text of terminal
It is the sample text that user terminal generates at random from preset sample text library, can also be as shown in fig. 6, user is logical
A sample text of touch-control user terminal screen selection is crossed, the sample text is corresponding with sample voice curve.
For example, the sample text of user's selection are as follows:
I have a good friend.She is a pretty girl.She lives in Jiujiang.She
is a middle school student.She has big eyes,a small mouth,a small nose and a
round face.She is tall and thin.She likes watching TV and playing the
basketball.On the weekend,she always plays basketball with her friends in the
afternoon and watches TV in the evening.
The assessment text are as follows:
I have a good good friend.She is a pretty girl.She lives in
Jiujiang.She is a middle student.She has big eyes,a small mouth,a nose and a
round face.She is tall and thin.She likes watching TV and playing the
basket.On the weekend,she always always plays basketball with her friends in
the afternoon and watches TV in the evening.
User terminal generates difference text information, the difference text information display form can be 6 institute of table by comparing
The representation method shown.
Table 6
S209: scoring processing is carried out to the assessment voice set based on the difference text information, generates the assessment
The corresponding scoring of voice set.
Specifically, difference text information can be input in Rating Model by user terminal according to preset code of points
Scoring processing, the corresponding scoring of output assessment voice set are carried out, terminal generates the scoring report comprising this assessment scoring simultaneously
It is shown in display interface, the test and evaluation report includes omission, error, the word of redundancy and voice category in user's assessment voice process
Property information, voice attributes information includes but is not limited to: word speed information, prosody information and tone color.Further, voice test and evaluation report
It can also include the pronunciation of received pronunciation.
Optionally, the preset code of points of user terminal can be, and setting total score 100 is divided, setting deduction of points radix and difference
Type deduction of points coefficient and difference text part of speech deduction of points coefficient.
Such as: setting deduction of points radix is 1 point, and difference deduction of points type has 3 kinds, and the coefficient that redundancy type is respectively set is 2.0,
The deduction of points coefficient for omitting type is 3.0, and the deduction of points type for the type that malfunctions is 4.0, and difference text part of speech deduction of points coefficient can be set
Are as follows: noun 1.0, verb 2.0, adjective 2.0, interjection 1.5, adverbial word 2.0 etc. can calculate most according to difference text information
Whole score:
Final score=100-1*2.0*2.0-1.0*1.0*3.0-1.0*2.0*3.0-1.0*1.0*4.0-1.0*2. 0*
2.0=79
Optionally, the user terminal can test and assess according to this voice and score, and obtain corresponding tone testing evaluation, use
Family terminal generates the scoring comprising this assessment scoring and evaluation and reports and show in display interface, as shown in fig. 7, the assessment
Report may include the word and voice attributes information of omission or redundancy in user's assessment voice process, and voice attributes information includes
But it is not limited to: word speed information, prosody information and tone color.
In one or more embodiments, user terminal acquires inputted assessment voice set, in received pronunciation information
The current assessment voice in the assessment voice set is obtained in library, and obtains the corresponding assessment voice of the current assessment voice
Curve, then the assessment voice curve is compared with received pronunciation curve each in received pronunciation collection of curves, obtain similarity
Then set obtains the target criteria voice curve of the similarity maximum value instruction in the similarity set, by the target mark
The corresponding target criteria voice of quasi- voice curve is determined as the corresponding received pronunciation of the current assessment voice, and according to identical
Mode generates the corresponding received pronunciation of other assessment voices in assessment voice set, by the corresponding received pronunciation of each assessment voice into
It composes a piece of writing this conversion and obtains assessment text, score by comparing text and sample text is tested and assessed voice set of testing and assessing.
By comparing acquired assessment text and sample text to obtain difference text information, then according to the difference text
Information scores, so that it may realize and accurately be scored the voice assessment set that user inputs, be commented to improve voice
The accuracy divided.
Following is the application Installation practice, can be used for executing the application embodiment of the method.It is real for the application device
Undisclosed details in example is applied, the application embodiment of the method is please referred to.
Fig. 8 is referred to, it illustrates the structural representations for the scoring generating means that one exemplary embodiment of the application provides
Figure.The scoring generating means can by software, hardware or both be implemented in combination with as terminal all or part of.It should
Device 1 includes that text obtains module 11, data obtaining module 12 and scoring generation module 13.
Text obtains module 11, and for acquiring inputted assessment voice set, it is corresponding to obtain the assessment voice set
Assessment text;
Data obtaining module 12 obtains the assessment text for the assessment text to be compared with sample text
Difference text information between the sample text;
Score generation module 13, for being carried out at scoring based on the difference text information to the assessment voice set
Reason generates the corresponding scoring of the assessment voice set.
Optionally, as shown in figure 9, set acquisition module 11 can specifically include:
Voice set acquiring unit 110, for obtaining the corresponding received pronunciation set of the assessment voice set;
Test and assess text determination unit 111, for the corresponding relationship based on received pronunciation information and text information, determine described in
The corresponding assessment text of received pronunciation set.
Optionally, the voice set acquiring unit 110, is specifically used for:
The corresponding received pronunciation of voice of respectively testing and assessing in the assessment voice set is obtained in received pronunciation information bank;
Based on the sequencing of each assessment voice, the corresponding received pronunciation group of each assessment voice is combined into
Received pronunciation set.
Optionally, the voice set acquiring unit 110, is specifically used for:
The current assessment voice in the assessment voice set is obtained in received pronunciation information bank;
The corresponding assessment voice curve of the current assessment voice is obtained, the assessment voice curve and received pronunciation are obtained
The similarity set of each received pronunciation curve in collection of curves;
The corresponding received pronunciation of the current assessment voice is determined based on the similarity set;
The next assessment voice for obtaining the current assessment voice, next assessment voice is determined as currently surveying
Comment sound, and execute and described obtain the corresponding assessment voice curve of the current assessment voice, the acquisition assessment voice curve
The step of with the similarity set of received pronunciation curve each in received pronunciation collection of curves;
When detecting there is no when next assessment voice, the received pronunciation set comprising the received pronunciation is generated.
Optionally, the voice set acquiring unit 110, is specifically used for:
Obtain the similarity maximum value in the similarity set;
The target criteria voice curve of the similarity maximum value instruction is obtained, the target criteria voice curve is corresponding
Target criteria voice be determined as the corresponding received pronunciation of the current assessment voice.
Optionally, described device 1, which is characterized in that the difference text information includes difference amount of text, difference text
The attribute of content and the difference content of text.
It should be noted that scoring generating means provided by the above embodiment execute score generation method when, only more than
The division progress of each functional module is stated for example, can according to need and in practical application by above-mentioned function distribution by difference
Functional module complete, i.e., the internal structure of equipment is divided into different functional modules, with complete it is described above whole or
Person's partial function.In addition, scoring generating means provided by the above embodiment and scoring generation method embodiment belong to same design,
It embodies realization process and is detailed in embodiment of the method, and which is not described herein again.
Above-mentioned the embodiment of the present application serial number is for illustration only, does not represent the advantages or disadvantages of the embodiments.
In one or more embodiments, user terminal acquires inputted assessment voice set, in received pronunciation information
The current assessment voice in the assessment voice set is obtained in library, and obtains the corresponding assessment voice of the current assessment voice
Curve, then the assessment voice curve is compared with received pronunciation curve each in received pronunciation collection of curves, obtain similarity
Then set obtains the target criteria voice curve of the similarity maximum value instruction in the similarity set, by the target mark
The corresponding target criteria voice of quasi- voice curve is determined as the corresponding received pronunciation of the current assessment voice, and according to identical
Mode generates the corresponding received pronunciation of other assessment voices in assessment voice set, by the corresponding received pronunciation of each assessment voice into
It composes a piece of writing this conversion and obtains assessment text, score by comparing text and sample text is tested and assessed voice set of testing and assessing.
By comparing acquired assessment text and sample text to obtain difference text information, then according to the difference text
Information scores, so that it may realize and accurately be scored the voice assessment set that user inputs, be commented to improve voice
The accuracy divided.
The embodiment of the present application also provides a kind of computer storage medium, the computer storage medium can store more
Item instruction, described instruction are suitable for being loaded by processor and being executed the method and step such as above-mentioned Fig. 1-embodiment illustrated in fig. 7, specifically hold
Row process may refer to Fig. 1-embodiment illustrated in fig. 7 and illustrate, herein without repeating.
Present invention also provides a kind of computer program product, which is stored at least one instruction,
At least one instruction is loaded as the processor and is executed to realize scoring generation method described in as above each embodiment.
Referring to Figure 10, the structural schematic diagram of a kind of electronic equipment is provided for the embodiment of the present application.As shown in Figure 10, institute
Stating server 1000 may include: at least one processor 1001, at least one network interface 1004, and user interface 1003 is deposited
Reservoir 1005, at least one communication bus 1002.
Wherein, communication bus 1002 is for realizing the connection communication between these components.
Wherein, user interface 1003 may include display screen (Display), camera (Camera), optional user interface
1003 can also include standard wireline interface and wireless interface.
Wherein, network interface 1004 optionally may include standard wireline interface and wireless interface (such as WI-FI interface).
Wherein, processor 1001 may include one or more processing core.Processor 1001 using it is various excuse and
Various pieces in the entire server 1000 of connection, by running or executing the instruction being stored in memory 1005, journey
Sequence, code set or instruction set, and call the data that are stored in memory 1005, the various functions of execute server 1000 and
Handle data.Optionally, processor 1001 can using Digital Signal Processing (Digital Signal Processing,
DSP), field programmable gate array (Field-Programmable Gate Array, FPGA), programmable logic array
At least one of (Programmable Logic Array, PLA) example, in hardware is realized.Processor 1001 can integrating central
Processor (Central Processing Unit, CPU), image processor (Graphics Processing Unit, GPU)
With the combination of one or more of modem etc..Wherein, the main processing operation system of CPU, user interface and apply journey
Sequence etc.;GPU is used to be responsible for the rendering and drafting of content to be shown needed for display screen;Modem is for handling channel radio
Letter.It is understood that above-mentioned modem can not also be integrated into processor 1001, carried out separately through chip piece
It realizes.
Wherein, memory 1005 may include random access memory (Random Access Memory, RAM), also can wrap
Include read-only memory (Read-Only Memory).Optionally, which includes non-transient computer-readable medium
(non-transitory computer-readable storage medium).Memory 1005 can be used for store instruction, journey
Sequence, code, code set or instruction set.Memory 1005 may include storing program area and storage data area, wherein storing program area
Can store the instruction for realizing operating system, the instruction at least one function (such as touch function, sound play function
Energy, image player function etc.), for realizing instruction of above-mentioned each embodiment of the method etc.;Storage data area can store each above
The data etc. being related in a embodiment of the method.Memory 1005 optionally can also be that at least one is located remotely from aforementioned processing
The storage device of device 1001.As shown in Figure 10, as may include in a kind of memory 1005 of computer storage medium operation
System, network communication module, Subscriber Interface Module SIM and scoring generate application program.
In server 1000 shown in Fig. 10, user interface 1003 is mainly used for providing the interface of input for user, obtains
Take the data of family input;And processor 1001 can be used for that the scoring stored in memory 1005 is called to generate application program,
And specifically execute following operation:
Inputted assessment voice set is acquired, the corresponding assessment text of the assessment voice set is obtained;
The assessment text is compared with sample text, is obtained between the assessment text and the sample text
Difference text information;
Scoring processing is carried out to the assessment voice set based on the difference text information, generates the assessment voice collection
Close corresponding scoring.
In one embodiment, the processor 1001 is executing the corresponding assessment text of the acquisition assessment voice set
When, it is specific to execute following operation:
Obtain the corresponding received pronunciation set of the assessment voice set;
Corresponding relationship based on received pronunciation information and text information determines the corresponding assessment text of the received pronunciation set
This.
In one embodiment, the processor 1001 is executing the corresponding received pronunciation of the acquisition assessment voice set
Set is specific to execute following operation:
The corresponding received pronunciation of voice of respectively testing and assessing in the assessment voice set is obtained in received pronunciation information bank;
Based on the sequencing of each assessment voice, the corresponding received pronunciation group of each assessment voice is combined into
Received pronunciation set.
In one embodiment, the processor 1001 obtains the assessment voice in execution in received pronunciation information bank
It respectively tests and assesses in set the corresponding received pronunciation of voice, specific to execute following operation:
The current assessment voice in the assessment voice set is obtained in received pronunciation information bank;
The corresponding assessment voice curve of the current assessment voice is obtained, the assessment voice curve and received pronunciation are obtained
The similarity set of each received pronunciation curve in collection of curves;
The corresponding received pronunciation of the current assessment voice is determined based on the similarity set;
The next assessment voice for obtaining the current assessment voice, next assessment voice is determined as currently surveying
Comment sound, and execute and described obtain the corresponding assessment voice curve of the current assessment voice, the acquisition assessment voice curve
The step of with the similarity set of received pronunciation curve each in received pronunciation collection of curves;
When detecting there is no when next assessment voice, the received pronunciation set comprising the received pronunciation is generated.
In one embodiment, the processor 1001 determines the current assessment based on the similarity set in execution
The corresponding received pronunciation of voice, specific to execute following operation:
Obtain the similarity maximum value in the similarity set;
The target criteria voice curve of the similarity maximum value instruction is obtained, the target criteria voice curve is corresponding
Target criteria voice be determined as the corresponding received pronunciation of the current assessment voice.
In one embodiment, the processor 1001 is when executing aforesaid operations, it is characterised in that the difference text
Information includes the attribute of difference amount of text, difference content of text and the difference content of text.
In one or more embodiments, user terminal acquires inputted assessment voice set, in received pronunciation information
The current assessment voice in the assessment voice set is obtained in library, and obtains the corresponding assessment voice of the current assessment voice
Curve, then the assessment voice curve is compared with received pronunciation curve each in received pronunciation collection of curves, obtain similarity
Then set obtains the target criteria voice curve of the similarity maximum value instruction in the similarity set, by the target mark
The corresponding target criteria voice of quasi- voice curve is determined as the corresponding received pronunciation of the current assessment voice, and according to identical
Mode generates the corresponding received pronunciation of other assessment voices in assessment voice set, by the corresponding received pronunciation of each assessment voice into
It composes a piece of writing this conversion and obtains assessment text, score by comparing text and sample text is tested and assessed voice set of testing and assessing.
By comparing acquired assessment text and sample text to obtain difference text information, then according to the difference text
Information scores, so that it may realize and accurately be scored the voice assessment set that user inputs, be commented to improve voice
The accuracy divided.
Those of ordinary skill in the art will appreciate that realizing all or part of the process in above-described embodiment method, being can be with
Relevant hardware is instructed to complete by computer program, the program can be stored in a computer-readable storage medium
In, the program is when being executed, it may include such as the process of the embodiment of above-mentioned each method.Wherein, the storage medium can be magnetic
Dish, CD, read-only memory or random access memory etc..
Above disclosed is only the application preferred embodiment, cannot limit the right model of the application with this certainly
It encloses, therefore according to equivalent variations made by the claim of this application, still belongs to the range that the application is covered.
Claims (14)
1. a kind of scoring generation method, which is characterized in that the described method includes:
Inputted assessment voice set is acquired, the corresponding assessment text of the assessment voice set is obtained;
The assessment text is compared with sample text, obtains the difference between the assessment text and the sample text
Text information;
Scoring processing is carried out to the assessment voice set based on the difference text information, generates the assessment voice set pair
The scoring answered.
2. the method according to claim 1, wherein described obtain the corresponding assessment text of the assessment voice set
This, comprising:
Obtain the corresponding received pronunciation set of the assessment voice set;
Corresponding relationship based on received pronunciation information and text information determines the corresponding assessment text of the received pronunciation set.
3. according to the method described in claim 2, it is characterized in that, described obtain the corresponding standard speech of the assessment voice set
Sound set, comprising:
The corresponding received pronunciation of voice of respectively testing and assessing in the assessment voice set is obtained in received pronunciation information bank;
Based on the sequencing of each assessment voice, the corresponding received pronunciation group of each assessment voice is combined into standard
Voice set.
4. according to the method described in claim 3, it is characterized in that, described obtain the assessment language in received pronunciation information bank
It respectively tests and assesses in sound set the corresponding received pronunciation of voice, comprising:
The current assessment voice in the assessment voice set is obtained in received pronunciation information bank;
The corresponding assessment voice curve of the current assessment voice is obtained, the assessment voice curve and received pronunciation curve are obtained
The similarity set of each received pronunciation curve in set;
The corresponding received pronunciation of the current assessment voice is determined based on the similarity set;
The next assessment voice for obtaining the current assessment voice, is determined as language of currently testing and assessing for next assessment voice
Sound, and execute it is described obtain the corresponding assessment voice curve of the current assessment voice, obtain the assessment voice curve and mark
In quasi- voice collection of curves the step of the similarity set of each received pronunciation curve;
When detecting there is no when next assessment voice, the received pronunciation set comprising the received pronunciation is generated.
5. according to the method described in claim 4, it is characterized in that, described determine the current survey based on the similarity set
The corresponding received pronunciation of comment sound, comprising:
Obtain the similarity maximum value in the similarity set;
The target criteria voice curve for obtaining the similarity maximum value instruction, by the corresponding mesh of the target criteria voice curve
Mark received pronunciation is determined as the corresponding received pronunciation of the current assessment voice.
6. the method according to claim 1, wherein the difference text information includes difference amount of text, difference
The attribute of different content of text and the difference content of text.
7. a kind of scoring generating means, which is characterized in that described device includes:
Text obtains module, for acquiring inputted assessment voice set, obtains the corresponding assessment of the assessment voice set
Text;
Data obtaining module, for the assessment text to be compared with sample text, obtain the assessment text with it is described
Difference text information between sample text;
Score generation module, for carrying out scoring processing to the assessment voice set based on the difference text information, generates
The corresponding scoring of the assessment voice set.
8. device according to claim 7, which is characterized in that the set obtains module, comprising:
Voice set acquiring unit, for obtaining the corresponding received pronunciation set of the assessment voice set;
Text determination unit of testing and assessing determines the standard speech for the corresponding relationship based on received pronunciation information and text information
The corresponding assessment text of sound set.
9. device according to claim 8, which is characterized in that the voice set acquiring unit is specifically used for:
The corresponding received pronunciation of voice of respectively testing and assessing in the assessment voice set is obtained in received pronunciation information bank;
Based on the sequencing of each assessment voice, the corresponding received pronunciation group of each assessment voice is combined into standard
Voice set.
10. device according to claim 9, which is characterized in that the voice set acquiring unit is specifically used for:
The current assessment voice in the assessment voice set is obtained in received pronunciation information bank;
The corresponding assessment voice curve of the current assessment voice is obtained, the assessment voice curve and received pronunciation curve are obtained
The similarity set of each received pronunciation curve in set;
The corresponding received pronunciation of the current assessment voice is determined based on the similarity set;
The next assessment voice for obtaining the current assessment voice, is determined as language of currently testing and assessing for next assessment voice
Sound, and execute it is described obtain the corresponding assessment voice curve of the current assessment voice, obtain the assessment voice curve and mark
In quasi- voice collection of curves the step of the similarity set of each received pronunciation curve;
When detecting there is no when next assessment voice, the received pronunciation set comprising the received pronunciation is generated.
11. device according to claim 10, which is characterized in that the voice set acquiring unit is specifically used for:
Obtain the similarity maximum value in the similarity set;
The target criteria voice curve for obtaining the similarity maximum value instruction, by the corresponding mesh of the target criteria voice curve
Mark received pronunciation is determined as the corresponding received pronunciation of the current assessment voice.
12. device according to claim 7, which is characterized in that the difference text information includes difference amount of text, difference
The attribute of different content of text and the difference content of text.
13. a kind of computer storage medium, which is characterized in that the computer storage medium is stored with a plurality of instruction, the finger
It enables and is suitable for being loaded by processor and being executed the method and step such as claim 1~6 any one.
14. a kind of electronic equipment is it is characterised by comprising: processor and memory;Wherein, the memory is stored with computer
Program, the computer program are suitable for being loaded by the processor and being executed the method step such as claim 1~6 any one
Suddenly.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910280448.6A CN110136721A (en) | 2019-04-09 | 2019-04-09 | A kind of scoring generation method, device, storage medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910280448.6A CN110136721A (en) | 2019-04-09 | 2019-04-09 | A kind of scoring generation method, device, storage medium and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN110136721A true CN110136721A (en) | 2019-08-16 |
Family
ID=67569299
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910280448.6A Pending CN110136721A (en) | 2019-04-09 | 2019-04-09 | A kind of scoring generation method, device, storage medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110136721A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110322895A (en) * | 2018-03-27 | 2019-10-11 | 亿度慧达教育科技(北京)有限公司 | Speech evaluating method and computer storage medium |
CN110600052A (en) * | 2019-08-19 | 2019-12-20 | 天闻数媒科技(北京)有限公司 | Voice evaluation method and device |
CN112597065A (en) * | 2021-03-03 | 2021-04-02 | 浙江口碑网络技术有限公司 | Page testing method and device |
CN112597066A (en) * | 2021-03-03 | 2021-04-02 | 浙江口碑网络技术有限公司 | Page testing method and device |
CN112686020A (en) * | 2020-12-29 | 2021-04-20 | 科大讯飞股份有限公司 | Composition scoring method and device, electronic equipment and storage medium |
CN112802456A (en) * | 2021-04-14 | 2021-05-14 | 北京世纪好未来教育科技有限公司 | Voice evaluation scoring method and device, electronic equipment and storage medium |
CN113053403A (en) * | 2021-03-19 | 2021-06-29 | 北京乐学帮网络技术有限公司 | Voice evaluation method and device |
CN113205729A (en) * | 2021-04-12 | 2021-08-03 | 华侨大学 | Foreign student-oriented speech evaluation method, device and system |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102568475A (en) * | 2011-12-31 | 2012-07-11 | 安徽科大讯飞信息科技股份有限公司 | System and method for assessing proficiency in Putonghua |
CN103730032A (en) * | 2012-10-12 | 2014-04-16 | 李志刚 | Method and system for controlling multimedia data |
CN105374356A (en) * | 2014-08-29 | 2016-03-02 | 株式会社理光 | Speech recognition method, speech assessment method, speech recognition system, and speech assessment system |
CN105741831A (en) * | 2016-01-27 | 2016-07-06 | 广东外语外贸大学 | Spoken language evaluation method based on grammatical analysis and spoken language evaluation system |
CN105845134A (en) * | 2016-06-14 | 2016-08-10 | 科大讯飞股份有限公司 | Spoken language evaluation method through freely read topics and spoken language evaluation system thereof |
CN106303187A (en) * | 2015-05-11 | 2017-01-04 | 小米科技有限责任公司 | The acquisition method of voice messaging, device and terminal |
CN106847260A (en) * | 2016-12-20 | 2017-06-13 | 山东山大鸥玛软件股份有限公司 | A kind of Oral English Practice automatic scoring method of feature based fusion |
CN107135247A (en) * | 2017-02-16 | 2017-09-05 | 江苏南大电子信息技术股份有限公司 | A kind of service system and method for the intelligent coordinated work of person to person's work |
CN107579883A (en) * | 2017-08-25 | 2018-01-12 | 上海肖克利信息科技股份有限公司 | Distributed pickup intelligent home furnishing control method |
CN107578778A (en) * | 2017-08-16 | 2018-01-12 | 南京高讯信息科技有限公司 | A kind of method of spoken scoring |
CN109272992A (en) * | 2018-11-27 | 2019-01-25 | 北京粉笔未来科技有限公司 | A kind of spoken language assessment method, device and a kind of device for generating spoken appraisal model |
CN109448730A (en) * | 2018-11-27 | 2019-03-08 | 广州广电运通金融电子股份有限公司 | A kind of automatic speech quality detecting method, system, device and storage medium |
CN109461459A (en) * | 2018-12-07 | 2019-03-12 | 平安科技(深圳)有限公司 | Speech assessment method, apparatus, computer equipment and storage medium |
CN109493852A (en) * | 2018-12-11 | 2019-03-19 | 北京搜狗科技发展有限公司 | A kind of evaluating method and device of speech recognition |
-
2019
- 2019-04-09 CN CN201910280448.6A patent/CN110136721A/en active Pending
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102568475A (en) * | 2011-12-31 | 2012-07-11 | 安徽科大讯飞信息科技股份有限公司 | System and method for assessing proficiency in Putonghua |
CN103730032A (en) * | 2012-10-12 | 2014-04-16 | 李志刚 | Method and system for controlling multimedia data |
CN105374356A (en) * | 2014-08-29 | 2016-03-02 | 株式会社理光 | Speech recognition method, speech assessment method, speech recognition system, and speech assessment system |
CN106303187A (en) * | 2015-05-11 | 2017-01-04 | 小米科技有限责任公司 | The acquisition method of voice messaging, device and terminal |
CN105741831A (en) * | 2016-01-27 | 2016-07-06 | 广东外语外贸大学 | Spoken language evaluation method based on grammatical analysis and spoken language evaluation system |
CN105845134A (en) * | 2016-06-14 | 2016-08-10 | 科大讯飞股份有限公司 | Spoken language evaluation method through freely read topics and spoken language evaluation system thereof |
CN106847260A (en) * | 2016-12-20 | 2017-06-13 | 山东山大鸥玛软件股份有限公司 | A kind of Oral English Practice automatic scoring method of feature based fusion |
CN107135247A (en) * | 2017-02-16 | 2017-09-05 | 江苏南大电子信息技术股份有限公司 | A kind of service system and method for the intelligent coordinated work of person to person's work |
CN107578778A (en) * | 2017-08-16 | 2018-01-12 | 南京高讯信息科技有限公司 | A kind of method of spoken scoring |
CN107579883A (en) * | 2017-08-25 | 2018-01-12 | 上海肖克利信息科技股份有限公司 | Distributed pickup intelligent home furnishing control method |
CN109272992A (en) * | 2018-11-27 | 2019-01-25 | 北京粉笔未来科技有限公司 | A kind of spoken language assessment method, device and a kind of device for generating spoken appraisal model |
CN109448730A (en) * | 2018-11-27 | 2019-03-08 | 广州广电运通金融电子股份有限公司 | A kind of automatic speech quality detecting method, system, device and storage medium |
CN109461459A (en) * | 2018-12-07 | 2019-03-12 | 平安科技(深圳)有限公司 | Speech assessment method, apparatus, computer equipment and storage medium |
CN109493852A (en) * | 2018-12-11 | 2019-03-19 | 北京搜狗科技发展有限公司 | A kind of evaluating method and device of speech recognition |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110322895A (en) * | 2018-03-27 | 2019-10-11 | 亿度慧达教育科技(北京)有限公司 | Speech evaluating method and computer storage medium |
CN110600052A (en) * | 2019-08-19 | 2019-12-20 | 天闻数媒科技(北京)有限公司 | Voice evaluation method and device |
CN112686020A (en) * | 2020-12-29 | 2021-04-20 | 科大讯飞股份有限公司 | Composition scoring method and device, electronic equipment and storage medium |
CN112597065A (en) * | 2021-03-03 | 2021-04-02 | 浙江口碑网络技术有限公司 | Page testing method and device |
CN112597066A (en) * | 2021-03-03 | 2021-04-02 | 浙江口碑网络技术有限公司 | Page testing method and device |
CN113053403A (en) * | 2021-03-19 | 2021-06-29 | 北京乐学帮网络技术有限公司 | Voice evaluation method and device |
CN113205729A (en) * | 2021-04-12 | 2021-08-03 | 华侨大学 | Foreign student-oriented speech evaluation method, device and system |
CN112802456A (en) * | 2021-04-14 | 2021-05-14 | 北京世纪好未来教育科技有限公司 | Voice evaluation scoring method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110136721A (en) | A kind of scoring generation method, device, storage medium and electronic equipment | |
US11830499B2 (en) | Providing answers to voice queries using user feedback | |
US10339166B1 (en) | Systems and methods for providing natural responses to commands | |
CN101309327B (en) | Sound chat system, information processing device, speech recognition and key words detection | |
US10446141B2 (en) | Automatic speech recognition based on user feedback | |
US8478592B2 (en) | Enhancing media playback with speech recognition | |
CN109686383B (en) | Voice analysis method, device and storage medium | |
US20050144013A1 (en) | Conversation control apparatus, conversation control method, and programs therefor | |
CN110570873B (en) | Voiceprint wake-up method and device, computer equipment and storage medium | |
CN111833853B (en) | Voice processing method and device, electronic equipment and computer readable storage medium | |
CN106782607A (en) | Determine hot word grade of fit | |
JP7230806B2 (en) | Information processing device and information processing method | |
CN109036395A (en) | Personalized speaker control method, system, intelligent sound box and storage medium | |
CN110782875B (en) | Voice rhythm processing method and device based on artificial intelligence | |
US20210217403A1 (en) | Speech synthesizer for evaluating quality of synthesized speech using artificial intelligence and method of operating the same | |
CN111261195A (en) | Audio testing method and device, storage medium and electronic equipment | |
CN110675866A (en) | Method, apparatus and computer-readable recording medium for improving at least one semantic unit set | |
US10417345B1 (en) | Providing customer service agents with customer-personalized result of spoken language intent | |
CN113129867A (en) | Training method of voice recognition model, voice recognition method, device and equipment | |
CN109119073A (en) | Audio recognition method, system, speaker and storage medium based on multi-source identification | |
CN110853669B (en) | Audio identification method, device and equipment | |
CN110781329A (en) | Image searching method and device, terminal equipment and storage medium | |
CN110809796B (en) | Speech recognition system and method with decoupled wake phrases | |
JP6347939B2 (en) | Utterance key word extraction device, key word extraction system using the device, method and program thereof | |
CN112116181A (en) | Classroom quality model training method, classroom quality evaluation method and classroom quality evaluation device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190816 |