CN110148413A - Speech evaluating method and relevant apparatus - Google Patents
Speech evaluating method and relevant apparatus Download PDFInfo
- Publication number
- CN110148413A CN110148413A CN201910422699.3A CN201910422699A CN110148413A CN 110148413 A CN110148413 A CN 110148413A CN 201910422699 A CN201910422699 A CN 201910422699A CN 110148413 A CN110148413 A CN 110148413A
- Authority
- CN
- China
- Prior art keywords
- text
- voice
- translation
- unit
- confidence level
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 55
- 238000012360 testing method Methods 0.000 claims abstract description 86
- 238000001514 detection method Methods 0.000 claims abstract description 58
- 238000011156 evaluation Methods 0.000 claims abstract description 52
- 238000012545 processing Methods 0.000 claims abstract description 41
- 238000013519 translation Methods 0.000 claims description 160
- 230000002441 reversible effect Effects 0.000 claims description 34
- 239000011159 matrix material Substances 0.000 claims description 26
- 238000004891 communication Methods 0.000 claims description 15
- 238000004590 computer program Methods 0.000 claims description 15
- 230000007306 turnover Effects 0.000 claims description 4
- 238000013497 data interchange Methods 0.000 claims description 2
- 235000013399 edible fruits Nutrition 0.000 claims 1
- 230000006870 function Effects 0.000 description 9
- 230000008569 process Effects 0.000 description 9
- 238000010586 diagram Methods 0.000 description 8
- 238000004364 calculation method Methods 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 238000004422 calculation algorithm Methods 0.000 description 5
- 230000005611 electricity Effects 0.000 description 5
- 230000005236 sound signal Effects 0.000 description 4
- 238000013518 transcription Methods 0.000 description 4
- 230000035897 transcription Effects 0.000 description 4
- 230000009471 action Effects 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 238000010168 coupling process Methods 0.000 description 3
- 238000005859 coupling reaction Methods 0.000 description 3
- 230000001537 neural effect Effects 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000008878 coupling Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000005192 partition Methods 0.000 description 2
- 230000007115 recruitment Effects 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 238000000354 decomposition reaction Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000002452 interceptive effect Effects 0.000 description 1
- 238000013507 mapping Methods 0.000 description 1
- 239000002245 particle Substances 0.000 description 1
- 238000012805 post-processing Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000001360 synchronised effect Effects 0.000 description 1
- 230000001052 transient effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/48—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
- G10L25/51—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
- G10L25/60—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for measuring the quality of voice signals
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Quality & Reliability (AREA)
- Signal Processing (AREA)
- Machine Translation (AREA)
Abstract
The embodiment of the present application discloses a kind of speech evaluating method and relevant apparatus, and method includes: the first voice obtained under the first Job evaluation mode as evaluating standard, and the second voice that acquisition is to be evaluated;It handles the first voice and obtains the first text, and the second voice of processing obtains the second text;Obtain the corresponding first text detection strategy of the first Job evaluation mode;The first text and the second text are handled according to the first text detection strategy, obtains the testing result for the second voice.The application is conducive to improve equipment and carries out the flexibility of speech evaluating and comprehensive.
Description
Technical field
This application involves technical field of electronic equipment, and in particular to a kind of speech evaluating method and relevant apparatus.
Background technique
Simultaneous interpretation be it is a kind of the high language translation translation activity of difficulty is strictly limited by the time, it requires interpreter listening
While distinguishing the speech of source language, prediction, understanding, memory, conversion and the mesh of source language information are quickly completed by existing thematic knowledge
The tissue of poster speech and expression, therefore the also referred to as synchronous interpretation of simultaneous interpretation.The culture of simultaneous interpretation student be one very
Complicated process mainly includes grasp of the training student to source language and the target language, the extensive understanding to knowledge, to same
Sound is interpreted the training of skill.Wherein the training of simultaneous interpretation skill is the most important thing of current simultaneous interpretation students developing.
The grounding of simultaneous interpretation skill at present mainly passes through based on practice culture temporary memory is repeated, and passes through
The ability that number number practice culture is said when listening in an interference situation.When student has certain basis and then is passed in unison
The training translated.For the teaching exercise in each stage, makes timely and effectively recruitment evaluation and feedback quickly mentions student ability
It rises very crucial.
Summary of the invention
The embodiment of the present application provides a kind of speech evaluating method and relevant apparatus, carries out speech evaluating to improve equipment
Flexibility and comprehensive.
In a first aspect, the embodiment of the present application provides a kind of speech evaluating method, comprising:
The first voice under the first Job evaluation mode as evaluating standard is obtained, and obtains the second voice to be evaluated;
It handles first voice and obtains the first text, and processing second voice obtains the second text;
Obtain the corresponding first text detection strategy of first Job evaluation mode;
First text and second text are handled according to the first text detection strategy, is obtained for described the
The testing result of two voices.
Second aspect, the embodiment of the present application provide a kind of speech evaluating device, including processing unit and communication unit,
In,
The processing unit, for by the communication unit obtain the first Job evaluation mode under as evaluating standard first
Voice, and the second voice to be evaluated is obtained by the communication unit, first Job evaluation mode includes repeating test mould
Formula or test pattern of interpreting;And the first text, and processing second voice are obtained for handling first voice
Obtain the second text;And for obtaining the corresponding first text detection strategy of first Job evaluation mode;And for according to
The first text detection strategy handles first text and second text, obtains the detection for second voice
As a result.
The third aspect, the embodiment of the present application provide a kind of electronic equipment, including processor, memory, communication interface and
One or more programs, wherein said one or multiple programs are stored in above-mentioned memory, and are configured by above-mentioned
It manages device to execute, above procedure is included the steps that for executing the instruction in the embodiment of the present application first aspect either method.
Fourth aspect, the embodiment of the present application provide a kind of computer readable storage medium, wherein above-mentioned computer-readable
Storage medium storage is used for the computer program of electronic data interchange, wherein above-mentioned computer program executes computer such as
Step some or all of described in the embodiment of the present application first aspect either method.
5th aspect, the embodiment of the present application provide a kind of computer program product, wherein above-mentioned computer program product
Non-transient computer readable storage medium including storing computer program, above-mentioned computer program are operable to make to calculate
Machine executes the step some or all of as described in the embodiment of the present application first aspect either method.The computer program product
It can be a software installation packet.
As can be seen that in the embodiment of the present application, it, can when electronic equipment can carry out speech detection under different test patterns
It is adapted to the exclusive text detection strategy of current Job evaluation mode with dynamic select, and handles voice according to the exclusive text detection strategy
Corresponding text realizes the detection to voice to be evaluated to obtain testing result, so can be to avoid because using single detection
Strategy and appearance the case where different Job evaluation modes can not be adapted to, be conducive to improve electronic equipment carry out voice assessment flexibility and
It is comprehensive.
Detailed description of the invention
In order to illustrate the technical solutions in the embodiments of the present application or in the prior art more clearly, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
Some embodiments of application for those of ordinary skill in the art without creative efforts, can be with
It obtains other drawings based on these drawings.
Fig. 1 is the schematic diagram of a kind of electronic equipment acquisition fingerprint provided by the embodiments of the present application;
Fig. 2 a is a kind of flow diagram of speech evaluating method provided by the embodiments of the present application;
Fig. 2 b is a kind of testing result example interface for repeating test pattern provided by the embodiments of the present application;
Fig. 2 c is a kind of testing result example interface of test pattern of interpreting provided by the embodiments of the present application;
The structural schematic diagram of Fig. 3 a kind of electronic equipment provided by the embodiments of the present application;
A kind of Fig. 4 functional unit composition block diagram of speech evaluating device provided by the embodiments of the present application.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are some of the embodiments of the present invention, instead of all the embodiments.Based on this hair
Embodiment in bright, every other implementation obtained by those of ordinary skill in the art without making creative efforts
Example, shall fall within the protection scope of the present invention.
As shown in FIG. 1, FIG. 1 is the schematic diagram of a speech evaluating system 100, which includes voice
Acquisition device 110, voice processing apparatus 120, the voice acquisition device 110 connect the voice processing apparatus 120, and voice obtains
Take device 110 to be handled to voice processing apparatus 120 for obtaining voice data concurrency, voice processing apparatus 120 for pair
Voice data is handled and is exported processing result, which may include integrated form single devices or more
Equipment, for convenience of describing, speech evaluating system 100 is referred to as electronic equipment by the application.Obviously the electronic equipment may include
The various handheld devices with wireless communication function, wearable device, calculate equipment or are connected to wireless-modulated solution mobile unit
Adjust other processing equipments and various forms of user equipmenies (User Equipment, UE) of device, mobile station (Mobile
Station, MS), terminal device (terminal device) etc..
At present there are two types of the recruitment evaluation of simultaneous interpretation student and feedback Mains, first is that showing by teacher or classmate
The problem of occasion is done the exercises and is fed back, and student is helped to find interpretation voice;Second is that teacher or student are by recording back
Put row analysis and summary into.Cooperation practice for the first teacher or classmate, needs other people cooperation, and practice environment is limited.It is right
In the second way, it is lower that playback plays back efficiency.
Based on this, the embodiment of the present application proposes a kind of speech evaluating method to solve the above problems, below to the application reality
Example is applied to describe in detail.
Fig. 2 a is please referred to, Fig. 2 a is that the embodiment of the present application provides a kind of flow diagram of speech evaluating method, application
In electronic equipment as shown in Figure 1, as shown, this speech evaluating method includes:
S201, electronic equipment obtains the first voice under the first Job evaluation mode as evaluating standard, and obtains to be evaluated
The second voice.
Wherein, first voice be source voice, second voice be target voice, source voice can by user (such as:
Student or teacher) voluntarily selects or specify, such as English BBC broadcast, target voice are the record of user (such as: student) Interpreter
Sound file.
In the specific implementation, first voice can be the voice document that electronic equipment local terminal prestores, alternatively, in real time with clothes
The voice document for the server push that business device (such as: cloud) is interactive and obtains, second voice can be electronic equipment and pass through
The voice acquisition device of local terminal obtains, or the dedicated recording system by communicating with local terminal obtains, and does not do unique restriction herein.
Wherein, the Job evaluation mode that the electronic equipment is supported includes repeating test pattern and test pattern of interpreting, described multiple
It states test pattern and refers to that user listens to the audio content of source voice, and repeat the content information of the audio (such as delay 2-3 seconds).It is logical
This method is crossed, the main ability practicing student and saying when listening is not related to language translation conversion, the test pattern of interpreting also at this time
Also known as simultaneous interpretation mode, under the mode, user listens to original language audio content on one side, is transcribed into object language on one side
And come out by phonetic representation, it is consistent with practical simultaneous interpretation mode.Compared to test pattern is repeated, which increases original language
To the translation process of object language, degree-of-difficulty factor is bigger.In addition to the tone is expressed, integrality, accuracy rate and the fluency of translation are
The important of simultaneous interpretation ability level considers index.
S202, the electronic equipment handle first voice and obtain the first text, and processing second voice obtains
To the second text.
Wherein, the process that the one the second voice of electronic equipment processing obtains the one the second texts not only includes speech-to-text
The processing of step further includes the pretreatment for original text after conversion, which may include following at least one: number
With Time alignment, filter out meaningless modal particle, and carry out punctuate and punctuate prediction.
In the specific implementation, the processing of the voice to text step of electronic equipment may include following two situation:
The first, is the affiliated languages of first voice and institute at this time when repeating test pattern for the first Job evaluation mode
It is identical to state the affiliated languages of the second voice, the affiliated languages of the first text are identical with the affiliated languages of the second text.The electricity
Sub- equipment handles first voice and obtains the first text, and processing second voice obtains the second text, comprising: described
Electronic equipment calls corresponding first speech recognition system of the affiliated languages of the first voice;Pass through first speech recognition system
System handles first voice and obtains the first text;Second voice, which is handled, by first speech recognition system obtains the
Two texts.It can be seen that the treatment process only needs to can be completed using the same speech recognition system.
Second, be the affiliated languages of first voice and institute at this time when interpreting test pattern for the first Job evaluation mode
It is different to state the affiliated languages of the second voice, the affiliated languages of the affiliated languages of the first text and second text are different.The electricity
Sub- equipment handles first voice and obtains the first text, and processing second voice obtains the second text, comprising: described
Electronic equipment calls corresponding first speech recognition system of the affiliated languages of the first voice, and calls the second voice institute
Belong to corresponding second speech recognition system of languages;First voice, which is handled, by first speech recognition system obtains first
Text;Second voice, which is handled, by second speech recognition system obtains the second text.
S203, the electronic equipment obtain the corresponding first text detection strategy of first Job evaluation mode.
Wherein, first Job evaluation mode is when repeating test pattern, and the electronic equipment also supports the second Job evaluation mode,
The second Job evaluation mode is test pattern of interpreting at this time, likewise, if the first Job evaluation mode is test pattern of interpreting, electronic equipment
Also support the second Job evaluation mode, the second Job evaluation mode is to repeat test pattern at this time.
Wherein, electronic equipment local terminal can prestore the corresponding pass between the first Job evaluation mode and the first text detection strategy
System, and the corresponding relationship being stored between the second Job evaluation mode and the second text detection strategy.In the specific implementation, electronic equipment
It only needs quickly determine current first Job evaluation mode corresponding first by way of inquiring the mapping relations set prestored
The particular content of text detection strategy.Convenience and high-efficiency is quick.
Wherein, above-mentioned corresponding relationship can be configured in server side in advance.In the specific implementation, electronic equipment can be real
When interacted with server, notice server inquires the corresponding first text detection strategy of current first Job evaluation mode, and returns
It is transmitted to electronic equipment.
It should be noted that in the application electronic equipment obtain the first text detection strategy specific implementation include but
It is not limited to above-mentioned example method, can also be other modes, does not do unique restriction herein.
S204, the electronic equipment handle first text and second text according to the first text detection strategy
This, obtains the testing result for second voice.
Wherein, if first Job evaluation mode is to repeat test pattern, the testing result includes form presented below
In any one: repeat quality comprehensive scoring, each subdivision index scoring (such as: repeat degree of registration scoring, leakage repetition amount,
More repetition amounts repeat accuracy), Error Text unit show (such as: the mistake that occurs passes through two sentences during repeating
The mode of son alignment is shown) etc..
By taking textual presentation unit as an example, the testing result example interface of repetition test pattern as shown in Figure 2 b, wherein should
Interface includes the first playback progress control of the first voice and the second playback progress control of the second voice, further includes wrong sentence
Corresponding urtext unit and repetition text unit, wherein inconsistent list in urtext unit and repetition text unit
Word or phrase are highlighted out, in addition, the first object section of the progress bar of the first playback progress control is marked
Come with facilitate user quickly position the wrong original statement position of repetition (first object section include urtext unit it is corresponding into
Spend section), the second target interval of the progress bar of the second playback progress control is labeled to be come out to facilitate user quickly to position again
It states mistake and repeats sentence position (first object section includes the corresponding progress section of urtext unit), so that user clicks original
When beginning text unit or selection first object section, electronic equipment can play the voice comprising repeating wrong original statement,
When user clicks repetition text unit or chooses the second target interval, electronic equipment can be played comprising repeating mistake repetition language
The voice of sentence improves and consults convenience.
In addition, playback progress control target interval may include the contextual information for repeating wrong sentence, it such as include previous
Sentence and latter sentence can also carry out hierarchical according to paragraph, as only included repeating belonging to wrong sentence when previous paragraphs etc..
If first Job evaluation mode is test pattern of interpreting, the testing result includes appointing in form presented below
Anticipate one kind: quality comprehensive of interpreting scoring, each subdivision index scoring (such as: degree of registration of interpreting scoring, the scoring of translation fluency,
Leak translation amount, more translation amounts and translation accuracy), Error Text unit show (such as: the mistake occurred during interpreting
Shown by way of two sentence alignments) etc..
By taking textual presentation unit as an example, the testing result example interface of test pattern of interpreting as shown in Figure 2 c, wherein should
Interface includes the first playback progress control of the first voice and the second playback progress control of the second voice, further includes wrong sentence
Corresponding urtext unit and text unit of interpreting, wherein inconsistent list in urtext unit and text unit of interpreting
Word or phrase are highlighted out, in addition, the first object section of the progress bar of the first playback progress control is marked
Come with facilitate user quickly position wrong original statement position of interpreting (first object section include urtext unit it is corresponding into
Spend section), the second target interval of the progress bar of the second playback progress control is labeled to be come out to facilitate user quickly to position biography
It translates mistake to interpret sentence position (first object section includes the corresponding progress section of urtext unit), so that user clicks original
When beginning text unit or selection first object section, electronic equipment can play the voice comprising wrong original statement of interpreting,
User clicks when interpreting text unit or choosing the second target interval, and electronic equipment can be played interprets language comprising mistake of interpreting
The voice of sentence improves and consults convenience.
It such as include previous sentence and latter in addition, playback progress control may include the contextual information of translation error sentence
Sentence can also carry out hierarchical according to paragraph, as only included belonging to translation error sentence when previous paragraphs etc..
As can be seen that in the embodiment of the present application, it, can when electronic equipment can carry out speech detection under different test patterns
It is adapted to the exclusive text detection strategy of current Job evaluation mode with dynamic select, and handles voice according to the exclusive text detection strategy
Corresponding text realizes the detection to voice to be evaluated to obtain testing result, so can be to avoid because using single detection
Strategy and appearance the case where different Job evaluation modes can not be adapted to, be conducive to improve electronic equipment carry out voice assessment flexibility and
It is comprehensive.
In a possible example, first Job evaluation mode includes repeating test pattern;Belonging to first voice
Languages are identical with the affiliated languages of the second voice, the affiliated languages of the first text and the affiliated languages phase of second text
Together.
It is understood that above-mentioned electronic equipment handles first text and institute according to the first text detection strategy
The second text is stated, obtains can be diversified, this Shen for the specific implementation of the testing result of second voice
Unique restriction is not done please, citing is illustrated below.
In the possible example of the application, the electronic equipment is according to the first text detection strategy processing described first
Text and second text obtain may is that the electricity for the specific implementation of the testing result of second voice
Sub- equipment determines matching degree of second text relative to first text;Testing result is generated according to the matching degree.
In the specific implementation, the electronic equipment determines matching degree of second text relative to first text, packet
Include: the electronic equipment is decomposed first text to obtain the first text unit set, and second text is carried out
Decomposition obtains the second text unit set;According to the first word grade text unit set and the second word grade text unit collection
It is total to calculate matching degree.
Wherein, the text unit of first text unit and second text unit fining level can be word
Grade, phrase grade, sentence grade, paragraph level etc. such as calculate the speech vector of sentence, obtain sentence grade matching degree etc., do not do unique limit herein
It is fixed.
In the possible example of the application, the electronic equipment is according to the first text detection strategy processing described first
Text and second text obtain may also is that for the specific implementation of the testing result of second voice described
Electronic equipment determines the alignment information of first text and second text;Described second is determined according to the alignment information
The repetition accuracy of voice;Testing result is generated according to the alignment information and/or the repetition accuracy.
In the specific implementation, the electronic equipment determines the alignment information of first text and second text, comprising:
The electronic equipment calculates repetition matching degree of second text relative to first text, obtains repeating matching degree square
Battle array, wherein the repetition matching degree is for indicating matching degree of second text unit relative to the first text unit, the repetition
Matching degree is calculated according to the first quantity and the second quantity, first quantity for indicate second text unit with
The quantity of identical word grade text unit in first text unit, second quantity is for indicating the second text list
The quantity of word grade text unit in member, first text include at least one first text unit, and second text includes
At least one second text unit;Optimal repetition align to path is filtered out from the repetition matching degree matrix;According to it is described most
Excellent repetition align to path determines the alignment information.
Wherein, if the text unit of first text unit and second text unit fining level can be word
Grade, sentence grade, phrase grade, paragraph level etc., do not do unique restriction herein.The optimal repetition align to path, which refers to, repeats matching degree square
The alignment relation path of largest cumulative matching degree score in battle array, specifically can be using common calculations such as Viterbi Viterbi algorithms
Method is realized, is not specifically limited herein.
In the specific implementation, the electronic equipment determines the repetition accuracy of second voice according to the alignment information,
To include: the electronic equipment, which repeat matching degree according to the optimal sentence grade repeated in align to path, calculates second voice
Repeat accuracy.Such as repetition accuracy can be calculated using average weighted mode.
In the possible example of the application, the electronic equipment is according to the alignment information and/or the repetition accuracy
Generate testing result, comprising: the electronic equipment repeats quality according to the reference that the alignment information calculates second text
Parameter, it is described to include leakage repetition rate and/or more repetition rates with reference to repetition mass parameter;According to described with reference to repetition mass parameter
And/or the repetition accuracy determines testing result.
Below by taking sentence grade text unit as an example, above-mentioned calculating process is illustrated.
Assuming that the first text includes five first grade text units (i.e. five words) of A, B, C, D, E, the second text includes
A, six second grade text units of b, c, d, e, f can then define sentence grade and repeat matching degree Ai, the calculation formula of j are as follows:
Wherein, i be the first text in i-th of first grade text units, j be the second text in j-th second
Grade text unit, Ei, j indicate the identical word grade text of j-th of second grade text units and i-th of first grade text units
The quantity of this unit, Qy, j indicate the total quantity of word grade text unit in j-th of second text units.
Then corresponding sentence grade repetition matching degree matrix is
Assuming that repeating matching degree matrix based on above-mentioned sentence grade determines optimal repetition align to path are as follows:
A1,1 → A2,2 → A4,3 → A5,5 → A5,6,
It can then determine that this is repeated in test, the 3rd first grade text unit in the first text is what leakage was repeated
Sentence grade text unit, the 4th in the second text unit and the 6th second grade text unit are the sentence grade text list more repeated
Member, therefore it is as follows to calculate leakage repetition rate, more repetition rates and repetition accuracy rate difference:
Leak repetition rate are as follows: leak the quantity of the sentence grade text unit of repetition divided by the quantity of first grade text unit, i.e., 1/5
=0.2;
More repetition rates are as follows: the quantity for the sentence grade text unit more repeated divided by second grade text unit quantity, i.e., 2/6
≈0.33;
Repeat accuracy are as follows: the optimal weighted average for repeating sentence grade repetition matching degree in align to path, such as may is that
(A1,1+A2,2+A4,3+A5,5+A5,6)/6。
It should be noted that the calculation of the alignment information under above-mentioned repetition test pattern is merely illustrative, can also adopt
The alignment information that the first text and second text are calculated with other methods well known in the art, as based on space vector
Cosine-algorithm etc. does not do unique restriction herein.
As it can be seen that electronic equipment, can pair based on the first text and the second text for test pattern is repeated in this example
Its information, which calculates, repeats accuracy, is then based on alignment information and/or repeats accuracy generation testing result, since alignment is believed
Breath repeats accuracy and can more comprehensively reflect the repetition quality of user, so assessment accuracy and comprehensive can be improved.
In a possible example, first Job evaluation mode includes test pattern of interpreting;Belonging to first voice
The affiliated languages of languages and second voice are different, and the affiliated languages of the first text and the affiliated languages of the second text are not
Together.
In the possible example of the application, the electronic equipment is according to the first text detection strategy processing described first
Text and second text obtain the testing result for second voice, comprising: the electronic equipment determines described the
The alignment information of one text and second text;Determine the translation fluency of second text;According to the alignment information
And/or the translation fluency generates testing result.
In the specific implementation, to determine that the implementation of the translation fluency of second text can be more for the electronic equipment
Kind multiplicity, do not do unique restriction herein.
For example, electronic equipment can handle second text based on preset fluency prediction model and be translated
The prediction result of fluency, the fluency prediction model can use neural network language model, can also be using simple N member
N-gram language model.
In the possible example of the application, the electronic equipment is according to the alignment information and/or the translation fluency
Generate testing result, comprising: the electronic equipment determines that the reference of the second language is interpreted quality according to the alignment information
Parameter, it is described to be comprised at least one of the following with reference to mass parameter of interpreting: leakage translation rate, more translation rates and translation accuracy rate;According to
It is described to refer to interpret mass parameter and/or translation fluency generation testing result.
In the possible example of the application, the electronic equipment determines being aligned for first text and second text
Information, comprising: the electronic equipment determines the intertranslation confidence level matrix of first text and second text;From it is described mutually
It translates and filters out optimal translation align to path in confidence level matrix;Alignment information is determined according to the optimal translation align to path.
In the possible example of the application, the electronic equipment determines the intertranslation of first text and second text
Confidence level matrix, comprising: the electronic equipment obtains positive translation model and reverse translation model, and the forward direction translation model is used
In being converted to the affiliated languages of the second text by the affiliated languages of the first text, the reverse translation model is used for by described
The affiliated languages of second text are converted to languages described in first text;Pass through the positive translation model, the reverse translation
Model, first text and second text, determine the intertranslation of each first text unit and each second text unit
Confidence level obtains intertranslation confidence level matrix, and first text includes multiple first text units, and second text includes more
A second text unit.
In the possible example of the application, the electronic equipment passes through the positive translation model, the reverse translation mould
Type, first text and second text determine that each first text unit and the intertranslation of each second text unit are set
Reliability obtains intertranslation confidence level matrix, comprising: the electronic equipment by the positive translation model, first text and
Second text calculates the positive degree of translation confidence of second text;Pass through the reverse translation model, first text
Originally with second text, the reverse translation confidence level of second text is calculated;According to the positive degree of translation confidence and institute
It states reverse translation confidence level and determines intertranslation confidence level of each second text unit relative to each first text unit, obtain
To intertranslation confidence level matrix.
Wherein, the positive degree of translation confidence is for indicating that the second text unit is turned over relative to the first of the first text unit
Confidence level is translated, first degree of translation confidence is multiple according to the first text subelements multiple in first text unit
What the one sub- confidence level of translation was weighted and averaged, the sub- confidence level of each first translation is the first defeated of each first text subelement
Out in Making by Probability Sets multiple first output probabilities maximum value, each first output probability refers to the positive translation model defeated
In the case where entering the first text subelement, the forward direction translation model output result is the second text in second text unit
The probability of subunit;The reverse translation confidence level is used to indicate second of the first text unit relative to the second text unit
Degree of translation confidence, second degree of translation confidence are according to the multiple of the second text subelements multiple in second text unit
What the second sub- confidence level of translation was weighted and averaged, each second sub- degree of translation confidence is the second of each second text subelement
The maximum value of multiple second output probabilities in output probability set, each second output probability refer to that the reverse translation model exists
In the case where inputting the second text subelement, reverse translation model output result is the in the first text subelement
The probability of one text subelement.
By taking sentence grade text unit and word grade text unit as an example, then the positive degree of translation confidence is for indicating second grade
First grade degree of translation confidence of the text unit relative to first grade text unit, first grade degree of translation confidence is root
According to multiple first word grade degree of translation confidence weighted average of multiple first word grade text units in first grade text unit
It obtains, each first word grade degree of translation confidence is multiple in the first output probability set of each first word grade text unit
The maximum value of one output probability, each first output probability refer to that the positive translation model is inputting the first word grade text unit
In the case where, the forward direction translation model output result is the second word grade text unit in second grade text unit
Probability;The reverse translation confidence level is used to indicate second of first grade text unit relative to second grade text unit
Grade degree of translation confidence, second grade degree of translation confidence are according to the second word grade texts multiple in second grade text unit
What multiple second word grade degree of translation confidence of this unit were weighted and averaged, each second word grade degree of translation confidence is each second
The maximum value of multiple second output probabilities, each second output probability refer in second output probability set of word grade text unit
For the reverse translation model in the case where inputting the second word grade text unit, the reverse translation model output result is described
The probability of the first word grade text unit in first grade text unit.
Below for sentence grade text unit, test pattern of interpreting is illustrated in conjunction with specific example.
(1) the audio signal Wy that the original audio signal Wx and student's practice for obtaining student's practice are generated in the process;Its
Middle Wx is that student practices the source language speech used, is voluntarily selected by student or teacher specifies, such as English BBC broadcast, Wy are
The recording file of the object language of student Interpreter, is obtained by recording system.
(2) using speech recognition system corresponding with language languages by audio signal Wx transcription at correspondence textual representation Tx,
Simultaneously by audio signal Wy transcription at textual representation Ty.
(3) identification text post-processing carried out to Tx and Ty respectively, including number and Time alignment, filters out the meaningless tone
Word, and punctuate and punctuate prediction are carried out, obtain output resultWithWherein SX, i, SY, jIt is i-th text of original language audio after making pauses in reading unpunctuated ancient writings, Yi Jixue respectively
The jth sentence text of member's output audio, M and N be respectively predicted based on original audio content transcription come sentence number and be based on
Student exports audio content transcription and predicts the sentence number come.
(4) the positive Machine Translation Model E of a set of source language and the target language of trainingfWith reversed Machine Translation Model Eb,
Middle positive translation, which refers to, translates into object language text for source language text, reverse translation be then by object language character translation at
Source language text.Machine Translation Model, can using statistical machine translation (Statistical Machine Translation,
SMT) scheme or neural machine translation (Neural Machine Translation, NMT) scheme, without limitation.
(5) based on positive translation model Ef, every words of calculating student's simultaneous interpretation are in original language audio in every words
Degree of translation confidence FI, j, it is defined as follows:
FI, j=P (SY, j|SX, i, Ef), i=1,2 ..., M j=1,2 ..., N
FI, jIndicate given source language sentence SX, iWith positive Machine Translation Model Ef, target language sentence S is translatedY, j's
Confidence score.Its calculation method is first to source language sentence SX, iIt carries out the pretreatment such as segmenting, then uses machine translation mould
Type Ef, calculate the target language sentence S under the Machine Translation ModelY, jScoring probability.The machine of specific calculating process and use
It is related that device translates modeling scheme, such as neural machine translation, equally to target language sentence SY, jAfter carrying out word segmentation processing,
It calculates and corresponds to the probability of target language words at each decoding moment, be finally weighted and averaged the decoding probability of all target words
Confidence level as whole sentence.
As an example it is assumed that source language sentence SX, iText be " I likes to sing ", target language sentence SY, jFor " I love
Singing " segments " I likes to sing " for " I ", " love ", " singing ", by " I love singing " segment for " I ",
" love ", " singing " then passes through positive translation model Ef, calculate participle " I " and be translated as " I ", " love ", " singing "
Probability be respectively 0.9,0,0, calculate participle " love " be translated as " I ", " love ", " singing " probability be respectively 0,0.8,
0, calculate participle " singing " be translated as " I ", " love ", " singing " probability be respectively 0,0,0.7, obtain that word grade is optimal to be turned over
It translates path and is translated as " I " 0.9 for participle " I ", participle " love " is translated as " love " 0.8, and participle " singing " is translated as
" singing " 0.7, weighted average obtain target language sentence SY, j" I love singing " is relative to source language sentence SX, i's
Confidence score 0.8.
(6) it is based on reverse translation model, every words for calculating student's simultaneous interpretation are talked about for every in original language audio
Degree of translation confidence BJ, i, it is defined as follows:
BJ, i=P (SX, i|SY, j, Eb), i=1,2 ..., M j=1,2 ..., N
It is represented to the language sentence S that sets the goalY, jWith reversed Machine Translation Model Eb, source language sentence S is translatedX, iConfidence
Score is spent, calculation is with positive translation model confidence calculations method, and only input is target language sentence SY, j, defeated
It is source language sentence S outX, i。
(7) based on positive degree of translation confidence score and reverse translation confidence score, source language sentence and target language are calculated
Say the intertranslation confidence level C of sentenceI, j:
CI, j=(FI, j+BJ, i)/2, i=1,2 ..., M j=1,2 ..., N
Wherein CI, jIndicate SX, iWith SY, jThe confidence level of intertranslation, score is higher to illustrate the more accurate of student's simultaneous interpretation translation.
(8) it is based on intertranslation confidence level Matrix C={ CI, j, source language sentence is calculated using Viterbi Viterbi algorithm and is learned
The maximum alignment relation path for the target language sentence that member translates.Define Viterbi decode to j-th of sentence of object language with
It is Sigma σ that i-th of sentence maximum align to path of original language, which adds up confidence score,j(i), an object language sentence on the path
Fo-love is designated as under the source language sentence of the corresponding maximum confidence score of sonThen according to Viterbi formula, calculating side
Formula is as follows:
Wherein Sigma σ0(i)=0, i=1,2 ..., M, M and N are source language sentence number and object language sentence respectively
Sub- number.Under normal circumstances, the corresponding Chinese of an English, the subscript of source language sentence and corresponding student's translation result sentence
Within limits, therefore in Viterbi outcome procedure, extension is the introduction of searching route width K, the value to difference backward
It is determined according to experimental result.Specific decoding process is same as the prior art, and this will not be detailed here;It is available finally by recalling
Decoded optimal translation respective path, i.e. target language sentence SY, jIts corresponding source language sentence isWherein,It indicates
The neat source language sentence subscript of the object language jth sentence pair obtained using alignment algorithm;In this way, available each mesh
Mark the alignment relation of language sentence and source language sentence.
(9) translation error detects.In student's translation process, it is possible that the sentence of translation error, such as leak
When turning over (corresponding source language sentence), source language sentence just can not find corresponding target language sentence, is aligned at this time by optimal decoding
Path counts the number for the source language sentence that the source language sentence not on optimal path can be missed.Sentence is missed in order
Subset is combined into A, then:
Wrong score is turned in definition leakage are as follows:
Score1=| A |/M
Wherein | A | indicate that sentence number is turned in leakage, M is source language sentence sum.
Other than leakage is turned over, it is also possible to occur turning over unintentionally in student's practice, if target language sentence jth sentence is more more
The sentence come is translated, then the confidence score of its corresponding alignment sentenceWill be relatively low, therefore pass through given threshold
T, ifThen it is considered to rout up the sentence come more.Assuming that the sentence number more turned over is P, definition turns over wrong score calculating
Mode is as follows:
Score2=P/N
Wherein N is target language sentence sum.
For being aligned sentence pairSy, j, if translation accuracy is preferable, confidence scoreWith regard to relatively high, therefore
Definition translation accuracy score are as follows:
Wherein L isSentence number, as normalized parameter.
The translation result of high quality requires translation remarkable fluency, therefore defines fluency score are as follows:
Wherein λ is using the language model of a large amount of object languages training, P (SY, j| λ) it is that student practices sentence SY, jIn the language
Say the fluency score on model.Language model can use neural network language model, can also use simple n-gram
Language model.
Based on it is above-mentioned turn over leakage and turn over wrong score, translation accuracy score, and translation fluency score, define this white silk
The horizontal total score practised are as follows:
Score=α1(1-Score1)+α2(1-Score2)+α3·Score3+α4·Score4
Wherein αi, i=1,2,3,4 be the weight of each score, and specific value can be obtained according to experimental result or experience.
It is consistent with embodiment shown in above-mentioned Fig. 2 a, referring to Fig. 3, Fig. 3 is a kind of electricity provided by the embodiments of the present application
The structural schematic diagram of sub- equipment 300, as shown, the electronic equipment 300 includes application processor 310, memory 320, leads to
Believe interface 330 and one or more programs 321, wherein one or more of programs 321 are stored in above-mentioned memory
In 320, and it is configured to be executed by above-mentioned application processor 310, one or more of programs 321 include following for executing
The instruction of step;
The first voice under the first Job evaluation mode as evaluating standard is obtained, and obtains the second voice to be evaluated;With
And the first text is obtained for handling first voice, and processing second voice obtains the second text;And it is used for
Obtain the corresponding first text detection strategy of first Job evaluation mode;And it is used at according to the first text detection strategy
First text and second text are managed, the testing result for second voice is obtained.
As can be seen that in the embodiment of the present application, it, can when electronic equipment can carry out speech detection under different test patterns
It is adapted to the exclusive text detection strategy of current Job evaluation mode with dynamic select, and handles voice according to the exclusive text detection strategy
Corresponding text realizes the detection to voice to be evaluated to obtain testing result, so can be to avoid because using single detection
Strategy and appearance the case where different Job evaluation modes can not be adapted to, be conducive to improve electronic equipment carry out voice assessment flexibility and
It is comprehensive.
In a possible example, first Job evaluation mode includes repeating test pattern;Belonging to first voice
Languages are identical with the affiliated languages of the second voice, the affiliated languages of the first text and the affiliated languages phase of second text
Together.
In a possible example, first text and institute are handled according to the first text detection strategy described
State the second text, in terms of obtaining the testing result for second voice, the instruction in described program be specifically used for executing with
Lower operation: the alignment information of first text and second text is determined;And for being determined according to the alignment information
The repetition accuracy of second voice;And for generating detection according to the alignment information and/or the repetition accuracy
As a result.
In a possible example, detection is generated according to the alignment information and/or the repetition accuracy described
As a result in terms of, the instruction in described program is specifically used for executing following operation: according to alignment information calculating second text
Mass parameter is repeated in this reference, described to include leakage repetition rate and/or more repetition rates with reference to repetition mass parameter;And it is used for root
Testing result is determined with reference to repetition mass parameter and/or the repetition accuracy according to described.
In a possible example, first Job evaluation mode includes test pattern of interpreting;Belonging to first voice
The affiliated languages of languages and second voice are different, and the affiliated languages of the first text and the affiliated languages of the second text are not
Together.
In a possible example, first text and institute are handled according to the first text detection strategy described
State the second text, in terms of obtaining the testing result for second voice, the instruction in described program be specifically used for executing with
Lower operation: the alignment information of first text Yu second text is determined;And for determining turning over for second text
Translate fluency;And for generating testing result according to the alignment information and/or the translation fluency.
In a possible example, detection is generated according to the alignment information and/or the translation fluency described
As a result in terms of, the instruction in described program is specifically used for executing following operation: determining second language according to the alignment information
The reference of speech is interpreted mass parameter, described to comprise at least one of the following with reference to mass parameter of interpreting: leakage translation rate, more translation rates and
Translate accuracy rate;And for referring to interpret mass parameter and/or translation fluency generation testing result according to described.
In a possible example, in the alignment information side of the determination first text and second text
Face, the instruction in described program are specifically used for executing following operation: determining the intertranslation of first text and second text
Confidence level matrix;And for filtering out optimal translation align to path from the intertranslation confidence level matrix;And it is used for basis
The optimal translation align to path determines alignment information.
In a possible example, in the intertranslation confidence level square of the determination first text and second text
In terms of battle array, the instruction in described program is specifically used for executing following operation: the positive translation model of acquisition and reverse translation model, institute
Positive translation model is stated to be used to be converted to the affiliated languages of the second text by the affiliated languages of the first text, it is described reversely to turn over
Model is translated for being converted to languages described in first text as the affiliated languages of the second text;And for by it is described just
To translation model, the reverse translation model, first text and second text, determine each first text unit and
The intertranslation confidence level of each second text unit, obtains intertranslation confidence level matrix, and first text includes multiple first texts
Unit, second text include multiple second text units.
In a possible example, pass through the positive translation model, the reverse translation model, described the described
One text and second text, determine the intertranslation confidence level of each first text unit and each second text unit, obtain
In terms of intertranslation confidence level matrix, the instruction in described program is specifically used for executing following operation: by the positive translation model,
First text and second text calculate the positive degree of translation confidence of second text;And for by described
Reverse translation model, first text and second text calculate the reverse translation confidence level of second text;And
For according to the positive degree of translation confidence and the reverse translation confidence level determine each second text unit relative to
The intertranslation confidence level of each first text unit obtains intertranslation confidence level matrix.
It is above-mentioned that mainly the scheme of the embodiment of the present application is described from the angle of method side implementation procedure.It is understood that
, in order to realize the above functions, it comprises execute the corresponding hardware configuration of each function and/or software mould for electronic equipment
Block.Those skilled in the art should be readily appreciated that, in conjunction with each exemplary unit of embodiment description presented herein
And algorithm steps, the application can be realized with the combining form of hardware or hardware and computer software.Some function actually with
Hardware or computer software drive the mode of hardware to execute, the specific application and design constraint item depending on technical solution
Part.Professional technician can specifically realize described function to each using distinct methods, but this reality
Now it is not considered that exceeding scope of the present application.
The embodiment of the present application can carry out the division of functional unit according to above method example to electronic equipment, for example, can
With each functional unit of each function division of correspondence, two or more functions can also be integrated in a processing unit
In.Above-mentioned integrated unit both can take the form of hardware realization, can also realize in the form of software functional units.It needs
It is noted that be schematical, only a kind of logical function partition to the division of unit in the embodiment of the present application, it is practical real
It is current that there may be another division manner.
Fig. 4 is the functional unit composition block diagram of speech evaluating device 400 involved in the embodiment of the present application.The voice is commented
It surveys device 400 and is applied to electronic equipment, the electronic equipment includes processing unit 401 and communication unit 402, wherein
The processing unit 401 is used as evaluating standard for obtaining by the communication unit 402 under first Job evaluation mode
The first voice, and the second voice to be evaluated is obtained by the communication unit, first Job evaluation mode includes repeating
Test pattern or test pattern of interpreting;And the first text is obtained for handling first voice, and processing described the
Two voices obtain the second text;And for obtaining the corresponding first text detection strategy of first Job evaluation mode;And it uses
In handling first text and second text according to the first text detection strategy, obtain for second voice
Testing result.
Wherein, the speech evaluating device 400 can also include storage unit 403, for storing the program of electronic equipment
Code and data.The processing unit 401 can be processor, and the communication unit 402 can be internal communications interface, storage
Unit 403 can be memory.
As can be seen that in the embodiment of the present application, it, can when electronic equipment can carry out speech detection under different test patterns
It is adapted to the exclusive text detection strategy of current Job evaluation mode with dynamic select, and handles voice according to the exclusive text detection strategy
Corresponding text realizes the detection to voice to be evaluated to obtain testing result, so can be to avoid because using single detection
Strategy and appearance the case where different Job evaluation modes can not be adapted to, be conducive to improve electronic equipment carry out voice assessment flexibility and
It is comprehensive.
In a possible example, first Job evaluation mode includes repeating test pattern;Belonging to first voice
Languages are identical with the affiliated languages of the second voice, the affiliated languages of the first text and the affiliated languages phase of second text
Together.
In a possible example, first text and institute are handled according to the first text detection strategy described
The second text is stated, in terms of obtaining the testing result for second voice, the processing unit 401 is specifically used for: determining institute
State the alignment information of the first text and second text;And for determining second voice according to the alignment information
Repeat accuracy;And for generating testing result according to the alignment information and/or the repetition accuracy.
In a possible example, detection is generated according to the alignment information and/or the repetition accuracy described
As a result aspect, the processing unit 401 are specifically used for: repeating matter according to the reference that the alignment information calculates second text
Parameter is measured, it is described to include leakage repetition rate and/or more repetition rates with reference to repetition mass parameter;And for being repeated according to the reference
Mass parameter and/or the repetition accuracy determine testing result.
In a possible example, first Job evaluation mode includes test pattern of interpreting;Belonging to first voice
The affiliated languages of languages and second voice are different, and the affiliated languages of the first text and the affiliated languages of the second text are not
Together.
In a possible example, first text and institute are handled according to the first text detection strategy described
The second text is stated, in terms of obtaining the testing result for second voice, the processing unit 401 is specifically used for: determining institute
State the alignment information of the first text Yu second text;And the translation fluency for determining second text;And
For generating testing result according to the alignment information and/or the translation fluency.
In a possible example, detection is generated according to the alignment information and/or the translation fluency described
As a result aspect, the processing unit 401 is specifically used for: determining that the reference of the second language is interpreted matter according to the alignment information
Parameter is measured, it is described to be comprised at least one of the following with reference to mass parameter of interpreting: leakage translation rate, more translation rates and translation accuracy rate;With
And for referring to interpret mass parameter and/or translation fluency generation testing result according to described.
In a possible example, in the alignment information side of the determination first text and second text
Face, the processing unit 401 are specifically used for: determining the intertranslation confidence level matrix of first text and second text;With
And for filtering out optimal translation align to path from the intertranslation confidence level matrix;And for according to the optimal translation pair
Neat path determines alignment information.
In a possible example, in the intertranslation confidence level square of the determination first text and second text
Battle array aspect, the processing unit 401 are specifically used for: obtaining positive translation model and reverse translation model, the positive translation mould
Type is used to be converted to the affiliated languages of the second text by the affiliated languages of the first text, the reverse translation model be used for by
The affiliated languages of second text are converted to languages described in first text;And for by the positive translation model,
The reverse translation model, first text and second text determine each first text unit and each second text
The intertranslation confidence level of this unit obtains intertranslation confidence level matrix, and first text includes multiple first text units, and described
Two texts include multiple second text units.
In a possible example, pass through the positive translation model, the reverse translation model, described the described
One text and second text, determine the intertranslation confidence level of each first text unit and each second text unit, obtain
In terms of intertranslation confidence level matrix, the processing unit 401 is specifically used for: passing through the positive translation model, first text
With second text, the positive degree of translation confidence of second text is calculated;And for by the reverse translation model,
First text and second text calculate the reverse translation confidence level of second text;And for according to
Positive degree of translation confidence and the reverse translation confidence level determine each second text unit relative to each first text
The intertranslation confidence level of unit obtains intertranslation confidence level matrix.
The embodiment of the present application also provides a kind of computer storage medium, wherein computer storage medium storage is for electricity
The computer program of subdata exchange, the computer program make computer execute any as recorded in above method embodiment
Some or all of method step, above-mentioned computer include electronic equipment.
The embodiment of the present application also provides a kind of computer program product, and above-mentioned computer program product includes storing calculating
The non-transient computer readable storage medium of machine program, above-mentioned computer program are operable to that computer is made to execute such as above-mentioned side
Some or all of either record method step in method embodiment.The computer program product can be a software installation
Packet, above-mentioned computer includes electronic equipment.
It should be noted that for the various method embodiments described above, for simple description, therefore, it is stated as a series of
Combination of actions, but those skilled in the art should understand that, the application is not limited by the described action sequence because
According to the application, some steps may be performed in other sequences or simultaneously.Secondly, those skilled in the art should also know
It knows, the embodiments described in the specification are all preferred embodiments, related actions and modules not necessarily the application
It is necessary.
In the above-described embodiments, it all emphasizes particularly on different fields to the description of each embodiment, there is no the portion being described in detail in some embodiment
Point, reference can be made to the related descriptions of other embodiments.
In several embodiments provided herein, it should be understood that disclosed device, it can be by another way
It realizes.For example, the apparatus embodiments described above are merely exemplary, such as the division of said units, it is only a kind of
Logical function partition, there may be another division manner in actual implementation, such as multiple units or components can combine or can
To be integrated into another system, or some features can be ignored or not executed.Another point, shown or discussed is mutual
Coupling, direct-coupling or communication connection can be through some interfaces, the indirect coupling or communication connection of device or unit,
It can be electrical or other forms.
Above-mentioned unit as illustrated by the separation member may or may not be physically separated, aobvious as unit
The component shown may or may not be physical unit, it can and it is in one place, or may be distributed over multiple
In network unit.It can select some or all of unit therein according to the actual needs to realize the mesh of this embodiment scheme
's.
It, can also be in addition, each functional unit in each embodiment of the application can integrate in one processing unit
It is that each unit physically exists alone, can also be integrated in one unit with two or more units.Above-mentioned integrated list
Member both can take the form of hardware realization, can also realize in the form of software functional units.
If above-mentioned integrated unit is realized in the form of SFU software functional unit and sells or use as independent product
When, it can store in a computer-readable access to memory.Based on this understanding, the technical solution of the application substantially or
Person says that all or part of the part that contributes to existing technology or the technical solution can body in the form of software products
Reveal and, which is stored in a memory, including some instructions are used so that a computer equipment
(can be personal computer, server or network equipment etc.) executes all or part of each embodiment above method of the application
Step.And memory above-mentioned includes: USB flash disk, read-only memory (ROM, Read-Only Memory), random access memory
The various media that can store program code such as (RAM, Random Access Memory), mobile hard disk, magnetic or disk.
Those of ordinary skill in the art will appreciate that all or part of the steps in the various methods of above-described embodiment is can
It is completed with instructing relevant hardware by program, which can store in a computer-readable memory, memory
May include: flash disk, read-only memory (English: Read-Only Memory, referred to as: ROM), random access device (English:
Random Access Memory, referred to as: RAM), disk or CD etc..
The embodiment of the present application is described in detail above, specific case used herein to the principle of the application and
Embodiment is expounded, the description of the example is only used to help understand the method for the present application and its core ideas;
At the same time, for those skilled in the art can in specific embodiments and applications according to the thought of the application
There is change place, in conclusion the contents of this specification should not be construed as limiting the present application.
Claims (13)
1. a kind of speech evaluating method characterized by comprising
The first voice under the first Job evaluation mode as evaluating standard is obtained, and obtains the second voice to be evaluated;
It handles first voice and obtains the first text, and processing second voice obtains the second text;
Obtain the corresponding first text detection strategy of first Job evaluation mode;
First text and second text are handled according to the first text detection strategy, is obtained for second language
The testing result of sound.
2. the method according to claim 1, wherein first Job evaluation mode includes repeating test pattern;Institute
It is identical with the affiliated languages of the second voice to state the affiliated languages of the first voice, the affiliated languages of the first text and second text
Languages belonging to this are identical.
3. according to the method described in claim 2, it is characterized in that, described according to described in the first text detection strategy processing
First text and second text obtain the testing result for second voice, comprising:
Determine the alignment information of first text and second text;
The repetition accuracy of second voice is determined according to the alignment information;
Testing result is generated according to the alignment information and/or the repetition accuracy.
4. according to the method described in claim 3, it is characterized in that, described quasi- according to the alignment information and/or the repetition
True property generates testing result, comprising:
Mass parameter is repeated according to the reference that the alignment information calculates second text, it is described with reference to repetition mass parameter packet
Include leakage repetition rate and/or more repetition rates;
Testing result is determined with reference to repetition mass parameter and/or the repetition accuracy according to described.
5. the method according to claim 1, wherein first Job evaluation mode includes test pattern of interpreting;Institute
It is different to state the affiliated languages of the affiliated languages of the first voice and second voice, the affiliated languages of the first text and second text
Languages belonging to this are different.
6. according to the method described in claim 5, it is characterized in that, described according to described in the first text detection strategy processing
First text and second text obtain the testing result for second voice, comprising:
Determine the alignment information of first text Yu second text;
Determine the translation fluency of second text;
Testing result is generated according to the alignment information and/or the translation fluency.
7. according to the method described in claim 6, it is characterized in that, described flow according to the alignment information and/or the translation
Smooth property generates testing result, comprising:
Determine that the reference of the second language is interpreted mass parameter according to the alignment information, it is described with reference to mass parameter packet of interpreting
Include following at least one: leakage translation rate, more translation rates and translation accuracy rate;
According to described with reference to mass parameter and/or the translation fluency generation testing result of interpreting.
8. method according to claim 6 or 7, which is characterized in that determination first text and second text
This alignment information, comprising:
Determine the intertranslation confidence level matrix of first text and second text;
Optimal translation align to path is filtered out from the intertranslation confidence level matrix;
Alignment information is determined according to the optimal translation align to path.
9. according to the method described in claim 8, it is characterized in that, the determination first text and second text
Intertranslation confidence level matrix, comprising:
Positive translation model and reverse translation model are obtained, the forward direction translation model is used for by the affiliated languages of the first text
The affiliated languages of the second text are converted to, the reverse translation model is used to be converted to institute by the affiliated languages of the second text
State languages described in the first text;
By the positive translation model, the reverse translation model, first text and second text, determine each
The intertranslation confidence level of first text unit and each second text unit obtains intertranslation confidence level matrix, the first text packet
Multiple first text units are included, second text includes multiple second text units.
10. according to the method described in claim 9, it is characterized in that, it is described by the positive translation model, described reversely turn over
Model, first text and second text are translated, determines the mutual of each first text unit and each second text unit
Confidence level is translated, intertranslation confidence level matrix is obtained, comprising:
It is turned over by the positive translation model, first text and second text, the forward direction for calculating second text
Translate confidence level;
By the reverse translation model, first text and second text, reversely turning over for second text is calculated
Translate confidence level;
According to the positive degree of translation confidence and the reverse translation confidence level determine each second text unit relative to
The intertranslation confidence level of each first text unit obtains intertranslation confidence level matrix.
11. a kind of speech evaluating device, which is characterized in that including processing unit and communication unit, wherein
The processing unit, for obtaining the first language under the first Job evaluation mode as evaluating standard by the communication unit
Sound, and the second voice to be evaluated is obtained by the communication unit, first Job evaluation mode includes repeating test pattern
Or test pattern of interpreting;And the first text is obtained for handling first voice, and processing second voice obtains
To the second text;And for obtaining the corresponding first text detection strategy of first Job evaluation mode;And for according to institute
It states the first text detection strategy and handles first text and second text, obtain the detection knot for second voice
Fruit.
12. a kind of electronic equipment, which is characterized in that one including processor, memory, and one or more programs
Or multiple programs are stored in the memory, and are configured to be executed by the processor, described program includes for holding
Instruction of the row such as the step in the described in any item methods of claim 1-10.
13. a kind of computer readable storage medium, which is characterized in that storage is used for the computer program of electronic data interchange,
In, the computer program makes computer execute such as the described in any item methods of claim 1-10.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910422699.3A CN110148413B (en) | 2019-05-21 | 2019-05-21 | Voice evaluation method and related device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910422699.3A CN110148413B (en) | 2019-05-21 | 2019-05-21 | Voice evaluation method and related device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110148413A true CN110148413A (en) | 2019-08-20 |
CN110148413B CN110148413B (en) | 2021-10-08 |
Family
ID=67592304
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910422699.3A Active CN110148413B (en) | 2019-05-21 | 2019-05-21 | Voice evaluation method and related device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110148413B (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110134952A (en) * | 2019-04-29 | 2019-08-16 | 华南师范大学 | A kind of Error Text rejection method for identifying, device and storage medium |
CN111402924A (en) * | 2020-02-28 | 2020-07-10 | 联想(北京)有限公司 | Spoken language evaluation method and device and computer readable storage medium |
CN112562737A (en) * | 2021-02-25 | 2021-03-26 | 北京映客芝士网络科技有限公司 | Method, device, medium and electronic equipment for evaluating audio processing quality |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000222406A (en) * | 1999-01-27 | 2000-08-11 | Sony Corp | Voice recognition and translation device and its method |
JP5986883B2 (en) * | 2012-10-23 | 2016-09-06 | 日本電信電話株式会社 | Language model evaluation method, apparatus and program |
CN107316638A (en) * | 2017-06-28 | 2017-11-03 | 北京粉笔未来科技有限公司 | A kind of poem recites evaluating method and system, a kind of terminal and storage medium |
CN107578778A (en) * | 2017-08-16 | 2018-01-12 | 南京高讯信息科技有限公司 | A kind of method of spoken scoring |
-
2019
- 2019-05-21 CN CN201910422699.3A patent/CN110148413B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2000222406A (en) * | 1999-01-27 | 2000-08-11 | Sony Corp | Voice recognition and translation device and its method |
JP5986883B2 (en) * | 2012-10-23 | 2016-09-06 | 日本電信電話株式会社 | Language model evaluation method, apparatus and program |
CN107316638A (en) * | 2017-06-28 | 2017-11-03 | 北京粉笔未来科技有限公司 | A kind of poem recites evaluating method and system, a kind of terminal and storage medium |
CN107578778A (en) * | 2017-08-16 | 2018-01-12 | 南京高讯信息科技有限公司 | A kind of method of spoken scoring |
Non-Patent Citations (2)
Title |
---|
俞敬松等: "《高正确率的双语语块对齐算法研究》", 《中文信息学报》 * |
程葳等: "《一种面向汉英口语翻译的双语语块处理方法》", 《中文信息学报》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110134952A (en) * | 2019-04-29 | 2019-08-16 | 华南师范大学 | A kind of Error Text rejection method for identifying, device and storage medium |
CN110134952B (en) * | 2019-04-29 | 2020-03-31 | 华南师范大学 | Error text rejection method, device and storage medium |
CN111402924A (en) * | 2020-02-28 | 2020-07-10 | 联想(北京)有限公司 | Spoken language evaluation method and device and computer readable storage medium |
CN111402924B (en) * | 2020-02-28 | 2024-04-19 | 联想(北京)有限公司 | Spoken language evaluation method, device and computer readable storage medium |
CN112562737A (en) * | 2021-02-25 | 2021-03-26 | 北京映客芝士网络科技有限公司 | Method, device, medium and electronic equipment for evaluating audio processing quality |
CN112562737B (en) * | 2021-02-25 | 2021-06-22 | 北京映客芝士网络科技有限公司 | Method, device, medium and electronic equipment for evaluating audio processing quality |
Also Published As
Publication number | Publication date |
---|---|
CN110148413B (en) | 2021-10-08 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109918680B (en) | Entity identification method and device and computer equipment | |
CN108288468B (en) | Audio recognition method and device | |
JP2019102063A (en) | Method and apparatus for controlling page | |
US7412389B2 (en) | Document animation system | |
CN108711420A (en) | Multilingual hybrid model foundation, data capture method and device, electronic equipment | |
CN103677729B (en) | Voice input method and system | |
WO2022078146A1 (en) | Speech recognition method and apparatus, device, and storage medium | |
CN108877782A (en) | Audio recognition method and device | |
CN110147451B (en) | Dialogue command understanding method based on knowledge graph | |
CN110310619A (en) | Polyphone prediction technique, device, equipment and computer readable storage medium | |
CN110148413A (en) | Speech evaluating method and relevant apparatus | |
CN108228576B (en) | Text translation method and device | |
CN109754783A (en) | Method and apparatus for determining the boundary of audio sentence | |
CN106202288B (en) | A kind of optimization method and system of man-machine interactive system knowledge base | |
CN111694937A (en) | Interviewing method and device based on artificial intelligence, computer equipment and storage medium | |
CN112487139A (en) | Text-based automatic question setting method and device and computer equipment | |
CN109461459A (en) | Speech assessment method, apparatus, computer equipment and storage medium | |
CN110457661A (en) | Spatial term method, apparatus, equipment and storage medium | |
CN109741641A (en) | Langue leaning system based on new word detection | |
CN110223365A (en) | A kind of notes generation method, system, device and computer readable storage medium | |
CN112463942A (en) | Text processing method and device, electronic equipment and computer readable storage medium | |
CN104536677A (en) | Three-dimensional digital portrait with intelligent voice interaction function | |
CN109325178A (en) | Method and apparatus for handling information | |
CN107945802A (en) | Voice recognition result processing method and processing device | |
CN109448717A (en) | A kind of phonetic word spelling recognition methods, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |