CN110176225A - A kind of appraisal procedure and device of prosody prediction effect - Google Patents
A kind of appraisal procedure and device of prosody prediction effect Download PDFInfo
- Publication number
- CN110176225A CN110176225A CN201910461506.5A CN201910461506A CN110176225A CN 110176225 A CN110176225 A CN 110176225A CN 201910461506 A CN201910461506 A CN 201910461506A CN 110176225 A CN110176225 A CN 110176225A
- Authority
- CN
- China
- Prior art keywords
- artificial
- test case
- result
- weight
- prosodic labeling
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000000694 effects Effects 0.000 title claims abstract description 106
- 238000000034 method Methods 0.000 title claims abstract description 48
- 238000012360 testing method Methods 0.000 claims abstract description 464
- 238000002372 labelling Methods 0.000 claims abstract description 335
- 239000012141 concentrate Substances 0.000 claims abstract description 30
- 238000012076 audiometry Methods 0.000 claims description 157
- 238000001514 detection method Methods 0.000 claims description 58
- 230000033764 rhythmic process Effects 0.000 claims description 46
- 238000011156 evaluation Methods 0.000 claims description 23
- 239000002131 composite material Substances 0.000 claims description 20
- 230000015572 biosynthetic process Effects 0.000 claims description 19
- 235000013399 edible fruits Nutrition 0.000 claims description 19
- 238000003786 synthesis reaction Methods 0.000 claims description 19
- 230000007115 recruitment Effects 0.000 claims description 7
- 239000000203 mixture Substances 0.000 claims description 3
- 238000013507 mapping Methods 0.000 claims description 2
- 238000009826 distribution Methods 0.000 description 13
- 238000010586 diagram Methods 0.000 description 11
- 238000004891 communication Methods 0.000 description 7
- 239000011159 matrix material Substances 0.000 description 6
- 238000012546 transfer Methods 0.000 description 4
- 238000005457 optimization Methods 0.000 description 3
- 238000003860 storage Methods 0.000 description 3
- 238000004590 computer program Methods 0.000 description 2
- 238000012937 correction Methods 0.000 description 2
- 230000006870 function Effects 0.000 description 2
- 230000005611 electricity Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/02—Methods for producing synthetic speech; Speech synthesisers
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L13/00—Speech synthesis; Text to speech systems
- G10L13/08—Text analysis or generation of parameters for speech synthesis out of text, e.g. grapheme to phoneme translation, prosody generation or stress or intonation determination
- G10L13/10—Prosody rules derived from text; Stress or intonation
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
This application provides the appraisal procedures and device of a kind of prosody prediction effect, wherein, method includes: to obtain test case to concentrate multiple corresponding prosody predictions of test case as a result, the corresponding prosody prediction result of each test case is predicted to obtain by prosody prediction engine to be assessed;Weight based on artificial prosodic labeling result each in the corresponding artificial prosodic labeling result set of multiple test cases being obtained ahead of time, determine that the weight of the corresponding prosody prediction result of multiple test cases, the weight of any artificial prosodic labeling result can characterize the resonable degree of the artificial prosodic labeling result;According to the weight of the corresponding prosody prediction result of multiple test cases and the corresponding prosody prediction result of multiple test cases, the assessment result of the prosody prediction effect of prosody prediction engine to be assessed is determined.The appraisal procedure of prosody prediction effect provided by the present application automatic, efficiently, objectively can assess the prediction effect of prosody prediction engine.
Description
Technical field
This application involves speech synthesis technique field more particularly to a kind of appraisal procedures and device of prosody prediction effect.
Background technique
Prosody prediction is a part indispensable in speech synthesis system, it belongs at the front end of speech synthesis system
Reason, for predicting the rhythm boundary position in text data, the back-end processing of speech synthesis system can be according to rhythm boundary position
Provide audio pause.
Prosody prediction is directly affected by prosody prediction engine implementation, the quality of the prosody prediction effect of prosody prediction engine
The total quality of speech synthesis, in order to obtain higher speech synthesis quality, it usually needs pre- to the rhythm of prosody prediction engine
Effect is surveyed to be assessed.
Currently, the method assessed the prosody prediction effect of prosody prediction engine is artificial appraisal procedure, i.e., by commenting
Estimate personnel to assess the prosody prediction result of prosody prediction engine.However, manual evaluation method vulnerable to subjective factor (such as
Experience, state of appraiser etc.) it influences, cause assessment result confidence level not high, also, the cost of labor of manual evaluation method
It is higher with time cost.
Summary of the invention
In view of this, this application provides the appraisal procedures and device of a kind of prosody prediction result, to solve existing skill
Manual evaluation method in art causes assessment result confidence level not high vulnerable to subjective factor, and manual evaluation method it is artificial at
Originally with the higher problem of time cost, its technical solution is as follows:
A kind of appraisal procedure of prosody prediction effect, comprising:
It obtains test case and concentrates the corresponding prosody prediction result of multiple test cases, wherein each test case
Corresponding prosody prediction result is predicted to obtain by prosody prediction engine to be assessed;
Based on each artificial in the corresponding artificial prosodic labeling result set of being obtained ahead of time, the multiple test case
The weight of prosodic labeling result determines the weight of the corresponding prosody prediction result of the multiple test case, wherein any
The corresponding artificial prosodic labeling result set of test case include at least one corresponding artificial prosodic labeling of the test case as a result,
The weight of any artificial prosodic labeling result can characterize the resonable degree of the artificial prosodic labeling result;
It is right respectively according to the corresponding prosody prediction result of the multiple test case and the multiple test case
The weight for the prosody prediction result answered determines the assessment result of the prosody prediction effect of the prosody prediction engine to be assessed.
Optionally, each artificial rhythm in the corresponding artificial prosodic labeling result set of the multiple test case is obtained
The weight of annotation results, comprising:
Obtain the corresponding artificial prosodic labeling result set of the multiple test case;
By each of the corresponding artificial prosodic labeling result set of the multiple test case work prosodic labeling knot
Fruit carries out audio synthesis, obtains the corresponding Composite tone collection of the multiple test case;
The best artificial prosodic labeling that each test case is chosen is directed to according to audiometry personnel each in multiple audiometry personnel
As a result, determining each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of the multiple test case
Weight;Wherein, any audiometry personnel are audiometry personnel for the best artificial prosodic labeling result that any test case is chosen
Audiometry is carried out by each Composite tone concentrated to the corresponding Composite tone of the test case, it is corresponding artificial from the test case
The best artificial prosodic labeling result selected in prosodic labeling result set.
Optionally, described that the most beautiful woman that each test case is chosen is directed to according to audiometry personnel each in multiple audiometry personnel
Work prosodic labeling is as a result, determine each artificial rhythm in the corresponding artificial prosodic labeling result set of the multiple test case
The weight of annotation results, comprising:
The test case that the test case is concentrated is divided into multiple groups, every group of test case forms test case
Collection, obtains multiple test case subsets;
The test case subset that acquisition one had not been obtained from the multiple test case subset is as target detection use-case
Subset;
It is surveyed based on the corresponding initial weight of the multiple audiometry personnel and each audiometry personnel for the target
The best artificial prosodic labeling that example on probation concentrates each test case to choose is as a result, determine that the multiple audiometry personnel are right respectively
The target weight answered;
By the corresponding target weight of the audiometry personnel, each test in the target detection use-case subset is determined
The weight of each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of use-case;
It is corresponding just using the corresponding target weight of the multiple audiometry personnel as the multiple audiometry personnel
Then beginning weight executes the test case subset that acquisition one had not been obtained from multiple test case subsets and surveys as target
Example collection on probation, until there is no the test case subset having not been obtained, it is corresponding to obtain the multiple test case
The weight of each artificial prosodic labeling result in artificial prosodic labeling result set.
Optionally, described to be based on the corresponding initial weight of the multiple audiometry personnel and each audiometry personnel needle
The best artificial prosodic labeling choose to test case each in the target detection use-case subset is as a result, determined the multiple survey
Listen the corresponding target weight of personnel, comprising:
By each audiometry personnel for the best artificial of test case selection each in the target detection use-case subset
Prosodic labeling is as a result, determine the corresponding test case quantity ratio of any two audiometry personnel, wherein any two audiometry personnel couple
The test case quantity ratio answered are as follows: any two audiometry personnel are for each test case choosing in the target detection use-case subset
It is surveyed in the total quantity and the target detection use-case subset of the corresponding test case of identical best manually prosodic labeling result taken
The ratio of the total quantity of example on probation;
It is corresponding based on the corresponding initial weight of the multiple audiometry personnel and any two audiometry personnel
Test case quantity ratio determines the corresponding target weight of the multiple audiometry personnel.
Optionally, described by the corresponding target weight of the audiometry personnel, determine the target detection example
Concentrate the weight of each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of each test case, comprising:
Appointing in artificial prosodic labeling result set corresponding for test case any in the target detection use-case subset
One artificial prosodic labeling result:
By choosing the corresponding target of audiometry personnel that the artificial prosodic labeling result is best artificial prosodic labeling result
Weight determines the weight of the artificial prosodic labeling result;
It is every in the corresponding artificial prosodic labeling result set of each test case in the target detection use-case subset to obtain
The weight of a artificial prosodic labeling result.
Optionally, described by choosing the audiometry personnel that the artificial prosodic labeling result is best artificial prosodic labeling result
Corresponding target weight determines the weight of the artificial prosodic labeling result, comprising:
The corresponding target power of audiometry personnel that the artificial prosodic labeling result is best artificial prosodic labeling result will be chosen
Value summation, summation obtain weight of the value as the artificial prosodic labeling result.
Optionally, described based on the corresponding artificial prosodic labeling result of being obtained ahead of time, the multiple test case
The weight for concentrating each artificial prosodic labeling result, determines the power of the corresponding prosody prediction result of the multiple test case
Value, comprising:
For the corresponding prosody prediction result of any test case:
Prosody prediction corresponding with the test case is determined from the corresponding artificial prosodic labeling result set of the test case
As a result consistent artificial prosodic labeling is as a result, and using the weight for the artificial prosodic labeling result determined as the test case pair
The weight for the prosody prediction result answered;
To obtain the weight of the corresponding prosody prediction result of the multiple test case.
Optionally, described according to the corresponding prosody prediction result of the multiple test case and the multiple test
The weight of the corresponding prosody prediction result of use-case determines commenting for the prosody prediction effect of the prosody prediction engine to be assessed
Estimate result, comprising:
It is right respectively by the corresponding prosody prediction result of the multiple test case and the multiple test case
The weight for the prosody prediction result answered determines the score of the prosody prediction effect of the prosody prediction engine to be assessed;
It determines that the score of the prosody prediction effect of the prosody prediction engine to be assessed accounts for the ratio of artificial top score, makees
For the assessment result of the prosody prediction effect of the prosody prediction engine to be assessed;
Wherein, the artificial top score to the corresponding maximum weight of each test case by summing to obtain, any survey
Trying out the corresponding maximum weight of example is each artificial prosodic labeling knot in the corresponding artificial prosodic labeling result set of the test case
Maximum weight in the weight of fruit.
A kind of assessment device of prosody prediction effect, comprising: prosody prediction result obtains module, prosody prediction result weight
Determining module and prosody prediction recruitment evaluation module;
The prosody prediction result obtains module, concentrates the corresponding rhythm of multiple test cases for obtaining test case
Restrain prediction result, wherein the corresponding prosody prediction result of each test case is predicted to obtain by prosody prediction engine to be assessed;
The prosody prediction result weight determining module, for based on test case difference be obtained ahead of time, the multiple
The weight of each artificial prosodic labeling result in corresponding artificial prosodic labeling result set, determines the multiple test case difference
The weight of corresponding prosody prediction result, wherein the corresponding artificial prosodic labeling result set of any test case includes the test
At least one corresponding artificial prosodic labeling of use-case is as a result, the weight of any artificial prosodic labeling result can characterize the artificial rhythm
Restrain the resonable degree of annotation results;
The prosody prediction recruitment evaluation module, for according to the corresponding prosody prediction knot of the multiple test case
The weight of fruit and the corresponding prosody prediction result of the multiple test case determines the prosody prediction engine to be assessed
Prosody prediction effect assessment result.
The assessment device of the prosody prediction effect further include: artificial prosodic labeling result set obtains module, audio synthesis
Module and artificial prosodic labeling result weight determining module;
The artificial prosodic labeling result set obtains module, corresponding artificial for obtaining the multiple test case
Prosodic labeling result set;
The audio synthesis module, being used for will be in the corresponding artificial prosodic labeling result set of the multiple test case
Each of artificial prosodic labeling result carry out audio synthesis, obtain the corresponding Composite tone collection of the multiple test case;
The artificial prosodic labeling result weight determining module, for according to audiometry personnel needle each in multiple audiometry personnel
The best artificial prosodic labeling choose to each test case is as a result, determined the corresponding artificial rhythm of the multiple test case
Restrain the weight that annotation results concentrate each artificial prosodic labeling result;Wherein, any audiometry personnel select for any test case
The best artificial prosodic labeling result taken is that audiometry personnel pass through each conjunction for concentrating to the corresponding Composite tone of the test case
Audiometry, the best artificial prosodic labeling selected from the corresponding artificial prosodic labeling result set of the test case are carried out at audio
As a result.
Optionally, the artificial prosodic labeling result weight determining module includes: grouping submodule, acquisition submodule, the
One weight determines that submodule, the second weight determine that submodule and initial weight determine submodule;
The grouping submodule, the test case for concentrating the test case are divided into multiple groups, every group of test case
A test case subset is formed, multiple test case subsets are obtained;
The acquisition submodule, for obtaining test case having not been obtained from the multiple test case subset
Collection is used as target detection use-case subset;
First weight determines submodule, for being based on the corresponding initial weight of the multiple audiometry personnel, with
And the best artificial prosodic labeling knot that each audiometry personnel choose for test case each in the target detection use-case subset
Fruit determines the corresponding target weight of the multiple audiometry personnel;
Second weight determines submodule, for determining institute by the corresponding target weight of the audiometry personnel
State each artificial prosodic labeling knot in the corresponding artificial prosodic labeling result set of each test case in target detection use-case subset
The weight of fruit;
The initial weight determines submodule, for the corresponding target weight of the multiple audiometry personnel to be determined as
The corresponding initial weight of the multiple audiometry personnel, then triggers the acquisition submodule from multiple test case subsets
The test case subset that acquisition one had not been obtained is as target detection use-case subset, until there is no the test cases having not been obtained
Subset, to obtain each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of the multiple test case
Weight.
Optionally, first weight determines submodule, is specifically used for surveying by each audiometry personnel for the target
The best artificial prosodic labeling that example on probation concentrates each test case to choose is as a result, determine that any two audiometry personnel are corresponding
Test case quantity ratio;Based on the corresponding initial weight of the multiple audiometry personnel and any two audiometry people
The corresponding test case quantity ratio of member, determines the corresponding target weight of the multiple audiometry personnel;
Wherein, the test case quantity ratio of any two audiometry personnel are as follows: any two audiometry personnel survey for the target
The sum for the corresponding test case of identical best manually prosodic labeling result that each test case that example on probation is concentrated is chosen
The ratio of amount and the total quantity of test case in the target detection use-case subset.
Optionally, second weight determines submodule, is specifically used for for any in the target detection use-case subset
Any artificial prosodic labeling in the corresponding artificial prosodic labeling result set of test case is as a result, by choosing the artificial rhythm mark
The corresponding target weight of audiometry personnel that result is best artificial prosodic labeling result is infused, determines the artificial prosodic labeling result
Weight, to obtain in the corresponding artificial prosodic labeling result set of each test case in the target detection use-case subset everyone
The weight of work prosodic labeling result.
Optionally, the prosody prediction result weight determining module is specifically used for being directed to the corresponding rhythm of any test case
It restrains prediction result: determining that the rhythm corresponding with the test case is pre- from the corresponding artificial prosodic labeling result set of the test case
The consistent artificial prosodic labeling of result is surveyed as a result, and using the weight for the artificial prosodic labeling result determined as the test case
The weight of corresponding prosody prediction result;To obtain the weight of the corresponding prosody prediction result of the multiple test case.
Optionally, the prosody prediction recruitment evaluation module includes: that prosody prediction effect score determines submodule and assessment
As a result submodule is determined;
The prosody prediction effect score determines submodule, for passing through the corresponding rhythm of the multiple test case
The weight of prediction result and the corresponding prosody prediction result of the multiple test case determines that the rhythm to be assessed is pre-
Survey the score of the prosody prediction effect of engine;
The assessment result determines submodule, for determining the prosody prediction effect of the prosody prediction engine to be assessed
Score accounts for the ratio of artificial top score, the assessment result of the prosody prediction effect as the prosody prediction engine to be assessed;
Wherein, the artificial top score to the corresponding maximum weight of each test case by summing to obtain, any test case pair
The maximum weight answered is the weight of each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of the test case
In maximum weight.
A kind of assessment equipment of prosody prediction effect, comprising: memory and processor;
The memory, for storing program;
The processor realizes each step of the appraisal procedure of the prosody prediction effect for executing described program.
A kind of readable storage medium storing program for executing is stored thereon with computer program, real when the computer program is executed by processor
Each step of the appraisal procedure of the existing prosody prediction effect.
Via above scheme it is found that the appraisal procedure and device of prosody prediction effect provided by the present application, obtain survey first
Example on probation concentrates the corresponding prosody prediction of multiple test cases as a result, being then based on the multiple test cases point being obtained ahead of time
The weight of each artificial prosodic labeling result, determines that multiple test cases are right respectively in not corresponding artificial prosodic labeling result set
The weight for the prosody prediction result answered determines finally according to the weight of the corresponding prosody prediction result of multiple test cases
The assessment result of the prosody prediction effect of prosody prediction engine to be assessed, it can be seen that, prosody prediction effect provided by the present application
Appraisal procedure, everyone can be concentrated in the corresponding artificial prosodic labeling result set of multiple test cases based on test case
The weight of work prosodic labeling result automatically assesses the prosody prediction effect of prosody prediction engine to be assessed, compared to existing
Manual evaluation mode, not only avoid influence of the subjective factors to assessment result, and save manpower, reduce assessment
It is time-consuming.
Detailed description of the invention
In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below
There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is only this
The embodiment of invention for those of ordinary skill in the art without creative efforts, can also basis
The attached drawing of offer obtains other attached drawings.
Fig. 1 is the flow diagram of the appraisal procedure of prosody prediction effect provided by the embodiments of the present application;
Fig. 2 is provided by the embodiments of the present application obtains in the corresponding artificial prosodic labeling result set of multiple test cases
The flow diagram of the weight of each artificial prosodic labeling result;
Fig. 3 is that each audiometry personnel provided by the embodiments of the present application according in multiple audiometry personnel use for each test
The best artificial prosodic labeling that example is chosen is as a result, determine every in the corresponding artificial prosodic labeling result set of multiple test cases
The flow diagram of the weight of a artificial prosodic labeling result;
Fig. 4 is the weight provided by the embodiments of the present application according to the corresponding prosody prediction result of multiple test cases,
Determine the flow diagram of the assessment result of the prosody prediction effect of prosody prediction engine to be assessed;
Fig. 5 is the structural schematic diagram of the assessment device of prosody prediction effect provided by the embodiments of the present application;
Fig. 6 is the structural schematic diagram of the assessment equipment of prosody prediction effect provided by the embodiments of the present application.
Specific embodiment
Following will be combined with the drawings in the embodiments of the present invention, and technical solution in the embodiment of the present invention carries out clear, complete
Site preparation description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.It is based on
Embodiment in the present invention, it is obtained by those of ordinary skill in the art without making creative efforts every other
Embodiment shall fall within the protection scope of the present invention.
Existing manual evaluation method is to concentrate each test case corresponding test case by multiple appraisers
Prosody prediction result (that is, prosody prediction result that prosody prediction engine to be assessed is directed to test use cases) is assessed, specifically
, appraiser carries out prosodic labeling check and correction to prosody prediction result or by the audio that prosody prediction result synthesizes, thus
Unacceptable prosody prediction is picked out from the corresponding prosody prediction result of each test case as a result, count in turn can not
Receptance.
Inventor it has been investigated that, there are biggish subjectivities for above-mentioned manual evaluation mode, for example, different evaluators
For member in the audio that audiometry is synthesized by prosody prediction result, the difference of audiometry results is up to 25% or even same appraiser
Also bigger for the audiometry results difference of same audio in different time sections, this causes the confidence level of assessment result lower, and
And manpower needed for manual evaluation method is more, and it is longer to assess the time, i.e. the cost of labor and time cost of manual evaluation method
It is higher.
In addition, the data manually marked can not reuse, that is to say, that prosody prediction in above-mentioned manual evaluation method
Engine optimization front and back, is required to appraiser and closes to the prosody prediction result of prosody prediction engine or by prosody prediction result
At audio carry out prosodic labeling check and correction, to pick out unacceptable prosody prediction result, that is, prosody prediction engine optimization before
Labeled data to the no any utility value of the assessment of the prosody prediction result of prosody prediction engine after optimization.
In view of above-mentioned manual evaluation mode there are the problem of, inventor has made intensive studies, and finally proposes one
Kind of effect is preferable, for the method assessed of prediction effect of prosody prediction engine, which is applicable to pair
The application scenarios that the prediction effect of prosody prediction engine is assessed, the appraisal procedure can it is automatic, efficiently, objectively to be evaluated
The prediction effect for estimating prosody prediction engine is assessed, which can be applied to terminal, can also be applied to server.
It is introduced followed by appraisal procedure of following embodiments to prosody prediction effect provided by the present application.
Referring to Fig. 1, the flow diagram of the appraisal procedure of prosody prediction effect provided by the embodiments of the present application is shown,
This method may include:
Step S101: it obtains test case and concentrates the corresponding prosody prediction result of multiple test cases.
Wherein, test case concentrate include chosen from the different field of user's scene in preset big data ratio it is multiple
Test case, the corresponding prosody prediction result of each test case are predicted to obtain by prosody prediction engine to be assessed.
Step S102: based on every in the corresponding artificial prosodic labeling result set of being obtained ahead of time, multiple test cases
The weight of a artificial prosodic labeling result, determines the weight of the corresponding prosody prediction result of multiple test cases.
Wherein, the corresponding artificial prosodic labeling result set of any test case, the corresponding artificial rhythm of any test case
Annotation results collection includes at least one corresponding artificial prosodic labeling result of the test case.
In view of same test case there may be multiple reasonable prosodic labelings as a result, therefore, the application is for any
Test case can obtain multiple reasonable artificial prosodic labeling results and form the corresponding artificial prosodic labeling result of the test case
Collection, wherein multiple reasonable artificial prosodic labeling results can carry out rhythm boundary to the test case by multiple mark personnel
Position marks to obtain.
Wherein, the weight of any prosodic labeling result can characterize the resonable degree of the prosodic labeling result.
Test case is obtained in advance concentrates in the corresponding artificial prosodic labeling result set of multiple test cases everyone
The realization process of the weight of work prosodic labeling result can be found in the explanation of subsequent embodiment.
Step S103: right respectively according to the corresponding prosody prediction result of multiple test cases and multiple test cases
The weight for the prosody prediction result answered determines the assessment result of the prosody prediction effect of prosody prediction engine to be assessed.
Specifically, multiple corresponding prosody prediction results of test case and multiple surveys can be concentrated according to test case
The weight of the corresponding prosody prediction result of example on probation determines obtaining for the prosody prediction effect of prosody prediction engine to be assessed
Point, and then the score of the prosody prediction effect based on prosody prediction engine to be assessed, determine the rhythm of prosody prediction engine to be assessed
Restrain the assessment result of prediction effect.
The appraisal procedure of prosody prediction effect provided by the embodiments of the present application, first acquisition test case concentrate multiple tests
The corresponding prosody prediction of use-case is as a result, be then based on the corresponding artificial rhythm mark of multiple test cases being obtained ahead of time
The weight for infusing each artificial prosodic labeling result in result set, determines the corresponding prosody prediction result of multiple test cases
Weight determines prosody prediction engine to be assessed finally according to the weight of the corresponding prosody prediction result of multiple test cases
Prosody prediction effect assessment result, it can be seen that, the appraisal procedure of prosody prediction effect provided by the embodiments of the present application can
Each artificial prosodic labeling knot in the corresponding artificial prosodic labeling result set of multiple test cases is concentrated based on test case
The weight of fruit is automatically assessed the prosody prediction effect of prosody prediction engine to be assessed, compared to existing manual evaluation
Mode not only avoids influence of the subjective factors to assessment result, and saves manpower, reduces assessment time-consuming.
In addition, in the present embodiment, the corresponding artificial prosodic labeling result of multiple test cases of test case concentration
The weight of each artificial prosodic labeling result in collection and the corresponding artificial prosodic labeling result set of multiple test cases
It needs to obtain once, i.e., prediction effect of prosody prediction engine of each version can be assessed using it, i.e. test case
The corresponding artificial prosodic labeling result set of multiple test cases and multiple test cases concentrated are corresponding artificial
The weight of each artificial prosodic labeling result is reusable in prosodic labeling result set.
Next the corresponding artificial prosodic labeling result set of multiple test cases is concentrated to the preparatory test case that obtains
In the weight of each artificial prosodic labeling result be introduced.
The corresponding artificial prosodic labeling knot of multiple test cases is concentrated referring to Fig. 2, showing and obtaining test case
Fruit concentrates the flow diagram of the weight of each artificial prosodic labeling result, may include:
Step S201: it obtains test case and concentrates the corresponding artificial prosodic labeling result set of multiple test cases.
Wherein, the artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of any test case is by multiple marks
Note personnel carry out prosodic phrase boundary position for the test case and mark to obtain.
Step S202: by each artificial prosodic labeling in the corresponding artificial prosodic labeling result set of multiple test cases
As a result audio synthesis is carried out, the corresponding Composite tone collection of multiple test cases is obtained.
Specifically, any test case is directed to, respectively in artificial prosodic labeling result set corresponding to the test case
Each artificial prosodic labeling result carries out audio synthesis, and the set of the audio composition synthesized is corresponding as the test case
Composite tone collection, to obtain the corresponding Composite tone collection of multiple test cases.
Step S203: according to audiometry personnel each in multiple audiometry personnel for the best artificial of each test case selection
Prosodic labeling is as a result, determine each artificial prosodic labeling knot in the corresponding artificial prosodic labeling result set of multiple test cases
The weight of fruit.
Wherein, any audiometry personnel are audiometry people for the best artificial prosodic labeling result that any test case is chosen
Member carries out audiometry by each Composite tone concentrated to the corresponding Composite tone of the test case, from the corresponding people of the test case
The best artificial prosodic labeling result selected in work prosodic labeling result set.It should be noted that any audiometry personnel are directed to
When the corresponding Composite tone collection of any test case carries out audiometry, picked out most from the corresponding Composite tone concentration of the test case
Good news frequency, the corresponding artificial prosodic labeling result of optimal audio are best artificial prosodic labeling result.
Below to " step S203: according to each audiometry personnel in multiple audiometry personnel for the selection of each test case
Best artificial prosodic labeling as a result, determining each artificial in the corresponding artificial prosodic labeling result set of multiple test cases
The weight of prosodic labeling result " is introduced.
Referring to Fig. 3, showing according to audiometry personnel each in multiple audiometry personnel for the selection of each test case
Best artificial prosodic labeling is as a result, determine each artificial rhythm in the corresponding artificial prosodic labeling result set of multiple test cases
Restrain annotation results weight flow diagram, may include:
Step S301: the test case that test case is concentrated is divided into multiple groups, every group of test case forms a test and use
Example collection obtains multiple test case subsets.
Illustratively, it includes 100 test cases that test case, which is concentrated, 100 test cases can be divided into 5 groups, every group 5
A test case can get 5 test case subsets.
Step S302: the test case subset that acquisition one had not been obtained from multiple test case subsets is as target detection
Use-case subset.
Step S303: target is directed to based on the corresponding initial weight of multiple audiometry personnel and each audiometry personnel
The best artificial prosodic labeling that each test case is chosen in test case subset is as a result, determine that multiple audiometry personnel respectively correspond
Target weight.
Specifically, the realization process of step S303 may include:
Step S3031, it is chosen most by each audiometry personnel for test case each in target detection use-case subset
Beautiful woman's work prosodic labeling is as a result, determine the corresponding test case quantity ratio of any two audiometry personnel.
Wherein, the corresponding test case quantity ratio of any two audiometry personnel are as follows: any two audiometry personnel are directed to the mesh
Mapping tries out the corresponding test case of identical best manually prosodic labeling result that each test case that example is concentrated is chosen
The ratio of the total quantity of test case in total quantity and target detection use-case subset.
Illustratively, target detection use-case subset includes 3 test cases A, B, C, the corresponding artificial rhythm of test case A
Annotation results collection is { a1, a2, a3 }, and the corresponding artificial prosodic labeling result set of test case B is { b1, b2, b3 }, test case
The corresponding artificial prosodic labeling result set of C is { c1, c2, c3 }, it is assumed that audiometry personnel x is directed to the most beautiful woman that test case A chooses
Work prosodic labeling result is a2, is b1 for the test case B best artificial prosodic labeling result chosen, for test case C
The best artificial prosodic labeling result chosen is c3, and audiometry personnel y is directed to the best artificial prosodic labeling knot that test case A chooses
Fruit is a1, is b1, the most beautiful woman chosen for test case C for the best artificial prosodic labeling result that test case B chooses
Work prosodic labeling result is c3, then, audiometry personnel x and audiometry personnel y chooses identical best for the test case A, B, C
Manually the total quantity of the corresponding test case of prosodic labeling result is 2, the total quantity of test case in target detection use-case subset
Be the corresponding test case quantity ratio in 3, then audiometry personnel x and audiometry personnel y be 2/3.
Step S3032, corresponding based on the corresponding initial weight of multiple audiometry personnel and any two audiometry personnel
Test case quantity ratio, determine the corresponding target weight of multiple audiometry personnel.
Wherein, the corresponding target weight of any audiometry personnel is higher, shows that experience of the audiometry personnel in terms of audiometry is got over
It is abundant.
Assuming that the quantity of audiometry personnel is Np, in the present embodiment, can be by NpThe corresponding initial weight of a audiometry personnel
Form a NpDimensional vector, can be by any two audiometry personnel as the corresponding initial probability distribution of target detection use-case subset
Test case quantity than composition one Np*NpMatrix as the corresponding transfer matrix of target detection use-case subset.It needs to illustrate
, for first aim test case subset, the value of each element is 1/ in corresponding initial probability distribution
Np。
For target detection use-case subset Si, obtain target detection use-case subset SiCorresponding initial probability distribution Vi-1With turn
Move matrix MiIt afterwards, can be according to limn→∞Mi nVi-1, iteration several times is carried out, until calculated result tends to be steady, final calculating knot
Fruit is target detection use-case subset SiCorresponding probability distribution Vi, ViIn the value of each element be that multiple audiometry personnel respectively correspond
Target weight, wherein the value of i be 1~Q, Q be test case subset quantity.
Step S304: by the corresponding target weight of audiometry personnel, each survey of target detection use-case subset is determined
The weight of each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of example on probation.
Specifically, in artificial prosodic labeling result set corresponding for test case any in target detection use-case subset
Any artificial prosodic labeling result: by choosing the audiometry people that the artificial prosodic labeling result is best artificial prosodic labeling result
The corresponding target weight of member, determines the weight of the artificial prosodic labeling result, to obtain each survey in target detection use-case subset
The weight of each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of example on probation.
Further, in artificial prosodic labeling result set corresponding for test case any in target detection use-case subset
Any artificial prosodic labeling result: can will choose the audiometry that the artificial prosodic labeling result is best artificial prosodic labeling result
The corresponding target weight summation of personnel, summation obtains weight of the value as the artificial prosodic labeling result, to obtain target detection
In use-case subset in the corresponding artificial prosodic labeling result set of each test case each artificial prosodic labeling result weight.
Step S305: the test case subset whether also having not been obtained in multiple test case subsets is judged, if multiple
The test case subset also having not been obtained in test case subset, thens follow the steps S306, if in multiple test case subsets
The test case subset not having not been obtained then terminates weight and determines process.
Step S306: corresponding just using the corresponding target weight of multiple audiometry personnel as multiple audiometry personnel
Then beginning weight executes step S302.
In the present embodiment, the corresponding artificial rhythm mark of each test case in first aim test case subset is calculated
When infusing the weight of each artificial prosodic labeling result in result set, with [1/Np 1/Np…1/Np]TIt is tested as first aim
The corresponding initial probability distribution V of use-case subset0, pass through initial probability distribution V0It is corresponding with first aim test case subset
Transfer matrix M1It determines the corresponding probability distribution V1 of first aim test case subset, is used calculating second target test
It, will when example concentrates the weight of each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of each test case
V1 passes through V1 and second target test case subset as the corresponding initial probability distribution of second target test case subset
Corresponding transfer matrix M2 determines the corresponding probability distribution V2 of second target test case subset, and so on.
Via the above process, it can get test case and concentrate the corresponding artificial prosodic labeling result of multiple test cases
Concentrate the weight of each artificial prosodic labeling result.
It should be noted that the present embodiment is grouped the test case that test case is concentrated, used respectively for every group of test
Example determines weight, since the data of each operation are relatively fewer, operation efficiency with higher, in addition, by previous target
The corresponding probability distribution of test case subset can be reduced as the corresponding initial probability distribution of next target detection use-case subset
The number of iterations makes calculated result quickly tend to be steady, and can be further improved operation efficiency.
It is further to note that above-mentioned steps S301~step S306 is a kind of preferred implementation of step S203,
The present embodiment does not limit step S203 and can only be realized by step S301~step S306, can also realize otherwise,
For example, can not be grouped to test use cases, an initial probability distribution V is set for entire test use cases0(V0For NpDimension
Column vector, the NpThe value of each element is 1/N in dimensional vectorp), then the corresponding test of any two audiometry personnel is used
Number of cases amount is transfer matrix M (N more corresponding than forming entire test use casesp*Np), pass through limn→∞MnV0Iteration several times is carried out,
Until calculated result tends to be steady, final calculated result is the corresponding probability distribution V of entire test use cases, each member in V
The value of element is that the corresponding target weight of multiple audiometry personnel is determined whole by the corresponding target weight of audiometry personnel
A test case concentrates each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of multiple test cases
Weight.
Aforementioned process gives obtains the corresponding artificial prosodic labeling of the multiple test cases of test case concentration in advance
Next the process of the weight of each artificial prosodic labeling result in result set " step S102: is based on in above-described embodiment
The power of each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of test cases be obtained ahead of time, multiple
Value, determines the weight of the corresponding prosody prediction result of multiple test cases " it is introduced.
Based on each artificial rhythm in the corresponding artificial prosodic labeling result set of being obtained ahead of time, multiple test cases
The weight of annotation results determines that the process of the weight of the corresponding prosody prediction result of multiple test cases may include: needle
To the corresponding prosody prediction of any test case as a result, from the corresponding artificial prosodic labeling result set of the test case determine with
The consistent artificial prosodic labeling of the corresponding prosody prediction result of the test case is as a result, and artificial prosodic labeling knot that will determine
Weight of the weight of fruit as the corresponding prosody prediction result of the test case;Multiple test cases are concentrated to obtain test case
The weight of corresponding prosody prediction result.
Illustratively, the corresponding prosody prediction result of a test case is x, the corresponding artificial prosodic labeling of the test case
Result set is { a1, a2, a3 }, if prosody prediction result x is consistent with artificial prosodic labeling result a2, by artificial prosodic labeling knot
Weight of the weight of fruit a2 as prosody prediction result x.It should be noted that prosody prediction result x and artificial prosodic labeling knot
Fruit a2 unanimously can be identical as the prosodic phrase stall position in artificial prosodic labeling result a2 for prosody prediction result x.
It should be noted that for the corresponding prosody prediction of any test case as a result, if the corresponding people of the test case
In work prosodic labeling result set be not present the consistent artificial prosodic labeling of prosody prediction result corresponding with the test case as a result,
It is then 0 by the weight setting of the corresponding prosody prediction result of test case.
It, can basis after obtaining test case and concentrating the weight of the corresponding prosody prediction result of multiple test cases
The weight of the corresponding prosody prediction result of multiple test cases, determines the prosody prediction effect of prosody prediction engine to be assessed
Assessment result.
Referring to Fig. 4, showing according to the corresponding prosody prediction result of multiple test cases and multiple tests use
The weight of the corresponding prosody prediction result of example, determines the assessment result of the prosody prediction effect of prosody prediction engine to be assessed
Flow diagram, may include:
Step S401: right respectively by the corresponding prosody prediction result of multiple test cases and multiple test cases
The weight for the prosody prediction result answered determines the score of the prosody prediction effect of prosody prediction engine to be assessed.
Specifically, the weight of the corresponding prosody prediction result of multiple test cases can be summed, summation obtains value and returns
Score after one change as the prosody prediction effect of prosody prediction engine to be assessed.
Step S402: determine that the score of the prosody prediction effect of prosody prediction engine to be assessed accounts for the ratio of artificial top score
Example, the assessment result of the prosody prediction effect as prosody prediction engine to be assessed.
Wherein, artificial top score to the corresponding maximum weight of multiple test cases by summing to obtain, any survey
Trying out the corresponding maximum weight of example is each artificial prosodic labeling knot in the corresponding artificial prosodic labeling result set of the test case
Maximum weight in the weight of fruit.
In the present embodiment, the score of the prosody prediction effect of prosody prediction engine to be assessed accounts for the ratio of artificial top score
Example is able to reflect most of user to the prosody prediction good results degree of prosody prediction engine to be assessed, it is possible to understand that, it is to be evaluated
Estimate the prosody prediction effect of prosody prediction engine score account for artificial top score ratio it is bigger, show pre- to the rhythm to be assessed
The prosody prediction good results degree for surveying engine is higher.
The appraisal procedure of prosody prediction effect provided by the embodiments of the present application, can be automatically to prosody prediction engine to be assessed
Prosody prediction effect is assessed, and compared to existing manual evaluation mode, not only avoids subjective factors to assessment result
Influence, and save manpower, reduce that assessment is time-consuming, i.e., appraisal procedure provided by the embodiments of the present application can be automatic, high
Effect objectively assesses the prosody prediction effect of prosody prediction engine to be assessed.In addition, test use cases in the present embodiment
In multiple corresponding artificial prosodic labeling result sets of test case and the corresponding artificial rhythm of multiple test cases
Annotation results concentrate the weight of each artificial prosodic labeling result reusable.
The embodiment of the present application also provides a kind of assessment devices of prosody prediction effect, provide below the embodiment of the present application
Assessment device be described, assessment device described below can correspond to each other reference with above-described appraisal procedure.
Referring to Fig. 5, the structure for showing a kind of assessment device of prosody prediction effect provided by the embodiments of the present application is shown
It is intended to, the apparatus may include: prosody prediction result obtains module 501, prosody prediction result weight determining module 502 and the rhythm
Evaluation module 503.
Prosody prediction result obtains module 501, concentrates the corresponding rhythm of multiple test cases for obtaining test case
Restrain prediction result.
Wherein, the corresponding prosody prediction result of each test case is predicted to obtain by prosody prediction engine to be assessed.
Prosody prediction result weight determining module 502, for based on being obtained ahead of time, multiple test cases are corresponding
The weight of each artificial prosodic labeling result, determines that the multiple test case is corresponding in artificial prosodic labeling result set
The weight of prosody prediction result.
Wherein, the corresponding artificial prosodic labeling result set of any test case include the test case it is corresponding at least one
Artificial prosodic labeling is as a result, the weight of any artificial prosodic labeling result can characterize the reasonable journey of the artificial prosodic labeling result
Degree.
Prosody prediction recruitment evaluation module 503, for according to the corresponding prosody prediction knot of the multiple test case
The weight of fruit and the corresponding prosody prediction result of the multiple test case determines the prosody prediction engine to be assessed
Prosody prediction effect assessment result.
The assessment device of prosody prediction effect provided by the embodiments of the present application, first acquisition test case concentrate multiple tests
The corresponding prosody prediction of use-case is as a result, be then based on the corresponding artificial rhythm mark of multiple test cases being obtained ahead of time
The weight for infusing each artificial prosodic labeling result in result set, determines the corresponding prosody prediction result of multiple test cases
Weight determines prosody prediction engine to be assessed finally according to the weight of the corresponding prosody prediction result of multiple test cases
Prosody prediction effect assessment result, it can be seen that, the assessment device of prosody prediction effect provided by the embodiments of the present application can
Based on the weight of artificial prosodic labeling result each in the corresponding artificial prosodic labeling result set of multiple test cases, automatically
The prosody prediction effect of prosody prediction engine to be assessed is assessed, compared to existing manual evaluation mode, is not only avoided
Influences of the subjective factors to assessment result, and save that manpower, to reduce assessment time-consuming.
In one possible implementation, the assessment device of prosody prediction effect provided by the above embodiment further include:
Artificial prosodic labeling result set obtains module, audio synthesis module and artificial prosodic labeling result weight determining module.
The artificial prosodic labeling result set obtains module, corresponding artificial for obtaining the multiple test case
Prosodic labeling result set.
The audio synthesis module, being used for will be in the corresponding artificial prosodic labeling result set of the multiple test case
Each artificial prosodic labeling result carries out audio synthesis, obtains the corresponding Composite tone collection of multiple test cases.
The artificial prosodic labeling result weight determining module, for according to audiometry personnel needle each in multiple audiometry personnel
The best artificial prosodic labeling choose to each test case is as a result, determined the corresponding artificial rhythm of the multiple test case
Restrain the weight that annotation results concentrate each artificial prosodic labeling result.
Wherein, any audiometry personnel are audiometry people for the best artificial prosodic labeling result that any test case is chosen
Member carries out audiometry by each Composite tone concentrated to the corresponding Composite tone of the test case, from the corresponding people of the test case
The best artificial prosodic labeling result selected in work prosodic labeling result set.
In one possible implementation, the artificial prosodic labeling result weight determining module includes: grouping submodule
Block, acquisition submodule, the first weight determine that submodule, the second weight determine that submodule and initial weight determine submodule;
The grouping submodule, the test case for concentrating the test case are divided into multiple groups, every group of test case
A test case subset is formed, multiple test case subsets are obtained;
The acquisition submodule, for obtaining test case having not been obtained from the multiple test case subset
Collection is used as target detection use-case subset;
First weight determines submodule, for being based on the corresponding initial weight of the multiple audiometry personnel, with
And the best artificial prosodic labeling knot that each audiometry personnel choose for test case each in the target detection use-case subset
Fruit determines the corresponding target weight of the multiple audiometry personnel;
Second weight determines submodule, for determining institute by the corresponding target weight of the audiometry personnel
State each artificial prosodic labeling knot in the corresponding artificial prosodic labeling result set of each test case in target detection use-case subset
The weight of fruit;
The initial weight determines submodule, for the corresponding target weight of the multiple audiometry personnel to be determined as
The corresponding initial weight of the multiple audiometry personnel, then triggers the acquisition submodule from multiple test case subsets
The test case subset that acquisition one had not been obtained is as target detection use-case subset, until there is no the test cases having not been obtained
Subset, it is each artificial in the corresponding artificial prosodic labeling result set of the multiple test cases of the test case concentration to obtain
The weight of prosodic labeling result.
In one possible implementation, first weight determines submodule, is specifically used for passing through each audiometry people
Member is directed to the best artificial prosodic labeling that each test case is chosen in the target detection use-case subset as a result, determining any two
The corresponding test case quantity ratio of a audiometry personnel;Based on the corresponding initial weight of the multiple audiometry personnel, Yi Jisuo
The corresponding test case quantity ratio of any two audiometry personnel is stated, determines the corresponding target power of the multiple audiometry personnel
Value;
Wherein, the test case quantity ratio of any two audiometry personnel are as follows: any two audiometry personnel survey for the target
The sum for the corresponding test case of identical best manually prosodic labeling result that each test case that example on probation is concentrated is chosen
The ratio of amount and the total quantity of test case in the target detection use-case subset.
In one possible implementation, second weight determines submodule, is specifically used for surveying the target
Example on probation concentrates any artificial prosodic labeling in the corresponding artificial prosodic labeling result set of any test case as a result, passing through
The corresponding target weight of audiometry personnel that the artificial prosodic labeling result is best artificial prosodic labeling result is chosen, determines the people
The weight of work prosodic labeling result, to obtain the corresponding artificial rhythm mark of each test case in the target detection use-case subset
Infuse the weight of each artificial prosodic labeling result in result set.
In one possible implementation, second weight determines submodule, by choosing the artificial rhythm mark
The corresponding target weight of audiometry personnel that result is best artificial prosodic labeling result is infused, determines the artificial prosodic labeling result
It is corresponding specifically for the audiometry personnel that the artificial prosodic labeling result is best artificial prosodic labeling result will be chosen when weight
The summation of target weight, summation obtain weight of the value as the artificial prosodic labeling result.
In one possible implementation, in the assessment device of prosody prediction effect provided by the above embodiment, the rhythm
Prediction result weight determining module 502 is specifically used for being directed to the corresponding prosody prediction result of any test case: use from the test
The consistent artificial rhythm of prosody prediction result corresponding with the test case is determined in the corresponding artificial prosodic labeling result set of example
Annotation results, and using the weight for the artificial prosodic labeling result determined as the corresponding prosody prediction result of the test case
Weight;To obtain the weight of the corresponding prosody prediction result of the multiple test case.
In one possible implementation, in the assessment device of prosody prediction effect provided by the above embodiment, the rhythm
Evaluation module 503 may include: that prosody prediction effect score determines that submodule and assessment result determine submodule.
The prosody prediction effect score determines submodule, for passing through the corresponding rhythm of the multiple test case
The weight of prediction result and the corresponding prosody prediction result of the multiple test case determines that the rhythm to be assessed is pre-
Survey the score of the prosody prediction effect of engine.
The assessment result determines submodule, for determining the prosody prediction effect of the prosody prediction engine to be assessed
Score accounts for the ratio of artificial top score, the assessment result of the prosody prediction effect as the prosody prediction engine to be assessed.
Wherein, the artificial top score is appointed by summing to obtain to the corresponding maximum weight of multiple test cases
The corresponding maximum weight of one test case is each artificial rhythm mark in the corresponding artificial prosodic labeling result set of the test case
Infuse the maximum weight in the weight of result.
The embodiment of the present application also provides a kind of assessment equipments of prosody prediction effect, referring to Fig. 6, showing the assessment
The structural schematic diagram of equipment, the assessment equipment may include: at least one processor 601, at least one communication interface 602, until
A few memory 603 and at least one communication bus 604;
In the embodiment of the present application, processor 601, communication interface 602, memory 603, communication bus 604 quantity be
At least one, and processor 601, communication interface 602, memory 603 complete mutual communication by communication bus 604;
Processor 601 may be a central processor CPU or specific integrated circuit ASIC (Application
Specific Integrated Circuit), or be arranged to implement the integrated electricity of one or more of the embodiment of the present invention
Road etc.;
Memory 603 may include high speed RAM memory, it is also possible to further include nonvolatile memory (non-
Volatile memory) etc., a for example, at least magnetic disk storage;
Wherein, memory is stored with program, the program that processor can call memory to store, and described program is used for:
It obtains test case and concentrates the corresponding prosody prediction result of multiple test cases, wherein each test case
Corresponding prosody prediction result is predicted to obtain by prosody prediction engine to be assessed;
Based on each artificial in the corresponding artificial prosodic labeling result set of being obtained ahead of time, the multiple test case
The weight of prosodic labeling result determines the weight of the corresponding prosody prediction result of the multiple test case, wherein any
The corresponding artificial prosodic labeling result set of test case include at least one corresponding artificial prosodic labeling of the test case as a result,
The weight of any artificial prosodic labeling result can characterize the resonable degree of the artificial prosodic labeling result;
It is right respectively according to the corresponding prosody prediction result of the multiple test case and the multiple test case
The weight for the prosody prediction result answered determines the assessment result of the prosody prediction effect of the prosody prediction engine to be assessed.
Optionally, the refinement function of described program and extension function can refer to above description.
The embodiment of the present application also provides a kind of readable storage medium storing program for executing, which can be stored with and hold suitable for processor
Capable program, described program are used for:
It obtains test case and concentrates the corresponding prosody prediction result of multiple test cases, wherein each test case
Corresponding prosody prediction result is predicted to obtain by prosody prediction engine to be assessed;
Based on each artificial in the corresponding artificial prosodic labeling result set of being obtained ahead of time, the multiple test case
The weight of prosodic labeling result determines the weight of the corresponding prosody prediction result of the multiple test case, wherein any
The corresponding artificial prosodic labeling result set of test case include at least one corresponding artificial prosodic labeling of the test case as a result,
The weight of any artificial prosodic labeling result can characterize the resonable degree of the artificial prosodic labeling result;
It is right respectively according to the corresponding prosody prediction result of the multiple test case and the multiple test case
The weight for the prosody prediction result answered determines the assessment result of the prosody prediction effect of the prosody prediction engine to be assessed.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by
One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation
Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning
Covering non-exclusive inclusion, so that the process, method, article or equipment for including a series of elements not only includes that
A little elements, but also including other elements that are not explicitly listed, or further include for this process, method, article or
The intrinsic element of equipment.In the absence of more restrictions, the element limited by sentence "including a ...", is not arranged
Except there is also other identical elements in the process, method, article or apparatus that includes the element.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with other
The difference of embodiment, the same or similar parts in each embodiment may refer to each other.
The foregoing description of the disclosed embodiments enables those skilled in the art to implement or use the present invention.
Various modifications to these embodiments will be readily apparent to those skilled in the art, as defined herein
General Principle can be realized in other embodiments without departing from the spirit or scope of the present invention.Therefore, of the invention
It is not intended to be limited to the embodiments shown herein, and is to fit to and the principles and novel features disclosed herein phase one
The widest scope of cause.
Claims (15)
1. a kind of appraisal procedure of prosody prediction effect characterized by comprising
It obtains test case and concentrates the corresponding prosody prediction result of multiple test cases, wherein each test case is corresponding
Prosody prediction result predict to obtain by prosody prediction engine to be assessed;
Based on each artificial rhythm in the corresponding artificial prosodic labeling result set of being obtained ahead of time, the multiple test case
The weight of annotation results determines the weight of the corresponding prosody prediction result of the multiple test case, wherein any test
The corresponding artificial prosodic labeling result set of use-case includes at least one corresponding artificial prosodic labeling of the test case as a result, any
The weight of artificial prosodic labeling result can characterize the resonable degree of the artificial prosodic labeling result;
It is corresponding according to the corresponding prosody prediction result of the multiple test case and the multiple test case
The weight of prosody prediction result determines the assessment result of the prosody prediction effect of the prosody prediction engine to be assessed.
2. the appraisal procedure of prosody prediction effect according to claim 1, which is characterized in that obtain the multiple test and use
The weight of each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of example, comprising:
Obtain the corresponding artificial prosodic labeling result set of the multiple test case;
By each of the corresponding artificial prosodic labeling result set of the multiple test case work prosodic labeling result into
The synthesis of row audio, obtains the corresponding Composite tone collection of the multiple test case;
According to audiometry personnel each in multiple audiometry personnel for each test case choose best artificial prosodic labeling as a result,
Determine the weight of each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of the multiple test case;
Wherein, any audiometry personnel for any test case choose best artificial prosodic labeling result be audiometry personnel by pair
Each Composite tone that the corresponding Composite tone of the test case is concentrated carries out audiometry, from the corresponding artificial rhythm mark of the test case
The best artificial prosodic labeling result selected in note result set.
3. the appraisal procedure of prosody prediction effect according to claim 2, which is characterized in that described according to multiple audiometry people
Each audiometry personnel are for the best artificial prosodic labeling of each test case selection as a result, determining that the multiple test is used in member
The weight of each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of example, comprising:
The test case that the test case is concentrated is divided into multiple groups, every group of test case forms a test case subset, obtain
To multiple test case subsets;
The test case subset that acquisition one had not been obtained from the multiple test case subset is as target detection use-case subset;
It is used based on the corresponding initial weight of the multiple audiometry personnel and each audiometry personnel for the target detection
The best artificial prosodic labeling that example concentrates each test case to choose is as a result, determine that the multiple audiometry personnel are corresponding
Target weight;
By the corresponding target weight of the audiometry personnel, each test case in the target detection use-case subset is determined
The weight of each artificial prosodic labeling result in corresponding artificial prosodic labeling result set;
Using the corresponding target weight of the multiple audiometry personnel as the corresponding initial power of the multiple audiometry personnel
Then value executes the test case subset that acquisition one had not been obtained from multiple test case subsets and uses as target detection
Example collection, until there is no the test case subset having not been obtained, it is corresponding artificial to obtain the multiple test case
The weight of each artificial prosodic labeling result in prosodic labeling result set.
4. the appraisal procedure of prosody prediction effect according to claim 3, which is characterized in that described to be based on the multiple survey
The corresponding initial weight of personnel and each audiometry personnel are listened to use for test each in the target detection use-case subset
The best artificial prosodic labeling that example is chosen is as a result, determine the corresponding target weight of the multiple audiometry personnel, comprising:
The best artificial rhythm chosen by each audiometry personnel for test case each in the target detection use-case subset
Annotation results determine the corresponding test case quantity ratio of any two audiometry personnel, wherein any two audiometry personnel are corresponding
Test case quantity ratio are as follows: any two audiometry personnel choose for each test case in the target detection use-case subset
The total quantity and the sub- integrated test of the target detection use-case of the corresponding test case of identical best manually prosodic labeling result are used
The ratio of the total quantity of example;
Based on the corresponding initial weight of the multiple audiometry personnel and the corresponding test of any two audiometry personnel
Use-case quantity ratio determines the corresponding target weight of the multiple audiometry personnel.
5. the appraisal procedure of prosody prediction effect according to claim 3, which is characterized in that described to pass through the audiometry people
The corresponding target weight of member determines the corresponding artificial prosodic labeling of each test case in the target detection use-case subset
The weight of each artificial prosodic labeling result in result set, comprising:
Any people in artificial prosodic labeling result set corresponding for test case any in the target detection use-case subset
Work prosodic labeling result:
By choosing the corresponding target weight of audiometry personnel that the artificial prosodic labeling result is best artificial prosodic labeling result,
Determine the weight of the artificial prosodic labeling result;
To obtain in the corresponding artificial prosodic labeling result set of each test case in the target detection use-case subset everyone
The weight of work prosodic labeling result.
6. the appraisal procedure of prosody prediction effect according to claim 5, which is characterized in that described artificial by choosing this
Prosodic labeling result is the corresponding target weight of audiometry personnel of best artificial prosodic labeling result, determines the artificial prosodic labeling
As a result weight, comprising:
The corresponding target weight of audiometry personnel that the artificial prosodic labeling result is best artificial prosodic labeling result will be chosen to ask
With, summation obtain weight of the value as the artificial prosodic labeling result.
7. the appraisal procedure of prosody prediction effect described according to claim 1~any one of 6, which is characterized in that described
Based on each artificial prosodic labeling in the corresponding artificial prosodic labeling result set of being obtained ahead of time, the multiple test case
As a result weight determines the weight of the corresponding prosody prediction result of the multiple test case, comprising:
For the corresponding prosody prediction result of any test case:
Prosody prediction result corresponding with the test case is determined from the corresponding artificial prosodic labeling result set of the test case
Consistent artificial prosodic labeling is as a result, and corresponding as the test case using the weight for the artificial prosodic labeling result determined
The weight of prosody prediction result;
To obtain the weight of the corresponding prosody prediction result of the multiple test case.
8. the appraisal procedure of prosody prediction effect described according to claim 1~any one of 6, which is characterized in that described
According to the corresponding prosody prediction result of the multiple test case and the corresponding rhythm of the multiple test case
The weight of prediction result determines the assessment result of the prosody prediction effect of the prosody prediction engine to be assessed, comprising:
It is corresponding by the corresponding prosody prediction result of the multiple test case and the multiple test case
The weight of prosody prediction result determines the score of the prosody prediction effect of the prosody prediction engine to be assessed;
Determine that the score of the prosody prediction effect of the prosody prediction engine to be assessed accounts for the ratio of artificial top score, as institute
State the assessment result of the prosody prediction effect of prosody prediction engine to be assessed;
Wherein, the artificial top score by summing to obtain to the corresponding maximum weight of each test case, use by any test
The corresponding maximum weight of example is each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of the test case
Maximum weight in weight.
9. a kind of assessment device of prosody prediction effect characterized by comprising it is pre- that prosody prediction result obtains module, the rhythm
Survey result weight determining module and prosody prediction recruitment evaluation module;
The prosody prediction result obtains module, concentrates the corresponding rhythm of multiple test cases pre- for obtaining test case
Survey result, wherein the corresponding prosody prediction result of each test case is predicted to obtain by prosody prediction engine to be assessed;
The prosody prediction result weight determining module, for being respectively corresponded based on test case be obtained ahead of time, the multiple
Artificial prosodic labeling result set in each artificial prosodic labeling result weight, determine that the multiple test case respectively corresponds
Prosody prediction result weight, wherein the corresponding artificial prosodic labeling result set of any test case includes the test case
At least one corresponding artificial prosodic labeling is as a result, the weight of any artificial prosodic labeling result can characterize the artificial rhythm mark
Infuse the resonable degree of result;
The prosody prediction recruitment evaluation module, for according to the corresponding prosody prediction result of the multiple test case with
And the weight of the corresponding prosody prediction result of the multiple test case, determine the rhythm of the prosody prediction engine to be assessed
Restrain the assessment result of prediction effect.
10. the assessment device of prosody prediction effect according to claim 9, which is characterized in that further include: artificial rhythm mark
It infuses result set and obtains module, audio synthesis module and artificial prosodic labeling result weight determining module;
The artificial prosodic labeling result set obtains module, for obtaining the corresponding artificial rhythm of the multiple test case
Annotation results collection;
The audio synthesis module, for will be every in the corresponding artificial prosodic labeling result set of the multiple test case
A artificial prosodic labeling result carries out audio synthesis, obtains the corresponding Composite tone collection of the multiple test case;
The artificial prosodic labeling result weight determining module, it is every for being directed to according to audiometry personnel each in multiple audiometry personnel
The best artificial prosodic labeling that a test case is chosen is as a result, determine the corresponding artificial rhythm mark of the multiple test case
Infuse the weight of each artificial prosodic labeling result in result set;Wherein, any audiometry personnel choose for any test case
Best artificial prosodic labeling result, which is audiometry personnel, passes through each synthesized voice concentrated to the corresponding Composite tone of the test case
Frequency carries out audiometry, the best artificial prosodic labeling knot selected from the corresponding artificial prosodic labeling result set of the test case
Fruit.
11. the assessment device of prosody prediction effect according to claim 9, which is characterized in that the artificial prosodic labeling
As a result weight determining module includes: and is grouped submodule, acquisition submodule, the first weight to determine that submodule, the second weight determine son
Module and initial weight determine submodule;
The grouping submodule, the test case for concentrating the test case are divided into multiple groups, every group of test case composition
One test case subset obtains multiple test case subsets;
The acquisition submodule is made for obtaining the test case subset that one had not been obtained from the multiple test case subset
For target detection use-case subset;
First weight determines submodule, is used to be based on the corresponding initial weight of the multiple audiometry personnel, and every
A audiometry personnel are directed to the best artificial prosodic labeling that each test case is chosen in the target detection use-case subset as a result, really
Determine the corresponding target weight of the multiple audiometry personnel;
Second weight determines submodule, for determining the mesh by the corresponding target weight of the audiometry personnel
Mapping tries out example and concentrates each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of each test case
Weight;
The initial weight determines submodule, described for the corresponding target weight of the multiple audiometry personnel to be determined as
The corresponding initial weight of multiple audiometry personnel, then triggers the acquisition submodule and obtains from multiple test case subsets
The one test case subset having not been obtained is as target detection use-case subset, until there is no the test case having not been obtained
Collection, to obtain each artificial prosodic labeling result in the corresponding artificial prosodic labeling result set of the multiple test case
Weight.
12. the assessment device of prosody prediction effect according to claim 11, which is characterized in that first weight determines
Submodule, specifically for being chosen most by each audiometry personnel for test case each in the target detection use-case subset
Beautiful woman's work prosodic labeling is as a result, determine the corresponding test case quantity ratio of any two audiometry personnel;Based on the multiple audiometry
The corresponding initial weight of personnel and the corresponding test case quantity ratio of any two audiometry personnel, determine described in
The corresponding target weight of multiple audiometry personnel;
Wherein, the test case quantity ratio of any two audiometry personnel are as follows: any two audiometry personnel use for the target detection
The total quantity for the corresponding test case of identical best artificial prosodic labeling result that each test case that example is concentrated is chosen with
The ratio of the total quantity of test case in the target detection use-case subset.
13. the assessment device of prosody prediction effect according to claim 11, which is characterized in that second weight determines
Submodule is specifically used for artificial prosodic labeling result set corresponding for test case any in the target detection use-case subset
In any artificial prosodic labeling as a result, by choosing the survey that the artificial prosodic labeling result is best artificial prosodic labeling result
The corresponding target weight of personnel is listened, determines the weight of the artificial prosodic labeling result, to obtain the target detection use-case subset
In in the corresponding artificial prosodic labeling result set of each test case each artificial prosodic labeling result weight.
14. the assessment device of the prosody prediction effect according to any one of claim 9~13, which is characterized in that institute
Prosody prediction result weight determining module is stated, is specifically used for being directed to the corresponding prosody prediction result of any test case: from the survey
Determine that prosody prediction result corresponding with the test case is consistent artificial in the corresponding artificial prosodic labeling result set of example on probation
Prosodic labeling is as a result, and using the weight for the artificial prosodic labeling result determined as the corresponding prosody prediction knot of the test case
The weight of fruit;To obtain the weight of the corresponding prosody prediction result of the multiple test case.
15. the assessment device of the prosody prediction effect according to any one of claim 9~13, which is characterized in that institute
Stating prosody prediction recruitment evaluation module includes: that prosody prediction effect score determines that submodule and assessment result determine submodule;
The prosody prediction effect score determines submodule, for passing through the corresponding prosody prediction of the multiple test case
As a result and the weight of the corresponding prosody prediction result of the multiple test case, determine that the prosody prediction to be assessed draws
The score for the prosody prediction effect held up;
The assessment result determines submodule, the score of the prosody prediction effect for determining the prosody prediction engine to be assessed
The ratio for accounting for artificial top score, the assessment result of the prosody prediction effect as the prosody prediction engine to be assessed;Wherein,
For the artificial top score by summing to obtain to the corresponding maximum weight of each test case, any test case is corresponding most
Big weight be in the corresponding artificial prosodic labeling result set of the test case in the weight of each artificial prosodic labeling result most
Big weight.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910461506.5A CN110176225B (en) | 2019-05-30 | 2019-05-30 | Method and device for evaluating rhythm prediction effect |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910461506.5A CN110176225B (en) | 2019-05-30 | 2019-05-30 | Method and device for evaluating rhythm prediction effect |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110176225A true CN110176225A (en) | 2019-08-27 |
CN110176225B CN110176225B (en) | 2021-08-13 |
Family
ID=67696566
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910461506.5A Active CN110176225B (en) | 2019-05-30 | 2019-05-30 | Method and device for evaluating rhythm prediction effect |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110176225B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110797005A (en) * | 2019-11-05 | 2020-02-14 | 百度在线网络技术(北京)有限公司 | Prosody prediction method, apparatus, device, and medium |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101000765A (en) * | 2007-01-09 | 2007-07-18 | 黑龙江大学 | Speech synthetic method based on rhythm character |
CN101051458A (en) * | 2006-04-04 | 2007-10-10 | 中国科学院自动化研究所 | Rhythm phrase predicting method based on module analysis |
CN101131818A (en) * | 2006-07-31 | 2008-02-27 | 株式会社东芝 | Speech synthesis apparatus and method |
US20100075806A1 (en) * | 2008-03-24 | 2010-03-25 | Michael Montgomery | Biorhythm feedback system and method |
CN104485115A (en) * | 2014-12-04 | 2015-04-01 | 上海流利说信息技术有限公司 | Pronunciation evaluation equipment, method and system |
CN105244020A (en) * | 2015-09-24 | 2016-01-13 | 百度在线网络技术(北京)有限公司 | Prosodic hierarchy model training method, text-to-speech method and text-to-speech device |
-
2019
- 2019-05-30 CN CN201910461506.5A patent/CN110176225B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101051458A (en) * | 2006-04-04 | 2007-10-10 | 中国科学院自动化研究所 | Rhythm phrase predicting method based on module analysis |
CN101131818A (en) * | 2006-07-31 | 2008-02-27 | 株式会社东芝 | Speech synthesis apparatus and method |
CN101000765A (en) * | 2007-01-09 | 2007-07-18 | 黑龙江大学 | Speech synthetic method based on rhythm character |
US20100075806A1 (en) * | 2008-03-24 | 2010-03-25 | Michael Montgomery | Biorhythm feedback system and method |
CN104485115A (en) * | 2014-12-04 | 2015-04-01 | 上海流利说信息技术有限公司 | Pronunciation evaluation equipment, method and system |
CN105244020A (en) * | 2015-09-24 | 2016-01-13 | 百度在线网络技术(北京)有限公司 | Prosodic hierarchy model training method, text-to-speech method and text-to-speech device |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110797005A (en) * | 2019-11-05 | 2020-02-14 | 百度在线网络技术(北京)有限公司 | Prosody prediction method, apparatus, device, and medium |
Also Published As
Publication number | Publication date |
---|---|
CN110176225B (en) | 2021-08-13 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN105323285A (en) | APP-product multi-platform release method | |
CN107544905A (en) | The optimization method and system of regression test case collection | |
CN106326116A (en) | Method and device for testing product | |
CN102831055A (en) | Test case selection method based on weighting attribute | |
CN106325756A (en) | Data storage and data computation methods and devices | |
CN106779086A (en) | A kind of integrated learning approach and device based on Active Learning and model beta pruning | |
CN109657125A (en) | Data processing method, device, equipment and storage medium based on web crawlers | |
CN110176225A (en) | A kind of appraisal procedure and device of prosody prediction effect | |
CN105447072A (en) | Configurable interface framework as well as searching method and system utilizing framework | |
CN106919576A (en) | Using the method and device of two grades of classes keywords database search for application now | |
CN106844319A (en) | Report form generation method and device | |
CN103218419B (en) | Web tab clustering method and system | |
CN103957531B (en) | The method and apparatus that signal testing is carried out using intelligent communications terminal | |
CN105447635A (en) | Examination and approval method and device in workflow | |
CN106919587A (en) | Application program search system and method | |
CN109102303A (en) | Risk checking method and relevant apparatus | |
CN108664593A (en) | Data consistency verification method, device, storage medium and electronic equipment | |
CN104572774A (en) | Searching method and device | |
CN110334019A (en) | A kind of test method, device and readable storage medium storing program for executing | |
CN110297770A (en) | Application testing method and relevant apparatus | |
CN109614465A (en) | Data processing method, device and electronic equipment based on citation relations | |
CN109285009A (en) | It brushes single recognition methods and brushes single identification device | |
CN105204067B (en) | Seismic horizon method for tracing and device | |
CN109002511A (en) | A kind of intelligent recommendation method and apparatus of public lavatory | |
CN101582133A (en) | Scheme evaluating system and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |