CN108536668A - Wake up word appraisal procedure and device, storage medium, electronic equipment - Google Patents
Wake up word appraisal procedure and device, storage medium, electronic equipment Download PDFInfo
- Publication number
- CN108536668A CN108536668A CN201810159653.2A CN201810159653A CN108536668A CN 108536668 A CN108536668 A CN 108536668A CN 201810159653 A CN201810159653 A CN 201810159653A CN 108536668 A CN108536668 A CN 108536668A
- Authority
- CN
- China
- Prior art keywords
- word
- assessed
- feature
- assessment
- duration
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000000034 method Methods 0.000 title claims abstract description 28
- 230000002618 waking effect Effects 0.000 claims abstract description 58
- 238000012545 processing Methods 0.000 claims abstract description 27
- 238000000605 extraction Methods 0.000 claims description 21
- 238000004364 calculation method Methods 0.000 claims description 15
- 238000013507 mapping Methods 0.000 claims description 6
- 239000000284 extract Substances 0.000 claims description 5
- 230000000694 effects Effects 0.000 abstract description 7
- 238000005516 engineering process Methods 0.000 description 7
- 238000010586 diagram Methods 0.000 description 6
- 238000011156 evaluation Methods 0.000 description 4
- 238000012854 evaluation process Methods 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000013528 artificial neural network Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 3
- 238000012549 training Methods 0.000 description 3
- 150000001875 compounds Chemical class 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 230000011218 segmentation Effects 0.000 description 2
- 239000013598 vector Substances 0.000 description 2
- 230000037007 arousal Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 239000003795 chemical substances by application Substances 0.000 description 1
- 230000002354 daily effect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000003203 everyday effect Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 210000005036 nerve Anatomy 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/02—Feature extraction for speech recognition; Selection of recognition unit
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/26—Speech to text systems
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Acoustics & Sound (AREA)
- Human Computer Interaction (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Machine Translation (AREA)
Abstract
A kind of wake-up word appraisal procedure of disclosure offer and device, storage medium, electronic equipment.This method includes:Obtain word to be assessed input by user;The assessment feature of the word to be assessed is extracted, the assessment feature is for indicating the word to be assessed in acoustics level and/or the distinction of semantic level;Using the assessment feature of the word to be assessed as input, after the wake-up word assessment models processing through building in advance, determine the word to be assessed if appropriate for as wake-up word.Such scheme helps to improve the accuracy for waking up word assessment result, and then improves the wake-up effect of the wake-up word of user setting.
Description
Technical field
This disclosure relates to voice process technology field, and in particular, to a kind of wake-up word appraisal procedure and device are deposited
Storage media, electronic equipment.
Background technology
Voice awakening technology is the important branch in voice process technology field, in smart home, intelligent robot, intelligence
Energy vehicle device, smart mobile phone etc. have important application.
In actual application, intelligent terminal captures voice data input by user, by the wake-up model built in advance
It carries out waking up word identification, if the voice data is identified as waking up word, wakes up success;Otherwise failure is waken up.
In order to improve the usage experience of user, personalized wake-up word can be according to demand set by user.Meanwhile in order to
Ensure wake-up effect, need first to carry out when user setting wakes up word to wake up word assessment, judge user setting wake-up word whether
Properly.
Current wake-up word assessment Main Basiss experience or rule are realized.Specifically, waiting for for user setting can be obtained
Word is assessed, judges whether word to be assessed meets default evaluation condition, if it is satisfied, then illustrating that word to be assessed is suitable as
Wake up word.For example, default evaluation condition may include:The length of word is more than preset length;And/or word includes
Difference between syllable is more than default difference.Wherein, the length of word can be presented as the word quantity and/or word that word includes
The audio duration of the corresponding voice data of language;Difference between syllable can be presented as whether adjacent syllable is identical, and then count
The quantity for going out different adjacent syllables, compared with default difference.
The wake-up word evaluation process so realized based on experience or rule, since rule setting has certain subjectivity
Property, cause assessment result accuracy relatively low, and then influence the wake-up effect of the wake-up word of user setting.
Invention content
It is a general object of the present disclosure to provide a kind of wake-up word appraisal procedure and device, storage medium, electronic equipments, help
In the accuracy for improving wake-up word assessment result, and then improve the wake-up effect of the wake-up word of user setting.
To achieve the goals above, the disclosure provides a kind of wake-up word appraisal procedure, the method includes:
Obtain word to be assessed input by user;
The assessment feature of the word to be assessed is extracted, the assessment feature is for indicating the word to be assessed in acoustics
The distinction of level and/or semantic level;
Using the assessment feature of the word to be assessed as input, after the wake-up word assessment models processing through building in advance,
Determine the word to be assessed if appropriate for as wake-up word.
Optionally, for indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes voice unit
Distribution characteristics, then the assessment feature of the extraction word to be assessed include:Analyze the language that the word to be assessed includes
Sound unit counts number that the total number of voice unit, the number of different phonetic unit, variant voice unit occur, specified
At least one of in the number that the number of voice unit, each specified speech unit occur, the distribution as institute's speech units is special
Sign;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the knowledge of word to be assessed
Other probability, then the assessment feature of the extraction word to be assessed include:Obtain the voice list that the word to be assessed includes
The identification probability of member;By the mean value of the identification probability of each voice unit, as the identification probability of the word to be assessed, the knowledge
Other probability includes accuracy rate and/or false alarm rate;
And/or
For indicate the assessment feature of distinction of the word to be assessed in acoustics level include word to be assessed when
Long, then the assessment feature of the extraction word to be assessed includes:Obtain the voice unit that the word to be assessed includes
Duration;By the sum of the duration of each voice unit, the duration as the word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the sound of word to be assessed
Feature is adjusted, then the assessment feature of the extraction word to be assessed includes:Obtain the individual character that the word to be assessed includes
Tone calculates the pitch variance between adjacent individual character;It is performed mathematical calculations, is obtained using the pitch variance between the adjacent individual character
To the tonality feature of the word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the score of language model,
Then the assessment feature of the extraction word to be assessed includes:Using the word to be assessed as input, through what is built in advance
After language model processing, the score of the word to be assessed is exported, the score is used to indicate what the word to be assessed occurred
Frequency;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the word of word to be assessed
Property feature, then the assessment feature of the extraction word to be assessed include:Obtain the word that the word to be assessed includes
Part of speech;Count the number of different parts of speech, the number that variant part of speech occurs, the part of speech feature as the word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the suitable of word to be assessed
Slippery feature, then the assessment feature of the extraction word to be assessed include:The word for including using the word to be assessed,
Calculate the semantic smoothness of forward direction of the word to be assessed and reverse semantic smoothness;Utilize the positive semantic smoothness and institute
It states reverse semantic smoothness to perform mathematical calculations, obtains the smoothness feature of the word to be assessed.
Optionally it is determined that when the word to be assessed is not suitable as waking up word, the method further includes:
The problem of extracting the word to be assessed feature;
According to described problem feature, determine that problem types existing for the word to be assessed, described problem type are used for table
Show the reason of word to be assessed is not suitable as waking up word.
Optionally, described problem feature includes the score of language model, then existing for the determination word to be assessed
Problem types include:Using the word to be assessed as input, after the language model processing through building in advance, output is described to be evaluated
Estimate the score of word, the score is used to indicate the frequency that the word to be assessed occurs;When the score of the word to be assessed
When more than default score value, judge that problem types existing for the word to be assessed are high frequency vocabulary;
And/or
Described problem feature includes the duration of word to be assessed, then the determination word to be assessed there are the problem of class
Type includes:Obtain the duration for the voice unit that the word to be assessed includes;By the sum of the duration of each voice unit, as described
The duration of word to be assessed;When the duration of the word to be assessed is less than preset duration, the judgement word to be assessed exists
The problem of type be duration it is too short;
And/or
Described problem feature includes the schwa feature of word to be assessed, then is asked existing for the determination word to be assessed
Inscribing type includes:Count the number for the schwa phoneme that the word to be assessed includes;When the number of the schwa phoneme is more than pre-
If when number, judging that problem types existing for the word to be assessed are excessive for schwa.
Optionally it is determined that when the word to be assessed is not suitable as waking up word, the method further includes:
According to the semantic similar word knowledge mapping built in advance, the corresponding replaceable word of the word to be assessed is obtained;
The assessment feature of the replaceable word is extracted, the assessment feature is for indicating the replaceable word in acoustics
The distinction of level and/or semantic level;
Using the assessment feature of the replaceable word as input institute is determined after wake-up word assessment models processing
Replaceable word is stated if appropriate for as wake-up word;
If the replaceable word is suitable as waking up word, recommend the replaceable word to user.
The disclosure provides a kind of wake-up word apparatus for evaluating, and described device includes:
Word acquisition module to be assessed, for obtaining word to be assessed input by user;
Characteristic extracting module, the assessment feature for extracting the word to be assessed are assessed, the assessment feature is used for table
Show the word to be assessed in acoustics level and/or the distinction of semantic level;
Word determining module is waken up, for using the assessment feature of the word to be assessed as input, being called out through what is built in advance
It wakes up after word assessment models processing, determines the word to be assessed if appropriate for as wake-up word.
Optionally, for indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes voice unit
Distribution characteristics, then the assessment characteristic extracting module, the voice unit for including for analyzing the word to be assessed counts language
The number of the total number of sound unit, the number of different phonetic unit, the number that variant voice unit occurs, specified speech unit
At least one of in the number that mesh, each specified speech unit occur, the distribution characteristics as institute's speech units;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the knowledge of word to be assessed
Other probability, then the assessment characteristic extracting module, the identification probability for obtaining the voice unit that the word to be assessed includes;
By the mean value of the identification probability of each voice unit, as the identification probability of the word to be assessed, the identification probability includes standard
True rate and/or false alarm rate;
And/or
For indicate the assessment feature of distinction of the word to be assessed in acoustics level include word to be assessed when
It grows, then the assessment characteristic extracting module, the duration for obtaining the voice unit that the word to be assessed includes;By each voice
The sum of duration of unit, the duration as the word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the sound of word to be assessed
Feature is adjusted, then the assessment characteristic extracting module, the tone for obtaining the individual character that the word to be assessed includes calculates adjacent
Pitch variance between individual character;It is performed mathematical calculations, is obtained described to be assessed using the pitch variance between the adjacent individual character
The tonality feature of word;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the score of language model,
The then assessment characteristic extracting module, for using the word to be assessed as input, the language model through building in advance to be handled
Afterwards, the score of the word to be assessed is exported, the score is used to indicate the frequency that the word to be assessed occurs;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the word of word to be assessed
Property feature, then the assessment characteristic extracting module, the part of speech for obtaining the word that the word to be assessed includes;Statistics is different
The number that the number of part of speech, variant part of speech occur, the part of speech feature as the word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the suitable of word to be assessed
Slippery feature, then the assessment characteristic extracting module, the word for including using the word to be assessed calculate described to be evaluated
Estimate the semantic smoothness of forward direction of word and reverse semantic smoothness;It is suitable using the positive semantic smoothness and the reverse semanteme
Slippery performs mathematical calculations, and obtains the smoothness feature of the word to be assessed.
Optionally, described device further includes:
Problem characteristic extraction module, for when determining that the word to be assessed is not suitable as waking up word, described in extraction
The problem of word to be assessed feature;
Problem types determining module, for according to described problem feature, determine the word to be assessed there are the problem of class
Type, described problem type is for indicating the reason of word to be assessed is not suitable as waking up word.
Optionally, described problem feature includes the score of language model, then described problem determination type module, is used for institute
Word to be assessed is stated as input, after the language model processing through building in advance, exports the score of the word to be assessed, it is described
Score is used to indicate the frequency that the word to be assessed occurs;When the score of the word to be assessed is more than default score value, sentence
Problem types existing for the fixed word to be assessed are high frequency vocabulary;
And/or
Described problem feature includes the duration of word to be assessed, then described problem determination type module, described for obtaining
The duration for the voice unit that word to be assessed includes;By the sum of the duration of each voice unit, as the word to be assessed when
It is long;When the duration of the word to be assessed is less than preset duration, judge problem types existing for the word to be assessed for when
Length is too short;
And/or
Described problem feature includes the schwa feature of word to be assessed, then described problem determination type module, for counting
The number for the schwa phoneme that the word to be assessed includes;When the number of the schwa phoneme is more than preset number, institute is judged
It is that schwa is excessive to state problem types existing for word to be assessed.
Optionally, described device further includes:
Replaceable word obtains module, for when determining that the word to be assessed is not suitable as waking up word, according to pre-
The semantic similar word knowledge mapping first built obtains the corresponding replaceable word of the word to be assessed;
The assessment characteristic extracting module, the assessment feature for extracting the replaceable word, the assessment feature are used
In the expression replaceable word in acoustics level and/or the distinction of semantic level;
The wake-up word determining module is used for using the assessment feature of the replaceable word as input, through the wake-up
After the processing of word assessment models, determine the replaceable word if appropriate for as wake-up word;
Replaceable word recommending module, for when the replaceable word is suitable as waking up word, recommending institute to user
State replaceable word.
The disclosure provides a kind of storage medium, wherein being stored with a plurality of instruction, described instruction is loaded by processor, in execution
State the step of waking up word appraisal procedure.
The disclosure provides a kind of electronic equipment, and the electronic equipment includes;
Above-mentioned storage medium;And
Processor, for executing the instruction in the storage medium.
In disclosure scheme, it can carry out waking up word assessment based on the assessment feature of word to be assessed, specifically, assessment is special
Sign can objectively reflect that word to be assessed in acoustics level and/or the distinction of semantic level, passes through compared with the existing technology
The rule of subjectivity setting carries out waking up word assessment, and disclosure scheme helps to improve the accuracy of assessment result, and then improves and use
The wake-up effect of the wake-up word of family setting.
Other feature and advantage of the disclosure will be described in detail in subsequent specific embodiment part.
Description of the drawings
Attached drawing is for providing further understanding of the disclosure, and a part for constitution instruction, with following tool
Body embodiment is used to explain the disclosure together, but does not constitute the limitation to the disclosure.In the accompanying drawings:
Fig. 1 is the flow diagram that disclosure scheme wakes up word appraisal procedure;
Fig. 2 is the composition schematic diagram that disclosure scheme wakes up word apparatus for evaluating;
Fig. 3 is the structural schematic diagram for the electronic equipment that disclosure scheme is used to wake up word assessment.
Specific implementation mode
The specific implementation mode of the disclosure is described in detail below in conjunction with attached drawing.It should be understood that this place is retouched
The specific implementation mode stated is only used for describing and explaining the disclosure, is not limited to the disclosure.
Referring to Fig. 1, show that the disclosure wakes up the flow diagram of word appraisal procedure.It may comprise steps of:
S101 obtains word to be assessed input by user.
In disclosure scheme, user can be according to self-demand, and it is the word to be assessed for waking up word and using that a work done in the manner of a certain author, which is arranged,
Language.Disclosure scheme can be not specifically limited the composition of word to be assessed, can use same languages, can also mix multiple
Languages, for example, word to be assessed is " you fly at good news ", " hello iflytek ", " hello news fly " etc., it specifically can be by user's root
It is arranged according to demand.
As an example, user can input word to be assessed by voice mode, correspond to this, can pass through Mike
Wind obtains word to be assessed input by user;Alternatively, user can input word to be assessed by text mode, correspond to this,
Word to be assessed input by user can be obtained by input-output equipment such as keyboards.Disclosure scheme is to obtaining word to be assessed
Concrete mode can not limit.
In actual application, the evaluation process of disclosure scheme can be set by the intelligence with voice arousal function
It is standby to realize, and then word to be assessed is determined as by the corresponding wake-up word of the smart machine according to assessment result;Alternatively, disclosure side
The evaluation process of case can realize by other special equipments, and then according to assessment result is allocated to word to be assessed corresponding
Smart machine, for waking up the corresponding smart machine.Disclosure scheme can not do specific limit to the executive agent of evaluation process
It is fixed.
S102 extracts the assessment feature of the word to be assessed, and the assessment feature is for indicating the word to be assessed
In acoustics level and/or the distinction of semantic level.
After getting word to be assessed input by user, it can extract and indicate word to be assessed in acoustics level and/or language
The assessment feature of the distinction of adopted level is used for waking up the processing of word assessment models.
As an example, for indicating that word to be assessed in the assessment feature of the distinction of acoustics level, may include
At least one of following characteristics:The distribution characteristics of voice unit, the identification probability of word to be assessed, word to be assessed when
The tonality feature of word long, to be assessed.
As an example, for indicating that word to be assessed in the assessment feature of the distinction of semantic level, may include
At least one of following characteristics:The score of language model, the part of speech feature of word to be assessed, the smoothness of word to be assessed are special
Sign.
Meaning about each character representation and specific extraction process wouldn't be described in detail herein reference can be made to hereafter introducing.
S103, using the assessment feature of the word to be assessed as input, at the wake-up word assessment models through building in advance
After reason, determine the word to be assessed if appropriate for as wake-up word.
After extracting assessment feature in word to be assessed, mould can be carried out using the wake-up word assessment models built in advance
Type processing determines word to be assessed if appropriate for as wake-up word.
As an example, the output for waking up word assessment models can include 2 output nodes, respectively represent word to be assessed
Language is suitable as wake-up word, word to be assessed is not suitable as waking up word;Alternatively, waking up the output of word assessment models can include
1 output node, the point value of evaluation for indicating word to be assessed judge to be assessed if point value of evaluation is less than preset value
Word is not suitable as waking up word;Otherwise judge that word to be assessed is suitable as waking up word.Disclosure scheme is to waking up word assessment
The output form of model can be not specifically limited.
It to sum up, can be according to word to be assessed in acoustics level and/or language after disclosure scheme gets word to be assessed
The distinction of adopted level carries out waking up word assessment.In general, the distinction of word to be assessed is better, when being used as wake-up word
Wake-up effect it is better.It carries out waking up word assessment by the rule of subjectivity setting compared with the existing technology, disclosure scheme has more
Objectivity helps to improve the accuracy of assessment result, and then improves the wake-up effect of the wake-up word of user setting.
As an example, after waking up the processing of word assessment models, however, it is determined that the word to be assessed that user currently inputs is not
It is suitable as waking up word, disclosure scheme also provides following preferred embodiment, to improve the success rate that user setting wakes up word.
Preferred embodiment one, the problem of word to be assessed can be extracted feature;According to problem characteristic, determine that word to be assessed is deposited
The problem of type, that is, analyze word to be assessed be not suitable as wake up word the reason of.
As an example, feature can be presented as the score of language model the problem of word to be assessed.It, can corresponding to this
Word to be assessed as input, after the language model processing through building in advance, is exported the score of word to be assessed, the score
The frequency that word to be assessed occurs can be indicated, in general, the higher frequency for illustrating to occur of score is higher;It is waited for it is then possible to compare
It assesses the score of word, preset score value size between the two, when the score of word to be assessed is more than default score value, explanation waits for
Assess word occur frequency it is higher, it is likely that occur the word in every-day language, cause smart machine by false wake-up can
Energy property increases, therefore can determine that problem types existing for word to be assessed are high frequency vocabulary, i.e., word to be assessed is not suitable as calling out
The reason of awake word is that the word to be assessed belongs to high frequency vocabulary.
For example, language model can calculate the score of word to be assessed in the following manner:
Word segmentation processing can be carried out to word to be assessed, obtain word sequence { w1, w2..., wk..., wf, wherein wkTable
Show k-th of word of word to be assessed;Then probability P (the w that f word sequentially occurs according to the sequence of word sequence is calculated1,
w2..., wf), as the frequency that word to be assessed occurs, i.e., the score of word to be assessed.
In disclosure scheme, preferably by word to be assessed from w1To wfProbability P (the w in direction1, w2..., wf) indicate to be evaluated
Estimate the score of word, can specifically be presented as following formula:
Wherein, P (wk|wk-1) acquisition can be counted by general corpus.
As an example, feature can be presented as the duration of word to be assessed the problem of word to be assessed.Corresponding to this,
The duration for the voice unit that word to be assessed includes can be obtained;By the sum of the duration of each voice unit, as word to be assessed
Duration;It is then possible to the size that the duration of word more to be assessed, preset duration are between the two, when word to be assessed
When length is less than preset duration, illustrates that word to be assessed is very short, may be difficult to be captured by smart machine in actual application
It arrives, when extracting wherein useful information for carrying out smart machine wake-up, therefore can determine that problem types existing for word to be assessed are
The reason of length is too short, i.e., word to be assessed is not suitable as wake-up word is that the duration of the word to be assessed is too short.
For example, the duration of word to be assessed can be calculated in the following manner:
It is possible, firstly, to count to obtain the duration of each voice unit, it specifically, can be advance for each voice unit
The pronunciation duration that multiple speakers correspond to the voice unit is acquired, then by the pronunciation duration mean value of multiple speakers, is determined as
The duration of the pronunciation unit;It is then possible to analyze the voice unit that word to be assessed includes, so by these voice units when
It is the sum of long, it is determined as the duration of word to be assessed.For example, voice unit can be presented as phoneme, syllable etc., disclosure side
Case can be not specifically limited this.
As an example, feature can be presented as the schwa feature of word to be assessed the problem of word to be assessed.It is corresponding
In this, the number for the schwa phoneme that word to be assessed includes can be counted;Compare both number, preset numbers of schwa phoneme it
Between size illustrate that word to be assessed includes poor light of more distinction when the number of schwa phoneme is more than preset number
Sound phoneme may influence the wake-up success rate intelligently waken up, therefore can determine that problem types existing for word to be assessed are schwa
Excessively, i.e., the reason of word to be assessed is not suitable as waking up word is that the schwa that the word to be assessed includes is excessive.For example,
Word to be assessed is " bodhi bodhi ", wherein " Bodhisattva " word includes schwa p, and " carrying " includes schwa t.
It is to be appreciated that the preset number in disclosure scheme, can be a pre-set fixed numbers;Alternatively,
Phoneme sum that can also be according to word to be assessed include, pre-set fixed ratio, it is calculated can variable value, this public affairs
Evolution case can be not specifically limited this.
In actual application, word to be assessed may be because single reason, be not suitable as waking up word;Or
May be to cause it to be not suitable as waking up word because of multiple reasons.Disclosure scheme can be not specifically limited this.
Preferred embodiment two, can be in conjunction with word to be assessed input by user, before ensureing that semanteme is same or analogous as possible
It puts, carries out waking up word recommendation for user.
Specifically, it can obtain that word to be assessed is corresponding to be replaced according to the semantic similar word knowledge mapping built in advance
Change word;Judge that replaceable word if appropriate for as word is waken up, can be presented as referring next to scheme shown in Fig. 1:Extraction can
The assessment feature of word is replaced, assessment feature is for indicating replaceable word in acoustics level and/or the distinction of semantic level;
Using the assessment feature of replaceable word as input, through wake up word assessment models processing after, determine replaceable word if appropriate for
As wake-up word;If replaceable word is suitable as waking up word, the replaceable word can be recommended to user.
As an example, the problem of can be combined with word to be assessed type determines replaceable word for word to be assessed
Language.For example, type is high frequency vocabulary the problem of word " robot " to be assessed, can recommend to be revised as " little Man robots "
As replaceable word, to reduce the score of language model;The problem of word " booting " to be assessed type be duration it is too short,
It can recommend to be revised as " please be switched on " as replaceable word, to increase pronunciation duration;Word " bodhi bodhi " to be assessed asks
It is that schwa is excessive to inscribe type, can recommend to be revised as " your good bodhi " as replaceable word, to reduce schwa quantity.
To sum up, user can know the reason of word to be assessed is not suitable as waking up word, and then targetedly carry out
Modification;In addition, in order to improve the success rate of user's modification, user can also be carried out to wake up word recommendation, select to confirm for user.
In this way, while improving user setting wake-up word success rate, user experience is also contributed to.
The assessment feature in disclosure scheme is explained below.
1. indicating the assessment feature of distinction of the word to be assessed in acoustics level
(1) distribution characteristics of voice unit
As an example, the voice unit that word to be assessed includes can be analyzed, the distribution for counting voice unit is special
Sign.For example, the distribution characteristics of voice unit can be presented as at least one in following items:The sum of voice unit
Number, each specified speech of mesh, the number of different phonetic unit, the number that variant voice unit occurs, specified speech unit
The number that unit occurs.Wherein, voice unit can be presented as that phoneme, syllable etc., disclosure scheme can not do this specific limit
It is fixed.
In general, if the word to be assessed voice unit that includes is very few, such as include only one or two of voice unit, it is daily right
There may be many pronunciations similar with word to be assessed in words, causes the pronunciation distinction of word to be assessed relatively low, increase
Smart machine is by the possibility of false triggering.In addition, if the voice unit that word to be assessed includes is more, but all voice units
All same, such as word to be assessed are " uh uh uh ", it is this pronounce single word to be assessed pronunciation distinction equally also very
It is low, easy to produce false triggering.In view of this, disclosure scheme can extract the total number of voice unit, different phonetic unit
The number that number, variant voice unit occur, the assessment feature as word to be assessed.
By voice unit be syllable for, word " ding-dong ding-dong " to be assessed can be divided into " ding ", " dong ",
4 voice units of " ding ", " dong ".In the example, the total number of voice unit is 4;The number of different phonetic unit is 2,
Respectively " ding ", " dong ";The number that the number that voice unit " ding " occurs is 2, voice unit " dong " occurs is 2.
By taking voice unit is phoneme as an example, it is based on bag of words thinkings, it is contemplated that Chinese or English share 80 sounds
Element can set the distribution characteristics of voice unit to 80 dimensional vectors, per one phoneme of one-dimensional representation, per one-dimensional numerical value
Indicate the number that the phoneme occurs in word to be assessed.
In addition, it is necessary to explanation, in order to improve the acoustics distinction of word to be assessed, disclosure scheme can also be advance
Determine some specified speech units, the designating unit number that word to be assessed includes is more, and acoustics distinction is better, more suitable
As wake-up word.In view of this, disclosure scheme can also extract the number of specified speech unit, each specified speech unit occurs
Number, the assessment feature as word to be assessed.
For example, can opening degree is big, loudness is big, pronunciation is more visible, is easy the voice unit being captured, be determined as
Specified speech unit, for example, combination simple or compound vowel of a Chinese syllable ua, iao, ian, iong etc. of Chinese, English vowel ai, ao etc. can specifically be tied
Practical application request setting is closed, disclosure scheme can not limit this.
(2) identification probability of word to be assessed
In disclosure scheme, the identification probability of word to be assessed can be embodied as:The accuracy rate of word to be assessed and/
Or the false alarm rate of word to be assessed.In general, the accuracy rate of word to be assessed is higher, false alarm rate is lower, acoustics distinction is better,
It is more suitable as waking up word.
As an example, the identification probability of word to be assessed can be obtained by way of off-line test.With to be assessed
For the accuracy rate of word, N positive example sample of word to be assessed can under various circumstances, be acquired, statistics is wherein correct
The sample size M of identification calculates the accuracy rate under each environment using M/N;Then again by the equal of the accuracy rate under each environment
Value, is determined as the accuracy rate of word to be assessed.By taking the false alarm rate of word to be assessed as an example, it can monitor pre- under various circumstances
It fixes time in section, word to be assessed is as word is waken up by the number of false wake-up, for example, the false alarm rate under some environment is 24 hours
It is interior by false wake-up 2 times;Then by the mean value of the false alarm rate under varying environment, it is determined as the false alarm rate of word to be assessed.
As an example, the voice unit that can include based on word to be assessed, the identification for obtaining word to be assessed are general
Rate.Specifically, the identification probability for the voice unit that word to be assessed includes can be obtained;By the identification probability of each voice unit
Mean value, the identification probability as word to be assessed.Wherein, the discrimination of voice unit, false alarm rate are referred to be described above, with
Offline mode counts to obtain, and and will not be described here in detail.
(3) duration of word to be assessed
In general, the duration of word to be assessed is longer, acoustics distinction is better, is more suitable as waking up word.It obtains to be evaluated
Estimate the process of the duration of word, reference can be made to the place of problem types analysis above is introduced, and will not be described here in detail.
(4) tonality feature of word to be assessed
As an example, the tone that can obtain the individual character that word to be assessed includes calculates the sound between adjacent individual character
Variance is adjusted, if for example, the tone of two neighboring individual character is consistent, pitch variance 0;Otherwise pitch variance is 1;Then, it utilizes
Pitch variance between adjacent individual character performs mathematical calculations, and calculates the tonality feature of word to be assessed, for example, can be by each tone
The mean value of the sum of variance or each pitch variance is determined as the tonality feature of word to be assessed, and disclosure scheme is to specifically counting
Student movement is calculated mode and can not be limited.
For example, the tone classifier built in advance can be utilized, the pitch sequences { b of word to be assessed is obtained1,
b2..., bj..., bn, wherein bjIndicate the corresponding tone types of j-th of individual character of word to be assessed.By taking Chinese as an example, individual character
Tone types can be presented as 4 kinds of common tones, identifier " 1 ", " 2 ", " 3 ", " 4 " can be used to indicate different tone;
Or the tone types that other languages determine individual character are can be combined with, disclosure scheme can be not specifically limited this.
In general, the pronunciation of the word to be assessed of modulation in tone has more distinction, i.e., the tonality feature value of word to be assessed is got over
Greatly, acoustics distinction is better, is more suitable as waking up word.
2. indicating the assessment feature of distinction of the word to be assessed in semantic level
(1) score of language model
In general, the score of language model is higher, and it is higher by the probability of false triggering, it is more not suitable as waking up word.It obtains
The process of the score of word to be assessed, reference can be made to the place of problem types analysis above is introduced, and will not be described here in detail.
(2) part of speech feature of word to be assessed
As an example, the part of speech for the word that word to be assessed includes can be obtained;Count the number, each of different parts of speech
The number that different parts of speech occur, the part of speech feature as word to be assessed.In general, the part of speech feature that word to be assessed includes is richer
Richness, semantic differentiation is better, is more suitable as waking up word.
For example, word segmentation processing can be carried out to word to be assessed, obtains part of speech sequence { q1, q2..., qk..., qf,
Wherein, qkIndicate the part of speech of k-th of word of word to be assessed.As an example, for following 11 kinds of parts of speech:Noun moves
Word, adjective, numeral-classifier compound, pronoun, adverbial word, preposition, conjunction, auxiliary verb, interjection, onomatopoeia, can be by the word of word to be assessed
Property feature be set as 11 dimensional vectors, per one-dimensional representation one part of speech, indicate the part of speech in word to be assessed per one-dimensional numerical value
The number occurred in language.
(3) the smoothness feature of word to be assessed
As an example, the word that word to be assessed can be utilized to include, the forward direction for calculating word to be assessed are semantic suitable
Slippery and reverse semantic smoothness;It is performed mathematical calculations, is obtained to be evaluated using positive semantic smoothness and reverse semantic smoothness
Estimate the smoothness feature of word.
The calculation of semantic smoothness, reference can be made to the place of problem types analysis above is introduced, and will not be described here in detail.Its
In, positive semanteme smoothness can be presented as word to be assessed from w1To wfProbability P (the w in direction1, w2..., wf), it is reverse semantic
Smoothness can be presented as word to be assessed from wfTo w1Probability P (the w in directionf, wf-1..., w1)。
For example, positive semantic smoothness and reverse semantic smoothness perform mathematical calculations and can be presented as, positive language
The absolute value of the difference of adopted smoothness and reverse semantic smoothness.The smoothness characteristic value obtained generally, based on this is bigger, illustrates just
To more reasonable, the easier statement of word to be assessed, more it is suitable as waking up word.
For example, positive semantic smoothness and reverse semantic smoothness perform mathematical calculations and can be presented as, positive language
The quotient of adopted smoothness and reverse semantic smoothness.The smoothness characteristic value obtained generally, based on this is bigger, illustrates that forward direction is more closed
Reason, the easier statement of word to be assessed are more suitable as waking up word.
As an example, a large amount of sample can be acquired and wake up word, calling out in disclosure scheme is obtained based on this training
Awake word assessment models.Wherein, sample, which wakes up word, can be presented as that positive example sample wakes up word, negative data wakes up word;In addition, may be used also
Positive example sample wake-up word to be labeled as being suitable as waking up word in advance, negative data wake-up word is labeled as uncomfortable cooperation in advance
To wake up word.
When carrying out model training, it may be determined that the topological structure for waking up word assessment models well, for example, can be presented as
CNN (English:Convolutional Neural Network, Chinese:Convolutional neural networks), RNN (English:Recurrent
Neural Network, Chinese:Recognition with Recurrent Neural Network), DNN (English:Deep Neural Network, Chinese:Depth nerve
Network) etc., disclosure scheme can be not specifically limited this.In this way, after being waken up from sample and extracting assessment feature in word, it can be with
The assessment feature that word is waken up in conjunction with selected topological structure, sample carries out waking up the training of word assessment models, until waking up word assesses mould
Until the assessment result of type output is consistent with the assessment result of sample wake-up word mark.
Referring to Fig. 2, show that the disclosure wakes up the composition schematic diagram of word apparatus for evaluating.Described device may include:
Word acquisition module 201 to be assessed, for obtaining word to be assessed input by user;
Characteristic extracting module 202, the assessment feature for extracting the word to be assessed are assessed, the assessment feature is used for
Indicate the word to be assessed in acoustics level and/or the distinction of semantic level;
Word determining module 203 is waken up, is used for using the assessment feature of the word to be assessed as input, through what is built in advance
After waking up the processing of word assessment models, determine the word to be assessed if appropriate for as wake-up word.
Optionally, for indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes voice unit
Distribution characteristics, then the assessment characteristic extracting module, the voice unit for including for analyzing the word to be assessed counts language
The number of the total number of sound unit, the number of different phonetic unit, the number that variant voice unit occurs, specified speech unit
At least one of in the number that mesh, each specified speech unit occur, the distribution characteristics as institute's speech units;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the knowledge of word to be assessed
Other probability, then the assessment characteristic extracting module, the identification probability for obtaining the voice unit that the word to be assessed includes;
By the mean value of the identification probability of each voice unit, as the identification probability of the word to be assessed, the identification probability includes standard
True rate and/or false alarm rate;
And/or
For indicate the assessment feature of distinction of the word to be assessed in acoustics level include word to be assessed when
It grows, then the assessment characteristic extracting module, the duration for obtaining the voice unit that the word to be assessed includes;By each voice
The sum of duration of unit, the duration as the word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the sound of word to be assessed
Feature is adjusted, then the assessment characteristic extracting module, the tone for obtaining the individual character that the word to be assessed includes calculates adjacent
Pitch variance between individual character;It is performed mathematical calculations, is obtained described to be assessed using the pitch variance between the adjacent individual character
The tonality feature of word;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the score of language model,
The then assessment characteristic extracting module, for using the word to be assessed as input, the language model through building in advance to be handled
Afterwards, the score of the word to be assessed is exported, the score is used to indicate the frequency that the word to be assessed occurs;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the word of word to be assessed
Property feature, then the assessment characteristic extracting module, the part of speech for obtaining the word that the word to be assessed includes;Statistics is different
The number that the number of part of speech, variant part of speech occur, the part of speech feature as the word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the suitable of word to be assessed
Slippery feature, then the assessment characteristic extracting module, the word for including using the word to be assessed calculate described to be evaluated
Estimate the semantic smoothness of forward direction of word and reverse semantic smoothness;It is suitable using the positive semantic smoothness and the reverse semanteme
Slippery performs mathematical calculations, and obtains the smoothness feature of the word to be assessed.
Optionally, described device further includes:
Problem characteristic extraction module, for when determining that the word to be assessed is not suitable as waking up word, described in extraction
The problem of word to be assessed feature;
Problem types determining module, for according to described problem feature, determine the word to be assessed there are the problem of class
Type, described problem type is for indicating the reason of word to be assessed is not suitable as waking up word.
Optionally, described problem feature includes the score of language model, then described problem determination type module, is used for institute
Word to be assessed is stated as input, after the language model processing through building in advance, exports the score of the word to be assessed, it is described
Score is used to indicate the frequency that the word to be assessed occurs;When the score of the word to be assessed is more than default score value, sentence
Problem types existing for the fixed word to be assessed are high frequency vocabulary;
And/or
Described problem feature includes the duration of word to be assessed, then described problem determination type module, described for obtaining
The duration for the voice unit that word to be assessed includes;By the sum of the duration of each voice unit, as the word to be assessed when
It is long;When the duration of the word to be assessed is less than preset duration, judge problem types existing for the word to be assessed for when
Length is too short;
And/or
Described problem feature includes the schwa feature of word to be assessed, then described problem determination type module, for counting
The number for the schwa phoneme that the word to be assessed includes;When the number of the schwa phoneme is more than preset number, institute is judged
It is that schwa is excessive to state problem types existing for word to be assessed.
Optionally, described device further includes:
Replaceable word obtains module, for when determining that the word to be assessed is not suitable as waking up word, according to pre-
The semantic similar word knowledge mapping first built obtains the corresponding replaceable word of the word to be assessed;
The assessment characteristic extracting module, the assessment feature for extracting the replaceable word, the assessment feature are used
In the expression replaceable word in acoustics level and/or the distinction of semantic level;
The wake-up word determining module is used for using the assessment feature of the replaceable word as input, through the wake-up
After the processing of word assessment models, determine the replaceable word if appropriate for as wake-up word;
Replaceable word recommending module, for when the replaceable word is suitable as waking up word, recommending institute to user
State replaceable word.
About the device in above-described embodiment, wherein modules execute the concrete mode of operation in related this method
Embodiment in be described in detail, explanation will be not set forth in detail herein.
Referring to Fig. 3, the structural schematic diagram of electronic equipment 300 of the disclosure for waking up word assessment is shown.Reference Fig. 3,
Electronic equipment 300 includes processing component 301, further comprises one or more processors, and by 302 generations of storage medium
The storage device resource of table, can be by the instruction of the execution of processing component 301, such as application program for storing.Storage medium 302
The application program of middle storage may include it is one or more each correspond to one group of instruction module.In addition, processing
Component 301 is configured as executing instruction, to execute above-mentioned wake-up word appraisal procedure.
Electronic equipment 300 can also include a power supply module 303, be configured as executing the power supply pipe of electronic equipment 300
Reason;One wired or wireless network interface 304 is configured as electronic equipment 300 being connected to network;With an input and output
(I/O) interface 305.Electronic equipment 300 can be operated based on the operating system for being stored in storage medium 302, such as Windows
ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or similar.
The preferred embodiment of the disclosure is described in detail above in association with attached drawing, still, the disclosure is not limited to above-mentioned reality
The detail in mode is applied, in the range of the technology design of the disclosure, a variety of letters can be carried out to the technical solution of the disclosure
Monotropic type, these simple variants belong to the protection domain of the disclosure.
It is further to note that specific technical features described in the above specific embodiments, in not lance
In the case of shield, can be combined by any suitable means, in order to avoid unnecessary repetition, the disclosure to it is various can
The combination of energy no longer separately illustrates.
In addition, arbitrary combination can also be carried out between a variety of different embodiments of the disclosure, as long as it is without prejudice to originally
Disclosed thought equally should be considered as disclosure disclosure of that.
Claims (12)
1. a kind of wake-up word appraisal procedure, which is characterized in that the method includes:
Obtain word to be assessed input by user;
The assessment feature of the word to be assessed is extracted, the assessment feature is for indicating the word to be assessed in acoustics level
And/or the distinction of semantic level;
Using the assessment feature of the word to be assessed as input, after the wake-up word assessment models processing through building in advance, determine
The word to be assessed is if appropriate for as wake-up word.
2. according to the method described in claim 1, it is characterized in that,
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the distribution characteristics of voice unit,
Then the assessment feature of the extraction word to be assessed includes:The voice unit that the word to be assessed includes is analyzed, is counted
The number of the total number of voice unit, the number of different phonetic unit, the number that variant voice unit occurs, specified speech unit
At least one of in the number that mesh, each specified speech unit occur, the distribution characteristics as institute's speech units;
And/or
For indicate the assessment feature of distinction of the word to be assessed in acoustics level include word to be assessed identification it is general
Rate, then the assessment feature of the extraction word to be assessed include:Obtain the voice unit that the word to be assessed includes
Identification probability;By the mean value of the identification probability of each voice unit, as the identification probability of the word to be assessed, the identification is general
Rate includes accuracy rate and/or false alarm rate;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the duration of word to be assessed, then
The assessment feature of the extraction word to be assessed includes:Obtain the duration for the voice unit that the word to be assessed includes;
By the sum of the duration of each voice unit, the duration as the word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the tone spy of word to be assessed
It levies, then the assessment feature of the extraction word to be assessed includes:The tone for the individual character that the word to be assessed includes is obtained,
Calculate the pitch variance between adjacent individual character;It is performed mathematical calculations using the pitch variance between the adjacent individual character, obtains institute
State the tonality feature of word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the score of language model, then institute
It states and extracts the assessment feature of the word to be assessed and include:Using the word to be assessed as input, the language through building in advance
After model treatment, the score of the word to be assessed is exported, the score is used to indicate the frequency that the word to be assessed occurs;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the part of speech spy of word to be assessed
It levies, then the assessment feature of the extraction word to be assessed includes:Obtain the part of speech for the word that the word to be assessed includes;
Count the number of different parts of speech, the number that variant part of speech occurs, the part of speech feature as the word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the smoothness of word to be assessed
Feature, then the assessment feature of the extraction word to be assessed include:The word for including using the word to be assessed calculates
The semantic smoothness of forward direction and reverse semantic smoothness of the word to be assessed;Utilize the positive semantic smoothness and described inverse
It performs mathematical calculations to semantic smoothness, obtains the smoothness feature of the word to be assessed.
3. method according to claim 1 or 2, which is characterized in that determine that the word to be assessed is not suitable as waking up
When word, the method further includes:
The problem of extracting the word to be assessed feature;
According to described problem feature, problem types existing for the word to be assessed are determined, described problem type is for indicating institute
State the reason of word to be assessed is not suitable as waking up word.
4. according to the method described in claim 3, it is characterized in that,
Described problem feature includes the score of language model, then problem types packet existing for the determination word to be assessed
It includes:Using the word to be assessed as input, after the language model processing through building in advance, obtaining for the word to be assessed is exported
Point, the score is used to indicate the frequency that the word to be assessed occurs;When the score of the word to be assessed is more than default point
When value, judge that problem types existing for the word to be assessed are high frequency vocabulary;
And/or
Described problem feature includes the duration of word to be assessed, then problem types packet existing for the determination word to be assessed
It includes:Obtain the duration for the voice unit that the word to be assessed includes;By the sum of the duration of each voice unit, as described to be evaluated
Estimate the duration of word;When the duration of the word to be assessed is less than preset duration, judge to ask existing for the word to be assessed
It is that duration is too short to inscribe type;
And/or
Described problem feature includes the schwa feature of word to be assessed, then the determination word to be assessed there are the problem of class
Type includes:Count the number for the schwa phoneme that the word to be assessed includes;When the number of the schwa phoneme is more than present count
When mesh, judge that problem types existing for the word to be assessed are excessive for schwa.
5. method according to claim 1 or 2, which is characterized in that determine that the word to be assessed is not suitable as waking up
When word, the method further includes:
According to the semantic similar word knowledge mapping built in advance, the corresponding replaceable word of the word to be assessed is obtained;
The assessment feature of the replaceable word is extracted, the assessment feature is for indicating the replaceable word in acoustics level
And/or the distinction of semantic level;
It can described in determination after wake-up word assessment models processing using the assessment feature of the replaceable word as input
Word is replaced if appropriate for as wake-up word;
If the replaceable word is suitable as waking up word, recommend the replaceable word to user.
6. a kind of wake-up word apparatus for evaluating, which is characterized in that described device includes:
Word acquisition module to be assessed, for obtaining word to be assessed input by user;
Characteristic extracting module is assessed, the assessment feature for extracting the word to be assessed, the assessment feature is for indicating institute
Word to be assessed is stated in acoustics level and/or the distinction of semantic level;
Word determining module is waken up, is used for using the assessment feature of the word to be assessed as input, the wake-up word through building in advance
After assessment models processing, determine the word to be assessed if appropriate for as wake-up word.
7. device according to claim 6, which is characterized in that
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the distribution characteristics of voice unit,
The then assessment characteristic extracting module, the voice unit for including for analyzing the word to be assessed, counts the total of voice unit
Number, the number of specified speech unit, each specified language that number, the number of different phonetic unit, variant voice unit occur
At least one of in the number that sound unit occurs, the distribution characteristics as institute's speech units;
And/or
For indicate the assessment feature of distinction of the word to be assessed in acoustics level include word to be assessed identification it is general
Rate, then the assessment characteristic extracting module, the identification probability for obtaining the voice unit that the word to be assessed includes;It will be each
The mean value of the identification probability of voice unit, as the identification probability of the word to be assessed, the identification probability includes accuracy rate
And/or false alarm rate;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the duration of word to be assessed, then
The assessment characteristic extracting module, the duration for obtaining the voice unit that the word to be assessed includes;By each voice unit
The sum of duration, the duration as the word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in acoustics level includes the tone spy of word to be assessed
Sign, then the assessment characteristic extracting module, the tone for obtaining the individual character that the word to be assessed includes calculate adjacent individual character
Between pitch variance;It is performed mathematical calculations using the pitch variance between the adjacent individual character, obtains the word to be assessed
Tonality feature;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the score of language model, then institute
Characteristic extracting module is estimated in commentary, is used for using the word to be assessed as input, defeated after the language model processing through building in advance
Go out the score of the word to be assessed, the score is used to indicate the frequency that the word to be assessed occurs;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the part of speech spy of word to be assessed
It levies, then the assessment characteristic extracting module, the part of speech for obtaining the word that the word to be assessed includes;Count different parts of speech
Number, variant part of speech occur number, the part of speech feature as the word to be assessed;
And/or
For indicating that the assessment feature of distinction of the word to be assessed in semantic level includes the smoothness of word to be assessed
Feature, then the assessment characteristic extracting module, the word for including using the word to be assessed calculate the word to be assessed
The semantic smoothness of forward direction and reverse semantic smoothness of language;Utilize the positive semantic smoothness and the reverse semantic smoothness
It performs mathematical calculations, obtains the smoothness feature of the word to be assessed.
8. the device described according to claim 6 or 7, which is characterized in that described device further includes:
Problem characteristic extraction module, for when determining that the word to be assessed is not suitable as waking up word, extraction to be described to be evaluated
The problem of estimating word feature;
Problem types determining module, for according to described problem feature, determining problem types existing for the word to be assessed, institute
Problem types are stated for indicating the reason of word to be assessed is not suitable as waking up word.
9. device according to claim 8, which is characterized in that
Described problem feature includes the score of language model, then described problem determination type module, is used for the word to be assessed
Language after the language model processing through building in advance, exports the score of the word to be assessed, the score is used for table as input
Show the frequency that the word to be assessed occurs;When the score of the word to be assessed is more than default score value, judgement is described to be evaluated
It is high frequency vocabulary to estimate problem types existing for word;
And/or
Described problem feature includes the duration of word to be assessed, then described problem determination type module, described to be evaluated for obtaining
Estimate the duration for the voice unit that word includes;By the sum of the duration of each voice unit, the duration as the word to be assessed;When
When the duration of the word to be assessed is less than preset duration, judge that problem types existing for the word to be assessed are duration mistake
It is short;
And/or
Described problem feature includes the schwa feature of word to be assessed, then described problem determination type module, described for counting
The number for the schwa phoneme that word to be assessed includes;When the number of the schwa phoneme is more than preset number, waited for described in judgement
It is that schwa is excessive to assess problem types existing for word.
10. the device described according to claim 6 or 7, which is characterized in that described device further includes:
Replaceable word obtains module, for when determining that the word to be assessed is not suitable as waking up word, according to advance structure
The semantic similar word knowledge mapping built obtains the corresponding replaceable word of the word to be assessed;
The assessment characteristic extracting module, the assessment feature for extracting the replaceable word, the assessment feature are used for table
Show the replaceable word in acoustics level and/or the distinction of semantic level;
The wake-up word determining module, for using the assessment feature of the replaceable word as input, being commented through the wake-up word
After estimating model treatment, determine the replaceable word if appropriate for as wake-up word;
Replaceable word recommending module, for when the replaceable word is suitable as waking up word, to user recommend described in can
Replace word.
11. a kind of storage medium, wherein being stored with a plurality of instruction, which is characterized in that described instruction is loaded by processor, right of execution
Profit requires the step of any one of 1 to 5 the method.
12. a kind of electronic equipment, which is characterized in that the electronic equipment includes;
Storage medium described in claim 11;And
Processor, for executing the instruction in the storage medium.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810159653.2A CN108536668B (en) | 2018-02-26 | 2018-02-26 | Wake-up word evaluation method and device, storage medium and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810159653.2A CN108536668B (en) | 2018-02-26 | 2018-02-26 | Wake-up word evaluation method and device, storage medium and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108536668A true CN108536668A (en) | 2018-09-14 |
CN108536668B CN108536668B (en) | 2022-06-07 |
Family
ID=63486143
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810159653.2A Active CN108536668B (en) | 2018-02-26 | 2018-02-26 | Wake-up word evaluation method and device, storage medium and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108536668B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109767763A (en) * | 2018-12-25 | 2019-05-17 | 苏州思必驰信息科技有限公司 | It is customized wake up word determination method and for determine it is customized wake up word device |
CN110989963A (en) * | 2019-11-22 | 2020-04-10 | 北京梧桐车联科技有限责任公司 | Awakening word recommendation method and device and storage medium |
CN111105789A (en) * | 2018-10-25 | 2020-05-05 | 珠海格力电器股份有限公司 | Awakening word obtaining method and device |
CN111128171A (en) * | 2019-12-31 | 2020-05-08 | 云知声智能科技股份有限公司 | Setting method and device based on voice recognition |
CN111341317A (en) * | 2020-02-19 | 2020-06-26 | Oppo广东移动通信有限公司 | Method and device for evaluating awakening audio data, electronic equipment and medium |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102999161A (en) * | 2012-11-13 | 2013-03-27 | 安徽科大讯飞信息科技股份有限公司 | Implementation method and application of voice awakening module |
CN103474069A (en) * | 2013-09-12 | 2013-12-25 | 中国科学院计算技术研究所 | Method and system for fusing recognition results of a plurality of speech recognition systems |
CN104464751A (en) * | 2014-11-21 | 2015-03-25 | 科大讯飞股份有限公司 | Method and device for detecting pronunciation rhythm problem |
CN104584119A (en) * | 2012-07-03 | 2015-04-29 | 谷歌公司 | Determining hotword suitability |
CN104616653A (en) * | 2015-01-23 | 2015-05-13 | 北京云知声信息技术有限公司 | Word match awakening method, work match awakening device, voice awakening method and voice awakening device |
CN105096939A (en) * | 2015-07-08 | 2015-11-25 | 百度在线网络技术(北京)有限公司 | Voice wake-up method and device |
US9275637B1 (en) * | 2012-11-06 | 2016-03-01 | Amazon Technologies, Inc. | Wake word evaluation |
CN105654943A (en) * | 2015-10-26 | 2016-06-08 | 乐视致新电子科技(天津)有限公司 | Voice wakeup method, apparatus and system thereof |
CN105679310A (en) * | 2015-11-17 | 2016-06-15 | 乐视致新电子科技(天津)有限公司 | Method and system for speech recognition |
CN106062868A (en) * | 2014-07-25 | 2016-10-26 | 谷歌公司 | Providing pre-computed hotword models |
US20170125036A1 (en) * | 2015-11-03 | 2017-05-04 | Airoha Technology Corp. | Electronic apparatus and voice trigger method therefor |
CN106941001A (en) * | 2017-04-18 | 2017-07-11 | 何婉榕 | Automatic page turning method and device |
CN107134279A (en) * | 2017-06-30 | 2017-09-05 | 百度在线网络技术(北京)有限公司 | A kind of voice awakening method, device, terminal and storage medium |
CN107223280A (en) * | 2017-03-03 | 2017-09-29 | 深圳前海达闼云端智能科技有限公司 | robot awakening method, device and robot |
CN107578771A (en) * | 2017-07-25 | 2018-01-12 | 科大讯飞股份有限公司 | Audio recognition method and device, storage medium, electronic equipment |
CN107665708A (en) * | 2016-07-29 | 2018-02-06 | 科大讯飞股份有限公司 | Intelligent sound exchange method and system |
-
2018
- 2018-02-26 CN CN201810159653.2A patent/CN108536668B/en active Active
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104584119A (en) * | 2012-07-03 | 2015-04-29 | 谷歌公司 | Determining hotword suitability |
US9275637B1 (en) * | 2012-11-06 | 2016-03-01 | Amazon Technologies, Inc. | Wake word evaluation |
CN102999161A (en) * | 2012-11-13 | 2013-03-27 | 安徽科大讯飞信息科技股份有限公司 | Implementation method and application of voice awakening module |
CN103474069A (en) * | 2013-09-12 | 2013-12-25 | 中国科学院计算技术研究所 | Method and system for fusing recognition results of a plurality of speech recognition systems |
CN106062868A (en) * | 2014-07-25 | 2016-10-26 | 谷歌公司 | Providing pre-computed hotword models |
CN104464751A (en) * | 2014-11-21 | 2015-03-25 | 科大讯飞股份有限公司 | Method and device for detecting pronunciation rhythm problem |
CN104616653A (en) * | 2015-01-23 | 2015-05-13 | 北京云知声信息技术有限公司 | Word match awakening method, work match awakening device, voice awakening method and voice awakening device |
CN105096939A (en) * | 2015-07-08 | 2015-11-25 | 百度在线网络技术(北京)有限公司 | Voice wake-up method and device |
CN105654943A (en) * | 2015-10-26 | 2016-06-08 | 乐视致新电子科技(天津)有限公司 | Voice wakeup method, apparatus and system thereof |
US20170125036A1 (en) * | 2015-11-03 | 2017-05-04 | Airoha Technology Corp. | Electronic apparatus and voice trigger method therefor |
CN105679310A (en) * | 2015-11-17 | 2016-06-15 | 乐视致新电子科技(天津)有限公司 | Method and system for speech recognition |
CN107665708A (en) * | 2016-07-29 | 2018-02-06 | 科大讯飞股份有限公司 | Intelligent sound exchange method and system |
CN107223280A (en) * | 2017-03-03 | 2017-09-29 | 深圳前海达闼云端智能科技有限公司 | robot awakening method, device and robot |
CN106941001A (en) * | 2017-04-18 | 2017-07-11 | 何婉榕 | Automatic page turning method and device |
CN107134279A (en) * | 2017-06-30 | 2017-09-05 | 百度在线网络技术(北京)有限公司 | A kind of voice awakening method, device, terminal and storage medium |
CN107578771A (en) * | 2017-07-25 | 2018-01-12 | 科大讯飞股份有限公司 | Audio recognition method and device, storage medium, electronic equipment |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111105789A (en) * | 2018-10-25 | 2020-05-05 | 珠海格力电器股份有限公司 | Awakening word obtaining method and device |
CN109767763A (en) * | 2018-12-25 | 2019-05-17 | 苏州思必驰信息科技有限公司 | It is customized wake up word determination method and for determine it is customized wake up word device |
CN110989963A (en) * | 2019-11-22 | 2020-04-10 | 北京梧桐车联科技有限责任公司 | Awakening word recommendation method and device and storage medium |
CN111128171A (en) * | 2019-12-31 | 2020-05-08 | 云知声智能科技股份有限公司 | Setting method and device based on voice recognition |
CN111341317A (en) * | 2020-02-19 | 2020-06-26 | Oppo广东移动通信有限公司 | Method and device for evaluating awakening audio data, electronic equipment and medium |
CN111341317B (en) * | 2020-02-19 | 2023-09-01 | Oppo广东移动通信有限公司 | Method, device, electronic equipment and medium for evaluating wake-up audio data |
Also Published As
Publication number | Publication date |
---|---|
CN108536668B (en) | 2022-06-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11848008B2 (en) | Artificial intelligence-based wakeup word detection method and apparatus, device, and medium | |
CN108536668A (en) | Wake up word appraisal procedure and device, storage medium, electronic equipment | |
CN108320733B (en) | Voice data processing method and device, storage medium and electronic equipment | |
CN106611597B (en) | Voice awakening method and device based on artificial intelligence | |
CN108182937B (en) | Keyword recognition method, device, equipment and storage medium | |
CN110517664B (en) | Multi-party identification method, device, equipment and readable storage medium | |
CN101069230B (en) | The tone pattern information of the text message used in prediction communication system | |
US8150692B2 (en) | Method and apparatus for recognizing a user personality trait based on a number of compound words used by the user | |
CN105334743A (en) | Intelligent home control method and system based on emotion recognition | |
CN108320734A (en) | Audio signal processing method and device, storage medium, electronic equipment | |
CN108595406B (en) | User state reminding method and device, electronic equipment and storage medium | |
WO2022178969A1 (en) | Voice conversation data processing method and apparatus, and computer device and storage medium | |
CN110473536B (en) | Awakening method and device and intelligent device | |
CN110853621B (en) | Voice smoothing method and device, electronic equipment and computer storage medium | |
CN107274903A (en) | Text handling method and device, the device for text-processing | |
CN110992959A (en) | Voice recognition method and system | |
CN108269574B (en) | Method and device for processing voice signal to represent vocal cord state of user, storage medium and electronic equipment | |
CN108053826B (en) | Method and device for man-machine interaction, electronic equipment and storage medium | |
CN113314119A (en) | Voice recognition intelligent household control method and device | |
Liu et al. | Learning salient features for speech emotion recognition using CNN | |
JP6299563B2 (en) | Response generation method, response generation apparatus, and response generation program | |
CN113823265A (en) | Voice recognition method and device and computer equipment | |
CN103035244A (en) | Voice tracking method capable of feeding back loud-reading progress of user in real time | |
CN111968646A (en) | Voice recognition method and device | |
CN111554270A (en) | Training sample screening method and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |