CN101266793B - Device and method for reducing recognition error via context relation in dialog bouts - Google Patents

Device and method for reducing recognition error via context relation in dialog bouts Download PDF

Info

Publication number
CN101266793B
CN101266793B CN2007100870226A CN200710087022A CN101266793B CN 101266793 B CN101266793 B CN 101266793B CN 2007100870226 A CN2007100870226 A CN 2007100870226A CN 200710087022 A CN200710087022 A CN 200710087022A CN 101266793 B CN101266793 B CN 101266793B
Authority
CN
China
Prior art keywords
rule
dialogue
context
bout
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2007100870226A
Other languages
Chinese (zh)
Other versions
CN101266793A (en
Inventor
吴旭智
李青宪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Industrial Technology Research Institute ITRI
Original Assignee
Industrial Technology Research Institute ITRI
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Industrial Technology Research Institute ITRI filed Critical Industrial Technology Research Institute ITRI
Priority to CN2007100870226A priority Critical patent/CN101266793B/en
Publication of CN101266793A publication Critical patent/CN101266793A/en
Application granted granted Critical
Publication of CN101266793B publication Critical patent/CN101266793B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a device and a method of reducing identification errors via context in dialog rounds. The device comprises a rule storage unit, an evolutionary rule generation module, and a rule trigger. The invention firstly analyzes dialog history record and trains a rule group via a computing method of a plurality of parallel evolutionary and the rule group describes a relation of the context in dialog rounds. Then the invention processes to reappraise according to extraneous information referenced by the rule group as a voice recognition module or aiming at a result generated by existing voice recognition, and measures reappraised voice recognition confidence. Finally, the invention makes use of each successful dialog round to dynamically adjust the rule group. The invention can improve accuracy of recognition and help new generation voice recognition or voice recognition of more complex dialog system based on existing voice recognition. A learning method of the rule group used by the invention has a lower training cost.

Description

Reduce the apparatus and method of identification mistake by context relation between the dialogue bout
Technical field
The present invention is about a kind of apparatus and method that reduce the identification mistake by context (context) relation between dialogue bout (dialogue turn).
Background technology
Automatic speech recognizing (Automatic Speech Recognition, ASR) in, how reducing identification mistake (recognition error) is a very important problem.Research is always found, utilizes more information, as the reference of identification, can effectively reduce the identification error rate.Available information comprises voice (speech utterance) information, the meaning of one's words (speech semantics) information, also has the context of dialogue (dialoguecontext) relevant information.
The traditional voice identification mainly is to take keyword identification (keyword spotting).If the identification of keyword is correct,, finish required task just can correctly continue dialogue so.For traditional Information Access conversational system (for example inquiring about systems such as weather, personnel query information, voice ticket booking), as long as the discrimination power of keyword can be improved, in conjunction with other correlation technique (for example different dialogue states is taked different subdialogue systems), can realize an available system.
In the newer conversational system, what the relation between system and the user was different from the past asks that by fixing a side the opposing party answers, the interaction that it is more complicated, and the technology that causes depending merely on the keyword identification can't realize a feasible conversational system.For example, in a langue leaning system, user and system can ask the other side problem mutually, and answer a question, and together finish some tasks, or reach the task that both sides have.Fig. 1 is an example of this conversational system.With reference to figure 1, user's (representing with U) and system's (representing with S) engage in the dialogue, and both sides will coordinate out a time jointly, and the activity that also has both sides to accept is carried out.
In this example, it is question-response that both sides no longer set, and the identification mistake that therefore may occur can be as follows:
" Do you like dancing? " might be become by misidentification: " I do like dancing. ";
" Would you like to...? " might be become by misidentification: " What do you like to..? "
In the middle of above-mentioned example, only can know and carry out misidentification by the keyword identification, possibly can't solve such mistake.If can so for the lifting of discrimination power, have sizable help with reference to the relevant information of the context of dialogue.
In the technology, the historical content of the dialogue of utilizing is arranged now, improve discrimination power.For example, people such as RebeccaJonson are in the paper " Dialogue Context-basedRe-ranking of ASR Hypotheses " that IEEE SLT 2006 is proposed, its utilize phonetic feature (utterance feature), real-time context feature (immediate context feature), recently contextual feature (close-contextfeature), context of dialogue feature (dialogue context feature) but and the feature of able one's inventory feature different aspects such as (listfeature), as the reference of judgement identification mistake.And in real-time context feature, nearest contextual feature, this paper has only been considered the context of dialogue information of nearest front twice dialogue bout, as the basis of identification.
The another kind of practice with reference to the conversation history content, then be calculate before the dialogue ASSOCIATE STATISTICS information (for example, talk with the information that ongoing cancellation rate (cancel percentage), error rate (error percentage), system's rounds (number of system turns), user's rounds (number of user turns) etc. are summarized, the relevant information that does not have each bout conversation content before reference is arrived in detail and is accurately also accurately described the relation that may exist between the dialogue bout
Present technology is according to a upper dialogue sentence (normally system send sentence) mostly, as the Main Basis of judging present sentence.Yet in real dialogue, present sentence may be correlated with by the several sentences in front, but not only relevant with previous sentence.In the middle of existing technology, for such situation, still there is not effectively expression.For example, in the middle of the existing example, adopt similar N to connect the practice of the syntax (N-gram).If consider the situation of n>3, the distribution meeting of its frequency very sparse (sparse) so.
In voice identification system, but utilizing the methods of marking again of N-the best able one inventory (N-Best List) to improve the discrimination power of voice, also is the notion of a widespread use.But in the method for N-the best able one inventory, but focus on how to utilize N-the best able one inventory information to measure confidence degree (confidencemeasure) mostly, with how in identification process, but produce the method for N-the best able one inventory, but and N-the best able one inventory how to carry out adaptability study (adaptive learning).
Summary of the invention
The purpose of this invention is to provide a kind of apparatus and method that reduce the identification mistake by context relation between the dialogue bout.The present invention seeks the consideration of optimum answer when context relation is included speech recognition between one or more dialogue bouts, can reduce the identification error rate of automatic speech recognizing system.The present invention can help a new generation or the speech recognition of complicated conversational system.
The present invention analyzes existing conversation content, finds out a rule group (rule set) of the rule composition of many description context of dialogue relations.The described information of each rule is that to talk with bout (dialogue turn) be unit, and can describe the context relation between a plurality of dialogue bouts.Through the rule group after the training, can be used to the historical record according to dialogue, determine in the present dialogue bout probability that each context relation occurs.But can to N-the best able one inventory that speech recognition was produced mark of reruning, reduce thus the identification error rate with this probability.
The device that reduces the identification mistake by context relation between the dialogue bout of the present invention comprises a regular storage element (rule storage unit), an evolutionary regular generation module (evolutionary rulegeneration module) and a regular trigger (rule trigger).The rule storage element has the rule group that one or more rule forms, and each rule is described one group of relation between the dialogue bout.Evolutionary regular generation module develops from a dialogue record (dialogue log) and adjusts, and trains this rule group.The rule trigger is according to the Conversation History (dialogue history) of a plurality of dialogue bouts of rule sets and front that train, from the rule sets that trains, select at least one rule and corresponding confidence degree thereof and measure (confidence measure), its speech recognition is reappraised for an ASR system.
Identification result after reappraising is fed back in the session log, and this device can further be adjusted this rule sets by a reward reward/punishment element again.
Reduce the device of identification mistake according to described by context relation between the dialogue bout, comprise at least the context relation between this dialogue bout in the expression of the described rule of each bar that wherein should the rule group.
Reduce the device of identification mistake according to described by context relation between the dialogue bout, the described information of the described rule of each bar that wherein should the rule group comprises the context category of a plurality of dialogue bouts in front of a sequence, measures when context category and the corresponding confidence degree of this rule of inferior dialogue bout.
Reduce the device of identification mistake according to described by context relation between the dialogue bout, wherein should have one or more different context of dialogue classifications in the described information of the described rule of each bar of rule group.
Reduce the device of identification mistake according to described by context relation between the dialogue bout, wherein the corresponding confidence degree of this rule is measured as the confidence mark of this rule.
According to the described device that reduces the identification mistake by context relation between the dialogue bout, wherein also enjoy a kind of expression mode of general-purpose classification in the described information of the described rule of each bar of this rule sets, any context classification in described one or more the different context classifications of this general-purpose classification representative.
Reduce the device of identification mistake according to described by context relation between the dialogue bout, wherein this evolutionary regular generation module comprises three Operands, is respectively regular variation, rule evaluation and Rules Filtering.
The present invention also provides a kind of method that reduces the identification mistake by context relation between the dialogue bout, this method comprises the following step: by the evolutionary ANALYSIS OF CALCULATING Conversation History of a massive parallel, to train a rule group, this rule group is described the context relation between one or more dialogue bouts; Reappraise according to the identification result that this rule group and an automatic speech recognizing system produce originally, and measure the confidence degree of this speech recognition of reappraising; And for each successful dialogue bout, dynamically adjust this rule group.
Reduce the method for identification mistake according to described by context relation between the dialogue bout, the wherein said step that trains this rule group also comprises: produce at random random rule set; And with three Operands of this random rule set by an evolutionary calculating, comprise regular variation, rule evaluation and Rules Filtering, train this rule group through adjusting by the evolution in generation.
Reduce the method for identification mistake according to described by context relation between the dialogue bout, wherein should the rule group be formed by one or more rule, and take the dialogue bout as unit represents each bar should the described information of rule.
Reduce the method for identification mistake according to described by context relation between the dialogue bout, wherein should the rule group describe through the following steps the context relation between one or more dialogue bouts: the central attribute definition of conversation content that will described one or more dialogue bouts becomes one or more context of dialogue classifications; And each described rule is with symbol M 1M 2M 3... M n: R, I represent, wherein M 1M 2M 3... M nThe context category of n dialogue bout before the representative, the R representative is when time context category of dialogue bout, and I represents the corresponding confidence degree of this rule and measures.
Reduce the method for identification mistake according to described by context relation between the dialogue bout, wherein should the rule variation refer to that each described rule had a probability, become another new rule by a kind of mode in variation or the combination.
Reduce the method for identification mistake according to described by context relation between the dialogue bout, wherein this rule evaluation refers to assess the confidence degree of each described rule.
Reduce the method for identification mistake according to described by context relation between the dialogue bout, wherein this Rules Filtering comprises the following steps: to keep the rule of a predetermined ratio; Make a variation by rule at random or from existing rule and produce new rule; Find out rule of equal value, and therefrom delete more general rule; And if regular when deleted, then be back to the step of described generation new regulation.
According to the described method that reduces the identification mistake by context relation between the dialogue bout, wherein said reappraising, and the step of measuring the confidence degree of this speech recognition of reappraising also comprises the following steps: front session log cover repeatedly is used in each bar rule in this rule sets, repeatedly talks with the context classification of bout and meets the repeatedly rule of session log of described front to find out the front; And in all described rules that meet, classify according to the context classification when inferior dialogue bout of all described rules that meet, and calculate the confidence score information of each context of dialogue classification.
According to the described method that reduces the identification mistake by context relation between the dialogue bout, wherein after calculating this confidence score information, the described step of reappraising and measuring the confidence degree of this speech recognition of reappraising also comprises: this confidence score information is offered this automatic speech recognizing system.
Reduce the method for identification mistake according to described by context relation between the dialogue bout, wherein this confidence score information makes this automatic speech recognizing system that more information be arranged, but to produce accurate the best able one inventory.
According to the described method that reduces the identification mistake by context relation between the dialogue bout, wherein this confidence score information provides to this automatic speech recognizing system as aftertreatment, but adjusts the mark of this automatic speech recognizing system the best able one's inventory originally by this confidence score information.
The present invention is defined as one or more classifications with the attribute in the middle of the conversation content, also is context of dialogue classification.Each sentence dialogue according to its information, can belong to a specific context of dialogue classification.Then, the context classification of a plurality of dialogue bouts in front that each regular described information comprises a sequence, when the context classification of inferior dialogue bout, and the corresponding confidence degree of this rule is measured.Each rule is also enjoyed a kind of expression mode of general-purpose classification, represents any context classification in described one or more different context classifications.
Definition and representation according to each bar rule of the structure of device of the present invention and this rule sets, the present invention at first passes through the evolution formula computing method (evolutionary massive parallelismapproach) of a massive parallel, analyze Conversation History, to train a rule sets.Then, reappraise according to the identification result that this rule group and automatic speech recognizing system produce originally, and measure the confidence degree of the speech recognition of reappraising.At last, for each successful dialogue bout, dynamically adjust this rule sets.
The evolutionary computational methods of this massive parallel train this rule group from session log.Produce a regular collection at random at first, at random.Then, with these three Operands calculating by this evolution formula of regular collection at random, comprise rule variation (rule variation), rule evaluation (rule evaluation) and Rules Filtering (rule selection), train this rule sets through adjusting by the evolution in generation.
And after the rule group process each generation adaptation through the Rules Filtering generation, the confidence mark of the rule that produces can be higher, more can represent to talk with the relation of the context of dialogue between the bout.Utilize this relation, can on existing speech recognition basis, further improve the degree of accuracy of identification.Simultaneously, learning method used in the present invention, it trains spent cost lower, helps to design the voice identification system that a capable of dynamic is adjusted the identification rule.Such mechanism also has sizable help for processing the more conversational system of sophisticated interaction future.
In conjunction with detailed description and claims of following accompanying drawing, embodiment, with on address other purpose of the present invention and advantage and be specified in after.
Description of drawings
Fig. 1 is the example of a traditional conversational system.
Fig. 2 A is a schematic diagram that reduces the device of identification mistake by context relation between the dialogue bout of the present invention.
Fig. 2 B explanation automatic speech recognizing system applies the present invention reappraise, and the present invention can by reward reward/punishment element, adjust the rule group among Fig. 2 A.
Fig. 3 is a flow chart, and the operation that reduces the method for identification mistake by context relation between the dialogue bout of the present invention is described.
Fig. 4 explanation is calculated by an evolution formula, from session log, trains rule sets.
Fig. 5 is an example with the session log of Fig. 1, and 9 kinds of different context of dialogue classifications that the present invention defines this session log are described.
Fig. 6 illustrates the steps flow chart of Rules Filtering.
Fig. 7 further specifies the present invention and how to reappraise, and measures the probability of talking with the possibility of various context of dialogue classifications in the bout at present.
Fig. 8 is a schematic diagram of the context of dialogue classification kenel corresponding with it of one section dialogue in the session log.
Fig. 9 A is the example of the session log of Fig. 8 by the regular collection of random fashion generation.
Fig. 9 B to Fig. 9 D is respectively after adjusting through the evolution in 100 generations, 200 generations and 10000 generations, and the rule group that trains is wherein all only listed front 30 rules in its rule group.
Figure 10 illustrates that regular trigger is used in the Conversation History cover in the rule group, and calculates the probability of each context of dialogue classification in the present dialogue bout.
Wherein, description of reference numerals is as follows:
The S system
U user
Context relation reduces the device of identification mistake between 200 utilization dialogue bouts
201 regular storage elements
203 evolutionary regular generation modules
205 regular triggers
211 rule sets
215a at least one rule
215b confidence degree is measured
221 session logs
The N natural number
The Conversation History of the state of 223 top ns dialogue bout
225 automatic speech recognizing systems
The result of 225a speech recognition
237 reward reward/punishment elements
The 301 evolution formula computing method by a massive parallel are analyzed Conversation History, and to train a rule sets, this rule sets is being described the context relation between one or more dialogue bout
302 identification results that originally produce according to this rule group and automatic speech recognizing system are reappraised, and the confidence degree of the measurement speech recognition of reappraising
303 for each successful dialogue bout, dynamically adjusts this rule group
401 produce a regular collection at random at random
402 rule variations
403 rule evaluations
404 Rules Filterings
M 1M 2M 3... M n: R, I Rule Expression mode
601 keep the rule of a predetermined ratio
602 make a variation by rule at random or from existing rule produces new rule
603 find out rule of equal value, and therefrom delete more general rule
Be 604 regular deleted?
701 are used in one by one rule in this rule group with front n time session log cover
702 in the rule that all meet, and classifies according to the context category (just R) when inferior dialogue bout of these rules, and calculates the confidence mark of each context of dialogue classification
Embodiment
Fig. 2 A is a schematic diagram that reduces the device of identification mistake by context relation between the dialogue bout of the present invention.With reference to figure 2, this comprises a regular storage element 201, an evolutionary regular generation module 203 and a regular trigger 205 by the device 200 that context relation between the dialogue bout reduces the identification mistake.Rule storage element 201 has a rule group 211, and this rule group 211 is made up of one or more rule, and represents each rule take the dialogue bout as unit.Evolutionary regular generation module 203 interconnects with this rule storage element, and develops from dialogue record (a dialogue log) 221 and adjust, and trains this rule group 211.Rule trigger 205 is connected with regular storage element 201, and according to the Conversation History 223 of the rule sets 211 that trains with top n dialogue bout, from the rule sets 211 that trains, select at least one regular 215a and corresponding confidence degree thereof and measure 215b, for an automatic speech recognizing system 225 its speech recognition is reappraised, wherein N is a natural number (natural number).
Speech recognition after this is reappraised as a result 225a is fed back in the session log 221.Device 200 of the present invention can by reward reward/punishment element (reward/punishment element) 237, further be adjusted this rule group 211, shown in Fig. 2 B.
With reference to figure 2B, when user's phonetic entry automatic speech recognizing system 225,215b can measure according to the selected at least one regular 215a of regular trigger 205 and corresponding confidence degree thereof in this automatic speech recognizing system 225, but N-the best able one inventory that automatic speech recognizing system 225 produces is originally reappraised, but carry out the weight totalling with the mark of N-the best able one inventory, revaluation mark (rescoring) is namely considered the relation of contextual information between the dialogue bout again.Reduce thus the speech recognition mistake, but can increase the reliability of the mark assessment of N-the best able one inventory, but from this N-the best able one inventory, find the answer that is more suitable for, and be fed back in the session log 221.By reward reward/punishment element 237, the further rule in the regulation rule group 211.
According to the present invention, evolutionary regular generation module 203 trains rule group 211 from an existing session log.For example, produce at random earlier random rule set, with three Operands of this random rule set by this evolutionary regular generation module, comprise regular variation, rule evaluation and Rules Filtering, and train this rule group 211 then.
According to this, when device of the present invention is applied in the automatic speech recognizing system, with context relation between one or more dialogue bouts, develop and to adjust and train many one group of rule describing this contexts of dialogue relation, wherein the described information of each bar rule is to be unit with the dialogue bout.Through the rule group after the training, can be used to the historical record according to the dialogue bout, determine in the present dialogue bout probability that each context relation occurs.But can reappraise to N-the best able one inventory that speech recognition produces originally with this probability, reduce thus the identification mistake, improve this automatic speech recognizing system to the confidence degree of identification result.
The described information of each rule of rule group comprises the context category of previous or a plurality of dialogue bouts of a sequence, when the context category of inferior dialogue bout, and the corresponding confidence degree of this rule is measured.In the described information of each rule and have one or more different context of dialogue classifications.The corresponding confidence degree of each rule is measured as the confidence mark of this rule.The context of dialogue of each dialogue bout is except affiliated context of dialogue classification, also can enjoy a kind of expression mode of general-purpose classification in the described information of each bar rule, any context classification in described one or more the different context classifications of this general-purpose classification representative.
In conjunction with Fig. 2 A and Fig. 2 B according to architectural feature of the present invention, below further specify the expression mode of operation of the present invention, each rule, with and the definition of institute's descriptor.
Fig. 3 is a flow chart, and the operation that reduces the method for identification mistake by context relation between the dialogue bout of the present invention is described.At first, shown in step 301, by the evolution formula computing method of a massive parallel, analyze Conversation History, to train a rule sets, this rule sets is being described the context relation between one or more dialogue bouts.Then, shown in step 302, reappraise according to the identification result that this rule group and automatic speech recognizing system produce originally, and measure the confidence degree of the speech recognition of reappraising.At last, shown in step 303, for each successful dialogue bout, dynamically adjust this rule group.Below further specify step 301 to 303.
In step 301, the evolutionary computational methods of this massive parallel train this rule group from session log.As shown in Figure 4, at first, shown in label 401, produce a regular collection at random at random.Then, with three Operands of this random rule set by this evolutionary calculating, comprise rule variation 402, rule evaluation 403 and Rules Filtering 404, train this rule group through adjusting by the evolution in generation.
Next, illustrate how this rule group describes the context relation between one or more dialogue bouts.As previously mentioned, this rule sets is made up of one or more rule, and is that unit represents each bar rule to talk with bout.At first, with the attribute in the middle of the conversation content, being defined as one or more classifications, also is context of dialogue classification.Each sentence dialogue according to its information, can belong to a specific context of dialogue classification.Then, a rule is with symbol M 1M 2M 3... M n: R, I represent, wherein M 1M 2M 3... M nThe context category of n dialogue bout before the representative, R representative are when time context category of dialogue bout, and I represents the corresponding confidence degree measurement of this rule.The assessment mark of the example of I such as this rule, or the number of times of this rule appearance or probability etc.
Be without loss of generality, Fig. 5 is an example with the session log of Fig. 1, and 9 kinds of different context of dialogue classifications that this session log is defined out are described.For example, dialogue " I do not like to go swimming. " be defined as n kenel, dialogue " Do you like dancing? " be defined as the V-type attitude, dialogue " Good byenow. " is defined as the X kenel.
What deserves to be mentioned is that the definition of context of dialogue classification kenel is not limited to 9 kinds of above-mentioned kenels, can define the context of dialogue classification of how different kenels according to the dialogue sentence pattern.
Represent in rule in the setting of mode that the context of dialogue of each dialogue bout is except affiliated context of dialogue classification kenel, the present invention also provides a kind of general-purpose classification kenel, is designated as " # ".At M 1M 2M 3... M nIn, if a certain dialogue bout use classes kenel " # " then represents this time dialogue bout and allows any context of dialogue classification.For example, suppose that the possible kind of context of dialogue classification has: V, Y, N, Q, S}, the assessment mark of this rule is 50, in the middle of " VY#N:S, 50 ", the # position can be any context of dialogue classification so.That is to say that " VYVN:S, 50 ", " VYYN:S, 50 ", " VYNN:S, 50 ", " VYQN:S, 50 " and " VYSN:S, 50 " all meet this rule and describe.
Three Operands by evolutionary regular generation module carry out rule adjusts, and the rule through generating after the adjusting of a plurality of generations, and its confidence mark can be higher.In other words, the every rule in this rule group can be described out in the middle of the context of dialogue context relation between bout and the bout.What deserves to be mentioned is that this context relation can not be subject to the number of dialogue bout.Below further specify three Operands of evolutionary regular generation module, rule variation 402, rule evaluation 403 and Rules Filtering 404.
Rule variation 402: in the existing regular collection, each rule has the rule that a probability makes a variation (variation) or is combined into other.The variation mode be wherein certain context of dialogue classification kenel once from M iBecome M j, perhaps from M iBecome " # ", perhaps become M from " # " j, also can be that its regular result becomes R ' from R, M wherein i, M j, R, R ' all represent different context of dialogue classifications.For example, VS#Q makes a variation into VS##.The mode of combination is that rule sets different in the regular collection is synthesized a new rule.For example, VS##+##SQ is combined into VSSQ.
Rule evaluation 403: the confidence degree that refers to Rules of Assessment is measured I, can decide according to number of times or the probability that this rule occurs in existing session log.For example, occurrence number is the more assessed more height of mark.
Rules Filtering 404: be the screening of carrying out rule according to following four steps, the steps flow chart of this Rules Filtering is described with Fig. 6.In step 601, keep the rule of a predetermined ratio, for example 300 rules.Every rule is retained probability and its confidence degree of getting off and is directly proportional.In step 602, make a variation by rule at random or from existing rule and produce new rule.In step 603, find out rule of equal value (equivalence rule), and therefrom delete more general rule.For example, if regular VS#:R is identical with regular VS##:R assessment mark, then this two rule is considered as equivalence, deletes more general rule (VS##:R).Another kind of situation for example supposes to find two identical and similar regular M of assessment mark iM j#M 1: M r, 23 and M iM jM mM 1: M r, 23, then this two rule is actually the same situation of describing.That is to say M iM j#M 1: M r, the " # " in the middle of 23 might be M only mThe present invention understands deletion rule M iM j#M 1: M r, 23, improve the accuracy of rule description.
Shown in step 604,, then be back to step 602, otherwise finish this Rules Filtering if regular when deleted.
After training the rule group, the assessment mark of its rule can be more and more high through adjusting by the evolution in generation, and the described rule of whole rule group, namely more can represent the relation of context of dialogue classification between the different dialogue.Therefore, can be used to calculate the probability that occurs various context of dialogue classification in the dialogue of present institute identification.But this information can be marked to N-the best able one inventory again, to improve the confidence degree of identification result.
After the rule group that trains had been arranged, in step 302, the present invention also comprised the following steps to reappraise, and measured the probability of talking with at present the possibility of various context of dialogue classifications in the bout.With reference to figure 7, at first, shown in step 701, with front n time session log cover be used in this rule group one by one the rule.That is to say that in each rule in this rule group, the context category of finding out its front n dialogue bout (also is M 1M 2M 3... M n) meet the rule of front n session log.
Then, shown in step 702, in the rule that all meet, classify according to the context classification (R just) when inferior dialogue bout of these rules, and calculate the confidence mark of each context of dialogue classification.Being calculated as of this confidence mark, the result is this classification and the confidence mark summation that meets the strictly all rules of preceding n session log.According to the confidence mark of every kind of conversational class, just can find out the probability of this conversational class.The confidence mark is higher, and its probability is just higher.
In this confidence score information input automatic speech recognizing system, namely can be used to reduce the error rate of speech recognition.It is that the confidence score information is offered the automatic speech recognizing system that its method has two: the first kinds of modes, allows it utilize more information, but produces more accurate N-the best able one inventory.The second way is to do the aftertreatment of automatic speech recognizing, but adjusts the mark of automatic speech recognizing system N-the best able one inventory originally by confidence score information or the higher rule of probability, improves the accuracy rate of identification thus.
In step 303, the identification result of automatic speech recognizing system output is fed back to session log, can be again by the rule in reward reward/punishment element 237 dynamic regulation rule groups.
One section dialogue in the session log is as example, and the definition of context of dialogue classification kenel below illustrates practical operation flow process of the present invention such as Fig. 5.Be without loss of generality, in this example, Rule Expression of the present invention is that the context category of in the past 4 dialogue bouts explains, and in other words, each Rule Expression is M 1M 2M 3M 4: R, I.
Fig. 8 is a synoptic diagram of the context of dialogue classification kenel of above-mentioned this section dialogue and correspondence thereof, and wherein U represents the user, and S represents a voice identification system.
After receiving the session log of Fig. 8, Fig. 9 A to Fig. 9 D explanation comes the generation rule group with the evolutionary computational methods of massive parallel.Fig. 9 A is an example by the regular collection of random fashion generation.Fig. 9 B to Fig. 9 D is respectively after adjusting through the evolution in 100 generations, 200 generations and 10000 generations, and the rule group that trains is wherein all only listed front 30 rules in its rule group 300 rules.
Then, then the Conversation History cover of the state of a plurality of dialogue bouts in front is used in the one by one rule of the rule group among Fig. 9 D, and analyzes in the at present dialogue confidence mark and the probability of each context of dialogue classification in the present dialogue bout, the result as shown in figure 10.
Be without loss of generality, the Conversation History of the state of the 39th dialogue bout is as follows:
The user: " What did you do yesterday morning? "
System: " what did nothing yesterday morning "
According to the Conversation History of the 39th above-mentioned bout, can find out that the answer of system has problem.
By regular trigger 205, the Conversation History of the state of above-mentioned front 4 dialogue bouts is applied mechanically one by one rule in the rule group, and find out the context category of front 4 dialogue bouts of rule, i.e. M 1M 2M 3M 4At this example, find out a certain legal M in the rule group 1M 2M 3M 4Be XXXQ.Then, classify according to the context category (just R) when inferior dialogue bout of this context category XXXQ, and calculate mark or the probability of each context of dialogue classification in the present dialogue bout.
In this example, calculate respectively its probability with 9 kinds of context of dialogue classification kenels that define among Fig. 5.As shown in figure 10, the probability of context classification " Q " is 0.32, the probability of context classification " Y " is 0.12, the probability of context classification " N " is 0.03, the probability of context classification " y " is 0.21, the probability of context classification " n " is 0.04, the probability of context classification " S " is 0.89, the probability of context classification " V " is 0.31, the probability of context classification " C " is 0.25, the probability of context classification " X " is 0.
Because the probability of context classification " S " is the highest, therefore, the probability that this XXXQ:S rule occurs is the highest, and in other words, the sentence pattern of system answer should be informs sentence.
At last, but this rule XXXQ:S can carry out weight with automatic speech recognizing system the best able one's inventory originally to be added up, but from the best able one inventory, look for the sentence pattern that meets most to answer, for example: " I did nothingyesterday morning. ", but improve the fiduciary level of the mark assessment of the best able one inventory thus.Certainly should also can in the automatic speech recognizing system, do post processing by rule XXXQ:S, but also namely directly adjust the mark of the best able one inventory, improve discrimination power.
But the mark of the rule group that the present invention adjusted and automatic speech recognizing system the best able one's inventory originally carries out weight again to be added up, and namely considers the relation of context category between the dialogue bout, but so can increase the reliability of the best able one inventory mark assessment.
The above only for the preferred embodiment of invention, can not limit scope of the invention process with this.The equalization that namely in every case claim is done according to the present invention changes and modifies, and all should still belong in the scope that patent of the present invention contains.

Claims (17)

1. one kind is reduced the device of identification mistake by context relation between the dialogue bout, and this device comprises:
One regular storage element, have a rule sets, this rule sets is made up of one or more rule, and be that unit represents the described information of the described rule of each bar with the dialogue bout, the described information of the described rule of each bar of this rule sets comprises the context classification of a plurality of dialogue bouts in front of a sequence, measures when the context classification of inferior dialogue bout and the corresponding confidence degree of this rule;
One evolution formula rule generation module interconnects with this rule storage element, and develops from a dialogue record and adjust, to train this rule sets; And
One regular trigger, be connected with this rule storage element, and according to the Conversation History of this rule group that trains and previous one or more dialogue bouts, from the rule group that this trains, select at least one rule and corresponding confidence degree measurement thereof, for an automatic speech recognizing system its speech recognition is reappraised.
2. as claimed in claim 1ly reduce the device of identification mistake by context relation between the dialogue bout, wherein, the result after this is reappraised is fed back in this session log, and this device is also adjusted this rule sets by the reward/punishment element of fulfilling.
3. as claimed in claim 1ly reduce the device of identification mistake by context relation between the dialogue bout, comprise at least the context relation between this dialogue bout in the expression of the described rule of each bar that wherein should the rule group.
4. as claimed in claim 1ly reduce the device of identification mistake by context relation between the dialogue bout, wherein should have one or more different context of dialogue classifications in the described information of the described rule of each bar of rule group.
5. as claimed in claim 1ly reduce the device of identification mistake by context relation between the dialogue bout, wherein the corresponding confidence degree of this rule is measured as the confidence mark of this rule.
6. the device that reduces the identification mistake by context relation between the dialogue bout as claimed in claim 4, wherein also enjoy a kind of expression mode of general-purpose classification in the described information of the described rule of each bar of this rule sets, any context classification in described one or more the different context classifications of this general-purpose classification representative.
7. as claimed in claim 1ly reduce the device of identification mistake by context relation between the dialogue bout, wherein this evolutionary regular generation module comprises three Operands, is respectively regular variation, rule evaluation and Rules Filtering.
8. one kind is reduced the method for identification mistake by context relation between the dialogue bout, and this method comprises the following step:
Evolution formula ANALYSIS OF CALCULATING Conversation History by a massive parallel, to train a rule sets, this rule sets is made up of one or more rule, and be that unit represents that each bar should the described information of rule with the dialogue bout, this rule sets is described the context relation between one or more dialogue bouts, and the described information of the described rule of each bar of this rule sets comprises the context classification of a plurality of dialogue bouts in front of a sequence, measures when the context classification of inferior dialogue bout and the corresponding confidence degree of this rule;
Reappraise according to the identification result that this rule sets and an automatic speech recognizing system produce originally, and measure the confidence degree of this speech recognition of reappraising; And
For each successful dialogue bout, dynamically adjust this rule group.
9. as claimed in claim 8ly reduce the method for identification mistake by context relation between the dialogue bout, the wherein said step that trains this rule group also comprises:
Produce a regular collection at random at random; And
With three Operands of this random rule set by an evolutionary calculating, comprise regular variation, rule evaluation and Rules Filtering, train this rule group through adjusting by the evolution in generation.
10. as claimed in claim 8ly reduce the method for identification mistake by context relation between the dialogue bout, wherein should the rule group describe through the following steps the context relation between one or more dialogue bouts:
Attribute definition in the middle of the conversation content of described one or more dialogue bouts is become one or more context of dialogue classifications; And
Each described rule is with symbol M 1M 2M 3... M n: R, I represent, wherein M 1M 2M 3... M nThe context category of n dialogue bout before the representative, the R representative is when time context category of dialogue bout, and I represents the corresponding confidence degree of this rule and measures.
11. as claimed in claim 9ly reduce the method for identification mistake by context relation between the dialogue bout, wherein should the rule variation refer to that each described rule had a probability, become another new rule by a kind of mode in variation or the combination.
12. as claimed in claim 9ly reduce the method for identification mistake by context relation between the dialogue bout, wherein this rule evaluation refers to assess the confidence degree of each described rule.
13. as claimed in claim 9ly reduce the method for identification mistake by context relation between the dialogue bout, wherein this Rules Filtering comprises the following steps:
The rule that keeps a predetermined ratio;
Make a variation by rule at random or from existing rule and produce new rule;
Find out rule of equal value, and therefrom delete more general rule; And
If regular when deleted, then be back to the step of described generation new regulation.
14. as claimed in claim 8ly reduce the method for identification mistake by context relation between the dialogue bout, wherein said reappraising, and the step of measuring the confidence degree of this speech recognition of reappraising also comprises the following steps:
Front session log cover repeatedly is used in each bar rule in this rule sets, repeatedly talks with the context classification of bout and meet the repeatedly rule of session log of described front to find out the front; And
In all described rules that meet, classify according to the context category when inferior dialogue bout of all described rules that meet, and calculate the confidence score information of each context of dialogue classification.
15. the method that reduces the identification mistake by context relation between the dialogue bout as claimed in claim 14, wherein after calculating this confidence score information, the described step of reappraising and measuring the confidence degree of this speech recognition of reappraising also comprises: this confidence score information is offered this automatic speech recognizing system.
16. as claimed in claim 15ly reduce the method for identification mistake by context relation between the dialogue bout, wherein this confidence score information makes this automatic speech recognizing system that more information be arranged, but to produce accurate the best able one inventory.
17. the method that reduces the identification mistake by context relation between the dialogue bout as claimed in claim 15, wherein this confidence score information provides to this automatic speech recognizing system as post processing, but adjusts the mark of this automatic speech recognizing system the best able one's inventory originally by this confidence score information.
CN2007100870226A 2007-03-14 2007-03-14 Device and method for reducing recognition error via context relation in dialog bouts Active CN101266793B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2007100870226A CN101266793B (en) 2007-03-14 2007-03-14 Device and method for reducing recognition error via context relation in dialog bouts

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2007100870226A CN101266793B (en) 2007-03-14 2007-03-14 Device and method for reducing recognition error via context relation in dialog bouts

Publications (2)

Publication Number Publication Date
CN101266793A CN101266793A (en) 2008-09-17
CN101266793B true CN101266793B (en) 2011-02-02

Family

ID=39989145

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2007100870226A Active CN101266793B (en) 2007-03-14 2007-03-14 Device and method for reducing recognition error via context relation in dialog bouts

Country Status (1)

Country Link
CN (1) CN101266793B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105659316A (en) * 2013-11-25 2016-06-08 三菱电机株式会社 Conversation control device and conversation control method
CN104679826B (en) * 2015-01-09 2019-04-30 北京京东尚科信息技术有限公司 The method and system of context identification based on disaggregated model
US10049666B2 (en) * 2016-01-06 2018-08-14 Google Llc Voice recognition system
CN108182942B (en) * 2017-12-28 2021-11-26 瑞芯微电子股份有限公司 Method and device for supporting interaction of different virtual roles
CN111048074A (en) * 2019-12-25 2020-04-21 出门问问信息科技有限公司 Context information generation method and device for assisting speech recognition

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1637740A (en) * 2003-11-20 2005-07-13 阿鲁策株式会社 Conversation control apparatus, and conversation control method
CN1842788A (en) * 2004-10-08 2006-10-04 松下电器产业株式会社 Dialog supporting apparatus

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1637740A (en) * 2003-11-20 2005-07-13 阿鲁策株式会社 Conversation control apparatus, and conversation control method
CN1842788A (en) * 2004-10-08 2006-10-04 松下电器产业株式会社 Dialog supporting apparatus

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
JP特开2003-223185A 2003.08.08
Rebecca Jonson.DIALOGUE CONTEXT-BASED RE-RANKING OF ASR HYPOTHESES.《Spoken Language Technology Workshop,2006.IEEE》.2006,174-177. *
RebeccaJonson.DIALOGUECONTEXT-BASEDRE-RANKINGOFASRHYPOTHESES.《SpokenLanguageTechnologyWorkshop 2006.IEEE》.2006

Also Published As

Publication number Publication date
CN101266793A (en) 2008-09-17

Similar Documents

Publication Publication Date Title
US7890329B2 (en) Apparatus and method to reduce recognition errors through context relations among dialogue turns
CN110032623B (en) Method and device for matching question of user with title of knowledge point
CA2508946C (en) Method and apparatus for natural language call routing using confidence scores
CN104903954B (en) The speaker verification distinguished using the sub- phonetic unit based on artificial neural network and identification
EP0960417B1 (en) Method of determining model-specific factors for pattern recognition, in particular for speech patterns
CN110517693B (en) Speech recognition method, speech recognition device, electronic equipment and computer-readable storage medium
CN1211779C (en) Method and appts. for determining non-target language in speech identifying system
JPH08512148A (en) Topic discriminator
CN101266793B (en) Device and method for reducing recognition error via context relation in dialog bouts
AU2008303513A1 (en) Method and system for identifying information related to a good
CN109544104A (en) A kind of recruitment data processing method and device
CN111145733A (en) Speech recognition method, speech recognition device, computer equipment and computer readable storage medium
CN113807103B (en) Recruitment method, device, equipment and storage medium based on artificial intelligence
US20040148169A1 (en) Speech recognition with shadow modeling
US20040193894A1 (en) Methods and apparatus for modeling based on conversational meta-data
JP3496706B2 (en) Voice recognition method and its program recording medium
CN1213398C (en) Method and system for non-intrusive speaker verification using behavior model
CN111680476B (en) Method for intelligently generating service hotword recognition conversion of class text
CN115905187B (en) Intelligent proposition system oriented to cloud computing engineering technician authentication
CN113836269B (en) Chapter-level core event extraction method based on question-answering system
CN112133291B (en) Language identification model training and language identification method and related device
CN113255361B (en) Automatic voice content detection method, device, equipment and storage medium
CN113239164B (en) Multi-round dialogue flow construction method and device, computer equipment and storage medium
McDermott et al. Prototype-based MCE/GPD training for word spotting and connected word recognition
KR100382473B1 (en) Speech recognition method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant