CN106558309A

CN106558309A - A kind of spoken dialog strategy-generating method and spoken dialog method

Info

Publication number: CN106558309A
Application number: CN201510629197.XA
Authority: CN
Inventors: 徐为群; 任航; 赵学敏; 颜永红
Original assignee: Institute of Acoustics CAS; Beijing Kexin Technology Co Ltd
Current assignee: Institute of Acoustics CAS; Beijing Kexin Technology Co Ltd
Priority date: 2015-09-28
Filing date: 2015-09-28
Publication date: 2017-04-05
Anticipated expiration: 2035-09-28
Also published as: CN106558309B

Abstract

The invention provides a kind of spoken dialog strategy-generating method, methods described includes：Step S1) real human computer conversation's data sample is collected using people's substitute machine mode；Step S2) based on Agenda analog subscriber models, a Virtual User is built according to dialogue data sample, for simulating real user behavior；Step S3) for Virtual User semantic information addition noise, build noisy communication channel；Step S4) dialog strategy template is built according to the semantic information of Virtual User；Step S5) extract dialog strategy template in the free parameter that includes of all conditions sentence constitute parameter vector, be optimized in input genetic algorithm and obtain optimal solution；Step S6) optimal solution imparting dialog strategy template is obtained into dialog strategy.The spoken strategy-generating method of the present invention is compared with the existing pure manual dialog strategy formulated with more preferable noise robustness；And the spoken policy language of present invention definition is easy for workers to edit and safeguards, is more suitable for the business environment for having strict demand to system action.

Description

A kind of spoken dialog strategy-generating method and spoken dialog method

Technical field

The present invention relates to spoken interaction field, and in particular to a kind of spoken dialog strategy-generating method and spoken dialog side Method.

Background technology

With the rapid popularization of mobile Internet and intelligent terminal, spoken dialogue system has obtained more extensive Application.Spoken interaction is one of most natural, convenient man-machine interaction mode, is widely used on mobile device Voice assistant, automatic customer service, in the system such as automatic information inquiry.Compare the tradition input mode such as keyboard and mouse, The use threshold of spoken dialogue system is lower, and potential user group is wider, and is more applicable for vehicle-mounted, wearable etc. Man-machine interaction under Special use environment.Spoken dialog management be spoken dialogue system center module, major function It is to receive the semantic information from speech understanding module, output system dialogue action, afterwards by language generation module Natural language text is converted into, and voice is converted into finally by voice synthetic module and is returned to user.

Certain noise is inevitably present in the result output of speech recognition and speech understanding in real system, and It is particularly acute in the larger use environment of noise, this allows for conversational system and has in view of wrong presence, Need to make deduction to real dialogue state with reference to conversation history, according to current state in time actively and user Carry out validation of information.The main challenge faced in the structure of dialogue management module is how to be accurately tracked by dialogue State, and most suitable dialogue action is selected according to current dialogue states.It is main by field in commercial system at present Expert's manual construction safeguards dialog strategy, and the method is relatively complicated.And it is many using based on extensive chemical in research field The model and method and measure of habit slightly carries out Automatic Optimal, and the method has more preferably robustness；But the plan for training Slightly numerical model, it is difficult to manual setting and checking are carried out, so being rarely applied to business environment at present.

The content of the invention

It is an object of the invention to the above-mentioned of spoken dialog strategy presence is generated in overcoming current spoken dialogue system lack Fall into, a kind of generation method of spoken dialog strategy is proposed based on genetic algorithm, it is special that the method has incorporated a large amount of fields Family's Heuristicses, using the automatic learning strategy of data-driven version, with stronger noise robustness.

To achieve these goals, the invention provides a kind of spoken dialog strategy-generating method, methods described includes：

Step S1) real human computer conversation's data sample is collected using people's substitute machine mode；

Step S2) based on Agenda analog subscriber models, a Virtual User is built according to dialogue data sample, use In simulation real user behavior；

Step S3) for Virtual User semantic information addition noise, build noisy communication channel；

Step S4) dialog strategy template is built according to the semantic information of Virtual User；

Step S5) extract dialog strategy template in the free parameter that includes of all conditions sentence constitute parameter vector, input It is optimized in genetic algorithm and obtains optimal solution；

Step S6) optimal solution imparting dialog strategy template is obtained into dialog strategy.

In above-mentioned technical proposal, step S3) detailed process be：

According to actual identification and the behavior pattern setting noise parameter of understanding system of emulation；Noisy communication channel receives correct Semantic information, export the semantic information containing error message, it is random in emulation to change the semantic groove in semanteme and fill out Information and dialogue action are filled, simulation is real to be understood or recognize mistake.

In above-mentioned technical proposal, step S4) detailed process be：

Dialog strategy template specifies the basic structure of dialog strategy, talks with plan by several " condition-action " sentences Slightly it is indicated, wherein condition part is used for certain constraint of the expression to current dialogue states, if meeting certain condition, Respective action is exported then, condition part contains certain threshold parameter；Form is as follows：

if(condition(θ₁))then(action₁)

else if(condition(θ₂))then(action₂)

…

else then(action_n)

Wherein, θ=(θ₁…θ_n) it is free parameter to be asked, giving policy template by one group of complete free parameter is Complete dialog strategy is obtained；

Dialog strategy defines a dialog manager basic act, during parameter training, dialog manager By step S3) in noisy communication channel and step S2) in build analog subscriber interact；It is each in template The corresponding certain priority relationship of the sequencing of individual condition-action sentence, i.e., the condition that prioritizing selection is matched at first Sentence.

In above-mentioned technical proposal, step S5) specifically include：

Step S5-1) formulate fitness function；

Formulating fitness function needs the factor for considering to include：Dialogue success rate is outer, dialog length and request user are heavy Multiple number of times；Each factor is weighted averagely according to its influence degree to user experience, description is obtained comprehensive Close the fitness function of Consumer's Experience；

Step S5-2) one group of parameter vector is selected as initial solution, using step S5-1) fitness function formulated Genetic algorithm training is carried out, until iterationses reach a definite limitation or the performance of optimal solution reaches optimization aim, Output optimal solution.

Realized based on the dialog strategy that above-mentioned spoken dialog strategy-generating method is generated, present invention also offers a kind of mouth Language dialogue method, the method include：

Step T1) phonetic entry of user is converted into identification text after speech recognition；

Step T2) identification text is converted into into structurized semantic information；

Step T3) current dialogue states are updated according to semantic information；

Step T4) current dialogue states are judged according to the dialog strategy, export corresponding dialogue action；

Step T5) the dialogue action of output is converted into into textual form, voice is then converted to, is exported to user.

It is an advantage of the current invention that：

1st, spoken strategy-generating method of the invention has more preferable compared with the existing pure manual dialog strategy formulated Noise robustness；

2nd, in the method for the invention, the language of definition strategy is easy for workers to edit and safeguards, is more suitable for system row For the business environment for having strict demand；

3rd, spoken strategy-generating method of the invention is easy to addition domain-specialist knowledge, can directly from rule-based right Words strategy is migrated and is upgraded.

Description of the drawings

Fig. 1 is the flow chart of the spoken dialog strategy-generating method of the present invention；

Schematic diagrams of the Fig. 2 for genetic algorithm；

Fig. 3 is the schematic diagram of the spoken dialog method of the present invention.

Specific embodiment

The present invention will be further described in detail with specific embodiment below in conjunction with the accompanying drawings.

As shown in figure 1, a kind of spoken dialog strategy-generating method, methods described includes：

Step S1) real human computer conversation's data sample is collected using the mode of people's substitute machine (Wizard-of-oz) This；

Step S2) analog subscriber model based on Agenda, a Virtual User is built according to dialogue data sample, For simulating real user behavior；

At beginning of conversation, according to the random user view initialization stack for generating, initial dialog action is pressed into；Then In the dialogue interaction of every bout, simulate Virtual User and moved according to the certain user of the system acting generation for receiving Make in press-in stack, eject the user reply of the user action as this bout of storage afterwards from stack top；

The form of semantic information is：Dialogue action+semanteme groove filling；For example, " I wants to inquire about near ' Zhong Guan-cun ' The restaurant of ' cheap ' " is represented by inform (location=Zhong Guan-cun, price=are cheap).At beginning of conversation Conversation object is generated at random according to semantic information, such as conversation object is for " inquiry ' Zhong Guan-cun ' is nearby ' cheap ' ' Sichuan cuisine shop ' ", i.e., need to fill ' place ', ' price ', ' taste ' three in epicycle dialogue semantic Groove.

Step S3) for Virtual User semantic information addition noise, build interchannel noise；

Noisy communication channel can produce random deletion and modification operation to true semantic information, generate N-best list and each The corresponding confidence score of item；The information of the example above can be converted into after noisy communication channel { semantics=[inform (location=Zhong Guan-cun, price=are cheap)], score=0.7 } or { semantics=[inform (location=international trades)], score=0.2 }.

In terms of noisy communication channel, according to the actual identification of emulation and should understand that the behavior pattern of system sets noise parameter； Noisy communication channel receives correct semantic information, exports the semantic information containing error message represented with Nbest forms, The random semantic groove filling information changed in semanteme and dialogue action in emulation, simulation is real to be understood or recognizes Mistake.

if(condition(θ₁))then(action₁)

elseif(condition(θ₂))then(action₂)

…

else then(action_n)

Wherein, θ=(θ₁…θ_n) it is free parameter to be asked, giving policy template by one group of complete free parameter is Complete dialog strategy is obtained.Dialog strategy defines a dialog manager basic act, in parameter training During, dialog manager is by step S3) in noisy communication channel and step S2) in build analog subscriber carry out Interaction.

The template is easy to the Heuristicses for stating domain expert, can be by the rewriting of the dialog strategy of rule-based expression Come.One typical condition-action sentence is：If " the semantic confidence degree fraction of the optimal result of input is less than θ₁, then Issue the user with the system acting of ' please repeat ' "；In template, the sequencing correspondence of each condition-action sentence is certain Priority relationship, i.e., the conditional statement that prioritizing selection is matched at first.

As shown in Fig. 2 genetic algorithm is iterative optimized algorithm, in each iteration, by all of candidate A solution composition generation " totality ", selects wherein outstanding individuality by the side such as " variation ", " intersection " " breeding " Formula is generated and constitutes overall individuality of future generation, and the individuality of such better performances has bigger probability by its characteristic hereditary To the next generation, and the performance overall per a generation is typically better than its parent.Both propagation methods are as shown below, The mode for wherein making a variation is, to some parent individuality, random disturbance to be carried out to which and generates a new individual.And intersect Mode be that two parts that two parent individualities are therefrom selected with complementation at random are pieced together, generate new individuality.

Step S5) specifically include：

Step S5-1) fitness function is formulated,

Fitness function is used to evaluate the individual effect quality of every generation during genetic algorithm optimization；Determine dialogue plan Slightly fine or not topmost factor is dialogue success rate, in the dialogue of common task-driven, when conversational system is completed User's assigned operation, the information for correctly providing user's request, then it is assumed that be once successfully to talk with, otherwise it is assumed that being Failure.

Step S5-2) one group of parameter vector is selected as initial solution, using step S5-1) fitness function formulated Genetic algorithm training is carried out, until iterationses reach a definite limitation or the performance of optimal solution reaches optimization aim, Output optimal solution；

In each iteration, " variation " is used, " intersection ", two kinds of modess of reproduction generated the next generation, is being selected The individuality for answering prioritizing selection fitness higher during the parent individuality for participating in breeding operation, to guarantee the excellent of parent Characteristic hereditary improves the overall performance of filial generation to filial generation.

For parameter vector θ to be assessed, its assignment is obtained into a specific dialog strategy in policy template, will The dialog strategy be placed in simulated environment with Virtual User carry out N wheel interact, if wherein N_SUCWheel is talked with successfully then Dialogue success rate is represented by：

F=N/N_SUC

Based on above-mentioned spoken dialog strategy-generating method dialog strategy, present invention also offers a kind of spoken dialog method；

As shown in figure 3, methods described is specifically included：

Claims

1. a kind of spoken dialog strategy-generating method, methods described include：

2. spoken dialog strategy-generating method according to claim 1, it is characterised in that step S3) Detailed process be：

3. spoken dialog strategy-generating method according to claim 1, it is characterised in that step S4) Detailed process be：

if(condition(θ₁))then(action₁)

elseif(condition(θ₂))then(action₂)

...

else then(action_n)

Wherein, θ=(θ₁...θ_n) it is free parameter to be asked, giving policy template by one group of complete free parameter is Complete dialog strategy is obtained；

Dialog strategy defines a dialog manager basic act, during parameter training, dialog manager By step S3) in noisy communication channel and step S2) in build analog subscriber interact；Dialog strategy The certain priority relationship of the sequencing correspondence of each condition-action sentence in template, prioritizing selection are matched at first Conditional statement.

4. spoken dialog strategy-generating method according to claim 1, it is characterised in that step S5) Specifically include：

Step S5-1) formulate fitness function；

Formulating fitness function needs the factor for considering to include：Dialogue success rate, dialog length and request user repeat Number of times；Each factor is weighted averagely according to the influence degree of its user experience, obtains describing comprehensive use The fitness function of family experience；

5. a kind of spoken dialog method, is given birth to based on the spoken dialog strategy-generating method described in one of claim 1-4 Into dialog strategy realize that the method includes：