CN106649825A

CN106649825A - Voice interaction system, establishment method and device thereof

Info

Publication number: CN106649825A
Application number: CN201611247830.XA
Authority: CN
Inventors: 曾永梅; 李波; 朱频频
Original assignee: Shanghai Zhizhen Intelligent Network Technology Co Ltd
Current assignee: Shanghai Zhizhen Intelligent Network Technology Co Ltd
Priority date: 2016-12-29
Filing date: 2016-12-29
Publication date: 2017-05-10
Anticipated expiration: 2036-12-29
Also published as: CN106649825B

Abstract

The invention provides a voice interaction system, an establishment method and an establishment device thereof. The method for establishing the voice interaction system comprises the steps of receiving a voice user interaction flow diagram, wherein the voice user interaction flow diagram includes a plurality of flows flowing according to a preset flow; establishing a knowledge base based on the plurality of flows, wherein the plurality of flows include a first flow and a second flow located at a downstream of the first flow; an answer to a first knowledge point corresponding to the first flow is an interrogative sentence type answer, and a question of a second knowledge point corresponding to the second flow is response to the interrogative sentence type answer of the first knowledge point; providing a linguistic model so as to execute voice recognition for voice input of a user; and providing the knowledge points in the knowledge base so as to execute semantic recognition for an acquired voice recognition result. Flowing between the flows is realized by use of matching of the knowledge points in the knowledge base in a manner of establishment of the knowledge base, and thus the implementation difficulty is reduced.

Description

Voice interactive system and its creation method and device

Technical field

The present invention relates to human-computer interaction technique field, more particularly to a kind of voice interactive system and establishment interactive voice system The method and apparatus of system.

Background technology

Man-machine interaction is the science of the interactive relation between Study system and user.System can be various machines Device, or computerized system and software.For example, various artificial intelligence systems, example can be realized by man-machine interaction Such as, intelligent customer service system, speech control system etc..Artificial intelligence semantics recognition is the basis of man-machine interaction, and it can be to people Speech like sound is identified, to be converted into machine it will be appreciated that language.

Intelligent Answer System is one kind typical case's application of man-machine interaction, wherein after user's proposition problem, intelligent answer system System provides the answer of the problem.Voice interactive system is a kind of special intelligent Answer System, i.e., user propose problem be with The form input of voice.Therefore, in voice interactive system, it is necessary first to by the customer problem of speech form, i.e. phonetic entry The customer problem of written form is identified as, then again the problem of user is understood by above-mentioned semantic resolving, and provide corresponding Answer.

Traditionally, it is the voice user's interaction diagrams be given based on client to design voice interactive system, application person Corresponding VoiceXML come realize user semantic understand and follow-up handling process.VoiceXML is to build on XML language specification It is a kind of markup language for being applied to voice browse on basis.Can be set up based on the voice application of WEB using VoiceXML And service.

Based on this traditional design mode, it would be desirable to the language material of identification write grammer, production language model, then will it is understood that Language material point good class, generative semantics model.Then language and semantic model have been loaded in voice interactive system, and has finished writing each The corresponding vxml of semantic classification (voice extensible markup language, Voice Extensible Markup language).Language Model is used for the phonetic entry of identifying user, is converted into the user input of written form.Semantic model is for understanding this The implication of the user input of written form, to determine follow-up process.Such as bill inquiry classification tag be：bill.Need in vxml Write exactly, when the semantic analysis result of identification is bill, then walk corresponding flow process, such as next art is：" you think inquiry The bill of which moon" and then wait user input to recognize again, it is identified as " this month ", of that month classification tag continues past for month. Under walk corresponding flow process.

This needs special developer and writes vxml based on the development scheme of VoiceXML, increased enforcement difficulty, and And needing and write the people of semantic model needs agreement semantic classification tag. just can proceed, and increased communication cost.Increase every time Flow process is revised, is required for being loaded into language model, semantic model and vxml again in systems, it is impossible to come into force in real time.

The content of the invention

The brief overview of one or more aspects given below is providing to the basic comprehension in terms of these.This general introduction is not The extensive overview of all aspects for contemplating, and it is also non-to be both not intended to identify the key or decisive key element of all aspects Attempt to define the scope in terms of any or all.Its unique purpose is to provide the one of one or more aspects in simplified form A little concepts think the sequence of more detailed description given later.

The invention provides a kind of voice interactive system and its creation method and device, to solve voice interactive system establishment During circulation between flow process the high problem of development and implementation difficulty.

In a first aspect, the invention provides a kind of method for creating voice interactive system, including：

Voice user's interaction diagrams are received, voice user's interaction diagrams are included according to many of intended flow circulation Individual flow process；

Based on the plurality of flow process creation of knowledge storehouse, the knowledge base includes corresponding with the plurality of flow process multiple knowing Know point, each knowledge point includes problem and its answer,

Wherein, the plurality of flow process includes first pass and the second procedure positioned at the first pass downstream, described the The answer of the first knowledge point corresponding to one flow process be question sentence type answer, and with the second knowledge point corresponding to the second procedure Problem be the question sentence type answer to first knowledge point response；

Language model is provided for performing speech recognition to the phonetic entry of user；And

The knowledge point in the knowledge base is provided for performing semantics recognition to the voice identification result for obtaining.

Second aspect, the invention provides a kind of device for creating voice interactive system, including：

Receiver module, for receiving voice user's interaction diagrams, voice user's interaction diagrams are included according to pre- Determine multiple flow processs of workflow；

Knowledge base creation module, for based on the plurality of flow process creation of knowledge storehouse, the knowledge base to include many with described The corresponding multiple knowledge points of individual flow process, each knowledge point includes problem and its answer,

Language model training module, for providing language model for performing speech recognition to the phonetic entry of user； And

Knowledge point distribute module, there is provided the knowledge point in the knowledge base is for the voice identification result for obtaining execution Semantics recognition.

The third aspect, the invention provides a kind of voice interactive system, including：

The knowledge base that above-mentioned method is created；

Sound identification module, voice is performed for adopting the language model for being provided with the aforedescribed process to user speech input Identification；

Semantics recognition module, language is performed for the corresponding knowledge point in using the knowledge base to institute's speech recognition result Justice identification；And

Output module, for providing a user with response output based on voice identification result.

By way of the present invention is setting up knowledge base, realized between flow process using the matching of knowledge in knowledge base point Circulation.This avoids special developer and writes vxml, reduces enforcement difficulty.It is critical that being based on Voice compared to tradition The design of XML, in additions and deletions flow process, it is only necessary to the corresponding knowledge point of additions and deletions in knowledge base, can come into force in real time, deployment Flexibly.

Description of the drawings

After the detailed description of embodiment of the disclosure is read in conjunction with the following drawings, better understood when the present invention's Features described above and advantage.In the accompanying drawings, each component is not necessarily drawn to scale, and with similar correlation properties or feature Component may have same or like reference.

Fig. 1 shows the flow chart for creating the method for voice interactive system according to an aspect of the present invention；

Fig. 2 shows an example of voice user's interaction diagrams；

Fig. 3 shows the flow chart for asking standard the method being extended according to an aspect of the present invention；

Fig. 4 shows the block diagram for creating the device of voice interactive system according to an aspect of the present invention；And

Fig. 5 shows the block diagram of expanding element according to a further aspect in the invention；And

Fig. 6 shows the block diagram of voice interactive system according to an aspect of the present invention.

Specific embodiment

Below in conjunction with the drawings and specific embodiments, the present invention is described in detail.Note, it is below in conjunction with accompanying drawing and specifically real The aspects for applying example description is only exemplary, and is understood not to carry out any restriction to protection scope of the present invention.

For voice interactive system, user proposes problem in the form of phonetic entry.In order to answer the problem of user, voice The background process of interactive system mainly includes two parts：Speech recognition part and semantics recognition part.Speech recognition part Effect is to carry out speech recognition based on phonetic entry of the speech model to user, to obtain the customer problem of written form.It is semantic Identification division is that the customer problem of written form is understood based on semantic model, is answered with understanding user view, and then being given Case.

Speech recognition technology is mainly constituted by the language model training stage and using the cognitive phase of language model.Above-mentioned Speech recognition part is as using the cognitive phase of language model.

The language model training stage is the modeling that language model is carried out by the training of a large amount of language materials, for example with SRILM Instrument is modeled.SRILM full name are Stanford Research Institute Language Modeling Toolkit (Stanford Research Institute's Language Modeling instrument), main target is the estimation and evaluation and test of supporting language model.Setting up language model Afterwards, the voice of user input is identified using the language model.In speech recognition process, language model accurately whether for Recognition result is most important.More perfect language model can more accurate voice identification result.

A set of knowledge base for semantics recognition is devised in the present invention, and knowledge base includes numerous knowledge points.Know It is exactly FAQ conventional at ordinary times to know point most original and simplest form, and general form is that " ask-answer " is right.In the present invention, " standard is asked " is used to indicate that the word of certain knowledge point, and main target is that expression is clear, is easy to safeguard.For example, " the money of CRBT Expense " is exactly that clearly standard asks description for expression.Here " asking " should not be narrowly interpreted as " inquiry ", and should broadly be managed One " input " of solution, is somebody's turn to do " input " and has corresponding " output ".For example, for for the semantics recognition of control system, user One instruction, for example " turn on radio " also should be understood to be one " asking ", now corresponding " answering " can be use In calling for the control program for performing corresponding control.

Therefore, voice interactive system and the difference for being traditionally based on the voice interactive system that VoiceXML is designed in the present invention It, in semantics recognition part, is that the process that standard is asked is found from knowledge base based on voice identification result to be, is found and it The standard matched somebody with somebody is asked, you can be considered the semanteme of " understanding " voice identification result, such that it is able to ask right by the standard of the matching " the answering " answered is supplied to user.

In the present invention, the Semantic Similarity Measurement that can be asked by all standards in voice identification result and knowledge base come It is determined that the standard of matching is asked.For example, the standard with highest semantic similarity asks that it is that matched standard is asked that can be determined that, Can ask from the standard of the matching and determine that user wishes the target service handled, and then can will ask that what is be associated answers with the standard Case is supplied to user.For example, if the standard for being matched is asked as " rate of CRBT ", can will ask that what is be associated answers with the standard Case (for example, CRBT rate situation) is exported to user.

Knowledge base and knowledge point therein are introduced initially below.

For more accurately and efficiently identifying user problem, intelligent Answer System also been developed the concept of abstract semantics.Take out As semanteme is to the further abstract of body generic attribute.The collection that the abstract semantics of one classification pass through one group of abstract semantics expression formula Close to describe the different expression of a class abstract semantics, be the more abstract semanteme of expression, these abstract semantics expression formulas are in composition Expanded on element.When these elements for expanding can express various concrete once corresponding value is had been assigned It is semantic.

Each abstract semantics expression formula mainly may include to lack semantic component and semantic rules word.Disappearance semantic component by Semantic component symbol expression, can express of all kinds after these semantic components for lacking are filled with corresponding value (i.e. content) Concrete semanteme.

The semantic component symbol of abstract semantics may include：

[concept]：Represent the word or phrase of main body or object composition.

Such as：" CRBT " in " how open-minded CRBT is "

[action]：The word or phrase of expression action composition.

Such as：" handling " in " how credit card is handled "

[attribute]：Represent the word or phrase of attribute composition.

Such as：" color " in " which color iphone has "

[adjective]：Represent the word or phrase of ornamental equivalent.

Such as：" cheap " in " which brand of refrigerator is cheap "

Some main abstract semantics classification examples have：

What conceptual illustration [concept] is

Attribute constitutes which [attribute] [concept] has

How [action] behavior [concept]

Behavior place [concept] is somewhere [action]

Behavioral reasons [concept] why can [action]

Behavior prediction [concept] can or can not [action]

Behavior judges [concept] either with or without [attribute]

[attribute] of attribute situation [concept] is [adjective]

Whether determined property [concept] has [attribute]

Why so [adjective] [attribute] of attribute reason [concept]

Where is the difference of proximate nutrition [concept1] and [concept2]

Attribute compares what difference [attribute] of [concept1] and [concept2] has

Question sentence judges to do general judge by part-of-speech tagging in the composition of abstract semantics aspect, concept pair The part of speech answered is noun, and the corresponding parts of speech of action are verb, the corresponding parts of speech of attribute are noun, adjective correspondences Be adjective.

By classification for " behavior " abstract semantics [concept] how as a example by [action], the abstract language of the category A plurality of abstract semantics expression formula is may include under justice set：

Abstract semantics classification：Behavior

Abstract semantics expression formula：

A. [concept] [need | should] [how]<[can with]><Carry out>[action]

B. { [concept]～[action] }

c.[concept]<'s>[action]<Method | mode | step>

d.<Which has | what has | either with or without><Pass through | use | to exist>[concept][action]<'s>[method]

E. [how] [action]～[concept]

Tetra- abstract semantics expression formulas of above-mentioned a, b, c, d are all for describing " behavior " this abstract semantics classification 's.Symbol " | " expression "or" relation, symbol "" represent that the composition is not essential.By taking above-mentioned abstract semantics expression formula c as an example, can Expand into following abstract semantics expression：

c1.[concept]<'s>[action]<Method>

c2.[concept]<'s>[action]<Mode>

c3.[concept]<'s>[action]<Step>

c4.[concept]<'s>[action]

c5.[concept][action]<Method>

c6.[concept][action]<Mode>

c7.[concept][action]<Step>

c8.[concept][action]

In above-mentioned abstract semantics expression formula, in addition to the abstract semantic component as disappearance semantic component is accorded with, its The specific word that he occurs such as " how ", " should ", " method ", these words need to be used in abstract semantics are regular, So can be collectively referred to as semantic rules word.

The basic concepts with regard to knowledge point for intelligent answer are described above, this is in the understanding present invention Have help.

Fig. 1 shows the flow chart for creating the method 100 of voice interactive system according to an aspect of the present invention. As shown in figure 1, method 100 may include following steps：

Step 101：Voice user's interaction diagrams are received, voice user's interaction diagrams are included according to intended flow stream The multiple flow processs for turning.

Interactive voice flow chart is the diagram of the workflow for representing user when using voice interactive system.In flow chart One flow process of each node on behalf, with the difference of customer problem, from a workflow to next flow process.In Fig. 2 Flow process 1 is shown, positioned at the flow process 11,12,13 in the downstream of flow process 1, positioned at the flow process 111,112 in the downstream of flow process 11, positioned at flow process The flow process 131 in 13 downstreams, and positioned at the flow process 1121,1122 in the downstream of flow process 112.

User can circulate with interacting for voice interactive system according to the relation of each flow process in flow chart.For example, in flow process 1 In the stage, flow process 11, flow process 12 or flow process 13 are entered by the identification selection to user input.Assume to enter flow process 12, Then whole interaction flow terminates.If into flow process 11, according to the identification to user input, be selectively entered flow process 111 or Person's flow process 112.Assume to enter flow process 111, then whole interaction flow terminates.If into flow process 112, according to user input Identification, is selectively entered flow process 1121 or flow process 1122.

Each flow process in Fig. 2 is different depending on the service object of voice interactive system.For example, to for telecom operation For the voice interactive system of business, then these flow processs can be " telephone expenses inquiry ", " Ring Back Tone service is handled ", " flow bag order " etc. Deng.

The flow chart illustrated in Fig. 2 may be only the part of an entire flow figure, and for example, the upstream of flow process 1 also may be used Can there is another flow process, flow process 1 can be entered from another flow process.Flow process 111, flow process 112, flow process 131, flow process 1121, stream The downstream of journey 1122 also likely to be present other flow processs do not drawn.

After receiving the flow chart, you can know the demand of system user, such that it is able to according to the demand come custom-built system.

In step 102, based on flow chart in the plurality of flow process creation of knowledge storehouse, the knowledge base includes and the plurality of flow process Corresponding multiple knowledge points, each knowledge point includes problem and its answer.

A knowledge point is built for each flow process in flow chart, so as to obtain corresponding multiple knowledge points.Knowledge The foundation of point especially has claimed below, i.e., for the upstream flow process for having down stream train, the knowledge point corresponding to the upstream flow process Answer is question sentence type answer, and the problem of the knowledge point corresponding to the down stream train is to the knowledge point corresponding to the upstream flow process Question sentence type answer response.

Assume that this multiple flow process includes first pass and the second procedure positioned at the first pass downstream.First pass institute is right The answer of the first knowledge point answered is question sentence type answer, then be to the with the problem of the second knowledge point corresponding to the second procedure The response of the question sentence type answer of one knowledge point.Here first pass and second procedure is only to be generally meant to represent between flow process Relative upstream-downstream relationship.

By taking Fig. 2 as an example, from the point of view of relation of the flow process 1 with flow process 11,12,13, flow process 1 is upstream flow process, i.e. first pass； Flow process 11,12,13 is all the down stream train of flow process 1, is second procedure.But come with the relation of flow process 11 and flow process 111,112 See, flow process 11 is then changed into upstream flow process, i.e. first pass, and flow process 111, flow process 112 are down stream train, i.e. second procedure.

As described above, knowledge point is in the form of " problem-answer ", " problem " and " answer " all Ying Yiguang here Free burial ground for the destitute angle is understanding.For example, " problem " can be directly one instruction or a declarative sentence, rather than traditional grammar on one Individual question sentence, correspondingly, " answer " can be that " answer " can also be in order to perform a function or command calls of this instruction One confirmative question.

Here, " answer " itself can also be a phraseological question sentence form.For example, it is assumed that flow process 13 is " bill The flow process of inquiry ", its corresponding knowledge point " problem " can be " bill inquiry ", and " answer " is then that " you want which moon inquired about Bill”." problem " of the corresponding knowledge point of down stream train 131 of flow process 13 can be " month of inquiry ", and " answer " is " xx The bill of the moon is that yyyy is first " (wherein, actually entering month for user in xx months).

Thus when user input " I wants to consult bill ", can be found matching by Semantic Similarity Measurement Degree highest problem " bill inquiry ", i.e., into flow process 13, now, the answer of output is not that specific bill is detailed, but " you want the bill for inquiring about which moon to one question sentence”.When the month that user input is specifically inquired about, by semantic similarity meter Calculate, find matching degree highest question sentence " month of inquiry ", i.e., into flow process 131, the answer for now exporting is final use The answer " bill in xx months is that yyyy is first " that family wishes to know that.

Again for example, it is assumed that the answer of the knowledge point of flow process 1 is that " you want what business handled for question sentence type answer", flow process 11, Flow process 12, the question sentence of the knowledge point of flow process 13 are " flow bag business ", " Ring Back Tone service ", " bill inquiry ".Then, by user The Semantic Similarity Measurement of the voice identification result of input, can position to the knowledge point of flow process 11, flow process 12 or flow process 13, from And the corresponding answer of the knowledge point is given, realize the circulation of flow process 1 to flow process 11, flow process 12 or flow process 13.

By the way, that is, the contact between upstream flow process and the knowledge point of down stream train is built, with by semantic phase The mode of the next knowledge point of positioning is calculated like degree, the circulation between each flow process in flow chart is completed.

When being input into system, optimal situation is asked using standard to user, then system is at once it will be appreciated that user The meaning.However, user often not uses standard to ask, but some forms for deforming that standard is asked.For example, if for receipts The standard form of asking of the radio station switching of sound machine is " changing a radio station ", then the order that user may use is " one electricity of switching Platform ", it is the same meaning that machine is also required to be capable of identify that user expresses.

As described above, realized in user's question matching to knowledge point by Semantic Similarity Measurement in the present invention Problem.In order that Semantic Similarity Measurement obtains more preferable result, the standard of each knowledge point is asked also extend in the present invention Many extensions are asked.Perform semantics recognition when, actually by user's question sentence (i.e. voice identification result) of written form with Each knowledge point includes that standard is asked and extended and to ask and perform Semantic Similarity Measurement together in interior question sentence, to obtain matching highest Question sentence.

For this purpose, in the present invention, set up each knowledge point include the standard for setting up the knowledge point ask, the expansion of associated Exhibition is asked and corresponding answer.Standard asks that with the foundation of answer be according to client (the i.e. client of customized voice interactive system, example Such as, bank, telecom operators etc.) provide knowledge enter edlin.User in flow chart each flow process have and say accordingly It is bright, for example need what information etc. fed back according to user input content.Can from user provide these knowledge in extract and The standard for editing each knowledge point is asked and answer.

If the extension that each standard is asked is carried out by the form of artificial " thinking ", less efficient, and is had a lot Leakage is thought.In the present invention, automatically generate the extension that standard asks using abstract semantics expression formula to ask.

For this purpose, firstly the need of an abstract semantics database is provided, including multiple abstract semantics expression formulas, abstract language Adopted expression formula includes the semantic composition of disappearance, as described above.

Fig. 3 shows the flow chart for asking standard the method 300 being extended.As shown in figure 3, method 300 may include as Lower step.

Step 302, the standard is asked according to abstract semantics database carry out abstract semantics recommendation process, to obtain and the mark Standard asks corresponding one or more abstract semantics expression formulas.

For example, standard ask for：" how looking into violating the regulations ".

Firstly, it is necessary to find in abstract semantics database ask corresponding abstract semantics expression formula with the standard.It is real one In example, the abstract semantics are recommended first to ask the standard carries out word segmentation processing, obtains some words, and the word is semantic rules word Or non-semantic regular word.

How for example, " how looking into violating the regulations " can be divided into word " ", " looking into ", " violating the regulations ".In these words, " how " it is semanteme Regular word, " looking into " and " violating the regulations " is non-semantic rules word.

Then, respectively part-of-speech tagging process is carried out to each non-semantic regular word, for example, " looks into " and be noted as verb, " disobeyed Chapter " is noted as noun.

Afterwards, part of speech judgement process is carried out to each semantic rules word, obtains the grammatical category information of each semantic rules word.Word Class simply understands and is one group of word for having general character, these words semantically can with it is similar can also be dissimilar.

Finally, according to these part-of-speech informations and grammatical category information abstract semantics database is scanned for process, is obtained and mark Standard asks the abstract semantics expression formula that " how looking into violating the regulations " matches.

In practice, the abstract semantics expression formula matched with user meets following condition：

1) the corresponding part of speech of disappearance semantic component of abstract semantics expression formula asks the word of corresponding filling content including standard Property；

2) corresponding semantic rules word is identical in asking with standard or belongs to same part of speech for abstract semantics expression formula；

3) order of abstract semantics expression formula is identical with the order of representation that standard is asked.

In above-mentioned abstract semantics classification " behavior ", disappearance semantic component action of abstract semantics expression formula e Part of speech is verb, and standard is asked that " how looking into violating the regulations " corresponding filling content " looking into " is also verb, lacks semantic component concept Part of speech is noun, and standard asks that " how looking into violating the regulations " corresponding filling content " violating the regulations " is also noun, therefore meets above-mentioned condition 1).

Semantic rules word secondly, in abstract semantics expression formula e " how " ask corresponding in " how looking into violating the regulations " with standard Semantic rules word " how " belong to same part of speech, therefore meet above-mentioned condition 2).

Finally, the order of abstract semantics expression formula e is also identical with the order of representation that standard is asked, meets above-mentioned condition 3).

Therefore, in abstract semantics database, the abstract semantics expression formula for asking that " how looking into violating the regulations " matches with standard is found E, i.e., [how] [action]～[concept].The abstract semantics expression formula belongs to " behavior " classification, due to a classification In abstract semantics expression formula there is identical to express implication, therefore be that above-mentioned standard asks recommendation " behavior side in the present invention The set of the abstract semantics expression formula of formula " this classification.In other words, in the classification belonging to abstract semantics expression formula for being matched All abstract semantics expression formulas are all proposed as asking corresponding abstract semantics expression formula with the standard.

Step 304, asks that middle extraction is corresponding with the disappearance semantic component of one or more abstract semantics expression formulas from the standard Content, and by the fills of extraction to corresponding disappearance semantic component obtaining asking corresponding one or many with the standard Individual concrete semantic formula.These concrete semantic formulas are asked as the extension that the standard is asked.

So that above-mentioned standard asks " how looking into violating the regulations " as an example, recommend following abstract semantics expression formula：

A. [concept] [need | should] [how]<[can with]><Carry out>[action]

B. { [concept]～[action] }

c.[concept]<'s>[action]<Method | mode | step>

E. [how] [action]～[concept]

Ask that standard " how looking into violating the regulations " carries out expansion process with above-mentioned abstract semantics expression formula.

In one example, middle extraction in corresponding with the disappearance semantic component of each abstract semantics expression formula is asked from standard Hold, and the fills of extraction are lacked in semantic component to obtain being asked with the standard to each abstract semantics expression formula is corresponding Corresponding concrete semantic formula.

With abstract semantics expression formula a：[concept] [need | should] [how]<[can with]><Carry out> How as a example by [action], content corresponding with the disappearance semantic component of the expression formula is extracted from " ", " looking into ", " violating the regulations "：

The corresponding contents of concept：" violating the regulations "

The corresponding contents of action：" looking into "

Therefore, " will look into " and filling violating the regulations " violating the regulations " will obtain a concrete semantic formula to corresponding disappearance semantic component： [violating the regulations] [need | should] [how]<[can with]><Carry out>[inquiry].

How by taking abstract semantics expression formula b. { [concept]～[action] } as an example, carry from " ", " looking into ", " violating the regulations " Take content corresponding with the disappearance semantic component of the expression formula：

The corresponding contents of concept：" violating the regulations "

The corresponding contents of action：" looking into "

Therefore, " will look into " and " violating the regulations " will be filled to corresponding disappearance semantic component and obtain a concrete semantic formula：[disobey Chapter] [inquiry].

With abstract semantics expression formula c. [concept]<'s>[action]<Method | mode | step>As a example by, from " why ", " looking into ", content corresponding with the disappearance semantic component of the expression formula is extracted in " violating the regulations "：

The corresponding contents of concept：" violating the regulations "

The corresponding contents of action：" looking into "

Therefore, " will look into " and filling violating the regulations " violating the regulations " will obtain a concrete semantic formula to corresponding disappearance semantic component： [violating the regulations]<'s>[inquiry]<Method | mode | step>.

With abstract semantics expression formula d.<Which has | what has | either with or without><Pass through | use | to exist>[concept][action] <'s>How as a example by [method], content corresponding with the disappearance semantic component of the expression formula is extracted from " ", " looking into ", " violating the regulations "：

The corresponding contents of concept：" violating the regulations "

The corresponding contents of action：" looking into "

Therefore, " will look into " and " violating the regulations " will be filled to corresponding disappearance semantic component and obtain a concrete semantic formula：<Which has A bit | what has | either with or without><Pass through | use | to exist>[violating the regulations] [inquiry]<'s>[method].

It is described above and how the process being extended is asked standard using abstract semantics database.

Semantic formula and user's question sentence relation and traditional template matches have very big different, in conventional template matching In, template and user's question sentence are simply matched and the relation not matched, and relation is to pass through between semantic formula and user's question sentence The value (similarity) of quantization is representing, while this value for quantifying can be to the similarity between similar question sentence and user's question sentence Mutually compare.

Therefore, there is extraordinary discrimination using the semantics recognition of Semantic Similarity Measurement in the present invention, improves use Embody at family.

Fig. 1 is returned to, in step 103, there is provided language model is for the phonetic entry of user execution speech recognition.

For voice interactive system, it is necessary first to which the user that the phonetic entry of user is identified as into written form is defeated Enter.As it was previously stated, the realization of speech recognition needs to use language model.Language model is mainly by using the training of a large amount of language materials Into.On the one hand, more substantial amounts of language material is used, the language model for obtaining is more accurate.However, with the increase of language material quantity, training And the calculating cost of identification is consequently increased.Therefore, in practice, often cost and performance it is compromise on the basis of with certain Language material amount is trained.

On the other hand, language material is more targeted, and the language model for training is also more accurate.For example, answering for sport category With occasion, it is possible to use a large amount of terms related to physical culture are trained as language material, for the application scenario of financial class, can be with It is trained as language material using a large amount of terms related to finance.In this way, obtain more accurately under certain cost Language model.

In the present invention, on the one hand, the corpus of language model can according to the application of voice interactive system come Select.

But in order to further improve the identification precision of language model, while reduces cost, employs more in the present invention With targetedly strategy.That is, voice interactive system of the invention is not to adopt to immobilize for the phonetic entry of user Language model, but based on the workflow position being presently in, it is therefore possible to use different language models.

Specifically, for each flow process, training be exclusively used in the flow process down stream train language model, for regard to It is somebody's turn to do the speech recognition that (a little) down stream train performs user speech input.Obviously each flow process here is referred to down stream train Flow process.By taking Fig. 2 as an example, for flow process 1, training has the language model of the flow process 11,12,13 for being exclusively used in flow process downstream.For Flow process 11, training has the language model of the flow process 111,112 for being exclusively used in flow process downstream, and the rest may be inferred.

Specifically in training, by the use of the question sentence in the knowledge point corresponding with (a little) down stream train as voice training language Material train language model.It will be understood that due to thus training user input of the language model for obtaining for recognizing to be likely to The question sentence in these knowledge points is exactly corresponded to, therefore, with a relatively high recognition accuracy.In practice, SRILM instruments can be adopted It is trained.

In step 104, there is provided the knowledge point in knowledge base is for the voice identification result for obtaining execution semantics recognition.

In one example, for each user input from user, when semantics recognition is performed, can be using in knowledge base All knowledge points carry out Semantic Similarity Measurement.

More preferably, more targeted strategy is employed.That is, for the semantics recognition of user input is not using whole All knowledge points in knowledge base, but based on the workflow position being presently in, it is therefore possible to use different knowledge points carries out semanteme Similarity Measure.

Specifically, for each flow process, there is provided with the knowledge point corresponding to the down stream train of the flow process, for regard to It is somebody's turn to do the semantics recognition that (a little) down stream train performs voice identification result.By taking Fig. 2 as an example, for user is given in flow process 1 stage User input, voice identification result (the i.e. user of written form is being obtained by the user input of speech form by speech recognition After input), it is now, defeated merely with three corresponding knowledge points of flow process 11,12,13 and the user when performing semantics recognition Enter to perform Semantic Similarity Measurement.

Such scheme of the invention, setting up knowledge base by way of, using the matching of knowledge in knowledge base point, come Realize the circulation between flow process.This avoids special developer and writes vxml, reduces enforcement difficulty.It is critical that comparing In design of the tradition based on Voice XML, in additions and deletions flow process, it is only necessary to the corresponding knowledge point of additions and deletions in knowledge base, Can come into force in real time, deployment is flexible.

Although for make explanation simplify said method is illustrated and is described as a series of actions, it should be understood that and understand, These methods are not limited by the order of action, because according to one or more embodiments, some actions can occur in different order And/or with from it is depicted and described herein or not shown herein and describe but it will be appreciated by those skilled in the art that other Action concomitantly occurs.

Fig. 4 shows the block diagram for creating the device 400 of voice interactive system according to an aspect of the present invention.

As shown in figure 4, device 400 may include receiver module 401, knowledge base creation module 402, language model training module 403rd, knowledge point distribute module 404 and abstract semantics storehouse 405.

Receiver module 401 can be used for receive voice user's interaction diagrams, voice user's interaction diagrams include according to Multiple flow processs of intended flow circulation.

Knowledge base creation module 402 can be based on this multiple flow process creation of knowledge storehouse, and the knowledge base includes and this multiple flow process Corresponding multiple knowledge points, each knowledge point includes problem and its answer.

Without loss of generality, the first pass for multiple flow processs being included and the second positioned at the first pass downstream Journey, the answer of the first knowledge point corresponding to first pass is question sentence type answer, and with the second knowledge corresponding to second procedure The problem of point is the response of the question sentence type answer to the first knowledge point.

For this purpose, also including that abstract semantics database 405 includes multiple abstract semantics expression formulas, abstract semantics in device 400 Expression formula includes disappearance semantic component.

In one example, knowledge base creation module 402 may include expanding element 4021.Expanding element 4021 can be according to abstract Semantic database is asked standard carries out abstract semantics recommendation process, and corresponding one or more abstract languages are asked with the standard when obtaining During adopted expression formula, from the standard middle extraction in corresponding with the disappearance semantic component of one or more abstract semantics expression formulas is asked Hold, and the fills of extraction are lacked in semantic component to obtain asking corresponding one or more tools with the standard to corresponding Body semantic formula, the concrete semantic formula is asked as the extension that the standard is asked.

As shown in figure 5, expanding element 4021 may include participle subelement 40211, part-of-speech tagging subelement 40212, part of speech Judgment sub-unit 40213 and search subelement 40214.

Participle subelement 40211 can be used to ask standard carries out word segmentation processing, obtains some words, and these words are semanteme Regular word or non-semantic regular word.Part-of-speech tagging subelement 40212 can be used to carry out part of speech to each non-semantic regular word respectively Mark process, obtains the part-of-speech information of each non-semantic regular word.Part of speech judgment sub-unit 40213 can be used for respectively to each language Adopted rule word carries out part of speech judgement process, obtains the grammatical category information of each semantic rules word.Final search subelement 40214 can root According to these part-of-speech informations and grammatical category information abstract semantics database is scanned for process, obtain being asked with standard the abstract language for matching Adopted expression formula.

Abstract semantics expression formula may also include semantic rules word, ask that the abstract semantics expression formula for matching needs to meet with standard Following condition：

The corresponding part of speech of disappearance semantic component of abstract semantics expression formula asks the part of speech of corresponding filling content including standard；

Corresponding semantic rules word is identical in asking with standard or belongs to same part of speech for abstract semantics expression formula；

The order of abstract semantics expression formula is identical with the order of representation that standard is asked.

Language model training module 403 can be used to provide language model for performing voice knowledge to the phonetic entry of user Not.

In one example, language model training module 403 can be directed to each flow process, and training is exclusively used in the downstream stream of the flow process The language model of journey, for performing the speech recognition of user speech input with regard to the down stream train.In training, language model Training module 403 trains language using the problem in the knowledge point corresponding with (a little) down stream train as voice training language material Speech model.In practice, language model training module 403 can adopt SRILM instrument train language models.

Knowledge point distribute module 404 can provide the knowledge point in the knowledge base for holding to the voice identification result for obtaining Row semantics recognition.

In one example, knowledge point distribute module 404 can provide right with the down stream train of flow process institute for each flow process The knowledge point answered, for performing the semantics recognition of voice identification result with regard to (a little) down stream train.

The specific implementation of the device in the present invention for creating voice interactive system can be found in be handed over for creating voice Mutually the embodiment of the method for system, will not be described here.

Present invention also offers the voice interactive system that a kind of employing such scheme builds.

Fig. 6 shows the block diagram of voice interactive system 600 according to an aspect of the present invention.

Voice interactive system 600 may include knowledge base 601, and the knowledge base 601 can be created using the method shown in Fig. 1.

Voice interactive system 600 may also include sound identification module 602, semantics recognition module 603 and output module 604. The language model that semantics recognition module 602 can be used to be provided using the method shown in Fig. 1 performs voice knowledge to user speech input Not.

Semantics recognition module 603 can be used to perform the voice identification result using the corresponding knowledge point in knowledge base 601 Semantics recognition.Output module 604 can be used to provide a user with response output based on voice identification result.

In example, semantics recognition module 603 may include Semantic Similarity Measurement module 6031, by the voice identification result with Question sentence in corresponding knowledge point performs Semantic Similarity Measurement and calculates, and semantic similarity in the problem of threshold value higher than having highest language The problem of adopted similarity is confirmed as matching problem.Output module 604 can be supplied to the answer being associated with the matching problem User is used as the response output.

Skilled artisans will appreciate that, the various illustrative logic cards described with reference to the embodiments described herein Block, module, circuit and algorithm steps are capable of achieving as electronic hardware, computer software or combination of the two.Clearly to solve This interchangeability of hardware and software is said, various illustrative components, frame, module, circuit and step are with its function above Property form make vague generalization description.Such feature is implemented as hardware or software depends on concrete application and puts on The design constraint of total system.Technical staff can realize described function for every kind of application-specific with different modes Property, but such realize that decision-making should not be interpreted to cause departing from the scope of the present invention.

With reference to presently disclosed embodiment description various illustrative logic modules and circuit can with general processor, Digital signal processor (DSP), special IC (ASIC), field programmable gate array (FPGA) or other FPGAs Device, discrete door or transistor logic, discrete nextport hardware component NextPort or its be designed to carry out any group of function described herein Close to realize or perform.General processor can be microprocessor, but in alternative, the processor can be any routine Processor, controller, microcontroller or state machine.Processor is also implemented as the combination of computing device, such as DSP One or more microprocessors that combination, multi-microprocessor with microprocessor cooperates with DSP core or any other this Class is configured.

The step of method described with reference to embodiment disclosed herein or algorithm, can be embodied directly in hardware, in by processor Embody in the software module of execution or in combination of the two.Software module can reside in RAM memory, flash memory, ROM and deposit Reservoir, eprom memory, eeprom memory, register, hard disk, removable disk, CD-ROM or known in the art appoint In the storage medium of what other forms.Exemplary storage medium is coupled to processor so that the processor can be from/to the storage Medium reads and writes information.In alternative, storage medium can be integrated into processor.Processor and storage medium can In residing in ASIC.ASIC can reside in user terminal.In alternative, processor and storage medium can be used as discrete sets Part is resident in the user terminal.

It is for so that any person skilled in the art all can make or using this public affairs to provide of this disclosure being previously described Open.Various modifications of this disclosure all will be apparent for a person skilled in the art, and as defined herein general Suitable principle can be applied to spirit or scope of other variants without departing from the disclosure.Thus, the disclosure is not intended to be limited Due to example described herein and design, but should be awarded and principle disclosed herein and novel features phase one The widest scope of cause.

Claims

1. a kind of method for creating voice interactive system, it is characterised in that include：

Voice user's interaction diagrams are received, voice user's interaction diagrams include the multiple streams circulated according to intended flow Journey；

Based on the plurality of flow process creation of knowledge storehouse, the knowledge base includes the multiple knowledge corresponding with the plurality of flow process Point, each knowledge point includes problem and its answer,

Wherein, the plurality of flow process includes first pass and the second procedure positioned at the first pass downstream, described first-class The answer of the first knowledge point corresponding to journey is question sentence type answer, and is asked with the second knowledge point corresponding to the second procedure Topic is the response of the question sentence type answer to first knowledge point；

2. the method for claim 1, it is characterised in that the problem in each knowledge point is asked including standard and should The extension that standard is asked is asked.

3. method as claimed in claim 2, it is characterised in that the extension asked and set up in the following ways：

Abstract semantics database is provided, the abstract semantics database includes multiple abstract semantics expression formulas, the abstract semantics Expression formula includes disappearance semantic component；

The standard is asked according to the abstract semantics database carry out abstract semantics recommendation process, asks with the standard when obtaining During corresponding one or more abstract semantics expression formulas, from the standard middle extraction and one or more abstract semantics expression formulas are asked The corresponding content of disappearance semantic component, and by the fills of extraction to it is corresponding disappearance semantic component in obtain with it is described Standard asks corresponding one or more concrete semantic formulas, the extension that the concrete semantic formula is asked as the standard Ask.

4. method as claimed in claim 3, it is characterised in that the abstract semantics recommendation process includes：

The standard is asked carries out word segmentation processing, obtains some words, and the word is semantic rules word or non-semantic regular word；

Respectively part-of-speech tagging process is carried out to each non-semantic regular word, obtain the part-of-speech information of each non-semantic regular word；

Respectively part of speech judgement process is carried out to each semantic rules word, obtain the grammatical category information of each semantic rules word；

Abstract semantics database is scanned for according to the part-of-speech information and grammatical category information processing, obtain being asked with the standard The abstract semantics expression formula matched somebody with somebody.

5. method as claimed in claim 4, it is characterised in that the abstract semantics expression formula also includes semantic rules word, with The standard asks that the abstract semantics expression formula of matching meets following condition：

6. the method for claim 1, it is characterised in that the offer language model includes：

For each flow process, training is exclusively used in the language model of the down stream train of the flow process, for regard to the down stream train Perform the speech recognition of user speech input.

7. method as claimed in claim 6, it is characterised in that the training is included using corresponding with the down stream train Problem in knowledge point is used as voice training language material train language model.

8. method as claimed in claim 7, it is characterised in that the language model using SRILM instruments by being trained.

9. the method for claim 1, it is characterised in that the knowledge point in the offer knowledge base includes：

For each flow process, there is provided with the knowledge point corresponding to the down stream train of the flow process, for regard to the down stream train Perform the semantics recognition of voice identification result.

10. a kind of device for creating voice interactive system, it is characterised in that include：

Receiver module, for receiving voice user's interaction diagrams, voice user's interaction diagrams are included according to pre- constant current Multiple flow processs of Cheng Liuzhuan；

Knowledge base creation module, for based on the plurality of flow process creation of knowledge storehouse, the knowledge base to include and the plurality of stream The corresponding multiple knowledge points of journey, each knowledge point includes problem and its answer,

Language model training module, for providing language model for performing speech recognition to the phonetic entry of user；And

Knowledge point distribute module, there is provided the knowledge point in the knowledge base for obtain voice identification result perform semanteme Identification.

11. devices as claimed in claim 10, it is characterised in that the problem in each knowledge point ask including standard and The extension that the standard is asked is asked.

12. devices as claimed in claim 11, it is characterised in that also include

Abstract semantics database, the abstract semantics database includes multiple abstract semantics expression formulas, the abstract semantics expression Formula includes disappearance semantic component；

Wherein, the knowledge base creation module include expanding element, for according to the abstract semantics database to the standard Asking carries out abstract semantics recommendation process, when acquisition asks corresponding one or more abstract semantics expression formulas with the standard, from The standard ask it is middle extract corresponding with the disappearance semantic component of one or more abstract semantics expression formulas content, and by extraction Fills lack in semantic component to obtain asking corresponding one or more concrete semantic meaning representations with the standard to corresponding Formula, the concrete semantic formula is asked as the extension that the standard is asked.

13. devices as claimed in claim 12, it is characterised in that the expanding element includes：

Participle subelement, for asking the standard word segmentation processing is carried out, and obtains some words, and the word is semantic rules word Or non-semantic regular word；

Part-of-speech tagging subelement, for each non-semantic regular word carrying out part-of-speech tagging process respectively, obtains each non-semantic The part-of-speech information of regular word；

Part of speech judgment sub-unit, for each semantic rules word carrying out part of speech judgement process respectively, obtains each semantic rules The grammatical category information of word；

Search subelement, for scanning for processing to abstract semantics database according to the part-of-speech information and grammatical category information, obtains To asking the abstract semantics expression formula that matches with the standard.

14. devices as claimed in claim 13, it is characterised in that the abstract semantics expression formula also includes semantic rules word, Ask that the abstract semantics expression formula for matching meets following condition with the standard：

15. devices as claimed in claim 10, it is characterised in that the language model training module is directed to each flow process, instruction White silk is exclusively used in the language model of the down stream train of the flow process, for performing the language of user speech input with regard to the down stream train Sound is recognized.

16. devices as claimed in claim 15, it is characterised in that the language model training module is utilized and downstream stream Problem in the corresponding knowledge point of journey is used as voice training language material train language model.

17. devices as claimed in claim 16, it is characterised in that the language model training module is instructed using SRILM instruments Practice language model.

18. devices as claimed in claim 10, it is characterised in that the knowledge point distribute module is directed to each flow process, there is provided With the knowledge point corresponding to the down stream train of the flow process, for performing the semanteme of voice identification result with regard to the down stream train Identification.

19. a kind of voice interactive systems, it is characterised in that include：

The knowledge base that as claimed in any one of claims 1-9 wherein method is created；

Sound identification module, for the language model that provided using method as claimed in any one of claims 1-9 wherein to Family phonetic entry performs speech recognition；

Semantics recognition module, for the corresponding knowledge point in using the knowledge base semantic knowledge is performed to institute's speech recognition result Not；And

20. voice interactive systems as claimed in claim 19, it is characterised in that the semantics recognition module includes：

Semantic Similarity Measurement module, semantic similarity meter is performed by institute's speech recognition result with the question sentence in corresponding knowledge point Calculate, semantic similarity is confirmed as matching question sentence higher than the question sentence in the question sentence of threshold value with highest semantic similarity,

The answer being associated with the matching problem is supplied to user as the response output for the output module.