CN103280218A

CN103280218A - Voice recognition-based selection method and mobile terminal device and information system thereof

Info

Publication number: CN103280218A
Application number: CN2013101828630A
Authority: CN
Inventors: 张国峰
Original assignee: Via Technologies Inc
Current assignee: Via Technologies Inc
Priority date: 2012-12-31
Filing date: 2013-05-17
Publication date: 2013-09-04
Also published as: CN106847278A; TWI511124B; TW201426736A; CN103021403A

Abstract

The invention discloses a voice recognition-based selection method and a mobile terminal device and an information system thereof. The selection method comprises the following steps: receiving first voice input; performing voice recognition and natural language processing on the first voice input so as to generate corresponding first keywords; obtaining at least one first return answer according to the first keywords; performing a corresponding operation according to a corresponding type of the first return answer when the number of the first return answer is 1; settling the first return answers when the number of the first return answers is greater than 1, displaying the first return answers in a first candidate list, and receiving a second input voice; performing the voice recognition and the natural language processing on the second input voice so as to generate corresponding second keywords; and selecting second return answers from the second return answers in the first candidate list according to the second keywords.

Description

System of selection and mobile terminal apparatus and infosystem based on speech recognition

Technical field

The present invention relates to a kind of system of selection and mobile terminal apparatus thereof and infosystem, particularly relate to a kind of system of selection based on speech recognition and mobile terminal apparatus and infosystem.

Background technology

In the natural language understanding (Nature Language Understanding) of computing machine, can use specific grammer to grasp intention or the information of user's read statement usually.Therefore, if store the data of abundant user's read statement in the database, just can accomplish rational judgement.

In the existing practice, having a kind of is to utilize built-in fixedly word to tabulate to grasp user's read statement, and fixedly comprised specific intention or the employed particular terms of information in the word tabulation, and the user needs to express its intention or information according to this particular terms, and its intention or information could correctly be identified by system.Yet, force the user to go to remember that fixedly each particular terms of word tabulation is the practice of quite not hommization.For example: prior art is used the fixedly embodiment of word tabulation, require the user in inquiry weather, must say: " Shanghai (or Beijing) tomorrow (or day after tomorrow) weather how? " and if the user uses other more natural colloquial styles to express when also wanting to inquire weather conditions, such as be " Shanghai tomorrow how? " because do not occur in the statement " weather ", so prior art will be understood as " there is a place of crying tomorrow in Shanghai ", the obvious so real intention of not catching the user.In addition, the employed statement kind of user is very complicated, and changes to some extent often again, even the statement that the user may input error sometimes, must grasp user's read statement in the case by the mode of fuzzy matching.Therefore, the effect that only provides the fixedly word tabulation of the input rule that ossifys to reach is just poorer.

In addition, when utilizing natural language understanding to handle polytype user view, the syntactic structure of the intention that some is different but is identical, for example the read statement as the user is " I will see the The Romance of the Three Kingdoms ", its user view might be to want to see the film of the The Romance of the Three Kingdoms, or want to read the book of the The Romance of the Three Kingdoms, therefore usually in this case, just can match two kinds and may be intended to allow the user do selection.Yet, under many circumstances, provide unnecessary may being intended to allow the user do select to be very unnecessary and do not have efficient.For example, when user's read statement was " I want to see MillionStar ", book or paintings that user's intention is mated for reading MillionStar were (because MillionStar are TV programme) very unnecessary.

Moreover generally speaking, the search result that obtains in full-text search is non-structured data.Information in the unstructured data is dispersion and the association of not having, for example, behind Search engines such as google or Baidu input key word, the Web search result who obtains is exactly unstructured data, because search result must just can find central useful information by artificial reading item by item, and such practice is not only wasted user's time, and may miss the information of wanting, so can be very restricted on practicality.

Summary of the invention

The invention provides a kind of system of selection based on speech recognition and mobile terminal apparatus and infosystem, can promote the convenience of user's operation.

The present invention proposes a kind of system of selection based on speech recognition, comprising: receive first phonetic entry; First phonetic entry is carried out speech recognition to produce first key word; Produce at least one first repayment answer according to first key word; When the quantity of selecting the first repayment answer is 1, carry out corresponding operation according to the data type of the selected first repayment answer; When the quantity of selecting the first repayment answer greater than 1 the time, show first candidate list that comprises the first repayment answer and receive second phonetic entry; Second phonetic entry is carried out speech recognition to produce second key word; From the first shown repayment answer of first candidate list, select the second repayment answer according to second key word.

The present invention proposes a kind of mobile terminal apparatus, comprises voice receiving unit, display unit, storage unit and data processing unit.Voice receiving unit receives first phonetic entry and second phonetic entry.Display unit comprises the candidate list of repaying answer in order to demonstration.Storage unit is in order to store a plurality of data.Data processing unit couples voice receiving unit, display unit and storage unit.Data processing unit carries out speech recognition producing first key word to first phonetic entry, and selects the first corresponding repayment answer according to first key word.When the quantity of the first repayment answer of selecting was 1, data processing unit carried out corresponding operation according to the type of the selected first repayment answer.When the quantity of the first repayment answer of selecting greater than 1 the time, data processing unit control display unit shows first candidate list that comprises the first repayment answer.Data processing unit carries out speech recognition producing second key word to second phonetic entry, and selects the second repayment answer according to second key word from the first repayment answer of first candidate list.

The present invention proposes a kind of infosystem, comprises servomechanism and mobile terminal apparatus.Servomechanism is in order to store a plurality of data and to have speech identifying function.Mobile terminal apparatus comprises voice receiving unit, display unit and data processing unit.Voice receiving unit receives first phonetic entry and second phonetic entry.Display unit comprises the candidate list of repaying answer in order to demonstration.Data processing unit couples voice receiving unit, display unit and servomechanism.Data processing unit carries out speech recognition producing first key word by servomechanism to first phonetic entry, and servomechanism is selected the first corresponding repayment answer and is sent to data processing unit according to first key word.When the quantity of the first repayment answer of selecting was 1, data processing unit carried out corresponding operation according to the type of the selected first repayment answer.When the quantity of the first repayment answer of selecting greater than 1 the time, data processing unit control display unit shows first candidate list that comprises the first repayment answer, and data processing unit carries out speech recognition producing second key word by servomechanism to second phonetic entry, and servomechanism is selected the second repayment answer and is sent to data processing unit from the first repayment answer of first candidate list according to second key word.

The present invention proposes a kind of system of selection based on speech recognition, comprising: retrieve to obtain at least one first repayment answer according to this first key word in a structured database; When the quantity of this first repayment answer of selecting greater than 1 the time, show that one comprises first candidate data of this first repayment answer; After showing this first candidate list, receive one second phonetic entry, and this second phonetic entry is carried out speech recognition to produce one second key word; And, from this first repayment answer of first candidate list, select the second repayment answer according to this second user view.

Based on above-mentioned, system of selection and mobile terminal apparatus and the infosystem based on speech recognition of the embodiment of the invention, it carries out speech recognition and natural language processing confirming the key word of first phonetic entry and the second phonetic entry correspondence to first phonetic entry and second phonetic entry, and according to the key word of first phonetic entry and the second phonetic entry correspondence repayment answer is selected.By this, can promote the convenience of user's operation.

For above-mentioned feature and advantage of the present invention can be become apparent, embodiment cited below particularly, and be described with reference to the accompanying drawings as follows.

Description of drawings

Fig. 1 is the calcspar according to the natural language understanding system of one embodiment of the invention.

Fig. 2 is according to the natural language processing device of one embodiment of the invention synoptic diagram to the analysis result of user's various solicited messages.

Fig. 3 A is the synoptic diagram according to the stored a plurality of records with specific data structure of the structured database of one embodiment of the invention.

Fig. 3 B is the synoptic diagram of the stored a plurality of records with specific data structure of according to another embodiment of the present invention structured database.

Fig. 3 C is the synoptic diagram according to the stored directs data of the directs data storage device of one embodiment of the invention.

Fig. 4 A is the process flow diagram according to the search method of one embodiment of the invention.

Fig. 4 B is the process flow diagram of the natural language understanding system course of work according to another embodiment of the present invention.

Fig. 5 A is the calcspar of the natural language dialogue system that illustrates according to one embodiment of the invention.

Fig. 5 B is the calcspar of the natural language understanding system that illustrates according to one embodiment of the invention.

Fig. 5 C is the calcspar of the natural language dialogue system that illustrates according to another embodiment of the present invention.

Fig. 6 is the method flow diagram of the correction voice answer-back that illustrates according to one embodiment of the invention.

Fig. 7 A is the calcspar of the natural language dialogue system that illustrates according to one embodiment of the invention.

Fig. 7 B is the calcspar of the natural language dialogue system that illustrates according to another embodiment of the present invention.

Fig. 8 A is the natural language dialogue method flow diagram that illustrates according to one embodiment of the invention.

Fig. 8 B is the synoptic diagram according to the stored a plurality of records with specific data structure of the structured database of an embodiment more of the present invention.

Fig. 9 is the system schematic according to the mobile terminal apparatus of one embodiment of the invention.

Figure 10 is the system schematic according to the infosystem of one embodiment of the invention.

Figure 11 is the process flow diagram based on the system of selection of speech recognition according to one embodiment of the invention.

Figure 12 is the calcspar of the speech control system that illustrates according to one embodiment of the invention.

Figure 13 is the calcspar of the speech control system that illustrates according to another embodiment of the present invention.

Figure 14 is the process flow diagram of the speech control method that illustrates according to one embodiment of the invention.

The reference numeral explanation

100,520,520 ', 720,720 ': natural language understanding system

102,503,503 ', 703,902,902 ': solicited message

104: analysis result

106: may be intended to syntax data

108,509,509 ', 711,904,904 ': key word

110: respond the result

112: the intention data

114: determine the intention syntax data

116: the analysis result output module

200: searching system

220: structured database

240: Search engine

260: the Retrieval Interface unit

280: the directs data storage device

300: the natural language processing device

302,832,834,836,838: record

304: header field

306: content field

308: divide field

310: guide field

312: numeric field

314: come source field

316: the temperature field

318,852,854: the hobby field

320,862,864: detest field

400: knowledge is assisted Understanding Module

500,500 ', 700,700 ': the natural language dialogue system

501,701: phonetic entry

507,507 ', 707: voice answer-back

510,710: the phonetic sampling module

511,511 ', 711,906,906 ': the repayment answer

513,513 ', 713: voice

522,722: sound identification module

524,724: the natural language processing module

526,726: the phonetic synthesis module

530,740: speech database for speech synthesis

702: the speech synthesis processing module

715: user preference data

717: the user preferences record

730: property database

872,874: field

900,1010: mobile terminal apparatus

908,908 ': candidate list

910,1011: voice receiving unit

920,1013: data processing unit

930,1015: display unit

940: storage unit

1000: infosystem

1020: servomechanism

SP1: first voice

SP2: second voice

1200,1300: speech control system

1210: auxiliary actuating apparatus

1212,1222: wireless transport module

1214: trigger module

1216: the wireless charging battery

12162: battery unit

12164: the wireless charging module

1220,1320: mobile terminal apparatus

1221: voice system

1224: the phonetic sampling module

1226: the phonetic synthesis module

1227: the voice output interface

1228: communication module

1230:(high in the clouds) servomechanism

1232: the speech understanding module

12322: sound identification module

12324: speech processing module

S410～S450: the step of search method according to an embodiment of the invention

S510～S590: the step of the natural language understanding system course of work according to an embodiment of the invention

S602, S604, S606, S608, S610, S612: each step of method of revising voice answer-back

S802～S890: each step of natural language dialogue method according to an embodiment of the invention

S1100～S1190: according to each step based on the system of selection of speech recognition of one embodiment of the invention

S1402～S1412: according to each step of the speech control method of one embodiment of the invention

Embodiment

Because the existing embodiment of fixedly word tabulation that uses can only provide rigid input rule, judgement for the changeable read statement of user is very not enough, so often cause user's intention misjudgment be can not find required information or export unnecessary information to problems such as users because judgment is not enough.In addition, existing Search engine can only provide the user and disperse and relevant not strong search result, can filter out information needed so the user also will take time to inspect one by one, not only loses time but also may miss information needed.The present invention namely proposes search method and the system of a structural data at the foregoing problems of prior art, provide specific field to store the data of different types element at structural data, when providing the user to use natural-sounding input information to retrieve, can be fast and the user's that judges rightly intention, and then provide information needed to give the user or provide more accurate message to choose for it.

Fig. 1 is the calcspar according to the natural language understanding system of one embodiment of the invention.As shown in Figure 1, natural language understanding system 100 comprises that searching system 200, natural language processing device 300 and knowledge assists Understanding Module 400, the auxiliary Understanding Module 400 of knowledge couples natural language processing device 300 and searching system 200, searching system 200 also comprises structured database 220, Search engine 240 and Retrieval Interface unit 260, and wherein Search engine 240 couples structured database 220 and Retrieval Interface unit 260.In the present embodiment, searching system 200 includes Retrieval Interface unit 260, but non-to limit the present invention, may not have Retrieval Interface unit 260 among some embodiment, and otherwise makes 240 pairs of structured database of Search engine 220 carry out full-text search.

When the user sends solicited message 102 to natural language understanding system 100, but natural language processing device 300 analysis request information 102, and be sent to the auxiliary Understanding Module 400 of knowledge in the syntax data 106 that may be intended to that will analyze, wherein may be intended to syntax data 106 and comprise key word 108 and intention data 112.Subsequently, the auxiliary Understanding Module 400 of knowledge takes out and may be intended to the key word 108 in the syntax data 106 and be sent to searching system 200 and will be intended to data 112 be stored in auxiliary Understanding Module 400 inside of knowledge, and the Search engine 240 in the searching system 200 will carry out after the full-text search according to 108 pairs of structured database of key word 220, and the response result 110 with full-text search is back to the auxiliary Understanding Module 400 of knowledge again.Then, the auxiliary Understanding Module 400 of knowledge is compared according to responding 110 pairs of stored intention data 112 of result, and definite intention syntax data 114 of trying to achieve is sent to analysis result output module 116, and analysis result output module 116 is again according to determining intention syntax data 114, transmit analysis result 104 to the servomechanism (not shown), after inquiring the required data of user, give the user with it subsequently.It should be noted that analysis result 104 can comprise key word 108, also exportable partial information (for example recording 302 numbering) or the whole information that comprises the record (for example record of Fig. 3 A/3B) of key word 108.In addition, analysis result 104 can directly be converted to voice output by servomechanism and give the user or give user's (mode of " particular procedure " and content and the information that is comprised hereinafter can be described in detail in detail again) through exporting corresponding voice after the particular procedure more again, those skilled in the art can design the information that searching system 200 is exported according to actual demand, and the present invention is not limited this.

Above-mentioned analysis result output module 116 can combine with other modules according to circumstances, for example can incorporate in the auxiliary Understanding Module 400 of knowledge in one embodiment or be located away from natural language understanding system 100 in another embodiment and be arranged in servomechanism (for example comprising natural language understanding system), handle again so servomechanism will directly receive intention syntax data 114.In addition, the auxiliary Understanding Module 400 of knowledge can be stored in intention data 112 in the storage device of inside modules, in natural language understanding system 100, in the servomechanism (for example comprising natural language understanding system) or in any reservoir that auxiliary Understanding Module 400 can capture for knowledge, the present invention is not limited this.Moreover, natural language understanding system 100 comprises that the auxiliary Understanding Module 400 of searching system 200, natural language processing device 300 and knowledge can construct with the various combinations of hardware, software, firmware or aforesaid way, and the present invention does not also limit this.

Aforementioned natural language understanding system 100 can be arranged in the high in the clouds servomechanism, also can be arranged in the servomechanism of local-area network, even be to be positioned at personal computer, mobile computing machine (as mobile computer) or device for mobile communication (as mobile phone) etc.Each member in natural language understanding system 100 or the searching system 200 also not necessarily need be arranged in the uniform machinery, and visual actual needs is dispersed in different device or system links by various communications protocol.For example, the auxiliary Understanding Module 400 of natural language understanding processor 300 and knowledge is configurable in same intelligent mobile phone, and searching system 200 is configurable in another high in the clouds servomechanism; Or the auxiliary Understanding Module 400 of Retrieval Interface unit 260, natural language understanding processor 300 and knowledge is configurable in same mobile computer, and in Search engine 240 and structured database 220 configurable another servomechanisms in local-area network.In addition, when natural language understanding system 100 all is positioned at servomechanism (no matter being high in the clouds servomechanism or local-area network servomechanism), the auxiliary Understanding Module 400 of searching system 200, natural language understanding processor 300 and knowledge can be disposed in the different main frames, and be planned as a whole the transmission of its mutual message and data by the servomechanism main system.Certainly, the auxiliary Understanding Module 400 of searching system 200, natural language understanding processor 300 and knowledge also visual actual demand and will be wherein both or all be incorporated in the main frame, the present invention does not limit the configuration of this part.

In an embodiment of the present invention, the user can send solicited message to natural language processing device 300 in various manners, for example sends solicited message with modes such as the phonetic entry of speaking or text descriptions.For instance, if natural language understanding system 100 is the servomechanism (not shown)s that are arranged in high in the clouds or local-area network, then the user can be earlier by mobile device (mobile phone for example, PDA, flat computer or similar system) import solicited message 102, then by the telecommunication system dealer solicited message 102 is sent to natural language understanding system 100 in the servomechanism again, allow natural language processing device 300 carry out the analysis of solicited message 102, the final servo device is after confirming user view, again by analysis result output module 116 with the analysis result 104 of correspondence by after the processing of servomechanism, pass user institute information requested back user's mobile device.For instance, solicited message 102 can be the problem (for example the weather of Shanghai " tomorrow how ") that the user wishes to try to achieve by natural language understanding system 100 answer, and natural language understanding system 100 analyze the user be intended that the weather of inquiry Shanghai tomorrow the time, will give the user with the weather data of inquiring about as output result 104 by analysis result output module 116.In addition, if the user to natural language understanding system 100 under instruction be " I will see and allow bullet fly ", when " I want to listen the date of passing by together ", because " allow bullet fly " or " date of passing by together " may comprise different fields, so natural language processing device 300 can be parsed into user's solicited message 102 and one or morely may be intended to syntax data 106, this may be intended to syntax data 106 and include key word 108 and intention data 112, and then via after the structural data 240 in the searching system 220 is carried out full-text search, and then affirmation user's intention.

Furthermore, when user's solicited message 102 is " tomorrow Shanghai how weather the time, natural language processing device 300 by analysis after, can produce one and may be intended to syntax data 106:

"＜queryweather 〉,＜city 〉=Shanghai,＜the time 〉=tomorrow ".

In one embodiment, if natural language understanding system 100 thinks that user's intention is quite clear and definite, just can be directly with user's intention (that is the weather in inquiry Shanghai tomorrow) by analysis result output module 116 output analysis results 104 to servomechanism, and servomechanism can send the user to inquiring the specified sky weather of user.Again for example, when user's solicited message 102 during for " I will see the The Romance of the Three Kingdoms ", natural language processing device 300 by analysis after, can produce three and may be intended to syntax data 106:

"＜readbook 〉,＜bookname 〉=The Romance of the Three Kingdoms ";

"＜watchTV 〉,＜TVname 〉=The Romance of the Three Kingdoms "; And

"＜watchfilm 〉,＜filmname 〉=The Romance of the Three Kingdoms ".

This is because the key word 108 (that is " The Romance of the Three Kingdoms ") that may be intended in the syntax data 106 may belong to different fields, that is three fields of books (＜readbook 〉), TV play (＜watchTV 〉) and film (＜readfilm 〉), so solicited message 102 can be parsed into and a plurality ofly may be intended to syntax data 106, therefore need be further analyzed by the auxiliary Understanding Module 400 of knowledge, confirm user's intention.Again for another example, if during user's input " I will see and allow bullet fly ", because wherein " allowing bullet fly " might be that movie name or title claim, may be intended to syntax data 106 so also may occur following at least two:

"＜readbook 〉,＜bookname 〉=allow bullet fly "; And

"＜watchfilm 〉,＜filmname 〉=allow bullet fly ";

It belongs to books and two fields of film respectively.The above-mentioned syntax data 106 that may be intended to needs to be further analyzed by the auxiliary Understanding Module 400 of knowledge subsequently, and therefrom tries to achieve and determine intention syntax data 114, expresses the clearly intention of user's solicited message.When the auxiliary Understanding Module 400 of knowledge is analyzed may be intended to syntax data 106 time, the auxiliary Understanding Module 400 of knowledge can transmit key words 108 (for example above-mentioned " The Romance of the Three Kingdoms " or " allowing bullet fly ") to searching system 200 by Retrieval Interface 260.Structured database 220 in the searching system 200 has stored a plurality of records with specific data structure, and Search engine 240 can come by the key word 108 that Retrieval Interface unit 260 receives structured database 220 is carried out full-text search, and response result 110 passbacks that full-text search is obtained are to the auxiliary Understanding Module 400 of knowledge, and the auxiliary Understanding Module 400 of knowledge just can be responded result 110 by this and tries to achieve and determine intention syntax data 114 subsequently.As for structured database 220 being carried out full-text search to determine the details of intention syntax data 114, will do more detailed description by Fig. 3 A, Fig. 3 B and relevant paragraph in the back.

In concept of the present invention, natural language understanding system 100 can capture the key word 108 in user's the solicited message 102 earlier, and differentiate the domain attribute of key word 108 by the full-text search result of structured database 220, during for example above-mentioned input " I will see the The Romance of the Three Kingdoms ", can produce belong to books, TV play, three fields of film respectively may be intended to syntax data 106, further analyze and confirm user's clearly intention subsequently again.Therefore the user can give expression to its intention or information in the colloquial style mode very like a cork, and does not need to learn by heart especially particular terms, for example the particular terms of tabulating about fixing word in the existing practice.

Fig. 2 is the synoptic diagram according to the analysis result of 300 couples of users' of natural language processing device of one embodiment of the invention various solicited messages.

As shown in Figure 2, when user's solicited message 102 is the weather of Shanghai " tomorrow how ", natural language processing device 300 by analysis after, can produce and may be intended to syntax data 106 and be:

"＜queryweather 〉,＜city 〉=Shanghai,＜the time 〉=tomorrow "

Wherein be intended to data 112 and be " Shanghai " and " tomorrow " for "＜queryweather〉" key word 108.Owing to after the analysis of natural language processing device 300, only obtain one group of intention syntax data 106 (inquiry weather＜queryweather 〉), therefore in one embodiment, the auxiliary Understanding Module 400 of knowledge can directly take out and key word 108 " Shanghai " and " tomorrow " be sent to servomechanism as analysis result 104 and inquire about the information of weather and (for example inquire about Shanghai weather overview tomorrow, comprise meteorology, temperature ... etc. information), do not judge user view and do not need that structured database 220 is carried out full-text search.Certainly, in one embodiment, still can carry out full-text search to structured database 220 and do more accurate user view judgement, those skilled in the art can change according to actual demand.

In addition, when user's solicited message 102 is " I will see and allow bullet fly ", may be intended to syntax data 106 because can produce two:

"＜readbook 〉,＜bookname 〉=allow bullet fly "; And

"＜watchfilm 〉,＜filmname 〉=allow bullet fly ";

The key word 108 identical with two corresponding intention data 112 "＜readbook〉" and "＜watchfilm〉" and two " allows bullet fly ", represents that its intention may be the film of seeing the books of " allowing bullet fly " or seeing " allowing bullet fly ".For further confirming user's intention, to transmit key word 108 by the auxiliary Understanding Module 400 of knowledge " allows bullet fly " to Retrieval Interface unit 260, then Search engine 240 " allows bullet fly " by this key word 108 to come structured database 220 is carried out full-text search, to confirm that " allowing bullet fly " should be that title claims or movie name, use the intention of confirming the user.

Moreover, when user's solicited message 102 is " I want to listen a date of passing by together ", can produces two and may be intended to syntax data 106:

"＜playmusic 〉,＜singer 〉=pass by＜songname together 〉=date "; "＜playmusic 〉,＜songname 〉=date of passing by together "

The identical intention data 112 of two correspondences "＜playmusic〉", and the key word 108 of two groups of correspondences " is passed by together " and is reached " date of passing by " with " date ", represent that respectively its intention may be the song " date " of listening the singer " to pass by together " and sing, or listen song " date of passing by together ", the auxiliary Understanding Module 400 of knowledge this moment can transmit first set of keyword 108 and " pass by together " with " date " and second set of keyword " date of passing by " to Retrieval Interface unit 260, confirm that " date " this head that whether has " passing by together " this singer to sing sings (user view that first set of keyword is implied), or " date of passing by together " this first song (user view that second set of keyword is implied) is not arranged, use the intention of confirming the user.Yet the present invention is not limited to may be intended to syntax data and the intention corresponding form of data and title in this represented each.

Fig. 3 A is the synoptic diagram according to the stored a plurality of records with specific data structure of the structured database 220 of one embodiment of the invention.

Generally speaking, in some existing full-text search practices, the search result that obtains is non-structured data (for example results that search by google or Baidu), because every information of its search result is to disperse and do not have an association, so the user must inspect every information more one by one, therefore cause the restriction of practicality.Yet, in concept of the present invention, can effectively promote effectiveness of retrieval and correctness by structured database.Because the inner numeric data that comprises of each record in the disclosed structured database has relevance each other, and these numeric datas are jointly in order to express the attribute of this record.So when Search engine carries out a full-text search to structured database, can be when the numeric data that records and key word generation coupling, output is corresponding to the directs data of this numeric data, as the intention of confirming this solicited message.The implementation detail of this part will be done further to describe by following example.

In an embodiment of the present invention, structured database 220 each stored record 302 comprise header field 304 and content field 306, comprise a plurality of minutes fields 308 in the header field 304, field comprised and guided field 310 and numeric field 312 each minute, the guide field 310 of described a plurality of record 302 is in order to storing directs data, and the numeric field 312 of described a plurality of record 302 is in order to the numerical value storage data.Illustrate with the record 1 shown in Fig. 3 A at this, three branch fields 308 in the header field 304 of record 1 have stored respectively:

" singerguid: Liu Dehua ",

" songnameguid: the date of passing by together "; And

" songtypeguid: Hong Kong and Taiwan, Guangdong language, popular ";

Field 308 guide field 310 stored numeric field 312 that directs data " singerguid ", " songnameguid " reach " songtypeguid " its corresponding branch field 308 respectively and then stored numeric data " Liu Dehua ", " date of passing by together " respectively and reach " Hong Kong and Taiwan; Guangdong language, popular " each minute.The field kind that directs data " singerguid " represents numeric data " Liu Dehua " is singer's title (singer), the field kind that directs data " songnameguid " represents numeric data " date of passing by together " is song title (song), directs data " songtypeguid " represents numeric data, and the field kind of " Hong Kong and Taiwan; Guangdong language, popular " is types of songs (song type).In fact each directs data at this can be represented with different specific string number or character respectively, in the present invention not as limit.Record 1 306 of content field are to have stored the data that perhaps store other in lyrics of " date of passing by together " this first song (composition/word person for example ... Deng), yet the True Data in the content field 306 of each record is not the emphasis that the present invention emphasizes, therefore only schematically describes in Fig. 3 A.

Among the aforesaid embodiment, each record comprises header field 304 and content field 306, and the branch field 308 in the header field 304 comprises guides field 310 and numeric field 312, but it is non-to limit the present invention, can there be content field 306 among some embodiment, even be not guide field 310 among some embodiment yet.

In addition, in an embodiment of the present invention, separate each minute field data of 308 in storing first special character between each minute field data of 308, between the data of guiding field 310 and this numeric field 312, store the data that second special character is separated guide field and numeric field.For instance, as shown in Figure 3A, " singerguid " between " Liu Dehua ", " songnameguid " and between " date of passing by " and " songtypeguid " and " Hong Kong and Taiwan; Guangdong language; popular " between be to utilize second special character ": " do separation, be to utilize first special character " | " to do separation and record each minute of 1 308 of fields, yet the present invention is not limited to come as the special character in order to separate with ": " or " | ".

On the other hand, in an embodiment of the present invention, each minute field in the header field 304 308 can have fixedly figure place, for example field 308 fixedly figure place can be 32 characters each minute, and the fixedly figure place of guide field 310 wherein can be 7 or 8 positions (are used at most guiding 128 or 256 kind of different directs data), in addition, cause first special character and the needed figure place of second special character can be fixed, guide field 310 so divide the fixedly figure place of field 308 at deduction, first special character, after the figure place that second special character accounts for, remaining figure place just can all be used for the numeric data of numerical value storage field 312.Moreover, because the figure place of branch field 308 is fixed, the content that adds branch field 308 storage datas can be as shown in Figure 3A in regular turn for guiding field 310 (index of directs data), first special character, the numeric data of numeric field 312, second special character, and as previously mentioned, the bit quantity of these four data is also fixed, so on real the work, can skip the position (for example skipping preceding 7 or 8 positions) of guiding field 310, and the figure place of second special character (is for example skipped 1 character again, that is 8 positions) after, deduct the shared figure place of first special character (last 1 character for example again, 8 positions) afterwards, the last numeric data that just can directly obtain numeric field 312 (is for example directly taken out numeric data " Liu Dehua " in first minute of record 1 field 308, also have this moment 32-3=29 character can supply the numeric data of numerical value storage field 312, in the formula 3 (that is 1+1+1) represents the directs data of being guided field 310, first special character, 1 character that second special character is accounted for respectively), then carrying out required field kind judgement again gets final product.So, after present numeric data comparison of taking out finishes (no matter whether comparing success or not), can take out the numeric data (for example in second branch field 308 of record 1, directly taking out numeric data " date of passing by together ") of next branch field 308 again according to the mode of above-mentioned taking-up numeric data, the comparison of the field kind of comparing.The mode of above-mentioned taking-up numeric data can begin to compare from recording 1, and after having compared record 1 all numeric data, takes out first minute in the header field 308 of record 2 field numeric data (for example " Feng Xiaogang ") of 308 again and compares.Above-mentioned comparison program will continue to carry out, till the numeric data of all records was all compared.

It should be noted that the figure place of above-mentioned branch field 308 and guide field 310, first special character, second special character figure place of using can change according to practical application, the present invention is not limited this.The mode that aforementioned utilization compares to take out numeric data is a kind of embodiment, but non-in order to limit the present invention, another embodiment can use the mode of full-text search to carry out.In addition, above-mentioned skipping guides the reality of field 310, second special character, first special character to make mode, can use a translation (for example division) to reach, the enforcement of this part can be carried out with the mode of hardware, software or both combinations, and those skilled in the art can change according to actual demand.In another embodiment of the present invention, each minute field in the header field 304 308 can have fixedly figure place, guide field 310 in the branch field 308 can have another fixedly figure place, and can not comprise first special character and second special character in the header field 304, since each minute field 308 and the figure place of respectively guiding field 310 for fixing, so can utilize the mode of skipping particular number of bits or the mode of using position translation (for example division) directly to take out directs data or numeric data in each minute field 308.

It should be noted, because the front has been mentioned branch field 308 and has been had certain figure place, thus can be in natural language understanding system 100 (or comprising in the servomechanism of natural language understanding system 100) usage counter to record what compare at present be certain branch field 308 of a certain record.In addition, the record of comparison also can use another counter to store its order.For instance, when the record order of using one first counter records to represent respectively to compare at present and when using the branch order of the field that one second counter represents to compare at present, if comparison at present be the 3rd the branch field 308 (that is comparison " filenameguid: Hua Yi brother ") of the record 2 of Fig. 3 A the time, the stored numerical value of first counter will be 2 (what represent comparison at present is record 2), and the stored numerical value of second counter then is 3 (what represent present comparison is the 3rd branch field 308).Moreover, the above-mentioned mode that only stores the directs data of guiding field 310 with 7 or 8 positions, be wish will branch field 308 most numeral all be used for the numerical value storage data, actual directs data then can be used as index by these 7,8 positions, from the stored directs data storage device 280 of searching system 200, read actual directs data according to this again, wherein directs data is that mode with form stores, but other any modes for searching system 200 accesses are all in the present invention available.So, when practical operation, except can directly taking out numeric data compares, also can be when producing matching result, directly according to the numerical value of above-mentioned two counters, directly take out directs data and give knowledge auxiliary Understanding Module 400 as responding result 110.For instance, when record 6 the 2nd branch field 308 (that is " songnameguid: betray ") when the match is successful, the numerical value of learning the first present counter/second counter is respectively 6 and 2, therefore can go to according to these two numerical value and store the directs data storage device 280 shown in Fig. 3 C, inquire directs data by the branch field 2 that records 6 and be " songnameguid ".In one embodiment, after the figure place of minute field 308 can being fixed, all positions that again will branch field 308 all are used for the numerical value storage data, guide field, first special character, second special character so can remove fully, as long as and Search engine 240 knows that whenever crossing fixedly figure place is exactly another minute field 308, and in second counter, add and one get final product (certainly, also need when whenever changing next record and retrieving the storage values of first counter is added one), can provide more figure place to come the numerical value storage data like this.

Cite an actual example again to illustrate when comparison produces matching result that passback matched record 110 is done the process of further processing to the auxiliary Understanding Module 400 of knowledge.Corresponding to the data structure of above-mentioned record 302, in an embodiment of the present invention, when user's solicited message 102 is " I will see and allow bullet fly ", can produces two and may be intended to syntax data 106:

"＜readbook 〉,＜bookname 〉=allow bullet fly "; With

"＜watchfilm 〉,＜filmname 〉=allow bullet fly ";

The key word 108 that Search engine 240 just receives by Retrieval Interface unit 260 " allows bullet fly " to come the header field 304 of the stored record of the structured database 220 of Fig. 3 A is carried out full-text search.In the full-text search, in header field 304, found to store the record 5 that numeric data " allows bullet fly ", therefore produced matching result.Next, searching system 200 will return in record 5 header fields 304, assist Understanding Module 400 corresponding to the directs data " filmnameguid " that key word 108 " allows bullet fly " as responding result 110 and being back to knowledge.Because in the header field of record 5, comprise the directs data " filmnameguid " that corresponding numeric data " allows bullet fly ", so the auxiliary Understanding Module 400 of knowledge may be intended to the intention data 112 that syntax data 106 before stored "＜watchfilm〉" or "＜readbook〉" by the directs data " filmnameguid " of comparison record 5 with above-mentioned, just the definite intention syntax data 114 that can judge this solicited message is "＜watchfilm 〉,＜filmname 〉=allow bullet fly " (because all comprising " film " therein).In other words, this time described in user's the solicited message 102 data " to allow bullet fly " be movie name, and data user's solicited message 102 be intended to see a film " allowing bullet fly ", but not read books.

Cite an actual example again and do further explanation.When user's solicited message 102 is " I want to listen a date of passing by together ", can produces two and may be intended to syntax data 106:

"＜playmusic 〉,＜singer 〉=pass by＜songname together 〉=date "; With

"＜playmusic 〉,＜songname 〉=date of passing by together ";

Two set of keyword 108 that Search engine 240 just receives by Retrieval Interface unit 260:

" pass by together " and " date "; And

" date of passing by together "

Come the header field 304 of the stored record of the structured database 220 of Fig. 3 A is carried out full-text search.Because in the full-text search, in the header field 304 of all records, do not find corresponding to " pass by together " matching result with " date " of first set of keyword 108, but found corresponding to second set of keyword 108 record 1 on " date of passing by together ", so searching system 200 will record in 1 header field 304 directs data " songnameguid " corresponding to second set of keyword 108, as matched record 110 and be back to the auxiliary Understanding Module 400 of knowledge.Next, the auxiliary Understanding Module 400 of knowledge is after the directs data " songnameguid " that receives corresponding numeric data " date of passing by together ", just with may be intended to syntax data 106 (that is "＜playmusic 〉;＜singer 〉=pass by together;＜songname 〉=date " with "＜playmusic;＜songname 〉=intention data 112 in date ") of passing by together (that is＜singer,＜songname〉etc.) compare, so just find to describe the data that singer's title is arranged in this user's the solicited message 102, song title is arranged is the data (because have only＜songname〉compare successfully) on " date of passing by together " but describe.So, definite intention syntax data 114 that the auxiliary Understanding Module 400 of knowledge can be judged this solicited message 102 by above-mentioned comparison for "＜playmusic 〉;＜songname 〉=date of passing by together ", and user's solicited message 102 be intended to listen a song " date of passing by together ".

In another embodiment of the present invention, retrieval and response result 110 can be the complete matched record of mate fully with key word 108 or the part matched record of mating with key word 108 parts.For instance, if user's solicited message 102 is " I want listen Xiao Jing to rise betrayal ", similarly, natural language processing device 300 by analysis after, produce two and may be intended to syntax data 106:

"＜playmusic 〉,＜singer 〉=Xiao Jingteng,＜songname 〉=betray "; And "＜playmusic 〉,＜songname 〉=betrayal of Xiao Jingteng ";

And transmit two set of keyword 108:

" Xiao Jingteng " and " betrayal "; And

" betrayal of Xiao Jingteng ";

Give Retrieval Interface unit 260, the key word 108 that Search engine 240 then receives by Retrieval Interface unit 260 comes the header field 304 of the stored record 302 of the structured database 220 of Fig. 3 A is carried out full-text search.Because in full-text search, corresponding second set of keyword 108 " betrayal of Xiao Jingteng " does not match any record, but corresponding first set of keyword 108 " Xiao Jingteng " and " betrayals " found record 6 with the matching result that records 7.Because second set of keyword 108 " Xiao Jingteng " is with " betrayal " only with the numeric data that records in 6 " Xiao Jingteng " is complementary, do not reach " Cao's lattice " and match other numeric datas " Yang Zongwei ", therefore record 6 and be part matched record (please noting that the record 5 of above-mentioned corresponding requests information 102 " I will see and allow bullet fly " and the record 1 of corresponding requests information " I want to listen the date of passing by together " are all the part matched record), and key word " Xiao Jingteng " and " betrayals " mated fully record 7 numeric data (because second set of keyword 108 " Xiao Jingteng " with " betrayal " all the match is successful), be complete matched record so record 7.In an embodiment of the present invention, when these Retrieval Interface unit a plurality of matched record 110 of 260 outputs are assisted Understanding Module 400 to knowledge, can export the matched record 110 of complete matched record (that is whole numeric datas is all mated) and part matched record (that is only having the numeric data of part to be mated) in regular turn, wherein the priority of complete matched record is greater than the priority of part matched record.Therefore, in the Retrieval Interface unit during matched record 110 of 260 output records 6 and record 7, the output priority of record 7 can be greater than the output priority of record 6, all produce matching result because record 7 whole numeric datas " Xiao Jingteng " with " betrayal ", also comprise " Yang Zongwei " and " Cao's lattice " and do not produce matching result but record 6.That is to say that record stored in the structured database 220 is more high to the matching degree of the key word 108 in its solicited message 102, preferentially be output more easily, so that the user consults or select corresponding definite intention syntax data 114.In another embodiment, can directly export the corresponding matched record 110 of the highest record of priority, as the usefulness of determining intention syntax data 114.Aforementioned non-to limit the present invention, as long as because (for example may take to search mode that matched record namely exports in another embodiment, be solicited message 102 with " I want listen Xiao Jing to rise betrayal ", when retrieving record 6 when namely producing matching result, namely the directs data of output record 6 correspondences is done matched record 110), and do not comprise the ordering of priority, to accelerate the speed of retrieval.In another embodiment, record that can be the highest to priority is directly carried out its corresponding processing mode and is provided and gives the user.For example when priority the highest when playing the film of the The Romance of the Three Kingdoms, can play-over film and user.In addition, if priority the highest rise the betrayal of performance for Xiao Jing the time, can be directly with this playback of songs and user.It should be noted that the present invention is limited this at this for illustrative purposes only.

In an embodiment more of the present invention, if user's solicited message 102 is " I will listen the betrayal of Liu De China ", then its may be intended to syntax data 106 one of them be:

"＜playmusic 〉,＜singer 〉=Liu Dehua,＜songname 〉=betray ";

If Retrieval Interface unit 260 can't find any matching result with key word 108 " Liu Dehua " and " betrayal " input Search engine 240 in the database of Fig. 3 A.In another embodiment of the present invention, Retrieval Interface unit 260 can be respectively with key word 108 " Liu Dehua " and " betrayal " input Search engine 240, and to try to achieve " Liu Dehua " be that singer's title (directs data singerguid) and " betrayal " are song title (directs data songnameguid, and the singer may be Cao's lattice or Xiao Jingteng, Yang Zongwei and the chorus of Cao's lattice) to correspondence respectively.At this moment, natural language understanding system 100 can further be reminded the user: " whether betray this song is that Xiao Jingteng sings (according to the matching result of record 7) ", perhaps, " whether be Xiao Jingteng, Yang Zongwei and Cao's lattice chorus (according to the matching result of record 6) ".

In an embodiment more of the present invention, structured database 220 stored records can also include source field 314 and temperature field 316.Database shown in Fig. 3 B, it also comprises to come source field 314 temperature fields 316, hobby field 318 and detests field except every field of Fig. 3 A.The source field 314 that comes of each record can be to come from the source value which structured database (structured data storehouse 220 only in this is graphic, and in fact can have how different structured database) or which user, servomechanism provide in order to store this record.And, the hobby that natural language understanding system 100 can be divulged in request message 102 before according to the user, retrieve the structured database (when for example carrying out full-text search generation coupling with the key word 108 in the solicited message 102, just the temperature value to this record adds) of particular source.And respectively record 302 temperature field 316 in order to store search temperature value or the popular degree value (for example this is recorded in the special time by single user, specific user group, all users' matching times or probability) of this record 302, reference when judging user view for the auxiliary Understanding Module 400 of knowledge hereinafter can be described in detail as for hobby field 318 and the use-pattern of detesting field again.In detail and opinion, when user's solicited message 102 during for " I will see the The Romance of the Three Kingdoms ", natural language processing device 300 by analysis after, can produce and a plurality ofly may be intended to syntax data 106:

"＜readbook 〉,＜bookname 〉=The Romance of the Three Kingdoms ";

"＜watchTV 〉,＜TVname 〉=The Romance of the Three Kingdoms "; And

"＜watchfilm 〉,＜filmname 〉=The Romance of the Three Kingdoms ".

If natural language understanding system 100 (for example utilize store these notes by temperature field 316 record 302 by number of times that certain user clicked) in the historical record of user's solicited message 102, count the request of its major part for seeing a film, then natural language understanding system 100 can be done retrieval at the structured database that stores the film record and (comes the source value in the source field 314 at this moment, be the code that record stores the structured database of film record), thereby can preferentially judge "＜watchfilm 〉,＜filmname 〉=The Romance of the Three Kingdoms " is to determine intention syntax data 114.For instance, also can record 302 at each in one embodiment and be mated once, just can add one in the temperature field 316 of back, as user's historical record.So when doing full-text search, can from all matching results, select the highest record 302 of numerical value in the temperature field 316 at foundation key word 108 " The Romance of the Three Kingdoms ", as the usefulness of judging user view.In one embodiment, if natural language understanding system 100 is in the result for retrieval of key word 108 " The Romance of the Three Kingdoms ", judge corresponding " The Romance of the Three Kingdoms " this to go out the stored search temperature value of the temperature field 316 of record of TV programme the highest, then just can preferentially judge "＜watchTV 〉,＜TVname 〉=The Romance of the Three Kingdoms " is to determine intention syntax data 114.In addition, above-mentioned alter mode to temperature field 316 stored numerical value can change by the computer system at natural language understanding system 100 places, and the present invention is not limited this.In addition, the numerical value of temperature field 316 also can successively decrease in time, with the expression user temperature of a certain record 302 reduced gradually, the present invention is not also limited this part.

Lift another example again, in another embodiment, because the user may miss potter the TV play of seeing the The Romance of the Three Kingdoms in certain period, the user can't finish watching the short time because the length of TV play may be very long, therefore in the short time, may repeat to click (supposing that every coupling once just adds one with the numerical value in the temperature field 316), therefore cause certain record 302 to be repeated coupling, this part all can be learnt by the data of analyzing temperature field 316.Moreover in another embodiment, telecommunications dealer also can utilize temperature field 316 that the temperature that a certain source data that provide are taken is provided, and these data supplier's coding can be used for source field 314 and stores.For instance, if the probability that the supplier's of certain position supply " The Romance of the Three Kingdoms TV play " quilt clicks is the highest, so when certain user imports the solicited message 102 of " I will see the The Romance of the Three Kingdoms ", though when the database to Fig. 3 B carries out full-text search, can find the books (record 8) of reading the The Romance of the Three Kingdoms, watch The Romance of the Three Kingdoms TV play (record 9), watch three matching results of The Romance of the Three Kingdoms film (record 10), but because the data in the temperature field 316 show that watching The Romance of the Three Kingdoms TV play is that the most popular option (that is records 8 now, 9, the numerical value of 10 temperature field is respectively 2,5,8), so being done matched record 110, the directs data that record 10 is provided earlier exports the auxiliary understanding system 400 of knowledge to, as the override option of judging user view.In one embodiment, simultaneously in the future the data of source field 314 are shown to the user, allow the user judge that he wants the TV play of watching whether to be provided by certain supplier.It should be noted, above-mentioned to come source field 314 stored data with and alter mode, also can change by the computer system at natural language understanding system 100 places, the present invention is not limited this.It should be noted, those skilled in the art should know, can further further cut into relevant with individual subscriber the temperature field 316 among Fig. 3 B, hobby field 318, detest field 320 stored information and relevant two parts with all users, and temperature field 316 that will be relevant with individual subscriber, hobby field 318, detest the mobile phone that field 320 information will be stored in the user, servomechanism then stores and all user-dependent temperature fields 316, hobby field 318, detest information such as field 320.So, only the personal like relevant information relevant with the selection of individual subscriber or intention just only is stored in device for mobile communication (for example mobile phone, flat computer or little of individual subscriber ... Deng) in, servomechanism then stores all relevant information with the user, so not only can save the storage area of servomechanism, also keep the confidentiality of individual subscriber hobby.

Significantly, the inner numeric data that comprises of each record in the disclosed structured database has relevance each other and (for example records the numeric data " Liu Dehua " in 1, " date of passing by together ", " Hong Kong and Taiwan; Guangdong language; popular " all are the features of describing record 1), and these numeric datas are jointly in order to express from user's solicited message to the intention of this record (when for example " date of passing by together " being produced matching result, expression user's intention may be the data access to recording 1), so when Search engine carries out full-text search to structured database, can be when the numeric data of record be mated, output is corresponding to the directs data of this numeric data (for example output " songnameguid " is as responding result 110), and then confirms the intention (for example comparing in the auxiliary Understanding Module 400 of knowledge) of this solicited message.

Disclose or the content of teaching based on above-mentioned one exemplary embodiment, Fig. 4 A is the process flow diagram according to the search method of one embodiment of the invention.See also Fig. 4 A, the search method of embodiments of the invention may further comprise the steps:

Provide structured database, and structured database stores a plurality of records (step S410);

Receive at least one key word (step S420);

Come the header field of a plurality of records is carried out full-text search (step S430) by key word.For instance, allow the header field 304 of the stored a plurality of records 302 of 240 pairs of structured database of Search engine 220 carry out full-text search key word 108 input Retrieval Interface unit 260, can carry out as the retrieval mode that Fig. 3 A or Fig. 3 B are carried out or the mode that does not change its spirit as for retrieval mode;

Judge whether full-text search has matching result (step S440).For instance, judge by Search engine 240 whether these key word 108 corresponding full-text searches have matching result; And

If matching result is arranged, export complete matched record and part matched record (step S450) in regular turn.For instance, if this key word 108 of record coupling is arranged in the structured database 220, then Retrieval Interface unit 260 is exported complete matched record and the directs data in the part matched record (can obtain by the directs data storage device 280 to Fig. 3 C) of mating this key word 108 in regular turn and is sent to the auxiliary understanding system 400 of knowledge as responding result 110, and wherein the priority of complete matched record is greater than the priority of part matched record.

On the other hand, if matching result is not arranged, then can directly notify the user it fails to match and process ends, notify the user not find matching result and require to do further input or enumerate and may option do further selection (for example aforementioned do the example that full-text search does not produce matching result with ＂ Liu De China ＂ and ＂ betrayal ＂) (step 460) to the user.

Aforesaid process step is non-, and some step is to ignore or to remove to limit the present invention, for example, in another embodiment of the present invention, can come execution in step S440 by being arranged in searching system 200 outer matching judgment modules (not being illustrated in figure); Or in another embodiment of the present invention, can ignore above-mentioned steps S450, its action of exporting complete matched record and part matched record in regular turn can export in regular turn among the execution in step S450 action of complete matched record and part matched record by the matching result output module (not being illustrated in figure) that is arranged in outside the searching system 200.

Disclose or the content of teaching based on above-mentioned one exemplary embodiment, Fig. 4 B is the process flow diagram of natural language understanding system 100 courses of work according to another embodiment of the present invention.See also Fig. 4 B, natural language understanding system 100 courses of work of another embodiment of the present invention may further comprise the steps:

Receive solicited message (step S510).For instance, user's solicited message 102 that will have voice content or a word content is sent to natural language understanding system 100;

Provide structured database, and structured database stores a plurality of records (step S520);

With solicited message grammerization (step S530).For instance, after the solicited message 102 of natural language processing device 300 analysis user, and then transfer to and corresponding may be intended to syntax data 106;

Distinguish the possible attribute (step S540) of key word.For instance, the auxiliary Understanding Module 400 of knowledge picks out the possible attribute that may be intended at least one key word 108 in the syntax data 106, and for example, key word 108 " The Romance of the Three Kingdoms " may be book, film and TV programme;

Come the header field 304 of a plurality of records is carried out full-text search (step S550) by key word 108.For instance, key word 108 input Retrieval Interface unit 260 are allowed the header field 304 of the stored a plurality of records of 240 pairs of structured database of Search engine 220 carry out full-text search;

Judge whether full-text search has matching result (step S560).For instance, judge by Search engine 240 whether these key word 108 corresponding full-text searches have matching result;

If matching result is arranged, export complete matched record and the corresponding directs data of part matched record (step S570) in regular turn for responding result 110.For instance, if this key word 108 of record coupling is arranged in the structured database 220, then Retrieval Interface unit 260 complete matched record and the corresponding directs data of part matched record of export this key word 108 of coupling in regular turn is response result 110,

Wherein the priority of complete matched record is greater than the priority of part matched record; And

Definite intention syntax data (step S580) of output correspondence in regular turn.For instance, the auxiliary Understanding Module 400 of knowledge is used the corresponding definite intention syntax data 114 of output by complete matched record and the part matched record of output in regular turn.

On the other hand, if do not produce matching result at step S560, also can use the mode of similar step S460 to handle, for example directly notify the user it fails to match and process ends, notify the user not find matching result and require to do further input or enumerate and may option do further selection (for example aforementioned do the example that full-text search does not produce matching result with ＂ Liu De China ＂ and ＂ betrayal ＂) (step S590) to the user.

Aforesaid process step is non-to limit the present invention, and some step is to ignore or to remove.

In sum, the present invention is by the included key word of solicited message that takes out the user, and the header field at the record with specific data structure in the structured database is carried out full-text search, if produce matching result, just can judge the affiliated field kind of key word, use and determine that the user is in the represented intention of solicited message.

Next in speech recognition, should be used as more explanation at above structured database.At first in the natural language dialogue system, the voice answer-back that corrects mistakes according to user's voice input, and further find out the application that other possible answers offer the user back and forth and explain.

As previously mentioned, though device for mobile communication now can provide the natural language dialogue function, link up with device for mobile communication to allow the user send voice.Yet at present speech dialogue system, when user's voice is imported when indeterminate, because the phonetic entry of same sentence may mean a plurality of different intentions or purpose, so system can export the voice answer-back that does not meet phonetic entry easily.Therefore in a lot of dialogue scenes, the user is difficult to obtain meeting the voice answer-back of its intention.For this reason, the present invention proposes a kind of method and natural language dialogue system of revising voice answer-back, and wherein natural language dialogue is that system can be repaiied voice answer-back to mistake according to user's voice input, and further finds out other possible answers and offer the user back and forth.In order to make content of the present invention more clear, below the example that can implement according to this really as the present invention especially exemplified by embodiment.

Fig. 5 A is the calcspar of the natural language dialogue system that illustrates according to one embodiment of the invention.Please refer to Fig. 5 A, natural language dialogue system 500 comprises phonetic sampling module 510, natural language understanding system 520 and speech database for speech synthesis 530.In one embodiment, phonetic sampling module 510 is in order to receive first phonetic entry 501 (for example from user's voice), subsequently it is resolved and produce first solicited message 503, and natural language understanding system 520 can be resolved first solicited message 503 again and obtain wherein first key word 509, and finding the first repayment answer, 511 backs that meet first solicited message 503 (according to the description of Fig. 1, first solicited message 503 can use solicited message 102 identical modes to process, that is solicited message 102 can generation may be intended to syntax data 106 after analysis, and key word 108 wherein can be used for structured database 220 is carried out full-text search and obtained to respond result 110, this respond result 110 again with may be intended to intention data 112 in the syntax data 106 and compare and produce and determine intention syntax data 114, send analysis result 104 by analysis result output module 116 at last, this analysis result 104 can be used as the repayment of first among Fig. 5 A answer 511), according to these first repayment answer 511 pairs of speech database for speech synthesis 530 carry out corresponding speech polling (because as the analysis result 104 of the first answer case 511 can comprise fully/related data of the record 302 of part coupling (for example is stored in the directs data of guide field 310, numeric data at numeric field 312, and in the data of content field 306 ... Deng), therefore can utilize these data to carry out speech polling), first voice answer-back 507 that produces corresponding to first phonetic entry 501 of first voice 513 inquired about of output gives the user again.Wherein, if the user thinks when first voice answer-back 507 that natural language understanding system 520 is exported does not meet first solicited message 503 in first phonetic entry 501, the user will import another phonetic entry, and this thing is indicated in for example second phonetic entry 501 '.Natural language understanding system 520 can utilize above-mentioned same treatment mode to first phonetic entry 501 to handle second phonetic entry 501 ' to produce second solicited message 503 ', subsequently second solicited message 503 ' is resolved, obtained wherein second key word 509 ', finds the second repayment answer 511 ' that meets second solicited message 503 ', finds out the second corresponding voice 513 ', produces corresponding second voice answer-back, 507 ' output according to second voice 513 ' more at last and give the user, as the usefulness of revising the first repayment answer 511.Significantly, the natural language understanding system 100 that natural language understanding system 520 can Fig. 1 is the basis, and increases the purpose that new module (will do explanation in conjunction with follow-up Fig. 5 B) is reached the voice answer-back that corrects mistakes according to user's voice input again.

Each member in the aforementioned natural language dialogue system 500 is configurable in uniform machinery.For example, phonetic sampling module 510 for example is to be disposed at same electronic installation with natural language understanding system 520.Wherein, electronic installation can be mobile phone (Cell phone), personal digital assistant (Personal Digital Assistant, PDA) mobile phone, intelligent mobile phone device for mobile communication, palmtop computer (Pocket PC), Tablet PC (Tablet PC), mobile computer, personal computer or other electronic installations that possesses communication function or bitcom is installed such as (Smart phone) do not limit its scope at this.In addition, above-mentioned electronic installation can use Android operating system, microsoft operating system, Android operating system, (SuSE) Linux OS etc., is not limited thereto.Certainly, each member in the aforementioned natural language dialogue system 500 also not necessarily need be arranged in the uniform machinery, and can be dispersed in different device or system and link by various communications protocol.For example, natural language understanding system 520 can be arranged in the high in the clouds servomechanism, also can be arranged in the servomechanism of local-area network.In addition, each member in the natural language understanding system 520 also can be dispersed in different machines, and for example each member in the natural language understanding system 520 can be positioned at the machine identical or different with phonetic sampling module 510.

In the present embodiment, phonetic sampling module 510 is in order to receive phonetic entry, this phonetic sampling module 510 can be the device of reception message such as microphone (Microphone), and first phonetic entry, 501/ second phonetic entry 501 ' can be from user's voice.

In addition, the natural language understanding system 520 of present embodiment can be done in fact by the hardware circuit that or several logic gates combine.Perhaps, in another embodiment of the present invention, natural language understanding system 520 can be done in fact by computer program code.For instance, natural language understanding system 520 for example is to be implemented into application program, operating system or driver etc. by the procedure code fragment that program language is write, and these procedure code fragments are stored in the storage element, and carry out by processing unit (Fig. 5 A does not show).In order to make those skilled in the art further understand the natural language understanding system 520 of present embodiment, under give an actual example to describe.So, the present invention only for illustrating, not as limit, for example uses the modes such as mixing combination of hardware, software, firmware or these three kinds of embodiments at this, all can use to implement the present invention.

Fig. 5 B is the calcspar of the natural language understanding system 520 that illustrates according to one embodiment of the invention.Please refer to Fig. 5 B, the natural language understanding system 520 of present embodiment can comprise sound identification module 522, natural language processing module 524 and phonetic synthesis module 526.Wherein, sound identification module 522 can receive the solicited message that transmits from phonetic sampling module 510, first solicited message 503 that first phonetic entry 501 is resolved for example, and take out one or more first key word 509(for example key word 108 of Figure 1A or words and expressions etc.).Natural language processing module 524 can be resolved these first key words 509 again, and it is (identical with the processing mode of Fig. 5 A to obtain to comprise at least a candidate list of repaying answer, that is for example carry out full-text search by 200 pairs of structured database of searching system 220 of Figure 1A, and obtaining response result 110 and intention data 112 comparison backs are being produced definite intention syntax data 114, the analysis result of being sent by analysis result output module 116 104 produces the repayment answer at last), and can from all repayment answers of candidate list, select an answer that meets first phonetic entry 501 (for example to select complete matched record as the first repayment answer 511 ... Deng).Owing to the first repayment answer 511 is answers that natural language understanding system 520 gets in internal analysis, could export and give the user so also it must be converted to voice output, the user just can judge like this.So phonetic synthesis module 526 can be come voice inquirement generated data storehouse 530 according to the first repayment answer 511, and this speech database for speech synthesis 530 for example be record literal with and corresponding voice messaging, can make phonetic synthesis module 526 can find out first voice 513 corresponding to the first repayment answer 511, use and synthesize first voice answer-back 507.Afterwards, phonetic synthesis module 526 can be given the user by voice output interface (not illustrating) (wherein the voice output interface for example is devices such as loudspeaker, loudspeaker or earphone) output with first voice answer-back 507 that synthesizes.It should be noted that phonetic synthesis module 526 may need earlier format conversion to be carried out in the first repayment answer 511, calls out by the interface of speech database for speech synthesis 530 defineds then according to the first repayment answer, 511 voice inquirement generated data storehouses 530 time.Because it is relevant whether to need to carry out the own definition of format conversion and speech database for speech synthesis 530 during voice calls generated data storehouse 530, because this part belongs to the known technology of those skilled in the art, so will not describe in detail at this.

Next illustrative example illustrates, if user's input is first phonetic entry, 501 words of " I will see the The Romance of the Three Kingdoms ", sound identification module 522 can receive first solicited message 503 that first phonetic entry 501 is resolved that transmits from phonetic sampling module 510, and taking out then for example is first key word 509 that comprises " The Romance of the Three Kingdoms ".524 of natural language processing modules can be resolved this first key word 509 " The Romance of the Three Kingdoms " again and (for example be carried out full-text search by 200 pairs of structured database of searching system 220 of Figure 1A, and obtaining response result 110 and intention data 112 comparison backs are being produced definite intention syntax data 114, the last analysis result of being sent by analysis result output module 116 104), and then the repayment answer that produces three intention options that comprise " The Romance of the Three Kingdoms ", and with its be integrated into a candidate list (suppose each the intention option have only one the repayment answer, it ranges " reading " respectively, " see TV play ", and " seeing a film " three options), then from these three repayment answers of candidate list, select one again and have mxm. (for example selecting the record 10 of Fig. 3 B) as the first repayment answer 511 in temperature field 316.In one embodiment, can directly carry out the corresponding mode that temperature field 316 has mxm. (for example the previous Xiao Jing that play-overs that puies forward rises " betrayal " of singing and gives the user), the present invention is not limited this.

In addition, natural language processing module 524 also can be by resolving follow-up second phonetic entry 501 ' that receives (because transporting feed-in phonetic sampling module 510 in the same way with previous phonetic entry 501), and judge whether the first repayment answer 511 last time is correct.Because second phonetic entry 501 ' is the user at the response that before provided first voice answer-back 507 that gives the user to do, it comprises the user and thinks the information of previous first voice answer-back, 507 correctness.If be that the expression user thinks that the first repayment answer 511 is incorrect after analyzing second phonetic entry 501 ', natural language processing module 524 can select other repayment answers in the above-mentioned candidate list to repay answer 511 ' as second, after for example from candidate list, rejecting the first repayment answer 511, and select one second again in remaining repayment answer and repay answer 511 ', recycling phonetic synthesis module 526 is found out second voice 513 ' corresponding to the second repayment answer 511 ', by phonetic synthesis module 526 second voice 513 ' is synthesized the 507 ' broadcast of second voice answer-back at last and gives the user.

Continue the example of previous user's input " I will see the The Romance of the Three Kingdoms ", if the user wants to see the TV play of the The Romance of the Three Kingdoms, so it is not that the user wants just that the option (because being the film of seeing " The Romance of the Three Kingdoms ") of Fig. 3 B record 10 of user is given in previous output, so the user may import " I will see The Romance of the Three Kingdoms TV play " (user spell out want to see be TV play) or " I do not see The Romance of the Three Kingdoms film " (user only negates present option) again ... Deng as second phonetic entry 501 '.So second phonetic entry 501 ' will be in parsing and after obtaining its second solicited message 503 ' (or second key word 509 '), can find that therefore second key word 509 ' in second solicited message 503 ' will comprise " TV play " (user has clearly indication) or " not film " (user only negates present option), will judge that the first repayment answer 511 does not meet user's demand.Be with, can select another repayment answer again as the second repayment answer 511 ' and the second corresponding voice answer-back 507 ' of output this moment from candidate list, for example export second voice answer-back 507 ' (wanting to watch The Romance of the Three Kingdoms TV play if the user spells out) of " I play The Romance of the Three Kingdoms TV play for you now ", or second voice answer-back 507 ' of output " you want be which option " (if the user only negates present option), and choose (for example " select time high repayment answer of temperature field 316 numerical value as the second repayment answer 511 ') in conjunction with other option in the candidate list for the user.Moreover, in another embodiment, if second phonetic entry 501 ' that the user imports comprises the message of " selection ", for example show " watching The Romance of the Three Kingdoms books ", " watch The Romance of the Three Kingdoms TV play ", and " watching The Romance of the Three Kingdoms film " three options are done when selecting to the user, when the user may import second phonetic entry 501 ' of " I will see a film ", after will and finding user's intention in second solicited message 503 ' of analyzing second phonetic entry 501 ' (for example finding that from second key word 509 ' user selects " watching film "), after obtaining its second solicited message 503 ', second voice answer-back 507 ' (if the user wants to watch The Romance of the Three Kingdoms film) of output " I play The Romance of the Three Kingdoms film for you now " is play-overed film then and is given the user so second phonetic entry 501 ' will be in parsing.Certainly, if the user imports when being " I want the 3rd option " (suppose this moment user-selected be the reading books), to carry out the 3rd and select corresponding application program, that is second voice answer-back 507 ' of output " you want be to read The Romance of the Three Kingdoms books ", and give user's action in conjunction with the e-book that shows the The Romance of the Three Kingdoms.

In the present embodiment, sound identification module 522, natural language processing module 524 and the phonetic synthesis module 526 in the aforementioned natural language understanding system 520 can be configured in the uniform machinery with phonetic sampling module 510.In other embodiments, sound identification module 522, natural language processing module 524 and phonetic synthesis module 526 also can be dispersed in the different machine (for example computer system, servomechanism or similar device/system).The natural language understanding system 520 ' shown in Fig. 5 C for example, phonetic synthesis module 526 can be configured in uniform machinery 502 with phonetic sampling module 510, and sound identification module 522, natural language processing module 524 are configurable at another machine.In addition, under the framework of Fig. 5 C, natural language processing module 524 can be sent to phonetic synthesis module 526 with the first repayment answer, 511/ second repayment answer 511 ', it is sent to speech database for speech synthesis to seek corresponding first voice, 513/ second voice 513 ', as the foundation that produces first voice answer-back, 507/ second voice answer-back 507 ' with the first repayment answer, 511/ second repayment answer 511 ' immediately.

Fig. 6 is the method flow diagram of correction first voice answer-back 507 that illustrates according to one embodiment of the invention.In the method for correction first voice answer-back 507 in the present embodiment, when the user thinks that first voice answer-back of playing at present 507 does not meet its previous first solicited message 503 of importing, can import second phonetic entry 501 ' and feed-in phonetic sampling module 510 again, when learning that by natural language understanding system 520 analyses previous broadcast is given user's first voice answer-back 507 and do not met user's intention more subsequently, natural language understanding system 520 can be exported second voice answer-back 507 ' again, uses and revises first voice answer-back 507 originally.For convenience of description, be example in this natural language dialogue system 500 of only lifting Fig. 5 A, but the method for correction first voice answer-back 507 of present embodiment is also applicable to the natural language dialogue system 500 ' of above-mentioned Fig. 5 C.

Please be simultaneously with reference to Fig. 5 A and Fig. 6, in step S602, phonetic sampling module 510 can receive first phonetic entry 501 (also same feed-in phonetic sampling module 510).Wherein, first phonetic entry 501 for example is from user's voice, and first phonetic entry 501 also can have user's first solicited message 503.Particularly, first phonetic entry 501 from the user can be inquiry sentence, imperative sentence or other solicited messages etc., for example " I will see the The Romance of the Three Kingdoms ", " I will listen the lustily music of water " or " temperature several years today " etc.

In step S604, natural language understanding system 520 can be resolved at least one included first key word 509 in first phonetic entry 501 and be obtained candidate list, and wherein candidate list has one or more repayment answers.For instance, when user's first phonetic entry 501 was " I will see the The Romance of the Three Kingdoms ", natural language understanding system 520 back first key word 509 that obtains by analysis for example was " " The Romance of the Three Kingdoms ", " seeing " ".Again for example, when user's first phonetic entry 501 was " I will listen the lustily song of water ", natural language understanding system 520 back first key word 509 that obtains by analysis for example was " " lustily water ", " listening ", " song " ".

After connecing, natural language understanding system 520 can be inquired about according to above-mentioned first key word, 509 self-structure databases 220, and obtains at least one search result (for example analysis result 104 of Fig. 1), according to this as the repayment answer in the candidate list.Can do not given unnecessary details at this as described in Figure 1A as for the mode of from a plurality of repayment answers, selecting the first repayment answer 511.Because first key word 509 may comprise different kens (film class for example, the books class, music class or game class etc.), and also can further be divided into the plurality of classes (different authors of same film or books title for example in the same ken, the different singers of same song title, different editions of same game name etc.), so at first key word 509, natural language understanding system 520 can inquire one or many search result (for example analysis result 104) that is relevant to this first key word 509 in structured database, can comprise in each search result that wherein the directs data that is relevant to this first key word 509 is (for example with " Xiao Jingteng ", " betrayal " is that key word 108 is at Fig. 3 A, when the structured database 220 of 3B is carried out full-text search, for example record 6 and 7 liang of group matching results of Fig. 3 A will be obtained, they comprise " singerguid " respectively, the directs data of " songnameguid ", this directs data are to be stored in the data of guiding field 310) and other data.Wherein, other data for example are in search result, except other key words relevant with first key word 709 etc. (are key word with " date of passing by together " for example and do full-text search in the structured database 220 of Fig. 3 A and obtain recording 1 when being matching result, both are other data " Liu Dehua " and " Hong Kong and Taiwan; Guangdong language, popular ").Therefore from another viewpoint, when first phonetic entry of importing as the user 501 has a plurality of first key word 509, first solicited message 503 of then representing the user is clearer and more definite, makes natural language understanding system 520 can inquire the search result that approaches with first solicited message 503.

For instance, when first key word 509 is " The Romance of the Three Kingdoms " (when for example the user imports the phonetic entry of " I will see the The Romance of the Three Kingdoms "), may produces three after natural language understanding system 520 is analyzed and may be intended to syntax data 106 (as shown in Figure 1):

"＜readbook 〉,＜bookname 〉=The Romance of the Three Kingdoms ";

"＜watchTV 〉,＜TVname 〉=The Romance of the Three Kingdoms "; And

"＜watchfilm 〉,＜filmname 〉=The Romance of the Three Kingdoms ".

Therefore look into news to search result be record (for

example record

8,9,10 of Fig. 3 B) about " ... " The Romance of the Three Kingdoms " ... " books " " (the intention data be＜readbook 〉), " ... " The Romance of the Three Kingdoms " ... " TV play " " (the intention data be＜watchTV 〉), " ... " The Romance of the Three Kingdoms " ... " film " " (being intended to data is＜watchfilm 〉), wherein " TV play " reaches the user view that " books ", " film " are enumerated correspondence respectively).Again for example, when first key word 509 during for " " lustily water ", " music " " (for example phonetic entry of user's input " I will listen the lustily music of water "), may produce after natural language understanding system 520 is analyzed and following may be intended to syntax data:

"＜playmusic 〉,＜songname 〉=water lustily ";

The search result that arrives of news of looking into for example about the record (for example record 12 of Fig. 3 B) of the record (for example record 11 of Fig. 3 B) of " ... " lustily water " ... " Liu Dehua " ", " ... " lustily water " ... " Li Yijun " ", wherein " Liu Dehua " to reach " Li Yijun " be intention data corresponding to the user.In other words, each search result can comprise first key word 509 and the intention data that are relevant to first key word 509, and natural language understanding system 520 can be according to the search result that inquires, data-switching included in the search result is become the repayment answer, and will repay answer and be recorded in the candidate list, use for subsequent step.

In step S606, natural language understanding system 520 can be selected at least one first repayment answer 511 in candidate list, and according to the first corresponding voice answer-back 507 of the first repayment answer, 511 outputs.In the present embodiment, natural language understanding system 520 can be according to the repayment answer in the prioritizing candidate list, and selects the repayment answer according to priority in candidate list, exports first voice answer-back 507 according to this.

For instance, when first key word 509 is " The Romance of the Three Kingdoms ", suppose that natural language understanding system 520 inquires a lot of pen about the record of " ... " The Romance of the Three Kingdoms " ... " books " " (that is do priority with the data bulk number that inquires), secondly be the record of " ... " The Romance of the Three Kingdoms " ... " music " ", and about the record minimum number of " ... " The Romance of the Three Kingdoms " ... " TV play " ", then natural language understanding system 520 can be with " books of the The Romance of the Three Kingdoms " as the first repayment answer (the repayment answer that override is selected), " music of the The Romance of the Three Kingdoms " as the second repayment answer (the second preferential repayment answer of selecting), " TV play of the The Romance of the Three Kingdoms " is as the 3rd repayment answer (the 3rd preferential repayment answer of selecting).Certainly, if the first repayment answer that is relevant to " books of the The Romance of the Three Kingdoms " not only when record notes, can also select first according to order (for example being clicked the number of times number) earlier and repay answer 511, the correlative detail front was carried, did not repeat them here.

Then, in step S608, phonetic sampling module 510 can receive second phonetic entry 501 ', and natural language understanding system 520 can be resolved this second phonetic entry 501 ', and judges whether the previous first selected repayment answer 511 is correct.At this, phonetic sampling module 510 can be resolved second phonetic entry 501 ', to parse the second included key word 509 ' of second phonetic entry 501 ', wherein this second key word 509 ' for example is the key word that further provides of user (for example time, intention, ken ... etc.).And when second key word 509 ' in second phonetic entry 501 ' and first repayment when relevant intention data do not conform in the answer 511, natural language understanding system 520 can judge previous selected first, and to repay answer 511 be incorrect.What comprise as for second solicited message 503 ' of judging second phonetic entry 501 ' is that " correctly " or " negating " mode front of first voice answer-back 507 was carried, does not repeat them here.

Furthermore, natural language understanding system 520 second phonetic entry 501 ' of resolving can comprise or not comprise the second clear and definite key word 509 '.For instance, phonetic sampling module 510 for example is to receive from user said " I do not refer to the books of the The Romance of the Three Kingdoms " (situation A), " I do not refer to the books of the The Romance of the Three Kingdoms, and I refer to the TV play of the The Romance of the Three Kingdoms " (situation B), " I refer to the TV play of the The Romance of the Three Kingdoms " (situation C) etc.Second key word 509 ' among the above-mentioned situation A for example is " " not being ", " The Romance of the Three Kingdoms ", " books " ", key word 509 among the situation B for example is " " not being ", " The Romance of the Three Kingdoms ", " books "; "Yes", " The Romance of the Three Kingdoms ", " TV play " ", and second key word 509 ' among the situation C for example is " "Yes", " The Romance of the Three Kingdoms ", " TV play " ".For convenience of description, above-mentionedly only enumerate situation A, B and C is example, but present embodiment is not limited to this.

Then, natural language understanding system 520 can judge whether intention data relevant in the first repayment answer 511 are correct according to the second included key word 509 ' of above-mentioned second phonetic entry 501 '.That is to say, if the disconnected first repayment answer 511 is " books of the The Romance of the Three Kingdoms ", and above-mentioned second key word 509 ' is " " The Romance of the Three Kingdoms ", " TV play " ", then natural language understanding system 520 can judge that the intention data (being that the user wants to see the The Romance of the Three Kingdoms " books ") of being correlated with in the first repayment answer 511 do not meet second key word 509 ' (being that the user wants to see the The Romance of the Three Kingdoms " TV play ") from user's second phonetic entry 501 ', uses and judges that the first repayment answer 511 is incorrect.Similarly, if judge that the repayment answer is " books of the The Romance of the Three Kingdoms ", and above-mentioned second key word 509 ' is " " not being ", " The Romance of the Three Kingdoms ", " books " ", and it is incorrect that then natural language understanding system 520 also can be judged the first repayment answer 511.

After natural language understanding system 520 was resolved second phonetic entry 501, when first voice answer-back 501 of output was correct before judging, then shown in step S610, natural language understanding system 520 can be made the response corresponding to second phonetic entry 501 '.For instance, suppose to be " yes, being the books of the The Romance of the Three Kingdoms " from user's second phonetic entry 501 ' that then natural language understanding system 520 can be second voice answer-back 507 ' of output " helping you to open the books of the The Romance of the Three Kingdoms ".Perhaps, natural language understanding system 520 can directly be written into the book contents of the The Romance of the Three Kingdoms by processing unit (not illustrating) when playing second voice answer-back 507 '.

Yet, after natural language understanding system 520 is resolved second phonetic entry 501 ', when first voice answer-back 507 of output before judging (that is repayment answer 511) is incorrect, then shown in step S612, natural language understanding system 520 can be selected another person outside the first repayment answer 511 in candidate list, and exports second voice answer-back 507 ' according to selected result.At this, if do not have the second clear and definite key word 509 ' (as second phonetic entry 501 ' of above-mentioned situation A) in second phonetic entry 501 ' that the user provides, then natural language understanding system 520 can be selected the second preferential repayment answer of selecting according to priority from candidate list.Perhaps, if have the second clear and definite key word 509 ' (as second phonetic entry 501 ' of above-mentioned situation B and C) in second phonetic entry 501 ' that the user provides, then natural language understanding system 520 can selected corresponding repayment answer directly according to guided second key word 509 ' of user from candidate list.

On the other hand, if have the second clear and definite key word 509 ' (as second phonetic entry of above-mentioned situation B and C) in second phonetic entry 501 ' that the user provides, but looking into, natural language understanding system 520 do not have the repayment answer that meets this second key word 509 in candidate list, then natural language understanding system 520 can output the 3rd voice answer-back, for example " NK " or " I know " etc.

In order to make those skilled in the art further understand method and the natural language dialogue system of the correction voice answer-back of present embodiment, below be described in detail for an embodiment again.

At first, suppose that first phonetic entry 501 that phonetic sampling module 510 receives is " I will see the The Romance of the Three Kingdoms " (step S602), then, natural language understanding system 520 can parse first key word 509 for " " seeing "; " The Romance of the Three Kingdoms " ", and acquisition has the candidate list of a plurality of first repayment answers, wherein each repayment answer has relevant key word and other data (other data can be stored in the content field 306 of Fig. 3 A/3B, or respectively record the some of 302 numeric field 312) (step S604), (suppose that the books/TV play/music/film about the The Romance of the Three Kingdoms respectively has only data in the search result) as shown in Table 1.

Table one

Then, natural language understanding system 520 can be selected required repayment answer in candidate list.Suppose natural language understanding system 520 choose in regular turn in the candidate list repayment answer a with as first the repayment answer 511, then natural language understanding system 520 for example is output " whether playing the books of the The Romance of the Three Kingdoms ", the i.e. first voice answer-back 507(step S606).

At this moment, if second phonetic entry 501 ' that phonetic sampling module 510 receives is " yes " (step S608), then natural language understanding system 520 can be judged above-mentioned repayment answer a for correct, and natural language understanding system 520 can be exported another voice answer-back 507 " please wait a moment " (that is second voice answer-back 507 '), and is written into the book contents (step S610) of the The Romance of the Three Kingdoms by processing unit (not illustrating).

Yet, if second phonetic entry 501 ' that phonetic sampling module 510 receives is " I do not refer to the books of the The Romance of the Three Kingdoms " (step S608), then can to judge above-mentioned repayment answer a be incorrect to natural language understanding system 520, and natural language understanding system 520 can be again from the repayment answer b～e of candidate list, select another repayment answer and do the second repayment answer 511 ', it for example is repayment answer b " whether will play the TV play of the The Romance of the Three Kingdoms ".If the user continues to answer " not being TV play ", then one of them of natural language understanding system 520 meeting selection repayment answer c～e repaid.In addition, if the repayment answer a～e in the candidate list is all given user's mistake by natural language understanding system 520 repayment, and do not meet user's voice among these repayment answers a～e and import at 501 o'clock, then the voice answer-back 507(step S612 of natural language understanding system 520 outputs " looking into no any data ").

In another embodiment, in above-mentioned step S608, if receiving user's second phonetic entry 501 ', phonetic sampling module 510 is " I refer to the caricature of the The Romance of the Three Kingdoms ", at this, owing to there is no the repayment answer about caricature in the candidate list, natural language understanding system 520 can directly be exported second voice answer-back 507 ' of " looking into no any data ".

Based on above-mentioned, natural language understanding system 520 can be according to from user's first phonetic entry 501 and the first corresponding voice answer-back 507 of output.Wherein, when first voice answer-back of exporting when natural language understanding system 520 507 does not meet user's the solicited message 503 of first phonetic entry 501, natural language understanding system 520 can be revised first voice answer-back 507 of output originally, and according to follow-up second phonetic entry 501 ' that provides of user, further output meets second voice answer-back 507 ' of user's first solicited message 503.Thus, if the user still dissatisfied natural language understanding system 520 provide answer the time, natural language understanding system 520 can automatically be revised, and repays new voice answer-back and give the user, uses and promotes user and natural language dialogue system 500 convenience when engaging in the dialogue.

What deserves to be mentioned is, in the step S606 and step S612 of Fig. 6, natural language understanding system 520 also can be assessed the method for priority according to difference, sort repayment answer in the candidate list, in candidate list, select the repayment answer according to this priority according to this, export the voice answer-back corresponding to the repayment answer again.

For instance, natural language understanding system 520 can be according to everybody's use habit, the priority of first in the candidate list that sorts repayment answer 511, the wherein answer of often using about everybody prioritization then.For example, be that " The Romance of the Three Kingdoms " is example with first key word 509 again, suppose that the repayment answer that natural language understanding system 520 finds is the TV play of the The Romance of the Three Kingdoms, the books of the The Romance of the Three Kingdoms and the music of the The Romance of the Three Kingdoms.Wherein, if everybody typically refers to the books of " The Romance of the Three Kingdoms " when mentioning " The Romance of the Three Kingdoms ", less people can refer to the TV play of " The Romance of the Three Kingdoms ", and people's music (when for example using temperature field 316 stored numerical value among Fig. 3 C to represent whole users' coupling situation that can refer to " The Romance of the Three Kingdoms " still less, the numerical value of temperature field 316 can be the highest on " books " record of " The Romance of the Three Kingdoms "), then natural language understanding system 520 can be according to the repayment answer of prioritizing about " books ", " TV play ", " music ".That is to say that natural language understanding system 520 can preferentially select " books of the The Romance of the Three Kingdoms " to come as the first repayment answer 511, and according to this first repayment answer, 511 outputs, first voice answer-back 507.

In addition, natural language understanding system 520 also can be accustomed to according to the user, to determine the priority of repayment answer.Specifically, natural language understanding system 520 can be recorded in property database (for example shown in Fig. 7 A/7B) with once receiving from user's voice input (comprising first phonetic entry 501, second phonetic entry 501 ' or any phonetic entry of being imported by the user), and wherein property database can be stored in the storage device such as hard disc.Property database can record natural language understanding system 520 and resolve user's voice and import at 501 o'clock, and first key word 509 that obtains and natural language understanding system 520 produce replys record etc. about data such as user preferences, customs.About storage and the acquisition of user preferences/custom data, will do further explanation by Fig. 7 A/7B/8 in the back.In addition, in one embodiment, the temperature field 316 stored numerical value in Fig. 3 C are customs (for example matching times) with user when relevant, and the numerical value of available temperature field 316 is judged user's use habit or priority.Therefore, natural language understanding system 520 can according to priority ordering repayment answer, be used the voice answer-back 507 that output meets user's voice input 501 according to the information such as user's custom that record in the property database 730 when selecting the repayment answer.For instance, in Fig. 3 B, the temperature field 316 stored numerical value of record 8/9/10 are respectively 2/5/8, its matching times that can represent " The Romance of the Three Kingdoms " " books ", " TV play ", " film " respectively is respectively 2/5/8, so will be preferred corresponding to the repayment answer of " film of the The Romance of the Three Kingdoms ".

On the other hand, natural language understanding system 520 also can be accustomed to selecting to repay answer according to the user.For instance, when supposing that user and natural language understanding system 520 engage in the dialogue, often be lifted to " books that I will see the The Romance of the Three Kingdoms ", and less mentioning " I will see the TV play of the The Romance of the Three Kingdoms ", and still less mention " music that I will see the The Romance of the Three Kingdoms " and (for example record 20 records about " books of the The Romance of the Three Kingdoms " (for example shown in the hobby field 318 of Fig. 3 B record 8) in the user session database, 8 records about " TV play of the The Romance of the Three Kingdoms " (for example shown in the hobby field 318 of Fig. 3 B record 9), and 1 record about " music of the The Romance of the Three Kingdoms "), then the priority of the repayment answer in the candidate list will be " books of the The Romance of the Three Kingdoms " in regular turn, " TV play of the The Romance of the Three Kingdoms " and " music of the The Romance of the Three Kingdoms ".That is to say that when first key word 509 was " The Romance of the Three Kingdoms ", natural language understanding system 520 can select " books of the The Romance of the Three Kingdoms " to come as the first repayment answer 511, and according to this first repayment answer, 511 outputs, first voice answer-back 507.

What deserves to be mentioned is that natural language understanding system 520 also can be according to user preferences, to determine the priority of repayment answer.Specifically, the user session database also can record the key word of the expressed mistake of user, for example: " liking ", " idol ", " detest " or " disliking " etc.Therefore, natural language understanding system 520 can be in candidate list, and the number of times that is recorded according to above-mentioned key word comes the repayment answer is sorted.For instance, it is more to suppose to repay the number of times that is relevant to " liking " in the answer, and then this repayment answer meeting preferentially is selected.Perhaps, it is more to suppose that repayment is relevant to the number of times of " detest " in the answer, then is selected after.

For instance, when supposing that user and natural language understanding system 520 engage in the dialogue, often mention " TV play that I dislike seeing the The Romance of the Three Kingdoms ", and less mentioning " my disagreeable music of listening the The Romance of the Three Kingdoms ", and still less mention " my the disagreeable books of listening the The Romance of the Three Kingdoms " and (for example record 20 records about " I dislike seeing the TV play of the The Romance of the Three Kingdoms " (for example can keeping a record by the detest field 320 of Fig. 3 B record 9) in the user session database, 8 records about " my disagreeable music of listening the The Romance of the Three Kingdoms ", and 1 about " I dislike seeing the books of the The Romance of the Three Kingdoms " (for example the detest field 320 by Fig. 3 B record 8 keeps a record)), then the priority of the repayment answer in the candidate list is " books of the The Romance of the Three Kingdoms " in regular turn, " TV play of the The Romance of the Three Kingdoms " and " music of the The Romance of the Three Kingdoms ".That is to say that when first key word 509 was " The Romance of the Three Kingdoms ", natural language understanding system 520 can select the books of " The Romance of the Three Kingdoms " to come as the first repayment answer 511, and according to this first repayment answer, 511 outputs, first voice answer-back 507.In one embodiment, can outside the temperature field 316 of Fig. 3 B, add one " detesting field 320 " in addition, in order to " the detest degree " of recording user.In another embodiment, can be when being resolved to the user to " detest " information of a certain record, directly the temperature field 316 (or hobby field 318) in corresponding record subtracts one (or other numerical value), like this can be when not increasing field the hobby of recording user.The embodiment of various recording user hobbies all can be applicable in the embodiment of the invention, and the present invention is not limited this.Other are accustomed to record and utilization and user/everybody's use habit and the hobby of data about the user ... provide the embodiment that replys and repay answer etc. mode, can do more detailed explanation at Fig. 7 of back A/7B/8.

On the other hand, before natural language understanding system 520 also can provide the repayment answer early than natural language dialogue system 500 according to the user (for example first phonetic entry 501 played before, this moment, the user did not know that still which kind of repayment answer will be natural language dialogue system 500 will provide select for it) phonetic entry imported, to determine the priority of at least one repayment answer.That is to say, (for example the 4th phonetic entry) time that received by phonetic sampling module 510 of supposing that phonetic entry is arranged is when played early than first phonetic entry 501, then natural language understanding system 520 also can be by resolving the 4th key word in the 4th phonetic entry, and in candidate list, preferentially choose and have the 4th repayment answer that the 4th key word therewith meets, and export the 4th voice answer-back according to this 4th repayment answer.

For instance, suppose that natural language understanding system 520 receives first phonetic entry 501 of " I want to see TV play " earlier, and soon (for example every several seconds) suppose that natural language understanding system 520 receives the 4th phonetic entry 501 of " it is good to help me to put the The Romance of the Three Kingdoms " again afterwards.At this moment, natural language understanding system 520 can recognize first key word 509 of " TV play " in first phonetic entry 501, recognizes in the 4th key word again subsequently " The Romance of the Three Kingdoms ".Therefore, natural language understanding system 520 can be chosen about " The Romance of the Three Kingdoms " the repayment answer with " TV play " from candidate list, and exports the 4th voice answer-back according to this with this 4th repayment answer and give the user.

Based on above-mentioned, natural language understanding system 520 can be according to importing from user's voice, and consider everybody's use habit, user preferences, user's custom or user's said front and back dialogue etc. information in light of actual conditions, and output is given the user than the voice answer-back that can meet the solicited message of phonetic entry.Wherein, natural language understanding system 520 can be according to different sortords, and for example everybody's use habit, user preferences, user's custom or user's said front and back dialogue etc. mode is come the repayment answer in the priority ordering candidate list.By this, if when more indeterminate from the user's voice input, natural language understanding system 520 can be considered everybody's use habit, user preferences, user's custom or user's said front and back dialogue in light of actual conditions, judges the intention (for example attribute of the key word 509 in first phonetic entry 501, ken etc.) that means in the user's voice input 501.In other words, if repayment answer and user once expressed/intention of everybody's indication near the time, 520 of natural language understanding systems can be paid the utmost attention to this and repay answer.Thus, the voice answer-back that natural language dialogue system 500 exports can meet user's solicited message.

In sum, in the method and natural language dialogue system of the correction voice answer-back of present embodiment, the natural language dialogue system can be according to from user's first phonetic entry 501 and the first corresponding voice answer-back 507 of output.Wherein, when first voice answer-back of exporting when the natural language dialogue system 507 does not meet user's first solicited message 503 of first phonetic entry 501 or first key word 509, first voice answer-back 507 of output originally can be revised by the natural language dialogue system, and according to follow-up second phonetic entry 501 ' that provides of user, further select second voice answer-back 507 ' that meets user's request.In addition, the natural language dialogue system also can preferentially select more suitable repayment answer according to everybody's use habit, user preferences, user's custom or user's said front and back dialogue etc. mode, and the corresponding voice answer-back of output gives the user according to this.Thus, if during the answer that the dissatisfied natural language dialogue system of user provides, the natural language dialogue system can automatically revise according to the solicited message that the user says each time, and repay new voice answer-back and give the user, use and promote user and the natural language dialogue system convenience when engaging in the dialogue.

Then again with natural language understanding system 100 and structured database 220 framework such as grade and members, be applied to according to the explanation that provides the example of replying and repay answer to do with user's session operational scenarios and context, user's use habit, everybody's use habit and user preferences.

Fig. 7 A is the calcspar of the natural language dialogue system that illustrates according to one embodiment of the invention.Please refer to Fig. 7 A, natural language dialogue system 700 comprises phonetic sampling module 710, natural language understanding system 720, property database 730 and speech database for speech synthesis 740.In fact, the phonetic sampling module 710 among Fig. 7 A and the phonetic sampling module of Fig. 5 A 510 are identical and natural language understanding system 520 is also identical with natural language understanding system 720, so the function of its execution is identical.In addition, during natural language understanding system 720 analysis request information 703, also can carry out the intention that full-text search obtains the user by the datumization database 220 to Fig. 1, the technology of this part explains so repeat no more with relevant narration at Fig. 1 because of the front.Be in order to storing the user preference data of being sent here by natural language understanding system 720 715 or to provide user preferences record 717 to give natural language understanding system 720 as for property database 730, this part can go detailed description later again.Speech database for speech synthesis 740 then is equal to speech database for speech synthesis 530, gives the user in order to voice output to be provided.In the present embodiment, phonetic sampling module 710 (is the first/the second phonetic entry 501/501 ' of Fig. 5 A/B in order to receive phonetic entry 701, for from user's voice), and natural language understanding system 720 can be resolved the solicited message 703 (being the first/the second solicited message 503/503 ' of Fig. 5 A/B) in the phonetic entry, and the corresponding voice answer-back 707 (being the first/the second voice answer-back 507/507 ' of Fig. 5 A/B) of output.Each member in the aforementioned natural language dialogue system 700 is configurable in uniform machinery, and the present invention is not limited this.

Natural language understanding system 720 can receive from phonetic sampling module 710 transmit phonetic entry 701 is resolved after solicited message 703, and, natural language understanding system 720 can produce the candidate list that comprises at least one repayment answer according to the one or more key words 709 in the phonetic entry 701, from candidate list, find out answers 711 in return that meet key word 709 again, and voice answer-backs 707 are exported according to voice 713 at last again to find out the voice 713 corresponding to repayment answer 711 in voice inquirement generated data storehouse 740 according to this.In addition, the natural language understanding system 720 of present embodiment can be done in fact by the hardware circuit that or several logic gates combine, or does in fact with computer program code, at this only for illustrating, not as limit.

Fig. 7 B is the calcspar of the natural language dialogue system 700 ' that illustrates according to another embodiment of the present invention.The natural language understanding system 720 ' of Fig. 7 B can comprise sound identification module 722 and natural language processing module 724, and phonetic sampling module 710 can be incorporated in the speech synthesis processing module 702 with phonetic synthesis module 726.Wherein, sound identification module 722 can receive and transmit the solicited message 703 that phonetic entry 701 is resolved from phonetic sampling module 710, and converts one or more key words 709 to.Natural language processing module 724 is handled these key words 709 again, and obtains at least one candidate list, and from candidate list, select one meet phonetic entry as the repayment answer 711.Owing to this repayment answer 711 is answers that natural language understanding system 720 gets in internal analysis, so also must will convert literal or voice output to could export and give the user, so phonetic synthesis module 726 can be come voice inquirement generated data storehouse 740 according to repayment answer 711, and this speech database for speech synthesis 740 for example be record literal with and corresponding voice messaging, can make phonetic synthesis module 726 can find out the voice 713 corresponding to repayment answer 711, use and synthesize voice answer-back 707.Afterwards, phonetic synthesis module 726 can be with synthetic voice by voice output interface (not illustrating), and wherein the voice output interface for example is devices such as loudspeaker, loudspeaker or earphone) output, use the output voice and give the user.It should be noted, in Fig. 7 A, natural language understanding system 720 is phonetic synthesis module 726 to be incorporated wherein into the (framework of Fig. 5 B for example, but phonetic synthesis module 726 is not shown among Fig. 7 A), and the phonetic synthesis module will utilize 711 pairs of speech database for speech synthesis 740 of repayment answer to inquire about to obtain voice 713, as the foundation that synthesizes voice answer-back 707.

In the present embodiment, sound identification module 722 in the aforementioned natural language understanding system 720, natural language processing module 724 and phonetic synthesis module 726 can be equal to sound identification module 522, natural language processing module 524 and the phonetic synthesis module 526 of Fig. 5 B respectively and identical functions is provided.In addition, sound identification module 722, natural language processing module 724 and phonetic synthesis module 726 can be configured in the uniform machinery with phonetic sampling module 710.In other embodiments, sound identification module 722, natural language processing module 724 and phonetic synthesis module 726 also can be dispersed in (for example computer system, servomechanism or similar device/system) in the different machines.The natural language understanding system 720 ' shown in Fig. 7 B for example, phonetic synthesis module 726 can be configured in uniform machinery 702 with phonetic sampling module 710, and sound identification module 722, natural language processing module 724 are configurable at another machine.It should be noted, in the framework of Fig. 7 B, because phonetic synthesis module 726 and phonetic sampling module 710 are configured in the machine 702, therefore natural-sounding understanding system 720 just need be sent to machine 702 with repayment answer 711, and repayment answer 711 can be sent to speech database for speech synthesis 740 to seek corresponding voice 713, as the foundation that produces voice answer-back 707 by phonetic synthesis module 726.In addition, phonetic synthesis module 726 is according to repayment answer 711 voice calls generated data storehouses 740 time, may need to repay earlier answer 711 and carry out format conversion, call out by the interface of speech database for speech synthesis 740 defineds then, because this part belongs to the known technology of those skilled in the art, so will not describe in detail at this.

Below namely in conjunction with above-mentioned natural language dialogue system 700 in conjunction with Fig. 7 A the natural language dialogue method is described.Fig. 8 is the process flow diagram of the natural language dialogue method that illustrates according to one embodiment of the invention.For convenience of description, be example in this natural language dialogue system 800 of only lifting Fig. 7 A, but the natural language dialogue method of present embodiment is also applicable to the natural language dialogue system 700 ' of above-mentioned Fig. 7 B.Compare down with Fig. 5/6, Fig. 5/6 are handled revises the information of exporting automatically according to the user's voice input, be to come recording user hobby data 715 according to property database 730 but Fig. 7 A/7B/8 is handled, and according to this from candidate list selection do repayment answer 711, and play its corresponding voice and give the user.In fact, the embodiment of Fig. 5/6 and Fig. 7 A/7B/8 can select one or and deposit, invention is not limited this.

Please be simultaneously with reference to Fig. 7 A and Fig. 8, in step S810, phonetic sampling module 710 can receive phonetic entry 701.Wherein, phonetic entry 701 for example is from user's voice, and phonetic entry 701 also can have user's solicited message 703.Particularly, can be inquiry sentence, imperative sentence or other solicited messages etc. from user's voice input 701, for example example " I will see the The Romance of the Three Kingdoms " carried of front, " I will listen the lustily music of water " or " temperature several years today " etc.It should be noted, step S802-S806 is the flow process of the previous phonetic entry stored user hobby data 715 of 700 couples of users of natural language dialogue system, and step S810-S840 backward namely operates based on the user preference data 715 that these before had been stored in property database 730.The details of step S802-S806 will be later again row describe in detail, below will tell about the content of operation of step S820-S840 earlier.

In step S820, natural language understanding system 720 can be resolved at least one included key word 709 in first phonetic entry 701, and then obtains candidate list, and wherein candidate list has one or more repayment answers.Specifically, natural language understanding system 720 can be resolved phonetic entry 701, and obtains one or more key words 709 of phonetic entry 701.For instance, when user's voice input 701 is " I will see the The Romance of the Three Kingdoms ", natural language understanding system 720 key word 709 that obtains of back by analysis for example is " " The Romance of the Three Kingdoms ", " seeing " " (what as previously mentioned, also analysis user wanted to see again is books, TV play or film).Again for example, when user's voice input 701 is " I will listen the lustily song of water ", natural language understanding system 720 key word 709 that obtains of back by analysis for example is " " lustily water ", " listening ", " song " " (what as previously mentioned, analysis user thought tin again is the version that Liu Dehua or monarch Li Yi sing).After connecing, natural language understanding system 720 can be carried out full-text search according to above-mentioned key word 709 self-structure databases, and obtains at least one search result (can be at least one notes record wherein of Fig. 3 A/3B), according to this as the repayment answer in the candidate list.Because key word 709 may belong to different ken (film class for example, the books class, music class or game class etc.), and also can further be divided into the plurality of classes (different authors of same film or books title for example in the same ken, the different singers of same song title, different editions of same game name etc.), so at a key word 709, natural language understanding system 720 can (for example be carried out full-text search to structured database 220) and be obtained one or many search result that is relevant to this key word 709 after analysis, it comprises other information except key word 709 and key word 709 etc. (content of other information as shown in Table 1).Therefore from another viewpoint, when first phonetic entry of importing as the user 701 has a plurality of key word 709, the solicited message 703 of then representing the user is clearer and more definite, the search result that makes natural language understanding system 720 can analyze to approach with solicited message 703 (because when if natural language understanding system 720 can find complete matching result, should be exactly option that the user wants).

For instance, when key word 709 is " The Romance of the Three Kingdoms ", the search result that natural language understanding system 720 analyzes for example be about the record of " ... " The Romance of the Three Kingdoms " ... " TV play " ", " ... " The Romance of the Three Kingdoms " ... " books " " (wherein " TV play " reach " books " be respond the indicated user view of result).Again for example, when key word 709 is " " lustily water ", " music " ", the user view that natural language understanding system 720 analyzes may be the record of " ... " lustily water " ... " music " ... " Liu Dehua " ", " ... " lustily water " ... " music " ... " Li Yijun " ", and wherein " Liu Dehua ", " Li Yijun " are in order to indicate the search result of user view.In other words, after 720 pairs of structured database of natural language understanding system 220 are carried out full-text search, other data (as shown in Table 1) that each search result can comprise key word 709 and be relevant to key word 709, and natural language understanding system 720 can convert the candidate list that comprises at least one repayment answer to according to the search result that analyzes and uses for subsequent step.

In step S830, the user preferences record 717 that natural language understanding system 720 is sent here according to property database 730 (for example converges the result who puts in order according to storing wherein user preference data 715, the back can explain this), in order to the one repayment answer 711 of selection in candidate list, and according to repayment answer 711 output voice answer-backs 707.In the present embodiment, natural language understanding system 720 can be arranged and select repayment answer 711 from candidate list according to a priority (priority comprises the following detailed description again of which mode).And in step S840, according to repayment answer 711, output voice answer-back 707 (step S840).

For instance, quantity that in one embodiment can search result is done priority, for example when key word 709 is " The Romance of the Three Kingdoms ", suppose that natural language understanding system 720 is after analysis, discovery record quantity about " ... " The Romance of the Three Kingdoms " ... " books " " in structured database 220 is maximum, secondly be the record of " ... " The Romance of the Three Kingdoms " ... " music " ", and about the record minimum number of " ... " The Romance of the Three Kingdoms " ... " TV play " ", then natural language understanding system 720 record that can will be relevant to " books of the The Romance of the Three Kingdoms " (for example will be organized into a candidate list all about " books of the The Romance of the Three Kingdoms " as the first preferential repayment answer, and can sort according to the numerical value of temperature field 316), be relevant to the record of " music of the The Romance of the Three Kingdoms " as the second preferential repayment answer, be relevant to the record of " TV play of the The Romance of the Three Kingdoms " as the 3rd preferential repayment answer.It should be noted, except the quantity of search result, can also be user preferences, user's custom or everybody's use habit as the foundation of priority, and relevant narration can be described in detail backward again.

In order to make those skilled in the art further understand natural language dialogue method and the natural language dialogue system of present embodiment, below be described in detail for an embodiment again.

At first, suppose that first phonetic entry 701 that phonetic sampling module 710 receives is " I will see the The Romance of the Three Kingdoms " (step S810), then, natural language understanding system 720 can parse the key word 709 for " " seeing ", " The Romance of the Three Kingdoms " ", and acquisition has the candidate list of a plurality of repayment answers, wherein each repayment answer has relevant key word (step S820) and other information, also shown in above-mentioned table one.

Then, natural language understanding system 720 can be selected the repayment answer in candidate list.Suppose natural language understanding system 720 choose in the candidate list repayment answer a (please refer to table one) with as first the repayment answer 711, then natural language understanding system 720 for example is output " whether playing the books of the The Romance of the Three Kingdoms ", as voice answer-back 707(step S830～S840).

As mentioned above, natural language understanding system 720 also can be assessed the method for priority according to difference, and the voice answer-back 707 corresponding to repayment answer 711 is exported in the repayment answer in the candidate list that sorts accordingly.For instance, natural language understanding system 720 can also can utilize this user preferences record 717 to determine the priority of repayment answers 711 according to judging user preferences (for example front/negative sense term of the use user that carried of front) with user's a plurality of session logs.So before the use-pattern that explains orally user's positive/negative term, earlier the mode of user preference data 715 in hobby/detest of stored user/everybody or custom explained.

Now according to the storing mode of step S802-806 about user preference data 715.In one embodiment, can be before step S810 receives phonetic entry 701, namely in step S802, receive a plurality of phonetic entries, previous dialog history record just, and according to these previous a plurality of phonetic entries 701, acquisition user preference data 715 (step S804) is stored in the property database 730 then.In fact, user preference data 715 also can be stored in (or say so and property database 730 is incorporated into the mode of structured database 220) in the structured database 220.For instance, in one embodiment, can directly utilize the temperature field 316 of Fig. 3 B to come the hobby of recording user, carry (be about to its temperature field when for example a certain record 302 is by coupling and add one) as for the recording mode front of temperature field 316, not repeat them here.Certainly, also can ward off field in addition in structured database 220 and come stored user hobby data 715, for example use key word (for example " The Romance of the Three Kingdoms ") to be the basis, in conjunction with user preferences (when for example mentioning term such as " liking " forward and negative terms such as " detest " as the user, can be respectively add one in the hobby field 318 of Fig. 3 B with the numerical value of detesting field 320), calculate the quantity (for example add up forward term with etc. the quantity of negative term) of hobby then.So 720 pairs of structured database 200 inquiring users hobbies of natural language understanding system record 717 o'clock, can directly inquire about hobby field 318 and the numerical value of detesting field 320 (can inquire about the forward term with etc. negative term how much quantity is respectively arranged), judge user's hobby (statistic that also is about to front term and negative term is sent to natural language understanding system 720 as user preferences record 717) more according to this.

The situation (that is property database 730 is not incorporated structured database 220 into) that user preference information 715 is stored in property database 730 below will be described.In one embodiment, user preference information 715 can use key word and user that the corresponded manner of " hobby " of this key word is stored, for instance, the storage of user preference information 715 can directly use the hobby field 852 of Fig. 8 B and detest field 862 to come the recording user individual to hobby and the detest of certain key word, and records everybody to hobby and the detest of this set of keyword with hobby field 854 and detest field 864.For example in Fig. 8 B, record 832 stored key words " " The Romance of the Three Kingdoms "; " books " " corresponding hobby field 852 and the numerical value of detesting field 862 for being respectively 20 and 1, record the 834 stored corresponding hobby fields 852 of key word " " The Romance of the Three Kingdoms "; " TV play " " and be difference 8 and 20 with the numerical value of detesting field 862, record the corresponding hobby field 852 of 836 stored key words " " The Romance of the Three Kingdoms "; " music " " and the numerical value of detesting field 862 for being respectively 1 and 8, it represents that all (more like by the more high expression of numerical value of for example liking field 852 for the hobby of related keyword and detest data for individual subscriber, detesting the more high expression of numerical value of field 862 more detests).In addition, record 832 corresponding hobby fields 854 and the numerical value of detesting field 864 and be respectively 5 and 3, record 834 corresponding hobby fields 854 and the numerical value of detesting field 864 for respectively 80 and 20, the numerical value of record 836 corresponding hobby fields 854 and detest field 864 is for being respectively 2 and 10, it is that expression everybody is for the hobby of related keyword and detest data (being called for short it with " hobby indication "), so just can increase the numerical value of hobby field 852 and detest field 862 according to user's hobby.Therefore, if during the voice of user's input " I want to see the TV play of the The Romance of the Three Kingdoms ", natural language understanding system 720 can be merged into user preference data 715 with " the hobby indication " that increase hobby field numerical value with " key word " " " The Romance of the Three Kingdoms ", " TV play " " and be sent to property database 730, so property database 730 can add one operation (because the user wants to see " " The Romance of the Three Kingdoms ", " TV play " ", representing that its preference degree increases) at hobby field 852 numerical value of record 834.Mode according to above-mentioned recording user hobby data, backward when user's key word that input is relevant more again, for example the user is when input " I will see the The Romance of the Three Kingdoms ", natural language understanding system 720 can inquire three records 832/834/836 relevant with " The Romance of the Three Kingdoms " at the property database 730 of Fig. 8 B according to key word " The Romance of the Three Kingdoms ", and property database 730 can return to natural language understanding system 720 as user preferences record 717 with the numerical value of detesting field 862 liking field 852, so natural language understanding system 720 can record 717 as the hobby foundations of judging individual subscribers according to user preferences.Certainly, property database 730 also can return with the numerical value of detesting field 864 hobby field 854 and give natural language understanding system 720 as user preferences record 717, just this moment, the user preferences record 717 will be as the foundation of judging everybody's hobby, and the present invention is that individual subscriber or everybody's hobby are not limited to 717 representatives of user preferences record.

In another embodiment, hobby field 852 also can be used as the foundation of judging user/everybody's custom with the numerical value of detesting field 862.For instance, natural language understanding system 720 can be after receiving user preferences record 717, judge hobby field 852/854 and the numerical value difference of detesting field 862/864 earlier, if two numerical value have differed on certain critical value, the expression user is accustomed to using specific mode to engage in the dialogue, for example when the numerical value of hobby field 852 detest the numerical value of field 862 big more than 10 times, the expression user misses potter use " front term " and does dialogue (this i.e. a kind of recording mode of " user's custom "), so natural language understanding system 720 can only be chosen the repayment answer with hobby field 852 under this situation.What use when natural language understanding system 720 is that property database 730 stored hobby fields 854/ are when detesting the numerical value of field 864, what expression was judged is property database 730 all users' hobby record, and judged result namely can be as the reference data of everybody's use habit.It should be noted, record 717 for the user preferences of natural language understanding system 720 by property database 730 passbacks and can comprise the hobby record (for example liking the numerical value that field 852/ is detested field 862) of individual subscriber and everybody's hobby record (for example liking the numerical value of field 854/ detest field 864) simultaneously, the present invention is not limited this.

Storage as for the user preference data 715 that the phonetic entry based on this is obtained, can when step S820 produces candidate list, (mate fully or the part coupling no matter be), store obtained user preference data 715 in the user speech input this time by natural language dialogue system 700.For example in step S820, when key word can produce matching result in structured database 220, can judge that the user is the tendency of preference to some extent to this matching result, therefore " key word " and " hobby indication " can be sent to property database 730, and after finding corresponding record therein, the hobby field 852/854 of change corresponding record its correspondence or detest field 862/864 numerical value (for example when the user imports " I want to see the books of the The Romance of the Three Kingdoms ", can add one to the numerical value of the hobby field 852/854 of the record 832 of Fig. 8 B).In another embodiment, natural language dialogue system 700 also can be in step S830, and just stored user is liked data 715 after the user chooses a repayment answer.In addition, if when not when property database 730 finds corresponding key word, can set up a new record and come stored user hobby data 715.For example when the user imports the voice of " I listen the lustily water of Liu De China " and produces key word " " Liu Dehua ", " lustily water " ", if do not find corresponding record at property database 730 when storing, so will set up new record 838 at property database 730, and add one at hobby field 852/854 numerical value of its correspondence.Above-mentioned user preference data 715 storage opportunity and storing modes, only for the usefulness of explanation, those skilled in the art be when can changing embodiment shown in the present according to practical application, but all do not break away from equivalence that spirit of the present invention does and modify and must be included in the claim of the present invention.

In addition, though in the form of the property database 730 store recording 832-838 shown in Fig. 8 B and the record format of structured database 220 (for example Fig. 3 A/3B/3C those shown) and inequality, the present invention is not limited the saving format of each record.Moreover, though above-described embodiment is only told about hobby field 852/854 and storage and the use-pattern of detesting field 862/864, but in another embodiment, can ward off field 872/874 in addition with other customs of difference stored user/everybody at property database 730, for example the corresponding data of this notes record are downloaded, quote, recommend, comment on or referral ... data such as number of times.In another embodiment, these are downloaded, quote, recommend, comment on or the number of times of referral or data also can be concentrated to like field 852/854 and be detested field 862/864 and store, for example the comment that at every turn a certain record provided of user or referral give other people with reference to the time can add one at the numerical value of hobby field 852/854, if the user can add one at the numerical value of detesting field 862/864 when providing bad comment to a certain record, the present invention gives restriction to the records of values mode of the quantity of record and field is neither.It should be noted, those skilled in the art should know, because of the hobby field 852 among Fig. 8 B, detest field 862 ... selection with individual subscriber is relevant with hobby Deng only, so can be with the selection/hobby/detest information storage of these individual subscribers in user's device for mobile communication, and with all user-dependent hobby fields 854, detest field 864 ... just be stored in the servomechanism etc. information, so also can save the storage area of servomechanism, also keep the confidentiality of individual subscriber hobby.

Following recycling Fig. 7 A and the user's of Fig. 8 B actual behaviour in service is done further explanation.Conversation content based on a plurality of phonetic entries 701, when supposing that user and natural language understanding system 720 engage in the dialogue, often mention " TV play that I dislike seeing the The Romance of the Three Kingdoms ", and less mentioning " my disagreeable music of listening the The Romance of the Three Kingdoms ", and still less mention " my the disagreeable books of listening the The Romance of the Three Kingdoms " and (for example record 20 records about " I dislike seeing the TV play of the The Romance of the Three Kingdoms " in the property database 730 (that is in Fig. 8 B, the quantity that " The Romance of the Three Kingdoms " adds the negative term of " TV play " is exactly 20), 8 records about " my disagreeable music of listening the The Romance of the Three Kingdoms " are (that is in Fig. 8 B, the quantity that " The Romance of the Three Kingdoms " adds the negative term of " music " is 8), and 1 about " my the disagreeable books of listening the The Romance of the Three Kingdoms ") (that is in Fig. 8 B, the quantity that " The Romance of the Three Kingdoms " adds the negative term of " books " is 1), because the user preferences that returns from property database 730 record 717 will comprise the quantity (that is 20 of these three negative terms, 8,1), then natural language understanding system 720 can be arranged as " books of the The Romance of the Three Kingdoms " in regular turn with the priority of the repayment answer 711 in the candidate list, " music of the The Romance of the Three Kingdoms ", and " TV play of the The Romance of the Three Kingdoms ".That is to say that when key word 709 was " The Romance of the Three Kingdoms ", natural language understanding system 720 can select the books of " The Romance of the Three Kingdoms " to come as repayment answer 711, and according to these repayment answer 711 output voice answer-backs 707.It should be noted, though above-mentioned is to use the statistic of the used negative term of user to come the column major order separately, but in another embodiment, still can use the statistic of the used front of user term to come column major order's (for example previously mentioned, the numeric ratio of hobby field 852 is detested on field 862 some critical values) separately.

What deserves to be mentioned is that natural language understanding system 720 also can be simultaneously according to the front term of user's use and the number of negative term, to determine the priority of repayment answer.Specifically, property database 730 also can record the key word of the expressed mistake of user, for example: " liking ", " idol " (above is the front term), " detest " or " disliking " (above is negative term) etc.Therefore, natural language understanding system 720 is used differing the number of times of " liking " and " detest " except comparing the user, also can be in candidate list, directly come repayment answer sort (that is relatively front term or negative term any person to quote number of times more) according to the corresponding positive/negative term of above-mentioned key word number of times.For instance, suppose that repayment is relevant to the number of times of " liking " more (that is front term quote number of times numeric ratio more or hobby field 852 bigger) in the answer, then this repayment answer meeting preferentially is selected.Perhaps, suppose that repayment is relevant to the number of times of " detests " more (that is negative term quote number of times numeric ratio more or detest field 862 bigger) in the answer, natural language understanding system 720 then is selected after, so can be put out all repayment answers in order a candidate list according to above-mentioned prioritizing mode.Because certain customers may preference use other users of front term (numerical value of for example liking field 852 is big especially) then to be accustomed to using negative term (numerical value of for example detesting field 862 is big especially), therefore in the above-described embodiments, because user preferences record 717 will reflect individual user's use habit, so can provide the option that more meets user's custom to choose for it.

In addition, natural language understanding system 720 also can be according to everybody's use habit, the sort priority of the repayment answer 711 in the candidate list, the wherein answer of often using about everybody prioritization (for example using the temperature field 316 of Fig. 3 C to keep a record) then.For example, when key word 709 is " The Romance of the Three Kingdoms ", suppose that repayment answer that natural language understanding system 720 finds for example is the music of books and the The Romance of the Three Kingdoms of the TV play of the The Romance of the Three Kingdoms, the The Romance of the Three Kingdoms.Wherein, if everybody typically refers to the TV play of " The Romance of the Three Kingdoms " when mentioning " The Romance of the Three Kingdoms ", less people can refer to the film of " The Romance of the Three Kingdoms ", and still less the people can refer to the books of " The Romance of the Three Kingdoms ", (for example among Fig. 8 B, relative recording is respectively 80,40,5 at the numerical value of hobby field 854), then natural language understanding system 720 can be according to the repayment answer 711 of prioritizing about " TV play ", " film ", " books ".That is to say that natural language understanding system 720 can preferentially select " TV play of the The Romance of the Three Kingdoms " to come as repayment answer 711, and according to these repayment answer 711 output voice answer-backs 707.Mode as for above-mentioned " the answer prioritization that everybody often uses " can use the temperature field 316 of Fig. 3 C to keep a record, and recording mode has disclosed, does not repeat them here in the relevant paragraph of above-mentioned Fig. 3 C.

In addition, natural language understanding system 720 also can be repaid the priority of answer 711 according to user's frequency of utilization with decision.Specifically, because natural language understanding system 720 can be recorded in property database 730 with once receiving from user's voice input 701, property database 730 can record natural language understanding system 720 parsing user's voice and import at 701 o'clock, the key word 709 that obtains and natural language understanding system 720 all response messages such as repayment answer 711 that produced.Therefore natural language understanding system 720 when selecting repayment answer 711 backward, can be according to the response message that records in the property database 730 (for example user preferences/detest/custom or even everybody hobby/detest/custom ... etc. information), find out the repayment answer 711 that meets user view (being judged by user's voice input institute) according to priority ordering, use correspondence voice answer-back.As for the mode of above-mentioned " priority of being accustomed to determining to repay answer 711 according to the user ", also can use the temperature field 316 of Fig. 3 C to keep a record, and recording mode has disclosed, does not repeat them here in the relevant paragraph of above-mentioned Fig. 3 C.

Comprehensively above-mentioned, natural language understanding system 720 can be stored to (step S806) in the property database 730 with above-mentioned user preferences attribute (for example front term and negative term), user's custom and everybody's use habit.That is to say, in step S802, step S804 and step S806, know user preference data 715 from user's previous dialog history record, and with in the user preference data 715 adding property databases of collecting 730, in addition, also user's custom is stored to property database 730 with everybody's use habit, allows natural language understanding system 720 can utilize abundant information in the property database 730 (for example the user preferences record 717), the user is provided more accurate replying.

Next the details to step S830 is described further.In step S830, be that the key word 709 that receives phonetic entry and resolve phonetic entry at S820 at step S810 is with after obtaining candidate list, then, natural language understanding system 720 determines the priority (step S880) of at least one repayment answer according to user preferences records 717 such as user preferences, user's custom or everybody's use habits.As mentioned above, priority can be foundation by record quantity, user or everybody's of searching modes such as positive/negative term.Then, in candidate list, select a repayment answer 711 (step S890) according to priority, also can select matching degree soprano or priority soprano as mentioned above as for the selection of repayment answer.Afterwards, according to repayment answer 711, output voice answer-back 707 (step S840).

On the other hand, natural language understanding system 720 also can be according to user's phonetic entry 701 of input more early, to determine the priority of at least one repayment answer.That is to say, supposing has for example above-mentioned the 4th phonetic entry of another phonetic entry 701() time of being received by phonetic sampling module 710 is in advance when voice answer-back 707 is played, then natural language understanding system 720 also can be by resolving the key word (that is the 4th key word 709) in this phonetic entry 701 (that is the 4th phonetic entry), and in candidate list, preferentially choose repayment answer that key word therewith meets with as repayment answer 711, and according to these repayment answer 711 output voice answer-backs 707.

For instance, suppose that natural language understanding system 720 receives the phonetic entry 701 of " I want to see TV play " earlier, and every after several seconds, suppose that natural language understanding system 720 receives the phonetic entry 701 of " it is good to help me to put the The Romance of the Three Kingdoms " again.At this moment, natural language understanding system 720 can recognize the key word (first key word) of " TV play " in primary phonetic entry 701, and recognize the key word (the 4th key word) of " The Romance of the Three Kingdoms " in the back, therefore, natural language understanding system 720 can be from candidate list, choose the intention data and be the repayment answer about " The Romance of the Three Kingdoms " and " TV play ", and repay answer 711 with this and export according to this and give the user with voice answer-back 707.

Based on above-mentioned, natural language understanding system 720 can be according to importing from user's voice, and consider everybody's use habit, user preferences, user's custom or user's said front and back dialogue etc. information in light of actual conditions, and output is given the user than the voice answer-back 707 that can meet the solicited message 703 of phonetic entry 701.Wherein, natural language understanding system 720 can be according to different sortords, and for example everybody's use habit, user preferences, user's custom or user's said front and back dialogue etc. mode is come the repayment answer in the priority ordering candidate list.By this, if when more indeterminate from user's voice input 701, natural language understanding system 720 can be considered everybody's use habit, user preferences, user's custom or user's said front and back dialogue in light of actual conditions, judges the intention (for example attribute of the key word in the phonetic entry 709, ken etc.) that means in the user's voice input 701.In other words, if repayment answer 711 once expressed with the user/intention of everybody's indication near the time, 720 of natural language understanding systems can be paid the utmost attention to this repayment answer 711.Thus, the voice answer-back 707 that natural language dialogue system 700 exports can meet user's solicited message 703.

It should be noted that though above-mentioned property database 730 is done description with structured database 220 with different databases, these two databases can combine, those skilled in the art can select according to practical application.

In sum, the invention provides a kind of natural language dialogue method and system thereof, the natural language dialogue system can be according to exporting corresponding voice answer-back from the user's voice input.Natural language dialogue of the present invention system also can be according to foundation everybody use habit, user preferences, user's custom or user's said front and back dialogue etc. mode, preferentially select more suitable repayment answer, export voice answer-back according to this and give the user, use and promote user and the natural language dialogue system convenience when engaging in the dialogue.

Then again with natural language understanding system 100 and structured database 220 framework such as grade and members, be applied to the solicited message analysis of importing according to user speech and the quantity of the repayment answer that gets, determine directly to operate according to data type or require the user that further indication is provided, subsequently repayment answer only surplus 1 o'clock, the also explanation done of the example that can directly operate according to data type.The benefit that this selection of user is provided is the screening that system can repay answer for the user, directly offer the user but will comprise the candidate list of repaying answer, and allow the user by repaying choosing of answer, oneself determine to want the software of carrying out or which kind of service is provided, to reach the purpose that user friendly interface (user-friendly interface) is provided.

Fig. 9 is the system schematic according to the mobile terminal apparatus of one embodiment of the invention.Please refer to Fig. 9, in the present embodiment, mobile terminal apparatus 900 comprises voice receiving unit 910, data processing unit 920, display unit 930 and storage unit 940.Data processing unit 920 couples voice receiving unit 910, display unit 930 and storage unit 940.Voice receiving unit 910 is in order to receive the first input voice SP1 and the second input voice SP2 and to be sent to data processing unit 920.The above-mentioned first phonetic entry SP1 and the second phonetic entry SP2 can be phonetic entries 501,701.Display unit 930 is in order to be controlled by data processing unit 920 to show the first/the second candidate list 908/908 '.Storage unit 940 is in order to store a plurality of data, and these data can comprise the data of aforesaid structured database 220 or property database 730, do not repeat them here.In addition, storage unit 940 can be the storer of any kind in servomechanism or the computer system, dynamic RAM (DRAM) for example, static RAM (SRAM), flash memory (Flash memory), ROM (read-only memory) (ROM) ... Deng, the present invention is not limited this, and those skilled in the art can select for use according to actual demand.

In the present embodiment, the effect of data processing unit 920 is as the natural language understanding system 100 of Fig. 1, can carry out speech recognition to produce solicited message 902 to the first input voice SP1, again first solicited message 902 is analyzed with natural language processing and imported first key word 904 of voice SP1 to produce corresponding first, and from the data (for example Search engine 240 carries out full-text search according to 108 pairs of structured database of key word 220) of storage unit 940, find out first according to first key word 904 of the first input voice SP1 correspondence and repay answer 906 (for example first repayment answer 511/711).When the first repayment answer, 906 quantity that find were 1, data processing unit 920 can directly carry out corresponding operation according to the type of the first repayment answer, 906 corresponding data; When the quantity of the first repayment answer 906 greater than 1 the time, data processing unit 920 can be organized into first candidate list 908 with the first repayment answer 906, controls display unit 940 subsequently and shows that first candidate list 908 gives the user.Showing that first candidate list 908 supplies the user to do under the situation of further choosing, data processing unit 920 can be received the second input voice SP2, and it is carried out speech recognition to produce second solicited message 902 ', again second solicited message 902 ' is carried out natural language processing and import second key word 904 ' of voice SP2 to produce corresponding second, and from first candidate list 908, select the part of correspondence according to second key word 904 ' of the second input voice SP2 correspondence.Wherein, first key word 904 and second key word 904 ' can be made of a plurality of key word.Above-mentioned the second phonetic entry SP2 is analyzed and produces the mode of second solicited message 902 ' and second key word 904 ', therefore the mode that can use Fig. 5 A and 7A that second phonetic entry is analyzed repeats no more.

Similarly, when the quantity of the second repayment answer 906 was 1, data processing unit 920 can carry out corresponding operation according to the type of the second repayment answer 906; When the quantity of the second repayment answer 906 ' greater than 1 the time, data processing unit 920 can be organized into second candidate list 908 ' and control display unit 940 according to the second repayment answer 906 ' again and be shown.Then, again according to the next input of user voice to select corresponding part, carry out corresponding operation according to the quantity of follow-up repayment answer again, this can analogize with reference to above-mentioned explanation and learns, then repeats no more at this.

Furthermore, data processing unit 920 can be imported a plurality of records 302 of structured database 220 (for example each minute field 308 the numeric data in the header field 304) and first the first corresponding key word 904 of voice SP1 compare (as the front to as described in Fig. 1, Fig. 3 A, 3B, the 3C).When first key word 904 of structured database 220 certain record 302 and first input voice SP1 is at least part of coupling, then this record 302 is considered as the matching result (for example generation matching result of Fig. 3 A/3B) that the first input voice SP1 produces.Wherein, if the type of data is the music shelves, then record 302 can comprise song title, singer, album name, publication time, broadcast order ... Deng; If the type of data is the image shelves, then record 302 can comprise film title, publication time, staff's (comprising the performance personnel) ... Deng; If the type of data is the webpage shelves, then record 302 can comprise web site name, type of webpage, corresponding user account ... Deng; If the type of data is the picture shelves, then record 302 can comprise picture name, pictorial information ... Deng; If the type of data is cardfile, then record 302 can comprise contact man's title, contact man's phone, contact man address ... Deng.Above-mentioned record 302 is to give an example with explanation, and records 302 and can decide according to practical application, and the embodiment of the invention is not as limit.

Then, data processing unit 920 can judge whether second key word 904 ' of the second input voice SP2 correspondence comprises an order vocabulary (for example " I want the 3rd option " or " I select the 3rd ") of indication order.When second key word 904 ' of the second input voice SP2 correspondence comprised the order vocabulary of indication order, then data processing unit 920 selected to be positioned at data of corresponding positions according to order vocabulary in first candidate list 908.When second key word 904 ' of the second input voice SP2 correspondence does not comprise the order vocabulary of indication order, the expression user may directly choose certain first repayment answer 906 in first candidate list 908, then data processing unit 920 is compared each first repayment answer 906 corresponding records 302 and second key word 904 ' in first candidate list 908, to determine the degree of correspondence of the first repayment answer 906 and the second input voice SP2, determine whether to have in first candidate list 908 certain the first repayment answer, 906 corresponding second input voice SP2 according to degree of correspondence again.In one embodiment of this invention, data processing unit 920 can be according to the degree of correspondence (for example coupling or the partly degree of coupling fully) of 906 pairs of second key words 904 ' of the first repayment answer, decide whether to have certain first repayment answer 906 and the second input voice SP2 to produce in first candidate list 906 corresponding, use the flow process of simplification selection.Wherein, degree of correspondence is the corresponding second input voice SP2 for the soprano in data processing unit 920 selecting datas.

For instance, if the first input voice SP1 is " today, weather how ", after carrying out speech recognition and natural language processing, first key word 904 of the first input voice SP1 correspondence can comprise that " today " reaches " weather ", therefore data processing unit 920 can read the data of correspondence weather today, and shows that by display unit 930 these weather datas are as first candidate list 908.Then, if the second input voice SP2 is " I will see the 3rd data " or " I select the 3rd ", after carrying out speech recognition and natural language processing, second key word 904 ' of the second input voice SP2 correspondence can comprise " the 3rd ", can be read as the order vocabulary of indication order this " the 3rd ", therefore data processing unit 920 can read the 3rd data (that is the 3rd first repayment answer 906 in first candidate list 908) in first candidate list 908, and shows corresponding Weather information by display unit 930 again.Perhaps, if the second input voice SP2 is " I will see Pekinese's weather " or " I select Pekinese's weather ", after carrying out speech recognition and natural language processing, second key word 904 ' of the second input voice SP2 correspondence can comprise that " Beijing " reaches " weather ", so data processing unit 920 can read corresponding Pekinese data in first candidate list 908.When this selects the corresponding first repayment answer, 906 quantity to be 1, can directly show corresponding Weather information by display unit 930; When selected first repayment answer 906 quantity greater than 1 the time, show again that then further second candidate list 908 ' (comprising at least one second repayment answer 906 ') further selects for the user.

If the first input voice SP1 is " I will phone Lao Zhang ", after carrying out speech recognition and natural language processing, first key word 904 of the first input voice SP1 correspondence can comprise that " phone " reaches " opening ", therefore data processing unit 920 can read contact man's data that corresponding surname " opens " (can be by structured database 220 be carried out full-text search, obtain the detailed data corresponding to record 302 again), and pass through first candidate list 908 that display unit 930 shows these contact man's data (that is first repayment answer 906).Then, if the second input voice SP2 is " the 3rd Lao Zhang " or " I select the 3rd ", after carrying out speech recognition and natural language processing, second key word 904 ' of the second input voice SP2 correspondence can comprise " the 3rd ", can be read as the order vocabulary of indication order this " the 3rd ", therefore data processing unit 920 can read the 3rd data (that is the 3rd first repayment answer 906) in first candidate list 908, and dials and connects according to selected data.Perhaps, if the second input voice SP2 is " I select 139 beginnings ", after carrying out speech recognition and natural language processing, second key word 904 ' of the second input voice SP2 correspondence can comprise that " 139 " reach " beginning ", can not be read as indication order vocabulary in proper order in these " 139 ", so data processing unit 920 can read, and telephone number is contact man's data of 139 beginnings in first candidate list 908; If the second input voice SP2 is " I want the Lao Zhang of Pekinese ", after carrying out speech recognition and natural language processing, second key word 904 ' of the second input voice SP2 correspondence can comprise that " Beijing " reaches " opening ", and data processing unit 920 can read that the address is the contact man of Pekinese data in first candidate list 908.When the selected first repayment answer, 906 quantity are 1, then dial and connect according to selected data; When selected first repayment answer 906 quantity greater than 1, first then will this moment selected repayment answer 906 is as the second repayment answer 906 ', and is organized into one second candidate list 908 ' and shows and give the user for its selection.

If the first input voice SP1 is " I will look for the dining room ", after carrying out speech recognition and natural language processing, first key word 904 of the first input voice SP1 can comprise " dining room ", data processing unit 920 can read all corresponding to the dining room first repayment answer 906, because such indication is not very clear and definite, so will show that first candidate list 908 (comprising the first repayment answer 906 corresponding to all dining room data) give the user by display unit 930, and etc. the user further indicate.Then, when if the user imports " the 3rd dining room " or " I select the 3rd " by the second input voice SP2, after carrying out speech recognition and natural language processing, second key word 904 ' of the second input voice SP2 correspondence can comprise " the 3rd ", can be read as the order vocabulary of indication order this " the 3rd ", therefore data processing unit 920 can read the 3rd data in first candidate list 908, and shows according to selected data.Perhaps, if the second input voice SP2 is " I select nearest ", after carrying out speech recognition and natural language processing, second key word 904 ' of the second input voice SP2 correspondence can comprise " nearest ", so data processing unit 920 can read the nearest dining room data in address and user in first candidate list 908; If the second input voice SP2 is " I want the dining room, Pekinese ", after carrying out speech recognition and natural language processing, second key word 904 ' of the second input voice SP2 correspondence can comprise that " Beijing " reaches " dining room ", so data processing unit 920 can read, and the address is dining room, Pekinese data in first candidate list 908.When the quantity of the selected first repayment answer 906 is 1, then show according to selected data; When selected first repayment answer 906 quantity greater than 1, first then will this moment selected repayment answer 906 is as the second repayment answer 906 ', and is organized into one second candidate list 908 ' and shows and give the user for its selection.

According to above-mentioned, data processing unit 920 can carry out corresponding operation according to selected first type of repaying the data of answer 906 (or second repayment answer 906 ').For instance, when the type of the data of selected first repayment answer 906 correspondences is music shelves, then data processing unit 920 carries out music according to selected data; When the type of selected data is image shelves, then data are handled Unit 920 and are carried out image according to selected data and play; When the type of selected data is webpage shelves, then data processing unit 920 shows according to selected data; When the type of selected data is picture shelves, then data processing unit 920 carries out picture according to selected data and shows; When the type of selected data is a cardfile, then data processing unit 920 is dialed and connected according to selected data.

Figure 10 is the system schematic according to the infosystem of one embodiment of the invention.Please refer to Fig. 9 and Figure 10, in the present embodiment, infosystem 1000 comprises mobile terminal apparatus 1010 and servomechanism 1020, and wherein servomechanism 1020 can be high in the clouds servomechanism, local-area network servomechanism or other similar devices, but the embodiment of the invention is not as limit.Mobile terminal apparatus 1010 comprises voice receiving unit 1011, data processing unit 1013 and display unit 1015.Data processing unit 1013 couples voice receiving unit 1011, display unit 1015 and servomechanism 1020.Mobile terminal apparatus 1010 can be mobile phone (Cell phone), personal digital assistant (Personal Digital Assistant, PDA) mobile phone, intelligent mobile phone device for mobile communication such as (Smart phone), the present invention is not also limited this.The functional similarity of voice receiving unit 1011 is in voice receiving unit 910, and the functional similarity of display unit 1015 is in display unit 930.Servomechanism 1020 is in order to store a plurality of data and to have speech identifying function.

In the present embodiment, data processing unit 1013 can carry out speech recognition to produce first solicited message 902 by 1020 pairs first inputs of servomechanism voice SP1, again first solicited message 902 is carried out natural language processing producing first key word 904 of the corresponding first input voice SP1, and servomechanism 1020 can carry out full-text search to find out the first repayment answer, 906 backs and to be sent to data processing unit 1013 according to 904 pairs of structured database of first key words 220.When the quantity of the first repayment answer 906 was 1, data processing unit 1013 can carry out corresponding operation according to the first repayment answer, 906 corresponding data types; When the quantity of the first repayment answer 906 greater than 1 the time, data processing unit 1013 will the selected first repayment answer 906 this moment be organized into first candidate list, 908 back control display units 1015 and shows and give the user, and waits the user and further indicate.After the user imports indication again, then, data processing unit 1013 can carry out speech recognition to produce second solicited message 902 ' by 1020 pairs second inputs of servomechanism voice PS2, again second solicited message 902 ' is analyzed with natural language processing to produce second key word 904 ' of the corresponding second input voice SP2, and servomechanism 1020 is selected the first corresponding repayment answer 906 as the second repayment answer 906 ' according to second key word 904 ' of the second input voice SP2 correspondence from first candidate list 908, and is sent to data processing unit 1013.Similarly, when the quantity of the second corresponding repayment answer 906 of this moment was 1, data processing unit 920 can carry out corresponding operation according to the type of the second repayment answer, 906 corresponding data; When the quantity of the second repayment answer 906 greater than 1 the time, data processing unit 1013 is controlled display unit 1015 again and is shown that giving the user does further selection after this moment, the selected second repayment answer 906 was organized into one second candidate list 908 ' again.Then, servomechanism 1020 can be again according to the part of follow-up input voice selecting correspondence, and data processing unit 1013 can carry out corresponding operation according to the quantity of the data of selecting again, this can analogize with reference to above-mentioned explanation and learns, then repeats no more at this.

It should be noted, in one embodiment, be 1 o'clock if repay answer 906 quantity according to first key word 904 selected first of the first input voice SP1 correspondence, can directly carry out the operation of this data correspondence.In addition, in another embodiment, can export a prompting earlier and give the user, will be performed with the respective operations of notifying the first user-selected repayment answer 906.Moreover, in another embodiment, can be 1 o'clock in second key word, 904 ' the selected second repayment answer, 906 quantity according to the second input voice SP2 correspondence also, directly carry out the operation of this data correspondence.Certainly, in another embodiment, can also export a prompting earlier and give the user, will be performed with the respective operations of notifying user-selected data, the present invention is not limited this.

Furthermore, servomechanism 1020 can be compared by first key word 904 that structured database 220 each records 302 are corresponding with the first input voice SP1.When each record 302 and first key word 904 is at least part of coupling, then this record 302 is considered as the data that the first input voice SP1 mates, and this record 302 is repaid one of answer 906 as first.If first key word 904 selected first according to the first input voice SP1 correspondence was repaid answer 906 quantity greater than 1 o'clock, the user may be again by the second input voice SP2 input indication.Because the indication of importing by the second input voice SP2 user at this moment may comprise order (selecting in the demonstration information which to wait order in order to indication), the direct a certain person in the selected demonstration information (for example directly the content of a certain information of indication) or (for example choose nearest dining room according to the intention of indicating the judgement user, just use the dining room of demonstration " recently " and give the user), so then will judging second key word 904 ' of the second input voice SP2 correspondence, servomechanism 1020 whether comprises an order vocabulary of indication order.When second key word 904 ' of the second input voice SP2 correspondence comprised the order vocabulary of indication order, then servomechanism 1020 selected to be positioned at the first repayment answer 906 of correspondence position in first candidate list 908 according to order vocabulary.When second key word 904 ' of the second input voice SP2 correspondence does not comprise the order vocabulary of indication order, then second key word 904 ' that servomechanism 1020 is corresponding with the second input voice SP2 with each first repayment answer 906 in first candidate list 908 is compared, determining the degree of correspondence of the first repayment answer 906 and the second input voice SP2, and can determine in first candidate list 908 the first repayment answer 906 whether corresponding second to import voice SP2 according to these degree of correspondence.In one embodiment of this invention, servomechanism 1020 can determine those the first repayment answers 906 corresponding second in first candidate list 908 to import voice SP2 according to the first repayment answer 906 and the degree of correspondence of second key word 904 ', to simplify the flow process of selecting.Wherein, servomechanism 1020 can select in the first repayment answer 906 degree of correspondence for the soprano for importing voice SP2 person corresponding to second.

Figure 11 is the process flow diagram based on the system of selection of speech recognition according to one embodiment of the invention.Please refer to Figure 11, in the present embodiment, can receive the first input voice (step S1100), and the first input voice SP1 is carried out speech recognition to produce the first solicited message 902(step S1110), again first solicited message 902 is analyzed natural language processing to produce the first key word 904(step S1120 of the corresponding first input voice).Then, can from a plurality of data, select the first corresponding repayment answer 906(step S1130 according to first key word 904), and judge whether the selected first repayment answer, 906 quantity are 1(step S1140).When the quantity of the selected first repayment answer 906 when being 1, that is the judged result of step S1140 is "Yes", then carries out corresponding operation (step S1150) according to the first repayment answer, 906 corresponding data types.When the quantity of the selected first repayment answer 906 greater than 1 the time, that is the judged result of step S1140 is "No", show first candidate list 908 and receive the second input voice SP2(step S1160 according to the selected first repayment answer 906), and the second input voice are carried out speech recognition to produce second solicited message 902 ' (step S1170), again second solicited message 902 ' is analyzed with natural language processing to produce second key word 904 ' (step S1180) of the corresponding second input voice.Then, select corresponding part according to the first repayment answer 906 of second solicited message 902 from first candidate list 908, return step S1140 and judge whether selected first quantity of repaying answer 906 is 1(step S1190).Wherein, the order of above-mentioned steps is that the embodiment of the invention is not as limit in order to explanation.And the details of above-mentioned steps can then repeat no more at this with reference to Fig. 9 and Figure 10 embodiment.

In sum, system of selection and mobile terminal apparatus and the infosystem based on speech recognition of the embodiment of the invention, it is to the first input voice and the second input voice carry out speech recognition and natural language processing is imported the key word of voice correspondence to confirm the first input voice and second, according to the key word of the first input voice and the second input voice correspondence repayment answer is selected again.By this, can promote the convenience of user's operation.

Next at disclosed natural language understanding system 100 and structured database 220 framework such as grade and members, the operational instances that combines with auxiliary actuating apparatus explains.

Figure 12 is the calcspar of the speech control system that illustrates according to one embodiment of the invention.Please refer to Figure 12, speech control system 1200 comprises auxiliary actuating apparatus 1210, mobile terminal apparatus 1220 and servomechanism 1230.In the present embodiment, auxiliary actuating apparatus 1210 can start the voice system of mobile terminal apparatus 1220 by wireless signal transmission, makes mobile terminal apparatus 1220 link up according to voice signal and servomechanism 1230.

Specifically, auxiliary actuating apparatus 1210 comprises first wireless transport module 1212 and trigger module 1214, and wherein trigger module 1214 is coupled to first wireless transport module 1212.First wireless transport module 1212 for example is to support wireless compatible authentication (Wireless fidelity, Wi-Fi), global intercommunication microwave access (Worldwide Interoperability for Microwave Access, WiMAX), blue bud (Bluetooth), super wideband (ultra-wideband, UWB) or radio-frequency (RF) identification (Radio-frequency identification, the device of communications protocol such as RFID), it can send wireless signal transmission, to correspond to each other with another wireless transport module and to set up wireless link.Trigger module 1214 for example is button, button etc.In the present embodiment, after the user presses this trigger module 1214 generations one trigger pip, first wireless transport module 1212 receives this trigger pip and starts, this moment, first wireless transport module 1212 can send wireless signal transmission, and transmitted this wireless signal transmission to mobile terminal apparatus 1220 by first wireless transport module 1212.In one embodiment, above-mentioned auxiliary actuating apparatus 1210 can be a bluetooth earphone.

Though it should be noted that the earphone/of some hand-free also has the design that starts mobile terminal apparatus 1220 some function at present, in the another embodiment of the present invention, auxiliary actuating apparatus 1210 can be different from above-mentioned earphone/.Above-mentioned earphone/by with the line of mobile terminal apparatus, listen/converse to replace the earphone/ on the mobile terminal apparatus 1220, the startup function is additional design, but auxiliary actuating apparatus 1210 of the present invention " only " is used for opening the voice system of mobile terminal apparatus 1220, do not have the function of listening/conversing, so inner circuit design can be simplified, cost is also lower.In other words, for above-mentioned hands-free headsets/microphone, auxiliary actuating apparatus 1210 is other devices, and namely the user may possess earphone/and the auxiliary actuating apparatus of the present invention 1210 of hand-free simultaneously.

In addition, the body of above-mentioned auxiliary actuating apparatus 1210 can the person of being to use conveniently can and articles for use, ornaments such as ring, wrist-watch, earrings, necklace, glasses for example, be various carry-on Portable article, or installation component, for example for being disposed at the driving accessory on the bearing circle, be not limited to above-mentioned.That is to say that auxiliary actuating apparatus 1210 is the device of " life-stylize ", by the setting of built-in system, allow the user can touch trigger module 1214 easily, with the opening voice system.For instance, when the body of auxiliary actuating apparatus 1210 was ring, user's moveable finger trigger module 1214 of pressing ring easily was triggered it.On the other hand, the body when auxiliary actuating apparatus 1210 is that the user also can trigger the trigger module 1214 of driving accessory device during the road easily when being disposed at the device of driving accessory.In addition, compared to the discomfort of wearing earphone/and listening/converse, use auxiliary actuating apparatus 1210 of the present invention voice system in the mobile terminal apparatus 1220 can be opened, even and then open sound amplification function (then will describe in detail), make the user need not wear earphone/, still can directly listen/converse by mobile terminal apparatus 1220.In addition, for the user, the article of auxiliary actuating apparatus 1210 for originally wearing or use of these " life-stylizes " so do not have the uncomfortable or problem of discomfort in the use, namely do not need the adaptation of taking time.For instance, when the user cooks in the kitchen, in the time of need dialing the mobile phone that is positioned over the parlor, suppose its wear have ring, the auxiliary actuating apparatus of the present invention 1210 of necklace or wrist-watch body, just can touch ring, necklace or wrist-watch with the opening voice system with inquiry friend recipe details.Can also reach above-mentioned purpose though partly have the earphone/that starts function at present, but in the process of at every turn cooking, be not all to need to call to consult the friend at every turn, so for the user, at any time wear earphone/and cook, in order to controlling the mobile terminal apparatus suitable inconvenience of can saying so at any time.

In other embodiments, auxiliary actuating apparatus 1210 also may be configured with wireless charging battery 1216, in order to drive first wireless transport module 1212.Furthermore, wireless charging battery 1216 comprises battery unit 12162 and wireless charging module 12164, and wherein wireless charging module 12164 is coupled to battery unit 12162.At this, wireless charging module 12164 can receive the energy of supplying from a wireless power supply (not illustrating), and is that electric power comes battery unit 12162 charging with this energy conversion.Thus, first wireless transport module 1212 of auxiliary actuating apparatus 1210 can charge by wireless charging battery 1216 expediently.

On the other hand, mobile terminal apparatus 1220 for example is mobile phone (Cell phone), personal digital assistant (Personal Digital Assistant, PDA) mobile phone, intelligent mobile phone (Smart phone), or palmtop computer (Pocket PC), Tablet PC (Tablet PC) or mobile computer of bitcom etc. are installed.Mobile terminal apparatus 1220 can be any Portable (Portable) mobile device that possesses communication function, does not limit its scope at this.In addition, mobile terminal apparatus 1220 can use Android operating system, microsoft operating system, Android operating system, (SuSE) Linux OS etc., is not limited to above-mentioned.

Mobile terminal apparatus 1220 comprises second wireless transport module 1222, second wireless transport module 1222 can be complementary with first wireless transport module 1212 of auxiliary actuating apparatus 1210, and adopt corresponding home control network communication protocol (for example communications protocol such as wireless compatible authentication, global intercommunication microwave access, blue bud, UWB communication agreement or radio-frequency (RF) identification), use with first wireless transport module 1212 and set up wireless link.It should be noted that " first " described herein wireless transport module 1212, " second " wireless transport module 1222 are in order to illustrate that wireless transport module is disposed at different devices, is not in order to limit the present invention.

In other embodiments, mobile terminal apparatus 1220 also comprises voice system 1221, this voice system 1221 is coupled to second wireless transport module 1222, so after the user triggers the trigger module 1214 of auxiliary actuating apparatus 1210, can wirelessly start voice system 1221 by first wireless transport module 1212 and second wireless transport module 1222.In one embodiment, this voice system 1221 can comprise phonetic sampling module 1224, phonetic synthesis module 1226 and voice output interface 1227.Phonetic sampling module 1224 is in order to receiving the voice signal from the user, and this phonetic sampling module 1224 for example be the device of reception message such as microphone (Microphone).Phonetic synthesis module 1226 can be inquired about a speech database for speech synthesis, and this speech database for speech synthesis for example be record literal with and the information of corresponding voice, make phonetic synthesis module 1226 can find out the voice corresponding to the specific character message, so that message language is carried out phonetic synthesis.Afterwards, phonetic synthesis module 1226 can be used to play and give the user with synthetic voice by 1227 outputs of voice output interface.Above-mentioned voice output interface 1227 for example is loudspeaker or earphone etc.

In addition, mobile terminal apparatus 1220 also may be configured with communication module 1228.Communication module 1228 for example is can transmit and the element that receives wireless signal, as radio-frequency (RF) transceiver.Furthermore, communication module 1228 can allow the user answer or call or use other services that the telecommunications dealer provides by mobile terminal apparatus 1220.In the present embodiment, communication module 1228 can be by the response message of Internet reception from servomechanism 1230, and set up conversation line between mobile terminal apparatus 1220 and at least one electronic installation according to this response message, wherein said electronic installation for example is another mobile terminal apparatus (not illustrating).

Servomechanism 1230 for example is network server or high in the clouds servomechanism etc., and it has speech understanding module 1232.In the present embodiment, speech understanding module 1232 comprises sound identification module 12322 and speech processing module 12324, and wherein speech processing module 12324 is coupled to sound identification module 12322.At this, sound identification module 12322 can receive the voice signal that transmits from phonetic sampling module 1224, voice signal is converted to a plurality of segmentation semantemes (for example key word or words and expressions etc.).12324 of speech processing module can parse mean (for example intention, time, the place etc.) of the semantic representatives of these segmentations according to these segmentations semantemes, and then judge the meaning represented in the above-mentioned voice signal.In addition, speech processing module 12324 also can produce corresponding response message according to the result who resolves.In the present embodiment, speech understanding module 1232 can be done in fact by the hardware circuit that or several logic gates combine, and can also be to do in fact with computer program code.What deserves to be mentioned is that in another embodiment, speech understanding module 1232 is configurable in mobile terminal apparatus 1320, speech control system 1300 as shown in figure 13.The operation of the speech understanding module 1232 of above-mentioned servomechanism 1230 can be as the natural language understanding system 100 of Figure 1A, the natural language dialogue system 500/700/700 ' of Fig. 5 A/7A/7B.

The following method of namely controlling in conjunction with above-mentioned speech control system 1200 plain language sounds.Figure 14 is the process flow diagram of the speech control method that illustrates according to one embodiment of the invention.Please be simultaneously with reference to Figure 12 and Figure 14, in step S1402, auxiliary actuating apparatus 1210 sends wireless signal transmission to mobile terminal apparatus 1220.Detailed explanation is that when first wireless transport module 1212 of auxiliary actuating apparatus 1210 was triggered because receiving a trigger pip, this auxiliary actuating apparatus 1210 can send wireless signal transmission to mobile terminal apparatus 1220.Particularly, when the trigger module 1214 in the auxiliary actuating apparatus 1210 is pressed by the user, this moment, trigger module 1214 meetings be triggered because of trigger pip, and make first wireless transport module 1212 send wireless signal transmission to second wireless transport module 1222 of mobile terminal apparatus 1220, use that the wireless transport module 1212 of winning is linked by home control network communication protocol and second wireless transport module 1222.Above-mentioned auxiliary actuating apparatus 1210 only is used for opening the voice system of mobile terminal apparatus 1220, does not have the function of listening/conversing, so inner circuit design can be simplified, cost is also lower.In other words, for hands-free headsets/microphone that general mobile terminal apparatus 1220 is added, auxiliary actuating apparatus 1210 is another devices, and namely the user may possess earphone/and the auxiliary actuating apparatus of the present invention 1210 of hand-free simultaneously.

What deserves to be mentioned is, the body of above-mentioned auxiliary actuating apparatus 1210 can the person of being to use conveniently can and articles for use, for example various carry-on Portable article such as ring, wrist-watch, earrings, necklace, glasses, or installation component, for example for being disposed at the driving accessory on the bearing circle, be not limited to above-mentioned.That is to say that auxiliary actuating apparatus 1210 is the device of " life-stylize ", by the setting of built-in system, allow the user can touch trigger module 1214 easily, with opening voice system 1221.Therefore, use auxiliary actuating apparatus 1210 of the present invention voice system 1221 in the mobile terminal apparatus 1220 can be opened, even and then open sound amplification function (then will describe in detail), make the user need not wear earphone/, still can directly listen/converse by mobile terminal apparatus 1220.In addition, for the user, the article of auxiliary actuating apparatus 1210 for originally wearing or use of these " life-stylizes " are not so have the uncomfortable or problem of discomfort in the use.

In addition, first wireless transport module 1212 and second wireless transport module 1222 all can be in sleep pattern or mode of operation.Wherein, it is closed condition that sleep pattern refers to wireless transport module, that is wireless transport module can not receive/detecting wireless transmission signal, and can't link with other wireless transport module.It is opening that mode of operation refers to wireless transport module, that is wireless transport module detecting wireless transmission constantly signal, or sends wireless signal transmission at any time, and can link with other wireless transport module.At this, when trigger module 1214 is triggered, if first wireless transport module 1212 is in sleep pattern, then trigger module 1214 can wake first wireless transport module 1212 up, make first wireless transport module 1212 enter mode of operation, and make first wireless transport module 1212 send wireless signal transmission to the second wireless transport module 1222, and allow first wireless transport module 1212 link by second wireless transport module 1222 of home control network communication protocol and mobile terminal apparatus 1220.

On the other hand, continue to maintain mode of operation and consume too much electric power for fear of first wireless transport module 1212, in the Preset Time after first wireless transport module 1212 enters mode of operation (for example being 5 minutes), if trigger module 1214 is not triggered again, then first wireless transport module 1212 can enter sleep pattern from mode of operation, and stops to link with second wireless transport module 1220 of mobile terminal apparatus 1220.

Afterwards, in step S1404, second wireless transport module 1222 of mobile terminal apparatus 1220 can receive wireless signal transmission, to start voice system 1221.Then, in step S1406, when second wireless transport module 1222 detects wireless signal transmission, mobile terminal apparatus 1220 can start

voice system

1221, and 1221 sampling modules 1224 of voice system can begin received speech signal, for example " temperature several years today ", " phone Lao Wang.", " ask enquiring telephone number." etc.

In step S1408, phonetic sampling module 1224 can be sent to above-mentioned voice signal the speech understanding module 1232 in the servomechanism 1230, to resolve voice signal and to produce response message by speech understanding module 1232.Furthermore, sound identification module 12322 in the speech understanding module 1232 can receive the voice signal from phonetic sampling module 1224, and voice signal is divided into a plurality of segmentation semantemes, speech processing module 12324 then can be carried out speech understanding to above-mentioned segmentation semanteme, to produce in order to respond the response message of voice signal.

In another embodiment of the present invention, mobile terminal apparatus 1220 more can receive the response message that speech processing module 12324 produces, and perhaps carries out the operation that response message is assigned by interior in the voice output interface 1227 output response messages according to this.In step S1410, the phonetic synthesis module 1226 of mobile terminal apparatus 1220 can receive the response message that speech understanding module 1232 produces, and carries out phonetic synthesis according to the content in the response message (for example vocabulary or words and expressions etc.), and produces voice answer-back.And in step S1412, voice output interface 1227 can receive and export this voice answer-back.

For example, when the user presses trigger module 1214 in the

auxiliary actuating apparatus

1210,1212 of first wireless transport modules can send wireless signal transmission to the second wireless transport module 1222, make mobile terminal apparatus 1220 start the phonetic sampling module 1224 of voice system 1221.At this, suppose that the voice signal from the user is an inquiry sentence, for example " temperature several years today ", then phonetic sampling module 1224 just can receive and the speech understanding module 1232 that this voice signal is sent in the servomechanism 1230 is resolved, and speech understanding module 1232 can send back mobile terminal apparatus 1220 with resolving the response message that produces.Suppose that the content in the response message that speech understanding module 1232 produces is " 30 ℃ ", then phonetic synthesis module 1226 can synthesize voice answer-back with the message of these " 30 ℃ ", and voice output interface 1227 can should be reported these voice to the user.

In another embodiment, suppose that the voice signal from the user is an imperative sentence, for example " phone Lao Wang.", then can identify this imperative sentence in the speech understanding module 1232 and be " dialing to the request of Lao Wang ".In addition, speech understanding module 1232 can produce new response message again, and for example " whether PLSCONFM sets aside Lao Wang ", and the response message that this is new is sent to mobile terminal apparatus 1220.At this, phonetic synthesis module 1226 can synthesize voice answer-back by the response message that this is new, and reports in the user by voice output interface 1227.Further say, when the user reply sure answer for the class of "Yes" the time, similarly, phonetic sampling module 1224 can receive and transmit this voice signal to servomechanism 1230, to allow speech understanding module 1232 resolve.After speech understanding module 1232 is resolved and finished, just can record a dialing command information at response message, and be sent to mobile terminal apparatus 1220.At this moment, the contact information that 1228 of communication modules can record according to call database inquires the telephone number of " Lao Wang ", and setting up the conversation line between mobile terminal apparatus 1220 and another electronic installation, that is " Lao Wang " given in dialing.

In other embodiments, except above-mentioned speech control system 1200, also can utilize speech control system 1300 or other similar systems, carry out above-mentioned method of operating, not be limited with the above embodiments.

In sum, in the speech control system and method for present embodiment, auxiliary actuating apparatus can wirelessly be opened the phonetic function of mobile terminal apparatus.And, the body of this auxiliary actuating apparatus can the person of being to use conveniently can and the articles for use of " life-stylize ", ornaments such as ring, wrist-watch, earrings, necklace, glasses for example, be various carry-on Portable article, or installation component, for example for being disposed at the driving accessory on the bearing circle, be not limited to above-mentioned.Thus, compared to the discomfort of wearing at present hands-free headsets/microphone in addition, the voice system that uses auxiliary actuating apparatus 1210 of the present invention to open in the mobile terminal apparatus 1220 will be more convenient.

It should be noted that above-mentioned servomechanism 1230 with speech understanding module may be network server or high in the clouds servomechanism, and the high in the clouds servomechanism may relate to the problem of user's the right of privacy.For example, the user need upload complete address list to the high in the clouds servomechanism, just can finish as calling, send out the operation relevant with address list such as news in brief.Even the high in the clouds servomechanism adopt to be encrypted line, and instant biography do not preserve, and the load that still is difficult to eliminate the user is excellent.Accordingly, below provide method and the corresponding voice interactive system thereof of another kind of speech control, mobile terminal apparatus can be carried out the interactive voice service with the high in the clouds servomechanism under the situation of not uploading complete address list.In order to make content of the present invention more clear, below the example that can implement according to this really as the present invention especially exemplified by embodiment.

Though the present invention discloses as above with embodiment; so it is not in order to limit the present invention; those skilled in the art can do a little change and retouching under the premise without departing from the spirit and scope of the present invention, so protection scope of the present invention is to be as the criterion with claim of the present invention.

Claims

1. system of selection based on speech recognition comprises:

Receive one first phonetic entry;

This first phonetic entry is carried out speech recognition to produce one first key word;

This first key word is carried out natural language processing to produce one first user view that should first phonetic entry;

Select at least one first repayment answer according to this first user view;

When the quantity of this first repayment answer of selecting is 1, carry out corresponding operation according to the type of selected this first repayment answer;

When the quantity of this first repayment answer of selecting greater than 1 the time, show that one comprises first candidate data of this first repayment answer;

Receiving one second phonetic entry, this second phonetic entry is carried out speech recognition to produce one second key word;

This second key word is carried out natural language processing to produce should second phonetic entry, one second user view; And

From this first repayment answer of first candidate list, select the second repayment answer according to this second user view.

2. the system of selection based on speech recognition as claimed in claim 1, wherein select the step of this first repayment answer to comprise according to this first user view:

The record and this first user view that are stored in a structured database are compared; And

When this record and this first user view are at least part of coupling, then this record is considered as this first repayment answer corresponding to this first phonetic entry.

3. the system of selection based on speech recognition as claimed in claim 1, wherein according to selecting the step of this second repayment answer to comprise in this first repayment answer from this first candidate list of this second user view:

Judge whether this second user view comprises an order vocabulary of indication order;

When this second user view comprises this order vocabulary of indication order, then in this first candidate list, select to be positioned at the first repayment answer of correspondence position according to this order vocabulary;

When this second user view does not comprise this order vocabulary of indication order, then will compare corresponding to record and this second user view of respectively this first repayment answer in this first candidate list; And

Which this first repayment answer that determines this first candidate list according to this comparison result is to should second phonetic entry.

4. the system of selection based on speech recognition as claimed in claim 3 wherein comprises step that should second phonetic entry according to which determines in this first candidate list this first repayment answer of this comparison result:

Select in this first repayment answer this matching degree to be should second phonetic entry for the soprano.

5. the system of selection based on speech recognition as claimed in claim 1, wherein the step of carrying out corresponding operation according to selected this first type of repaying answer comprises:

When selected this first type of repaying answer is music shelves, then music is carried out in selected this first repayment answer;

When selected first type of repaying answer is image shelves, then image is carried out in the selected first repayment answer and play;

When selected first type of repaying answer is webpage shelves, then the selected first repayment answer is shown;

When selected first type of repaying answer is picture shelves, then picture is carried out in the selected first repayment answer and show; And

When selected first type of repaying answer is a cardfile, then the selected first repayment answer is dialed and connected.

6. mobile terminal apparatus comprises:

One voice receiving unit receives one first phonetic entry and one second phonetic entry;

One display unit;

One storage unit is in order to store a plurality of data; And

One data processing unit, couple this voice receiving unit, this display unit and this storage unit, this data processing unit carries out speech recognition to produce one first key word to this first phonetic entry, this first key word is carried out natural language processing to produce one first user view that should first voice, and select the first repayment answer according to this first user view, when the quantity of this first repayment answer of selecting is 1, this data processing unit carries out corresponding operation according to the type of selected this first repayment answer, when the quantity of this first repayment answer of selecting greater than 1 the time, this data processing unit is controlled this display unit and is shown this first candidate list that comprises this first repayment answer, and this data processing unit carries out speech recognition to produce one second key word to these second voice, this second key word is carried out natural language processing producing one second user view that should second phonetic entry, and according to selecting the second repayment answer in this first repayment answer from this first candidate list of this second user view.

7. mobile terminal apparatus as claimed in claim 6, wherein this data processing unit will be compared corresponding to record and this first user view of respectively this first repayment answer, when respectively this record and this first user view are at least part of coupling, then this record is considered as this first repayment answer corresponding to this first phonetic entry.

8. mobile terminal apparatus as claimed in claim 6, wherein this data processing unit judges whether this second user view comprises an order vocabulary of indication order, when this second user view comprises this order vocabulary of indication order, then this data processing unit selects to be positioned at this first repayment answer of correspondence position in this first candidate list according to this order vocabulary, when this second user view does not comprise this order vocabulary of indication order, then this data processing unit with in this first candidate list respectively this first repayment corresponding this record of answer and this second user view compare to determine this first matching degree of repaying answer and this second phonetic entry, and this first repays answer corresponding to this second phonetic entry which to determine in this first candidate list according to these matching degrees.

9. mobile terminal apparatus as claimed in claim 8, wherein this data processing unit selects in this first repayment answer the matching degree soprano for to should second phonetic entry.

10. mobile terminal apparatus as claimed in claim 6, wherein the type when selected this first repayment answer is music shelves, then this data processing unit carries out music according to selected this first repayment answer, when selected this first type of repaying answer is image shelves, then this data processing unit carries out the image broadcast according to selected this first repayment answer, when selected this first type of repaying answer is webpage shelves, then this data processing unit shows according to selected this first repayment answer, when selected this first type of repaying answer is picture shelves, then this data processing unit carries out the picture demonstration according to selected this first repayment answer, and when selected this first the repayment answer type be a cardfile, then data processing unit according to selected this first the repayment answer dial and connect.

11. an infosystem comprises:

One servomechanism is in order to store a plurality of data and to have speech identifying function; And

A kind of mobile terminal apparatus comprises:

One display unit;

One data processing unit, couple this voice receiving unit, this display unit and this servomechanism, this data processing unit carries out speech recognition to produce one first key word by this servomechanism to this first phonetic entry, first key word is carried out natural language processing to produce one first user view that should first phonetic entry, and this servomechanism is selected corresponding at least one first repayment answer and is sent to this data processing unit according to this first user view from the record that a structured database comprises, when the quantity of this first repayment answer of selecting is 1, this data processing unit carries out corresponding operation according to the type of selected this first repayment answer, when the quantity of this first repayment answer of selecting greater than 1 the time, this data processing unit is controlled this display unit according to this first repayment answer of selecting and is shown first candidate list that comprises this first repayment answer, and this data processing unit carries out speech recognition to produce one second key word by this servomechanism to this second phonetic entry, second key word is carried out natural language processing producing one second user view that should second phonetic entry, and this servomechanism is according to selecting the second repayment answer and be sent to this data processing unit in this first repayment answer from this first candidate list of this second user view.

12. infosystem as claimed in claim 11, wherein this servomechanism respectively compare by this record and this first user view of this first repayment answer, when respectively this record and this first user view are at least part of coupling, then this record is considered as corresponding this first repayment answer of this first phonetic entry.

13. infosystem as claimed in claim 11, wherein this servomechanism judges whether this second user view comprises an order vocabulary of indication order, when this second user view comprises this order vocabulary of indication order, then this servomechanism selects to be positioned at this first repayment answer of correspondence position in this first candidate list according to this order vocabulary, when this second user view does not comprise this order vocabulary of indication order, then this servomechanism with in this first candidate list respectively this record and this second user view compare to determine the matching degree of this first repayment answer material and this second phonetic entry, and according to this matching degree determine this first candidate list which this first repay answer corresponding to this second phonetic entry.

14. infosystem as claimed in claim 13, wherein this this servomechanism selects in this first repayment answer this matching degree soprano for to should second phonetic entry.

15. infosystem as claimed in claim 11, wherein the type when selected this first repayment answer is music shelves, then this data processing unit carries out music according to selected this first repayment answer, when selected this first type of repaying answer is image shelves, then this data processing unit carries out the image broadcast according to selected this first repayment answer, when selected this first type of repaying answer is webpage shelves, then this data processing unit shows according to selected this first repayment answer, when selected this first type of repaying answer is picture shelves, then this data processing unit carries out the picture demonstration according to selected this first repayment answer, and when selected this first the repayment answer type be a cardfile, then data processing unit according to selected this first the repayment answer dial and connect.

16. the system of selection based on speech recognition comprises:

One first phonetic entry is carried out speech recognition to produce one first key word;

Retrieve to obtain at least one first repayment answer according to this first key word in a structured database;

After showing this first candidate list, receive one second phonetic entry, and this second phonetic entry is carried out speech recognition to produce one second key word; And

17. the system of selection based on speech recognition as claimed in claim 16, wherein this first key word step of retrieving to obtain at least one first repayment answer in a structured database comprises:

When the record of this structured database and this first key word are at least part of coupling, then this record is considered as this first repayment answer corresponding to this first phonetic entry.

18. the system of selection based on speech recognition as claimed in claim 16, wherein according to selecting the step of this second repayment answer to comprise in this first repayment answer from this first candidate list of this second key word:

When this second key word comprises this order vocabulary of indication order, then in this first candidate list, select to be positioned at the first repayment answer of correspondence position according to this order vocabulary;

When this second key word does not comprise this order vocabulary of indication order, then will compare corresponding to record and this second key word of respectively this first repayment answer in this first candidate list; And

19. the system of selection based on speech recognition as claimed in claim 18 wherein comprises step that should second phonetic entry according to which determines in this first candidate list this first repayment answer of this comparison result:

20. the system of selection based on speech recognition as claimed in claim 16, wherein the step of carrying out corresponding operation according to selected this first type of repaying answer comprises: