TWI578175B - Searching method, searching system and nature language understanding system - Google Patents

Searching method, searching system and nature language understanding system Download PDF

Info

Publication number
TWI578175B
TWI578175B TW102149041A TW102149041A TWI578175B TW I578175 B TWI578175 B TW I578175B TW 102149041 A TW102149041 A TW 102149041A TW 102149041 A TW102149041 A TW 102149041A TW I578175 B TWI578175 B TW I578175B
Authority
TW
Taiwan
Prior art keywords
record
column
data
user
search
Prior art date
Application number
TW102149041A
Other languages
Chinese (zh)
Other versions
TW201428517A (en
Inventor
張國峰
朱逸斐
Original Assignee
威盛電子股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority to CN2012105930648A priority Critical patent/CN103049567A/en
Priority to CN2013101845443A priority patent/CN103218463A/en
Priority to TW102121406 priority
Priority to CN201310690513.5A priority patent/CN103761242B/en
Application filed by 威盛電子股份有限公司 filed Critical 威盛電子股份有限公司
Priority to TW102149041A priority patent/TWI578175B/en
Publication of TW201428517A publication Critical patent/TW201428517A/en
Application granted granted Critical
Publication of TWI578175B publication Critical patent/TWI578175B/en

Links

Description

Search method, retrieval system, and natural language understanding system

The present invention relates to a retrieval technique, and more particularly to a retrieval method, a retrieval system, and a natural language understanding system for performing full-text retrieval on a structured database.

In the natural language understanding of a computer, a specific grammar is usually used to capture the intent or information of the user's input statement. Therefore, if there are enough data of the user input sentence stored in the database, a reasonable judgment can be made.

In the existing practice, one uses the built-in fixed word list to capture the user's input sentence, and the fixed word list contains specific intentions or specific terms used by the information, and the user needs to express according to the specific term. Their intentions or information, their intentions or information can be correctly identified by the system. However, forcing the user to remember each specific term of the fixed word list is quite unhuman. For example, the prior art uses a fixed word list implementation, requiring the user to ask the weather when It must be said: "What is the weather in Shanghai (or Beijing) tomorrow (or the day after tomorrow)?", and if users use other more natural colloquial expressions and want to ask about the weather conditions, for example, "How about Shanghai tomorrow?" There is no "weather" in the statement, so the prior art will understand that "there is a place in Shanghai called tomorrow", which obviously does not capture the true intention of the user. In addition, the types of statements used by users are very complicated, and often change from time to time. Sometimes users may enter incorrect statements. In this case, the user's input statements must be fetched by fuzzy matching. . Therefore, the effect of a fixed word list that only provides rigid input rules is even worse.

In addition, when using natural language understanding to handle multiple types of user intent, the grammatical structure of some different intents is the same. For example, when the user's input statement is "I want to see the Romance of the Three Kingdoms", the user's intention may be If you want to see a movie from the Romance of the Three Kingdoms, or if you want to read a book about the Romance of the Three Kingdoms, it is usually in this case that there are two possible intents to match the user's choice. However, in many cases, it is superfluous and inefficient to provide unnecessary possible intent to make the user's choice. For example, when the user's input sentence is "I want to see Super Star Avenue", it is not necessary to match the user's intentions to the book or painting of the Super Star Avenue (because the Super Star Avenue is a TV show).

Furthermore, in general, the search results obtained in full-text search are unstructured data. The information in unstructured data is scattered and unrelated. For example, after entering keywords in search engines such as Google or Baidu, the web search results obtained are unstructured data, because the search results must be read by humans. In order to find useful information, this method not only wastes the user's time Between, and may miss the information you want, so there will be a lot of restrictions on practicality.

The invention provides a retrieval method and a retrieval system, which perform full-text retrieval on a structured database, and the search results obtained by the full-text retrieval are very meaningful structured materials.

The invention further provides a natural language understanding system for assisting in determining the intent expressed by the user's request information by performing a full-text search on the structured database.

The invention provides a retrieval system comprising: a structured database and a search engine. A structured repository store has multiple records. The search engine performs a full-text search on the structured database, wherein the numerical data contained in each record of the structured database is related to each other, and the numerical data is used together to express the intention of the request information from the user to the record. The search engine is configured to perform a full-text search on the structured database, wherein when the numerical data is matched, the guidance data corresponding to the numerical data is output to confirm the intent of the requested information.

The present invention provides a natural language understanding system including: a natural language processor, a knowledge assisted understanding module, and a retrieval system. The natural language processor analyzes the user's request information into at least one possible intent grammar material, and each of the possible intent grammar materials includes at least one keyword and intent data. The knowledge assisted understanding module coupled to the natural language processor is configured to determine the determined intent grammar data in the at least one possible intent grammar material to express the user's intention to request the information. The aforementioned retrieval system package Includes a structured database and a search engine. A structured database stores multiple records. The search engine performs a full-text search of the structured database. The knowledge assisted understanding module transmits keywords to the retrieval system to assist in determining the intended grammar data by retrieving the response of the system.

The present invention proposes a retrieval method that first provides a structured database that has multiple records stored. Then, the full-text search is performed on the structured database.

According to an embodiment of the present invention, each of the foregoing records includes a title bar, the title bar includes at least one sub-column, each sub-column includes a guide bar and a value column, and the guide bar of the foregoing record stores the guide information, the foregoing The value column of the record stores the numerical data.

According to an embodiment of the invention, each of the foregoing records further includes a content column, and the recorded content column stores the recorded content details.

According to an embodiment of the present invention, when a plurality of columns of data are stored in the title column of the record, the first special character is stored between the columns of the columns to separate the information of each column. A second special character is stored between the column and the value column to separate the information in the guide bar and the value column.

In accordance with an embodiment of the invention, the columns in the title bar have a fixed number of digits.

The present invention provides a retrieval system comprising a structured database for storing at least one record, wherein each of the records includes at least one column, wherein the data stored in the column is a property commonly used to describe the record; Search engine Performing a full-text search on the structured database according to a keyword requesting information, wherein when at least one of the records of the structured database matches the keyword, a guide data corresponding to the column is output to confirm The intent of the request information.

The present invention provides a retrieval method comprising: inputting a keyword, wherein the keyword is generated by a request information; and performing a full-text search on a structured database according to the keyword, wherein the structured database stores at least a record, wherein each of the records includes at least one column, wherein the data stored in the column is a property commonly used to describe the record; wherein when at least one of the records of the structured database matches the keyword, A guidance material corresponding to the column is output to confirm the intent of the requested information.

Based on the above, the present invention performs a full-text search on a record having a specific data structure in the structured database by using a keyword included in the user's request information to assist in judging the intention expressed by the user in the requested information.

The above described features and advantages of the invention will be apparent from the following description.

100, 520, 520’, 720, 720’ ‧ ‧ natural language understanding system

102, 503, 503', 703, 902, 902'‧‧‧ request information

104‧‧‧Analysis results

106‧‧‧ possible intent grammar information

108, 509, 509’, 711, 904, 904’ ‧ ‧ keywords

110‧‧‧Responding results

112‧‧‧Intentional information

114‧‧‧Determining intent grammar information

116‧‧‧Analysis Results Output Module

200‧‧‧Search System

220‧‧‧ Structured Database

240‧‧‧Search Engine

260‧‧‧Search interface unit

280‧‧‧Guide data storage device

300‧‧‧Natural Language Processor

302, 832, 834, 836, 838 ‧ ‧ records

304‧‧‧ title bar

306‧‧‧Content bar

308‧‧ ‧ column

310‧‧‧Guide

312‧‧‧Value column

314‧‧‧Source column

316‧‧‧heat column

318, 852, 854‧‧‧ favorite bar

320, 862, 864‧‧‧ aversive bar

400‧‧‧Knowledge-assisted understanding module

500, 500’, 700, 700’‧‧‧ Natural Language Dialogue System

501, 701‧‧‧ voice input

507, 507’, 707‧ ‧ voice response

510, 710‧‧‧ voice sampling module

511, 511', 711, 906, 906' ‧ ‧ return answers

513, 513’, 713 ‧ ‧ voice

522, 722‧‧‧ voice recognition module

524, 724‧‧‧ natural language processing module

526, 726‧‧‧Speech synthesis module

530, 740‧‧‧Speech synthesis database

702‧‧‧voice integrated processing module

715‧‧‧User preference information

717‧‧‧User preference record

730‧‧‧Characteristic Database

Column 872, 874‧‧

900, 1010‧‧‧ mobile terminal devices

908, 908’‧‧‧ Candidate List

910, 1011‧‧‧ voice receiving unit

920, 1013‧‧‧ Data Processing Unit

930, 1015‧‧‧ display unit

940‧‧‧storage unit

1000‧‧‧Information System

1020‧‧‧Server

SP1‧‧‧ first voice

SP2‧‧‧second voice

1200, 1300‧‧‧ voice control system

1210‧‧‧Auxiliary starter

1212, 1222‧‧‧ wireless transmission module

1214‧‧‧ Trigger Module

1216‧‧‧Wireless rechargeable battery

12162‧‧‧ battery unit

12164‧‧‧Wireless charging module

1220, 1320‧‧‧ mobile terminal devices

1221‧‧‧ voice system

1224‧‧‧Voice sampling module

1226‧‧‧Speech synthesis module

1227‧‧‧Voice output interface

1228‧‧‧Communication Module

1230‧‧‧ (cloud) server

1232‧‧‧Voice Understanding Module

12322‧‧‧Voice recognition module

12324‧‧‧Voice Processing Module

S410~S450‧‧‧ steps of a retrieval method according to an embodiment of the present invention

S510~S590‧‧‧ steps of the working process of the natural language understanding system according to an embodiment of the invention

S602, S604, S606, S608, S610, S612‧‧‧ steps of modifying the voice response

S802~S890‧‧‧ steps of a natural language dialogue method according to an embodiment of the present invention

S1100~S1190‧‧‧ steps of a speech recognition based selection method according to an embodiment of the invention

S1402~S1412‧‧‧ steps of a voice manipulation method according to an embodiment of the present invention

1 is a block diagram of a natural language understanding system in accordance with an embodiment of the present invention.

2 is a diagram showing the analysis results of various request information of a user by a natural language processor according to an embodiment of the present invention.

3A is a schematic diagram of a plurality of records having a particular data structure stored by a structured database, in accordance with an embodiment of the present invention.

3B is a schematic diagram of a plurality of records having a particular data structure stored by a structured database in accordance with another embodiment of the present invention.

FIG. 3C is a schematic diagram of the guidance material stored by the guidance data storage device according to an embodiment of the invention.

4A is a flow chart of a retrieval method in accordance with an embodiment of the present invention.

4B is a flow chart showing the operation of a natural language understanding system in accordance with another embodiment of the present invention.

FIG. 5A is a block diagram of a natural language dialogue system according to an embodiment of the invention.

FIG. 5B is a block diagram of a natural language understanding system according to an embodiment of the invention.

FIG. 5C is a block diagram of a natural language dialogue system according to another embodiment of the present invention.

FIG. 6 is a flow chart of a method for modifying a voice response according to an embodiment of the invention.

FIG. 7A is a block diagram of a natural language dialogue system according to an embodiment of the invention.

FIG. 7B is a block diagram of a natural language dialogue system according to another embodiment of the present invention.

FIG. 8A is a flow of a natural language dialogue method according to an embodiment of the invention. Cheng Tu.

8B is a schematic diagram of a plurality of records having a particular data structure stored by a structured database in accordance with yet another embodiment of the present invention.

FIG. 9 is a schematic diagram of a system of a mobile terminal device according to an embodiment of the invention.

FIG. 10 is a schematic diagram of a system of an information system according to an embodiment of the invention.

11 is a flow chart of a method for selecting a base speech recognition according to an embodiment of the present invention.

FIG. 12 is a block diagram of a voice control system according to an embodiment of the invention.

FIG. 13 is a block diagram of a voice control system according to another embodiment of the invention.

FIG. 14 is a flowchart of a voice control method according to an embodiment of the invention.

Since the existing implementation of the fixed word list can only provide rigid input rules, the judgment ability of the user's variable input sentence is very insufficient, so often the user's intention is judged incorrectly and the required information cannot be found, or The problem of outputting unnecessary information to the user due to insufficient judgment. In addition, the existing search engine can only provide users with scattered and unrelated search results, so users have to spend time and one by one to filter out the required information, not only wasting time but also missing the required information. The present invention proposes a structural resource for the aforementioned problems of the prior art. Material retrieval method and system, in the structured data to provide a specific column to store different types of data elements, to provide users with natural voice input information for retrieval, can quickly and correctly determine the user's intention, and then provide the required information Give users or provide more precise information for their selection.

1 is a block diagram of a natural language understanding system in accordance with an embodiment of the present invention. As shown in FIG. 1 , the natural language understanding system 100 includes a retrieval system 200 , a natural language processor 300 , and a knowledge assisted understanding module 400 . The knowledge assisted understanding module 400 is coupled to the natural language processor 300 and the retrieval system 200 , and the retrieval system 200 The structure database 220, the search engine 240, and the search interface unit 260 are further included. The search engine 240 is coupled to the structured database 220 and the retrieval interface unit 260. In the present embodiment, the retrieval system 200 includes a retrieval interface unit 260, but is not intended to limit the present invention. In some embodiments, the retrieval interface unit 260 may not be retrieved, but in other manners (eg, through an API (Application Interface) call receiving key The word 108) causes the search engine 240 to perform a full-text search of the structured repository 220.

When the user issues the request information 102 to the natural language understanding system 100, the natural language processor 300 can analyze the request information 102 and send the analyzed possible intent grammar data 106 to the knowledge assisted understanding module 400, where the grammar data may be intended 106 includes a keyword 108 and an intent profile 112. Subsequently, the knowledge assisted understanding module 400 retrieves the keywords 108 in the possible intent grammar material 106 and sends them to the retrieval system 200 and stores the intent data 112 within the knowledge assisted understanding module 400, while the search engine 240 in the retrieval system 200 will After the full-text search is performed on the structured database 220 according to the keyword 108, the response result 110 of the full-text search is transmitted back to the knowledge assistant. Help understanding module 400. Then, the knowledge assisted understanding module 400 compares the stored intent data 112 according to the response result 110, and sends the determined determined intent grammar data 114 to the analysis result output module 116, and the analysis result output module 116 Then, based on the determined intent grammar data 114, the analysis result 104 is transmitted to the server (not shown), and then sent to the user after the information required by the user is queried. It should be noted that the analysis result 104 may include the keyword 108, and may also output partial information (eg, the number of the record 302) of the record containing the keyword 108 (eg, the record of FIG. 3A/3B), or all of the information. In addition, the analysis result 104 can be directly converted into a voice output to the user by the server, or after a specific process, and then the corresponding voice is output to the user (the specific method and the content and information included later will be described in detail later). The information output by the retrieval system 200 can be designed by a person skilled in the art according to actual needs, which is not limited by the present invention.

The analysis result output module 116 may be combined with other modules, for example, may be incorporated into the knowledge assisted understanding module 400 in one embodiment, or separated from the natural language understanding system 100 in another embodiment. Located in the server (eg, including the natural language understanding system 100), the server will then receive the intent grammar data 114 for processing. In addition, the natural language understanding system 100 can store the intent material 112 in a storage device internal to the module, in the natural language understanding system 100, in a server (eg, including the natural language understanding system 100), or at any available The knowledge assisted understanding module 400 can be retrieved from the storage device, which is not limited by the present invention. Moreover, the natural language understanding system 100 includes the retrieval system 200, the natural language processor 300, and the knowledge assisted understanding module 400 can be implemented by hardware, software, The firmware or the various combinations of the above methods are constructed, and the present invention is not limited thereto.

The aforementioned natural language understanding system 100 may be located in a cloud server, or may be located in a server in a local area network, or even in a personal computer, a mobile computer device (such as a notebook computer) or a mobile communication device (such as a mobile phone). The components of the natural language understanding system 100 or the retrieval system 200 are not necessarily required to be disposed in the same machine, but may be dispersed in different devices or systems through various different communication protocols depending on actual needs. For example, the natural language understanding processor 300 and the knowledge assisted understanding module 400 can be configured in the same smart phone, and the retrieval system 200 can be configured in another cloud server; or, the retrieval interface unit 260, natural language understanding processing The device 300 and the knowledge assisted understanding module 400 can be configured in the same notebook computer, and the search engine 240 and the structured database 220 can be configured in another server in the local area network. In addition, when the natural language understanding system 100 is located at the server (whether it is a cloud server or a local area network server), the retrieval system 200, the natural language understanding processor 300, and the knowledge assistance understanding module 400 may be configured differently. In the computer host, the server main system coordinates the transmission of information and data between them. Of course, the retrieval system 200, the natural language understanding processor 300, and the knowledge-assisted understanding module 400 can also combine two or all of them into a computer host according to actual needs, and the present invention does not limit the configuration of this part.

In an embodiment of the present invention, the user can send request information to the natural language processor 300 in various ways, such as sending a request message by means of a spoken voice input or a text description. For example, if the natural language understanding system 100 If it is located in a server (not shown) in the cloud or regional network, the user can first input the request information 102 by using a mobile device (such as a mobile phone, PDA, tablet computer or the like), and then through the telecommunication system provider. The request information 102 is transmitted to the natural language understanding system 100 in the server, so that the natural language processor 300 performs the analysis of the request information 102. Finally, after confirming the user's intention, the server passes the analysis result output module 116 to correspond. After the analysis result 104 is processed by the server, the information requested by the user is transmitted back to the user's mobile device. For example, the request information 102 may be a question that the user desires to obtain an answer by the natural language understanding system 100 (eg, "How is the weather in Shanghai tomorrow"), and the natural language understanding system 100 analyzes the user's intention is When the weather of Shanghai Tomorrow is queried, the weather data that is queried will be sent to the user as the output result 104 through the analysis result output module 116. In addition, if the user's instruction to the natural language understanding system 100 is "I want to see the bullet fly", "I want to listen to the days I walked together", because "let the bullets fly" or "the days that have passed together" Different fields may be included, so the natural language processor 300 will parse the user's request information 102 into one or more possible intent grammar materials 106, which may include the keyword 108 and the intent data 112, and then After the full-text search is performed on the structured material 240 in the search system 220, the user's intention is further confirmed.

Further, when the user's request information 102 is "How is the weather in Shanghai tomorrow?", the natural language processor 300 can generate a possible intent grammar data 106 after analysis: "<queryweather>, <city>= Shanghai, <time> = tomorrow.

In an embodiment, if the natural language understanding system 100 considers that the intention of the user is quite clear, the user's intention (ie, querying the weather of tomorrow in Shanghai) can be directly output to the server through the analysis result output module 116. And the server can transmit the user to the user in the weather specified by the user. For another example, when the user's request information 102 is "I want to see the Romance of the Three Kingdoms", the natural language processor 300 can generate three possible intent grammar materials 106 after analysis: "<readbook>, <bookname>=Three Kingdoms ";"<watchTV>, <TVname>=Three Kingdoms"; and "<watchfilm>, <filmname>=Three Kingdoms".

This is because the keywords 108 (ie, "Three Kingdoms") in the possible grammar material 106 may belong to different fields, that is, books (<readbook>), television dramas (<watchTV>), and movies (<readfilm>). There are three fields, so one request message 102 can be analyzed into a plurality of possible intent grammar materials 106, so further analysis by the knowledge assisted understanding module 400 is needed to confirm the user's intent. As another example, if the user enters "I want to see the bullet fly", because the "let the bullet fly" may be the movie name or the book name, there may be at least the following two possible intent grammars. Source 106: "<readbook>, <bookname>=Let the bullet fly"; and "<watchfilm>, <filmname>=Let the bullet fly"; they belong to two areas of books and movies. The above-mentioned possible intent grammar data 106 is then further analyzed by the knowledge assisted understanding module 400, and the determined intent grammar data 114 is obtained therefrom to express the clear meaning of the user's request information. Figure. When the knowledge assisted understanding module 400 analyzes the possible intent grammar data 106, the knowledge assisted understanding module 400 can transmit a keyword 108 (eg, "Three Kingdoms" or "Let the bullet fly") to the retrieval system 200 via the retrieval interface 260. The structured database 220 in the retrieval system 200 stores a plurality of records having a particular data structure, and the search engine 240 can perform full-text retrieval of the structured database 220 by the keywords 108 received by the retrieval interface unit 260, and The response result 110 obtained by the full-text search is transmitted back to the knowledge assisted understanding module 400, and then the knowledge assisted understanding module 400 can obtain the determined intention grammar data 114 by responding to the result 110. As for the full-text search of the structured repository 220 to determine the details of the intent grammar material 114, a more detailed description will be provided later through Figures 3A, 3B and related paragraphs.

In the concept of the present invention, the natural language understanding system 100 can first retrieve the keyword 108 in the user's request information 102, and determine the domain attribute of the keyword 108 by the full-text search result of the structured database 220, such as the above. When you enter "I want to see the Romance of the Three Kingdoms", it will generate possible intent grammar materials 106 belonging to the three fields of books, TV series, and movies, and then further analyze and confirm the user's clear intentions. Therefore, users can easily express their intentions or information in a colloquial manner, without having to memorize specific terms, such as specific terms in the existing practice regarding fixed word lists.

2 is a diagram showing the results of analysis of various request information of a user by the natural language processor 300, in accordance with an embodiment of the present invention.

As shown in FIG. 2, when the user's request information 102 is "How is the weather in Shanghai tomorrow?", the natural language processor 300 may generate a possible meaning after analysis. The grammar data 106 is: "<queryweather>, <city>=Shanghai, <time>=Tomorrow"

The intent data 112 is "<queryweather>", and the keyword 108 is "Shanghai" and "tomorrow". Since only a set of intent grammar data 106 (query weather <queryweather>) is obtained after analysis by the natural language processor 300, in an embodiment, the knowledge assisted understanding module 400 can directly retrieve the keyword 108 "Shanghai" and "" "Tomorrow" is sent to the server as the analysis result 104 to check the weather information (for example, to query the weather profile of Shanghai tomorrow, including weather, temperature, etc.), without the need to perform full-text search on the structured database 220 to determine the user's intention. (If the knowledge assisted understanding module 400 can confirm the user's intent by parsing the possible intent grammar data 106 generated by the request information 102). Of course, in an embodiment, the structured database 220 can still be subjected to full-text search for more accurate user intent determination, and those skilled in the art can make changes according to actual needs.

In addition, when the user's request information 102 is "I want to see the bullet fly", because two possible intent grammar materials 106 can be generated: "<readbook>, <bookname>=let the bullet fly"; and "<watchfilm> , <filmname>=let the bullet fly"; with two corresponding intent materials 112"<readbook>" and "<watchfilm>", and two identical keywords 108 "let the bullet fly" to indicate that the intention may be Look at the book "Let the bullets fly" or watch the movie "Let the bullets fly." To further confirm the user's intent, the keyword 108 "let the bullet fly" to the search interface unit 260 will be transmitted by the knowledge assisted understanding module 400, and then the search engine 240 will use the keyword 108 "Let the bullet fly" to perform a full-text search on the structured database 220 to confirm that "let the bullet fly" should be the title of the book or the name of the movie to confirm the user's intention.

Furthermore, when the user's request information 102 is "I want to listen to the day I walked together", two possible intent grammar materials 106 can be generated: "<playmusic>, <singer>= walked together, <songname>= Day ";"<playmusic>, <songname>=days passed together"

Two corresponding identical intent materials 112 "<playmusic>" and two sets of corresponding keywords 108" walked together with "days" and "days passed together" to indicate that their intentions may be to listen to the singer "Walk through the song "Day" together, or listen to the song "The Day Walked Together", at this time the Knowledge Aid Understanding Module 400 can transmit the first set of keywords 108 "walking together" and "days" And the second set of keywords "days passed together" is given to the search interface unit 260 to confirm whether there is a "day" song sung by the singer (the user implied by the first set of keywords) Intention), or whether there is a "day of the past" song (the user's intention implied by the second set of keywords) to confirm the user's intention. However, the present invention is not limited to the format and name corresponding to each possible intent grammar material and intent material represented herein.

3A is a diagram of a plurality of records having a particular data structure stored by structured database 220, in accordance with an embodiment of the present invention.

In general, in some existing full-text search methods, the search results obtained are unstructured data (such as those searched by Google or Baidu), because the information of the search results is scattered and unrelated. Therefore, the user must review each piece of information one by one, thus causing practical limitations. However, in this In the concept of the invention, the efficiency and correctness of the retrieval can be effectively improved by the structured database. Because the numerical data contained in each record in the structured database disclosed by the present invention is related to each other, and the numerical data is used together to express the attributes of the record. Therefore, when the search engine performs a full-text search on the structured database, when the recorded numerical data matches the keyword, the guidance data corresponding to the numerical data is output as the intention to confirm the requested information. The implementation details of this section will be further described by the following examples.

In the embodiment of the present invention, each record 302 stored in the structured database 220 includes a title bar 304 and a content bar 306. The title bar 304 includes a plurality of columns 308, each of which includes a guide bar 310 and a value column. 312. The index bar 310 of the plurality of records 302 is used to store the guidance data, and the value column 312 of the plurality of records 302 is used to store the numerical data. Here, the record 1 shown in FIG. 3A is exemplified, and the three columns 308 in the title bar 304 of the record 1 are respectively stored: "singerguid: Andy Lau", "songnameguid: days passed together"; and "songtypeguid" : Hong Kong and Taiwan, Cantonese, popular"; the guide bar 310 of each column 308 stores the guidance materials "singerguid", "songnameguid" and "songtypeguid", respectively, and the value column 312 of the corresponding column 308 stores the numerical data respectively. "Andy Lau", "the days that have passed together" and "Hong Kong and Taiwan, Cantonese, popular". The field of the "singerguid" is the singer's name (singer), and the field of the "songnameguid" is the name of the song. The category of the "songtypeguid" represents the numerical data "Hong Kong, Taiwan, Cantonese, popular" is the song type. The various reference materials herein may be represented by different specific strings or words, respectively, and are not limited thereto. The content column 306 of the record 1 is the lyrics content of the song that stores the "days that have passed together" or stores other materials (such as a composer/writer, etc.), but the real data in the content column 306 of each record. This is not the focus of the present invention and is therefore only schematically depicted in Figure 3A.

In the foregoing embodiment, each record includes a title bar 304 and a content bar 306, and the column 308 in the title bar 304 includes a guide bar 310 and a value column 312, but is not intended to limit the present invention, and may also be used in some embodiments. There is no content bar 306, or even some embodiments may have no guide bar 310.

In addition, in the embodiment of the present invention, a first special character is stored between the data of each column 308 to separate the data of each column 308, and is stored between the index column 310 and the data of the value column 312. The second special character separates the data in the guide bar and the value bar. For example, as shown in FIG. 3A, between "singerguid" and "Andy Lau", "songnameguid" and "days passed together", and "songtypeguid" and "Hong Kong and Taiwan, Cantonese, popular" are The second special character ":" is used for separation, and the respective columns 308 of the record 1 are separated by the first special character "|", but the present invention is not limited to ":" or "|". Is a special character used to separate.

On the other hand, in the embodiment of the present invention, each column 308 in the title bar 304 may have a fixed number of bits. For example, the fixed number of bits in each column 308 may be 32. Word, and the fixed number of digits of the index bar 310 may be 7 or 8 digits (up to 128 or 256 different guidance materials), and in addition, the required bits for the first special character and the second special character The number can be fixed, so after the number of fixed digits of the column 308 is deducted from the index bar 310, the first special character, and the second special character, the remaining digits can be used to store the value column. 312 numerical data. Furthermore, since the number of bits in the column 308 is fixed, the content of the data stored in the column 308 can be sequentially displayed as the index bar 310 (indicator of the guidance data), the first special character, and the value column 312 as shown in FIG. 3A. Data, second special characters, and as mentioned above, the number of bits of the four materials is also fixed, so the bits of the guide bar 310 can be skipped in practice (for example, skipping the first 7 or 8 bits), and After the number of bits of the second special character (for example, skipping 1 word, that is, 8 bits), and then deducting the number of bits occupied by the first special character (for example, the last 1 word, 8 bits), and finally The numerical data of the value column 312 can be directly obtained (for example, the numerical data is directly taken out in the first column 308 of the record 1), and there are 32-3=29 words for storing the value of the value column 312. In the data, 3 (that is, 1+1+1) in the formula represents one of the guidance data of the index column 310, the first special character, and the second special character, respectively, and then the required field is performed. The type can be judged. Therefore, after the currently obtained numerical data comparison is completed (regardless of whether the comparison is successful or not), the numerical data of the next column 308 can be taken out according to the above-mentioned method of taking out the numerical data (for example, in the second of the record 1) In column 308, the numerical data "days passed together" is directly taken out to compare the types of comparison fields. The above method of taking out the numerical data can be compared from the record 1, and after comparing all the numerical data of the record 1, the first column 308 in the title bar 308 of the record 2 is taken out. The numerical data (such as "Feng Xiaogang") are compared. The above comparison procedure will continue until all recorded values have been compared.

It should be noted that the number of bits in the above-mentioned column 308, and the number of bits used in the index bar 310, the first special character, and the second special character may be changed according to actual applications, and the present invention does not limit this. The foregoing method of extracting numerical data by comparison is only an embodiment, but is not intended to limit the present invention, and another embodiment may be performed by means of full-text search (for example, by "character-by-one comparison"). In addition, the implementation manner of skipping the guide bar 310, the second special character, and the first special character may be achieved by bit translation (for example, division), and the implementation of this part may be combined with hardware, software, or both. The manner is carried out, and those skilled in the art can change according to actual needs. In another embodiment of the present invention, each of the sub-columns 308 in the title bar 304 may have a fixed number of bits, the guide bar 310 in the sub-column 308 may have another fixed number of bits, and the title bar 304 may not include the first number. For the special characters and the second special characters, since the number of bits in each column 308 and each of the index bars 310 is fixed, the points can be directly taken out by skipping a specific number of bits or by using a bit shift (for example, division). Guidance data or numerical data in column 308.

It should be noted that since the column 308 has been mentioned to have a certain number of bits, a counter can be used in the natural language understanding system 100 (or in a server including the natural language understanding system 100) to record the current comparison. Is a column 308 of a record. In addition, the aligned records may also use another counter to store their order. For example, when a first counter record is used to indicate the currently recorded record order, and a second counter is used to indicate the currently aligned column order, If the third column 308 of record 2 of Figure 3A is currently aligned (i.e., the comparison "filenameguid: Huayi Brothers"), the value stored by the first counter will be 2 (indicating that the current comparison is Record 2), the value stored in the second counter is 3 (indicating that the third column 308 is currently aligned). Moreover, the above manner of storing the guidance data of the guide bar 310 only by 7 or 8 bits is to use most of the words of the column 308 to store the numerical data, and the actual guidance materials can pass the 7 or 8 The bit is used as an indicator, and the actual guide data is read from the guide data storage device 280 stored in the search system 200, wherein the guide data is stored in a form, but any other access to the search system 200 is available. Means can be used in the present invention. Therefore, in actual operation, in addition to directly extracting the numerical data for comparison, when the matching result is generated, the guidance data may be directly taken out as the response result 110 to the knowledge assisting understanding mode according to the values of the two counters. Group 400. For example, when the second column 308 of the record 6 (ie, "songnameguid: betrayal") is successfully matched, it will be known that the current first counter/second counter have values of 6 and 2, respectively, so These two values go to store the guidance data storage device 280 shown in FIG. 3C, and the guidance data is queried from the column 2 of the record 6 as "songnameguid". In an embodiment, after the number of digits of the column 308 is fixed, all the bits of the column 308 are used to store the numerical data, so that the guide bar, the first special character, and the second special character can be completely removed. The search engine 240 only needs to know that each time the fixed number of digits is crossed, another sub-column 308 is added, and one of the second counters is added (of course, the stored value of the first counter is also incremented every time the next record is searched). For example, in an embodiment, the size of each record can be set to a predetermined value, and the included The number of columns 308 can be fixed to a predetermined number, so that the search engine 220 can easily know that the visit has been made to the end of the record after analyzing the data of the predetermined value in a record. In another embodiment, a specific third special character (such as a period or other similar symbol) may be stored at the end of the record, and the search engine 220 also knows that the visit has been visited to the end of the record when the special symbol is found. The method can provide more digits to store the numerical data.

Another example is given to illustrate the process of returning the matching record 110 to the knowledge assisted understanding module 400 for further processing when the matching result is generated. Corresponding to the data structure of the above record 302, in the embodiment of the present invention, when the user's request information 102 is "I want to see the bullet fly", two possible intent grammar materials 106 can be generated: "<readbook>, <bookname>=let the bullet fly"; and "<watchfilm>, <filmname>=let the bullet fly"; the search engine 240 retrieves the bullet 108 by the retrieval interface unit 260 "to let the bullet fly" to Figure 3A The title bar 304 of the record stored in the structured database 220 is subjected to full-text search. In the full-text search, the record 5 storing the numerical data "Let the bullet fly" is found in the title bar 304, and thus a matching result is produced. Next, the retrieval system 200 will return the guidance material "filmnameguid" corresponding to the keyword 108 "Let the bullet fly" in the third column 308 of the post-record 5 title bar 304 as a response result 110 and pass it back to the knowledge-assisted understanding. Module 400. Since the title data column of the record 5 contains the guidance material "filmnameguid" corresponding to the numerical data "let the bullet fly", the knowledge assisted understanding module 400 uses the guidance data of the comparison record 5 "filmnameguid" and the intent data 112"<watchfilm>" or "<readbook>" previously stored in the possible intent grammar material 106 can determine that the determined intent grammar data 114 of the request information is "<watchfilm>, < Filmname>=Let the bullet fly" (because both contain "film" in it). In other words, the information described in the user's request information 102 "let the bullet fly" is the movie name, and the data user's request information 102 is intended to watch the movie "let the bullet fly" instead of reading the book. The confirmed "<watchfilm>, <filmname> = let the bullet fly" be treated as the determined intent grammar data 114 and sent to the analysis result output module 116 for further processing.

Give another example for further explanation. When the user's request information 102 is "I want to listen to the day I walked together", two possible intent grammar materials 106 may be generated: "<playmusic>, <singer>= walked together, <songname>=day"; The day of the walk with "<playmusic>, <songname>="; the search engine 240 retrieves the two sets of keywords 108 received by the interface unit 260: "walking together" and "days"; and "walking together" The days passed

The full-text search is performed on the title bar 304 of the record stored in the structured database 220 of FIG. 3A. Due to the full-text search, the matching results corresponding to the first group of keywords 108 "going through" and "days" are not found in the title bar 304 of all the records, but the corresponding corresponding to the second group of keywords 108" is found. Record 1 of the day that passed together, then the retrieval system 200 will record 1 in the title bar 304 corresponding to the second set of keywords The guidance material "songnameguid" of 108 is passed as the matching record 110 and passed back to the knowledge assisted understanding module 400. Next, after receiving the guidance material "songnameguid" corresponding to the value data "days passed together", the knowledge assisted understanding module 400 walks with the possible intention grammar material 106 (ie, "playmusic>, <singer>=" After that, <songname>=days are compared with the intent data 112 (ie, <singer>, <songname>, etc.) in "<playongic>, <songname>=days passed together), so this is found. The sub-user's request information 102 does not describe the material of the singer's name, but describes the material whose song name is "the day that passed together" (because only <songname> is successful). Therefore, the knowledge assisted understanding module 400 can determine, by the above comparison, that the determined intent grammar data 114 of the request information 102 is "<playmusic>, <songname>=days passed together", and the user requests information. The intent of 102 is to listen to the song "the days that have passed together."

In another embodiment of the present invention, the retrieved response result 110 may be an exact match record that exactly matches the keyword 108, or a partial match record that partially matches the keyword 108. For example, if the user's request information 102 is "I want to listen to Xiao Jingteng's betrayal", similarly, the natural language processor 300 analyzes to generate two possible intent grammar materials 106: "<playmusic>, <singer> = Xiao Jingteng, <songname>=betrayal; and "<playmusic>, <songname>=Xiao Jingteng's betrayal"; and send two sets of keywords 108: "Xiao Jingteng" and "betrayal"; and "Xiao Jingteng's betrayal"; To the search interface unit 260, the search engine 240 then performs a full-text search of the title bar 304 of the record 302 stored in the structured repository 220 of FIG. 3A by the keyword 108 received by the search interface unit 260. Since in the full-text search, the second group of keywords 108 "Xiao Jingteng's betrayal" did not match any records, but the first group of keywords 108 "Xiao Jingteng" and "betrayal" found the matching result of record 6 and record 7. . Since the second group of keywords 108 "Xiao Jing Teng" and "Betrayal" only match the numerical data "Xiao Jing Teng" in Record 6, but do not match other numerical data "Yang Zongwei" and "Cao Ge", record 6 is a partial matching record. (Please note that the above-mentioned corresponding request information 102 "I want to see the bullet 5" and the corresponding request information "I want to listen to the day that I walked together" record 1 is a partial match record), and the keyword "Xiao Jingteng" and "Betrayal" exactly matches the value of record 7 (because the second set of keywords 108 "Xiao Jingteng" and "Betrayal" match successfully), so record 7 is an exact match record. In the embodiment of the present invention, when the search interface unit 260 outputs the plurality of matching records 110 to the knowledge assisted understanding module 400, the complete matching records may be sequentially output (that is, all the numerical data are matched) and the partial matching is performed. A matching record 110 of records (i.e., only a portion of the numerical data is matched), wherein the priority order of the exact match records is greater than the priority order of the partial match records. Therefore, when the search interface unit 260 outputs the matching record 110 of the record 6 and the record 7, the output priority order of the record 7 is greater than the output priority order of the record 6, because all the numerical data "Xiao Jingteng" and "Betrayal" of the record 7 are generated. Match the results, but record 6 also contains "Yang Zongwei" and "Cao Ge" did not produce a match. That is to say, the higher the matching degree of the records stored in the structured database 220 to the keywords 108 in the request information 102, the easier it is to be preferentially outputted so that the user can view or select the records. Correspondingly determined intent grammar data 114. In another embodiment, the matching record 110 corresponding to the record with the highest priority can be directly output as the determination of the intended grammar data 114. The foregoing is not intended to limit the invention, as in another embodiment it may be possible to take the form of the search as long as the matching record is found (for example, "I want to hear Xiao Jingteng's betrayal" as the request information 102, when the record 6 is retrieved That is, when the matching result is generated, the guidance data corresponding to the record 6 is output as the matching record 110), and the ordering of the priority order is not included to speed up the retrieval. In another embodiment, the corresponding processing mode can be directly executed and provided to the user for the record with the highest priority. For example, when the movie with the highest priority is the movie of the Three Kingdoms, the movie and the user can be directly played. In addition, if the highest priority is the betrayal of Xiao Jingteng's singing, the song can be played directly with the user. It should be noted that the present invention is described herein only and is not intended to be limiting.

In still another embodiment of the present invention, if the user's request information 102 is "I want to listen to Andy Lau's betrayal", then one of the possible intent grammar materials 106 is: "<playmusic>, <singer> = Andy Lau, <songname>=betrayal; if the search interface unit 260 enters the keywords 108 "Andy Lau" and "betrayal" into the search engine 240, no matching results will be found in the database of FIG. 3A. In still another embodiment of the present invention, the search interface unit 260 may input the keywords 108 "Andy Lau" and "betrayal" into the search engine 240, respectively, and respectively determine that "Andy Lau" is the singer name (guide material singerguid) and " "Betrayal" is the name of the song (the guide information is songnameguid, and the singer may be Cao Ge or Xiao Jingteng, Yang Zongwei and Cao Ge chorus). At this point, the natural language understanding system 100 can further alert the user: "Betray this Is the song sung by Xiao Jingteng (according to the matching result of record 7)? ", or, "Is it a chorus for Xiao Jingteng, Yang Zongwei and Cao Ge (according to the matching result of record 6)? ".

In still another embodiment of the present invention, the record stored in the structured database 220 may further include a source column 314 and a heat column 316. The database shown in FIG. 3B includes a source column 314 heat column 316, a preference column 318, and an aversive column in addition to the columns of FIG. 3A. The source field 314 of each record can be used to store an indication or indicator from which structured database the record was generated (note that only the structured database 220 is shown in this figure, but more different structured data may actually exist Library), or which user, the source value provided by the server. Moreover, the natural language understanding system 100 can retrieve a structured database of a specific source according to the preferences leaked by the user in the previous request message 102 (for example, when the full-text search is performed by the keyword 108 in the request information 102 to generate a match, Add one to the heat value of the record). The heat column 316 of each record 302 is used to store the search heat value or popularity value of the record 302 (for example, the number of matches or the probability of the record being single user, a specific user group, all users in a specific time) The reference for the knowledge-assisted understanding module 400 to determine the user's intention, as well as the use of the preference bar 318 and the aversive bar will be described in detail later. In detail, when the user's request information 102 is "I want to see the Romance of the Three Kingdoms", the natural language processor 300 can generate a plurality of possible intent grammar materials 106 after analysis: "<readbook>, <bookname>=Three Kingdoms The Romance ";" <watchTV>, <TVname>=The Romance of the Three Kingdoms; and "<watchfilm>, <filmname>=The Romance of the Three Kingdoms".

If the retrieval system 200 is in the history of the user's request information 102 (eg, If the number of times the record 302 is selected by a user through the heat column 316 is counted, most of the requests are counted as movies (assuming that only one of the structured databases has a corresponding book for the Three Kingdoms, the TV series) The record of the movie, wherein the record of the movie is higher than the other two, the retrieval system 200 can perform a search for the structured database storing the movie record (the source value in the source column 314 at this time, It is a code for recording a structured database for storing movie records, so that "<watchfilm>, <filmname>=Three Kingdoms" can be preferentially determined to determine the intent grammar data 114. For example, in an embodiment, each record 302 can also be matched once, and one can be added to the subsequent heat column 316 as the user's history. Therefore, when the full-text search is performed according to the keyword "Three Kingdoms", the record 302 having the highest value in the heat column 316 can be selected from all the matching results as the judgment of the user's intention. In an embodiment, if the search system 200 determines in the search result of the keyword 108 "Three Kingdoms" that the search heat value stored in the heat column 316 corresponding to the record of the "Three Kingdoms" is the highest, then Priority determination "<watchTV>, <TVname>=Three Kingdoms" is to determine the intent grammar material 114. In yet another embodiment, the retrieval system 200 can count the values of all of the recorded retrieval systems 200 if there are pairs of records for each domain. For example, if there are multiple books, TV dramas, and movies corresponding to the Romance of the Three Kingdoms in the structured database 220, the retrieval system 200 can first count the heat values of the corresponding records, and determine which field has the highest value. . For example, there are 5, 13, 16 corresponding records of books, TV dramas, and movies in the Romance of the Three Kingdoms, and the heat values of these 5, 13, and 16 records are 30, 18, and 25 respectively. Therefore, the retrieval system 200 can select heat among the five records related to the books of the Three Kingdoms. The record with the highest value of degree column 316 and its corresponding guide bar data (which may include the source value in source column 314) will be passed to knowledge assisted understanding module 400 for further processing. In addition, the source value stored in the source field 314 can also be used as part of the matching record 110 and output to the knowledge assisted understanding module 400 for displaying where the user can access the desired television show. Furthermore, the manner of changing the value stored in the heat column 316 can be changed by the computer system in which the natural language understanding system 100 is located, and the present invention is not limited thereto. In addition, the value of the heat column 316 may also decrease with time to indicate that the user's heat for a certain record 302 has gradually decreased, and the present invention does not limit this portion.

As another example, in another embodiment, since the user may particularly like to watch the drama of the Three Kingdoms in a certain period of time, since the length of the drama may be long and the user cannot read it for a short time, in a short time It is possible to repeat the selection (assuming that the value in the heat column 316 is incremented by one each time it is matched), thus causing a certain record 302 to be repeatedly matched, which can be known by analyzing the data of the heat column 316. Moreover, in another embodiment, the telecommunications provider may also utilize the heat column 316 to indicate the popularity of the material provided by a source, and the data provider's code may be stored in the source field 314. For example, if a supplier who supplies the "Three Kingdoms TV series" has the highest probability of being selected, when a user enters the "I want to see the Romance of the Three Kingdoms" request information 102, although in the database of Figure 3B When you are doing a full-text search, you will find a book reading the Romance of the Three Kingdoms (Record 8), watching the drama of the Three Kingdoms (Record 9), and watching the Three Kingdoms (10). However, due to the information in the Heat Bar 316, you can see the Romance of the Three Kingdoms. Movies are now the hottest option (also That is, the values of the heat bars of records 8, 9, and 10 are 2, 5, and 8 respectively, so the guide data of the record 10 is first provided as the match record 110 and output to the knowledge assisted understanding system 400 as the highest priority for determining the user's intention. . In an embodiment, the data of the source bar 314 can be simultaneously displayed to the user, allowing the user to determine whether the TV show he wants to watch is provided by a provider (and the user can link to the provider to read and play) TV series to watch). In another embodiment, if there is more than one record providing a movie of the Three Kingdoms, the retrieval system 200 can transmit the data stored in the source column 314 of the records having the highest value to the knowledge assisted understanding module 400. It should be noted that the information stored in the source column 314 and the manner of changing the same may be changed by the computer system in which the natural language understanding system 100 is located, which is not limited by the present invention. It should be noted that those skilled in the art should further further cut the information stored in the heat column 316, the preference column 318, and the aversion column 320 in FIG. 3B into two parts related to the user and related to the entire user. The hot bar 316, the favorite bar 318, and the disgust bar 320 information related to the user are stored in the user's mobile phone, and the server stores information such as the heat bar 316, the favorite bar 318, and the disgust bar 320 associated with all the users. In this way, only relevant personal preferences related to the user's personal choice or intent are stored in the user's personal mobile communication device (eg, mobile phone, tablet computer, or small notebook, etc.), and the server The information related to the user is stored, which not only saves the storage space of the server, but also preserves the privacy of the user's personal preference.

Obviously, the numerical data contained in each record in the structured database disclosed by the present invention is related to each other (for example, the numerical data in record 1) "Andy Lau", "days passed together", "Hong Kong and Taiwan, Cantonese, popular" are used to describe the characteristics of record 1), and these numerical data (with corresponding guidance materials) are used together to express requests from users. When the information is intended for the record (for example, when the matching result is "on the day of walking together", it means that the user's intention may be access to the data of record 1), so when the search engine performs full-text search on the structured database, When the recorded numerical data is matched, the guidance data corresponding to the numerical data may be output (for example, output "songnameguid" as the response result 110), thereby confirming the intention of the requested information (for example, in the knowledge assisted understanding module 400) Correct).

Based on the disclosure or teachings of the above exemplary embodiments, FIG. 4A is a flowchart of a retrieval method in accordance with an embodiment of the present invention. Referring to FIG. 4A, the retrieval method of the embodiment of the present invention includes the following steps: providing a structured database, and the structured database stores a plurality of records (step S410); receiving at least one keyword (step S420); The word is used to perform full-text search on the title bar of the plurality of records (step S430). For example, the keyword 108 is input into the search interface unit 260 to cause the search engine 240 to perform a full-text search on the title bar 304 of the plurality of records 302 stored in the structured database 220. The search mode may be as shown in FIG. 3A or FIG. 3B. The search method is performed or the manner in which the spirit is not changed; whether the full-text search has a matching result (step S440). For example, the search engine 240 determines whether the full-text search corresponding to the keyword 108 has a matching result; If there is a matching result, the exact match record and the partial match record are sequentially output (step S450). For example, if there is a record matching the keyword 108 in the structured database 220, the retrieval interface unit 260 sequentially outputs the matching data in the exact matching record and the partial matching record matching the keyword 108 (via the FIG. 3C The feedback data storage device 280 is obtained as a response result 110 to the knowledge assistance understanding system 400 (in another embodiment, the response result 110 may include other information related to the matching record, such as the value stored in the heat column 316, Used to display to the user for transfer to other materials, wherein the priority of the exact match record is greater than the priority of the partial match record.

On the other hand, if there is no matching result, the natural language understanding system 100 can directly notify the user that the matching fails and ends the process, notifies the user that the matching result is not found and requests further input, or enumerates possible options to further select the user. (For example, the above example in which "Andy Lau" and "Betrayal" do not produce a matching result in the full-text search) (step 460).

The foregoing process steps are not intended to limit the present invention, and some steps may be omitted or removed. For example, in another embodiment of the present invention, a matching judgment module located outside the retrieval system 200 may be used (not shown in the figure). Step S440 is performed; or in another embodiment of the present invention, the above step S450 may be omitted, and the actions of sequentially outputting the exact match record and the partial match record may be output by the matching result located outside the retrieval system 200. The module (not shown in the figure) performs the actions of sequentially outputting the exact match record and the partial match record in step S450.

Based on the content disclosed or taught by the above exemplary embodiments, FIG. 4B is the root A flowchart of the working process of the natural language understanding system 100 in accordance with another embodiment of the present invention. Referring to FIG. 4B, the natural language understanding system 100 of the other embodiment of the present invention includes the following steps: receiving request information (step S510). For example, the user transmits the request information 102 having the voice content or the text content to the natural language understanding system 100; provides a structured database, and the structured database stores a plurality of records (step S520); grammatically digitizes the request information ( Step S530). For example, the natural language processor 300 analyzes the user's request information 102, and then converts to the corresponding possible intent grammar data 106; discriminates the possible attributes of the keyword (step S540). For example, the knowledge assisted understanding module 400 identifies possible attributes of at least one of the keywords 108 that may be intended to be grammar material 106, for example, the keyword 108 "Three Kingdoms" may be a book, a movie, and a television program; The full-text search is performed on the title bar 304 of the plurality of records (step S550). For example, the keyword 108 is input to the search interface unit 260 to cause the search engine 240 to perform a full-text search on the title bar 304 of the plurality of records stored in the structured database 220; and whether the full-text search has a matching result (step S560). For example, the search engine 240 determines whether the full-text search corresponding to the keyword 108 has a matching result; if there is a matching result, the exact match record and the partial match record are sequentially output. The guidance material corresponding to (step S570) is the response result 110. For example, if a record in the structured database 220 matches the keyword 108, the search interface unit 260 sequentially outputs the guide data corresponding to the match record and the partial match record of the keyword 108 as the response result 110. The priority order of the exact match records is greater than the priority order of the partial match records; and the corresponding determined intent grammar data is sequentially output (step S580). For example, the knowledge assisted understanding module 400 outputs the corresponding determined intent grammar data 114 by sequentially outputting the matched match record and the partial match record.

On the other hand, if the matching result is not generated in step S560, it may be processed in a manner similar to step S460, for example, directly notifying the user that the matching fails and ending the process, notifying the user that the matching result is not found and requesting further input, or The possible options are listed for the user to make further choices (for example, the foregoing example in which "Andy Lau" and "Betrayal" do not produce a matching result in the full-text search) (step S590).

The foregoing process steps are not intended to limit the invention, and some steps may be omitted or removed.

In summary, the present invention performs by extracting a keyword included in a request information of a user and for a title bar of a record having a specific material structure in the structured database, such as the structure having FIGS. 3A and 3B. Full-text search, if a matching result is generated, it can determine the type of domain to which the keyword belongs (compare with the information in the guide bar) to determine the intention expressed by the user in requesting information.

Next, we will make more use of the above structured database in speech recognition. More instructions. First, in the natural language dialogue system, the natural language understanding system 100 can be used to correct the erroneous voice response according to the user's voice input, and further find other possible answers to report the application to the user.

As mentioned earlier, today's mobile communication devices have been able to provide natural language conversations to allow users to voice to communicate with mobile communication devices. However, in the current voice dialogue system, when the user's voice input is not clear, since the same sentence voice input may mean a plurality of different intentions or purposes, the system may easily output a voice response that does not conform to the voice input. Therefore, in many conversation situations, it is difficult for a user to obtain a voice response that meets his or her intention. To this end, the present invention proposes a method of correcting a voice response and a natural language dialogue system, wherein the natural language dialogue system can repair the wrong voice response according to the user's voice input, and further find other possible answers to report back to the user. In order to clarify the content of the present invention, the following specific examples are given as examples in which the present invention can be implemented.

FIG. 5A is a block diagram of a natural language dialogue system according to an embodiment of the invention. Referring to FIG. 5A, the natural language dialogue system 500 includes a speech sampling module 510, a natural language understanding system 520, and a speech synthesis database 530. In an embodiment, the voice sampling module 510 is configured to receive the first voice input 501 (eg, voice from the user), and then parse it to generate the first request information 503, and the natural language understanding system 520 will again A request information 503 is parsed to obtain the first keyword 509 therein, and after finding the first return answer 511 that meets the first request information 503 (according to the description of FIG. 1, the first request information 503 can be the same as the request information 102 The way to do this, that is, request information 102 will be generated after analysis The grammar material 106 may be intended, and the keyword 108 therein may be used to perform a full-text search on the structured database 220 to obtain a response result 110, which is then compared with the intent data 112 in the possible intent grammar data 106. The determined intent grammar data 114 is generated, and finally the analysis result 104 is sent by the analysis result output module 116, and the analysis result 104 can be used as the first return answer 511 in FIG. 5A, and the speech synthesis database 530 is based on the first return answer 511. Corresponding voice queries are performed (because the analysis results 104 as the first answer 511 may include relevant information for the fully/partially matched records 302 (eg, the guidance data stored in the guide bar 310, the numerical data in the value column 312, and The data in the content column 306, etc., can therefore be used to perform a voice query, and the first voice 513 being queried is output to generate a first voice response 507 corresponding to the first voice input 501 to the user. Wherein, if the user believes that the first voice response 507 output by the natural language understanding system 520 does not conform to the first request information 503 in the first voice input 501, the user inputs another voice input, such as the second voice input 501', To indicate this. The natural language understanding system 520 processes the second voice input 501' using the same processing method as the first voice input 501 to generate the second request information 503', and then parses the second request information 503' to obtain the first The second keyword 509' finds the second return answer 511' that matches the second request information 503', finds the corresponding second voice 513', and finally generates the corresponding second voice response 507' output according to the second voice 513'. To the user, as a correction to the first return answer 511. Obviously, the natural language understanding system 520 can be based on the natural language understanding system 100 of FIG. 1, and a new module is added (which will be explained in conjunction with the subsequent FIG. 5B) to achieve a voice correction based on the user's voice input. The purpose of the answer.

The components in the aforementioned natural language dialogue system 500 can be configured in the same machine. For example, the speech sampling module 510 and the natural language understanding system 520 are, for example, disposed on the same electronic device. The electronic device may be a mobile phone (Cell phone), a personal digital assistant (PDA) mobile phone, a smart phone (Smart phone), and the like, a mobile communication device, a palm computer (Pocket PC), a tablet computer (Tablet). PC), notebook computer, personal computer, or other electronic device with communication function or communication software installed, is not limited in scope here. In addition, the above electronic device may use an Android operating system, a Microsoft operating system, an Android operating system, a Linux operating system, etc., without being limited thereto. Of course, the components of the aforementioned natural language dialogue system 500 need not necessarily be disposed in the same machine, but may be distributed in different devices or systems and connected by various different communication protocols. For example, the natural language understanding system 520 can be located in a cloud server or a server in a local area network. Moreover, the various components of the natural language understanding system 520 can also be distributed among different machines. For example, the components of the natural language understanding system 520 can be located in the same or different machines as the speech sampling module 510.

In this embodiment, the voice sampling module 510 is configured to receive voice input. The voice sampling module 510 can be a device for receiving audio such as a microphone, and the first voice input 501 / the second voice input 501 ′ can be Voice from the user.

Furthermore, the natural language understanding system 520 of the present embodiment can be one or several A hardware circuit composed of logic gates is implemented. Alternatively, in another embodiment of the invention, the natural language understanding system 520 can be implemented by computer program code. For example, the natural language understanding system 520 is, for example, a program code segment written by a programming language, implemented in an application, an operating system, or a driver, etc., and these code segments are stored in a storage unit and processed by the processing unit. (not shown in Figure 5A) to perform. In order to enable those skilled in the art to further understand the natural language understanding system 520 of the present embodiment, an example will be described below. However, the present invention is intended to be illustrative only and not limited thereto, and the invention may be practiced, for example, using hardware, software, firmware, or a combination of the three embodiments.

FIG. 5B is a block diagram of a natural language understanding system 520, in accordance with an embodiment of the invention. Referring to FIG. 5B , the natural language understanding system 520 of the present embodiment may include a voice recognition module 522 , a natural language processing module 524 , and a voice synthesis module 526 . The voice recognition module 522 receives the request information transmitted from the voice sampling module 510, for example, the first request information 503 for parsing the first voice input 501, and takes out one or more first keywords 509 (for example, The keyword 108 or sentence of FIG. 1A, etc.). The natural language processing module 524 can further parse the first keywords 509 to obtain a candidate list that includes at least one reward answer (the same as the processing of FIG. 5A, that is, for example, by the retrieval system 200 of FIG. 1A). The database 220 performs full-text search, and after obtaining the response result 110 and comparing the intent data 112, the determined intent grammar data 114 is generated, and finally the analysis result 104 sent by the analysis result output module 116 generates a reward answer), and Selecting an answer that matches the first voice input 501 from all the answer answers of the candidate list as the first time Answer 511 (for example, pick a perfect match record...etc.). Since the first return answer 511 is an internal analysis of the natural language understanding system 520, it must also be converted into a speech output before being output to the user so that the user can make a judgment. The speech synthesis module 526 then queries the speech synthesis database 530 according to the first reward answer 511. The speech synthesis database 530 records, for example, text and corresponding speech information, so that the speech synthesis module 526 can find out The first speech 513 corresponding to the first reward answer 511 is used to synthesize the first speech response 507. Thereafter, the speech synthesis module 526 can output the synthesized first speech response 507 to the user through a voice output interface (not shown) in which the voice output interface is, for example, a speaker, a speaker, or a headset. It should be noted that when the speech synthesis module 526 queries the speech synthesis database 530 according to the first reward answer 511, the first report answer 511 may need to be format converted first, and then performed through the interface specified by the speech synthesis database 530. call. Since the need to perform format conversion when calling the speech synthesis database 530 is related to the definition of the speech synthesis database 530 itself, since this portion belongs to a technique well known to those skilled in the art, it will not be described in detail herein.

Next, an example is given to illustrate that if the user inputs the first voice input 501 of "I want to see the Romance of the Three Kingdoms", the voice recognition module 522 receives the first voice input 501 from the voice sampling module 510. The parsed first request information 503 is then taken out, for example, as a first keyword 509 containing "Three Kingdoms". The natural language processing module 524 can then parse the first keyword 509 "Three Kingdoms" (for example, the full-text search of the structured database 220 by the retrieval system 200 of FIG. 1A, and obtain the response result 110 and the intent After the data 112 is compared, the determination is made. The graph grammar data 114, and finally the analysis result 104 sent by the analysis result output module 116), generates a reward answer containing three intent options of the "Three Kingdoms" and integrates them into a candidate list (assuming each intent option) There is only one return answer, which is classified into three options: “reading books”, “watching TV shows”, and “watching movies”, and then selecting one of the three return answers from the candidate list has the highest value in the heat column 316. (For example, the record 10 of FIG. 3B is selected) as the first return answer 511. In an embodiment, the corresponding manner in which the heat column 316 has the highest value can be directly executed (for example, the previously mentioned "betray" to the user directly played by Xiao Jingteng), which is not limited by the present invention.

In addition, the natural language processing module 524 can also determine the previous number by parsing the subsequently received second voice input 501' (because it is fed into the voice sampling module 510 in the same manner as the previous voice input 501). A return answer 511 is correct. Because the second voice input 501' is a response by the user to the first voice response 507 previously provided to the user, it includes information that the user considered the previous first voice response 507 correct or not. If after analyzing the second voice input 501', it indicates that the user thinks that the first report answer 511 is incorrect, the natural language processing module 524 can select other reward answers in the candidate list as the second report answer 511', for example, from the candidate. After the first return answer 511 is removed from the list, and a second return answer 511' is re-selected in the remaining return answer, the speech synthesis module 526 is used to find the second speech 513' corresponding to the second return answer 511'. Finally, the second speech 513' is synthesized by the speech synthesis module 526 into a second speech response 507' for playback to the user.

Continuing the previous user input "I want to see the Romance of the Three Kingdoms", if you use The user wants to watch the TV series of the Three Kingdoms, so the option of recording 10 in Figure 3B that was previously output to the user (because it is a movie watching "The Romance of the Three Kingdoms") is not what the user wants, so the user may input "I want to see the Romance of the Three Kingdoms." The TV series (the user clearly pointed out that he wants to watch a TV series), or "I don't want to watch the Three Kingdoms Romance movie" (the user only negates the current option)...etc. as the second voice input 501'. Then, after the second voice input 501' obtains its second request information 503' (or the second keyword 509'), it will find that the second keyword 509' in the second request information 503' will contain " The TV series (the user has a clear indication) or "Do not want the movie" (the user only denies the current option), so it will be judged that the first return answer 511 does not meet the user's needs. Therefore, at this time, another return answer can be selected from the candidate list as the second return answer 511' and the corresponding second voice response 507' can be output, for example, outputting the second voice of "I am playing the drama of the Three Kingdoms for you" Answer 507' (if the user explicitly indicates that he wants to watch the drama of the Three Kingdoms), or output the "what option do you want" (if the user only negates the current option) the second voice response 507', combined with other candidates in the candidate list The option is for the user to select (eg, "choose the hot answer bar 316 the next highest return answer as the second return answer 511'). Again, in another embodiment, if the second voice input 501' entered by the user includes "select "The message, for example, "Reading the Three Kingdoms Romance Books", "Watching the Three Kingdoms TV Series", and "Watch the Three Kingdoms Romance Movies" three options for the user to make a choice, the user may enter the "I want to watch movies" second voice input At 501', after analyzing the second request information 503' of the second voice input 501' and discovering the user's intention (for example, from the second keyword 509' Now the user selects "watch movie", so the second voice input 501' will output "I am playing the Three Kingdoms movie for you" after parsing and obtaining its second request information 503'. The second voice response 507' (if the user wants to watch the Three Kingdoms movie) then plays the movie directly to the user. Of course, if the user enters the "I want the third option" (assuming that the user chooses to read the book at this time), the application corresponding to the third selection will be executed, that is, the output "What you want is Read the "Two Voice Responses 507" of the Three Kingdoms Romance Book, and combine the e-books showing the Romance of the Three Kingdoms to the user's actions.

In this embodiment, the speech recognition module 522, the natural language processing module 524, and the speech synthesis module 526 in the natural language understanding system 520 can be disposed in the same machine as the speech sampling module 510. In other embodiments, the speech recognition module 522, the natural language processing module 524, and the speech synthesis module 526 can also be distributed among different machines (eg, computer systems, servers, or the like). For example, the natural language understanding system 520' shown in FIG. 5C, the speech synthesis module 526 can be disposed in the same machine 502 as the speech sampling module 510, and the speech recognition module 522 and the natural language processing module 524 can be configured on another machine. . In addition, under the architecture of FIG. 5C, the natural language processing module 524 transmits the first return answer 511 / the second return answer 511 ' to the speech synthesis module 526, which then returns the answer 511 / the second return answer. The 511' is sent to the speech synthesis database to find the corresponding first speech 513/second speech 513' as a basis for generating the first speech response 507/second speech response 507'.

FIG. 6 is a flow chart of a method for modifying a first voice response 507 according to an embodiment of the invention. In the method for correcting the first voice response 507 in this embodiment, when the user thinks that the currently played first voice response 507 does not match the first request information 503 that was previously input, the second voice input 501 is re-entered. 'And fed into the speech sampling module 510, and then analyzed by the natural language understanding system 520 to learn When the first voice response 507 previously played to the user does not conform to the user's intent, the natural language understanding system 520 can again output the second voice response 507', thereby correcting the original first voice response 507. For convenience of explanation, only the natural language dialogue system 500 of Fig. 5A is taken as an example, but the method of correcting the first voice response 507 of the present embodiment can also be applied to the above-described natural language dialogue system 500' of Fig. 5C.

Referring to FIG. 5A and FIG. 6 simultaneously, in step S602, the voice sampling module 510 receives the first voice input 501 (also fed into the voice sampling module 510). The first voice input 501 is, for example, a voice from a user, and the first voice input 501 may also have a first request information 503 of the user. Specifically, the first voice input 501 from the user may be an inquiry sentence, a command sentence, or other request information, such as "I want to see the Romance of the Three Kingdoms", "I want to listen to the music of the water" or "A few degrees of today's temperature", etc. Wait.

In step S604, the natural language understanding system 520 parses the at least one first keyword 509 included in the first voice input 501 to obtain a candidate list, wherein the candidate list has one or more reward answers. For example, when the first voice input 501 of the user is "I want to see the Romance of the Three Kingdoms", the first keyword 509 obtained by the natural language understanding system 520 after analysis is, for example, "The Romance of the Three Kingdoms" and "Look". . For another example, when the first voice input 501 of the user is "I want to listen to the song of forgetting the water", the first keyword 509 obtained by the natural language understanding system 520 after analysis is, for example, "forget the water" and "listen". ,"song"".

After that, the natural language understanding system 520 can query the structured database 220 according to the first keyword 509 to obtain at least one search result (for example, a map). The analysis result of 1 is 104), which is used as the return answer in the candidate list. The manner of selecting the first return answer 511 from the plurality of reward answers may be as described in FIG. 1A and will not be described herein. Since the first keyword 509 may contain different knowledge areas (such as movies, books, music or games, etc.), and the same knowledge field may be further divided into multiple categories (for example, different authors of the same movie or book name) , different singers of the same song name, different versions of the same game name, etc.), so for the first keyword 509, the natural language understanding system 520 can query one or more related documents in the structured database. The search result of the first keyword 509 (for example, the analysis result 104), wherein each of the search results may include guidance materials related to the first keyword 509 (for example, "Xiao Jingteng" and "Betrayal" as the key words 108 When the structured database 220 of 3A, 3B performs a full-text search, for example, the matching results of the records 6 and 7 of FIG. 3A are obtained, which respectively contain the guidance materials of “singerguid” and “songnameguid”, which are stored in the guidelines. Information in column 310) and other information. The other information is, for example, in the search result, in addition to other keywords related to the first keyword 709, etc. (for example, "days passed together" as a keyword and in the structured database 220 of FIG. 3A. When the full-text search and record 1 is the matching result, "Andy Lau" and "Hong Kong, Taiwan, Cantonese, and popular" are other materials. Therefore, from another point of view, when the first voice input 501 input by the user has multiple first keywords 509, it indicates that the first request information 503 of the user is clear, so that the natural language understanding system 520 can query more. The search result that is close to the first request information 503.

For example, when the first keyword 509 is "Three Kingdoms" (for example, When the user inputs "I want to see the voice input of the Three Kingdoms", the natural language understanding system 520 may generate three possible intent grammar materials 106 (as shown in Figure 1): "<readbook>, <bookname>=Three Kingdoms ";"<watchTV>, <TVname>=Three Kingdoms"; and "<watchfilm>, <filmname>=Three Kingdoms".

Therefore, the search results of the inquiry are about "..."The Romance of the Three Kingdoms..."Books" (intent information is <readbook>), "..."The Romance of the Three Kingdoms..."TV drama" (intention) The information is <watchTV>), "..."The Romance of the Three Kingdoms...""Movie"" (intent data is <watchfilm>) (for example, records 8, 9, 10 of Figure 3B), of which "TV drama" and "Books" and "Movies" list the corresponding user intents). For another example, when the first keyword 509 is ""forget the water", "music"" (for example, the user inputs "speaking of the music I want to listen to", the natural language understanding system 520 may generate the following after analysis. Possible intent grammar material: "<playmusic>, <songname>=forget the water"; For the search results, for example, the records of "... "Forget the water"... "Andy Lau"" (for example, record 11 of Figure 3B), "... "Forget the water"... "李翊君" The record (for example, record 12 of FIG. 3B), wherein "Andy Lau" and "Li Junjun" are intent data corresponding to the user. In other words, each search result may include a first keyword 509 and an intent data related to the first keyword 509, and the natural language understanding system 520 converts the data included in the search result into a search result according to the query. Return the answer and record the return answer in the candidate list for use in subsequent steps.

In step S606, the natural language understanding system 520 will be in the candidate list. At least one first return answer 511 is selected, and the corresponding first voice response 507 is output according to the first return answer 511. In the present embodiment, the natural language understanding system 520 can arrange the reward answers in the candidate list in order of priority, and select a reward answer from the candidate list according to the priority order, thereby outputting the first voice response 507.

For example, when the first keyword 509 is "Three Kingdoms", it is assumed that the natural language understanding system 520 queries a lot of records about "..."Three Kingdoms"... "Books" (ie, by query) The number of materials to be given is prioritized, for example, 20 records on books), followed by the records of "...the Romance of the Three Kingdoms..."Music" (for example, 18 pens), and about "..." "The Romance of the Three Kingdoms"... "TV drama" has the fewest records (for example, 10 strokes), and the Natural Language Understanding System 520 will use the "Book of the Three Kingdoms" as the first return answer (the most preferred return answer), " "The music of the Romance of the Three Kingdoms" as the second return answer (the second preferred choice of the answer), "the drama of the Three Kingdoms" as the third return answer (the third preferred choice answer). Of course, if the first return answer related to the "Book of the Three Kingdoms" is not only a record, the first return answer 511 can also be selected according to the order of precedence (for example, the number of times selected or the highest value of the heat column 316). The relevant details have been mentioned before and will not be described here.

Next, in step S608, the speech sampling module 510 receives the second speech input 501', and the natural language understanding system 520 parses the second speech input 501' and determines whether the previously selected first reward answer 511 is correct. Here, the speech sampling module 510 parses the second speech input 501' to parse the second keyword 509' included in the second speech input 501', wherein the second keyword 509', for example It is a keyword further provided by the user (such as time, intent, knowledge area, etc.). Moreover, when the second keyword 509' in the second voice input 501' does not match the intent data associated with the first reward answer 511, the natural language understanding system 520 determines that the previously selected first reward answer 511 is Incorrect. The manner in which the second request information 503' of the second voice input 501' is included to include the "correct" or "negative" first voice response 507 has been mentioned above and will not be described herein.

Further, the second speech input 501' parsed by the natural language understanding system 520 may or may not include an explicit second keyword 509'. For example, the voice sampling module 510 receives, for example, a book from the user that "I am not referring to the Romance of the Three Kingdoms" (Case A), "I am not referring to the Romance of the Three Kingdoms, I am referring to the TV series of the Romance of the Three Kingdoms" ( Situation B), "I mean the TV series of the Romance of the Three Kingdoms" (Case C) and so on. The second keyword 509' in the above case A is, for example, ""No", "Three Kingdoms", "Book"", and the keyword 509 in Case B is, for example, "No", "Romance of the Three Kingdoms", "Book" "Yes", "Three Kingdoms", "TV drama", and the second keyword 509' in case C is, for example, "Yes", "Romance of the Three Kingdoms", "TV drama". For convenience of explanation, only the cases A, B, and C are exemplified above, but the embodiment is not limited thereto.

Next, the natural language understanding system 520 determines whether the related intent data in the first reward answer 511 is correct based on the second keyword 509' included in the second speech input 501'. That is to say, if the first return answer 511 is "a book of the Romance of the Three Kingdoms" and the second keyword 509' is "the Romance of the Three Kingdoms" or "TV drama", the natural language understanding system 520 will judge the first return. The relevant intent information in answer 511 (that is, the user wants to see the "Books" of the Three Kingdoms) does not meet the second from the user. The second keyword 509' of the voice input 501' (i.e., the user wants to see the drama of the Three Kingdoms "TV drama") is used to judge that the first return answer 511 is incorrect. Similarly, if the answer to the answer is "the book of the Romance of the Three Kingdoms" and the second keyword 509' is "No", "Romance of the Three Kingdoms" and "Book", the Natural Language Understanding System 520 will also determine the A return answer 511 is incorrect.

After the natural language understanding system 520 parses the second voice input 501 and determines that the previously output first voice response 501 is correct, the natural language understanding system 520 makes a corresponding second voice input 501' as shown in step S610. Response. For example, if the second voice input 501' from the user is "Yes, a book of the Three Kingdoms", the natural language understanding system 520 may be a second voice response outputting "a book that is helping you to open the Romance of the Three Kingdoms". 507'. Alternatively, the natural language understanding system 520 can load the book content of the Three Kingdoms directly through the processing unit (not shown) while playing the second voice response 507'.

However, after the natural language understanding system 520 parses the second voice input 501', it is determined that the previously output first voice response 507 (ie, the return answer 511) is incorrect, then the natural language understanding system 520 will perform as shown in step S612. The other one other than the first return answer 511 is selected from the candidate list, and the second voice response 507' is output according to the selected result. Here, if the second voice input 501' provided by the user does not have the explicit second keyword 509' (such as the second voice input 501' of the above case A), the natural language understanding system 520 can be based on the priority order. A second preferred selection of return answers is selected from the candidate list. Alternatively, if the second voice input 501' provided by the user has a clear second keyword 509' (as in the case of cases B and C above) The second speech input 501'), the natural language understanding system 520 can directly select the corresponding reward answer from the candidate list based on the second keyword 509' directed by the user.

On the other hand, if the second voice input 501' provided by the user has an explicit second keyword 509' (such as the second voice input of cases B and C above), the natural language understanding system 520 checks in the candidate list. If there is no return answer corresponding to the second keyword 509, the natural language understanding system 520 outputs a third voice response, such as "Check this book" or "I don't know".

In order to enable those skilled in the art to further understand the method for correcting the voice response and the natural language dialogue system of the present embodiment, a detailed description will be given below.

First, it is assumed that the first voice input 501 received by the voice sampling module 510 is "I want to see the Romance of the Three Kingdoms" (step S602). Then, the natural language understanding system 520 can be interpreted as ""seeing" and "the Romance of the Three Kingdoms". a first keyword 509 and obtaining a candidate list having a plurality of first reward answers, wherein each of the reward answers has associated keywords and other materials (other information may be stored in the content column 306 of FIG. 3A/3B, or Each part of the value field 312 of each record 302) (step S604) is as shown in Table 1 (assuming that the book/television/music/movie for the Romance of the Three Kingdoms has only one piece of information).

Table I

Next, the natural language understanding system 520 will select the desired reward answer in the candidate list. Assuming that the natural language understanding system 520 sequentially selects the reward answer a in the candidate list as the first reward answer 511, the natural language understanding system 520 outputs, for example, "whether or not the book of the Three Kingdoms is played", that is, the first voice response 507 ( Step S606).

At this time, if the second voice input 501 ′ received by the voice sampling module 510 is “Yes” (step S608 ), the natural language understanding system 520 determines that the above-mentioned reward answer a is correct, and the natural language understanding system 520 Another voice response 507 "Please wait" (ie, second voice response 507') will be output and passed through the processing unit (not The book content is loaded to load the Romance of the Three Kingdoms (step S610).

However, if the second voice input 501' received by the voice sampling module 510 is "I don't refer to the book of the Three Kingdoms" (step S608), the natural language understanding system 520 determines that the above-mentioned reward answer a is incorrect, and The natural language understanding system 520 will then select another return answer from the return answer b~e of the candidate list as the second return answer 511', which is, for example, the "TV drama of whether to play the Three Kingdoms" of the answer b. If the user continues to answer "not a TV show", the natural language understanding system 520 will choose to report one of the answers c~e to report. In addition, if the reward answers a~e in the candidate list are returned to the user by the natural language understanding system 520, and the return answers a~e do not match the user's voice input 501, the natural language understanding system 520 outputs "check". The voice response 507 without any data (step S612).

In another embodiment, in the above step S608, if the voice sampling module 510 receives the user's second voice input 501' as "I mean the comics of the Three Kingdoms", here, there is no comic in the candidate list. In return for the answer, the natural language understanding system 520 will directly output a second voice response 507' of "Check no data".

Based on the above, the natural language understanding system 520 can output a corresponding first voice response 507 in accordance with the first voice input 501 from the user. When the first voice response 507 output by the natural language understanding system 520 does not meet the request information 503 of the first voice input 501 of the user, the natural language understanding system 520 can correct the first voice response 507 that is originally output, and according to the user. The second voice input 501' provided subsequently further outputs a second voice corresponding to the first request information 503 of the user. Answer 507’. As such, if the user is still dissatisfied with the answers provided by the natural language understanding system 520, the natural language understanding system 520 can automatically correct and report a new voice response to the user in order to enhance the user and natural language dialogue system 500. Convenience in conversation.

It is worth mentioning that, in step S606 and step S612 of FIG. 6, the natural language understanding system 520 can also sort the reward answers in the candidate list according to different methods for evaluating the priority order, according to the priority list from the candidate list. The answer is selected and the voice response corresponding to the answer is output.

For example, the natural language understanding system 520 can be based on the usage habits of the public (for example, when both the preference column 318 and the disgusting column 320 of FIG. 3B are divided into two parts, namely, storing the user's personal preference and the preference of the person, the two columns can be referred to as the public. The preferred information) is to prioritize the first return answer 511 in the candidate list, and the more frequently the answers are frequently used by the people. For example, taking the first keyword 509 as the "Three Kingdoms" as an example, assume that the natural language understanding system 520 finds the answer to the drama of the Three Kingdoms, the books of the Three Kingdoms, and the music of the Three Kingdoms. Among them, if everyone refers to the "Romance of the Three Kingdoms", it usually refers to the books of the "Three Kingdoms" (for example, 20 records of books), and fewer people will refer to the TV series of "Three Kingdoms" (for example, 18 records on books) And less people will refer to the music of the "Three Kingdoms" (for example, 10 records about books), so when the value stored in the heat column 316 in Figure 3B represents the matching situation of all users, the heat column 316 The value will be the highest in the "book" record of the "Three Kingdoms", and the natural language understanding system 520 will sort the answers to the "books", "TV series", and "music" in order of priority. That is to say, the natural language understanding system 520 preferentially selects the "book of the Three Kingdoms" as the first return answer 511, and outputs the first voice response 507 according to the first return answer 511.

In addition, the natural language understanding system 520 can also refer to the two columns regarding the user's personal preferences according to the user's habits (for example, when both the preference bar 318 and the disgusting bar 320 of FIG. 3B are divided into two parts, namely, storing the user's personal preference and the preference of the person. Information) to determine the priority of the return answer. In particular, the natural language understanding system 520 can record voice input (including the first voice input 501, the second voice input 501', or any voice input input by the user) that has received the user from the feature database. (For example, as shown in FIG. 7A/7B), wherein the property database can be stored in a storage device such as a hard disk. The feature database can record information about the user's preferences, habits, and the like, such as the first keyword 509 and the response record generated by the natural language understanding system 520 when the natural language understanding system 520 parses the user's voice input 501. The storage and retrieval of user preferences/customary data will be further explained later by means of FIG. 7A/7B/8. Further, in an embodiment, when the value stored in the heat column 316 in FIG. 3B is related to the user's habit (eg, the number of matches), the value of the heat column 316 can be used to determine the user's usage habit or priority. Therefore, when selecting the reward answer, the natural language understanding system 520 can report the answer according to the prioritization according to the user habits and the like recorded in the feature database 730, thereby outputting the voice response 507 that is more in line with the user's voice input 501. For example, in Figure 3B, the value stored in the 8/9/10 heat column 316 is 2/5/8, which can represent the "books", "TV series" and "movies" of the "Three Kingdoms" respectively. The number of matches is 2/5/8, so The answer to the return of the film "The Three Kingdoms" will be given priority.

On the other hand, the natural language understanding system 520 can also select a reward answer based on user habits. For example, suppose that when a user engages in a conversation with the natural language understanding system 520, he often mentions "I want to read the books of the Romance of the Three Kingdoms", and less mentions "I want to watch the TV series of the Romance of the Three Kingdoms", and less mention "I To see the music of the Romance of the Three Kingdoms" (for example, there are 20 records of the "Books of the Romance of the Three Kingdoms" recorded in the user dialogue database (for example, the preference column 318 of record 8 of Figure 3B), and eight "TV dramas about the Romance of the Three Kingdoms". The record (for example, the preference column 318 of record 9 in Figure 3B), and a record of "music of the Three Kingdoms", the priority order of the return answers in the candidate list will be followed by the book of the Romance of the Three Kingdoms. "The TV series of the Romance of the Three Kingdoms" and "The Music of the Romance of the Three Kingdoms." That is to say, when the first keyword 509 is "The Romance of the Three Kingdoms", the natural language understanding system 520 selects "the book of the Romance of the Three Kingdoms" as the first return answer 511, and outputs the first according to the first return answer 511. Voice response 507.

It is worth mentioning that the natural language understanding system 520 can also determine the priority order of returning answers according to user preferences. Specifically, the user dialogue database can also record keywords that the user has expressed, such as "like", "idol", "disgust" or "hate". Thus, the natural language understanding system 520 can sort the reward answers from the candidate list based on the number of times the keywords were recorded. For example, if there are more times in the return answer related to "like", then the return answer will be selected first. Or, suppose that the number of times in the return answer related to "disgust" is more, then it is selected later.

For example, suppose that when a user engages in a dialogue with the natural language understanding system 520, he often mentions "I hate watching TV dramas of the Three Kingdoms" and less mentions "I hate listening to the music of the Three Kingdoms" and less mentions " I hate listening to the books of the Romance of the Three Kingdoms. (For example, there are 20 records in the user dialogue database about "I hate to watch the TV series of the Three Kingdoms" (for example, can be recorded through the disgusting column 320 of record 9 in Figure 3B), 8 about "I hate listening to the music of the Romance of the Three Kingdoms" and a book about "I hate to read the Romance of the Three Kingdoms" (for example, by recording the disgusting column 320 of record 8 in Figure 3B), the priority of the answer in the candidate list is given priority. The order is "the books of the Romance of the Three Kingdoms", "the TV series of the Romance of the Three Kingdoms" and "the music of the Romance of the Three Kingdoms". That is to say, when the first keyword 509 is "Three Kingdoms", the natural language understanding system 520 selects the "Three Kingdoms" book as the first return answer 511, and outputs the first according to the first return answer 511. Voice response 507. In one embodiment, an "Aversion Bar 320" may be added to the heat column 316 of FIG. 3B to record the "degree of disgust" of the user. In another embodiment, one (or other value) may be directly subtracted from the heat column 316 (or the favorite column 318) of the corresponding record when parsing the user's "disgust" information for a certain record, so that Record user preferences when adding columns. Various embodiments for recording user preferences may be applied to the embodiments of the present invention, and the present invention is not limited thereto. Other examples of the recording and application of user habitual information, as well as user/guest use habits and preferences, etc., provide answers and reward answers, which are explained in more detail in later FIG. 7A/7B/8.

On the other hand, the natural language understanding system 520 can also be prior to the user providing a reward answer before the natural language dialogue system 500 (eg, the first voice input 501 is broadcasted Before the release, at this time, the user does not know which kind of return answer is provided by the natural language dialogue system 500 for the voice input to determine the priority order of at least one return answer. That is, if the voice input (eg, the fourth voice input) is received by the voice sampling module 510 earlier than the first voice input 501, the natural language understanding system 520 can also analyze the fourth voice input. The fourth keyword is selected in the candidate list, and the fourth reward answer corresponding to the fourth keyword is preferentially selected, and the fourth voice response is output according to the fourth reward answer.

For example, assume that the natural language understanding system 520 first receives the first voice input 501 of "I want to watch a TV show", and after a short time (eg, after a few seconds), assume that the natural language understanding system 520 receives "help me. The fourth voice input 501 puts the Romance of the Three Kingdoms. At this time, the natural language understanding system 520 can recognize the first keyword 509 of the "drama" in the first voice input 501, and then recognize the "Three Kingdoms" in the fourth keyword. Therefore, the natural language understanding system 520 selects the return answers for the "Three Kingdoms" and "TV series" from the candidate list, and outputs a fourth voice response to the user based on the fourth return answer.

Based on the above, the natural language understanding system 520 can output voices that are more in line with the request information of the voice input according to the voice input from the user and the information of the usage habits, user preferences, user habits, or the user's spoken and written conversations. Respond to the user. The natural language understanding system 520 can prioritize the reward answers in the candidate list according to different sorting methods, such as the usage habits, user preferences, user habits, or the user's forward and backward conversations. Thereby, if the voice input from the user is less clear, the natural language understanding system 520 can take into account The intent of the user's voice input 501 is determined by the usage habits, user preferences, user habits, or the user's conversations (eg, the attributes of the keywords 509 in the first voice input 501, the knowledge field, etc.) ). In other words, if the reward answer is close to what the user has expressed/intended by the person, the natural language understanding system 520 will prioritize the return answer. In this way, the voice response output by the natural language dialogue system 500 can be more consistent with the user's request information.

In summary, in the method for correcting a voice response and the natural language dialogue system of the present embodiment, the natural language dialogue system can output a corresponding first voice response 507 according to the first voice input 501 from the user. Wherein, when the first voice response 507 output by the natural language dialogue system does not match the first request information 503 or the first keyword 509 of the first voice input 501 of the user, the natural language dialogue system can correct the first voice originally output. In response to 507, and according to the second voice input 501' provided by the user, a second voice response 507' that is more in line with the user's needs is further selected. In addition, the natural language dialogue system can also preferentially select a more appropriate return answer according to the usage habits of the people, the user's preference, the user's habits or the user's talks, and the like, and output a corresponding voice response to the user. In this way, if the user is dissatisfied with the answer provided by the natural language dialogue system, the natural language dialogue system can automatically correct the request information according to the user each time, and report the new voice response to the user, thereby enhancing the user. Convenience when talking to the natural language dialogue system.

Then, the architecture and components such as the system 100 and the structured database 220 are understood in a natural language, and are applied to the context and context of the conversation with the user, and the user. Habits, usage habits, and user preferences provide an explanation of the examples of responses and reward answers.

FIG. 7A is a block diagram of a natural language dialogue system according to an embodiment of the invention. Referring to FIG. 7A, the natural language dialogue system 700 includes a speech sampling module 710, a natural language understanding system 720, a feature database 730, and a speech synthesis database 740. In fact, the speech sampling module 710 in FIG. 7A is the same as the speech sampling module 510 of FIG. 5A, and the natural language understanding system 520 is the same as the natural language understanding system 720, so the functions performed are the same. In addition, when the natural language understanding system 720 analyzes the request information 703, the user's intention can also be obtained by performing a full-text search on the data database 220 of FIG. 1. This part of the technology has been described above with respect to FIG. 1 and related descriptions. No longer. The feature database 730 is used to store the user preference information 715 sent by the natural language understanding system 720, or to provide the user preference record 717 to the natural language understanding system 720, which will be described in more detail later. The speech synthesis database 740 is equivalent to the speech synthesis database 530 for providing voice output to the user. In this embodiment, the voice sampling module 710 is configured to receive the voice input 701 (ie, the first/second voice input 501/501' of FIG. 5A/B is the voice from the user), and the natural language understanding system 720 The request information 703 in the voice input (ie, the first/second request information 503/503' of FIG. 5A/B) is parsed, and the corresponding voice response 707 is output (ie, the first/second voice response 507 of FIG. 5A/B). /507'). The components in the aforementioned natural language dialogue system 700 can be configured in the same machine, which is not limited by the present invention.

The natural language understanding system 720 will receive the speech sampling module 710. The request information 703 after parsing the speech input 701, and the natural language understanding system 720 generates a candidate list including at least one reward answer according to one or more keywords 709 in the speech input 701, and then from the candidate list. One of the keywords 709 is found as the return answer 711, and the speech synthesis database 740 is queried to find the speech 713 corresponding to the reward answer 711, and finally the speech response 707 is output according to the speech 713. In addition, the natural language understanding system 720 of the present embodiment may be implemented by a hardware circuit composed of one or several logic gates, or implemented by a computer program code, which is merely an example, and is not limit.

Figure 7B is a block diagram of a natural language dialog system 700', in accordance with another embodiment of the present invention. The natural language understanding system 720' of FIG. 7B can include a speech recognition module 722 and a natural language processing module 724, and the speech sampling module 710 can be combined with the speech synthesis module 726 in a speech synthesis processing module 702. The voice recognition module 722 receives the request information 703 sent from the voice sampling module 710 to parse the voice input 701 and converts it into one or more keywords 709. The natural language processing module 724 further processes the keywords 709 to obtain at least one candidate list, and selects one of the candidate lists that is more in line with the voice input 701 as the reward answer 711. Since the answer 711 is the internal analysis of the natural language understanding system 720, it must be converted into text or voice output to be output to the user, and the speech synthesis module 726 will query the speech synthesis data according to the reward answer 711. The library 740, and the speech synthesis database 740, for example, is recorded with text and its corresponding voice information, so that the speech synthesis module 726 can find the speech 713 corresponding to the reward answer 711, thereby synthesizing the speech response 707. after that, The speech synthesis module 726 can output the synthesized speech through a voice output interface (not shown), wherein the voice output interface is, for example, a device such as a speaker, a speaker, or a headset, to output a voice to the user. It should be noted that in FIG. 7A, the natural language understanding system 720 incorporates the speech synthesis module 726 (eg, the architecture of FIG. 5B, but the speech synthesis module 726 is not shown in FIG. 7A), while the speech synthesis module The group will use the reward answer 711 to query the speech synthesis database 740 to obtain the speech 713 as a basis for synthesizing the speech response 707.

In this embodiment, the speech recognition module 722, the natural language processing module 724, and the speech synthesis module 726 in the natural language understanding system 720 are respectively equivalent to the speech recognition module 522 and the natural language processing module of FIG. 5B. Group 524 and speech synthesis module 526 provide the same functionality. In addition, the speech recognition module 722, the natural language processing module 724, and the speech synthesis module 726 can be disposed in the same machine as the speech sampling module 710. In other embodiments, the speech recognition module 722, the natural language processing module 724, and the speech synthesis module 726 can also be distributed among different machines (eg, computer systems, servers, or the like). For example, the natural language understanding system 720' shown in FIG. 7B, the speech synthesis module 726 can be disposed in the same machine 702 as the speech sampling module 710, and the speech recognition module 722 and the natural language processing module 724 can be configured on another machine. . It should be noted that in the architecture of FIG. 7B, since the speech synthesis module 726 and the speech sampling module 710 are disposed in a machine 702, the natural speech understanding system 720 needs to transmit the reward answer 711 to the machine 702, and The speech synthesis module 726 sends the reward answer 711 to the speech synthesis database 740 to find the corresponding speech 713 as a basis for generating the speech response 707. In addition, language When the speech synthesis module 726 calls the speech synthesis database 740 according to the reward answer 711, it may be necessary to first convert the response answer 711 and then make a call through the interface specified by the speech synthesis database 740, as this part belongs to the field. Techniques well known to the skilled person are not described in detail herein.

The natural language dialogue method will be described below in conjunction with the natural language dialog system 700 described above in connection with FIG. 7A. FIG. 8A is a flowchart of a natural language dialogue method according to an embodiment of the invention. For convenience of explanation, only the natural language dialogue system 800 of Fig. 7A is taken as an example, but the natural language dialogue method of the present embodiment can also be applied to the above-described natural language dialogue system 700' of Fig. 7B. Compared with FIG. 5/6, the information outputted by the user's voice input is automatically corrected according to the user's voice input, but FIG. 7A/7B/8 deals with recording the user's preference according to the feature database 730. The data 715 is selected from the candidate list to report the answer 711 and play the corresponding voice to the user. In fact, the embodiments of Figures 5/6 and 7A/7B/8 may alternatively or coexist, and the invention is not limited thereto.

Referring to FIG. 7A and FIG. 8 simultaneously, in step S810, the voice sampling module 710 receives the voice input 701. The voice input 701 is, for example, a voice from a user, and the voice input 701 may also have a request information 703 of the user. Specifically, the voice input 701 from the user may be an inquiry sentence, a command sentence, or other request information, such as the aforementioned example "I want to see the Romance of the Three Kingdoms", "I want to listen to the music of the water" or "Today's temperature A few degrees, etc. It should be noted that steps S802-S806 are the flow of the natural language dialogue system 700 storing the user preference profile 715 for the user's previous voice input, and the subsequent steps S810-S840 are based on these previously stored in The user preference material 715 of the feature database 730 operates. The details of steps S802-S806 will be described later in detail, and the operation contents of steps S820-S840 will be described below.

In step S820, the natural language understanding system 720 parses at least one keyword 709 included in the first voice input 701 to obtain a candidate list, wherein the candidate list has one or more reward answers. In detail, the natural language understanding system 720 parses the speech input 701 and obtains one or more keywords 709 of the speech input 701. For example, when the user's voice input 701 is "I want to see the Romance of the Three Kingdoms", the keyword 709 obtained by the natural language understanding system 720 after analysis is, for example, "The Romance of the Three Kingdoms" and "Look" (as before) Said, but also analyze the user wants to see books, TV series, or movies). For another example, when the user's voice input 701 is "I want to listen to the song of forgetting the water", the keyword 709 obtained by the natural language understanding system 720 after analysis is, for example, "forget the water", "listen", "song" (As mentioned earlier, you can re-analyze what the user wants to hear is the version that Andy Lau or Li Yijun sang). After that, the natural language understanding system 720 can perform full-text search from the structured database according to the above keyword 709, and obtain at least one search result (which may be at least one of the records in FIG. 3A/3B), as a candidate list. The answer in the return. Since a keyword 709 may belong to different knowledge areas (such as movies, books, music, games, etc.), and can be further divided into multiple categories in the same knowledge field (for example, different authors of the same movie or book name, Different singers of the same song name, different versions of the same game name, etc.), so for a keyword 709, the natural language understanding system 720 can be analyzed (eg, full-text examination of the structured database 220) Search for one or more search results related to this keyword 709, which includes information other than the keyword 709 and the keyword 709, etc. (the contents of other information are shown in Table 1). Therefore, from another point of view, when the first voice input 701 input by the user has multiple keywords 709, it indicates that the user's request information 703 is clearer, so that the natural language understanding system 720 can analyze the request information 703. Close search results (because if the natural language understanding system 720 can find the exact match result, it should be the option that the user wants).

For example, when the keyword 709 is "Three Kingdoms", the search result analyzed by the natural language understanding system 720 is, for example, "..."Three Kingdoms"... "TV drama", "..." The records of the "Romance of the Three Kingdoms"... "Books" (where "TV dramas" and "books" are the user's intentions in response to the results). For another example, when the keyword 709 is "Forget Water" or "Music", the user's intention analyzed by the natural language understanding system 720 may be "... "forgetting water"... "music"... "Andy Lau", "... "forget the water"... "Music"... "Li Yujun" record, in which "Andy Lau" and "Li Yujun" are search results to indicate the user's intention. In other words, after the natural language understanding system 720 performs full-text search on the structured database 220, each search result may include a keyword 709 and other materials related to the keyword 709 (as shown in Table 1), and natural language understanding. The system 720 converts the analyzed search results into a candidate list containing at least one reward answer for use in subsequent steps.

In step S830, the natural language understanding system 720 records the user preference record 717 sent by the feature database 730 (for example, according to the result of storing the user preference data 715 stored therein, which will be described later). Candidate list The answer 711 is optionally returned, and a voice response 707 is output based on the reward answer 711. In the present embodiment, the natural language understanding system 720 can select the reward answer 711 from the candidate list in a prioritized order (which is included in the priority order). In step S840, based on the report answer 711, the voice response 707 is output (step S840).

For example, in an embodiment, the number of search results may be prioritized. For example, when the keyword 709 is "Three Kingdoms", it is assumed that the natural language understanding system 720 finds in the structured database 220 after analysis. ... "The Romance of the Three Kingdoms"... "Books" has the largest number of records, followed by "..."The Romance of the Three Kingdoms..."Music", and "..."The Romance of the Three Kingdoms." . . . "TV drama" has the fewest records, then the natural language understanding system 720 will record the records related to the "Book of the Three Kingdoms" as the first priority return answer (for example, sorting all the books about the Romance of the Three Kingdoms into one candidate) The list can be sorted according to the value of the heat column 316. The record related to the "Music of the Three Kingdoms" is the second priority return answer, and the record related to the "TV drama of the Three Kingdoms" is the third priority return answer. It should be noted that in addition to the number of search results, the priority may be based on user preferences, user habits, or usage habits, and the related narrative will be described in detail later.

In order to enable those skilled in the art to further understand the natural language dialogue method and the natural language dialogue system of the present embodiment, an embodiment will be described in detail below.

First, assume that the first voice input 701 received by the voice sampling module 710 is "I want to see the Romance of the Three Kingdoms" (step S810), and then, the natural language understanding system 720 The keyword 709 can be parsed as ""Look", "Three Kingdoms"", and a candidate list having multiple reward answers is obtained, wherein each of the reward answers has a related keyword (step S820) and other information, such as Table 1 above is shown.

Next, the natural language understanding system 720 will select the reward answer in the candidate list. Assuming that the natural language understanding system 720 selects the reward answer a in the candidate list (refer to Table 1) as the first reward answer 711, the natural language understanding system 720 outputs, for example, "whether or not the book of the Three Kingdoms is played" as a voice response. 707 (steps S830 to S840).

As described above, the natural language understanding system 720 can also sort the reward answers in the candidate list in accordance with different methods of evaluating the priority order, and accordingly output a voice response 707 corresponding to the reward answer 711. For example, the natural language understanding system 720 can determine user preferences based on a plurality of conversation records with the user (eg, using the user's positive/negative terms as mentioned above), and can also use the user preference record 717 to determine the return answer. 711 priority order. Before explaining the usage of the user's positive/negative terms, the user preference information 715 is first described in terms of storing the user/person's preferences/dislikes or habits.

The manner in which the user preference profile 715 is stored is now determined in accordance with steps S802-806. In an embodiment, before receiving the voice input 701 in step S810, a plurality of voice inputs, that is, previous history conversation records, may be received in step S802, and user preferences are captured based on the previous plurality of voice inputs 701. The data 715 (step S804) is then stored in the property database 730. In fact, the user preference data 715 can also be stored in the structured database 220 (or the feature database 730 is incorporated into the junction). The way to construct the database 220). For example, in an embodiment, the popularity column 316 of FIG. 3B can be directly used to record the user's preferences. As for the recording mode of the heat column 316, it has been mentioned before (for example, when a certain record 302 is matched, the heat column is added. a), I will not repeat them here. Of course, the user profile information 715 can also be stored in the structured database 220 column, for example, based on keywords (such as "Three Kingdoms"), combined with user preferences (for example, when the user refers to "like" and other positive terms. And when negative terms such as "disgust" are used, the values of the preference column 318 and the disgusting column 320 of FIG. 3B can be respectively added to one), and then the number of favorites (for example, the number of statistical positive terms and the like) can be calculated. Therefore, when the natural language understanding system 720 queries the structured database 200 for the user preference record 717, the value of the preference column 318 and the dislike column 320 can be directly queried (the number of the positive term and the negative term can be queried), and then The user's preference is judged (i.e., the statistical value of the positive and negative terms is transmitted to the natural language understanding system 720 as the user preference record 717).

The case where the user preference information 715 is stored in the feature database 730 (i.e., the feature database 730 is not incorporated into the structured library 220) will be described below. In an embodiment, the user preference information 715 may be stored in a manner corresponding to the user's "likeness" of the keyword. For example, the user preference information 715 may be stored directly using the preference column 852 of FIG. 8B. The disgusting column 862 records the user's personal preference and dislike of a certain keyword, and uses the favorite column 854 and the disgusting column 864 to record the preferences and dislikes of the group of keywords. For example, in FIG. 8B, the values of the favorite column 852 and the disgusting column 862 corresponding to the keyword "Three Kingdoms" and "Book" stored in the record 832 are 20 and 1, respectively, and the keyword "" stored in the record 834. The Romance of the Three Kingdoms, "TV The values of the preference column 852 and the disgusting column 862 corresponding to the drama "" are the values of the preference column 852 and the disgusting column 862 corresponding to the keywords ""Three Kingdoms" and "Music") stored in the records 836, respectively. The values are 1 and 8, respectively, which indicate the user's personal preference and dislike data for the relevant keywords (for example, the higher the value of the preference column 852 is, the more like, the higher the value of the column 862 is, the more disgusting it is). In addition, the values of the preference column 854 and the disgusting column 864 corresponding to the record 832 are 5 and 3, respectively, and the values of the preference column 854 and the disgusting column 864 corresponding to the record 834 are 80 and 20 respectively, and the preference column 854 corresponding to the record 836. The values of the dislike column 864 are 2 and 10, respectively, which indicate the preference and disgusting information of the relevant keywords (referred to as "favorite indication"), so that the preference column 852 and disgust can be added according to the user's preference. The value of column 862. Therefore, if the user inputs the voice "I want to watch the drama of the Three Kingdoms", the natural language understanding system 720 can combine the "keyword" ""Three Kingdoms", "TV drama"" with the "favorite indication" of increasing the value of the preference column. The user preference data 715 is sent to the feature database 730, so that the feature database 730 can be incremented by the value of the preference column 852 of the record 834 (because the user wants to see "The Romance of the Three Kingdoms" and "TV drama", indicating his preference. Degree increase). According to the above method of recording user preference data, when the user inputs the relevant keyword again, for example, when the user inputs "I want to see the Romance of the Three Kingdoms", the natural language understanding system 720 can be based on the keyword "Three Kingdoms". The feature database 730 of 8B queries three records 832/834/836 related to "Three Kingdoms", and the feature database 730 can return the value of the favorite column 852 and the aversive column 862 as a user preference record 717 to the nature. The language understanding system 720, then the natural language understanding system 720 can determine the user's individual based on the user's preference record 717. The basis of preference. Of course, the feature database 730 can also return the value of the preference column 854 and the dislike column 864 as the user preference record 717 to the natural language understanding system 720, but the user preference record 717 will be used as a basis for judging the preference of the person. The user preference record 717 represents the individual or the preferences of the user and is not limited.

In another embodiment, the values of the preference column 852 and the dislike column 862 can also serve as a basis for judging the user/people's habits. For example, the natural language understanding system 720 can determine the difference between the preference column 852/854 and the disgusting column 862/864 after receiving the user preference record 717. If the two values differ above a certain threshold, the user is represented. It is customary to use a specific way to conduct a dialogue. For example, when the value of the preference column 852 is greater than the value of the aversive column 862 by more than 10 times, it indicates that the user particularly likes to use the "positive language" as a dialogue (this is a "user habit" recording method. Therefore, the natural language understanding system 720 can select the reward answer only in the preference column 852 in this case. When the natural language understanding system 720 uses the value of the favorite column 854/disgusting column 864 stored in the property database 730, it indicates that all the user's favorite records of the feature database 730 are judged, and the judgment result can be used as a public. Customary reference material. It should be noted that the user preference record 717 that is passed back to the natural language understanding system 720 by the property database 730 can include both the user's personal preference record (eg, the value of the preference bar 852/disgusting column 862) and the person's favorite record (eg, The value of the preference column 854/disgusting column 864) is not limited by the present invention.

As for the storage of the user preference profile 715 obtained based on the current voice input, the candidate list may be generated in step S820 (whether it is an exact match or a partial Matching, the natural language dialogue system 700 stores the user preference information 715 obtained in the user's voice input. For example, in step S820, whenever the keyword can generate a matching result in the structured database 220, it can be determined that the user has a preference for the matching result, so "keyword" and "favorite indication" can be used. After sending to the feature database 730 and finding the corresponding record therein, the corresponding record is corresponding to the preference column 852/854 or the aversive column 862/864 (for example, when the user inputs "I want to read the books of the Romance of the Three Kingdoms", The value of the preference column 852/854 of the record 832 of FIG. 8B may be incremented by one). In another embodiment, the natural language dialogue system 700 may also store the user preference information 715 after the user selects a reward answer in step S830. In addition, if a corresponding keyword is not found in the property database 730, a new record can be created to store the user preference profile 715. For example, when the user inputs the voice "I listen to Andy Lau's forgotten water" and generates the keyword "Andy Lau", "forget the water", if the corresponding record is not found in the feature database 730 when storing, the feature data will be Library 730 creates a new record 838 and increments its value in its corresponding preference field 852/854. The above-mentioned user preference information 715 storage timing and storage mode are for illustrative purposes only, and those skilled in the art can change the embodiment shown in the present invention according to the actual application, but all equivalent modifications should not be deviated from the spirit of the present invention. It is included in the scope of the patent application of the present invention.

Further, although the format in which the records 832-838 are stored in the property database 730 shown in FIG. 8B is different from the recording format of the structured library 220 (for example, as shown in FIGS. 3A/3B/3C), the present invention The storage format of the record is not limited. Furthermore, although the above embodiment only describes the preference column 852/854 and the disgusting column 862/864 Storage and usage, but in another embodiment, column 872/874 may be additionally opened in the property database 730 to store other habits of the user/person, for example, the corresponding data of the record is downloaded, referenced, recommended, Information such as the number of comments, or referrals. In another embodiment, the number or data of these downloads, citations, recommendations, comments, or referrals may also be stored in the favorite column 852/854 and the aversive column 862/864, for example, each time the user provides a record. A good comment or referral to another person may add one to the value of the preference column 852/854. If the user provides a bad comment on a record, the value of the aversion column 862/864 may be increased by one. The number of records and the numerical value of the column are not limited. It should be noted that those skilled in the art should understand that since the preference column 852, the aversive column 862, the column 872, etc. in FIG. 8B are only related to the user's personal selection and preference, the personal selection of these users can be selected. The favorite/disgusting information is stored in the user's mobile communication device, and the information such as the favorite column 854, the disgusting column 864, the column 874, etc. associated with all users (or at least a specific group of users) is stored in the servo. In the device, it can also save the storage space of the server, and also preserve the privacy of the user's personal preference.

The actual use of the user will be further explained below using FIG. 7A and FIG. 8B. Based on the conversation content of the plurality of voice inputs 701, it is often mentioned that "the user hates to watch the drama of the Three Kingdoms" when the user engages in dialogue with the natural language understanding system 720, and less "I hate listening to the music of the Three Kingdoms". Less mention of "I hate listening to the Romance of the Three Kingdoms" (for example, the feature database 730 records 20 records about "I hate to watch the drama of the Three Kingdoms" (that is, in the record 834 of Figure 8B, "Three Kingdoms" The number of negative terms in the Romance "Plus TV Series" is 20), 8 The record of "I hate listening to the music of the Romance of the Three Kingdoms" (that is, in the record 836 of Figure 8B, the number of negative terms of "Three Kingdoms" plus "Music" is 8), and one is about "I hate listening to the Three Kingdoms." The book of the Romance") (i.e., in the record 832 of Figure 8B, the number of negative terms for "Three Kingdoms" plus "book" is 1), because the user preference record 717 returned from the property database 730 will contain this. The number of three negative terms (ie, 20, 8, 1), the natural language understanding system 720 will sequentially order the priority of the return answer 711 in the candidate list as "books of the Romance of the Three Kingdoms" and "Music of the Three Kingdoms" "and the TV series of the Romance of the Three Kingdoms." That is to say, when the keyword 709 is "Three Kingdoms", the natural language understanding system 720 selects the "Three Kingdoms" book as the reward answer 711, and outputs a voice response 707 based on the reward answer 711. It should be noted that although the above is a prioritized use of the statistical values of the negative terms used by the user, in another embodiment, the statistical values of the positive terms used by the user may still be used alone to prioritize ( For example, as previously mentioned, the value of the preference bar 852 is above a certain threshold of the aversive bar 862).

It is worth mentioning that the natural language understanding system 720 can also determine the priority order of returning answers according to the amount of positive and negative terms used by the user. Specifically, the feature database 730 can also record keywords that the user has expressed, such as "like", "idol" (above is positive), "disgust" or "hate" (above negative words), etc. Wait. Therefore, in addition to comparing the number of times the user uses "like" and "disgust", the natural language understanding system 720 can also directly report the reward answer according to the number of positive/negative terms corresponding to the keyword from the candidate list. Sorting (that is, comparing positive or negative terms to which the number of citations is more). For example Say, assuming that there are more times in the return answer related to "like" (that is, if the number of citations of the positive term is more or the value of the preference column 852 is larger), the answer to the return will be selected first. Or, suppose that the number of times in the return answer is more related to "disgust" (that is, the number of citations of negative terms is larger, or the value of aversive column 862 is larger), then it is selected later, so the natural language understanding system 720 can All the reward answers are sorted out according to the above prioritization order to sort out a candidate list. Since some users may prefer to use positive terms (for example, the value of the favorite column 852 is particularly large), while other users are accustomed to using negative terms (for example, the value of the aversive column 862 is particularly large), in the above embodiment, the user prefers to record. 717 will reflect the usage habits of individual users, so it can provide options that are more in line with the user's habits.

In addition, the natural language understanding system 720 can also prioritize the reward answers 711 in the candidate list according to the usage habits of the people, wherein the more frequently the answers frequently used by the people are prioritized (for example, using the heat column 316 of FIG. 3C, or The favorite column 854 of FIG. 8B is recorded with the disgusting column 864). For example, when the keyword 709 is "Three Kingdoms", it is assumed that the reward answers found by the natural language understanding system 720 are, for example, a drama of the Three Kingdoms, a book of the Romance of the Three Kingdoms, and a music of the Three Kingdoms. Among them, if the people refer to the "Three Kingdoms", they usually refer to the "Three Kingdoms" TV series. Less people will refer to the "Three Kingdoms" movie, and fewer people will refer to the "Three Kingdoms" book (for example, Figure 8B). In the case where the values of the related records in the favorite column 854 are 80, 40, and 5, respectively, the natural language understanding system 720 sorts the return answers 711 regarding "drama", "movie", and "book" in order of priority. In other words, the natural language understanding system 720 will give priority to the "TV drama of the Romance of the Three Kingdoms" as the answer 711, and According to this, the answer 711 outputs a voice response 707. As for the above-mentioned "priority of frequently used answers", the heat column 316 of FIG. 3C (or the preference column 854 and the aversive column 864 of FIG. 8B) can be used for recording, and the recording mode is already in the above FIG. 3C (8B). The relevant paragraphs are disclosed and will not be repeated here.

In addition, the natural language understanding system 720 can also determine the priority order of the reward answer 711 according to the frequency of use of the user. Specifically, since the natural language understanding system 720 can record the voice input 701 that has been received from the user in the property database 730, the property database 730 can record the natural language understanding system 720 that resolves the user's voice input 701. The keyword 709 and the natural language understanding system 720 all generate response messages such as the answer 711. Therefore, when the natural language understanding system 720 selects the reward answer 711 in the future, it can be based on the response information recorded in the feature database 730 (for example, user preference/disgust/habit, even everyone's preference/disgust/habit...) According to the prioritization, a reward answer 711 that is more in line with the user's intention (determined by the user's voice input) is found, and the corresponding voice response is obtained. As for the above-mentioned "priority order of returning the answer 711 according to the user's habits", the heat column 316 of FIG. 3C (or the preference column 852 and the disgusting column 862 of FIG. 8B) can also be used for recording, and the recording mode is already in the above FIG. 3C. The relevant paragraphs of (8B) are disclosed and will not be repeated here.

In summary, the natural language understanding system 720 can store the user preference attributes (eg, positive and negative terms), user habits, and usage habits described above into the feature database 730 (step S806). That is, in steps S802, S804, and S806, the user preference profile 715 (read from the profile database 730) is learned from the user's previous history session record, and the collected user preference profile 715 is collected. The feature database 730 is added (the user/person preference data is modified by the user preference record 717 to feed the feature database 730). In addition, the user habits and the usage habits are also stored in the feature database 730 for natural language understanding. System 720 can utilize the rich information in feature database 730 (e.g., stored in feature database 730 via user preference record 717) to provide a more accurate response from the user.

Next, the details of step S830 are further described. In step S830, after receiving the voice input in step S810 and parsing the keyword 709 of the voice input at S820 to obtain the candidate list, the natural language understanding system 720 then follows the user preferences such as user preference, user habits, or crowd use habits. Record 717, determining at least one priority order for returning the answer (step S880). As mentioned above, the priority order can be based on the number of records searched, the positive/negative terms of the user or the individual. Then, a return answer 711 is selected from the candidate list according to the priority order (step S890), and the selection of the return answer may also be the highest match or the highest priority as described above. Thereafter, based on the return answer 711, the voice response 707 is output (step S840).

On the other hand, the natural language understanding system 720 can also determine the priority order of at least one reward answer based on the voice input 701 entered by the user earlier. That is, assuming that another voice input 701 (e.g., the aforementioned fourth voice input) is received by the voice sampling module 710 in advance of the voice response 707 being played, the natural language understanding system 720 can also pass. Parsing the keyword in the voice input 701 (ie, the fourth voice input) (ie, the fourth keyword 709), and in the candidate list, preferentially selecting the reward answer that matches the keyword as the reward answer 711, and A voice response 707 is output based on this reward answer 711.

For example, assume that the natural language understanding system 720 first receives the voice input 701 of "I want to watch a TV show", and after a few seconds, assume that the natural language understanding system 720 receives the "Help me put the Romance of the Three Kingdoms". Voice input 701. At this time, the natural language understanding system 720 can recognize the keyword (first keyword) of the "TV drama" in the first voice input 701, and recognize the keyword of the "Three Kingdoms" (fourth keyword) Therefore, the natural language understanding system 720 selects the intent data from the candidate list as the return answer for the "Three Kingdoms" and "TV drama", and returns the answer 711 to output the voice response 707 to the user.

Based on the above, the natural language understanding system 720 can output the request information 703 that conforms to the voice input 701 according to the voice input from the user and the information of the user's usage habits, user preferences, user habits, or the user's talks. The voice response 707 is given to the user. The natural language understanding system 720 can prioritize the reward answers in the candidate list according to different sorting methods, such as the usage habits, user preferences, user habits, or the user's forward and backward conversations. Therefore, if the voice input 701 from the user is relatively unclear, the natural language understanding system 720 can determine the user's voice input 701 according to the usage habits, user preferences, user habits, or the user's talks. The intent of the finger (such as the attribute of the keyword 709 in the voice input, the field of knowledge, etc.). In other words, if the reward answer 711 is close to what the user has expressed/intended by the person, the natural language understanding system 720 will prioritize the reward answer 711. In this way, the voice response 707 output by the natural language dialogue system 700 can be more consistent with the user's request information. 703.

It should be noted that although the feature database 730 and the structured database 220 are described in different databases, the two databases may be combined, and those skilled in the art may select according to actual applications.

In summary, the present invention provides a natural language dialogue method and system thereof, and the natural language dialogue system can output a corresponding voice response according to a voice input from a user. The natural language dialogue system of the present invention can also preferentially select a more appropriate return answer according to the usage habits of the people, the user's preference, the user's habits or the user's talks, etc., thereby outputting a voice response to the user, thereby enhancing The convenience of the user in a conversation with the natural language dialogue system.

Then, the architecture and components such as the system 100 and the structured database 220 are understood in a natural language, and the number of reward answers obtained by analyzing the requested information according to the user's voice input is determined, and the operation is directly performed according to the data type, or the user is required to provide Further instructions, then when there is only one return answer, you can also directly explain the example of the operation based on the data type. The benefit of providing the user with this choice is that the system does not have to filter the answer for the user, but instead provides the candidate list containing the reward answer directly to the user, and lets the user decide which software to execute by or not by selecting the answer. What kind of service is provided to achieve the purpose of providing a user-friendly interface.

FIG. 9 is a schematic diagram of a system of a mobile terminal device according to an embodiment of the invention. Referring to FIG. 9 , in the embodiment, the mobile terminal device 900 includes a voice receiving unit 910 , a data processing unit 920 , a display unit 930 , and a storage unit 940 . Capital The material processing unit 920 is coupled to the voice receiving unit 910, the display unit 930, and the storage unit 940. The voice receiving unit 910 is configured to receive the first input voice SP1 and the second input voice SP2 and transmit the data to the data processing unit 920. The first voice input SP1 and the second voice input SP2 described above may be voice inputs 501, 701. The display unit 930 is configured to be controlled by the material processing unit 920 to display the first/second candidate list 908/908'. The storage unit 940 is configured to store a plurality of materials, and the data may include the data of the structured database 220 or the characteristic database 730, and details are not described herein. Furthermore, storage unit 940 can be any type of memory within a server or computer system, such as dynamic random access memory (DRAM), static random access memory (SRAM), flash memory, read only memory (ROM).. The invention is not limited thereto, and those skilled in the art can select according to actual needs.

In the present embodiment, the data processing unit 920 functions as the natural language understanding system 100 of FIG. 1, and performs speech recognition on the first input speech SP1 to generate the request information 902, and then analyzes the first request information 902 and the natural language. Processing to generate a first keyword 904 corresponding to the first input speech SP1, and according to the first keyword 904 corresponding to the first input speech SP1, the data from the storage unit 940 (eg, the search engine 240 according to the keyword 108 to the structured database) 220 performs a full-text search) to find a first return answer 906 (eg, a first return answer 511/711). When the number of the first reward answers 906 found is 1, the data processing unit 920 can directly perform corresponding operations according to the document data corresponding to the first reward answer 906; when the number of the first reward answers 906 is greater than 1, the data processing Unit 920 can organize the first reward answer 906 into a first candidate list 908, and then control display unit 940 to display The first candidate list 908 is for the user. In the case that the first candidate list 908 is displayed for the user to make further selection, the data processing unit 920 receives the second input speech SP2 and performs speech recognition to generate the second request information 902', and then the second request. The information 902' performs natural language processing to generate a second keyword 904' corresponding to the second input speech SP2, and selects a corresponding portion from the first candidate list 908 according to the second keyword 904' corresponding to the second input speech SP2. The first keyword 904 and the second keyword 904' may be composed of a plurality of keywords. The manner of analyzing the second speech input SP2 to generate the second request information 902' and the second keyword 904' may use the manner in which the second speech input is analyzed by using FIGS. 5A and 7A, and thus will not be described again.

Similarly, when the number of second reward answers 906 is 1, the data processing unit 920 performs a corresponding operation according to the type of the second reward answer 906; when the number of second reward answers 906' is greater than 1, the data processing unit 920 The second reward answer 906' is further organized into a second candidate list 908' and the display unit 940 is controlled to display. Then, according to the next input voice of the user, the corresponding part is selected, and then the corresponding operation is performed according to the number of subsequent feedback answers. This can be referred to the analogy of the above description, and will not be described here.

Further, the data processing unit 920 compares the plurality of records 302 of the structured repository 220 (eg, the numeric data of each of the sub-columns 308 in the title bar 304) with the first keyword 904 corresponding to the first input speech SP1. Yes (as described above with respect to Figures 1, 3A, 3B, 3C). When the structured database 220 has a record 302 that matches at least a portion of the first key 904 of the first input speech SP1, then the record is recorded 302 is considered to be the matching result produced by the first input speech SP1 (eg, the generation of the matching result of FIGS. 3A/3B). Wherein, if the document data to which the matching result belongs is a music file, the record 302 may include a song name, a singer, an album name, a publishing time, a playing order, ..., etc.; if the document material is an image file, the record 302 may include a movie. Name, publication time, staff (including performers), ..., etc.; if the document material is a webpage file, the record 302 may include the website name, the webpage type, the corresponding user account, ..., etc.; For a picture file, the record 302 may include a picture name, picture information, ..., etc.; if the document material is a business card file, the record 302 may include a contact name, a contact phone, a contact address, ... and the like. The foregoing record 302 is illustrated by way of example, and the record 302 may be determined according to an actual application, and the embodiment of the present invention is not limited thereto.

Next, the material processing unit 920 can determine whether the second keyword 904' corresponding to the second input speech SP2 contains a sequential vocabulary indicating the order (eg, "I want a third option" or "I choose a third"). When the second keyword 904' corresponding to the second input speech SP2 includes the sequential vocabulary indicating the order, the material processing unit 920 selects the material located at the corresponding position from the first candidate list 908 according to the sequential vocabulary. When the second keyword 904' corresponding to the second input speech SP2 does not include the sequential vocabulary indicating the order, indicating that the user may directly select a certain first return answer 906 in the first candidate list 908, the data processing unit 920 will be the first The record 302 corresponding to each first return answer 906 in the candidate list 908 is compared with the second keyword 904 ′ to determine the degree of correspondence between the first return answer 906 and the second input voice SP2, and then the first is determined according to the degree of correspondence. Whether there is a certain first return answer 906 in the candidate list 908 The second input voice SP2. In an embodiment of the present invention, the data processing unit 920 may determine the first candidate list 906 according to the degree of correspondence of the first reward answer 906 to the second keyword 904' (eg, the degree of complete matching or partial matching). Whether there is a certain first return answer 906 corresponding to the second input speech SP2, thereby simplifying the selection process. The data processing unit 920 can select the highest degree of correspondence in the data to correspond to the second input voice SP2.

For example, if the first input voice SP1 is “What is the weather today”, after performing voice recognition and natural language processing, the first keyword 904 corresponding to the first input voice SP1 may include “Today” and “Weather”, The data processing unit 920 reads the data corresponding to today's weather and displays the weather data as the first candidate list 908 through the display unit 930. Then, if the second input speech SP2 is "I want to see the third data" or "I choose the third pen", after the speech recognition and natural language processing, the second input speech SP2 corresponds to the second keyword 904' Will include "3rd", where "3rd" will be interpreted as a sequential vocabulary indicating order, so the data processing unit 920 will read the third data in the first candidate list 908 (ie, the first candidate list) The third one in the 908 first returns the answer 906), and the corresponding weather information is displayed again through the display unit 930. Alternatively, if the second input speech SP2 is "I want to see the weather in Beijing" or "I choose the weather in Beijing", after the speech recognition and natural language processing, the second keyword 904' corresponding to the second input speech SP2 will Including "Beijing" and "weather", the data processing unit 920 will read the data corresponding to Beijing in the first candidate list 908. When the number of the first return answers 906 corresponding to the selection is 1, the corresponding weather information may be directly displayed through the display unit 930; when the selected first return is answered When the number of cases 906 is greater than one, then a further second candidate list 908' (containing at least one second return answer 906') is displayed for further selection by the user.

If the first input voice SP1 is "I want to call the old card", after performing voice recognition and natural language processing, the first keyword 904 corresponding to the first input voice SP1 may include "telephone" and "sheet", The data processing unit 920 reads the contact information of the corresponding last name "Zhang" (the full-text search can be performed on the structured database 220, and then the detailed information corresponding to the record 302 is obtained), and the contact information is displayed through the display unit 930. The first candidate list 908 (ie, the first return answer 906). Then, if the second input voice SP2 is "3rd old" or "I choose the third", after performing speech recognition and natural language processing, the second keyword 904' corresponding to the second input speech SP2 may include "3rd", where "3rd" will be interpreted as a sequential vocabulary indicating the order, so the data processing unit 920 will read the third data in the first candidate list 908 (ie, the third first) Return the answer 906) and dial in based on the selected data. Alternatively, if the second input speech SP2 is "I choose 139 at the beginning", after performing speech recognition and natural language processing, the second keyword 904' corresponding to the second input speech SP2 may include "139" and "beginning". Here, "139" is not interpreted as a sequential vocabulary indicating the order, so the material processing unit 920 reads the contact information of the first candidate list 908 whose telephone number is 139; if the second input speech SP2 is "I want to After the speech recognition and natural language processing, the second keyword 904' corresponding to the second input speech SP2 will include "Beijing" and "Zhang", and the data processing unit 920 will read the first candidate list. The address in 908 is the contact information of Beijing. When the number of selected first return answers 906 is 1, it is based on the selected data. When the number of selected first return answers 906 is greater than 1, the first return answer 906 selected at this time is taken as the second return answer 906', and is organized into a second candidate list 908' for display to the user. Its choice.

If the first input voice SP1 is "I am looking for a restaurant", after performing voice recognition and natural language processing, the first keyword 904 of the first input voice SP1 will include "restaurant", and the data processing unit 920 will read all corresponding The first report answer 906 is returned in the restaurant. Since such an indication is not very clear, the first candidate list 908 (including the first return answer 906 corresponding to all restaurant materials) will be displayed to the user through the display unit 930, and the user is further advanced. Instructions. Then, if the user inputs "3rd restaurant" or "I choose the 3rd" through the second input voice SP2, after the voice recognition and natural language processing, the second keyword 904' corresponding to the second input voice SP2 Will include "3rd", where "3rd" will be interpreted as a sequential vocabulary indicating order, so the data processing unit 920 will read the third data in the first candidate list 908 and according to the selected data. Display. Alternatively, if the second input speech SP2 is "I select the nearest one", after performing speech recognition and natural language processing, the second keyword 904' corresponding to the second input speech SP2 may include "recent", so the data processing unit 920 will read the restaurant information of the first candidate list 908 whose address is closest to the user; if the second input voice SP2 is "I want a restaurant in Beijing", after the speech recognition and natural language processing, the second input voice SP2 corresponds to The second keyword 904' will include "Beijing" and "restaurant", so the data processing unit 920 will read the restaurant material of the first candidate list 908 with the address Beijing. When the number of selected first return answers 906 is 1, it is displayed according to the selected data; when the selected first return answer is 906 If the amount is greater than 1, the first return answer 906 selected at this time is taken as the second return answer 906', and is organized into a second candidate list 908' for display to the user for selection.

According to the above, the data processing unit 920 can perform corresponding operations according to the document data of the selected first reward answer 906 (or the second reward answer 906'). For example, when the document data corresponding to the selected first reward answer 906 is a music file, the data processing unit 920 performs music playback according to the selected data; when the selected document data is an image file, the data processing 920 The unit performs video playback according to the selected data; when the selected document data is a webpage file, the data processing unit 920 displays according to the selected data; and when the selected document data is a photo file, the data processing unit 920 The image display is performed according to the selected data; when the type of the selected data is a business card file, the data processing unit 920 performs dialing according to the selected data.

FIG. 10 is a schematic diagram of a system of an information system according to an embodiment of the invention. Referring to FIG. 9 and FIG. 10, in the embodiment, the information system 1000 includes a mobile terminal device 1010 and a server 1020. The server 1020 may be a cloud server, a regional network server, or the like, but The embodiments of the invention are not limited thereto. The mobile terminal device 1010 includes a voice receiving unit 1011, a data processing unit 1013, and a display unit 1015. The data processing unit 1013 is coupled to the voice receiving unit 1011, the display unit 1015, and the server 1020. The mobile terminal device 1010 may be a mobile communication device such as a mobile phone (Cell phone), a personal digital assistant (PDA) mobile phone, or a smart phone, and the present invention is not limited thereto. The function of the voice receiving unit 1011 is similar to that of the voice receiving unit 910. The display unit 1015 functions similarly to the display unit 930. The server 1020 is configured to store a plurality of materials and has a voice recognition function.

In this embodiment, the data processing unit 1013 performs voice recognition on the first input voice SP1 through the server 1020 to generate the first request information 902, and then performs natural language processing on the first request information 902 to generate a corresponding first input voice. The first keyword 904 of SP1, and the server 1020 performs a full-text search on the structured database 220 according to the first keyword 904 to find the first return answer 906 and transmits it to the data processing unit 1013. When the number of the first report answer 906 is 1, the data processing unit 1013 performs a corresponding operation according to the document data corresponding to the first report answer 906; when the number of the first report answer 906 is greater than 1, the data processing unit 1013 At this time, the selected first return answer 906 is organized into the first candidate list 908, and then the control display unit 1015 displays to the user, and waits for further instructions from the user. After the user inputs the indication, the data processing unit 1013 then performs voice recognition on the second input voice PS2 through the server 1020 to generate the second request information 902', and then analyzes and natural language processing the second request information 902'. To generate a second keyword 904' corresponding to the second input speech SP2, and the server 1020 selects a corresponding first reward answer 906 from the first candidate list 908 according to the second keyword 904' corresponding to the second input speech SP2. The second returns an answer 906' and is passed to the data processing unit 1013. Similarly, when the number of corresponding second reward answers 906 is 1 at this time, the data processing unit 920 performs corresponding operations according to the type of the data corresponding to the second reward answer 906; when the number of second reward answers 906 is greater than At 1 o'clock, the data processing unit 1013 will sort out the second return answer 906 selected at this time. After forming a second candidate list 908', the control display unit 1015 is again displayed to the user for further selection. Then, the server 1020 selects the corresponding part according to the subsequent input voice, and the data processing unit 1013 performs the corresponding operation according to the number of selected materials. This can be referred to the analogy of the above description, and will not be described herein.

It should be noted that, in an embodiment, if the number of first reward answers 906 selected according to the first keyword 904 corresponding to the first input voice SP1 is 1, the operation corresponding to the data may be directly performed. Moreover, in another embodiment, a prompt may be output to the user to inform the user that the corresponding operation of the selected first reward answer 906 will be performed. Furthermore, in another embodiment, when the number of second reward answers 906 selected according to the second keyword 904' corresponding to the second input voice SP2 is 1, the operation corresponding to the data may be directly performed. Of course, in another embodiment, a prompt may be output to the user to notify the user that the corresponding operation of the selected material is to be performed, and the present invention does not limit this.

Further, the server 1020 compares the respective records 302 of the structured database 220 with the first keywords 904 corresponding to the first input speech SP1. When each record 302 is at least partially matched to the first keyword 904, then this record 302 is considered to be the material matched by the first input speech SP1 and this record 302 is taken as one of the first reward answers 906. If the number of first reward answers 906 selected according to the first keyword 904 corresponding to the first input speech SP1 is greater than 1, the user may input an indication through the second input speech SP2. Since the indication input by the user at this time through the second input voice SP2 may include an order (to indicate the order of selecting the first item in the information, etc.), directly select one of the displayed information (for example, directly indicating an item) The content of the information), or determining the user's intention according to the indication (for example, selecting the nearest restaurant, the user will be displayed with the "recent" restaurant), and the server 1020 will then determine the second key corresponding to the second input voice SP2. The word 904' contains a sequential vocabulary indicating the order. When the second keyword 904' corresponding to the second input speech SP2 includes the sequential vocabulary indicating the order, the server 1020 selects the first return answer 906 located at the corresponding position from the first candidate list 908 according to the sequential vocabulary. When the second keyword 904' corresponding to the second input speech SP2 does not include the sequential vocabulary indicating the order, the server 1020 sets the second first answer 906 of the first candidate list 908 with the second input speech SP2. The keyword 904' is compared to determine the degree of correspondence between the first return answer 906 and the second input speech SP2, and may determine whether the first return answer 906 in the first candidate list 908 corresponds to the second input speech SP2 according to the degree of correspondence. . In an embodiment of the present invention, the server 1020 may determine, according to the degree of correspondence between the first return answer 906 and the second keyword 904', those first return answers 906 in the first candidate list 908 corresponding to the second input speech SP2, To simplify the process of selection. The server 1020 can select the one with the highest degree of correspondence in the first report answer 906 as corresponding to the second input voice SP2.

11 is a flow chart of a voice recognition based selection method in accordance with an embodiment of the present invention. Referring to FIG. 11, in this embodiment, the first input voice is received (step S1100), and the first input voice SP1 is voice-recoordinated to generate the first request information 902 (step S1110), and the first request information is further requested. The 902 performs analysis natural language processing to generate a first keyword 904 corresponding to the first input voice (step S1120). Then, the corresponding keyword is selected from the plurality of materials according to the first keyword 904. The first answer 906 is answered (step S1130), and it is judged whether or not the number of selected first return answers 906 is 1 (step S1140). When the number of selected first return answers 906 is 1, that is, the determination result of step S1140 is YES, the corresponding operation is performed according to the document data corresponding to the first return answer 906 (step S1150). When the number of selected first return answers 906 is greater than 1, that is, the determination result of step S1140 is "NO", the first candidate list 908 is displayed according to the selected first return answer 906 and the second input speech SP2 is received (step S1160). And performing voice recognition on the second input voice to generate second request information 902' (step S1170), and analyzing and natural language processing on the second request information 902' to generate a second keyword corresponding to the second input voice 904' (step S1180). Next, the corresponding portion is selected from the first report answer 906 in the first candidate list 908 according to the second request information 902, and then the process returns to step S1140 to determine whether the number of selected first report answers 906 is 1 (step S1190). The order of the above steps is for illustration, and the embodiment of the present invention is not limited thereto. For details of the above steps, reference may be made to the embodiments of FIG. 9 and FIG. 10, and details are not described herein again.

In summary, the voice recognition-based selection method and the mobile terminal device and the information system thereof perform voice recognition and natural language processing on the first input voice and the second input voice to confirm the first input voice and The keyword corresponding to the second input voice is further selected according to the keyword corresponding to the first input voice and the second input voice. Thereby, the convenience of the user's operation can be improved.

Next, the architecture and components of the natural language understanding system 100 and the structured database 220 disclosed in the present invention are combined with the auxiliary activation device. Give an example.

FIG. 12 is a block diagram of a voice control system according to an embodiment of the invention. Referring to FIG. 12, the voice control system 1200 includes an auxiliary activation device 1210, a mobile terminal device 1220, and a server 1230. In the present embodiment, the auxiliary activation device 1210 activates the voice system of the mobile terminal device 1220 by wirelessly transmitting signals, so that the mobile terminal device 1220 communicates with the server 1230 according to the voice signal.

In detail, the auxiliary activation device 1210 includes a first wireless transmission module 1212 and a trigger module 1214 , wherein the trigger module 1214 is coupled to the first wireless transmission module 1212 . The first wireless transmission module 1212 supports, for example, Wireless Fidelity (Wi-Fi), Worldwide Interoperability for Microwave Access (WiMAX), Bluetooth, and Ultra-wideband. A device of a communication protocol such as UWB or Radio-frequency identification (RFID), which can transmit a wireless transmission signal to establish a wireless connection with another wireless transmission module. The trigger module 1214 is, for example, a button, a button, or the like. In this embodiment, after the user presses the trigger module 1214 to generate a trigger signal, the first wireless transmission module 1212 receives the trigger signal and starts, and the first wireless transmission module 1212 sends a wireless transmission signal. And transmitting the wireless transmission signal to the mobile terminal device 1220 through the first wireless transmission module 1212. In an embodiment, the auxiliary activation device 1210 may be a Bluetooth headset.

It is worth noting that although some hands-free headsets/microphones are currently available The design of certain functions of the mobile terminal device 1220 is initiated, but in another embodiment of the invention, the auxiliary activation device 1210 can be different from the earphone/microphone described above. The above-mentioned earphone/microphone performs listening/talking by replacing the earphone/microphone on the mobile terminal device 1220 by connecting with the mobile terminal device, and the activation function is an additional design, but the auxiliary starting device 1210 of the present invention is "only" used. The voice system in the mobile terminal device 1220 does not have the function of listening/talking, so the internal circuit design can be simplified and the cost is low. In other words, with respect to the hands-free headset/microphone described above, the auxiliary activation device 1210 is another device, that is, the user may have both a hands-free headset/microphone and the auxiliary activation device 1210 of the present invention.

In addition, the shape of the auxiliary starting device 1210 described above may be an item that is accessible to the user, such as a ring, a watch, an earring, a necklace, an eyeglass, etc., that is, various portable items, or a mounting member, for example, a configuration. The driving accessories on the steering wheel are not limited to the above. That is to say, the auxiliary activation device 1210 is a "living" device, and the setting of the internal system allows the user to easily touch the trigger module 1214 to turn on the voice system. For example, when the shape of the auxiliary activation device 1210 is a ring, the user can easily move the finger to press the trigger module 1214 of the ring to be triggered. On the other hand, when the shape of the auxiliary starting device 1210 is a device disposed on the driving accessory, the user can also easily trigger the triggering module 1214 of the driving accessory device during driving. In addition, the auxiliary activation device 1210 of the present invention can be used to turn on the voice system in the mobile terminal device 1220, and even turn on the amplification function, compared to the discomfort of listening/talking with the earphone/microphone (which will be detailed later). ), allowing users to go directly through the action without wearing a headset/microphone The terminal device 1220 performs listening/talking. In addition, for the user, these "living" auxiliary starting devices 1210 are items that would otherwise be worn or used, so there is no problem of uncomfortable or uncomfortable use, that is, no flowers are needed. Time to adapt. For example, when a user is cooking in a kitchen and needs to dial a mobile phone placed in a living room, assuming that it is wearing the auxiliary starting device 1210 of the present invention having a ring, a necklace or a watch shape, the ring and the necklace can be lightly touched. Or watch to turn on the voice system to ask for friend recipe details. Although some earphones/microphones with start-up function can achieve the above purposes, in the process of cooking, not every time you need to call a friend, so for the user, wear headphones at any time. It is quite inconvenient to cook the microphone in order to control the mobile terminal device at any time.

In other embodiments, the auxiliary activation device 1210 can also be configured with a wireless rechargeable battery 1216 for driving the first wireless transmission module 1212. Further, the wireless charging battery 1216 includes a battery unit 12162 and a wireless charging module 12164. The wireless charging module 12164 is coupled to the battery unit 12162. Here, the wireless charging module 12164 can receive energy supplied from a wireless power supply device (not shown) and convert the energy into power to charge the battery unit 12162. As such, the first wireless transmission module 1212 of the auxiliary activation device 1210 can be conveniently charged by the wireless rechargeable battery 1216.

On the other hand, the mobile terminal device 1220 is, for example, a cell phone, a personal digital assistant (PDA) mobile phone, a smart phone, or a palmtop computer with a communication software installed (Pocket PC). ), Tablet PC or notebook computer, etc. The mobile terminal device 1220 can be any portable mobile device with communication function, and the scope is not limited herein. In addition, the mobile terminal device 1220 may use an Android operating system, a Microsoft operating system, an Android operating system, a Linux operating system, etc., and is not limited to the above.

The mobile terminal device 1220 includes a second wireless transmission module 1222, and the second wireless transmission module 1222 can match the first wireless transmission module 1212 of the auxiliary activation device 1210, and adopts a corresponding wireless communication protocol (for example, wireless compatibility). Authentication, global interoperability microwave access, Bluetooth, ultra-wideband communication protocol or radio frequency identification communication protocol, thereby establishing a wireless connection with the first wireless transmission module 1212. It should be noted that the "first" wireless transmission module 1212 and the "second" wireless transmission module 1222 are used to describe that the wireless transmission module is configured in different devices, and is not intended to limit the present invention.

In other embodiments, the mobile terminal device 1220 further includes a voice system 1221. The voice system 1221 is coupled to the second wireless transmission module 1222. Therefore, after the user triggers the trigger module 1214 of the auxiliary activation device 1210, the user can pass the first The wireless transmission module 1212 and the second wireless transmission module 1222 wirelessly activate the voice system 1221. In an embodiment, the voice system 1221 can include a voice sampling module 1224, a voice synthesis module 1226, and a voice output interface 1227. The voice sampling module 1224 is configured to receive a voice signal from a user. The voice sampling module 1224 is, for example, a device for receiving audio such as a microphone. The speech synthesis module 1226 can query a speech synthesis database, and the speech synthesis database is, for example, information recorded with text and corresponding speech, so that the speech synthesis module 1226 can find a corresponding correspondence. The speech of a specific text message to synthesize the text message. Thereafter, the speech synthesis module 1226 can output the synthesized speech through the speech output interface 1227 for playback to the user. The voice output interface 1227 described above is, for example, a speaker or an earphone.

In addition, the mobile terminal device 1220 may also be configured with a communication module 1228. The communication module 1228 is, for example, an element capable of transmitting and receiving wireless signals, such as a radio frequency transceiver. Further, the communication module 1228 enables the user to answer or make a call through the mobile terminal device 1220 or use other services provided by the carrier. In this embodiment, the communication module 1228 can receive the response information from the server 1230 through the Internet, and establish a call connection between the mobile terminal device 1220 and the at least one electronic device according to the response information, wherein the electronic device The device is, for example, another mobile terminal device (not shown).

The server 1230 is, for example, a network server or a cloud server, and has a voice understanding module 1232. In this embodiment, the voice recognition module 1232 includes a voice recognition module 12322 and a voice processing module 12324. The voice processing module 12324 is coupled to the voice recognition module 12322. Here, the speech recognition module 12322 receives the speech signal transmitted from the speech sampling module 1224 to convert the speech signal into a plurality of segmentation semantics (eg, keywords or words, etc.). The speech processing module 12324 can parse the meanings (such as intent, time, location, etc.) represented by the segmentation semantics according to the segmentation semantics, and then determine the meaning represented in the speech signal. In addition, the voice processing module 12324 also generates corresponding response information according to the parsed result. In this embodiment, the speech understanding module 1232 can be one or several logics. The hardware circuit of the door combination is implemented, and can also be implemented by a computer program code. It is worth mentioning that in another embodiment, the voice understanding module 1232 can be configured in the mobile terminal device 1320, such as the voice control system 1300 shown in FIG. The operation of the speech understanding module 1232 of the server 1230 can be as shown in the natural language understanding system 100 of FIG. 1A and the natural language dialogue system 500/700/700' of FIGS. 5A/7A/7B.

The method of voice manipulation will be described below in conjunction with the voice control system 1200 described above. FIG. 14 is a flowchart of a voice control method according to an embodiment of the invention. Referring to FIG. 12 and FIG. 14 simultaneously, in step S1402, the auxiliary activation device 1210 transmits a wireless transmission signal to the mobile terminal device 1220. The detailed description is that when the first wireless transmission module 1212 of the auxiliary activation device 1210 is triggered by receiving a trigger signal, the auxiliary activation device 1210 transmits a wireless transmission signal to the mobile terminal device 1220. Specifically, when the trigger module 1214 in the auxiliary activation device 1210 is pressed by the user, the trigger module 1214 is triggered by the trigger signal, and the first wireless transmission module 1212 sends the wireless transmission signal to the mobile terminal. The second wireless transmission module 1222 of the device 1220 is configured to enable the first wireless transmission module 1212 to be coupled to the second wireless transmission module 1222 via a wireless communication protocol. The above-mentioned auxiliary starting device 1210 is only used to enable the voice system in the mobile terminal device 1220, and does not have the function of listening/talking, so the internal circuit design can be simplified and the cost is low. In other words, with respect to the hands-free headset/microphone attached to the general mobile terminal device 1220, the auxiliary activation device 1210 is another device, that is, the user may have both the hands-free headset/microphone and the auxiliary activation device 1210 of the present invention.

It is worth mentioning that the shape of the auxiliary starting device 1210 described above may be The portable items that are accessible to the user, such as rings, watches, earrings, necklaces, glasses, and the like, or the mounting members, such as the traveling accessories disposed on the steering wheel, are not limited to the above. That is to say, the auxiliary starting device 1210 is a "living" device, and the setting of the internal system allows the user to easily touch the triggering module 1214 to turn on the voice system 1221. Therefore, the auxiliary system 1210 of the present invention can be used to turn on the voice system 1221 in the mobile terminal device 1220, and even turn on the sound amplification function (described later in detail), so that the user can still wear the earphone/microphone without being equipped. Listening/talking is performed directly through the mobile terminal device 1220. In addition, for the user, these "living" auxiliary starting devices 1210 are items that would otherwise be worn or used, so there is no problem of uncomfortable or uncomfortable use.

In addition, the first wireless transmission module 1212 and the second wireless transmission module 1222 can be in a sleep mode or an operating mode. The sleep mode refers to the wireless transmission module being in a closed state, that is, the wireless transmission module does not receive/detect wireless transmission signals, and cannot be connected with other wireless transmission modules. The working mode refers to the wireless transmission module being turned on, that is, the wireless transmission module can continuously detect wireless transmission signals, or can transmit wireless transmission signals at any time, and can be connected with other wireless transmission modules. Here, when the trigger module 1214 is triggered, if the first wireless transmission module 1212 is in the sleep mode, the trigger module 1214 wakes up the first wireless transmission module 1212, and causes the first wireless transmission module 1212 to enter the working mode. And causing the first wireless transmission module 1212 to send the wireless transmission signal to the second wireless transmission module 1222, and let the first wireless transmission module 1212 pass the wireless communication protocol and the second of the mobile terminal device 1220. The line transmission module 1222 is connected.

On the other hand, in order to prevent the first wireless transmission module 1212 from continuously maintaining the operating mode and consuming excessive power, the preset time (for example, 5 minutes) after the first wireless transmission module 1212 enters the working mode, if triggered If the module 1214 is not triggered again, the first wireless transmission module 1212 enters the sleep mode from the working mode and stops connecting with the second wireless transmission module 1220 of the mobile terminal device 1220.

Thereafter, in step S1404, the second wireless transmission module 1222 of the mobile terminal device 1220 receives the wireless transmission signal to activate the voice system 1221. Next, in step S1406, when the second wireless transmission module 1222 detects the wireless transmission signal, the mobile terminal device 1220 can activate the voice system 1221, and the 1221 sampling module 1224 of the voice system can start receiving the voice signal, for example, "Today How many degrees?", "Call to Pharaoh.", "Please check the phone number."

In step S1408, the voice sampling module 1224 transmits the voice signal to the voice understanding module 1232 in the server 1230 to analyze the voice signal and generate response information through the voice understanding module 1232. Further, the speech recognition module 12322 in the speech understanding module 1232 receives the speech signal from the speech sampling module 1224 and divides the speech signal into a plurality of segmentation semantics, and the speech processing module 12324 then The segmentation semantics perform speech understanding to generate response information for responding to the speech signal.

In another embodiment of the present invention, the mobile terminal device 1220 can further receive the response information generated by the voice processing module 12324, and output the content in the response information or perform the operation performed by the response information through the voice output interface 1227. Yubu In step S1410, the speech synthesis module 1226 of the mobile terminal device 1220 receives the response information generated by the speech understanding module 1232, and performs speech synthesis according to the content (such as a vocabulary or a sentence) in the response information to generate a speech response. And, in step S1412, the voice output interface 1227 receives and outputs the voice response.

For example, when the user presses the trigger module 1214 in the auxiliary activation device 1210, the first wireless transmission module 1212 sends a wireless transmission signal to the second wireless transmission module 1222, so that the mobile terminal device 1220 activates the voice system. Voice sampling module 1224 of 1221. Here, assuming that the voice signal from the user is a query sentence, such as "Today's temperature is a few degrees?", the voice sampling module 1224 receives and transmits the voice signal to the voice understanding module 1232 in the server 1230. The speech understanding module 1232 can transmit the response information generated by the parsing back to the mobile terminal device 1220. Assuming that the content of the response information generated by the speech understanding module 1232 is "30 ° C", the speech synthesis module 1226 synthesizes the "30 ° C" message into a speech response, and the speech output interface 1227 can respond to the speech. Broadcast to the user.

In another embodiment, assuming that the voice signal from the user is a command sentence, such as "calling to Pharaoh.", the voice understanding module 1232 can recognize the command sentence as "calling the phone to Pharaoh." request". In addition, the voice understanding module 1232 will generate new response information, such as "Please confirm whether to dial to Pharaoh", and transmit the new response information to the mobile terminal device 1220. Here, the speech synthesis module 1226 synthesizes the new response information into a voice response, and broadcasts the message to the user through the voice output interface 1227. Further, when the user's response is a positive answer to the "Yes" class, similarly, the voice sampling module 1224 can receive and transmit the voice message. The number is sent to the server 1230 for the speech understanding module 1232 to parse. After the speech understanding module 1232 has finished parsing, a dialing instruction information is recorded in the response information and transmitted to the mobile terminal device 1220. At this time, the communication module 1228 queries the phone number of the "Pharaoh" according to the contact information recorded in the phone database to establish a call connection between the mobile terminal device 1220 and another electronic device, that is, Dial to "Pharaoh."

In other embodiments, the above-described operation method may be performed by using the voice control system 1300 or other similar system in addition to the voice control system 1200 described above, and is not limited to the above embodiments.

In summary, in the voice control system and method of the embodiment, the auxiliary activation device can wirelessly activate the voice function of the mobile terminal device. Moreover, the shape of the auxiliary starting device may be a "living" item accessible by the user, such as a ring, a watch, an earring, a necklace, a pair of glasses, etc., that is, various portable items, or a mounting member. For example, the traveling accessory disposed on the steering wheel is not limited to the above. As a result, it is more convenient to use the auxiliary activation device 1210 of the present invention to activate the voice system in the mobile terminal device 1220 compared to the current discomfort of wearing the hands-free headset/microphone.

It should be noted that the server 1230 with the voice understanding module may be a network server or a cloud server, and the cloud server may involve the privacy of the user. For example, the user needs to upload a complete address book to the cloud server in order to complete operations related to the address book such as making a call, sending a text message, and the like. Even if the cloud server uses encrypted connection, and it is not saved, it is difficult to eliminate. In addition to user concerns. Accordingly, the following provides another voice control method and a corresponding voice interaction system thereof, and the mobile terminal device can perform a voice interaction service with the cloud server without uploading the complete address book. In order to clarify the content of the present invention, the following specific examples are given as examples in which the present invention can be implemented.

Although the present invention has been disclosed in the above embodiments, it is not intended to limit the present invention, and any one of ordinary skill in the art can make some changes and refinements without departing from the spirit and scope of the present invention. The scope of the invention is defined by the scope of the appended claims.

100‧‧‧Natural Language Understanding System

102‧‧‧Request information

104‧‧‧Analysis results

106‧‧‧ possible intent grammar information

108‧‧‧Keyword

110‧‧‧Responding results

112‧‧‧Intentional information

114‧‧‧Intentional grammar information

116‧‧‧Analysis Results Output Module

200‧‧‧Search System

220‧‧‧ Structured Database

240‧‧‧Search Engine

260‧‧‧Search interface unit

280‧‧‧Guide data storage device

300‧‧‧Natural Language Processor

400‧‧‧Knowledge-assisted understanding module

Claims (46)

  1. A retrieval system includes: a structured database for storing a plurality of records, wherein each of the records has a data structure; and a search engine for performing a full-text search on the structured database, wherein the The data structure includes a title bar, the title bar includes a plurality of columns, each of the columns includes a guide bar and a value column, wherein the guide bar stores a guide data, and the value column stores a numerical data, wherein the column The numerical data corresponds to the guidance material belonging to the same column, wherein the guidance data is used to indicate the name of one of the plurality of attribute categories, and the content of the attribute category indicated by the guidance material corresponding to the numerical data record The plurality of numerical data belonging to a first record are associated with each other according to the corresponding plurality of guidance materials, and the plurality of numerical data belonging to the first record are used together to express a plurality of attributes of the first record.
  2. The retrieval system of claim 1, wherein the data structure further comprises a content field, and the content column of the records stores content details of each of the records.
  3. The search system of claim 1, wherein each of the columns stores a first special character for separating each of the columns, and storing a second special between the guide bar and the data of the value column. A character that separates the guide bar from the data in the value column.
  4. The retrieval system of claim 1, wherein the title bar This column has a fixed number of digits.
  5. The search system of claim 1, further comprising a search interface unit coupled to the search engine for receiving at least one keyword for transmission to the search engine, so that the search engine has the record for the record The title bar performs the full-text search and reflects a matching result of the search engine, and outputs at least one of the records to match the record.
  6. The retrieval system of claim 5, wherein the retrieval matching record is a full matching record that completely matches the at least one keyword or a partial matching record that is partially matched with the at least one keyword.
  7. The retrieval system of claim 6, wherein when the retrieval interface unit outputs a plurality of retrieval matching records, the outputting the matching records and the partial matching records are sequentially output, wherein the priority of the all matching records is greater than This part matches the priority of the records.
  8. The retrieval system of claim 1, wherein the size of each of the records is equal to a specific value, and after searching for the specific value, the search engine performs the next record to the record currently searched for. Full Text Search.
  9. The search system of claim 1, wherein a third special character is stored after the last column of the title bar of each record, and when the search engine finds the third special character during full-text search, That is, the next record of the record is subjected to full-text search.
  10. A natural language understanding system includes: a natural language processor for analyzing a request message of a user to One of the possible intent grammar materials, each of the possible intent grammar materials includes at least one keyword and an intent data; a knowledge assisting understanding module coupled to the natural language processor for obtaining the at least one possible intent grammar data Determining the intent grammar data to express the user's intention to request the information; and a retrieval system comprising: a structured database for storing a plurality of records, wherein each of the records has a data structure; a search engine for performing a full-text search on the structured database, the data structure including a title bar including a plurality of columns, each of the columns including a guide bar and a numerical column, wherein the guide The column stores a guide data, and the value column stores a numerical data, wherein the numerical data corresponds to the guiding material belonging to the same column, wherein the guiding material is used to indicate a name of one of the plurality of attribute categories, and the The content of the attribute category indicated by the guidance material corresponding to the numerical data record, wherein the plurality of numbers belonging to a first record The data is associated with each other according to the corresponding plurality of guidance materials, and the plurality of attribute data belonging to the first record are used together to express a plurality of attributes of the first record; wherein the knowledge assisting understanding module transmits the keyword The search system, by means of a response from the retrieval system, assists in determining the determined intent grammar data, wherein the data structure further includes a content field, the content column of the records storing the content details of each of the records.
  11. The natural language understanding system of claim 10, wherein a first special character is stored between each of the columns for separating each of the columns, and storing a reference between the guide column and the data of the value column. The second special character is used to separate the information of the guide bar and the value column.
  12. The natural language understanding system of claim 10, wherein the column in the title bar has a fixed number of digits.
  13. The natural language understanding system of claim 10, wherein the retrieval system further comprises a search interface unit coupled to the search engine and the knowledge assisted understanding module for receiving the keyword for transmission to the search An engine, wherein the search engine performs the full-text search on the title bar of the records, and responds to a matching result of the search engine, and outputs at least one search matching record of the records, wherein the knowledge-assisted understanding module compares And at least one retrieves the instruction material stored in the title bar in the matching record and the intent data included in the at least one possible intent grammar material, thereby determining the intention of the user to request the information.
  14. The natural language understanding system of claim 13, wherein the search match record is a full match record that exactly matches the keyword or a part of the match record that matches the keyword portion.
  15. The natural language comprehension system of claim 14, wherein when the search interface unit outputs a plurality of search matching records, the full match record and the partial match record are sequentially output, wherein the full match record is prioritized. The order is greater than the priority order of the partial matching records.
  16. The natural language understanding system described in claim 10, wherein The size of each record is equal to a specific value, and after searching for the specific value, the search engine performs a full-text search on the next record to the record currently searched for.
  17. The natural language understanding system of claim 10, wherein a third special character is stored after the last column of the title bar of the record, and the third special character is found by the search engine when searching in full text. When the next record of the record is full-text searched.
  18. A retrieval method includes: providing a structured database, the structured database storing a plurality of records, wherein each of the records has a data structure; and performing a full-text search on the structured database, wherein the data structure Included in the title bar, the title bar includes a plurality of columns, each of the columns includes a guide bar and a value column, wherein the guide bar stores a guide data, and the value column stores a numerical data, wherein the numerical data Corresponding to the guidance material belonging to the same column, wherein the guidance material is used to indicate the name of one of the plurality of attribute categories, and the value data records the content of the attribute category indicated by the guidance material, wherein The plurality of numerical data belonging to a first record are associated with each other according to the corresponding plurality of guidance materials, and the plurality of numerical data belonging to the first record are used together to express a plurality of attributes of the first record.
  19. The search method of claim 18, wherein the data structure further comprises a content column, the content column of the records storing the contents of each of the records Details.
  20. The search method of claim 18, wherein a first special character is stored between each of the columns to separate each of the columns, and a second special is stored between the guide bar and the data of the value column. A character that separates the guide bar from the data in the value column.
  21. The search method of claim 18, wherein the column in the title column has a fixed number of bits.
  22. The method of claim 18, wherein the step of performing a full-text search on the structured database further comprises: receiving at least one keyword; and performing, by the keyword, the title bar of the records Full-text search; and if the full-text search has a matching result, at least one of the records is outputted to match the record.
  23. The retrieval method of claim 22, wherein the retrieval matching record is an all-matching record that completely matches the keyword or a part of the matching record that is partially matched with the keyword.
  24. The search method of claim 23, wherein the step of outputting the search matching records in the records further comprises: sequentially outputting the all-matched record and the partial matching record, wherein the priority of the all-matched record Greater than the priority of the partial matching record.
  25. The search method of claim 18, wherein the size of each record is equal to a specific value, and after searching for the specific value, the search engine performs the next record to the record currently searched for. Full Text Search.
  26. The search method of claim 18, wherein a third special character is stored after the last column of the title bar of each record, and when the search engine finds the third special character during full-text search, That is, the next record of the record is subjected to full-text search.
  27. A retrieval system comprising: a structured database for storing a plurality of records, wherein each of the records comprises at least one column, wherein the column comprises a plurality of columns, each of the columns comprising a guide bar and a a value column, wherein the guide bar stores a guide data, and the value column stores a numerical data, wherein the numerical data corresponds to the guide data belonging to the same column, wherein the guide data is used to indicate one attribute of the plurality of attribute categories a name of the category, and the content of the attribute category indicated by the guidance material corresponding to the data record, wherein the plurality of numerical data belonging to a first record are related to each other according to the corresponding plurality of guidance materials, and belong to The plurality of attributes of the first record are used together to describe a plurality of attributes of the first record; and a search engine is configured to perform a full-text search on the structured database according to a keyword of a request information, wherein Outputting the first when a first value data of a first column of a first record of the structured database matches the keyword Column corresponding to the first numeric data is a first guidance information to confirm that the resource request The intention of the message, wherein when the intent data of the request information matches the guidance material, it is confirmed that the record corresponding to the guidance material is intended by the request information, wherein the keyword matches the numerical data, Having an all-match record in which the keyword exactly matches the value data or a part of the match record that matches the keyword and the value data portion, wherein when the intent of the request information is confirmed, the priority of the all-match record is greater than the Partially match the priority of the records.
  28. The retrieval system of claim 27, wherein a first special character is stored between each of the columns to separate each of the columns.
  29. The search system of claim 28, wherein a second special character is stored between the guide bar and the value field for separating the guide data of the guide bar from the numerical data of the value column.
  30. The retrieval system of claim 27, wherein the column has a fixed number of digits.
  31. The retrieval system of claim 27, wherein the column includes a content field for storing corresponding content details of the record.
  32. The retrieval system of claim 27, wherein the request information is from a voice input by a user.
  33. The retrieval system of claim 31, wherein the voice is input via a mobile communication device.
  34. The search system described in claim 27, further comprising a search The interface unit is coupled to the search engine for receiving the keyword for transmission to the search engine.
  35. The retrieval system of claim 27, wherein each of the records is equal in size to a specific value, and after searching for the specific value, the search engine performs the next record to the record currently searched for. Full Text Search.
  36. The retrieval system of claim 27, wherein a third special character is stored after the last column of the title bar of each record, and when the search engine finds the third special character during full-text search, That is, the next record of the record is subjected to full-text search.
  37. A retrieval method includes: inputting a keyword, wherein the keyword is generated by a request information; and performing a full-text search on a structured database according to the keyword, wherein the structured database stores a plurality of records, Each of the records includes at least one column, wherein the column includes a plurality of columns, each of the columns includes a guide bar and a value column, wherein the guide bar stores a guide data, and the value column stores a value Data, wherein the numerical data corresponds to the guidance material belonging to the same column, wherein the guidance data is used to indicate a name of one of the plurality of attribute categories, and the reference data corresponding to the guidance data is The content of the attribute category, wherein the plurality of numerical data belonging to a first record are related to each other according to the corresponding plurality of guidance materials, and the numerical data belonging to the first record are used together to describe the first record. Attributes; wherein the first number of a first column of a first record in the structured database When the value data matches the keyword, outputting a first guidance data corresponding to the first value data in the first column to confirm the intention of the request information, wherein the intent information of the request information and the guidance material When a match is generated, it is confirmed that the record corresponding to the guide data is intended by the request information, wherein the match of the keyword with the numerical data includes a full match record that the keyword completely matches the numerical data or A portion of the matching record with the keyword and the portion of the value data, wherein when the intent of the request information is confirmed, the priority of the all-matched record is greater than the priority of the partially matched record.
  38. The search method of claim 37, wherein a first special character is stored between each of the columns to separate each of the columns.
  39. The search method of claim 38, wherein a second special character is stored between the guide bar and the value field for separating the guide data of the guide bar from the numerical data of the value column.
  40. The search method of claim 37, wherein the column has a fixed number of bits.
  41. The search method of claim 37, wherein the column includes a content field for storing corresponding content details of the record.
  42. The retrieval method of claim 37, wherein the request information is from a voice input by a user.
  43. The search method of claim 42, wherein the voice is input via a mobile communication device.
  44. The search method of claim 37, further comprising a search interface unit coupled to the search engine for receiving the keyword for transmission to the search engine.
  45. The search method of claim 37, wherein the size of each record is equal to a specific value, and after searching for the specific value, the search engine performs the next record to the record currently searched for. Full Text Search.
  46. The search method of claim 37, wherein a third special character is stored after the last column of the title bar of each record, and when the search engine finds the third special character during full-text search, That is, the next record of the record is subjected to full-text search.
TW102149041A 2012-12-31 2013-12-30 Searching method, searching system and nature language understanding system TWI578175B (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN2012105930648A CN103049567A (en) 2012-12-31 2012-12-31 Retrieval method, retrieval system and natural language understanding system
CN2013101845443A CN103218463A (en) 2012-12-31 2013-05-17 Retrieval method, retrieval system and natural language understanding system
TW102121406 2013-06-17
CN201310690513.5A CN103761242B (en) 2012-12-31 2013-12-13 Search method, searching system and natural language understanding system
TW102149041A TWI578175B (en) 2012-12-31 2013-12-30 Searching method, searching system and nature language understanding system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW102149041A TWI578175B (en) 2012-12-31 2013-12-30 Searching method, searching system and nature language understanding system

Publications (2)

Publication Number Publication Date
TW201428517A TW201428517A (en) 2014-07-16
TWI578175B true TWI578175B (en) 2017-04-11

Family

ID=51726093

Family Applications (1)

Application Number Title Priority Date Filing Date
TW102149041A TWI578175B (en) 2012-12-31 2013-12-30 Searching method, searching system and nature language understanding system

Country Status (1)

Country Link
TW (1) TWI578175B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5963940A (en) * 1995-08-16 1999-10-05 Syracuse University Natural language information retrieval system and method
CN1281191A (en) * 1999-07-19 2001-01-24 松下电器产业株式会社 The method of information retrieval and information retrieval device
TW200811676A (en) * 2006-08-23 2008-03-01 Inventec Besta Co Ltd Chinese input method using number of selections of Chinese character to determine arrangement of characters
US20080071771A1 (en) * 2006-09-14 2008-03-20 Sashikumar Venkataraman Methods and Systems for Dynamically Rearranging Search Results into Hierarchically Organized Concept Clusters
CN100421104C (en) * 2001-10-18 2008-09-24 英业达股份有限公司 Document library system storage and fetch recording method
CN1542657B (en) * 2003-04-07 2010-04-28 汤姆森特许公 Method for ensuring data compatibility when storing data item in database
CN101751422A (en) * 2008-12-08 2010-06-23 北京摩软科技有限公司 Method, mobile terminal and server for carrying out intelligent search at mobile terminal
CN102150158A (en) * 2008-09-12 2011-08-10 诺基亚公司 Method, system, and apparatus for arranging content search results
TW201131402A (en) * 2009-11-09 2011-09-16 Arcsight Inc Enabling faster full-text searching using a structured data store
CN102436458A (en) * 2011-03-02 2012-05-02 奇智软件(北京)有限公司 Command analyzing method and system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5963940A (en) * 1995-08-16 1999-10-05 Syracuse University Natural language information retrieval system and method
CN1281191A (en) * 1999-07-19 2001-01-24 松下电器产业株式会社 The method of information retrieval and information retrieval device
CN100421104C (en) * 2001-10-18 2008-09-24 英业达股份有限公司 Document library system storage and fetch recording method
CN1542657B (en) * 2003-04-07 2010-04-28 汤姆森特许公 Method for ensuring data compatibility when storing data item in database
TW200811676A (en) * 2006-08-23 2008-03-01 Inventec Besta Co Ltd Chinese input method using number of selections of Chinese character to determine arrangement of characters
US20080071771A1 (en) * 2006-09-14 2008-03-20 Sashikumar Venkataraman Methods and Systems for Dynamically Rearranging Search Results into Hierarchically Organized Concept Clusters
CN102150158A (en) * 2008-09-12 2011-08-10 诺基亚公司 Method, system, and apparatus for arranging content search results
CN101751422A (en) * 2008-12-08 2010-06-23 北京摩软科技有限公司 Method, mobile terminal and server for carrying out intelligent search at mobile terminal
TW201131402A (en) * 2009-11-09 2011-09-16 Arcsight Inc Enabling faster full-text searching using a structured data store
CN102436458A (en) * 2011-03-02 2012-05-02 奇智软件(北京)有限公司 Command analyzing method and system

Also Published As

Publication number Publication date
TW201428517A (en) 2014-07-16

Similar Documents

Publication Publication Date Title
US8949266B2 (en) Multiple web-based content category searching in mobile search application
KR101418163B1 (en) Speech recognition repair using contextual information
RU2544787C2 (en) User intention output based on previous interactions with voice assistant
US9674328B2 (en) Hybridized client-server speech recognition
CN101164102B (en) Methods and apparatus for automatically extending the voice vocabulary of mobile communications devices
AU2014306221B2 (en) Auto-activating smart responses based on activities from remote devices
JP5827380B2 (en) Non-standard position-based text input
US8037070B2 (en) Background contextual conversational search
US9646609B2 (en) Caching apparatus for serving phonetic pronunciations
US8949130B2 (en) Internal and external speech recognition use with a mobile communication facility
US20100299142A1 (en) System and method for selecting and presenting advertisements based on natural language processing of voice-based input
ES2622448T3 (en) System and procedure for user modeling to improve named entity recognition
CA2791277C (en) Using context information to facilitate processing of commands in a virtual assistant
US7228327B2 (en) Method and apparatus for delivering content via information retrieval devices
US10049675B2 (en) User profiling for voice input processing
EP2109097A1 (en) A method for personalization of a service
US9865264B2 (en) Selective speech recognition for chat and digital personal assistant systems
EP2089790B1 (en) Input prediction
US8886540B2 (en) Using speech recognition results based on an unstructured language model in a mobile communication facility application
DK178888B1 (en) Intelligent automated assistant in a media environment
US20100241663A1 (en) Providing content items selected based on context
TWI506982B (en) Voice chat system, information processing apparatus, speech recognition method, keyword detection method, and recording medium
AU2013292377B2 (en) Method of and system for inferring user intent in search input in a conversational interaction system
US20080103907A1 (en) Apparatus and computer code for providing social-network dependent information retrieval services
US8838457B2 (en) Using results of unstructured language model based speech recognition to control a system-level function of a mobile communications facility