TWI578175B

TWI578175B - Searching method, searching system and nature language understanding system

Info

Publication number: TWI578175B
Application number: TW102149041A
Authority: TW
Inventors: 張國峰; 朱逸斐
Original assignee: 威盛電子股份有限公司
Priority date: 2012-12-31
Filing date: 2013-12-30
Publication date: 2017-04-11
Also published as: TW201428517A

Description

Search method, retrieval system, and natural language understanding system

本發明是有關於一種檢索技術，且特別是有關於一種對結構化資料庫進行全文檢索的檢索方法、檢索系統以及自然語言理解系統。 The present invention relates to a retrieval technique, and more particularly to a retrieval method, a retrieval system, and a natural language understanding system for performing full-text retrieval on a structured database.

在計算機的自然語言理解(Nature Language Understanding)中，通常會使用特定的語法來抓取用戶的輸入語句的意圖或資訊。因此，若資料庫中儲存有足夠多的用戶輸入語句的資料，便能做到合理的判斷。 In the natural language understanding of a computer, a specific grammar is usually used to capture the intent or information of the user's input statement. Therefore, if there are enough data of the user input sentence stored in the database, a reasonable judgment can be made.

在現有的作法中，有一種是利用內置的固定詞列表來抓取用戶的輸入語句，而固定詞列表中包含了特定的意圖或資訊所使用的特定用語，而用戶需依照此特定用語來表達其意圖或資訊，其意圖或資訊才能被系統正確辨別。然而，迫使用戶去記住固定詞列表的每個特定用語是相當不人性化的作法。例如：現有技術使用固定詞列表的實施方式，要求用戶在詢問天氣的時候必須說：“上海(或北京)明天(或後天)天氣如何？”，而若用戶使用其他比較自然的口語化表達也想詢問天氣狀况時，比如是“上海明天怎麽樣啊？”，因為語句中未出現“天氣”，所以現有技術就會理解成“上海有個叫明天的地方”，這樣顯然沒有抓到用戶的真正意圖。另外，用戶所使用的語句種類是十分複雜的，並且又時常有所變化，甚至有時用戶可能會輸入錯誤的語句，在此情况下必須要藉由模糊匹配的方式來抓取用戶的輸入語句。因此，僅提供僵化輸入規則的固定詞列表所能達到的效果就更差了。 In the existing practice, one uses the built-in fixed word list to capture the user's input sentence, and the fixed word list contains specific intentions or specific terms used by the information, and the user needs to express according to the specific term. Their intentions or information, their intentions or information can be correctly identified by the system. However, forcing the user to remember each specific term of the fixed word list is quite unhuman. For example, the prior art uses a fixed word list implementation, requiring the user to ask the weather when It must be said: "What is the weather in Shanghai (or Beijing) tomorrow (or the day after tomorrow)?", and if users use other more natural colloquial expressions and want to ask about the weather conditions, for example, "How about Shanghai tomorrow?" There is no "weather" in the statement, so the prior art will understand that "there is a place in Shanghai called tomorrow", which obviously does not capture the true intention of the user. In addition, the types of statements used by users are very complicated, and often change from time to time. Sometimes users may enter incorrect statements. In this case, the user's input statements must be fetched by fuzzy matching. . Therefore, the effect of a fixed word list that only provides rigid input rules is even worse.

此外，當利用自然語言理解來處理多種類型的用戶意圖時，有些相異的意圖的語法結構却是相同的，例如當用戶的輸入語句為"我要看三國演義"，其用戶意圖有可能是想看三國演義的電影，或是想看三國演義的書，因此通常在此情况中，便會匹配到兩種可能意圖來讓用戶做選擇。然而，在很多情况下，提供不必要的可能意圖來讓用戶做選擇是十分多餘且沒效率的。例如，當用戶的輸入語句為"我想看超級星光大道"時，將使用者的意圖匹配為看超級星光大道的書或者畫作是十分沒必要的(因為超級星光大道是電視節目)。 In addition, when using natural language understanding to handle multiple types of user intent, the grammatical structure of some different intents is the same. For example, when the user's input statement is "I want to see the Romance of the Three Kingdoms", the user's intention may be If you want to see a movie from the Romance of the Three Kingdoms, or if you want to read a book about the Romance of the Three Kingdoms, it is usually in this case that there are two possible intents to match the user's choice. However, in many cases, it is superfluous and inefficient to provide unnecessary possible intent to make the user's choice. For example, when the user's input sentence is "I want to see Super Star Avenue", it is not necessary to match the user's intentions to the book or painting of the Super Star Avenue (because the Super Star Avenue is a TV show).

再者，一般而言，在全文檢索中所獲得的搜尋結果是非結構化的資料。非結構化資料內的資訊是分散且不具關聯的，例如，在google或百度等搜尋引擎輸入關鍵字後，所獲得的網頁搜尋結果就是非結構化資料，因為搜尋結果必須通過人為的逐項閱讀才能找到當中的有用資訊，而這樣的作法不僅浪費用戶的時間，而且可能漏失想要的資訊，所以在實用性上會受到很大的限制。 Furthermore, in general, the search results obtained in full-text search are unstructured data. The information in unstructured data is scattered and unrelated. For example, after entering keywords in search engines such as Google or Baidu, the web search results obtained are unstructured data, because the search results must be read by humans. In order to find useful information, this method not only wastes the user's time Between, and may miss the information you want, so there will be a lot of restrictions on practicality.

本發明提供一種檢索方法以及檢索系統，其對結構化資料庫進行全文檢索，而使全文檢索所獲得的搜尋結果是非常有意義的結構化的資料。 The invention provides a retrieval method and a retrieval system, which perform full-text retrieval on a structured database, and the search results obtained by the full-text retrieval are very meaningful structured materials.

本發明又提供一種自然語言理解系統，藉由對結構化資料庫進行全文檢索來輔助判斷用戶的請求資訊所表示的意圖。 The invention further provides a natural language understanding system for assisting in determining the intent expressed by the user's request information by performing a full-text search on the structured database.

本發明提出一種檢索系統，其包括：結構化資料庫以及搜尋引擎。結構化資料庫儲存具有多個記錄。搜尋引擎對結構化資料庫進行全文檢索，其中結構化資料庫每個記錄內部的所包含的數值資料相互間具有關聯性，且數值資料共同用以表達來自用戶的請求資訊對該記錄的意圖。該搜尋引擎用以對該結構化資料庫進行一全文檢索，其中在該數值資料被匹配時，對應於該數值資料的指引資料被輸出以確認該請求資訊的意圖。 The invention provides a retrieval system comprising: a structured database and a search engine. A structured repository store has multiple records. The search engine performs a full-text search on the structured database, wherein the numerical data contained in each record of the structured database is related to each other, and the numerical data is used together to express the intention of the request information from the user to the record. The search engine is configured to perform a full-text search on the structured database, wherein when the numerical data is matched, the guidance data corresponding to the numerical data is output to confirm the intent of the requested information.

本發明提出一種自然語言理解系統，其包括：自然語言處理器、知識輔助理解模組以及檢索系統。自然語言處理器將用戶的請求資訊，分析成至少一可能意圖語法資料，每一可能意圖語法資料包括至少一關鍵字及意圖資料。耦接至自然語言處理器的知識輔助理解模組用以求得至少一可能意圖語法資料中的確定意圖語法資料，以表達用戶的請求資訊的意圖。前述檢索系統包括結構化資料庫以及搜尋引擎。結構化資料庫儲存多個記錄。搜尋引擎對結構化資料庫進行全文檢索。知識輔助理解模組傳送關鍵字給檢索系統，藉由檢索系統的回應，以輔助求得確定意圖語法資料。 The present invention provides a natural language understanding system including: a natural language processor, a knowledge assisted understanding module, and a retrieval system. The natural language processor analyzes the user's request information into at least one possible intent grammar material, and each of the possible intent grammar materials includes at least one keyword and intent data. The knowledge assisted understanding module coupled to the natural language processor is configured to determine the determined intent grammar data in the at least one possible intent grammar material to express the user's intention to request the information. The aforementioned retrieval system package Includes a structured database and a search engine. A structured database stores multiple records. The search engine performs a full-text search of the structured database. The knowledge assisted understanding module transmits keywords to the retrieval system to assist in determining the intended grammar data by retrieving the response of the system.

本發明提出一種檢索方法，此方法首先提供結構化資料庫，此結構化資料庫儲存具有多個記錄。而後，對結構化資料庫進行全文檢索。 The present invention proposes a retrieval method that first provides a structured database that has multiple records stored. Then, the full-text search is performed on the structured database.

根據本發明的一實施例中，前述的每個記錄包括了標題欄，此標題欄內包括至少一分欄，每一分欄包括指引欄以及數值欄，前述記錄的指引欄儲存指引資料，前述記錄的數值欄儲存數值資料。 According to an embodiment of the present invention, each of the foregoing records includes a title bar, the title bar includes at least one sub-column, each sub-column includes a guide bar and a value column, and the guide bar of the foregoing record stores the guide information, the foregoing The value column of the record stores the numerical data.

根據本發明的一實施例中，前述的每個記錄還包括內容欄，前述記錄的內容欄儲存前述記錄的內容細節資料。 According to an embodiment of the invention, each of the foregoing records further includes a content column, and the recorded content column stores the recorded content details.

根據本發明的一實施例中，當前述記錄的標題欄中儲存有多個分欄的資料時，於各分欄的資料間儲存第一特殊字符，用以分隔各分欄的資料，於指引欄與數值欄的資料間儲存第二特殊字符，用以分隔指引欄與數值欄的資料。 According to an embodiment of the present invention, when a plurality of columns of data are stored in the title column of the record, the first special character is stored between the columns of the columns to separate the information of each column. A second special character is stored between the column and the value column to separate the information in the guide bar and the value column.

根據本發明的一實施例中，標題欄中的分欄具有固定位數。 In accordance with an embodiment of the invention, the columns in the title bar have a fixed number of digits.

本發明提出一種檢索系統，包括一結構化資料庫，用以儲存至少一記錄，其中每個該記錄包含至少一欄，其中該欄所儲存的資料是共同用以描述該記錄的屬性；以及一搜尋引擎，用以依據一請求資訊的一關鍵字對該結構化資料庫進行一全文檢索，其中在該結構化資料庫的至少一該記錄與該關鍵字產生匹配時，輸出對應於該欄的一指引資料以確認該請求資訊的意圖。 The present invention provides a retrieval system comprising a structured database for storing at least one record, wherein each of the records includes at least one column, wherein the data stored in the column is a property commonly used to describe the record; Search engine Performing a full-text search on the structured database according to a keyword requesting information, wherein when at least one of the records of the structured database matches the keyword, a guide data corresponding to the column is output to confirm The intent of the request information.

本發明提出一種檢索方法，包括：輸入一關鍵字，其中該關鍵字是由一請求資訊所產生；以及依據該關鍵字對一結構化資料庫進行一全文檢索，其中該結構化資料庫儲存至少一記錄，其中每個該記錄包含至少一欄，其中該欄所儲存的資料是共同用以描述該記錄的屬性；其中在該結構化資料庫的至少一該記錄與該關鍵字產生匹配時，輸出對應於該欄的一指引資料以確認該請求資訊的意圖。 The present invention provides a retrieval method comprising: inputting a keyword, wherein the keyword is generated by a request information; and performing a full-text search on a structured database according to the keyword, wherein the structured database stores at least a record, wherein each of the records includes at least one column, wherein the data stored in the column is a property commonly used to describe the record; wherein when at least one of the records of the structured database matches the keyword, A guidance material corresponding to the column is output to confirm the intent of the requested information.

基於上述，本發明藉由將用戶的請求資訊所包括的關鍵字，來對結構化資料庫中具有特定資料結構的記錄進行全文檢索，以輔助判斷出用戶在請求資訊中所表示的意圖。 Based on the above, the present invention performs a full-text search on a record having a specific data structure in the structured database by using a keyword included in the user's request information to assist in judging the intention expressed by the user in the requested information.

為讓本發明的上述特徵和優點能更明顯易懂，下文特舉實施例，並配合所附圖式作詳細說明如下。 The above described features and advantages of the invention will be apparent from the following description.

100、520、520’、720、720’‧‧‧自然語言理解系統 100, 520, 520’, 720, 720’ ‧ ‧ natural language understanding system

102、503、503’、703、902、902’‧‧‧請求資訊 102, 503, 503', 703, 902, 902'‧‧‧ request information

104‧‧‧分析結果 104‧‧‧Analysis results

106‧‧‧可能意圖語法資料 106‧‧‧ possible intent grammar information

108、509、509’、711、904、904’‧‧‧關鍵字 108, 509, 509’, 711, 904, 904’ ‧ ‧ keywords

110‧‧‧回應結果 110‧‧‧Responding results

112‧‧‧意圖資料 112‧‧‧Intentional information

114‧‧‧確定意圖語法資料 114‧‧‧Determining intent grammar information

116‧‧‧分析結果輸出模組 116‧‧‧Analysis Results Output Module

200‧‧‧檢索系統 200‧‧‧Search System

220‧‧‧結構化資料庫 220‧‧‧ Structured Database

240‧‧‧搜尋引擎 240‧‧‧Search Engine

260‧‧‧檢索介面單元 260‧‧‧Search interface unit

280‧‧‧指引資料儲存裝置 280‧‧‧Guide data storage device

300‧‧‧自然語言處理器 300‧‧‧Natural Language Processor

302、832、834、836、838‧‧‧記錄 302, 832, 834, 836, 838 ‧ ‧ records

304‧‧‧標題欄 304‧‧‧ title bar

306‧‧‧內容欄 306‧‧‧Content bar

308‧‧‧分欄 308‧‧ ‧ column

310‧‧‧指引欄 310‧‧‧Guide

312‧‧‧數值欄 312‧‧‧Value column

314‧‧‧來源欄 314‧‧‧Source column

316‧‧‧熱度欄 316‧‧‧heat column

318、852、854‧‧‧喜好欄 318, 852, 854‧‧‧ favorite bar

320、862、864‧‧‧厭惡欄 320, 862, 864‧‧‧ aversive bar

400‧‧‧知識輔助理解模組 400‧‧‧Knowledge-assisted understanding module

500、500’、700、700’‧‧‧自然語言對話系統 500, 500’, 700, 700’‧‧‧ Natural Language Dialogue System

501、701‧‧‧語音輸入 501, 701‧‧‧ voice input

507、507’、707‧‧‧語音應答 507, 507’, 707‧ ‧ voice response

510、710‧‧‧語音取樣模組 510, 710‧‧‧ voice sampling module

511、511’、711、906、906’‧‧‧回報答案 511, 511', 711, 906, 906' ‧ ‧ return answers

513、513’、713‧‧‧語音 513, 513’, 713 ‧ ‧ voice

522、722‧‧‧語音辨識模組 522, 722‧‧‧ voice recognition module

524、724‧‧‧自然語言處理模組 524, 724‧‧‧ natural language processing module

526、726‧‧‧語音合成模組 526, 726‧‧‧Speech synthesis module

530、740‧‧‧語音合成資料庫 530, 740‧‧‧Speech synthesis database

702‧‧‧語音綜合處理模組 702‧‧‧voice integrated processing module

715‧‧‧用戶喜好資料 715‧‧‧User preference information

717‧‧‧用戶喜好記錄 717‧‧‧User preference record

730‧‧‧特性資料庫 730‧‧‧Characteristic Database

872、874‧‧‧欄 Column 872, 874‧‧

900、1010‧‧‧行動終端裝置 900, 1010‧‧‧ mobile terminal devices

908、908’‧‧‧候選列表 908, 908’‧‧‧ Candidate List

910、1011‧‧‧語音接收單元 910, 1011‧‧‧ voice receiving unit

920、1013‧‧‧資料處理單元 920, 1013‧‧‧ Data Processing Unit

930、1015‧‧‧顯示單元 930, 1015‧‧‧ display unit

940‧‧‧存儲單元 940‧‧‧storage unit

1000‧‧‧資訊系統 1000‧‧‧Information System

1020‧‧‧伺服器 1020‧‧‧Server

SP1‧‧‧第一語音 SP1‧‧‧ first voice

SP2‧‧‧第二語音 SP2‧‧‧second voice

1200、1300‧‧‧語音操控系統 1200, 1300‧‧‧ voice control system

1210‧‧‧輔助啟動裝置 1210‧‧‧Auxiliary starter

1212、1222‧‧‧無線傳輸模組 1212, 1222‧‧‧ wireless transmission module

1214‧‧‧觸發模組 1214‧‧‧ Trigger Module

1216‧‧‧無線充電電池 1216‧‧‧Wireless rechargeable battery

12162‧‧‧電池單元 12162‧‧‧ battery unit

12164‧‧‧無線充電模組 12164‧‧‧Wireless charging module

1220、1320‧‧‧行動終端裝置 1220, 1320‧‧‧ mobile terminal devices

1221‧‧‧語音系統 1221‧‧‧ voice system

1224‧‧‧語音取樣模組 1224‧‧‧Voice sampling module

1226‧‧‧語音合成模組 1226‧‧‧Speech synthesis module

1227‧‧‧語音輸出介面 1227‧‧‧Voice output interface

1228‧‧‧通訊模組 1228‧‧‧Communication Module

1230‧‧‧(雲端)伺服器 1230‧‧‧ (cloud) server

1232‧‧‧語音理解模組 1232‧‧‧Voice Understanding Module

12322‧‧‧語音辨識模組 12322‧‧‧Voice recognition module

12324‧‧‧語音處理模組 12324‧‧‧Voice Processing Module

S410~S450‧‧‧根據本發明一實施例的檢索方法的步驟 S410~S450‧‧‧ steps of a retrieval method according to an embodiment of the present invention

S510~S590‧‧‧根據本發明一實施例的自然語言理解系統工作過程的步驟 S510~S590‧‧‧ steps of the working process of the natural language understanding system according to an embodiment of the invention

S602、S604、S606、S608、S610、S612‧‧‧修正語音應答的方法各步驟 S602, S604, S606, S608, S610, S612‧‧‧ steps of modifying the voice response

S802~S890‧‧‧根據本發明一實施例的自然語言對話方法各步驟 S802~S890‧‧‧ steps of a natural language dialogue method according to an embodiment of the present invention

S1100~S1190‧‧‧依據本發明一實施例的基於語音識別的選擇方法的各步驟 S1100~S1190‧‧‧ steps of a speech recognition based selection method according to an embodiment of the invention

S1402~S1412‧‧‧依據本發明一實施例的語音操控方法的各步驟 S1402~S1412‧‧‧ steps of a voice manipulation method according to an embodiment of the present invention

圖1為根據本發明的一實施例的自然語言理解系統的方塊圖。 1 is a block diagram of a natural language understanding system in accordance with an embodiment of the present invention.

圖2為根據本發明的一實施例的自然語言處理器對用戶的各種請求資訊的分析結果的示意圖。 2 is a diagram showing the analysis results of various request information of a user by a natural language processor according to an embodiment of the present invention.

圖3A是根據本發明的一實施例的結構化資料庫所儲存的具有特定資料結構的多個記錄的示意圖。 3A is a schematic diagram of a plurality of records having a particular data structure stored by a structured database, in accordance with an embodiment of the present invention.

圖3B是根據本發明的另一實施例的結構化資料庫所儲存的具有特定資料結構的多個記錄的示意圖。 3B is a schematic diagram of a plurality of records having a particular data structure stored by a structured database in accordance with another embodiment of the present invention.

圖3C是根據本發明的一實施例的指引資料儲存裝置所儲存的指引資料的示意圖。 FIG. 3C is a schematic diagram of the guidance material stored by the guidance data storage device according to an embodiment of the invention.

圖4A為根據本發明的一實施例的檢索方法的流程圖。 4A is a flow chart of a retrieval method in accordance with an embodiment of the present invention.

圖4B為根據本發明的另一實施例的自然語言理解系統工作過程的流程圖。 4B is a flow chart showing the operation of a natural language understanding system in accordance with another embodiment of the present invention.

圖5A是依照本發明一實施例所繪示的自然語言對話系統的方塊圖。 FIG. 5A is a block diagram of a natural language dialogue system according to an embodiment of the invention.

圖5B是依照本發明一實施例所繪示的自然語言理解系統的方塊圖。 FIG. 5B is a block diagram of a natural language understanding system according to an embodiment of the invention.

圖5C是依照本發明另一實施例所繪示的自然語言對話系統的方塊圖。 FIG. 5C is a block diagram of a natural language dialogue system according to another embodiment of the present invention.

圖6是依照本發明一實施例所繪示的修正語音應答的方法流程圖。 FIG. 6 is a flow chart of a method for modifying a voice response according to an embodiment of the invention.

圖7A是依照本發明一實施例所繪示的自然語言對話系統的方塊圖。 FIG. 7A is a block diagram of a natural language dialogue system according to an embodiment of the invention.

圖7B是依照本發明另一實施例所繪示的自然語言對話系統的方塊圖。 FIG. 7B is a block diagram of a natural language dialogue system according to another embodiment of the present invention.

圖8A是依照本發明一實施例所繪示的自然語言對話方法流程圖。 FIG. 8A is a flow of a natural language dialogue method according to an embodiment of the invention. Cheng Tu.

圖8B是根據本發明的再一實施例的結構化資料庫所儲存的具有特定資料結構的多個記錄的示意圖。 8B is a schematic diagram of a plurality of records having a particular data structure stored by a structured database in accordance with yet another embodiment of the present invention.

圖9為依據本發明一實施例的行動終端裝置的系統示意圖。 FIG. 9 is a schematic diagram of a system of a mobile terminal device according to an embodiment of the invention.

圖10為依據本發明一實施例的資訊系統的系統示意圖。 FIG. 10 is a schematic diagram of a system of an information system according to an embodiment of the invention.

圖11為依據本發明一實施例的基語音識別的選擇方法的流程圖。 11 is a flow chart of a method for selecting a base speech recognition according to an embodiment of the present invention.

圖12是依照本發明一實施例所繪示的語音操控系統的方塊圖。 FIG. 12 is a block diagram of a voice control system according to an embodiment of the invention.

圖13是依照本發明另一實施例所繪示的語音操控系統的方塊圖。 FIG. 13 is a block diagram of a voice control system according to another embodiment of the invention.

圖14是依照本發明一實施例所繪示的語音操控方法的流程圖。 FIG. 14 is a flowchart of a voice control method according to an embodiment of the invention.

由於現有運用固定詞列表的實施方式只能提供僵化的輸入規則，對於用戶多變的輸入語句的判斷能力十分不足，所以常導致對用戶的意圖判斷錯誤而找不到所需的資訊、或是因為判斷力不足而輸出不必要的資訊給用戶等問題。此外，現有的搜尋引擎只能對用戶提供分散、且相關不强的搜尋結果，於是用戶還要花時間逐條檢視才能過濾出所需資訊，不僅浪費時間而且可能漏失所需資訊。本發明即針對現有技術的前述問題提出一結構化資料的檢索方法與系統，在結構化資料提供特定的欄來儲存不同類型的資料元素，俾提供用戶使用自然語音輸入資訊進行檢索時，能快速且正確地判斷用戶的意圖，進而提供所需資訊予用戶、或提供更精確訊息供其選取。 Since the existing implementation of the fixed word list can only provide rigid input rules, the judgment ability of the user's variable input sentence is very insufficient, so often the user's intention is judged incorrectly and the required information cannot be found, or The problem of outputting unnecessary information to the user due to insufficient judgment. In addition, the existing search engine can only provide users with scattered and unrelated search results, so users have to spend time and one by one to filter out the required information, not only wasting time but also missing the required information. The present invention proposes a structural resource for the aforementioned problems of the prior art. Material retrieval method and system, in the structured data to provide a specific column to store different types of data elements, to provide users with natural voice input information for retrieval, can quickly and correctly determine the user's intention, and then provide the required information Give users or provide more precise information for their selection.

圖1為根據本發明的一實施例的自然語言理解系統的方塊圖。如圖1所示，自然語言理解系統100包括檢索系統200、自然語言處理器300以及知識輔助理解模組400，知識輔助理解模組400耦接自然語言處理器300以及檢索系統200，檢索系統200還包括結構化資料庫220、搜尋引擎240以及檢索介面單元260，其中搜尋引擎240耦接結構化資料庫220以及檢索介面單元260。在本實施例中，檢索系統200包括有檢索介面單元260，但非以限定本發明，某些實施例中可能沒有檢索介面單元260，而以其他方式(例如通過API(Application Interface)呼叫接收關鍵詞108)使搜尋引擎240對結構化資料庫220進行全文檢索。 1 is a block diagram of a natural language understanding system in accordance with an embodiment of the present invention. As shown in FIG. 1 , the natural language understanding system 100 includes a retrieval system 200 , a natural language processor 300 , and a knowledge assisted understanding module 400 . The knowledge assisted understanding module 400 is coupled to the natural language processor 300 and the retrieval system 200 , and the retrieval system 200 The structure database 220, the search engine 240, and the search interface unit 260 are further included. The search engine 240 is coupled to the structured database 220 and the retrieval interface unit 260. In the present embodiment, the retrieval system 200 includes a retrieval interface unit 260, but is not intended to limit the present invention. In some embodiments, the retrieval interface unit 260 may not be retrieved, but in other manners (eg, through an API (Application Interface) call receiving key The word 108) causes the search engine 240 to perform a full-text search of the structured repository 220.

當用戶對自然語言理解系統100發出請求資訊102時，自然語言處理器300可分析請求資訊102，並在將所分析的可能意圖語法資料106送往知識輔助理解模組400，其中可能意圖語法資料106包含關鍵字108與意圖資料112。隨後，知識輔助理解模組400取出可能意圖語法資料106中的關鍵字108並送往檢索系統200並將意圖資料112儲存在知識輔助理解模組400內部，而檢索系統200中的搜尋引擎240將依據關鍵字108對結構化資料庫220進行全文檢索之後，再將全文檢索的回應結果110回傳至知識輔助理解模組400。接著，知識輔助理解模組400依據回應結果110對所儲存的意圖資料112進行比對，並將所求得的確定意圖語法資料114送往分析結果輸出模組116，而分析結果輸出模組116再依據確定意圖語法資料114，傳送分析結果104至伺服器(未顯示)，隨後在查詢到用戶所需的資料後將其送給用戶。應注意的是，分析結果104可包含關鍵字108，亦可輸出包含關鍵字108的記錄(例如圖3A/3B的記錄)的部分資訊(例如記錄302的編號)、或是全部的資訊。此外，分析結果104可直接被伺服器轉換成語音輸出予用戶、或是再經過特定處理後再輸出對應語音予用戶(後文會再詳述“特定處理”的方式與所包含的內容與資訊)，本領域的技術人員可依據實際需求設計檢索系統200所輸出的資訊，本發明對此不予以限制。 When the user issues the request information 102 to the natural language understanding system 100, the natural language processor 300 can analyze the request information 102 and send the analyzed possible intent grammar data 106 to the knowledge assisted understanding module 400, where the grammar data may be intended 106 includes a keyword 108 and an intent profile 112. Subsequently, the knowledge assisted understanding module 400 retrieves the keywords 108 in the possible intent grammar material 106 and sends them to the retrieval system 200 and stores the intent data 112 within the knowledge assisted understanding module 400, while the search engine 240 in the retrieval system 200 will After the full-text search is performed on the structured database 220 according to the keyword 108, the response result 110 of the full-text search is transmitted back to the knowledge assistant. Help understanding module 400. Then, the knowledge assisted understanding module 400 compares the stored intent data 112 according to the response result 110, and sends the determined determined intent grammar data 114 to the analysis result output module 116, and the analysis result output module 116 Then, based on the determined intent grammar data 114, the analysis result 104 is transmitted to the server (not shown), and then sent to the user after the information required by the user is queried. It should be noted that the analysis result 104 may include the keyword 108, and may also output partial information (eg, the number of the record 302) of the record containing the keyword 108 (eg, the record of FIG. 3A/3B), or all of the information. In addition, the analysis result 104 can be directly converted into a voice output to the user by the server, or after a specific process, and then the corresponding voice is output to the user (the specific method and the content and information included later will be described in detail later). The information output by the retrieval system 200 can be designed by a person skilled in the art according to actual needs, which is not limited by the present invention.

上述的分析結果輸出模組116可視情况與其他模組相結合，例如在一實施例中可並入知識輔助理解模組400中、或是在另一實施例中分離於自然語言理解系統100而位於伺服器(例如包含自然語言理解系統100者)中，於是伺服器將直接接收意圖語法資料114再進行處理。此外，自然語言理解系統100可將意圖資料112儲存在模組內部的儲存裝置中、在自然語言理解系統100中、伺服器中(例如包含自然語言理解系統100者)、或是在任何可供知識輔助理解模組400可以擷取到的儲存器中，本發明對此並不加以限定。再者，自然語言理解系統100包括檢索系統200、自然語言處理器300以及知識輔助理解模組400可以用硬體、軟體、固件、或是上述方式的各種結合方式來構築，本發明亦未對此進行限制。 The analysis result output module 116 may be combined with other modules, for example, may be incorporated into the knowledge assisted understanding module 400 in one embodiment, or separated from the natural language understanding system 100 in another embodiment. Located in the server (eg, including the natural language understanding system 100), the server will then receive the intent grammar data 114 for processing. In addition, the natural language understanding system 100 can store the intent material 112 in a storage device internal to the module, in the natural language understanding system 100, in a server (eg, including the natural language understanding system 100), or at any available The knowledge assisted understanding module 400 can be retrieved from the storage device, which is not limited by the present invention. Moreover, the natural language understanding system 100 includes the retrieval system 200, the natural language processor 300, and the knowledge assisted understanding module 400 can be implemented by hardware, software, The firmware or the various combinations of the above methods are constructed, and the present invention is not limited thereto.

前述自然語言理解系統100可以位於雲端伺服器中，也可以位於區域網路中的伺服器，甚或是位於個人計算機、移動計算機裝置(如筆記型計算機)或移動通訊裝置(如手機)等。自然語言理解系統100或檢索系統200中的各構件也不一定需設置在同一機器中，而可視實際需要分散在不同裝置或系統通過各種不同的通訊協議來連結。例如，自然語言理解處理器300及知識輔助理解模組400可配置於同一智能型手機內，而檢索系統200可配置在另一雲端伺服器中；或者是，檢索介面單元260、自然語言理解處理器300及知識輔助理解模組400可配置於同一筆記型計算機內，而搜尋引擎240及結構化資料庫220可配置於區域網路中的另一伺服器中。此外，當自然語言理解系統100皆位於伺服器時(不論是雲端伺服器或區域網路伺服器)，可以將檢索系統200、自然語言理解處理器300、以及知識輔助理解模組400配置不同的計算機主機中，並由伺服器主系統來統籌其相互間的訊息與資料的傳送。當然，檢索系統200、自然語言理解處理器300、以及知識輔助理解模組400亦可視實際需求而將其中兩者或全部合並在一計算機主機中，本發明並不對這部分的配置進行限制。 The aforementioned natural language understanding system 100 may be located in a cloud server, or may be located in a server in a local area network, or even in a personal computer, a mobile computer device (such as a notebook computer) or a mobile communication device (such as a mobile phone). The components of the natural language understanding system 100 or the retrieval system 200 are not necessarily required to be disposed in the same machine, but may be dispersed in different devices or systems through various different communication protocols depending on actual needs. For example, the natural language understanding processor 300 and the knowledge assisted understanding module 400 can be configured in the same smart phone, and the retrieval system 200 can be configured in another cloud server; or, the retrieval interface unit 260, natural language understanding processing The device 300 and the knowledge assisted understanding module 400 can be configured in the same notebook computer, and the search engine 240 and the structured database 220 can be configured in another server in the local area network. In addition, when the natural language understanding system 100 is located at the server (whether it is a cloud server or a local area network server), the retrieval system 200, the natural language understanding processor 300, and the knowledge assistance understanding module 400 may be configured differently. In the computer host, the server main system coordinates the transmission of information and data between them. Of course, the retrieval system 200, the natural language understanding processor 300, and the knowledge-assisted understanding module 400 can also combine two or all of them into a computer host according to actual needs, and the present invention does not limit the configuration of this part.

在本發明的實施例中，用戶可以用各種方式來向自然語言處理器300發出請求資訊，例如用說話的語音輸入或是文字描述等方式來發出請求資訊。舉例來說，若自然語言理解系統100 是位於雲端或區域網路中的伺服器(未顯示)內，則用戶可先藉由移動裝置(例如手機、PDA、平板計算機或類似系統)來輸入請求資訊102，接著再通過電信系統業者來將請求資訊102傳送至伺服器中的自然語言理解系統100，來讓自然語言處理器300進行請求資訊102的分析，最後伺服器於確認用戶意圖後，再通過分析結果輸出模組116將對應的分析結果104通過伺服器的處理後，將用戶所請求的資訊傳回用戶的移動裝置。舉例來說，請求資訊102可以是用戶希望藉由自然語言理解系統100來求得答案的問題(例如"明天上海的天氣怎麽樣啊")，而自然語言理解系統100在分析出用戶的意圖是查詢上海明天的天氣時，將通過分析結果輸出模組116將所查詢的天氣資料作為輸出結果104送給用戶。此外，若用戶對自然語言理解系統100所下的指令為"我要看讓子彈飛"、"我想聽一起走過的日子"時，因為“讓子彈飛”或“一起走過的日子”可能包含不同的領域，所以自然語言處理器300會將用戶的請求資訊102分析成一個或一個以上的可能意圖語法資料106，此可能意圖語法資料106包括有關鍵字108及意圖資料112，然後再經由對檢索系統220中的結構化資料240進行全文檢索後，進而確認用戶的意圖。 In an embodiment of the present invention, the user can send request information to the natural language processor 300 in various ways, such as sending a request message by means of a spoken voice input or a text description. For example, if the natural language understanding system 100 If it is located in a server (not shown) in the cloud or regional network, the user can first input the request information 102 by using a mobile device (such as a mobile phone, PDA, tablet computer or the like), and then through the telecommunication system provider. The request information 102 is transmitted to the natural language understanding system 100 in the server, so that the natural language processor 300 performs the analysis of the request information 102. Finally, after confirming the user's intention, the server passes the analysis result output module 116 to correspond. After the analysis result 104 is processed by the server, the information requested by the user is transmitted back to the user's mobile device. For example, the request information 102 may be a question that the user desires to obtain an answer by the natural language understanding system 100 (eg, "How is the weather in Shanghai tomorrow"), and the natural language understanding system 100 analyzes the user's intention is When the weather of Shanghai Tomorrow is queried, the weather data that is queried will be sent to the user as the output result 104 through the analysis result output module 116. In addition, if the user's instruction to the natural language understanding system 100 is "I want to see the bullet fly", "I want to listen to the days I walked together", because "let the bullets fly" or "the days that have passed together" Different fields may be included, so the natural language processor 300 will parse the user's request information 102 into one or more possible intent grammar materials 106, which may include the keyword 108 and the intent data 112, and then After the full-text search is performed on the structured material 240 in the search system 220, the user's intention is further confirmed.

進一步來說，當用戶的請求資訊102為"明天上海的天氣怎麽樣啊？"時，自然語言處理器300經過分析後，可產生一個可能意圖語法資料106："<queryweather>,<city>=上海,<時間>=明天"。 Further, when the user's request information 102 is "How is the weather in Shanghai tomorrow?", the natural language processor 300 can generate a possible intent grammar data 106 after analysis: "<queryweather>, <city>= Shanghai, <time> = tomorrow.

在一實施例中，如果自然語言理解系統100認為用戶的意圖已相當明確，便可以直接將用戶的意圖(亦即查詢明天上海的天氣)通過分析結果輸出模組116輸出分析結果104至伺服器，而伺服器可在查詢到用戶所指定的天氣候傳送給用戶。又例如，當用戶的請求資訊102為"我要看三國演義"時，自然語言處理器300經過分析後，可產生出三個可能意圖語法資料106："<readbook>,<bookname>=三國演義"；"<watchTV>,<TVname>=三國演義"；以及"<watchfilm>,<filmname>=三國演義"。 In an embodiment, if the natural language understanding system 100 considers that the intention of the user is quite clear, the user's intention (ie, querying the weather of tomorrow in Shanghai) can be directly output to the server through the analysis result output module 116. And the server can transmit the user to the user in the weather specified by the user. For another example, when the user's request information 102 is "I want to see the Romance of the Three Kingdoms", the natural language processor 300 can generate three possible intent grammar materials 106 after analysis: "<readbook>, <bookname>=Three Kingdoms ";"<watchTV>, <TVname>=Three Kingdoms"; and "<watchfilm>, <filmname>=Three Kingdoms".

這是因為可能意圖語法資料106中的關鍵字108(亦即“三國演義”)可能屬於不同的領域，亦即書籍(<readbook>)、電視劇(<watchTV>)、以及電影(<readfilm>)三個領域，所以一個請求資訊102可分析成多個可能意圖語法資料106，因此需要通過知識輔助理解模組400做進一步分析，來確認用戶的意圖。再舉另一個例子來說，若用戶輸入"我要看讓子彈飛"時，因其中的"讓子彈飛"有可能是電影名稱或是書名稱，所以也可能出現至少以下兩個可能意圖語法資料106："<readbook>,<bookname>=讓子彈飛"；以及"<watchfilm>,<filmname>=讓子彈飛"；其分別屬於書籍與電影兩個領域。上述的可能意圖語法資料106隨後需通過知識輔助理解模組400做進一步分析，並從中求得確定意圖語法資料114，來表達用戶的請求資訊的明確意圖。當知識輔助理解模組400分析可能意圖語法資料106時，知識輔助理解模組400可通過檢索介面260傳送關鍵字108(例如上述的“三國演義”或“讓子彈飛”)給檢索系統200。檢索系統200中的結構化資料庫220儲存了具有特定資料結構的多個記錄，而搜尋引擎240能藉由檢索介面單元260所接收的關鍵字108來對結構化資料庫220進行全文檢索，並將全文檢索所獲得的回應結果110回傳給知識輔助理解模組400，隨後知識輔助理解模組400便能藉由此回應結果110來求得確定意圖語法資料114。至於對結構化資料庫220進行全文檢索以確定意圖語法資料114的細節，將在後面通過圖3A、圖3B與相關段落做更詳細的描述。 This is because the keywords 108 (ie, "Three Kingdoms") in the possible grammar material 106 may belong to different fields, that is, books (<readbook>), television dramas (<watchTV>), and movies (<readfilm>). There are three fields, so one request message 102 can be analyzed into a plurality of possible intent grammar materials 106, so further analysis by the knowledge assisted understanding module 400 is needed to confirm the user's intent. As another example, if the user enters "I want to see the bullet fly", because the "let the bullet fly" may be the movie name or the book name, there may be at least the following two possible intent grammars. Source 106: "<readbook>, <bookname>=Let the bullet fly"; and "<watchfilm>, <filmname>=Let the bullet fly"; they belong to two areas of books and movies. The above-mentioned possible intent grammar data 106 is then further analyzed by the knowledge assisted understanding module 400, and the determined intent grammar data 114 is obtained therefrom to express the clear meaning of the user's request information. Figure. When the knowledge assisted understanding module 400 analyzes the possible intent grammar data 106, the knowledge assisted understanding module 400 can transmit a keyword 108 (eg, "Three Kingdoms" or "Let the bullet fly") to the retrieval system 200 via the retrieval interface 260. The structured database 220 in the retrieval system 200 stores a plurality of records having a particular data structure, and the search engine 240 can perform full-text retrieval of the structured database 220 by the keywords 108 received by the retrieval interface unit 260, and The response result 110 obtained by the full-text search is transmitted back to the knowledge assisted understanding module 400, and then the knowledge assisted understanding module 400 can obtain the determined intention grammar data 114 by responding to the result 110. As for the full-text search of the structured repository 220 to determine the details of the intent grammar material 114, a more detailed description will be provided later through Figures 3A, 3B and related paragraphs.

在本發明的概念中，自然語言理解系統100能先擷取用戶的請求資訊102中的關鍵字108，並藉由結構化資料庫220的全文檢索結果來判別關鍵字108的領域屬性，例如上述輸入“我要看三國演義”時，會產生分別屬於書籍、電視劇、電影三個領域的可能意圖語法資料106，隨後再進一步分析並確認用戶的明確意圖。因此用戶能夠很輕鬆地以口語化方式來表達出其意圖或資訊，而不需要特別熟記特定用語，例如現有作法中關於固定詞列表的特定用語。 In the concept of the present invention, the natural language understanding system 100 can first retrieve the keyword 108 in the user's request information 102, and determine the domain attribute of the keyword 108 by the full-text search result of the structured database 220, such as the above. When you enter "I want to see the Romance of the Three Kingdoms", it will generate possible intent grammar materials 106 belonging to the three fields of books, TV series, and movies, and then further analyze and confirm the user's clear intentions. Therefore, users can easily express their intentions or information in a colloquial manner, without having to memorize specific terms, such as specific terms in the existing practice regarding fixed word lists.

圖2為根據本發明的一實施例的自然語言處理器300對用戶的各種請求資訊的分析結果的示意圖。 2 is a diagram showing the results of analysis of various request information of a user by the natural language processor 300, in accordance with an embodiment of the present invention.

如圖2所示，當用戶的請求資訊102為"明天上海的天氣怎麽樣啊"時，自然語言處理器300經過分析後，可產生出可能意圖語法資料106為："<queryweather>,<city>=上海,<時間>=明天" As shown in FIG. 2, when the user's request information 102 is "How is the weather in Shanghai tomorrow?", the natural language processor 300 may generate a possible meaning after analysis. The grammar data 106 is: "<queryweather>, <city>=Shanghai, <time>=Tomorrow"

其中意圖資料112為"<queryweather>"、而關鍵字108為"上海"與"明天"。由於經自然語言處理器300的分析後只取得一組意圖語法資料106(查詢天氣<queryweather>)，因此在一實施例中，知識輔助理解模組400可直接取出關鍵字108"上海"與"明天"作為分析結果104送往伺服器來查詢天氣的資訊(例如查詢明天上海天氣概况、包含氣象、氣溫...等資訊)，而不需要對結構化資料庫220進行全文檢索來判定用戶意圖(如果知識輔助理解模組400在通過解析請求資訊102所產生的可能意圖語法資料106即可確認用戶意圖的話)。當然，在一實施例中，仍可對結構化資料庫220進行全文檢索做更精確的用戶意圖判定，本領域的技術人員可依據實際需求進行變更。 The intent data 112 is "<queryweather>", and the keyword 108 is "Shanghai" and "tomorrow". Since only a set of intent grammar data 106 (query weather <queryweather>) is obtained after analysis by the natural language processor 300, in an embodiment, the knowledge assisted understanding module 400 can directly retrieve the keyword 108 "Shanghai" and "" "Tomorrow" is sent to the server as the analysis result 104 to check the weather information (for example, to query the weather profile of Shanghai tomorrow, including weather, temperature, etc.), without the need to perform full-text search on the structured database 220 to determine the user's intention. (If the knowledge assisted understanding module 400 can confirm the user's intent by parsing the possible intent grammar data 106 generated by the request information 102). Of course, in an embodiment, the structured database 220 can still be subjected to full-text search for more accurate user intent determination, and those skilled in the art can make changes according to actual needs.

此外，當用戶的請求資訊102為"我要看讓子彈飛"時，因為可產生出兩個可能意圖語法資料106："<readbook>,<bookname>=讓子彈飛"；以及"<watchfilm>,<filmname>=讓子彈飛"；與兩個對應的意圖資料112"<readbook>"與"<watchfilm>"、以及兩個相同的關鍵字108"讓子彈飛"，來表示其意圖可能是看"讓子彈飛"的書籍或是看"讓子彈飛"的電影。為進一步確認用戶的意圖，將通過知識輔助理解模組400傳送關鍵字108"讓子彈飛"給檢索介面單元260，接著搜尋引擎240藉由此關鍵字 108"讓子彈飛"來對結構化資料庫220進行全文檢索，以確認"讓子彈飛"應該是書名稱或是電影名稱，藉以確認用戶的意圖。 In addition, when the user's request information 102 is "I want to see the bullet fly", because two possible intent grammar materials 106 can be generated: "<readbook>, <bookname>=let the bullet fly"; and "<watchfilm> , <filmname>=let the bullet fly"; with two corresponding intent materials 112"<readbook>" and "<watchfilm>", and two identical keywords 108 "let the bullet fly" to indicate that the intention may be Look at the book "Let the bullets fly" or watch the movie "Let the bullets fly." To further confirm the user's intent, the keyword 108 "let the bullet fly" to the search interface unit 260 will be transmitted by the knowledge assisted understanding module 400, and then the search engine 240 will use the keyword 108 "Let the bullet fly" to perform a full-text search on the structured database 220 to confirm that "let the bullet fly" should be the title of the book or the name of the movie to confirm the user's intention.

再者，當用戶的請求資訊102為"我想聽一起走過的日子"時，可產生出兩個可能意圖語法資料106："<playmusic>,<singer>=一起走過,<songname>=日子"；"<playmusic>,<songname>=一起走過的日子" Furthermore, when the user's request information 102 is "I want to listen to the day I walked together", two possible intent grammar materials 106 can be generated: "<playmusic>, <singer>= walked together, <songname>= Day ";"<playmusic>, <songname>=days passed together"

兩個對應的相同的意圖資料112"<playmusic>"、以及兩組對應的關鍵字108"一起走過"與"日子"及"一起走過的日子"，來分別表示其意圖可能是聽歌手"一起走過"所唱的歌曲"日子"、或是聽歌曲"一起走過的日子"，此時知識輔助理解模組400可傳送第一組關鍵字108"一起走過"與"日子"以及第二組關鍵字"一起走過的日子"給檢索介面單元260，來確認是否有"一起走過"這位歌手所演唱的"日子"這首歌(第一組關鍵字所隱含的用戶意圖)、或是否有"一起走過的日子"這首歌(第二組關鍵字所隱含的用戶意圖)，藉以確認用戶的意圖。然而，本發明並不限於在此所表示的各可能意圖語法資料與意圖資料所對應的格式與名稱。 Two corresponding identical intent materials 112 "<playmusic>" and two sets of corresponding keywords 108" walked together with "days" and "days passed together" to indicate that their intentions may be to listen to the singer "Walk through the song "Day" together, or listen to the song "The Day Walked Together", at this time the Knowledge Aid Understanding Module 400 can transmit the first set of keywords 108 "walking together" and "days" And the second set of keywords "days passed together" is given to the search interface unit 260 to confirm whether there is a "day" song sung by the singer (the user implied by the first set of keywords) Intention), or whether there is a "day of the past" song (the user's intention implied by the second set of keywords) to confirm the user's intention. However, the present invention is not limited to the format and name corresponding to each possible intent grammar material and intent material represented herein.

圖3A是根據本發明的一實施例的結構化資料庫220所儲存的具有特定資料結構的多個記錄的示意圖。 3A is a diagram of a plurality of records having a particular data structure stored by structured database 220, in accordance with an embodiment of the present invention.

一般而言，在一些現有的全文檢索作法中，所獲得的搜尋結果是非結構化的資料(例如通過google或百度所搜尋的結果)，因其搜尋結果的各項資訊是分散且不具關聯的，所以用戶必須再對各項資訊逐一檢視，因此造成實用性的限制。然而，在本發明的概念中，能藉由結構化資料庫來有效增進檢索的效率與正確性。因為本發明所揭示的結構化資料庫中的每個記錄內部所包含的數值資料相互間具有關聯性，且這些數值資料共同用以表達該記錄的屬性。於是在搜尋引擎對結構化資料庫進行一全文檢索時，可在記錄的數值資料與關鍵字產生匹配時，輸出對應於該數值資料的指引資料，作為確認該請求資訊的意圖。這部分的實施細節將通過下列實例作更進一步的描述。 In general, in some existing full-text search methods, the search results obtained are unstructured data (such as those searched by Google or Baidu), because the information of the search results is scattered and unrelated. Therefore, the user must review each piece of information one by one, thus causing practical limitations. However, in this In the concept of the invention, the efficiency and correctness of the retrieval can be effectively improved by the structured database. Because the numerical data contained in each record in the structured database disclosed by the present invention is related to each other, and the numerical data is used together to express the attributes of the record. Therefore, when the search engine performs a full-text search on the structured database, when the recorded numerical data matches the keyword, the guidance data corresponding to the numerical data is output as the intention to confirm the requested information. The implementation details of this section will be further described by the following examples.

在本發明的實施例中，結構化資料庫220所儲存的每個記錄302包括標題欄304及內容欄306，標題欄304內包括多個分欄308，各分欄包括指引欄310以及數值欄312，所述多個記錄302的指引欄310用以儲存指引資料，而所述多個記錄302的數值欄312用以儲存數值資料。在此以圖3A所示的記錄1來舉例說明，記錄1的標題欄304中的三個分欄308分別儲存了："singerguid：劉德華"、"songnameguid：一起走過的日子"；及"songtypeguid：港臺，粵語，流行"；各分欄308的指引欄310分別儲存了指引資料"singerguid"、"songnameguid"及"songtypeguid"、而其對應分欄308的數值欄312則分別儲存了數值資料"劉德華"、"一起走過的日子"及"港臺，粵語，流行"。指引資料"singerguid"代表數值資料"劉德華"的領域種類為歌手名稱(singer)，指引資料"songnameguid"代表數值資料"一起走過的日子"的領域種類為歌曲名稱(song)，指引資料"songtypeguid"代表數值資料"港臺，粵語，流行"的領域種類為歌曲類型(song type)。在此的各指引資料實際上可分別用不同的特定一串數字或字來表示，在本發明中不以此為限。記錄1的內容欄306則是儲存了"一起走過的日子"這首歌的歌詞內容或儲存其他的資料(例如作曲/詞者...等)，然而各記錄的內容欄306中的真實資料並非本發明所强調的重點，因此在圖3A中僅示意性地來描述之。 In the embodiment of the present invention, each record 302 stored in the structured database 220 includes a title bar 304 and a content bar 306. The title bar 304 includes a plurality of columns 308, each of which includes a guide bar 310 and a value column. 312. The index bar 310 of the plurality of records 302 is used to store the guidance data, and the value column 312 of the plurality of records 302 is used to store the numerical data. Here, the record 1 shown in FIG. 3A is exemplified, and the three columns 308 in the title bar 304 of the record 1 are respectively stored: "singerguid: Andy Lau", "songnameguid: days passed together"; and "songtypeguid" : Hong Kong and Taiwan, Cantonese, popular"; the guide bar 310 of each column 308 stores the guidance materials "singerguid", "songnameguid" and "songtypeguid", respectively, and the value column 312 of the corresponding column 308 stores the numerical data respectively. "Andy Lau", "the days that have passed together" and "Hong Kong and Taiwan, Cantonese, popular". The field of the "singerguid" is the singer's name (singer), and the field of the "songnameguid" is the name of the song. The category of the "songtypeguid" represents the numerical data "Hong Kong, Taiwan, Cantonese, popular" is the song type. The various reference materials herein may be represented by different specific strings or words, respectively, and are not limited thereto. The content column 306 of the record 1 is the lyrics content of the song that stores the "days that have passed together" or stores other materials (such as a composer/writer, etc.), but the real data in the content column 306 of each record. This is not the focus of the present invention and is therefore only schematically depicted in Figure 3A.

前述的實施例中，每個記錄包括標題欄304及內容欄306，且標題欄304內的分欄308包括指引欄310以及數值欄312，但非以限定本發明，某些實施例中也可以沒有內容欄306，甚或是有些實施例中可以沒有指引欄310。 In the foregoing embodiment, each record includes a title bar 304 and a content bar 306, and the column 308 in the title bar 304 includes a guide bar 310 and a value column 312, but is not intended to limit the present invention, and may also be used in some embodiments. There is no content bar 306, or even some embodiments may have no guide bar 310.

除此之外，在本發明的實施例中，於各分欄308的資料間儲存有第一特殊字符來分隔各分欄308的資料，於指引欄310與該數值欄312的資料間儲存有第二特殊字符來分隔指引欄與數值欄的資料。舉例來說，如圖3A所示，"singerguid"與"劉德華"之間、"songnameguid"與"一起走過的日子"之間、以及"songtypeguid"與"港臺，粵語，流行"之間是利用第二特殊字符"："來做分隔，而記錄1的各分欄308間是利用第一特殊字符"|"來做分隔，然而本發明並不限於以"："或"|"來做為用以分隔的特殊字符。 In addition, in the embodiment of the present invention, a first special character is stored between the data of each column 308 to separate the data of each column 308, and is stored between the index column 310 and the data of the value column 312. The second special character separates the data in the guide bar and the value bar. For example, as shown in FIG. 3A, between "singerguid" and "Andy Lau", "songnameguid" and "days passed together", and "songtypeguid" and "Hong Kong and Taiwan, Cantonese, popular" are The second special character ":" is used for separation, and the respective columns 308 of the record 1 are separated by the first special character "|", but the present invention is not limited to ":" or "|". Is a special character used to separate.

另一方面，在本發明的實施例中，標題欄304中的各分欄308可具有固定位數，例如各分欄308的固定位數可以是32個字，而其中的指引欄310的固定位數可以是7或8個位(最多用來指引128或256種不同的指引資料)，此外，因第一特殊字符與第二特殊字符所需要的位數可以是固定的，所以分欄308的固定位數在扣除指引欄310、第一特殊字符、第二特殊字符所占去的位數後，剩下的位數便可悉數用來儲存數值欄312的數值資料。再者，由於分欄308的位數固定，加上分欄308儲存資料的內容可如圖3A所示依序為指引欄310(指引資料的指標)、第一特殊字符、數值欄312的數值資料、第二特殊字符，而且如前所述，這四個資料的位數量也是固定的，於是在實作上可跳過指引欄310的位(例如跳過前7或8個位)、以及第二特殊字符的位數(例如再跳過1個字，亦即8個位)後，再扣掉第一特殊字符所占的位數(例如最後1個字、8個位)之後，最後便可直接取得數值欄312的數值資料(例如在記錄1的第一個分欄308中直接取出數值資料”劉德華”，此時還有32-3=29個字可供儲存數值欄312的數值資料，算式中的3(亦即1+1+1)代表被指引欄310的指引資料、第一特殊字符、第二特殊字符所分別占去的1個字)，接著再進行所需的領域種類判斷即可。於是，在目前所取出的數值資料比對完畢後(不論是否比對成功與否)，可以再依據上述取出數值資料的方式取出下一個分欄308的數值資料(例如在記錄1的第二個分欄308中直接取出數值資料“一起走過的日子”)，來進行比對領域種類的比對。上述取出數值資料的方式可以從記錄1開始進行比對，並在比對完記錄1所有的數值資料後，再取出記錄2的標題欄308中第一個分欄308 的數值資料(例如“馮小剛”)進行比對。上述比對程序將持續進行，直到所有記錄的數值資料都被比對過為止。 On the other hand, in the embodiment of the present invention, each column 308 in the title bar 304 may have a fixed number of bits. For example, the fixed number of bits in each column 308 may be 32. Word, and the fixed number of digits of the index bar 310 may be 7 or 8 digits (up to 128 or 256 different guidance materials), and in addition, the required bits for the first special character and the second special character The number can be fixed, so after the number of fixed digits of the column 308 is deducted from the index bar 310, the first special character, and the second special character, the remaining digits can be used to store the value column. 312 numerical data. Furthermore, since the number of bits in the column 308 is fixed, the content of the data stored in the column 308 can be sequentially displayed as the index bar 310 (indicator of the guidance data), the first special character, and the value column 312 as shown in FIG. 3A. Data, second special characters, and as mentioned above, the number of bits of the four materials is also fixed, so the bits of the guide bar 310 can be skipped in practice (for example, skipping the first 7 or 8 bits), and After the number of bits of the second special character (for example, skipping 1 word, that is, 8 bits), and then deducting the number of bits occupied by the first special character (for example, the last 1 word, 8 bits), and finally The numerical data of the value column 312 can be directly obtained (for example, the numerical data is directly taken out in the first column 308 of the record 1), and there are 32-3=29 words for storing the value of the value column 312. In the data, 3 (that is, 1+1+1) in the formula represents one of the guidance data of the index column 310, the first special character, and the second special character, respectively, and then the required field is performed. The type can be judged. Therefore, after the currently obtained numerical data comparison is completed (regardless of whether the comparison is successful or not), the numerical data of the next column 308 can be taken out according to the above-mentioned method of taking out the numerical data (for example, in the second of the record 1) In column 308, the numerical data "days passed together" is directly taken out to compare the types of comparison fields. The above method of taking out the numerical data can be compared from the record 1, and after comparing all the numerical data of the record 1, the first column 308 in the title bar 308 of the record 2 is taken out. The numerical data (such as "Feng Xiaogang") are compared. The above comparison procedure will continue until all recorded values have been compared.

應注意的是，上述的分欄308的位數、以及指引欄310、第一特殊字符、第二特殊字符個使用的位數可依實際應用改變，本發明對此並未加以限制。前述利用比對來取出數值資料的方式只是一種實施例，但非用以限定本發明，另一實施例可以使用全文檢索的方式(例如以“字符逐一比對”的方式)來進行。此外，上述跳過指引欄310、第二特殊字符、第一特殊字符的實作方式，可以使用位平移(例如除法)來達成，此部分的實施可以用硬體、軟體、或兩者結合的方式進行，本領域的技術人員可依實際需求而變更。在本發明的另一實施例中，標題欄304中的各分欄308可具有固定位數，分欄308中的指引欄310可具有另一固定位數，並且標題欄304中可不包括第一特殊字符以及第二特殊字符，由於各分欄308以及各指引欄310的位數為固定，所以可利用跳過特定位數的方式或是使用位平移(例如除法)的方式來直接取出各分欄308中的指引資料或數值資料。 It should be noted that the number of bits in the above-mentioned column 308, and the number of bits used in the index bar 310, the first special character, and the second special character may be changed according to actual applications, and the present invention does not limit this. The foregoing method of extracting numerical data by comparison is only an embodiment, but is not intended to limit the present invention, and another embodiment may be performed by means of full-text search (for example, by "character-by-one comparison"). In addition, the implementation manner of skipping the guide bar 310, the second special character, and the first special character may be achieved by bit translation (for example, division), and the implementation of this part may be combined with hardware, software, or both. The manner is carried out, and those skilled in the art can change according to actual needs. In another embodiment of the present invention, each of the sub-columns 308 in the title bar 304 may have a fixed number of bits, the guide bar 310 in the sub-column 308 may have another fixed number of bits, and the title bar 304 may not include the first number. For the special characters and the second special characters, since the number of bits in each column 308 and each of the index bars 310 is fixed, the points can be directly taken out by skipping a specific number of bits or by using a bit shift (for example, division). Guidance data or numerical data in column 308.

應注意的是，由於前面已提到分欄308具有一定的位數，所以可以在自然語言理解系統100中(或是包含自然語言理解系統100的伺服器中)使用計數器來記錄目前所比對的是某一記錄的某分欄308。此外，比對的記錄亦可使用另一計數器來儲存其順序。舉例來說，當分別使用一第一計數器記錄來表示目前所比對的記錄順序、並使用一第二計數器來表示目前所比對的分欄順序時，若目前比對的是圖3A的記錄2的第3個分欄308(亦即比對“filenameguid：華誼兄弟”)時，第一計數器所儲存的數值將是2(表示目前比對的是記錄2)，第二計數器所儲存的數值則為3(表示目前比對的是第3個分欄308)。再者，上述僅以7或8個位儲存指引欄310的指引資料的方式，是希望將分欄308的大多數字都用來儲存數值資料，而實際的指引資料則可通過這7、8個位當作指標，再據以從檢索系統200所儲存的指引資料儲存裝置280中讀取實際的指引資料，其中指引資料是以表格的方式進行儲存，但其他任何可供檢索系統200存取的方式皆可用在本發明中。於是，在實際操作時，除了可直接取出數值資料進行比對之外，亦可在產生匹配結果時，直接依據上述兩個計數器的數值，直接取出指引資料作為回應結果110送給知識輔助理解模組400。舉例來說，當記錄6的第2個分欄308(亦即“songnameguid：背叛”)匹配成功時，將得知目前的第一計數器/第二計數器的數值分別為6與2，因此可以依據這兩個數值前往儲存圖3C所示的指引資料儲存裝置280，由記錄6的分欄2查詢出指引資料為“songnameguid”。在一實施例中，可以將分欄308的位數固定後，再將分欄308的所有位都用來儲存數值資料，於是可以完全除去指引欄、第一特殊字符、第二特殊字符，而搜尋引擎240只要知道每越過固定位數就是另一個分欄308，並在第二計數器中加一即可(當然，每換下一個記錄進行檢索時亦需將第一計數器的儲存值加一)，例如在一實施例中，每個記錄的大小可設定在一預定數值，而其所包含的分欄308數目可固定在一預定數量，因此搜索引擎220在一記錄中分析了預定數值的資料後，便可以輕易地知道已拜訪至記錄終點。在另一實施例中，可在紀錄最後儲存一個特定的第三特殊字符(例如句號或其他類似符號等)，而搜索引擎220在發現這個特殊符號時亦知道已拜訪至記錄終點，這樣的實施方式可以提供更多的位數來儲存數值資料。 It should be noted that since the column 308 has been mentioned to have a certain number of bits, a counter can be used in the natural language understanding system 100 (or in a server including the natural language understanding system 100) to record the current comparison. Is a column 308 of a record. In addition, the aligned records may also use another counter to store their order. For example, when a first counter record is used to indicate the currently recorded record order, and a second counter is used to indicate the currently aligned column order, If the third column 308 of record 2 of Figure 3A is currently aligned (i.e., the comparison "filenameguid: Huayi Brothers"), the value stored by the first counter will be 2 (indicating that the current comparison is Record 2), the value stored in the second counter is 3 (indicating that the third column 308 is currently aligned). Moreover, the above manner of storing the guidance data of the guide bar 310 only by 7 or 8 bits is to use most of the words of the column 308 to store the numerical data, and the actual guidance materials can pass the 7 or 8 The bit is used as an indicator, and the actual guide data is read from the guide data storage device 280 stored in the search system 200, wherein the guide data is stored in a form, but any other access to the search system 200 is available. Means can be used in the present invention. Therefore, in actual operation, in addition to directly extracting the numerical data for comparison, when the matching result is generated, the guidance data may be directly taken out as the response result 110 to the knowledge assisting understanding mode according to the values of the two counters. Group 400. For example, when the second column 308 of the record 6 (ie, "songnameguid: betrayal") is successfully matched, it will be known that the current first counter/second counter have values of 6 and 2, respectively, so These two values go to store the guidance data storage device 280 shown in FIG. 3C, and the guidance data is queried from the column 2 of the record 6 as "songnameguid". In an embodiment, after the number of digits of the column 308 is fixed, all the bits of the column 308 are used to store the numerical data, so that the guide bar, the first special character, and the second special character can be completely removed. The search engine 240 only needs to know that each time the fixed number of digits is crossed, another sub-column 308 is added, and one of the second counters is added (of course, the stored value of the first counter is also incremented every time the next record is searched). For example, in an embodiment, the size of each record can be set to a predetermined value, and the included The number of columns 308 can be fixed to a predetermined number, so that the search engine 220 can easily know that the visit has been made to the end of the record after analyzing the data of the predetermined value in a record. In another embodiment, a specific third special character (such as a period or other similar symbol) may be stored at the end of the record, and the search engine 220 also knows that the visit has been visited to the end of the record when the special symbol is found. The method can provide more digits to store the numerical data.

再舉一個實例來說明比對產生匹配結果時，回傳匹配記錄110至知識輔助理解模組400做進一步處理的過程。對應於上述記錄302的資料結構，在本發明的實施例中，當用戶的請求資訊102為"我要看讓子彈飛"時，可產生出兩個可能意圖語法資料106："<readbook>,<bookname>=讓子彈飛"；與"<watchfilm>,<filmname>=讓子彈飛"；搜尋引擎240便藉由檢索介面單元260所接收的關鍵字108"讓子彈飛"來對圖3A的結構化資料庫220所儲存的記錄的標題欄304進行全文檢索。全文檢索中，在標題欄304中找到了儲存有數值資料"讓子彈飛"的記錄5，因此產生了匹配結果。接下來，檢索系統200將回傳記錄5標題欄304的第三個分欄308中，對應於關鍵字108“讓子彈飛”的指引資料“filmnameguid”作為回應結果110並回傳至知識輔助理解模組400。由於在記錄5的標題欄中，包含對應數值資料"讓子彈飛"的指引資料"filmnameguid"，所以知識輔助理解模組400藉由比對記錄5的指引資料 "filmnameguid"與上述可能意圖語法資料106先前已儲存的意圖資料112"<watchfilm>"或"<readbook>"，便能判斷出此次請求資訊的確定意圖語法資料114為"<watchfilm>,<filmname>=讓子彈飛"(因為都包含“film”在其中)。換句話說，此次用戶的請求資訊102中所描述資料"讓子彈飛"是電影名稱，而資料用戶的請求資訊102的意圖為看電影"讓子彈飛"，而非閱讀書籍。確認後的"<watchfilm>,<filmname>=讓子彈飛"被當成確定意圖語法資料114並送往分析結果輸出模組116做進一步處理。 Another example is given to illustrate the process of returning the matching record 110 to the knowledge assisted understanding module 400 for further processing when the matching result is generated. Corresponding to the data structure of the above record 302, in the embodiment of the present invention, when the user's request information 102 is "I want to see the bullet fly", two possible intent grammar materials 106 can be generated: "<readbook>, <bookname>=let the bullet fly"; and "<watchfilm>, <filmname>=let the bullet fly"; the search engine 240 retrieves the bullet 108 by the retrieval interface unit 260 "to let the bullet fly" to Figure 3A The title bar 304 of the record stored in the structured database 220 is subjected to full-text search. In the full-text search, the record 5 storing the numerical data "Let the bullet fly" is found in the title bar 304, and thus a matching result is produced. Next, the retrieval system 200 will return the guidance material "filmnameguid" corresponding to the keyword 108 "Let the bullet fly" in the third column 308 of the post-record 5 title bar 304 as a response result 110 and pass it back to the knowledge-assisted understanding. Module 400. Since the title data column of the record 5 contains the guidance material "filmnameguid" corresponding to the numerical data "let the bullet fly", the knowledge assisted understanding module 400 uses the guidance data of the comparison record 5 "filmnameguid" and the intent data 112"<watchfilm>" or "<readbook>" previously stored in the possible intent grammar material 106 can determine that the determined intent grammar data 114 of the request information is "<watchfilm>, < Filmname>=Let the bullet fly" (because both contain "film" in it). In other words, the information described in the user's request information 102 "let the bullet fly" is the movie name, and the data user's request information 102 is intended to watch the movie "let the bullet fly" instead of reading the book. The confirmed "<watchfilm>, <filmname> = let the bullet fly" be treated as the determined intent grammar data 114 and sent to the analysis result output module 116 for further processing.

再舉一個實例作更進一步的說明。當用戶的請求資訊102為"我想聽一起走過的日子"時，可產生出兩個可能意圖語法資料106："<playmusic>,<singer>=一起走過,<songname>=日子"；與"<playmusic>,<songname>=一起走過的日子"；搜尋引擎240便藉由檢索介面單元260所接收的兩組關鍵字108："一起走過"與"日子"；以及"一起走過的日子" Give another example for further explanation. When the user's request information 102 is "I want to listen to the day I walked together", two possible intent grammar materials 106 may be generated: "<playmusic>, <singer>= walked together, <songname>=day"; The day of the walk with "<playmusic>, <songname>="; the search engine 240 retrieves the two sets of keywords 108 received by the interface unit 260: "walking together" and "days"; and "walking together" The days passed

來對圖3A的結構化資料庫220所儲存的記錄的標題欄304進行全文檢索。由於全文檢索中，並未在所有記錄的標題欄304中找到對應於第一組關鍵字108"一起走過"與"日子"的匹配結果，而是找到了對應於第二組關鍵字108"一起走過的日子"的記錄1，於是檢索系統200將記錄1標題欄304中對應於第二組關鍵字 108的指引資料"songnameguid"，作為匹配記錄110且回傳至知識輔助理解模組400。接下來，知識輔助理解模組400在接收對應數值資料"一起走過的日子"的指引資料"songnameguid"後，便與可能意圖語法資料106(亦即"<playmusic>,<singer>=一起走過,<songname>=日子"與"<playmusic>,<songname>=一起走過的日子")中的意圖資料112(亦即<singer>、<songname>等)進行比對，於是便發現此次用戶的請求資訊102中並未描述有歌手名稱的資料，而是描述有歌曲名稱為"一起走過的日子"的資料(因為只有<songname>比對成功)。所以，知識輔助理解模組400可藉由上述比對而判斷出此次請求資訊102的確定意圖語法資料114為"<playmusic>,<songname>=一起走過的日子"，而用戶的請求資訊102的意圖為聽歌曲"一起走過的日子"。 The full-text search is performed on the title bar 304 of the record stored in the structured database 220 of FIG. 3A. Due to the full-text search, the matching results corresponding to the first group of keywords 108 "going through" and "days" are not found in the title bar 304 of all the records, but the corresponding corresponding to the second group of keywords 108" is found. Record 1 of the day that passed together, then the retrieval system 200 will record 1 in the title bar 304 corresponding to the second set of keywords The guidance material "songnameguid" of 108 is passed as the matching record 110 and passed back to the knowledge assisted understanding module 400. Next, after receiving the guidance material "songnameguid" corresponding to the value data "days passed together", the knowledge assisted understanding module 400 walks with the possible intention grammar material 106 (ie, "playmusic>, <singer>=" After that, <songname>=days are compared with the intent data 112 (ie, <singer>, <songname>, etc.) in "<playongic>, <songname>=days passed together), so this is found. The sub-user's request information 102 does not describe the material of the singer's name, but describes the material whose song name is "the day that passed together" (because only <songname> is successful). Therefore, the knowledge assisted understanding module 400 can determine, by the above comparison, that the determined intent grammar data 114 of the request information 102 is "<playmusic>, <songname>=days passed together", and the user requests information. The intent of 102 is to listen to the song "the days that have passed together."

在本發明的另一實施例中，檢索而得的回應結果110可以是與關鍵字108完全匹配的完全匹配記錄、或是與關鍵字108部分匹配的部分匹配記錄。舉例來說，如果用戶的請求資訊102為"我想聽蕭敬騰的背叛"，同樣地，自然語言處理器300經過分析後，產生出兩個可能意圖語法資料106："<playmusic>,<singer>=蕭敬騰,<songname>=背叛"；及"<playmusic>,<songname>=蕭敬騰的背叛"；並傳送兩組關鍵字108："蕭敬騰"與"背叛"；以及"蕭敬騰的背叛"；給檢索介面單元260，搜尋引擎240接著藉由檢索介面單元260所接收的關鍵字108來對圖3A的結構化資料庫220所儲存的記錄302的標題欄304進行全文檢索。由於在全文檢索中，對應第二組關鍵字108"蕭敬騰的背叛"並未匹配到任何記錄，但是對應第一組關鍵字108"蕭敬騰"與"背叛"找到了記錄6與記錄7的匹配結果。由於第二組關鍵字108"蕭敬騰"與"背叛"僅與記錄6中的數值資料"蕭敬騰"相匹配，而未匹配到其他數值資料"楊宗緯"及"曹格"，因此記錄6為部分匹配記錄(請注意上述對應請求資訊102"我要看讓子彈飛"的記錄5以及對應請求資訊"我想聽一起走過的日子"的記錄1皆為部分匹配記錄)，而關鍵字"蕭敬騰"與"背叛"完全匹配了記錄7的數值資料(因為第二組關鍵字108"蕭敬騰"與"背叛"皆匹配成功)，所以記錄7為完全匹配記錄。在本發明的實施例中，當該檢索介面單元260輸出多個匹配記錄110至知識輔助理解模組400時，可依序輸出完全匹配記錄(亦即全部的數值資料都被匹配)及部分匹配記錄(亦即僅有部分的數值資料被匹配)的匹配記錄110，其中完全匹配記錄的優先順序大於部分匹配記錄的優先順序。因此，在檢索介面單元260輸出記錄6與記錄7的匹配記錄110時，記錄7的輸出優先順序會大於記錄6的輸出優先順序，因為記錄7全部的數值資料"蕭敬騰"與"背叛"都產生匹配結果，但記錄6還包含"楊宗緯"與"曹格"未產生匹配結果。也就是說，結構化資料庫220中所儲存的記錄對其請求資訊102中的關鍵字108的匹配程度越高，越容易優先被輸出，以便用戶進行查閱或挑選對應的確定意圖語法資料114。在另一實施例中，可直接輸出優先順序最高的記錄所對應的匹配記錄110，做為確定意圖語法資料114之用。前述非以限定本發明，因為在另一實施例中可能採取只要搜尋到有匹配記錄即輸出的方式(例如，以"我想聽蕭敬騰的背叛"為請求資訊102而言，當檢索到記錄6即產生匹配結果時，即輸出記錄6對應的指引資料做匹配記錄110)，而沒有包含優先順序的排序，以加快檢索的速度。在另一實施例中，可對優先順序最高的記錄，直接執行其對應的處理方式並提供予用戶。例如當優先順序最高的為播放三國演義的電影時，可直接播放電影與用戶。此外，若優先順序最高的為蕭敬騰演唱的背叛時，可直接將此歌曲播放與用戶。應注意的是，本發明在此僅作說明，並非對此加以限定。 In another embodiment of the present invention, the retrieved response result 110 may be an exact match record that exactly matches the keyword 108, or a partial match record that partially matches the keyword 108. For example, if the user's request information 102 is "I want to listen to Xiao Jingteng's betrayal", similarly, the natural language processor 300 analyzes to generate two possible intent grammar materials 106: "<playmusic>, <singer> = Xiao Jingteng, <songname>=betrayal; and "<playmusic>, <songname>=Xiao Jingteng's betrayal"; and send two sets of keywords 108: "Xiao Jingteng" and "betrayal"; and "Xiao Jingteng's betrayal"; To the search interface unit 260, the search engine 240 then performs a full-text search of the title bar 304 of the record 302 stored in the structured repository 220 of FIG. 3A by the keyword 108 received by the search interface unit 260. Since in the full-text search, the second group of keywords 108 "Xiao Jingteng's betrayal" did not match any records, but the first group of keywords 108 "Xiao Jingteng" and "betrayal" found the matching result of record 6 and record 7. . Since the second group of keywords 108 "Xiao Jing Teng" and "Betrayal" only match the numerical data "Xiao Jing Teng" in Record 6, but do not match other numerical data "Yang Zongwei" and "Cao Ge", record 6 is a partial matching record. (Please note that the above-mentioned corresponding request information 102 "I want to see the bullet 5" and the corresponding request information "I want to listen to the day that I walked together" record 1 is a partial match record), and the keyword "Xiao Jingteng" and "Betrayal" exactly matches the value of record 7 (because the second set of keywords 108 "Xiao Jingteng" and "Betrayal" match successfully), so record 7 is an exact match record. In the embodiment of the present invention, when the search interface unit 260 outputs the plurality of matching records 110 to the knowledge assisted understanding module 400, the complete matching records may be sequentially output (that is, all the numerical data are matched) and the partial matching is performed. A matching record 110 of records (i.e., only a portion of the numerical data is matched), wherein the priority order of the exact match records is greater than the priority order of the partial match records. Therefore, when the search interface unit 260 outputs the matching record 110 of the record 6 and the record 7, the output priority order of the record 7 is greater than the output priority order of the record 6, because all the numerical data "Xiao Jingteng" and "Betrayal" of the record 7 are generated. Match the results, but record 6 also contains "Yang Zongwei" and "Cao Ge" did not produce a match. That is to say, the higher the matching degree of the records stored in the structured database 220 to the keywords 108 in the request information 102, the easier it is to be preferentially outputted so that the user can view or select the records. Correspondingly determined intent grammar data 114. In another embodiment, the matching record 110 corresponding to the record with the highest priority can be directly output as the determination of the intended grammar data 114. The foregoing is not intended to limit the invention, as in another embodiment it may be possible to take the form of the search as long as the matching record is found (for example, "I want to hear Xiao Jingteng's betrayal" as the request information 102, when the record 6 is retrieved That is, when the matching result is generated, the guidance data corresponding to the record 6 is output as the matching record 110), and the ordering of the priority order is not included to speed up the retrieval. In another embodiment, the corresponding processing mode can be directly executed and provided to the user for the record with the highest priority. For example, when the movie with the highest priority is the movie of the Three Kingdoms, the movie and the user can be directly played. In addition, if the highest priority is the betrayal of Xiao Jingteng's singing, the song can be played directly with the user. It should be noted that the present invention is described herein only and is not intended to be limiting.

在本發明的再一實施例中，如果用戶的請求資訊102為"我要聽劉德華的背叛"，則其可能意圖語法資料106的其中之一為："<playmusic>,<singer>=劉德華,<songname>=背叛”；若檢索介面單元260將關鍵字108"劉德華"與"背叛"輸入搜尋引擎240，並不會在圖3A的資料庫中找到任何的匹配結果。在本發明的又一實施例中，檢索介面單元260可分別將關鍵字108"劉德華"以及"背叛"輸入搜尋引擎240，並且分別對應求得"劉德華"是歌手名稱(指引資料singerguid)以及"背叛"是歌曲名稱(指引資料songnameguid，且歌手可能是曹格或是蕭敬騰、楊宗緯與曹格合唱)。此時，自然語言理解系統100可進一步提醒用戶：“背叛這首歌曲是否為蕭敬騰所唱(依據記錄7的匹配結果)？”，或者，“是否為蕭敬騰、楊宗緯與曹格所合唱(依據記錄6的匹配結果)？”。 In still another embodiment of the present invention, if the user's request information 102 is "I want to listen to Andy Lau's betrayal", then one of the possible intent grammar materials 106 is: "<playmusic>, <singer> = Andy Lau, <songname>=betrayal; if the search interface unit 260 enters the keywords 108 "Andy Lau" and "betrayal" into the search engine 240, no matching results will be found in the database of FIG. 3A. In still another embodiment of the present invention, the search interface unit 260 may input the keywords 108 "Andy Lau" and "betrayal" into the search engine 240, respectively, and respectively determine that "Andy Lau" is the singer name (guide material singerguid) and " "Betrayal" is the name of the song (the guide information is songnameguid, and the singer may be Cao Ge or Xiao Jingteng, Yang Zongwei and Cao Ge chorus). At this point, the natural language understanding system 100 can further alert the user: "Betray this Is the song sung by Xiao Jingteng (according to the matching result of record 7)? ", or, "Is it a chorus for Xiao Jingteng, Yang Zongwei and Cao Ge (according to the matching result of record 6)? ".

在本發明的再一實施例中，結構化資料庫220所儲存記錄可還包括有來源欄314及熱度欄316。如圖3B所示的資料庫，其除了圖3A的各項欄之外，還包含來源欄314熱度欄316、喜好欄318與厭惡欄。各記錄的來源欄314可用以儲存此記錄是出自哪一個結構化資料庫的指示或指標(請注意在此圖中僅顯示結構化資料庫220，而實際上可存在更多不同的結構化資料庫)、或是哪一個用戶、伺服器所提供的來源值。並且，自然語言理解系統100可根據用戶在之前的請求訊息102中所透漏的喜好，來檢索特定來源的結構化資料庫(例如以請求資訊102中的關鍵字108進行全文檢索產生匹配時，便對該記錄的熱度值加一)。而各記錄302的熱度欄316用以儲存此記錄302的搜尋熱度值或是熱門程度值(例如該記錄在特定時間裏被單一用戶、特定用戶群組、所有用戶的匹配次數或機率)，以供知識輔助理解模組400判斷用戶意圖時的參考，至於喜好欄318與厭惡欄的使用方式後文會再詳述。詳細而論，當用戶的請求資訊102為"我要看三國演義"時，自然語言處理器300經過分析後，可產生出多個可能意圖語法資料106："<readbook>,<bookname>=三國演義"；"<watchTV>,<TVname>=三國演義"；以及"<watchfilm>,<filmname>=三國演義"。 In still another embodiment of the present invention, the record stored in the structured database 220 may further include a source column 314 and a heat column 316. The database shown in FIG. 3B includes a source column 314 heat column 316, a preference column 318, and an aversive column in addition to the columns of FIG. 3A. The source field 314 of each record can be used to store an indication or indicator from which structured database the record was generated (note that only the structured database 220 is shown in this figure, but more different structured data may actually exist Library), or which user, the source value provided by the server. Moreover, the natural language understanding system 100 can retrieve a structured database of a specific source according to the preferences leaked by the user in the previous request message 102 (for example, when the full-text search is performed by the keyword 108 in the request information 102 to generate a match, Add one to the heat value of the record). The heat column 316 of each record 302 is used to store the search heat value or popularity value of the record 302 (for example, the number of matches or the probability of the record being single user, a specific user group, all users in a specific time) The reference for the knowledge-assisted understanding module 400 to determine the user's intention, as well as the use of the preference bar 318 and the aversive bar will be described in detail later. In detail, when the user's request information 102 is "I want to see the Romance of the Three Kingdoms", the natural language processor 300 can generate a plurality of possible intent grammar materials 106 after analysis: "<readbook>, <bookname>=Three Kingdoms The Romance ";" <watchTV>, <TVname>=The Romance of the Three Kingdoms; and "<watchfilm>, <filmname>=The Romance of the Three Kingdoms".

若檢索系統200在用戶的請求資訊102的歷史記錄中(例如利用通過熱度欄316儲存該筆記錄302被某用戶所點選的次數)，統計出其大部份的請求為看電影(假設在結構化資料庫中各只有一筆對應於三國演義書籍、電視劇、電影的記錄，其中看電影的記錄其熱度欄316較其他兩者為高)，則檢索系統200可針對儲存電影記錄的結構化資料庫來做檢索(此時來源欄314中的來源值，是記錄儲存電影記錄的結構化資料庫的代碼)，從而可優先判定"<watchfilm>,<filmname>=三國演義"為確定意圖語法資料114。舉例來說，在一實施例中亦可在每個記錄302被匹配一次，就可在後面的熱度欄316加一，作為用戶的歷史記錄。於是在依據關鍵字108“三國演義”做全文檢索時，可以從所有匹配結果中挑選熱度欄316中數值最高的記錄302，作為判斷用戶意圖之用。在一實施例中，若檢索系統200在關鍵字108"三國演義"的檢索結果中，判定對應"三國演義"這出電視節目的記錄的熱度欄316所儲存的搜尋熱度值最高，則便可優先判定"<watchTV>,<TVname>=三國演義"為確定意圖語法資料114。在又一實施例中，若每個領域有對個記錄與其對應時，檢索系統200可對所有記錄的檢索系統200的數值進行統計。舉例來說，若在結構化資料庫220中有多個記錄分別對應至三國演義的書籍、電視劇、與電影，檢索系統200可這些對應記錄的熱度值先進行統計，並判斷哪個領域具有最高值。例如三國演義的書籍、電視劇、與電影的對應記錄共有5,13,16個，而這5,13,16個記錄的熱度值加總分別為30,18,25。因此，檢索系統200可在與三國演義的書籍相關的五個記錄中，挑選熱度欄316的數值最高的記錄，並其對應的指引欄的資料(可包含來源欄314中的來源值)將傳送給知識輔助理解模組400做進一步處理。此外，儲存在來源欄314中的來源值亦可作為匹配記錄110的一部分並輸出至知識輔助理解模組400,用以顯示予用戶可到哪里存取所需的電視劇。再者，上述對熱度欄316所儲存數值的變更方式，可通過自然語言理解系統100所在的計算機系統進行變更，本發明對此並不加以限制。此外、熱度欄316的數值亦可隨時間遞减，以表示用戶對某項記錄302的熱度已逐漸降低，本發明對這部分亦不加以限制。 If the retrieval system 200 is in the history of the user's request information 102 (eg, If the number of times the record 302 is selected by a user through the heat column 316 is counted, most of the requests are counted as movies (assuming that only one of the structured databases has a corresponding book for the Three Kingdoms, the TV series) The record of the movie, wherein the record of the movie is higher than the other two, the retrieval system 200 can perform a search for the structured database storing the movie record (the source value in the source column 314 at this time, It is a code for recording a structured database for storing movie records, so that "<watchfilm>, <filmname>=Three Kingdoms" can be preferentially determined to determine the intent grammar data 114. For example, in an embodiment, each record 302 can also be matched once, and one can be added to the subsequent heat column 316 as the user's history. Therefore, when the full-text search is performed according to the keyword "Three Kingdoms", the record 302 having the highest value in the heat column 316 can be selected from all the matching results as the judgment of the user's intention. In an embodiment, if the search system 200 determines in the search result of the keyword 108 "Three Kingdoms" that the search heat value stored in the heat column 316 corresponding to the record of the "Three Kingdoms" is the highest, then Priority determination "<watchTV>, <TVname>=Three Kingdoms" is to determine the intent grammar material 114. In yet another embodiment, the retrieval system 200 can count the values of all of the recorded retrieval systems 200 if there are pairs of records for each domain. For example, if there are multiple books, TV dramas, and movies corresponding to the Romance of the Three Kingdoms in the structured database 220, the retrieval system 200 can first count the heat values of the corresponding records, and determine which field has the highest value. . For example, there are 5, 13, 16 corresponding records of books, TV dramas, and movies in the Romance of the Three Kingdoms, and the heat values of these 5, 13, and 16 records are 30, 18, and 25 respectively. Therefore, the retrieval system 200 can select heat among the five records related to the books of the Three Kingdoms. The record with the highest value of degree column 316 and its corresponding guide bar data (which may include the source value in source column 314) will be passed to knowledge assisted understanding module 400 for further processing. In addition, the source value stored in the source field 314 can also be used as part of the matching record 110 and output to the knowledge assisted understanding module 400 for displaying where the user can access the desired television show. Furthermore, the manner of changing the value stored in the heat column 316 can be changed by the computer system in which the natural language understanding system 100 is located, and the present invention is not limited thereto. In addition, the value of the heat column 316 may also decrease with time to indicate that the user's heat for a certain record 302 has gradually decreased, and the present invention does not limit this portion.

再舉另一個實例來說，在另一實施例中，由於用戶可能在某段時間中特別喜歡看三國演義的電視劇，由於電視劇的長度可能很長而用戶無法短時間看完，因此在短時間中可能重複點選(假設每匹配一次就將熱度欄316內的數值加一的話)，因此造成某個記錄302被重複匹配，這部分都可通過分析熱度欄316的資料而得知。再者，在另一實施例中，電信業者也可以利用熱度欄316來表示某一來源所提供資料被取用的熱度，而此資料供應者的編碼可以用來源欄314進行儲存。舉例來說，若某位供應“三國演義電視劇”的供應者的被點選的機率最高，所以當某用戶輸入“我要看三國演義”的請求資訊102時，雖然在對圖3B的資料庫進行全文檢索時會找到閱讀三國演義的書籍(記錄8)、觀看三國演義電視劇(記錄9)、觀看三國演義電影(記錄10)三個匹配結果，但由於熱度欄316中的資料顯示觀看三國演義電影是現在最熱門的選項(亦即記錄8、9、10的熱度欄的數值分別為2、5、8)，所以將先提供記錄10的指引資料做匹配記錄110輸出至知識輔助理解系統400，作為判定用戶意圖的最優先選項。在一實施例中，可同時將來源欄314的資料顯示給用戶，讓用戶判斷他所想要觀看的電視劇是否為某位供應者所提供(而用戶可鏈接至該供應者以讀取並播放電視劇來觀看)。在另一實施例中，如果有一個以上的記錄提供三國演義的影片，檢索系統200可傳送這些記錄中，其熱度欄316具有最高值的來源欄314所儲存資料給知識輔助理解模組400。應注意的是，上述對來源欄314所儲存資料以及其變更方式，亦可通過自然語言理解系統100所在的計算機系統進行變更，本發明對此並不加以限制。應注意的是，本領域的技術人員應知，可進一步將圖3B中的熱度欄316、喜好欄318、厭惡欄320所儲存的資訊進一步切割成與用戶個人相關以及與全體用戶相關兩部分，並將與用戶個人相關的熱度欄316、喜好欄318、厭惡欄320資訊將儲存在用戶的手機，而伺服器則儲存與全體用戶相關的熱度欄316、喜好欄318、厭惡欄320等資訊。這樣一來，僅與用戶個人的選擇或意圖相關的個人喜好相關資訊就只儲存在用戶個人的移動通訊裝置(例如手機、平板計算機、或是小筆電...等)中，而伺服器則儲存與用戶全體相關的資訊，這樣不僅可節省伺服器的儲存空間，也保留用戶個人喜好的隱密性。 As another example, in another embodiment, since the user may particularly like to watch the drama of the Three Kingdoms in a certain period of time, since the length of the drama may be long and the user cannot read it for a short time, in a short time It is possible to repeat the selection (assuming that the value in the heat column 316 is incremented by one each time it is matched), thus causing a certain record 302 to be repeatedly matched, which can be known by analyzing the data of the heat column 316. Moreover, in another embodiment, the telecommunications provider may also utilize the heat column 316 to indicate the popularity of the material provided by a source, and the data provider's code may be stored in the source field 314. For example, if a supplier who supplies the "Three Kingdoms TV series" has the highest probability of being selected, when a user enters the "I want to see the Romance of the Three Kingdoms" request information 102, although in the database of Figure 3B When you are doing a full-text search, you will find a book reading the Romance of the Three Kingdoms (Record 8), watching the drama of the Three Kingdoms (Record 9), and watching the Three Kingdoms (10). However, due to the information in the Heat Bar 316, you can see the Romance of the Three Kingdoms. Movies are now the hottest option (also That is, the values of the heat bars of records 8, 9, and 10 are 2, 5, and 8 respectively, so the guide data of the record 10 is first provided as the match record 110 and output to the knowledge assisted understanding system 400 as the highest priority for determining the user's intention. . In an embodiment, the data of the source bar 314 can be simultaneously displayed to the user, allowing the user to determine whether the TV show he wants to watch is provided by a provider (and the user can link to the provider to read and play) TV series to watch). In another embodiment, if there is more than one record providing a movie of the Three Kingdoms, the retrieval system 200 can transmit the data stored in the source column 314 of the records having the highest value to the knowledge assisted understanding module 400. It should be noted that the information stored in the source column 314 and the manner of changing the same may be changed by the computer system in which the natural language understanding system 100 is located, which is not limited by the present invention. It should be noted that those skilled in the art should further further cut the information stored in the heat column 316, the preference column 318, and the aversion column 320 in FIG. 3B into two parts related to the user and related to the entire user. The hot bar 316, the favorite bar 318, and the disgust bar 320 information related to the user are stored in the user's mobile phone, and the server stores information such as the heat bar 316, the favorite bar 318, and the disgust bar 320 associated with all the users. In this way, only relevant personal preferences related to the user's personal choice or intent are stored in the user's personal mobile communication device (eg, mobile phone, tablet computer, or small notebook, etc.), and the server The information related to the user is stored, which not only saves the storage space of the server, but also preserves the privacy of the user's personal preference.

明顯的，本發明所揭示的結構化資料庫中的每個記錄內部所包含的數值資料相互間具有關聯性(例如記錄1中的數值資料 “劉德華”、“一起走過的日子”、“港臺，粵語，流行”都是用來描述記錄1的特徵)，且這些數值資料(與對應的指引資料)共同用以表達來自用戶的請求資訊對該記錄的意圖(例如對“一起走過的日子”產生匹配結果時，表示用戶的意圖可能是對記錄1的資料存取)，於是在搜尋引擎對結構化資料庫進行全文檢索時，可在記錄的數值資料被匹配時，輸出對應於該數值資料的指引資料(例如輸出“songnameguid”作為回應結果110)，進而確認該請求資訊的意圖(例如在知識輔助理解模組400中進行比對)。 Obviously, the numerical data contained in each record in the structured database disclosed by the present invention is related to each other (for example, the numerical data in record 1) "Andy Lau", "days passed together", "Hong Kong and Taiwan, Cantonese, popular" are used to describe the characteristics of record 1), and these numerical data (with corresponding guidance materials) are used together to express requests from users. When the information is intended for the record (for example, when the matching result is "on the day of walking together", it means that the user's intention may be access to the data of record 1), so when the search engine performs full-text search on the structured database, When the recorded numerical data is matched, the guidance data corresponding to the numerical data may be output (for example, output "songnameguid" as the response result 110), thereby confirming the intention of the requested information (for example, in the knowledge assisted understanding module 400) Correct).

基於上述示範性實施例所揭示或教示的內容，圖4A為根據本發明的一實施例的檢索方法的流程圖。請參閱圖4A，本發明的實施例的檢索方法包括以下步驟：提供結構化資料庫，且結構化資料庫儲存多個記錄(步驟S410)；接收至少一關鍵字(步驟S420)；藉由關鍵字來對多個記錄的標題欄進行全文檢索(步驟S430)。舉例來說，將關鍵字108輸入檢索介面單元260來讓搜尋引擎240對結構化資料庫220所儲存的多個記錄302的標題欄304進行全文檢索，至於檢索方式可如對圖3A或圖3B所進行的檢索方式、或是不變更其精神的方式來進行；判斷全文檢索是否有匹配結果(步驟S440)。舉例來說，藉由搜尋引擎240來判斷此關鍵字108所對應的全文檢索是否有匹配結果；以及若有匹配結果，依序輸出完全匹配記錄及部分匹配記錄(步驟S450)。舉例來說，若結構化資料庫220中有記錄匹配此關鍵字108，則檢索介面單元260依序輸出匹配此關鍵字108的完全匹配記錄及部分匹配記錄中的指引資料(可通過對圖3C的指引資料儲存裝置280而取得)作為回應結果110送往知識輔助理解系統400(在另一實施例中，回應結果110可包含相關於匹配記錄的其他資訊，例如儲存於熱度欄316的數值，用以顯示予用戶作轉往其他資料之用)，其中完全匹配記錄的優先順序大於部分匹配記錄的優先順序。 Based on the disclosure or teachings of the above exemplary embodiments, FIG. 4A is a flowchart of a retrieval method in accordance with an embodiment of the present invention. Referring to FIG. 4A, the retrieval method of the embodiment of the present invention includes the following steps: providing a structured database, and the structured database stores a plurality of records (step S410); receiving at least one keyword (step S420); The word is used to perform full-text search on the title bar of the plurality of records (step S430). For example, the keyword 108 is input into the search interface unit 260 to cause the search engine 240 to perform a full-text search on the title bar 304 of the plurality of records 302 stored in the structured database 220. The search mode may be as shown in FIG. 3A or FIG. 3B. The search method is performed or the manner in which the spirit is not changed; whether the full-text search has a matching result (step S440). For example, the search engine 240 determines whether the full-text search corresponding to the keyword 108 has a matching result; If there is a matching result, the exact match record and the partial match record are sequentially output (step S450). For example, if there is a record matching the keyword 108 in the structured database 220, the retrieval interface unit 260 sequentially outputs the matching data in the exact matching record and the partial matching record matching the keyword 108 (via the FIG. 3C The feedback data storage device 280 is obtained as a response result 110 to the knowledge assistance understanding system 400 (in another embodiment, the response result 110 may include other information related to the matching record, such as the value stored in the heat column 316, Used to display to the user for transfer to other materials, wherein the priority of the exact match record is greater than the priority of the partial match record.

另一方面，若未有匹配結果，則自然語言理解系統100可以直接通知用戶匹配失敗並結束流程、通知用戶未發現匹配結果並要求做更進一步的輸入、或是列舉可能選項給用戶做進一步選擇(例如前述以”劉德華”與”背叛”做全文檢索未產生匹配結果的例子)(步驟460)。 On the other hand, if there is no matching result, the natural language understanding system 100 can directly notify the user that the matching fails and ends the process, notifies the user that the matching result is not found and requests further input, or enumerates possible options to further select the user. (For example, the above example in which "Andy Lau" and "Betrayal" do not produce a matching result in the full-text search) (step 460).

前述的流程步驟非以限定本發明，有些步驟是可以忽略或移除，例如，在本發明的另一實施例中，可藉由位於檢索系統200外的匹配判斷模組(未繪示於圖中)來執行步驟S440；或是在本發明的另一實施例中，可忽略上述步驟S450，其依序輸出完全匹配記錄及部分匹配記錄的動作可以藉由位於檢索系統200外的匹配結果輸出模組(未繪示於圖中)，來執行步驟S450中依序輸出完全匹配記錄及部分匹配記錄的動作。 The foregoing process steps are not intended to limit the present invention, and some steps may be omitted or removed. For example, in another embodiment of the present invention, a matching judgment module located outside the retrieval system 200 may be used (not shown in the figure). Step S440 is performed; or in another embodiment of the present invention, the above step S450 may be omitted, and the actions of sequentially outputting the exact match record and the partial match record may be output by the matching result located outside the retrieval system 200. The module (not shown in the figure) performs the actions of sequentially outputting the exact match record and the partial match record in step S450.

基於上述示範性實施例所揭示或教示的內容，圖4B為根據本發明的另一實施例的自然語言理解系統100工作過程的流程圖。請參閱圖4B，本發明的另一實施例的自然語言理解系統100工作過程包括以下步驟：接收請求資訊(步驟S510)。舉例來說，用戶將具有語音內容或文字內容的請求資訊102傳送至自然語言理解系統100；提供結構化資料庫，且結構化資料庫儲存多個記錄(步驟S520)；將請求資訊語法化(步驟S530)。舉例來說，自然語言處理器300分析用戶的請求資訊102後，進而轉為對應的可能意圖語法資料106；辨別關鍵字的可能屬性(步驟S540)。舉例來說，知識輔助理解模組400辨別出可能意圖語法資料106中的至少一關鍵字108的可能屬性，例如，關鍵字108"三國演義"可能是書、電影及電視節目；藉由關鍵字108來對多個記錄的標題欄304進行全文檢索(步驟S550)。舉例來說，將關鍵字108輸入檢索介面單元260來讓搜尋引擎240對結構化資料庫220所儲存的多個記錄的標題欄304進行全文檢索；判斷全文檢索是否有匹配結果(步驟S560)。舉例來說，藉由搜尋引擎240來判斷此關鍵字108所對應的全文檢索是否有匹配結果；若有匹配結果，依序輸出完全匹配記錄及部分匹配記錄 (步驟S570)所對應的指引資料為回應結果110。舉例來說，若結構化資料庫220中有記錄匹配此關鍵字108，則檢索介面單元260依序輸出匹配此關鍵字108的完全匹配記錄及部分匹配記錄所對應的指引資料為回應結果110，其中完全匹配記錄的優先順序大於部分匹配記錄的優先順序；以及依序輸出對應的確定意圖語法資料(步驟S580)。舉例來說，知識輔助理解模組400藉由依序輸出的完全匹配記錄及部分匹配記錄，藉以輸出對應的確定意圖語法資料114。 Based on the content disclosed or taught by the above exemplary embodiments, FIG. 4B is the root A flowchart of the working process of the natural language understanding system 100 in accordance with another embodiment of the present invention. Referring to FIG. 4B, the natural language understanding system 100 of the other embodiment of the present invention includes the following steps: receiving request information (step S510). For example, the user transmits the request information 102 having the voice content or the text content to the natural language understanding system 100; provides a structured database, and the structured database stores a plurality of records (step S520); grammatically digitizes the request information ( Step S530). For example, the natural language processor 300 analyzes the user's request information 102, and then converts to the corresponding possible intent grammar data 106; discriminates the possible attributes of the keyword (step S540). For example, the knowledge assisted understanding module 400 identifies possible attributes of at least one of the keywords 108 that may be intended to be grammar material 106, for example, the keyword 108 "Three Kingdoms" may be a book, a movie, and a television program; The full-text search is performed on the title bar 304 of the plurality of records (step S550). For example, the keyword 108 is input to the search interface unit 260 to cause the search engine 240 to perform a full-text search on the title bar 304 of the plurality of records stored in the structured database 220; and whether the full-text search has a matching result (step S560). For example, the search engine 240 determines whether the full-text search corresponding to the keyword 108 has a matching result; if there is a matching result, the exact match record and the partial match record are sequentially output. The guidance material corresponding to (step S570) is the response result 110. For example, if a record in the structured database 220 matches the keyword 108, the search interface unit 260 sequentially outputs the guide data corresponding to the match record and the partial match record of the keyword 108 as the response result 110. The priority order of the exact match records is greater than the priority order of the partial match records; and the corresponding determined intent grammar data is sequentially output (step S580). For example, the knowledge assisted understanding module 400 outputs the corresponding determined intent grammar data 114 by sequentially outputting the matched match record and the partial match record.

另一方面，若在步驟S560未產生匹配結果，亦可運用類似步驟S460的方式來處理，例如直接通知用戶匹配失敗並結束流程、通知用戶未發現匹配結果並要求做更進一步的輸入、或是列舉可能選項給用戶做進一步選擇(例如前述以”劉德華”與”背叛”做全文檢索未產生匹配結果的例子)(步驟S590)。 On the other hand, if the matching result is not generated in step S560, it may be processed in a manner similar to step S460, for example, directly notifying the user that the matching fails and ending the process, notifying the user that the matching result is not found and requesting further input, or The possible options are listed for the user to make further choices (for example, the foregoing example in which "Andy Lau" and "Betrayal" do not produce a matching result in the full-text search) (step S590).

前述的流程步驟非以限定本發明，有些步驟是可以忽略或移除。 The foregoing process steps are not intended to limit the invention, and some steps may be omitted or removed.

綜上所述，本發明藉由取出用戶的請求資訊所包括的關鍵字，並且針對結構化資料庫中的具有特定資料結構的記錄的標題欄(諸如具有圖3A與圖3B的結構)來進行全文檢索，若產生匹配結果，便可判斷出關鍵字所屬的領域種類(與指引欄的資料做比對)，藉以確定用戶在請求資訊所表示的意圖。 In summary, the present invention performs by extracting a keyword included in a request information of a user and for a title bar of a record having a specific material structure in the structured database, such as the structure having FIGS. 3A and 3B. Full-text search, if a matching result is generated, it can determine the type of domain to which the keyword belongs (compare with the information in the guide bar) to determine the intention expressed by the user in requesting information.

接下來針對以上結構化資料庫在語音識別上的應用做更多的說明。首先針對在自然語言對話系統中，可通過自然語言理解系統100以根據用戶的語音輸入來修正錯誤的語音應答，並進一步找出其他可能的答案來回報給用戶的應用做說明。 Next, we will make more use of the above structured database in speech recognition. More instructions. First, in the natural language dialogue system, the natural language understanding system 100 can be used to correct the erroneous voice response according to the user's voice input, and further find other possible answers to report the application to the user.

如前所述，雖然現今的移動通訊裝置已可提供自然語言對話功能，以讓用戶發出語音來和移動通訊裝置溝通。然而在目前的語音對話系統，當用戶的語音輸入不明確時，由於同一句語音輸入可能意指多個不同的意圖或目的，故系統容易會輸出不符合語音輸入的語音應答。因此在很多對話情境中，用戶難以得到符合其意圖的語音應答。為此，本發明提出一種修正語音應答的方法以及自然語言對話系統，其中自然語言對話系統可根據用戶的語音輸入來修至錯誤的語音應答，並進一步找出其他可能的答案來回報給用戶。為了使本發明的內容更為明瞭，以下特舉實施例作為本發明確實能夠據以實施的範例。 As mentioned earlier, today's mobile communication devices have been able to provide natural language conversations to allow users to voice to communicate with mobile communication devices. However, in the current voice dialogue system, when the user's voice input is not clear, since the same sentence voice input may mean a plurality of different intentions or purposes, the system may easily output a voice response that does not conform to the voice input. Therefore, in many conversation situations, it is difficult for a user to obtain a voice response that meets his or her intention. To this end, the present invention proposes a method of correcting a voice response and a natural language dialogue system, wherein the natural language dialogue system can repair the wrong voice response according to the user's voice input, and further find other possible answers to report back to the user. In order to clarify the content of the present invention, the following specific examples are given as examples in which the present invention can be implemented.

圖5A是依照本發明一實施例所繪示的自然語言對話系統的方塊圖。請參照圖5A，自然語言對話系統500包括語音取樣模組510、自然語言理解系統520、以及語音合成資料庫530。在一實施例中，語音取樣模組510用以接收第一語音輸入501(例如來自用戶的語音)，隨後對其進行解析而產生第一請求資訊503，而自然語言理解系統520會再對第一請求資訊503進行解析而取得其中的第一關鍵字509，並在找到符合第一請求資訊503的第一回報答案511後(依據圖1的描述，第一請求資訊503可運用請求資訊102相同的方式做處理，亦即請求資訊102在分析後會產生可能意圖語法資料106，而其中的關鍵字108會用來對結構化資料庫220進行全文檢索而獲得回應結果110，此回應結果110再與可能意圖語法資料106中的意圖資料112作比對而產生確定意圖語法資料114，最後由分析結果輸出模組116送出分析結果104，此分析結果104可作為圖5A中的第一回報答案511)，依據此第一回報答案511對語音合成資料庫530進行對應的語音查詢(因為做為第一回答案511的分析結果104可包含完全/部分匹配的記錄302的相關資料(例如儲存在指引欄310的指引資料、在數值欄312的數值資料、以及在內容欄306的資料...等)，因此可利用這些資料進行語音查詢)，再輸出所查詢的第一語音513產生對應於第一語音輸入501的第一語音應答507予用戶。其中，倘若用戶認為自然語言理解系統520所輸出的第一語音應答507不符合第一語音輸入501中的第一請求資訊503時，用戶將輸入另一個語音輸入，例如第二語音輸入501’，來指示此事。自然語言理解系統520會利用上述對第一語音輸入501的相同處理方式來處理第二語音輸入501’以產生第二請求資訊503’，隨後對第二請求資訊503’進行解析、取得其中的第二關鍵字509’、找到符合第二請求資訊503’的第二回報答案511’、找出對應的第二語音513’、最後再依據第二語音513’產生對應的第二語音應答507’輸出予用戶，作為修正第一回報答案511之用。明顯的，自然語言理解系統520可以圖1的自然語言理解系統100為基礎，並再增加新的模組(將結合後續的圖5B做解說)來達成根據用戶的語音輸入來修正錯誤的語音應答的目的。 FIG. 5A is a block diagram of a natural language dialogue system according to an embodiment of the invention. Referring to FIG. 5A, the natural language dialogue system 500 includes a speech sampling module 510, a natural language understanding system 520, and a speech synthesis database 530. In an embodiment, the voice sampling module 510 is configured to receive the first voice input 501 (eg, voice from the user), and then parse it to generate the first request information 503, and the natural language understanding system 520 will again A request information 503 is parsed to obtain the first keyword 509 therein, and after finding the first return answer 511 that meets the first request information 503 (according to the description of FIG. 1, the first request information 503 can be the same as the request information 102 The way to do this, that is, request information 102 will be generated after analysis The grammar material 106 may be intended, and the keyword 108 therein may be used to perform a full-text search on the structured database 220 to obtain a response result 110, which is then compared with the intent data 112 in the possible intent grammar data 106. The determined intent grammar data 114 is generated, and finally the analysis result 104 is sent by the analysis result output module 116, and the analysis result 104 can be used as the first return answer 511 in FIG. 5A, and the speech synthesis database 530 is based on the first return answer 511. Corresponding voice queries are performed (because the analysis results 104 as the first answer 511 may include relevant information for the fully/partially matched records 302 (eg, the guidance data stored in the guide bar 310, the numerical data in the value column 312, and The data in the content column 306, etc., can therefore be used to perform a voice query, and the first voice 513 being queried is output to generate a first voice response 507 corresponding to the first voice input 501 to the user. Wherein, if the user believes that the first voice response 507 output by the natural language understanding system 520 does not conform to the first request information 503 in the first voice input 501, the user inputs another voice input, such as the second voice input 501', To indicate this. The natural language understanding system 520 processes the second voice input 501' using the same processing method as the first voice input 501 to generate the second request information 503', and then parses the second request information 503' to obtain the first The second keyword 509' finds the second return answer 511' that matches the second request information 503', finds the corresponding second voice 513', and finally generates the corresponding second voice response 507' output according to the second voice 513'. To the user, as a correction to the first return answer 511. Obviously, the natural language understanding system 520 can be based on the natural language understanding system 100 of FIG. 1, and a new module is added (which will be explained in conjunction with the subsequent FIG. 5B) to achieve a voice correction based on the user's voice input. The purpose of the answer.

前述自然語言對話系統500中的各構件可配置在同一機器中。舉例而言，語音取樣模組510與自然語言理解系統520例如是配置於同一電子裝置。其中，電子裝置可以是移動電話(Cell phone)、個人數字助理(Personal Digital Assistant，PDA)手機、智能型手機(Smart phone)等移動通訊裝置、掌上型計算機(Pocket PC)、平板型計算機(Tablet PC)、筆記型計算機、個人計算機、或是其他具備通訊功能或安裝有通訊軟體的電子裝置，在此並不限制其範圍。此外，上述電子裝置可使用Android操作系統、Microsoft操作系統、Android操作系統、Linux操作系統等等，不限於此。當然，前述自然語言對話系統500中的各構件也不一定需設置在同一機器中，而可分散在不同裝置或系統並通過各種不同的通訊協議來連結。舉例而言，自然語言理解系統520可以位於雲端伺服器中，也可以位於區域網路中的伺服器。此外，自然語言理解系統520中的各構件也可分散在不同的機器，例如自然語言理解系統520中的各構件可位於與語音取樣模組510相同或不同的機器。 The components in the aforementioned natural language dialogue system 500 can be configured in the same machine. For example, the speech sampling module 510 and the natural language understanding system 520 are, for example, disposed on the same electronic device. The electronic device may be a mobile phone (Cell phone), a personal digital assistant (PDA) mobile phone, a smart phone (Smart phone), and the like, a mobile communication device, a palm computer (Pocket PC), a tablet computer (Tablet). PC), notebook computer, personal computer, or other electronic device with communication function or communication software installed, is not limited in scope here. In addition, the above electronic device may use an Android operating system, a Microsoft operating system, an Android operating system, a Linux operating system, etc., without being limited thereto. Of course, the components of the aforementioned natural language dialogue system 500 need not necessarily be disposed in the same machine, but may be distributed in different devices or systems and connected by various different communication protocols. For example, the natural language understanding system 520 can be located in a cloud server or a server in a local area network. Moreover, the various components of the natural language understanding system 520 can also be distributed among different machines. For example, the components of the natural language understanding system 520 can be located in the same or different machines as the speech sampling module 510.

在本實施例中，語音取樣模組510用以接收語音輸入，此語音取樣模組510可以為麥克風(Microphone)等接收音訊的裝置，而第一語音輸入501/第二語音輸入501’可以是來自用戶的語音。 In this embodiment, the voice sampling module 510 is configured to receive voice input. The voice sampling module 510 can be a device for receiving audio such as a microphone, and the first voice input 501 / the second voice input 501 ′ can be Voice from the user.

此外，本實施例的自然語言理解系統520可由一個或數個邏輯門組合而成的硬體電路來實作。或者，在本發明另一實施例中，自然語言理解系統520可以通過計算機程序碼來實作。舉例來說，自然語言理解系統520例如是由程序語言所撰寫的程序碼片段來實作於應用程序、操作系統或驅動程序等，而這些程序碼片段儲存在儲存單元中，並藉由處理單元(圖5A未顯示)來執行的。為了使本領域的技術人員進一步瞭解本實施例的自然語言理解系統520，底下舉實例來進行說明。然而，本發明在此僅為舉例說明，並不以此為限，例如運用硬體、軟體、固件、或是此三種實施方式的混合結合等方式，皆可運用來實施本發明。 Furthermore, the natural language understanding system 520 of the present embodiment can be one or several A hardware circuit composed of logic gates is implemented. Alternatively, in another embodiment of the invention, the natural language understanding system 520 can be implemented by computer program code. For example, the natural language understanding system 520 is, for example, a program code segment written by a programming language, implemented in an application, an operating system, or a driver, etc., and these code segments are stored in a storage unit and processed by the processing unit. (not shown in Figure 5A) to perform. In order to enable those skilled in the art to further understand the natural language understanding system 520 of the present embodiment, an example will be described below. However, the present invention is intended to be illustrative only and not limited thereto, and the invention may be practiced, for example, using hardware, software, firmware, or a combination of the three embodiments.

圖5B是依照本發明一實施例所繪示的自然語言理解系統520的方塊圖。請參照圖5B，本實施例的自然語言理解系統520可包括語音辨識模組522、自然語言處理模組524以及語音合成模組526。其中，語音辨識模組522會接收從語音取樣模組510傳來的請求資訊，例如對第一語音輸入501進行解析的第一請求資訊503，並取出一個或多個第一關鍵字509(例如圖1A的關鍵字108或字句等)。自然語言處理模組524可再對這些第一關鍵字509進行解析，而獲得至少包含一個回報答案的候選列表(與圖5A的處理方式相同，亦即例如通過圖1A的檢索系統200對結構化資料庫220進行全文檢索，並在取得回應結果110且對意圖資料112比對後產生確定意圖語法資料114，最後由分析結果輸出模組116所送出的分析結果104來產生回報答案)，並且會從候選列表所有的回報答案中選出一個較符合第一語音輸入501的答案以做為第一回報答案511(例如挑選完全匹配記錄...等)。由於第一回報答案511是自然語言理解系統520在內部分析而得的答案，所以還必須將它轉換成語音輸出才能輸出予用戶，這樣用戶才能進行判斷。於是語音合成模組526會依據第一回報答案511來查詢語音合成資料庫530，而此語音合成資料庫530例如是記錄有文字以及其對應的語音資訊，可使得語音合成模組526能夠找出對應於第一回報答案511的第一語音513，藉以合成出第一語音應答507。之後，語音合成模組526可將合成的第一語音應答507通過語音輸出介面(未繪示)(其中語音輸出介面例如為喇叭、揚聲器、或耳機等裝置)輸出予用戶。應注意的是，語音合成模組526在依據第一回報答案511查詢語音合成資料庫530時，可能需要先將第一回報答案511進行格式轉換，然後通過語音合成資料庫530所規定的介面進行呼叫。由於呼叫語音合成資料庫530時是否需要進行格式轉換與語音合成資料庫530本身的定義相關，因這部分屬於本領域的技術人員所熟知的技術，故在此不予詳述。 FIG. 5B is a block diagram of a natural language understanding system 520, in accordance with an embodiment of the invention. Referring to FIG. 5B , the natural language understanding system 520 of the present embodiment may include a voice recognition module 522 , a natural language processing module 524 , and a voice synthesis module 526 . The voice recognition module 522 receives the request information transmitted from the voice sampling module 510, for example, the first request information 503 for parsing the first voice input 501, and takes out one or more first keywords 509 (for example, The keyword 108 or sentence of FIG. 1A, etc.). The natural language processing module 524 can further parse the first keywords 509 to obtain a candidate list that includes at least one reward answer (the same as the processing of FIG. 5A, that is, for example, by the retrieval system 200 of FIG. 1A). The database 220 performs full-text search, and after obtaining the response result 110 and comparing the intent data 112, the determined intent grammar data 114 is generated, and finally the analysis result 104 sent by the analysis result output module 116 generates a reward answer), and Selecting an answer that matches the first voice input 501 from all the answer answers of the candidate list as the first time Answer 511 (for example, pick a perfect match record...etc.). Since the first return answer 511 is an internal analysis of the natural language understanding system 520, it must also be converted into a speech output before being output to the user so that the user can make a judgment. The speech synthesis module 526 then queries the speech synthesis database 530 according to the first reward answer 511. The speech synthesis database 530 records, for example, text and corresponding speech information, so that the speech synthesis module 526 can find out The first speech 513 corresponding to the first reward answer 511 is used to synthesize the first speech response 507. Thereafter, the speech synthesis module 526 can output the synthesized first speech response 507 to the user through a voice output interface (not shown) in which the voice output interface is, for example, a speaker, a speaker, or a headset. It should be noted that when the speech synthesis module 526 queries the speech synthesis database 530 according to the first reward answer 511, the first report answer 511 may need to be format converted first, and then performed through the interface specified by the speech synthesis database 530. call. Since the need to perform format conversion when calling the speech synthesis database 530 is related to the definition of the speech synthesis database 530 itself, since this portion belongs to a technique well known to those skilled in the art, it will not be described in detail herein.

接下來列舉實例來說明，若用戶輸入的是“我要看三國演義”的第一語音輸入501話，語音辨識模組522會接收從語音取樣模組510傳來的對第一語音輸入501進行解析的第一請求資訊503，然後取出例如是包含“三國演義”的第一關鍵字509。自然語言處理模組524則可再對這個第一關鍵字509“三國演義”進行解析(例如通過圖1A的檢索系統200對結構化資料庫220進行全文檢索，並在取得回應結果110且對意圖資料112比對後產生確定意圖語法資料114，最後由分析結果輸出模組116所送出的分析結果104)，進而產生包含“三國演義”的三個意圖選項的回報答案，並將其整合成一候選列表(假設每個意圖選項只有一個回報答案，其分別歸類於“看書”、“看電視劇”、以及“看電影”三個選項)，接著再從候選列表的這三個回報答案中選出一個在熱度欄316具有最高值(例如挑選圖3B的記錄10)做為第一回報答案511。在一實施例中，可以直接執行熱度欄316具有最高值的所對應的方式(例如先前所提的直接播放蕭敬騰所演唱的“背叛”予用戶)，本發明並不對此加以限制。 Next, an example is given to illustrate that if the user inputs the first voice input 501 of "I want to see the Romance of the Three Kingdoms", the voice recognition module 522 receives the first voice input 501 from the voice sampling module 510. The parsed first request information 503 is then taken out, for example, as a first keyword 509 containing "Three Kingdoms". The natural language processing module 524 can then parse the first keyword 509 "Three Kingdoms" (for example, the full-text search of the structured database 220 by the retrieval system 200 of FIG. 1A, and obtain the response result 110 and the intent After the data 112 is compared, the determination is made. The graph grammar data 114, and finally the analysis result 104 sent by the analysis result output module 116), generates a reward answer containing three intent options of the "Three Kingdoms" and integrates them into a candidate list (assuming each intent option) There is only one return answer, which is classified into three options: “reading books”, “watching TV shows”, and “watching movies”, and then selecting one of the three return answers from the candidate list has the highest value in the heat column 316. (For example, the record 10 of FIG. 3B is selected) as the first return answer 511. In an embodiment, the corresponding manner in which the heat column 316 has the highest value can be directly executed (for example, the previously mentioned "betray" to the user directly played by Xiao Jingteng), which is not limited by the present invention.

此外，自然語言處理模組524還可藉由解析後續所接收的第二語音輸入501’(因為與先前的語音輸入501運用同樣的方式饋入語音取樣模組510)，而判斷前次的第一回報答案511是否正確。因為第二語音輸入501’是用戶針對先前提供予用戶的第一語音應答507所做的回應，其包含用戶認為先前的第一語音應答507正確與否的資訊。倘若在分析第二語音輸入501’後是表示用戶認為第一回報答案511不正確，自然語言處理模組524可選擇上述候選列表中的其他回報答案做為第二回報答案511’，例如從候選列表中剔除第一回報答案511後，並在剩餘的回報答案重新挑選一第二回報答案511’，再利用語音合成模組526找出對應於第二回報答案511’的第二語音513’，最後通過語音合成模組526將第二語音513’合成為第二語音應答507’播放予用戶。 In addition, the natural language processing module 524 can also determine the previous number by parsing the subsequently received second voice input 501' (because it is fed into the voice sampling module 510 in the same manner as the previous voice input 501). A return answer 511 is correct. Because the second voice input 501' is a response by the user to the first voice response 507 previously provided to the user, it includes information that the user considered the previous first voice response 507 correct or not. If after analyzing the second voice input 501', it indicates that the user thinks that the first report answer 511 is incorrect, the natural language processing module 524 can select other reward answers in the candidate list as the second report answer 511', for example, from the candidate. After the first return answer 511 is removed from the list, and a second return answer 511' is re-selected in the remaining return answer, the speech synthesis module 526 is used to find the second speech 513' corresponding to the second return answer 511'. Finally, the second speech 513' is synthesized by the speech synthesis module 526 into a second speech response 507' for playback to the user.

延續先前用戶輸入“我要看三國演義”的例子來說，若用戶想要看三國演義的電視劇，所以先前輸出予用戶的圖3B記錄10的選項(因為是看“三國演義”的電影)就不是用戶想要的，於是用戶可能再輸入“我要看三國演義電視劇”(用戶明確指出想看的是電視劇)、或是“我不要看三國演義電影”(用戶只否定目前選項)...等作為第二語音輸入501’。於是第二語音輸入501’將在解析而取得其第二請求資訊503’(或是第二關鍵字509’)後，會發現第二請求資訊503’中的第二關鍵字509’將包含“電視劇”(用戶有明確指示)或是“不要電影”(用戶只否定目前選項)，因此將判斷第一回報答案511不符合用戶的需求。是以，此時可以從候選列表再選出另一個回報答案做為第二回報答案511’並輸出對應的第二語音應答507’，例如輸出“我現在為您播放三國演義電視劇”的第二語音應答507’(如果用戶明確指出想觀看三國演義電視劇)、或是輸出“您想要的是哪個選項”(如果用戶只否定目前選項)的第二語音應答507’，並結合候選列表中其他的選項供用戶選取(例如“挑選熱度欄316數值次高的回報答案作為第二回報答案511’)。再者，在另一實施例中，若是用戶所輸入的第二語音輸入501’包含“選擇”的訊息，例如顯示“觀看三國演義書籍”、“觀看三國演義電視劇”、以及“觀看三國演義電影”三個選項給用戶做選擇時，用戶可能輸入“我要看電影”的第二語音輸入501’時，將在分析第二語音輸入501’的第二請求資訊503’並發現用戶的意圖後(例如從第二關鍵字509’發現用戶選擇“觀看電影”)，於是第二語音輸入501’將在解析而取得其第二請求資訊503’後，輸出“我現在為您播放三國演義電影” 的第二語音應答507’(如果用戶想觀看三國演義電影)然後直接播放電影予用戶。當然，若用戶所輸入的是“我要第三個選項”時(假設此時用戶所選擇的是閱讀書籍)，將執行第三選所對應的應用程序，亦即輸出“您想要的是閱讀三國演義書籍”的第二語音應答507’，並結合顯示三國演義的電子書予用戶的動作。 Continuing the previous user input "I want to see the Romance of the Three Kingdoms", if you use The user wants to watch the TV series of the Three Kingdoms, so the option of recording 10 in Figure 3B that was previously output to the user (because it is a movie watching "The Romance of the Three Kingdoms") is not what the user wants, so the user may input "I want to see the Romance of the Three Kingdoms." The TV series (the user clearly pointed out that he wants to watch a TV series), or "I don't want to watch the Three Kingdoms Romance movie" (the user only negates the current option)...etc. as the second voice input 501'. Then, after the second voice input 501' obtains its second request information 503' (or the second keyword 509'), it will find that the second keyword 509' in the second request information 503' will contain " The TV series (the user has a clear indication) or "Do not want the movie" (the user only denies the current option), so it will be judged that the first return answer 511 does not meet the user's needs. Therefore, at this time, another return answer can be selected from the candidate list as the second return answer 511' and the corresponding second voice response 507' can be output, for example, outputting the second voice of "I am playing the drama of the Three Kingdoms for you" Answer 507' (if the user explicitly indicates that he wants to watch the drama of the Three Kingdoms), or output the "what option do you want" (if the user only negates the current option) the second voice response 507', combined with other candidates in the candidate list The option is for the user to select (eg, "choose the hot answer bar 316 the next highest return answer as the second return answer 511'). Again, in another embodiment, if the second voice input 501' entered by the user includes "select "The message, for example, "Reading the Three Kingdoms Romance Books", "Watching the Three Kingdoms TV Series", and "Watch the Three Kingdoms Romance Movies" three options for the user to make a choice, the user may enter the "I want to watch movies" second voice input At 501', after analyzing the second request information 503' of the second voice input 501' and discovering the user's intention (for example, from the second keyword 509' Now the user selects "watch movie", so the second voice input 501' will output "I am playing the Three Kingdoms movie for you" after parsing and obtaining its second request information 503'. The second voice response 507' (if the user wants to watch the Three Kingdoms movie) then plays the movie directly to the user. Of course, if the user enters the "I want the third option" (assuming that the user chooses to read the book at this time), the application corresponding to the third selection will be executed, that is, the output "What you want is Read the "Two Voice Responses 507" of the Three Kingdoms Romance Book, and combine the e-books showing the Romance of the Three Kingdoms to the user's actions.

在本實施例中，前述自然語言理解系統520中的語音辨識模組522、自然語言處理模組524以及語音合成模組526可與語音取樣模組510配置在同一機器中。在其他實施例中，語音辨識模組522、自然語言處理模組524以及語音合成模組526亦可分散在不同的機器(例如計算機系統、伺服器或類似裝置/系統)中。例如圖5C所示的自然語言理解系統520’，語音合成模組526可與語音取樣模組510配置在同一機器502，而語音辨識模組522、自然語言處理模組524可配置在另一機器。此外，在圖5C的架構下，自然語言處理模組524會將第一回報答案511/第二回報答案511’傳送至語音合成模組526，其隨即以第一回報答案511/第二回報答案511’送往語音合成資料庫以尋找對應的第一語音513/第二語音513’，作為產生第一語音應答507/第二語音應答507’的依據。 In this embodiment, the speech recognition module 522, the natural language processing module 524, and the speech synthesis module 526 in the natural language understanding system 520 can be disposed in the same machine as the speech sampling module 510. In other embodiments, the speech recognition module 522, the natural language processing module 524, and the speech synthesis module 526 can also be distributed among different machines (eg, computer systems, servers, or the like). For example, the natural language understanding system 520' shown in FIG. 5C, the speech synthesis module 526 can be disposed in the same machine 502 as the speech sampling module 510, and the speech recognition module 522 and the natural language processing module 524 can be configured on another machine. . In addition, under the architecture of FIG. 5C, the natural language processing module 524 transmits the first return answer 511 / the second return answer 511 ' to the speech synthesis module 526, which then returns the answer 511 / the second return answer. The 511' is sent to the speech synthesis database to find the corresponding first speech 513/second speech 513' as a basis for generating the first speech response 507/second speech response 507'.

圖6是依照本發明一實施例所繪示的修正第一語音應答507的方法流程圖。在本實施例中的修正第一語音應答507的方法中，當用戶認為目前所播放的第一語音應答507不符合其先前所輸入的第一請求資訊503時，會再輸入第二語音輸入501’並饋入語音取樣模組510，隨後再由自然語言理解系統520分析而得知先前播放予用戶的第一語音應答507並不符合用戶的意圖時，自然語言理解系統520可再次輸出第二語音應答507’，藉以修正原本的第一語音應答507。為了方便說明，在此僅舉圖5A的自然語言對話系統500為例，但本實施例的修正第一語音應答507的方法亦可適用於上述圖5C的自然語言對話系統500’。 FIG. 6 is a flow chart of a method for modifying a first voice response 507 according to an embodiment of the invention. In the method for correcting the first voice response 507 in this embodiment, when the user thinks that the currently played first voice response 507 does not match the first request information 503 that was previously input, the second voice input 501 is re-entered. 'And fed into the speech sampling module 510, and then analyzed by the natural language understanding system 520 to learn When the first voice response 507 previously played to the user does not conform to the user's intent, the natural language understanding system 520 can again output the second voice response 507', thereby correcting the original first voice response 507. For convenience of explanation, only the natural language dialogue system 500 of Fig. 5A is taken as an example, but the method of correcting the first voice response 507 of the present embodiment can also be applied to the above-described natural language dialogue system 500' of Fig. 5C.

請同時參照圖5A及圖6，於步驟S602中，語音取樣模組510會接收第一語音輸入501(亦同樣饋入語音取樣模組510)。其中，第一語音輸入501例如是來自用戶的語音，且第一語音輸入501還可具有用戶的第一請求資訊503。具體而言，來自用戶的第一語音輸入501可以是詢問句、命令句或其他請求資訊等，例如「我要看三國演義」、「我要聽忘情水的音樂」或「今天溫度幾度」等等。 Referring to FIG. 5A and FIG. 6 simultaneously, in step S602, the voice sampling module 510 receives the first voice input 501 (also fed into the voice sampling module 510). The first voice input 501 is, for example, a voice from a user, and the first voice input 501 may also have a first request information 503 of the user. Specifically, the first voice input 501 from the user may be an inquiry sentence, a command sentence, or other request information, such as "I want to see the Romance of the Three Kingdoms", "I want to listen to the music of the water" or "A few degrees of today's temperature", etc. Wait.

於步驟S604中，自然語言理解系統520會解析第一語音輸入501中所包括的至少一個第一關鍵字509而獲得候選列表，其中候選列表具有一個或多個回報答案。舉例來說，當用戶的第一語音輸入501為「我要看三國演義」時，自然語言理解系統520經過分析後所獲得的第一關鍵字509例如是「『三國演義』、『看』」。又例如，當用戶的第一語音輸入501為「我要聽忘情水的歌」時，自然語言理解系統520經過分析後所獲得的第一關鍵字509例如是「『忘情水』、『聽』、『歌』」。 In step S604, the natural language understanding system 520 parses the at least one first keyword 509 included in the first voice input 501 to obtain a candidate list, wherein the candidate list has one or more reward answers. For example, when the first voice input 501 of the user is "I want to see the Romance of the Three Kingdoms", the first keyword 509 obtained by the natural language understanding system 520 after analysis is, for example, "The Romance of the Three Kingdoms" and "Look". . For another example, when the first voice input 501 of the user is "I want to listen to the song of forgetting the water", the first keyword 509 obtained by the natural language understanding system 520 after analysis is, for example, "forget the water" and "listen". ,"song"".

接後，自然語言理解系統520可依據上述第一關鍵字509自結構化資料庫220進行查詢，而獲得至少一筆搜尋結果(例如圖 1的分析結果104)，據以做為候選列表中的回報答案。至於從多個回報答案中選擇第一回報答案511的方式可如圖1A所述，在此不予以贅述。由於第一關鍵字509可能包含不同的知識領域(例如電影類、書籍類、音樂類或游戲類等等)，且同一知識領域中亦可進一步分成多種類別(例如同一電影或書籍名稱的不同作者、同一歌曲名稱的不同演唱者、同一遊戲名稱的不同版本等等)，故針對第一關鍵字509而言，自然語言理解系統520可在結構化資料庫中查詢到一筆或多筆相關於此第一關鍵字509的搜尋結果(例如分析結果104)，其中每一筆搜尋結果中可包括相關於此第一關鍵字509的指引資料(例如以“蕭敬騰”、“背叛”為關鍵字108在圖3A、3B的結構化資料庫220進行全文檢索時，將得到例如圖3A的記錄6與7兩組匹配結果，它們分別包含“singerguid”、“songnameguid”的指引資料，此指引資料為儲存在指引欄310的資料)與其他資料。其中，其他資料例如是在搜尋結果中，除了與第一關鍵字709相關以外的其他關鍵字等等(例如以“一起走過的日子”為關鍵字且在圖3A的結構化資料庫220做全文檢索而得到記錄1為匹配結果時，“劉德華”與“港臺，粵語，流行”兩者即為其他資料)。因此從另一觀點來看，當用戶所輸入的第一語音輸入501具有多個第一關鍵字509時，則表示用戶的第一請求資訊503較明確，使得自然語言理解系統520較能查詢到與第一請求資訊503接近的搜尋結果。 After that, the natural language understanding system 520 can query the structured database 220 according to the first keyword 509 to obtain at least one search result (for example, a map). The analysis result of 1 is 104), which is used as the return answer in the candidate list. The manner of selecting the first return answer 511 from the plurality of reward answers may be as described in FIG. 1A and will not be described herein. Since the first keyword 509 may contain different knowledge areas (such as movies, books, music or games, etc.), and the same knowledge field may be further divided into multiple categories (for example, different authors of the same movie or book name) , different singers of the same song name, different versions of the same game name, etc.), so for the first keyword 509, the natural language understanding system 520 can query one or more related documents in the structured database. The search result of the first keyword 509 (for example, the analysis result 104), wherein each of the search results may include guidance materials related to the first keyword 509 (for example, "Xiao Jingteng" and "Betrayal" as the key words 108 When the structured database 220 of 3A, 3B performs a full-text search, for example, the matching results of the records 6 and 7 of FIG. 3A are obtained, which respectively contain the guidance materials of “singerguid” and “songnameguid”, which are stored in the guidelines. Information in column 310) and other information. The other information is, for example, in the search result, in addition to other keywords related to the first keyword 709, etc. (for example, "days passed together" as a keyword and in the structured database 220 of FIG. 3A. When the full-text search and record 1 is the matching result, "Andy Lau" and "Hong Kong, Taiwan, Cantonese, and popular" are other materials. Therefore, from another point of view, when the first voice input 501 input by the user has multiple first keywords 509, it indicates that the first request information 503 of the user is clear, so that the natural language understanding system 520 can query more. The search result that is close to the first request information 503.

舉例來說，當第一關鍵字509為「三國演義」時(例如用戶輸入“我要看三國演義”的語音輸入時)，自然語言理解系統520分析後可能產生三個可能意圖語法資料106(如圖1所示)："<readbook>,<bookname>=三國演義"；"<watchTV>,<TVname>=三國演義"；以及"<watchfilm>,<filmname>=三國演義"。 For example, when the first keyword 509 is "Three Kingdoms" (for example, When the user inputs "I want to see the voice input of the Three Kingdoms", the natural language understanding system 520 may generate three possible intent grammar materials 106 (as shown in Figure 1): "<readbook>, <bookname>=Three Kingdoms ";"<watchTV>, <TVname>=Three Kingdoms"; and "<watchfilm>, <filmname>=Three Kingdoms".

因此查訊到的搜尋結果是關於「...『三國演義』...『書籍』」(意圖資料為<readbook>)、「...『三國演義』...『電視劇』」(意圖資料為<watchTV>)、「...『三國演義』...『電影』」(意圖資料為<watchfilm>)的記錄(例如圖3B的記錄8、9、10)，其中『電視劇』及『書籍』、『電影』分別列舉對應的用戶意圖)。又例如，當第一關鍵字509為「『忘情水』、『音樂』」(例如用戶輸入“我要聽忘情水的音樂”的語音輸入)時，自然語言理解系統520分析後可能產生以下的可能意圖語法資料："<playmusic>,<songname>=忘情水"；所查訊到的搜尋結果例如關於「...『忘情水』...『劉德華』」的記錄(例如圖3B的記錄11)、「...『忘情水』...『李翊君』」的記錄(例如圖3B的記錄12)，其中『劉德華』及『李翊君』為對應於用戶的意圖資料。換言之，每一筆搜尋結果可包括第一關鍵字509以及相關於第一關鍵字509的意圖資料，而自然語言理解系統520會依據所查詢到的搜尋結果，將搜尋結果中所包括的資料轉換成回報答案，並將回報答案記錄於候選列表中，以供後續步驟使用。 Therefore, the search results of the inquiry are about "..."The Romance of the Three Kingdoms..."Books" (intent information is <readbook>), "..."The Romance of the Three Kingdoms..."TV drama" (intention) The information is <watchTV>), "..."The Romance of the Three Kingdoms...""Movie"" (intent data is <watchfilm>) (for example, records 8, 9, 10 of Figure 3B), of which "TV drama" and "Books" and "Movies" list the corresponding user intents). For another example, when the first keyword 509 is ""forget the water", "music"" (for example, the user inputs "speaking of the music I want to listen to", the natural language understanding system 520 may generate the following after analysis. Possible intent grammar material: "<playmusic>, <songname>=forget the water"; For the search results, for example, the records of "... "Forget the water"... "Andy Lau"" (for example, record 11 of Figure 3B), "... "Forget the water"... "李翊君" The record (for example, record 12 of FIG. 3B), wherein "Andy Lau" and "Li Junjun" are intent data corresponding to the user. In other words, each search result may include a first keyword 509 and an intent data related to the first keyword 509, and the natural language understanding system 520 converts the data included in the search result into a search result according to the query. Return the answer and record the return answer in the candidate list for use in subsequent steps.

於步驟S606中，自然語言理解系統520會自候選列表中選擇至少一第一回報答案511，並依據第一回報答案511輸出對應的第一語音應答507。在本實施例中，自然語言理解系統520可按照優先順序排列候選列表中的回報答案，並依據優先順序自候選列表中選出回報答案，據以輸出第一語音應答507。 In step S606, the natural language understanding system 520 will be in the candidate list. At least one first return answer 511 is selected, and the corresponding first voice response 507 is output according to the first return answer 511. In the present embodiment, the natural language understanding system 520 can arrange the reward answers in the candidate list in order of priority, and select a reward answer from the candidate list according to the priority order, thereby outputting the first voice response 507.

舉例來說，當第一關鍵字509為「三國演義」時，假設自然語言理解系統520查詢到很多筆關於「...『三國演義』...『書籍』」的記錄(亦即以查詢到的資料數量多寡做優先順序，例如20筆關於書籍的記錄)，其次為「...『三國演義』...『音樂』」的記錄(例如18筆)，而關於「...『三國演義』...『電視劇』」的記錄數量最少(例如10筆)，則自然語言理解系統520會將「三國演義的書籍」做為第一回報答案(最優先選擇的回報答案)，「三國演義的音樂」做為第二回報答案(第二優先選擇的回報答案)，「三國演義的電視劇」做為第三回報答案(第三優先選擇的回報答案)。當然，若相關於「三國演義的書籍」的第一回報答案不只一筆記錄時，還可以依據應先順序(例如被點選次數多寡或熱度欄316的數值最高者)來挑選第一回報答案511，相關細節前面已提過，在此不予贅述。 For example, when the first keyword 509 is "Three Kingdoms", it is assumed that the natural language understanding system 520 queries a lot of records about "..."Three Kingdoms"... "Books" (ie, by query) The number of materials to be given is prioritized, for example, 20 records on books), followed by the records of "...the Romance of the Three Kingdoms..."Music" (for example, 18 pens), and about "..." "The Romance of the Three Kingdoms"... "TV drama" has the fewest records (for example, 10 strokes), and the Natural Language Understanding System 520 will use the "Book of the Three Kingdoms" as the first return answer (the most preferred return answer), " "The music of the Romance of the Three Kingdoms" as the second return answer (the second preferred choice of the answer), "the drama of the Three Kingdoms" as the third return answer (the third preferred choice answer). Of course, if the first return answer related to the "Book of the Three Kingdoms" is not only a record, the first return answer 511 can also be selected according to the order of precedence (for example, the number of times selected or the highest value of the heat column 316). The relevant details have been mentioned before and will not be described here.

接著，於步驟S608，語音取樣模組510會接收第二語音輸入501’，而自然語言理解系統520會解析此第二語音輸入501’，並判斷先前所選出的第一回報答案511是否正確。在此，語音取樣模組510會對第二語音輸入501’進行解析，以解析出第二語音輸入501’所包括的第二關鍵字509’，其中此第二關鍵字509’例如是用戶進一步提供的關鍵字(例如時間、意圖、知識領域...等等)。並且，當第二語音輸入501’中的第二關鍵字509’與第一回報答案511中所相關的意圖資料不相符時，自然語言理解系統520會判斷先前所選出的第一回報答案511為不正確。至於判斷第二語音輸入501’的第二請求資訊503’包含的是“正確”或“否定”第一語音應答507的方式前面已提過，在此不予贅述。 Next, in step S608, the speech sampling module 510 receives the second speech input 501', and the natural language understanding system 520 parses the second speech input 501' and determines whether the previously selected first reward answer 511 is correct. Here, the speech sampling module 510 parses the second speech input 501' to parse the second keyword 509' included in the second speech input 501', wherein the second keyword 509', for example It is a keyword further provided by the user (such as time, intent, knowledge area, etc.). Moreover, when the second keyword 509' in the second voice input 501' does not match the intent data associated with the first reward answer 511, the natural language understanding system 520 determines that the previously selected first reward answer 511 is Incorrect. The manner in which the second request information 503' of the second voice input 501' is included to include the "correct" or "negative" first voice response 507 has been mentioned above and will not be described herein.

進一步而言，自然語言理解系統520所解析的第二語音輸入501’可包括或不包括明確的第二關鍵字509’。舉例來說，語音取樣模組510例如是接收到來自用戶所說的「我不是指三國演義的書籍」(情况A)、「我不是指三國演義的書籍，我是指三國演義的電視劇」(情况B)、「我是指三國演義的電視劇」(情况C)等等。上述情况A中的第二關鍵字509’例如為「『不是』、『三國演義』、『書籍』」，情况B中的關鍵字509例如為「『不是』、『三國演義』、『書籍』，『是』、『三國演義』、『電視劇』」，而情况C中的第二關鍵字509’例如為「『是』、『三國演義』、『電視劇』」。為了方便說明，上述僅列舉情况A、B及C為例，但本實施例並不限於此。 Further, the second speech input 501' parsed by the natural language understanding system 520 may or may not include an explicit second keyword 509'. For example, the voice sampling module 510 receives, for example, a book from the user that "I am not referring to the Romance of the Three Kingdoms" (Case A), "I am not referring to the Romance of the Three Kingdoms, I am referring to the TV series of the Romance of the Three Kingdoms" ( Situation B), "I mean the TV series of the Romance of the Three Kingdoms" (Case C) and so on. The second keyword 509' in the above case A is, for example, ""No", "Three Kingdoms", "Book"", and the keyword 509 in Case B is, for example, "No", "Romance of the Three Kingdoms", "Book" "Yes", "Three Kingdoms", "TV drama", and the second keyword 509' in case C is, for example, "Yes", "Romance of the Three Kingdoms", "TV drama". For convenience of explanation, only the cases A, B, and C are exemplified above, but the embodiment is not limited thereto.

接著，自然語言理解系統520會依據上述第二語音輸入501’所包括的第二關鍵字509’，來判斷第一回報答案511中相關的意圖資料是否正確。也就是說，倘若斷第一回報答案511為「三國演義的書籍」，而上述第二關鍵字509’為「『三國演義』、『電視劇』」，則自然語言理解系統520會判斷第一回報答案511中相關的意圖資料(即用戶想看三國演義『書籍』)不符合來自用戶第二語音輸入501’的第二關鍵字509’(即用戶想看三國演義『電視劇』)，藉以判斷第一回報答案511不正確。類似地，倘若判斷回報答案為「三國演義的書籍」，而上述第二關鍵字509’為「『不是』、『三國演義』、『書籍』」，則自然語言理解系統520亦會判斷出第一回報答案511不正確。 Next, the natural language understanding system 520 determines whether the related intent data in the first reward answer 511 is correct based on the second keyword 509' included in the second speech input 501'. That is to say, if the first return answer 511 is "a book of the Romance of the Three Kingdoms" and the second keyword 509' is "the Romance of the Three Kingdoms" or "TV drama", the natural language understanding system 520 will judge the first return. The relevant intent information in answer 511 (that is, the user wants to see the "Books" of the Three Kingdoms) does not meet the second from the user. The second keyword 509' of the voice input 501' (i.e., the user wants to see the drama of the Three Kingdoms "TV drama") is used to judge that the first return answer 511 is incorrect. Similarly, if the answer to the answer is "the book of the Romance of the Three Kingdoms" and the second keyword 509' is "No", "Romance of the Three Kingdoms" and "Book", the Natural Language Understanding System 520 will also determine the A return answer 511 is incorrect.

當自然語言理解系統520解析第二語音輸入501之後，判斷之前輸出的第一語音應答501為正確時，則如步驟S610所示，自然語言理解系統520會做出對應於第二語音輸入501’的回應。舉例來說，假設來自用戶的第二語音輸入501’為「是的，是三國演義的書籍」，則自然語言理解系統520可以是輸出「正在幫您開啟三國演義的書籍」的第二語音應答507’。或者，自然語言理解系統520可在播放第二語音應答507’的同時，直接通過處理單元(未繪示)來載入三國演義的書籍內容。 After the natural language understanding system 520 parses the second voice input 501 and determines that the previously output first voice response 501 is correct, the natural language understanding system 520 makes a corresponding second voice input 501' as shown in step S610. Response. For example, if the second voice input 501' from the user is "Yes, a book of the Three Kingdoms", the natural language understanding system 520 may be a second voice response outputting "a book that is helping you to open the Romance of the Three Kingdoms". 507'. Alternatively, the natural language understanding system 520 can load the book content of the Three Kingdoms directly through the processing unit (not shown) while playing the second voice response 507'.

然而，當自然語言理解系統520解析第二語音輸入501’之後，判斷之前輸出的第一語音應答507(亦即回報答案511)不正確時，則如步驟S612所示，自然語言理解系統520會自候選列表中選擇第一回報答案511之外的另一者，並依據所選擇的結果輸出第二語音應答507’。在此，倘若用戶所提供的第二語音輸入501’中不具有明確的第二關鍵字509’(如上述情况A的第二語音輸入501’)，則自然語言理解系統520可依據優先順序從候選列表中選出第二優先選擇的回報答案。或者，倘若用戶所提供的第二語音輸入501’中具有明確的第二關鍵字509’(如上述情况B及C的第二語音輸入501’)，則自然語言理解系統520可直接依據用戶所指引的第二關鍵字509’，在從候選列表中選出對應的回報答案。 However, after the natural language understanding system 520 parses the second voice input 501', it is determined that the previously output first voice response 507 (ie, the return answer 511) is incorrect, then the natural language understanding system 520 will perform as shown in step S612. The other one other than the first return answer 511 is selected from the candidate list, and the second voice response 507' is output according to the selected result. Here, if the second voice input 501' provided by the user does not have the explicit second keyword 509' (such as the second voice input 501' of the above case A), the natural language understanding system 520 can be based on the priority order. A second preferred selection of return answers is selected from the candidate list. Alternatively, if the second voice input 501' provided by the user has a clear second keyword 509' (as in the case of cases B and C above) The second speech input 501'), the natural language understanding system 520 can directly select the corresponding reward answer from the candidate list based on the second keyword 509' directed by the user.

另一方面，倘若用戶所提供的第二語音輸入501’中具有明確的第二關鍵字509’(如上述情况B及C的第二語音輸入)，但自然語言理解系統520在候選列表中查無符合此第二關鍵字509的回報答案，則自然語言理解系統520會輸出第三語音應答，例如「查無此書」或「我不知道」等。 On the other hand, if the second voice input 501' provided by the user has an explicit second keyword 509' (such as the second voice input of cases B and C above), the natural language understanding system 520 checks in the candidate list. If there is no return answer corresponding to the second keyword 509, the natural language understanding system 520 outputs a third voice response, such as "Check this book" or "I don't know".

為了使本領域的技術人員進一步瞭解本實施例的修正語音應答的方法以及自然語言對話系統，以下再舉一實施例進行詳細的說明。 In order to enable those skilled in the art to further understand the method for correcting the voice response and the natural language dialogue system of the present embodiment, a detailed description will be given below.

首先，假設語音取樣模組510接收的第一語音輸入501為「我要看三國演義」(步驟S602)，接著，自然語言理解系統520可解析出為「『看』、『三國演義』」的第一關鍵字509，並獲得具有多個第一回報答案的候選列表，其中每一個回報答案具有相關的關鍵字與其他資料(其他資料可儲存於圖3A/3B的內容欄306中、或是各記錄302的數值欄312的一部份)(步驟S604)，如表一所示(假設搜尋結果中關於三國演義的書籍/電視劇/音樂/電影各只有一筆資料)。 First, it is assumed that the first voice input 501 received by the voice sampling module 510 is "I want to see the Romance of the Three Kingdoms" (step S602). Then, the natural language understanding system 520 can be interpreted as ""seeing" and "the Romance of the Three Kingdoms". a first keyword 509 and obtaining a candidate list having a plurality of first reward answers, wherein each of the reward answers has associated keywords and other materials (other information may be stored in the content column 306 of FIG. 3A/3B, or Each part of the value field 312 of each record 302) (step S604) is as shown in Table 1 (assuming that the book/television/music/movie for the Romance of the Three Kingdoms has only one piece of information).

表一 Table I

接著，自然語言理解系統520會在候選列表中選出所需的回報答案。假設自然語言理解系統520依序選取候選列表中的回報答案a以做為第一回報答案511，則自然語言理解系統520例如是輸出「是否播放三國演義的書籍」，即第一語音應答507(步驟S606)。 Next, the natural language understanding system 520 will select the desired reward answer in the candidate list. Assuming that the natural language understanding system 520 sequentially selects the reward answer a in the candidate list as the first reward answer 511, the natural language understanding system 520 outputs, for example, "whether or not the book of the Three Kingdoms is played", that is, the first voice response 507 ( Step S606).

此時，若語音取樣模組510接收的第二語音輸入501’為「是的」(步驟S608)，則自然語言理解系統520會判斷出上述的回報答案a為正確，且自然語言理解系統520會輸出另一語音應答507「請稍候」(亦即第二語音應答507’)，並通過處理單元(未繪示)來載入三國演義的書籍內容(步驟S610)。 At this time, if the second voice input 501 ′ received by the voice sampling module 510 is “Yes” (step S608 ), the natural language understanding system 520 determines that the above-mentioned reward answer a is correct, and the natural language understanding system 520 Another voice response 507 "Please wait" (ie, second voice response 507') will be output and passed through the processing unit (not The book content is loaded to load the Romance of the Three Kingdoms (step S610).

然而，若語音取樣模組510接收的第二語音輸入501’為「我不是指三國演義的書籍」(步驟S608)，則自然語言理解系統520會判斷出上述的回報答案a為不正確，且自然語言理解系統520會再從候選列表的回報答案b~e中，選出另一回報答案做第二回報答案511’，其例如是回報答案b的「是否要播放三國演義的電視劇」。倘若用戶繼續回答「不是電視劇」，則自然語言理解系統520會選擇回報答案c~e的其中之一來回報。此外，倘若候選列表中的回報答案a~e皆被自然語言理解系統520回報予用戶過，且這些回報答案a~e中沒有符合用戶的語音輸入501時，則自然語言理解系統520輸出「查無任何資料」的語音應答507(步驟S612)。 However, if the second voice input 501' received by the voice sampling module 510 is "I don't refer to the book of the Three Kingdoms" (step S608), the natural language understanding system 520 determines that the above-mentioned reward answer a is incorrect, and The natural language understanding system 520 will then select another return answer from the return answer b~e of the candidate list as the second return answer 511', which is, for example, the "TV drama of whether to play the Three Kingdoms" of the answer b. If the user continues to answer "not a TV show", the natural language understanding system 520 will choose to report one of the answers c~e to report. In addition, if the reward answers a~e in the candidate list are returned to the user by the natural language understanding system 520, and the return answers a~e do not match the user's voice input 501, the natural language understanding system 520 outputs "check". The voice response 507 without any data (step S612).

在另一實施例中，於上述的步驟S608，若語音取樣模組510接收用戶的第二語音輸入501’為「我是指三國演義的漫畫」，在此，由於候選列表中並無關於漫畫的回報答案，故自然語言理解系統520會直接輸出「查無任何資料」的第二語音應答507’。 In another embodiment, in the above step S608, if the voice sampling module 510 receives the user's second voice input 501' as "I mean the comics of the Three Kingdoms", here, there is no comic in the candidate list. In return for the answer, the natural language understanding system 520 will directly output a second voice response 507' of "Check no data".

基於上述，自然語言理解系統520可依據來自用戶的第一語音輸入501而輸出對應的第一語音應答507。其中，當自然語言理解系統520所輸出的第一語音應答507不符合用戶的第一語音輸入501的請求資訊503時，自然語言理解系統520可修正原本輸出的第一語音應答507，並依據用戶後續所提供的第二語音輸入501’，進一步輸出較符合用戶第一請求資訊503的第二語音應答507’。如此一來，倘若用戶仍不滿意自然語言理解系統520所提供的答案時，自然語言理解系統520可自動地進行修正，並回報新的語音應答予用戶，藉以增進用戶與自然語言對話系統500進行對話時的便利性。 Based on the above, the natural language understanding system 520 can output a corresponding first voice response 507 in accordance with the first voice input 501 from the user. When the first voice response 507 output by the natural language understanding system 520 does not meet the request information 503 of the first voice input 501 of the user, the natural language understanding system 520 can correct the first voice response 507 that is originally output, and according to the user. The second voice input 501' provided subsequently further outputs a second voice corresponding to the first request information 503 of the user. Answer 507’. As such, if the user is still dissatisfied with the answers provided by the natural language understanding system 520, the natural language understanding system 520 can automatically correct and report a new voice response to the user in order to enhance the user and natural language dialogue system 500. Convenience in conversation.

值得一提的是，在圖6的步驟S606與步驟S612中，自然語言理解系統520還可依照不同評估優先順序的方法，來排序候選列表中的回報答案，據以按照此優先順序自候選列表中選出回報答案，再輸出對應於回報答案的語音應答。 It is worth mentioning that, in step S606 and step S612 of FIG. 6, the natural language understanding system 520 can also sort the reward answers in the candidate list according to different methods for evaluating the priority order, according to the priority list from the candidate list. The answer is selected and the voice response corresponding to the answer is output.

舉例來說，自然語言理解系統520可依據衆人使用習慣(例如在圖3B的喜好欄318與厭惡欄320皆被分為儲存用戶個人喜好與衆人喜好兩部分時，可參考這兩個欄關於衆人喜好的資訊)，來排序候選列表中的第一回報答案511的優先順序，其中越是關於衆人經常使用的答案則優先排列。例如，再以第一關鍵字509為「三國演義」為例，假設自然語言理解系統520找到的回報答案為三國演義的電視劇、三國演義的書籍與三國演義的音樂。其中，若衆人提到「三國演義」時通常是指「三國演義」的書籍(例如20筆關於書籍的記錄)，較少人會指「三國演義」的電視劇(例如18筆關於書籍的記錄)，而更少人會指「三國演義」的音樂(例如10筆關於書籍的記錄)，所以當圖3B中的熱度欄316所儲存的數值來代表全部用戶的匹配情形時，因熱度欄316的數值在「三國演義」的「書籍」記錄上會最高，則自然語言理解系統520會按照優先順序排序關於「書籍」、「電視劇」、「音樂」的回報答案。也就是說，自然語言理解系統520會優先選擇「三國演義的書籍」來做為第一回報答案511，並依據此第一回報答案511輸出第一語音應答507。 For example, the natural language understanding system 520 can be based on the usage habits of the public (for example, when both the preference column 318 and the disgusting column 320 of FIG. 3B are divided into two parts, namely, storing the user's personal preference and the preference of the person, the two columns can be referred to as the public. The preferred information) is to prioritize the first return answer 511 in the candidate list, and the more frequently the answers are frequently used by the people. For example, taking the first keyword 509 as the "Three Kingdoms" as an example, assume that the natural language understanding system 520 finds the answer to the drama of the Three Kingdoms, the books of the Three Kingdoms, and the music of the Three Kingdoms. Among them, if everyone refers to the "Romance of the Three Kingdoms", it usually refers to the books of the "Three Kingdoms" (for example, 20 records of books), and fewer people will refer to the TV series of "Three Kingdoms" (for example, 18 records on books) And less people will refer to the music of the "Three Kingdoms" (for example, 10 records about books), so when the value stored in the heat column 316 in Figure 3B represents the matching situation of all users, the heat column 316 The value will be the highest in the "book" record of the "Three Kingdoms", and the natural language understanding system 520 will sort the answers to the "books", "TV series", and "music" in order of priority. That is to say, the natural language understanding system 520 preferentially selects the "book of the Three Kingdoms" as the first return answer 511, and outputs the first voice response 507 according to the first return answer 511.

此外，自然語言理解系統520亦可依據用戶習慣(例如在圖3B的喜好欄318與厭惡欄320皆被分為儲存用戶個人喜好與衆人喜好兩部分時，可參考這兩個欄關於用戶個人喜好的資訊)，以決定回報答案的優先順序。具體來說，自然語言理解系統520可將曾經接收到來自用戶的語音輸入(包括第一語音輸入501、第二語音輸入501’、或是任何由用戶所輸入的語音輸入)記錄在特性資料庫(例如圖7A/7B所示)，其中特性資料庫可儲存在硬碟等儲存裝置中。特性資料庫可記錄自然語言理解系統520解析用戶的語音輸入501時，所獲得的第一關鍵字509以及自然語言理解系統520所產生的應答記錄等關於用戶喜好、習慣等資料。關於用戶喜好/習慣資料的儲存與擷取，將在後面通過圖7A/7B/8做更進一步的說明。此外，在一實施例中，在圖3B中的熱度欄316所儲存的數值是與用戶的習慣(例如匹配次數)相關時，可用熱度欄316的數值判斷用戶的使用習慣或優先順序。因此，自然語言理解系統520在選擇回報答案時，可根據特性資料庫730中所記錄的用戶習慣等資訊，按照優先排序回報答案，藉以輸出較符合用戶的語音輸入501的語音應答507。舉例來說，在圖3B中，記錄8/9/10的熱度欄316所儲存的數值分別為2/5/8，其可分別代表「三國演義」的「書籍」、「電視劇」、「電影」的匹配次數分別為2/5/8，所以對應於「三國演義的電影」的回報答案將被優先選擇。 In addition, the natural language understanding system 520 can also refer to the two columns regarding the user's personal preferences according to the user's habits (for example, when both the preference bar 318 and the disgusting bar 320 of FIG. 3B are divided into two parts, namely, storing the user's personal preference and the preference of the person. Information) to determine the priority of the return answer. In particular, the natural language understanding system 520 can record voice input (including the first voice input 501, the second voice input 501', or any voice input input by the user) that has received the user from the feature database. (For example, as shown in FIG. 7A/7B), wherein the property database can be stored in a storage device such as a hard disk. The feature database can record information about the user's preferences, habits, and the like, such as the first keyword 509 and the response record generated by the natural language understanding system 520 when the natural language understanding system 520 parses the user's voice input 501. The storage and retrieval of user preferences/customary data will be further explained later by means of FIG. 7A/7B/8. Further, in an embodiment, when the value stored in the heat column 316 in FIG. 3B is related to the user's habit (eg, the number of matches), the value of the heat column 316 can be used to determine the user's usage habit or priority. Therefore, when selecting the reward answer, the natural language understanding system 520 can report the answer according to the prioritization according to the user habits and the like recorded in the feature database 730, thereby outputting the voice response 507 that is more in line with the user's voice input 501. For example, in Figure 3B, the value stored in the 8/9/10 heat column 316 is 2/5/8, which can represent the "books", "TV series" and "movies" of the "Three Kingdoms" respectively. The number of matches is 2/5/8, so The answer to the return of the film "The Three Kingdoms" will be given priority.

另一方面，自然語言理解系統520亦可依據用戶習慣來選擇回報答案。舉例來說，假設用戶與自然語言理解系統520進行對話時，經常提起到「我要看三國演義的書籍」，而較少提起「我要看三國演義的電視劇」，且更少提到「我要看三國演義的音樂」(例如用戶對話資料庫中記錄有20筆關於「三國演義的書籍」的記錄(例如圖3B記錄8的喜好欄318所示)，8筆關於「三國演義的電視劇」的記錄(例如圖3B記錄9的喜好欄318所示)，以及1筆關於「三國演義的音樂」的記錄)，則候選列表中的回報答案的優先順序將會依序為「三國演義的書籍」、「三國演義的電視劇」以及「三國演義的音樂」。也就是說，當第一關鍵字509為「三國演義」時，自然語言理解系統520會選擇「三國演義的書籍」來做為第一回報答案511，並依據此第一回報答案511輸出第一語音應答507。 On the other hand, the natural language understanding system 520 can also select a reward answer based on user habits. For example, suppose that when a user engages in a conversation with the natural language understanding system 520, he often mentions "I want to read the books of the Romance of the Three Kingdoms", and less mentions "I want to watch the TV series of the Romance of the Three Kingdoms", and less mention "I To see the music of the Romance of the Three Kingdoms" (for example, there are 20 records of the "Books of the Romance of the Three Kingdoms" recorded in the user dialogue database (for example, the preference column 318 of record 8 of Figure 3B), and eight "TV dramas about the Romance of the Three Kingdoms". The record (for example, the preference column 318 of record 9 in Figure 3B), and a record of "music of the Three Kingdoms", the priority order of the return answers in the candidate list will be followed by the book of the Romance of the Three Kingdoms. "The TV series of the Romance of the Three Kingdoms" and "The Music of the Romance of the Three Kingdoms." That is to say, when the first keyword 509 is "The Romance of the Three Kingdoms", the natural language understanding system 520 selects "the book of the Romance of the Three Kingdoms" as the first return answer 511, and outputs the first according to the first return answer 511. Voice response 507.

值得一提的是，自然語言理解系統520還可依據用戶喜好，以決定回報答案的優先順序。具體來說，用戶對話資料庫還可記錄有用戶所表達過的關鍵字，例如：「喜歡」、「偶像」、「厭惡」或「討厭」等等。因此，自然語言理解系統520可自候選列表中，依據上述關鍵字被記錄的次數來對回報答案進行排序。舉例來說，假設回報答案中相關於「喜歡」的次數較多，則此回報答案會優先被選取。或者，假設回報答案中相關於「厭惡」的次數較多，則較後被選取。 It is worth mentioning that the natural language understanding system 520 can also determine the priority order of returning answers according to user preferences. Specifically, the user dialogue database can also record keywords that the user has expressed, such as "like", "idol", "disgust" or "hate". Thus, the natural language understanding system 520 can sort the reward answers from the candidate list based on the number of times the keywords were recorded. For example, if there are more times in the return answer related to "like", then the return answer will be selected first. Or, suppose that the number of times in the return answer related to "disgust" is more, then it is selected later.

舉例來說，假設用戶與自然語言理解系統520進行對話時，經常提到「我討厭看三國演義的電視劇」，而較少提到「我討厭聽三國演義的音樂」，且更少提到「我討厭聽三國演義的書籍」(例如用戶對話資料庫中記錄有20筆關於「我討厭看三國演義的電視劇」的記錄(例如可通過圖3B記錄9的厭惡欄320做記錄)，8筆關於「我討厭聽三國演義的音樂」的記錄，以及1筆關於「我討厭看三國演義的書籍」(例如通過圖3B記錄8的厭惡欄320做記錄))，則候選列表中的回報答案的優先順序依序是「三國演義的書籍」、「三國演義的電視劇」以及「三國演義的音樂」。也就是說，當第一關鍵字509為「三國演義」時，自然語言理解系統520會選擇「三國演義」的書籍來做為第一回報答案511，並依據此第一回報答案511輸出第一語音應答507。在一實施例中，可以在圖3B的熱度欄316外另外加一個“厭惡欄320”，用以記錄用戶的“厭惡程度”。在另一個實施例中，可以在解析到用戶對某一記錄的“厭惡”資訊時，直接在對應記錄的熱度欄316(或喜好欄318)上減一(或其他數值)，這樣可以在不增加欄時記錄用戶的喜好。各種記錄用戶喜好的實施方式都可應用在本發明實施例中，本發明並不對此加以限制。其他關於用戶習慣資料的記錄與運用、以及用戶/衆人使用習慣及喜好...等方式來提供應答及回報答案的實施例，會在後面的圖7A/7B/8做更詳盡的解說。 For example, suppose that when a user engages in a dialogue with the natural language understanding system 520, he often mentions "I hate watching TV dramas of the Three Kingdoms" and less mentions "I hate listening to the music of the Three Kingdoms" and less mentions " I hate listening to the books of the Romance of the Three Kingdoms. (For example, there are 20 records in the user dialogue database about "I hate to watch the TV series of the Three Kingdoms" (for example, can be recorded through the disgusting column 320 of record 9 in Figure 3B), 8 about "I hate listening to the music of the Romance of the Three Kingdoms" and a book about "I hate to read the Romance of the Three Kingdoms" (for example, by recording the disgusting column 320 of record 8 in Figure 3B), the priority of the answer in the candidate list is given priority. The order is "the books of the Romance of the Three Kingdoms", "the TV series of the Romance of the Three Kingdoms" and "the music of the Romance of the Three Kingdoms". That is to say, when the first keyword 509 is "Three Kingdoms", the natural language understanding system 520 selects the "Three Kingdoms" book as the first return answer 511, and outputs the first according to the first return answer 511. Voice response 507. In one embodiment, an "Aversion Bar 320" may be added to the heat column 316 of FIG. 3B to record the "degree of disgust" of the user. In another embodiment, one (or other value) may be directly subtracted from the heat column 316 (or the favorite column 318) of the corresponding record when parsing the user's "disgust" information for a certain record, so that Record user preferences when adding columns. Various embodiments for recording user preferences may be applied to the embodiments of the present invention, and the present invention is not limited thereto. Other examples of the recording and application of user habitual information, as well as user/guest use habits and preferences, etc., provide answers and reward answers, which are explained in more detail in later FIG. 7A/7B/8.

另一方面，自然語言理解系統520還可依據用戶早於自然語言對話系統500提供回報答案前(例如第一語音輸入501被播放前，此時用戶尚不知自然語言對話系統500將提供哪種回報答案供其選擇)所輸入的語音輸入，以決定至少一回報答案的優先順序。也就是說，假設有語音輸入(例如第四語音輸入)被語音取樣模組510所接收的時間早於第一語音輸入501被播放時，則自然語言理解系統520亦可通過解析第四語音輸入中的第四關鍵字，並在候選列表中優先選取具有與此第四關鍵字符合的第四回報答案，並依據此第四回報答案輸出第四語音應答。 On the other hand, the natural language understanding system 520 can also be prior to the user providing a reward answer before the natural language dialogue system 500 (eg, the first voice input 501 is broadcasted Before the release, at this time, the user does not know which kind of return answer is provided by the natural language dialogue system 500 for the voice input to determine the priority order of at least one return answer. That is, if the voice input (eg, the fourth voice input) is received by the voice sampling module 510 earlier than the first voice input 501, the natural language understanding system 520 can also analyze the fourth voice input. The fourth keyword is selected in the candidate list, and the fourth reward answer corresponding to the fourth keyword is preferentially selected, and the fourth voice response is output according to the fourth reward answer.

舉例來說，假設自然語言理解系統520先接收到「我想看電視劇」的第一語音輸入501，且沒多久(例如隔了幾秒)之後，假設自然語言理解系統520又接收到「幫我放三國演義好了」的第四語音輸入501。此時，自然語言理解系統520可在第一語音輸入501中辨別到「電視劇」的第一關鍵字509，隨後又在第四關鍵字中辨別到「三國演義」。因此，自然語言理解系統520會從候選列表，選取關於「三國演義」與「電視劇」的回報答案，並以此第四回報答案據以輸出第四語音應答予用戶。 For example, assume that the natural language understanding system 520 first receives the first voice input 501 of "I want to watch a TV show", and after a short time (eg, after a few seconds), assume that the natural language understanding system 520 receives "help me. The fourth voice input 501 puts the Romance of the Three Kingdoms. At this time, the natural language understanding system 520 can recognize the first keyword 509 of the "drama" in the first voice input 501, and then recognize the "Three Kingdoms" in the fourth keyword. Therefore, the natural language understanding system 520 selects the return answers for the "Three Kingdoms" and "TV series" from the candidate list, and outputs a fourth voice response to the user based on the fourth return answer.

基於上述，自然語言理解系統520可依據來自用戶的語音輸入，並參酌衆人使用習慣、用戶喜好、用戶習慣或用戶所說的前後對話等等資訊，而輸出較能符合語音輸入的請求資訊的語音應答予用戶。其中，自然語言理解系統520可依據不同的排序方式，例如衆人使用習慣、用戶喜好、用戶習慣或用戶所說的前後對話等等方式，來優先排序候選列表中的回報答案。藉此，若來自用戶的語音輸入較不明確時，自然語言理解系統520可參酌衆人使用習慣、用戶喜好、用戶習慣或用戶所說的前後對話，來判斷出用戶的語音輸入501中所意指的意圖(例如第一語音輸入501中的關鍵字509的屬性、知識領域等等)。換言之，若回報答案與用戶曾表達過/衆人所指的意圖接近時，自然語言理解系統520則會優先考慮此回報答案。如此一來，自然語言對話系統500所輸出的語音應答，可較符合用戶的請求資訊。 Based on the above, the natural language understanding system 520 can output voices that are more in line with the request information of the voice input according to the voice input from the user and the information of the usage habits, user preferences, user habits, or the user's spoken and written conversations. Respond to the user. The natural language understanding system 520 can prioritize the reward answers in the candidate list according to different sorting methods, such as the usage habits, user preferences, user habits, or the user's forward and backward conversations. Thereby, if the voice input from the user is less clear, the natural language understanding system 520 can take into account The intent of the user's voice input 501 is determined by the usage habits, user preferences, user habits, or the user's conversations (eg, the attributes of the keywords 509 in the first voice input 501, the knowledge field, etc.) ). In other words, if the reward answer is close to what the user has expressed/intended by the person, the natural language understanding system 520 will prioritize the return answer. In this way, the voice response output by the natural language dialogue system 500 can be more consistent with the user's request information.

綜上所述，在本實施例的修正語音應答的方法與自然語言對話系統中，自然語言對話系統可依據來自用戶的第一語音輸入501而輸出對應的第一語音應答507。其中，當自然語言對話系統所輸出的第一語音應答507不符合用戶的第一語音輸入501的第一請求資訊503或第一關鍵字509時，自然語言對話系統可修正原本輸出的第一語音應答507，並依據用戶後續所提供的第二語音輸入501’，進一步選出較符合用戶需求的第二語音應答507’。此外，自然語言對話系統還可依據衆人使用習慣、用戶喜好、用戶習慣或用戶所說的前後對話等等方式，來優先選出較適當的回報答案，據以輸出對應的語音應答予用戶。如此一來，倘若用戶不滿意自然語言對話系統所提供的答案時，自然語言對話系統可依照用戶每一次所說出的請求資訊自動地進行修正，並回報新的語音應答予用戶，藉以增進用戶與自然語言對話系統進行對話時的便利性。 In summary, in the method for correcting a voice response and the natural language dialogue system of the present embodiment, the natural language dialogue system can output a corresponding first voice response 507 according to the first voice input 501 from the user. Wherein, when the first voice response 507 output by the natural language dialogue system does not match the first request information 503 or the first keyword 509 of the first voice input 501 of the user, the natural language dialogue system can correct the first voice originally output. In response to 507, and according to the second voice input 501' provided by the user, a second voice response 507' that is more in line with the user's needs is further selected. In addition, the natural language dialogue system can also preferentially select a more appropriate return answer according to the usage habits of the people, the user's preference, the user's habits or the user's talks, and the like, and output a corresponding voice response to the user. In this way, if the user is dissatisfied with the answer provided by the natural language dialogue system, the natural language dialogue system can automatically correct the request information according to the user each time, and report the new voice response to the user, thereby enhancing the user. Convenience when talking to the natural language dialogue system.

接著再以自然語言理解系統100與結構化資料庫220等架構與構件，應用於依據與用戶的對話場景及上下文、用戶使用習慣、衆人使用習慣及用戶喜好來提供應答及回報答案的實例做的說明。 Then, the architecture and components such as the system 100 and the structured database 220 are understood in a natural language, and are applied to the context and context of the conversation with the user, and the user. Habits, usage habits, and user preferences provide an explanation of the examples of responses and reward answers.

圖7A是依照本發明一實施例所繪示的自然語言對話系統的方塊圖。請參照圖7A，自然語言對話系統700包括語音取樣模組710、自然語言理解系統720、特性資料庫730及語音合成資料庫740。事實上，圖7A中的語音取樣模組710與圖5A的語音取樣模組510相同、而且自然語言理解系統520與自然語言理解系統720亦相同，所以其執行的功能是相同的。此外，自然語言理解系統720分析請求資訊703時，亦可通過對圖1的資料化資料庫220進行全文檢索而獲得用戶的意圖，這部分的技術因前面已針對圖1與相關敘述做說明故不再贅述。至於特性資料庫730是用以儲存由自然語言理解系統720所送來的用戶喜好資料715、或提供用戶喜好記錄717予自然語言理解系統720，這部分在後文會再行詳述。而語音合成資料庫740則等同語音合成資料庫530，用以提供語音輸出予用戶。在本實施例中，語音取樣模組710用以接收語音輸入701(即圖5A/B的第一/第二語音輸入501/501’，為來自用戶的語音)，而自然語言理解系統720會解析語音輸入中的請求資訊703(即圖5A/B的第一/第二請求資訊503/503’)，並輸出對應的語音應答707(即圖5A/B的第一/第二語音應答507/507’)。前述自然語言對話系統700中的各構件可配置在同一機器中，本發明對此並不加以限定。 FIG. 7A is a block diagram of a natural language dialogue system according to an embodiment of the invention. Referring to FIG. 7A, the natural language dialogue system 700 includes a speech sampling module 710, a natural language understanding system 720, a feature database 730, and a speech synthesis database 740. In fact, the speech sampling module 710 in FIG. 7A is the same as the speech sampling module 510 of FIG. 5A, and the natural language understanding system 520 is the same as the natural language understanding system 720, so the functions performed are the same. In addition, when the natural language understanding system 720 analyzes the request information 703, the user's intention can also be obtained by performing a full-text search on the data database 220 of FIG. 1. This part of the technology has been described above with respect to FIG. 1 and related descriptions. No longer. The feature database 730 is used to store the user preference information 715 sent by the natural language understanding system 720, or to provide the user preference record 717 to the natural language understanding system 720, which will be described in more detail later. The speech synthesis database 740 is equivalent to the speech synthesis database 530 for providing voice output to the user. In this embodiment, the voice sampling module 710 is configured to receive the voice input 701 (ie, the first/second voice input 501/501' of FIG. 5A/B is the voice from the user), and the natural language understanding system 720 The request information 703 in the voice input (ie, the first/second request information 503/503' of FIG. 5A/B) is parsed, and the corresponding voice response 707 is output (ie, the first/second voice response 507 of FIG. 5A/B). /507'). The components in the aforementioned natural language dialogue system 700 can be configured in the same machine, which is not limited by the present invention.

自然語言理解系統720會接收從語音取樣模組710傳來的對語音輸入701進行解析後的請求資訊703，並且，自然語言理解系統720會根據語音輸入701中的一個或多個關鍵字709來產生包含至少一個回報答案的候選列表，再從候選列表中找出較符合關鍵字709的一者作為回報答案711，並據以查詢語音合成資料庫740以找出對應於回報答案711的語音713，最後再依據語音713輸出語音應答707。此外，本實施例的自然語言理解系統720可由一個或數個邏輯門組合而成的硬體電路來實作，或以計算機程序碼來實作，在此僅為舉例說明，並不以此為限。 The natural language understanding system 720 will receive the speech sampling module 710. The request information 703 after parsing the speech input 701, and the natural language understanding system 720 generates a candidate list including at least one reward answer according to one or more keywords 709 in the speech input 701, and then from the candidate list. One of the keywords 709 is found as the return answer 711, and the speech synthesis database 740 is queried to find the speech 713 corresponding to the reward answer 711, and finally the speech response 707 is output according to the speech 713. In addition, the natural language understanding system 720 of the present embodiment may be implemented by a hardware circuit composed of one or several logic gates, or implemented by a computer program code, which is merely an example, and is not limit.

圖7B是依照本發明另一實施例所繪示的自然語言對話系統700’的方塊圖。圖7B的自然語言理解系統720’可包括語音辨識模組722與自然語言處理模組724，而語音取樣模組710可與語音合成模組726合並在一語音綜合處理模組702中。其中，語音辨識模組722會接收從語音取樣模組710傳來對語音輸入701進行解析的請求資訊703，並轉換成一個或多個關鍵字709。自然語言處理模組724再對這些關鍵字709進行處理，而獲得至少一個候選列表，並且從候選列表中選出一個較符合語音輸入701者做為回報答案711。由於此回報答案711是自然語言理解系統720在內部分析而得的答案，所以還必須將轉換成文字或語音輸出才能輸出予用戶，於是語音合成模組726會依據回報答案711來查詢語音合成資料庫740，而此語音合成資料庫740例如是記錄有文字以及其對應的語音資訊，可使得語音合成模組726能夠找出對應於回報答案711的語音713，藉以合成出語音應答707。之後，語音合成模組726可將合成的語音通過語音輸出介面(未繪示)，其中語音輸出介面例如為喇叭、揚聲器、或耳機等裝置)輸出，藉以輸出語音予用戶。應注意的是，在圖7A中，自然語言理解系統720是將語音合成模組726並入其中(例如圖5B的架構，但語音合成模組726未顯示於圖7A中)，而語音合成模組將利用回報答案711對語音合成資料庫740進行查詢以取得語音713，作為合成出語音應答707的依據。 Figure 7B is a block diagram of a natural language dialog system 700', in accordance with another embodiment of the present invention. The natural language understanding system 720' of FIG. 7B can include a speech recognition module 722 and a natural language processing module 724, and the speech sampling module 710 can be combined with the speech synthesis module 726 in a speech synthesis processing module 702. The voice recognition module 722 receives the request information 703 sent from the voice sampling module 710 to parse the voice input 701 and converts it into one or more keywords 709. The natural language processing module 724 further processes the keywords 709 to obtain at least one candidate list, and selects one of the candidate lists that is more in line with the voice input 701 as the reward answer 711. Since the answer 711 is the internal analysis of the natural language understanding system 720, it must be converted into text or voice output to be output to the user, and the speech synthesis module 726 will query the speech synthesis data according to the reward answer 711. The library 740, and the speech synthesis database 740, for example, is recorded with text and its corresponding voice information, so that the speech synthesis module 726 can find the speech 713 corresponding to the reward answer 711, thereby synthesizing the speech response 707. after that, The speech synthesis module 726 can output the synthesized speech through a voice output interface (not shown), wherein the voice output interface is, for example, a device such as a speaker, a speaker, or a headset, to output a voice to the user. It should be noted that in FIG. 7A, the natural language understanding system 720 incorporates the speech synthesis module 726 (eg, the architecture of FIG. 5B, but the speech synthesis module 726 is not shown in FIG. 7A), while the speech synthesis module The group will use the reward answer 711 to query the speech synthesis database 740 to obtain the speech 713 as a basis for synthesizing the speech response 707.

在本實施例中，前述自然語言理解系統720中的語音辨識模組722、自然語言處理模組724以及語音合成模組726，可分別等同於圖5B的語音辨識模組522、自然語言處理模組524以及語音合成模組526並提供相同的功能。此外，語音辨識模組722、自然語言處理模組724以及語音合成模組726可與語音取樣模組710配置在同一機器中。在其他實施例中，語音辨識模組722、自然語言處理模組724以及語音合成模組726亦可分散在不同的機器中(例如計算機系統、伺服器或類似裝置/系統)。例如圖7B所示的自然語言理解系統720’，語音合成模組726可與語音取樣模組710配置在同一機器702，而語音辨識模組722、自然語言處理模組724可配置在另一機器。應注意的是，在圖7B的架構中，因語音合成模組726與語音取樣模組710配置在一機器702中，因此自然語音理解系統720就需要將回報答案711傳送至機器702，並由語音合成模組726會將回報答案711送往語音合成資料庫740以尋找對應的語音713，作為產生語音應答707的依據。此外，語音合成模組726在依據回報答案711呼叫語音合成資料庫740時，可能需要先將回報答案711進行格式轉換，然後通過語音合成資料庫740所規定的介面進行呼叫，因這部分屬於本領域的技術人員所熟知的技術，故在此不予詳述。 In this embodiment, the speech recognition module 722, the natural language processing module 724, and the speech synthesis module 726 in the natural language understanding system 720 are respectively equivalent to the speech recognition module 522 and the natural language processing module of FIG. 5B. Group 524 and speech synthesis module 526 provide the same functionality. In addition, the speech recognition module 722, the natural language processing module 724, and the speech synthesis module 726 can be disposed in the same machine as the speech sampling module 710. In other embodiments, the speech recognition module 722, the natural language processing module 724, and the speech synthesis module 726 can also be distributed among different machines (eg, computer systems, servers, or the like). For example, the natural language understanding system 720' shown in FIG. 7B, the speech synthesis module 726 can be disposed in the same machine 702 as the speech sampling module 710, and the speech recognition module 722 and the natural language processing module 724 can be configured on another machine. . It should be noted that in the architecture of FIG. 7B, since the speech synthesis module 726 and the speech sampling module 710 are disposed in a machine 702, the natural speech understanding system 720 needs to transmit the reward answer 711 to the machine 702, and The speech synthesis module 726 sends the reward answer 711 to the speech synthesis database 740 to find the corresponding speech 713 as a basis for generating the speech response 707. In addition, language When the speech synthesis module 726 calls the speech synthesis database 740 according to the reward answer 711, it may be necessary to first convert the response answer 711 and then make a call through the interface specified by the speech synthesis database 740, as this part belongs to the field. Techniques well known to the skilled person are not described in detail herein.

以下即結合上述結合圖7A的自然語言對話系統700來說明自然語言對話方法。圖8A是依照本發明一實施例所繪示的自然語言對話方法的流程圖。為了方便說明，在此僅舉圖7A的自然語言對話系統800為例，但本實施例的自然語言對話方法亦可適用於上述圖7B的自然語言對話系統700’。與圖5/6相較下，圖5/6所處理的依據用戶的語音輸入而自動進行修正所輸出的資訊，但圖7A/7B/8所處理的是依據特性資料庫730來記錄用戶喜好資料715，並據以從候選列表中選擇一者做回報答案711，並播放其對應語音予用戶。事實上，圖5/6與圖7A/7B/8的實施方式可擇一或並存，發明並不對此加以限制。 The natural language dialogue method will be described below in conjunction with the natural language dialog system 700 described above in connection with FIG. 7A. FIG. 8A is a flowchart of a natural language dialogue method according to an embodiment of the invention. For convenience of explanation, only the natural language dialogue system 800 of Fig. 7A is taken as an example, but the natural language dialogue method of the present embodiment can also be applied to the above-described natural language dialogue system 700' of Fig. 7B. Compared with FIG. 5/6, the information outputted by the user's voice input is automatically corrected according to the user's voice input, but FIG. 7A/7B/8 deals with recording the user's preference according to the feature database 730. The data 715 is selected from the candidate list to report the answer 711 and play the corresponding voice to the user. In fact, the embodiments of Figures 5/6 and 7A/7B/8 may alternatively or coexist, and the invention is not limited thereto.

請同時參照圖7A及圖8，於步驟S810中，語音取樣模組710會接收語音輸入701。其中，語音輸入701例如是來自用戶的語音，且語音輸入701還可具有用戶的請求資訊703。具體而言，來自用戶的語音輸入701可以是詢問句、命令句或其他請求資訊等，例如前面提過的實例「我要看三國演義」、「我要聽忘情水的音樂」或「今天溫度幾度」等等。應注意的是，步驟S802-S806為自然語言對話系統700對用戶先前的語音輸入儲存用戶喜好資料715的流程，往後的步驟S810-S840即基於這些先前已儲存在特性資料庫730的用戶喜好資料715進行操作。步驟S802-S806的細節將在後文再行詳述，以下將先講述步驟S820-S840的操作內容。 Referring to FIG. 7A and FIG. 8 simultaneously, in step S810, the voice sampling module 710 receives the voice input 701. The voice input 701 is, for example, a voice from a user, and the voice input 701 may also have a request information 703 of the user. Specifically, the voice input 701 from the user may be an inquiry sentence, a command sentence, or other request information, such as the aforementioned example "I want to see the Romance of the Three Kingdoms", "I want to listen to the music of the water" or "Today's temperature A few degrees, etc. It should be noted that steps S802-S806 are the flow of the natural language dialogue system 700 storing the user preference profile 715 for the user's previous voice input, and the subsequent steps S810-S840 are based on these previously stored in The user preference material 715 of the feature database 730 operates. The details of steps S802-S806 will be described later in detail, and the operation contents of steps S820-S840 will be described below.

於步驟S820中，自然語言理解系統720會解析第一語音輸入701中所包括的至少一個關鍵字709，進而獲得候選列表，其中候選列表具有一個或多個回報答案。詳細而言，自然語言理解系統720會解析語音輸入701，而獲得語音輸入701的一個或多個關鍵字709。舉例來說，當用戶的語音輸入701為「我要看三國演義」時，自然語言理解系統720經過分析後所獲得的關鍵字709例如是「『三國演義』、『看』」(如前所述，還要再分析用戶想看的是書籍、電視劇、或電影)。又例如，當用戶的語音輸入701為「我要聽忘情水的歌」時，自然語言理解系統720經過分析後所獲得的關鍵字709例如是「『忘情水』、『聽』、『歌』」(如前所述，可以再分析用戶想聽的是劉德華或李翊君所演唱的版本)。接後，自然語言理解系統720可依據上述關鍵字709自結構化資料庫進行全文檢索，而獲得至少一筆搜尋結果(可為圖3A/3B的其中的至少一筆記錄)，據以做為候選列表中的回報答案。由於一個關鍵字709可能屬於不同的知識領域(例如電影類、書籍類、音樂類或游戲類等等)，且同一知識領域中亦可進一步分成多種類別(例如同一電影或書籍名稱的不同作者、同一歌曲名稱的不同演唱者、同一游戲名稱的不同版本等等)，故針對一個關鍵字709而言，自然語言理解系統720可在分析後(例如對結構化資料庫220進行全文檢索)得到一筆或多筆相關於此關鍵字709的搜尋結果，其包含除了關鍵字709以及關鍵字709以外的其他資訊等等(其他資訊的內容如表一所示)。因此從另一觀點來看，當用戶所輸入的第一語音輸入701具有多個關鍵字709時，則表示用戶的請求資訊703較明確，使得自然語言理解系統720較能分析到與請求資訊703接近的搜尋結果(因為若自然語言理解系統720可找到完全匹配結果時，應該就是用戶想要的選項了)。 In step S820, the natural language understanding system 720 parses at least one keyword 709 included in the first voice input 701 to obtain a candidate list, wherein the candidate list has one or more reward answers. In detail, the natural language understanding system 720 parses the speech input 701 and obtains one or more keywords 709 of the speech input 701. For example, when the user's voice input 701 is "I want to see the Romance of the Three Kingdoms", the keyword 709 obtained by the natural language understanding system 720 after analysis is, for example, "The Romance of the Three Kingdoms" and "Look" (as before) Said, but also analyze the user wants to see books, TV series, or movies). For another example, when the user's voice input 701 is "I want to listen to the song of forgetting the water", the keyword 709 obtained by the natural language understanding system 720 after analysis is, for example, "forget the water", "listen", "song" (As mentioned earlier, you can re-analyze what the user wants to hear is the version that Andy Lau or Li Yijun sang). After that, the natural language understanding system 720 can perform full-text search from the structured database according to the above keyword 709, and obtain at least one search result (which may be at least one of the records in FIG. 3A/3B), as a candidate list. The answer in the return. Since a keyword 709 may belong to different knowledge areas (such as movies, books, music, games, etc.), and can be further divided into multiple categories in the same knowledge field (for example, different authors of the same movie or book name, Different singers of the same song name, different versions of the same game name, etc.), so for a keyword 709, the natural language understanding system 720 can be analyzed (eg, full-text examination of the structured database 220) Search for one or more search results related to this keyword 709, which includes information other than the keyword 709 and the keyword 709, etc. (the contents of other information are shown in Table 1). Therefore, from another point of view, when the first voice input 701 input by the user has multiple keywords 709, it indicates that the user's request information 703 is clearer, so that the natural language understanding system 720 can analyze the request information 703. Close search results (because if the natural language understanding system 720 can find the exact match result, it should be the option that the user wants).

舉例來說，當關鍵字709為「三國演義」時，自然語言理解系統720所分析到的搜尋結果例如是關於「...『三國演義』...『電視劇』」、「...『三國演義』...『書籍』」的記錄(其中『電視劇』及『書籍』即為回應結果所指示的用戶意圖)。又例如，當關鍵字709為「『忘情水』、『音樂』」時，自然語言理解系統720所分析到的用戶意圖可能為「...『忘情水』...『音樂』...『劉德華』」、「...『忘情水』...『音樂』...『李翊君』」的記錄，其中『劉德華』、『李翊君』為用以指示用戶意圖的搜尋結果。換言之，在自然語言理解系統720對結構化資料庫220進行全文檢索後，每一筆搜尋結果可包括關鍵字709、以及相關於關鍵字709的其他資料(如表一所示)，而自然語言理解系統720會依據所分析到的搜尋結果轉換成包含至少一個回報答案的候選列表以供後續步驟使用。 For example, when the keyword 709 is "Three Kingdoms", the search result analyzed by the natural language understanding system 720 is, for example, "..."Three Kingdoms"... "TV drama", "..." The records of the "Romance of the Three Kingdoms"... "Books" (where "TV dramas" and "books" are the user's intentions in response to the results). For another example, when the keyword 709 is "Forget Water" or "Music", the user's intention analyzed by the natural language understanding system 720 may be "... "forgetting water"... "music"... "Andy Lau", "... "forget the water"... "Music"... "Li Yujun" record, in which "Andy Lau" and "Li Yujun" are search results to indicate the user's intention. In other words, after the natural language understanding system 720 performs full-text search on the structured database 220, each search result may include a keyword 709 and other materials related to the keyword 709 (as shown in Table 1), and natural language understanding. The system 720 converts the analyzed search results into a candidate list containing at least one reward answer for use in subsequent steps.

於步驟S830中，自然語言理解系統720根據特性資料庫730所送來的用戶喜好記錄717(例如依據儲存其中的用戶喜好資料715所匯整的結果，後面會對此做說明)，用以自候選列表中選擇一回報答案711，並依據回報答案711輸出語音應答707。在本實施例中，自然語言理解系統720可按照一優先順序(優先順序包含哪些方式以下會再詳述)排列從候選列表中選出回報答案711。而在步驟S840中，依據回報答案711，輸出語音應答707(步驟S840)。 In step S830, the natural language understanding system 720 records the user preference record 717 sent by the feature database 730 (for example, according to the result of storing the user preference data 715 stored therein, which will be described later). Candidate list The answer 711 is optionally returned, and a voice response 707 is output based on the reward answer 711. In the present embodiment, the natural language understanding system 720 can select the reward answer 711 from the candidate list in a prioritized order (which is included in the priority order). In step S840, based on the report answer 711, the voice response 707 is output (step S840).

舉例來說，在一實施例中可以搜尋結果的數量做優先順序，例如當關鍵字709為「三國演義」時，假設自然語言理解系統720在分析後，發現在結構化資料庫220中關於「...『三國演義』...『書籍』」的記錄數量最多，其次為「...『三國演義』...『音樂』」的記錄，而關於「...『三國演義』...『電視劇』」的記錄數量最少，則自然語言理解系統720會將相關於「三國演義的書籍」的記錄做為第一優先回報答案(例如將所有關於「三國演義的書籍」整理成一候選列表，並可依據熱度欄316的數值進行排序)，相關於「三國演義的音樂」的記錄做為第二優先回報答案，相關於「三國演義的電視劇」的記錄做為第三優先回報答案。應注意的是，除了搜尋結果的數量外，作為優先順序的依據還可以是用戶喜好、用戶習慣、或是衆人使用習慣，相關的敘述往後會再詳述。 For example, in an embodiment, the number of search results may be prioritized. For example, when the keyword 709 is "Three Kingdoms", it is assumed that the natural language understanding system 720 finds in the structured database 220 after analysis. ... "The Romance of the Three Kingdoms"... "Books" has the largest number of records, followed by "..."The Romance of the Three Kingdoms..."Music", and "..."The Romance of the Three Kingdoms." . . . "TV drama" has the fewest records, then the natural language understanding system 720 will record the records related to the "Book of the Three Kingdoms" as the first priority return answer (for example, sorting all the books about the Romance of the Three Kingdoms into one candidate) The list can be sorted according to the value of the heat column 316. The record related to the "Music of the Three Kingdoms" is the second priority return answer, and the record related to the "TV drama of the Three Kingdoms" is the third priority return answer. It should be noted that in addition to the number of search results, the priority may be based on user preferences, user habits, or usage habits, and the related narrative will be described in detail later.

為了使本領域的技術人員進一步瞭解本實施例的自然語言對話方法以及自然語言對話系統，以下再舉一實施例進行詳細的說明。 In order to enable those skilled in the art to further understand the natural language dialogue method and the natural language dialogue system of the present embodiment, an embodiment will be described in detail below.

首先，假設語音取樣模組710接收的第一語音輸入701為「我要看三國演義」(步驟S810)，接著，自然語言理解系統720 可解析出為「『看』、『三國演義』」的關鍵字709，並獲得具有多個回報答案的候選列表，其中每一個回報答案具有相關的關鍵字(步驟S820)與其他資訊，亦如上述的表一所示。 First, assume that the first voice input 701 received by the voice sampling module 710 is "I want to see the Romance of the Three Kingdoms" (step S810), and then, the natural language understanding system 720 The keyword 709 can be parsed as ""Look", "Three Kingdoms"", and a candidate list having multiple reward answers is obtained, wherein each of the reward answers has a related keyword (step S820) and other information, such as Table 1 above is shown.

接著，自然語言理解系統720會在候選列表中選出回報答案。假設自然語言理解系統720選取候選列表中的回報答案a(請參考表一)以做為第一回報答案711，則自然語言理解系統720例如是輸出「是否播放三國演義的書籍」，作為語音應答707(步驟S830~S840)。 Next, the natural language understanding system 720 will select the reward answer in the candidate list. Assuming that the natural language understanding system 720 selects the reward answer a in the candidate list (refer to Table 1) as the first reward answer 711, the natural language understanding system 720 outputs, for example, "whether or not the book of the Three Kingdoms is played" as a voice response. 707 (steps S830 to S840).

如上所述，自然語言理解系統720還可依照不同評估優先順序的方法，來排序候選列表中的回報答案，據此輸出對應於回報答案711的語音應答707。舉例來說，自然語言理解系統720可依據與使用者的多個對話記錄判斷用戶喜好(例如前面提過的使用用戶的正面/負向用語)，亦即可利用該用戶喜好記錄717決定回報答案711的優先順序。然在解說用戶正面/負面用語的使用方式之前，先對用戶喜好資料715在儲存用戶/衆人的喜好/厭惡或習慣的方式做說明。 As described above, the natural language understanding system 720 can also sort the reward answers in the candidate list in accordance with different methods of evaluating the priority order, and accordingly output a voice response 707 corresponding to the reward answer 711. For example, the natural language understanding system 720 can determine user preferences based on a plurality of conversation records with the user (eg, using the user's positive/negative terms as mentioned above), and can also use the user preference record 717 to determine the return answer. 711 priority order. Before explaining the usage of the user's positive/negative terms, the user preference information 715 is first described in terms of storing the user/person's preferences/dislikes or habits.

現在依據步驟S802-806關於用戶喜好資料715的儲存方式。在一實施例中，可在步驟S810接收語音輸入701之前，即在步驟S802中接收多個語音輸入，也就是先前的歷史對話記錄，並根據這些先前的多個語音輸入701，擷取用戶喜好資料715(步驟S804)，然後儲存在特性資料庫730中。事實上，用戶喜好資料715亦可儲存在結構化資料庫220中(或說是將特性資料庫730並入結構化資料庫220的方式)。舉例來說，在一實施例中，可以直接利用圖3B的熱度欄316來記錄用戶的喜好，至於熱度欄316的記錄方式前面已提過(例如某一記錄302被匹配時即將其熱度欄加一)，在此不予贅述。當然，也可以在結構化資料庫220另闢欄來儲存用戶喜好資料715，例如用關鍵字(例如“三國演義”)為基礎，結合用戶喜好(例如當用戶提到“喜歡”等正向用語以及“厭惡”等負面用語時，可分別在圖3B的喜好欄318與厭惡欄320的數值加一)，然後計算喜好的數量(例如統計正向用語與等負面用語的數量)。於是自然語言理解系統720對結構化資料庫200查詢用戶喜好記錄717時，可以直接查詢喜好欄318與厭惡欄320的數值(可查詢正向用語與等負面用語各有多少數量)，再據以判斷用戶的喜好(亦即將正面用語及負面用語的統計數值作為用戶喜好記錄717傳送至自然語言理解系統720)。 The manner in which the user preference profile 715 is stored is now determined in accordance with steps S802-806. In an embodiment, before receiving the voice input 701 in step S810, a plurality of voice inputs, that is, previous history conversation records, may be received in step S802, and user preferences are captured based on the previous plurality of voice inputs 701. The data 715 (step S804) is then stored in the property database 730. In fact, the user preference data 715 can also be stored in the structured database 220 (or the feature database 730 is incorporated into the junction). The way to construct the database 220). For example, in an embodiment, the popularity column 316 of FIG. 3B can be directly used to record the user's preferences. As for the recording mode of the heat column 316, it has been mentioned before (for example, when a certain record 302 is matched, the heat column is added. a), I will not repeat them here. Of course, the user profile information 715 can also be stored in the structured database 220 column, for example, based on keywords (such as "Three Kingdoms"), combined with user preferences (for example, when the user refers to "like" and other positive terms. And when negative terms such as "disgust" are used, the values of the preference column 318 and the disgusting column 320 of FIG. 3B can be respectively added to one), and then the number of favorites (for example, the number of statistical positive terms and the like) can be calculated. Therefore, when the natural language understanding system 720 queries the structured database 200 for the user preference record 717, the value of the preference column 318 and the dislike column 320 can be directly queried (the number of the positive term and the negative term can be queried), and then The user's preference is judged (i.e., the statistical value of the positive and negative terms is transmitted to the natural language understanding system 720 as the user preference record 717).

以下將描述將用戶喜好資訊715儲存在特性資料庫730的情形(亦即特性資料庫730不並入結構化資料庫220)。在一實施例中，用戶喜好資訊715可使用關鍵字與用戶對此關鍵字的“喜好”的對應方式來儲存，舉例來說，用戶喜好資訊715的儲存可直接使用圖8B的喜好欄852與厭惡欄862來記錄用戶個人對某關鍵字的喜好與厭惡，並以喜好欄854與厭惡欄864來記錄衆人對此組關鍵字的喜好與厭惡。例如在圖8B中，記錄832所儲存的關鍵字「『三國演義』、『書籍』」所對應喜好欄852與厭惡欄862的數值為分別為20與1、記錄834所儲存的關鍵字「『三國演義』、『電視劇』」所對應的喜好欄852與厭惡欄862的數值為分別8與20、記錄836所儲存的關鍵字「『三國演義』、『音樂』」所對應的喜好欄852與厭惡欄862的數值為分別為1與8，其皆表示用戶個人對於相關關鍵字的喜好與厭惡資料(例如喜好欄852的數值越高表示越喜歡、厭惡欄862的數值越高表示越厭惡)。此外，記錄832所對應喜好欄854與厭惡欄864的數值為分別為5與3、記錄834所對應的喜好欄854與厭惡欄864的數值為分別80與20、記錄836所對應的喜好欄854與厭惡欄864的數值為分別為2與10，其是表示衆人對於相關關鍵字的喜好與厭惡資料(以“喜好指示”簡稱之)，於是便可依據用戶的喜好來增加喜好欄852與厭惡欄862的數值。因此，若用戶輸入“我想看三國演義的電視劇”的語音時，自然語言理解系統720可將“關鍵字”「『三國演義』、『電視劇』」與增加喜好欄數值的“喜好指示”合並成用戶喜好資料715送往特性資料庫730，於是特性資料庫730可在記錄834的喜好欄852數值進行加一的操作(因為用戶想看「『三國演義』、『電視劇』」，表示其喜好度增加)。依據上述記錄用戶喜好資料的方式，往後當用戶又再輸入相關的關鍵字時，例如用戶在輸入“我要看三國演義”時，自然語言理解系統720可依據關鍵字“三國演義”在圖8B的特性資料庫730查詢到三筆與“三國演義”相關的記錄832/834/836，而特性資料庫730可將喜好欄852與厭惡欄862的數值做為用戶喜好記錄717回傳給自然語言理解系統720，於是自然語言理解系統720可依據用戶喜好記錄717作為判斷用戶個人的喜好依據。當然，特性資料庫730亦可將喜好欄854與厭惡欄864的數值做為用戶喜好記錄717回傳給自然語言理解系統720，只是此時用戶喜好記錄717將作為判斷衆人喜好的依據，本發明對用戶喜好記錄717代表的是用戶個人或是衆人的喜好並不加以限制。 The case where the user preference information 715 is stored in the feature database 730 (i.e., the feature database 730 is not incorporated into the structured library 220) will be described below. In an embodiment, the user preference information 715 may be stored in a manner corresponding to the user's "likeness" of the keyword. For example, the user preference information 715 may be stored directly using the preference column 852 of FIG. 8B. The disgusting column 862 records the user's personal preference and dislike of a certain keyword, and uses the favorite column 854 and the disgusting column 864 to record the preferences and dislikes of the group of keywords. For example, in FIG. 8B, the values of the favorite column 852 and the disgusting column 862 corresponding to the keyword "Three Kingdoms" and "Book" stored in the record 832 are 20 and 1, respectively, and the keyword "" stored in the record 834. The Romance of the Three Kingdoms, "TV The values of the preference column 852 and the disgusting column 862 corresponding to the drama "" are the values of the preference column 852 and the disgusting column 862 corresponding to the keywords ""Three Kingdoms" and "Music") stored in the records 836, respectively. The values are 1 and 8, respectively, which indicate the user's personal preference and dislike data for the relevant keywords (for example, the higher the value of the preference column 852 is, the more like, the higher the value of the column 862 is, the more disgusting it is). In addition, the values of the preference column 854 and the disgusting column 864 corresponding to the record 832 are 5 and 3, respectively, and the values of the preference column 854 and the disgusting column 864 corresponding to the record 834 are 80 and 20 respectively, and the preference column 854 corresponding to the record 836. The values of the dislike column 864 are 2 and 10, respectively, which indicate the preference and disgusting information of the relevant keywords (referred to as "favorite indication"), so that the preference column 852 and disgust can be added according to the user's preference. The value of column 862. Therefore, if the user inputs the voice "I want to watch the drama of the Three Kingdoms", the natural language understanding system 720 can combine the "keyword" ""Three Kingdoms", "TV drama"" with the "favorite indication" of increasing the value of the preference column. The user preference data 715 is sent to the feature database 730, so that the feature database 730 can be incremented by the value of the preference column 852 of the record 834 (because the user wants to see "The Romance of the Three Kingdoms" and "TV drama", indicating his preference. Degree increase). According to the above method of recording user preference data, when the user inputs the relevant keyword again, for example, when the user inputs "I want to see the Romance of the Three Kingdoms", the natural language understanding system 720 can be based on the keyword "Three Kingdoms". The feature database 730 of 8B queries three records 832/834/836 related to "Three Kingdoms", and the feature database 730 can return the value of the favorite column 852 and the aversive column 862 as a user preference record 717 to the nature. The language understanding system 720, then the natural language understanding system 720 can determine the user's individual based on the user's preference record 717. The basis of preference. Of course, the feature database 730 can also return the value of the preference column 854 and the dislike column 864 as the user preference record 717 to the natural language understanding system 720, but the user preference record 717 will be used as a basis for judging the preference of the person. The user preference record 717 represents the individual or the preferences of the user and is not limited.

在另一實施例中，喜好欄852與厭惡欄862的數值亦可作為判斷用戶/衆人習慣的依據。舉例來說，自然語言理解系統720可在接收用戶喜好記錄717後，先判斷喜好欄852/854與厭惡欄862/864的數值差異，若兩個數值相差到了某個臨界值之上，表示用戶習慣使用特定的方式來進行對話，例如當喜好欄852的數值較厭惡欄862的數值大了10次以上，表示用戶特別喜歡使用“正面用語”作對話(此即“用戶習慣”的一種記錄方式)，因此自然語言理解系統720在這個情形下可僅以喜好欄852來選取回報答案。當自然語言理解系統720使用的是特性資料庫730所儲存的喜好欄854/厭惡欄864的數值時，表示所判斷的是特性資料庫730所有用戶的喜好記錄，而判斷結果即可以作為衆人使用習慣的參考資料。應注意的是，由特性資料庫730回傳給自然語言理解系統720的用戶喜好記錄717可同時包含用戶個人的喜好記錄(例如喜好欄852/厭惡欄862的數值)與衆人的喜好記錄(例如喜好欄854/厭惡欄864的數值)，本發明對此並不加以限制。 In another embodiment, the values of the preference column 852 and the dislike column 862 can also serve as a basis for judging the user/people's habits. For example, the natural language understanding system 720 can determine the difference between the preference column 852/854 and the disgusting column 862/864 after receiving the user preference record 717. If the two values differ above a certain threshold, the user is represented. It is customary to use a specific way to conduct a dialogue. For example, when the value of the preference column 852 is greater than the value of the aversive column 862 by more than 10 times, it indicates that the user particularly likes to use the "positive language" as a dialogue (this is a "user habit" recording method. Therefore, the natural language understanding system 720 can select the reward answer only in the preference column 852 in this case. When the natural language understanding system 720 uses the value of the favorite column 854/disgusting column 864 stored in the property database 730, it indicates that all the user's favorite records of the feature database 730 are judged, and the judgment result can be used as a public. Customary reference material. It should be noted that the user preference record 717 that is passed back to the natural language understanding system 720 by the property database 730 can include both the user's personal preference record (eg, the value of the preference bar 852/disgusting column 862) and the person's favorite record (eg, The value of the preference column 854/disgusting column 864) is not limited by the present invention.

至於對基於本次的語音輸入所獲得的用戶喜好資料715的儲存，可在步驟S820產生候選列表時(不論是完全匹配或部分匹配)，由自然語言對話系統700儲存此次在用戶語音輸入中所取得的用戶喜好資料715。例如在步驟S820中，每當關鍵字可在結構化資料庫220中產生匹配結果時，即可判定用戶對此匹配結果是有所偏好的傾向，因此可以將“關鍵字”與“喜好指示”送往特性資料庫730，並在其中找到對應的記錄後，變更對應記錄其對應的喜好欄852/854或厭惡欄862/864數值(例如當用戶輸入“我想看三國演義的書籍”時，可對圖8B的記錄832的喜好欄852/854的數值加一)。在又一實施例中，自然語言對話系統700亦可在步驟S830中，於用戶選取一回報答案後才儲存用戶喜好資料715。此外，若當未在特性資料庫730找到對應的關鍵字時，可以建立一新的記錄來儲存用戶喜好資料715。例如當用戶輸入“我聽劉德華的忘情水”的語音並產生關鍵字「『劉德華』、『忘情水』」時，若進行儲存時未在特性資料庫730找到對應的記錄，所以將在特性資料庫730建立新的記錄838，並在其對應的喜好欄852/854數值加一。上述的用戶喜好資料715儲存時機與儲存方式，僅為說明之用，本領域的技術人員可依據實際應用變更本發明所示的實施例，但所有不脫離本發明精神所為的等效修飾仍應包含在本發明申請專利範圍中。 As for the storage of the user preference profile 715 obtained based on the current voice input, the candidate list may be generated in step S820 (whether it is an exact match or a partial Matching, the natural language dialogue system 700 stores the user preference information 715 obtained in the user's voice input. For example, in step S820, whenever the keyword can generate a matching result in the structured database 220, it can be determined that the user has a preference for the matching result, so "keyword" and "favorite indication" can be used. After sending to the feature database 730 and finding the corresponding record therein, the corresponding record is corresponding to the preference column 852/854 or the aversive column 862/864 (for example, when the user inputs "I want to read the books of the Romance of the Three Kingdoms", The value of the preference column 852/854 of the record 832 of FIG. 8B may be incremented by one). In another embodiment, the natural language dialogue system 700 may also store the user preference information 715 after the user selects a reward answer in step S830. In addition, if a corresponding keyword is not found in the property database 730, a new record can be created to store the user preference profile 715. For example, when the user inputs the voice "I listen to Andy Lau's forgotten water" and generates the keyword "Andy Lau", "forget the water", if the corresponding record is not found in the feature database 730 when storing, the feature data will be Library 730 creates a new record 838 and increments its value in its corresponding preference field 852/854. The above-mentioned user preference information 715 storage timing and storage mode are for illustrative purposes only, and those skilled in the art can change the embodiment shown in the present invention according to the actual application, but all equivalent modifications should not be deviated from the spirit of the present invention. It is included in the scope of the patent application of the present invention.

此外，雖然在圖8B所示的特性資料庫730儲存記錄832-838的格式與結構化資料庫220的記錄格式(例如圖3A/3B/3C所示者)並不相同，但本發明對各個記錄的儲存格式並不加以限制。再者，雖然上述實施例僅講述喜好欄852/854與厭惡欄862/864 的儲存與使用方式，但在另一實施例中，可在特性資料庫730另闢欄872/874以分別儲存用戶/衆人的其他習慣，例如該筆記錄對應的資料被下載、引用、推薦、評論、或轉介...的次數等資料。在另一實施例中，這些下載、引用、推薦、評論、或轉介的次數或資料亦可集中以喜好欄852/854與厭惡欄862/864作儲存，例如用戶每次對某項記錄提供好的評論或轉介予他人參考時可在喜好欄852/854的數值加一、若用戶對某項記錄提供不好的評論時即可在厭惡欄862/864的數值加一，本發明對記錄的數量與欄的數值記錄方式皆不予限制。應注意的是，本領域的技術人員應知，因圖8B中的喜好欄852、厭惡欄862、欄872...等僅與用戶個人的選擇與喜好相關，所以可將這些用戶個人的選擇/喜好/厭惡資訊儲存在用戶的移動通訊裝置中，而與全體用戶(或至少是某特定群組的用戶)相關的喜好欄854、厭惡欄864、欄874...等資訊就儲存在伺服器中，於是亦可節省伺服器的儲存空間，也保留用戶個人喜好的隱密性。 Further, although the format in which the records 832-838 are stored in the property database 730 shown in FIG. 8B is different from the recording format of the structured library 220 (for example, as shown in FIGS. 3A/3B/3C), the present invention The storage format of the record is not limited. Furthermore, although the above embodiment only describes the preference column 852/854 and the disgusting column 862/864 Storage and usage, but in another embodiment, column 872/874 may be additionally opened in the property database 730 to store other habits of the user/person, for example, the corresponding data of the record is downloaded, referenced, recommended, Information such as the number of comments, or referrals. In another embodiment, the number or data of these downloads, citations, recommendations, comments, or referrals may also be stored in the favorite column 852/854 and the aversive column 862/864, for example, each time the user provides a record. A good comment or referral to another person may add one to the value of the preference column 852/854. If the user provides a bad comment on a record, the value of the aversion column 862/864 may be increased by one. The number of records and the numerical value of the column are not limited. It should be noted that those skilled in the art should understand that since the preference column 852, the aversive column 862, the column 872, etc. in FIG. 8B are only related to the user's personal selection and preference, the personal selection of these users can be selected. The favorite/disgusting information is stored in the user's mobile communication device, and the information such as the favorite column 854, the disgusting column 864, the column 874, etc. associated with all users (or at least a specific group of users) is stored in the servo. In the device, it can also save the storage space of the server, and also preserve the privacy of the user's personal preference.

以下再利用圖7A與圖8B對用戶的實際使用狀况做更進一步的說明。基於多個語音輸入701的對話內容，假設用戶與自然語言理解系統720進行對話時，經常提到「我討厭看三國演義的電視劇」，而較少提到「我討厭聽三國演義的音樂」，且更少提到「我討厭聽三國演義的書籍」(例如特性資料庫730中記錄有20筆關於「我討厭看三國演義的電視劇」的記錄(亦即在圖8B的記錄834中，“三國演義”加“電視劇”的負面用語的數量就是20)，8 筆關於「我討厭聽三國演義的音樂」的記錄(亦即在圖8B的記錄836中，“三國演義”加“音樂”的負面用語的數量是8)，以及1筆關於「我討厭聽三國演義的書籍」)(亦即在圖8B的記錄832中，“三國演義”加“書籍”的負面用語的數量是1)，因為從特性資料庫730所回傳的用戶喜好記錄717將包含這三個負面用語的數量(亦即20、8、1)，則自然語言理解系統720會將候選列表中的回報答案711的優先順序依序排列為「三國演義的書籍」、「三國演義的音樂」、以及「三國演義的電視劇」。也就是說，當關鍵字709為「三國演義」時，自然語言理解系統720會選擇「三國演義」的書籍來做為回報答案711，並依據此回報答案711輸出語音應答707。應注意的是，雖然上述是單獨使用用戶所用過的負面用語的統計數值來列優先順序，但在另一實施例中，仍可單獨使用用戶所用過的正面用語的統計數值來列優先順序(例如先前提到的，喜好欄852的數值比厭惡欄862某一個臨界值之上)。 The actual use of the user will be further explained below using FIG. 7A and FIG. 8B. Based on the conversation content of the plurality of voice inputs 701, it is often mentioned that "the user hates to watch the drama of the Three Kingdoms" when the user engages in dialogue with the natural language understanding system 720, and less "I hate listening to the music of the Three Kingdoms". Less mention of "I hate listening to the Romance of the Three Kingdoms" (for example, the feature database 730 records 20 records about "I hate to watch the drama of the Three Kingdoms" (that is, in the record 834 of Figure 8B, "Three Kingdoms" The number of negative terms in the Romance "Plus TV Series" is 20), 8 The record of "I hate listening to the music of the Romance of the Three Kingdoms" (that is, in the record 836 of Figure 8B, the number of negative terms of "Three Kingdoms" plus "Music" is 8), and one is about "I hate listening to the Three Kingdoms." The book of the Romance") (i.e., in the record 832 of Figure 8B, the number of negative terms for "Three Kingdoms" plus "book" is 1), because the user preference record 717 returned from the property database 730 will contain this. The number of three negative terms (ie, 20, 8, 1), the natural language understanding system 720 will sequentially order the priority of the return answer 711 in the candidate list as "books of the Romance of the Three Kingdoms" and "Music of the Three Kingdoms" "and the TV series of the Romance of the Three Kingdoms." That is to say, when the keyword 709 is "Three Kingdoms", the natural language understanding system 720 selects the "Three Kingdoms" book as the reward answer 711, and outputs a voice response 707 based on the reward answer 711. It should be noted that although the above is a prioritized use of the statistical values of the negative terms used by the user, in another embodiment, the statistical values of the positive terms used by the user may still be used alone to prioritize ( For example, as previously mentioned, the value of the preference bar 852 is above a certain threshold of the aversive bar 862).

值得一提的是，自然語言理解系統720還可同時依據用戶使用的正面用語與負面用語的多寡，以決定回報答案的優先順序。具體來說，特性資料庫730還可記錄有用戶所表達過的關鍵字，例如：「喜歡」、「偶像」(以上為正面用語)、「厭惡」或「討厭」(以上為負面用語)等等。因此，自然語言理解系統720除了可比較用戶使用“喜歡”與“厭惡”的相差次數之外，還可自候選列表中，直接依據上述關鍵字所對應的正面/負面用語次數來對回報答案進行排序(亦即比較正面用語或負面用語哪者的引用次數較多)。舉例來說，假設回報答案中相關於「喜歡」的次數較多(亦即正面用語的引用次數較多、或是喜好欄852的數值比較大)，則此回報答案會優先被選取。或者，假設回報答案中相關於「厭惡」的次數較多(亦即負面用語的引用次數較多、或是厭惡欄862的數值比較大)，則較後被選取，於是自然語言理解系統720可將所有的回報答案依據上述的優先順序排列方式整理出一個候選列表。由於部分用戶可能偏好使用正面用語(例如喜好欄852的數值特別大)、而另一些用戶則習慣使用負面用語(例如厭惡欄862的數值特別大)，因此在上述實施例中，因用戶喜好記錄717將反映個別用戶的使用習慣，因此可以提供更符合用戶習慣的選項供其選取。 It is worth mentioning that the natural language understanding system 720 can also determine the priority order of returning answers according to the amount of positive and negative terms used by the user. Specifically, the feature database 730 can also record keywords that the user has expressed, such as "like", "idol" (above is positive), "disgust" or "hate" (above negative words), etc. Wait. Therefore, in addition to comparing the number of times the user uses "like" and "disgust", the natural language understanding system 720 can also directly report the reward answer according to the number of positive/negative terms corresponding to the keyword from the candidate list. Sorting (that is, comparing positive or negative terms to which the number of citations is more). For example Say, assuming that there are more times in the return answer related to "like" (that is, if the number of citations of the positive term is more or the value of the preference column 852 is larger), the answer to the return will be selected first. Or, suppose that the number of times in the return answer is more related to "disgust" (that is, the number of citations of negative terms is larger, or the value of aversive column 862 is larger), then it is selected later, so the natural language understanding system 720 can All the reward answers are sorted out according to the above prioritization order to sort out a candidate list. Since some users may prefer to use positive terms (for example, the value of the favorite column 852 is particularly large), while other users are accustomed to using negative terms (for example, the value of the aversive column 862 is particularly large), in the above embodiment, the user prefers to record. 717 will reflect the usage habits of individual users, so it can provide options that are more in line with the user's habits.

此外，自然語言理解系統720亦可依據衆人使用習慣，來排序候選列表中的回報答案711的優先順序，其中越是關於衆人經常使用的答案則優先排列(例如使用圖3C的熱度欄316、或圖8B的喜好欄854與厭惡欄864做記錄)。例如，當關鍵字709為「三國演義」時，假設自然語言理解系統720找到的回報答案例如為三國演義的電視劇、三國演義的書籍與三國演義的音樂。其中，若衆人提到「三國演義」時通常是指「三國演義」的電視劇，較少人會指「三國演義」的電影，而更少人會指「三國演義」的書籍，(例如圖8B中，相關記錄在喜好欄854的數值分別為80、40、5)，則自然語言理解系統720會按照優先順序排序關於「電視劇」、「電影」、「書籍」的回報答案711。也就是說，自然語言理解系統720會優先選擇「三國演義的電視劇」來做為回報答案711，並依據此回報答案711輸出語音應答707。至於上述的“衆人經常使用的答案優先排列”的方式，可以使用圖3C的熱度欄316(或圖8B的喜好欄854與厭惡欄864)做記錄，而記錄方式已在上述圖3C(8B)的相關段落揭示，在此不予贅述。 In addition, the natural language understanding system 720 can also prioritize the reward answers 711 in the candidate list according to the usage habits of the people, wherein the more frequently the answers frequently used by the people are prioritized (for example, using the heat column 316 of FIG. 3C, or The favorite column 854 of FIG. 8B is recorded with the disgusting column 864). For example, when the keyword 709 is "Three Kingdoms", it is assumed that the reward answers found by the natural language understanding system 720 are, for example, a drama of the Three Kingdoms, a book of the Romance of the Three Kingdoms, and a music of the Three Kingdoms. Among them, if the people refer to the "Three Kingdoms", they usually refer to the "Three Kingdoms" TV series. Less people will refer to the "Three Kingdoms" movie, and fewer people will refer to the "Three Kingdoms" book (for example, Figure 8B). In the case where the values of the related records in the favorite column 854 are 80, 40, and 5, respectively, the natural language understanding system 720 sorts the return answers 711 regarding "drama", "movie", and "book" in order of priority. In other words, the natural language understanding system 720 will give priority to the "TV drama of the Romance of the Three Kingdoms" as the answer 711, and According to this, the answer 711 outputs a voice response 707. As for the above-mentioned "priority of frequently used answers", the heat column 316 of FIG. 3C (or the preference column 854 and the aversive column 864 of FIG. 8B) can be used for recording, and the recording mode is already in the above FIG. 3C (8B). The relevant paragraphs are disclosed and will not be repeated here.

此外，自然語言理解系統720也可依據用戶的使用頻率以決定回報答案711的優先順序。具體來說，因自然語言理解系統720可將曾經接收到來自用戶的語音輸入701記錄在特性資料庫730，特性資料庫730可記錄自然語言理解系統720解析用戶的語音輸入701時，所獲得的關鍵字709以及自然語言理解系統720所有產生過的回報答案711等應答資訊。因此自然語言理解系統720在往後選擇回報答案711時，可根據特性資料庫730中所記錄的應答資訊(例如用戶喜好/厭惡/習慣、甚至是衆人喜好/厭惡/習慣...等資訊)，按照優先排序找出較符合用戶意圖(由用戶的語音輸入所判定)的回報答案711，藉以對應的的語音應答。至於上述“依據用戶習慣決定回報答案711的優先順序”的方式，亦可使用圖3C的熱度欄316(或圖8B的喜好欄852與厭惡欄862)做記錄，而記錄方式已在上述圖3C(8B)的相關段落揭示，在此不予贅述。 In addition, the natural language understanding system 720 can also determine the priority order of the reward answer 711 according to the frequency of use of the user. Specifically, since the natural language understanding system 720 can record the voice input 701 that has been received from the user in the property database 730, the property database 730 can record the natural language understanding system 720 that resolves the user's voice input 701. The keyword 709 and the natural language understanding system 720 all generate response messages such as the answer 711. Therefore, when the natural language understanding system 720 selects the reward answer 711 in the future, it can be based on the response information recorded in the feature database 730 (for example, user preference/disgust/habit, even everyone's preference/disgust/habit...) According to the prioritization, a reward answer 711 that is more in line with the user's intention (determined by the user's voice input) is found, and the corresponding voice response is obtained. As for the above-mentioned "priority order of returning the answer 711 according to the user's habits", the heat column 316 of FIG. 3C (or the preference column 852 and the disgusting column 862 of FIG. 8B) can also be used for recording, and the recording mode is already in the above FIG. 3C. The relevant paragraphs of (8B) are disclosed and will not be repeated here.

綜合上述，自然語言理解系統720可將上述的用戶喜好屬性(例如正面用語與負面用語)、用戶習慣及衆人使用習慣儲存至特性資料庫730中(步驟S806)。也就是說，在步驟S802、步驟S804及步驟S806中，從用戶的先前的歷史對話記錄獲知用戶喜好資料715(從特性資料庫730讀取)，並將所搜集到的用戶喜好資料715 加入特性資料庫730中(通過用戶喜好記錄717饋入特性資料庫730來對用戶/衆人喜好資料做修改)，此外，也將用戶習慣與衆人使用習慣儲存至特性資料庫730，讓自然語言理解系統720能利用特性資料庫730中豐富資訊(例如通過用戶喜好記錄717儲存在特性資料庫730中)，提供用戶更正確的應答。 In summary, the natural language understanding system 720 can store the user preference attributes (eg, positive and negative terms), user habits, and usage habits described above into the feature database 730 (step S806). That is, in steps S802, S804, and S806, the user preference profile 715 (read from the profile database 730) is learned from the user's previous history session record, and the collected user preference profile 715 is collected. The feature database 730 is added (the user/person preference data is modified by the user preference record 717 to feed the feature database 730). In addition, the user habits and the usage habits are also stored in the feature database 730 for natural language understanding. System 720 can utilize the rich information in feature database 730 (e.g., stored in feature database 730 via user preference record 717) to provide a more accurate response from the user.

接下來對步驟S830的細節做進一步描述。在步驟S830中，是在步驟S810接收語音輸入、並在S820解析語音輸入的關鍵字709以獲得候選列表後，接著，自然語言理解系統720依據將用戶喜好、用戶習慣或衆人使用習慣等用戶喜好記錄717，決定至少一回報答案的優先順序(步驟S880)。如上所述，優先順序可以通過搜尋的記錄數量、用戶或衆人的正面/負面用語等方式為依據。接著，依據優先順序自候選列表中選擇一回報答案711(步驟S890)，至於回報答案的選擇亦可如上所述，選擇匹配程度最高者、或是優先順序最高者。之後，依據回報答案711，輸出語音應答707(步驟S840)。 Next, the details of step S830 are further described. In step S830, after receiving the voice input in step S810 and parsing the keyword 709 of the voice input at S820 to obtain the candidate list, the natural language understanding system 720 then follows the user preferences such as user preference, user habits, or crowd use habits. Record 717, determining at least one priority order for returning the answer (step S880). As mentioned above, the priority order can be based on the number of records searched, the positive/negative terms of the user or the individual. Then, a return answer 711 is selected from the candidate list according to the priority order (step S890), and the selection of the return answer may also be the highest match or the highest priority as described above. Thereafter, based on the return answer 711, the voice response 707 is output (step S840).

另一方面，自然語言理解系統720還可依據用戶更早輸入的語音輸入701，以決定至少一回報答案的優先順序。也就是說，假設有另一個語音輸入701(例如前面提到的第四語音輸入)被語音取樣模組710所接收的時間提前於語音應答707被播放時，則自然語言理解系統720亦可通過解析這個語音輸入701(亦即第四語音輸入)中的關鍵字(亦即第四關鍵字709)，並在候選列表中，優先選取與此關鍵字符合的回報答案以做為回報答案711，並依據此回報答案711輸出語音應答707。 On the other hand, the natural language understanding system 720 can also determine the priority order of at least one reward answer based on the voice input 701 entered by the user earlier. That is, assuming that another voice input 701 (e.g., the aforementioned fourth voice input) is received by the voice sampling module 710 in advance of the voice response 707 being played, the natural language understanding system 720 can also pass. Parsing the keyword in the voice input 701 (ie, the fourth voice input) (ie, the fourth keyword 709), and in the candidate list, preferentially selecting the reward answer that matches the keyword as the reward answer 711, and A voice response 707 is output based on this reward answer 711.

舉例來說，假設自然語言理解系統720先接收到「我想看電視劇」的語音輸入701，且隔了幾秒之後，假設自然語言理解系統720又接收到「幫我放三國演義好了」的語音輸入701。此時，自然語言理解系統720可在第一次的語音輸入701中識別到「電視劇」的關鍵字(第一關鍵字)，且在後面識別到「三國演義」的關鍵字(第四關鍵字)，因此，自然語言理解系統720會從候選列表中，選取意圖資料是關於「三國演義」與「電視劇」的回報答案，並以此回報答案711據以輸出用語音應答707予用戶。 For example, assume that the natural language understanding system 720 first receives the voice input 701 of "I want to watch a TV show", and after a few seconds, assume that the natural language understanding system 720 receives the "Help me put the Romance of the Three Kingdoms". Voice input 701. At this time, the natural language understanding system 720 can recognize the keyword (first keyword) of the "TV drama" in the first voice input 701, and recognize the keyword of the "Three Kingdoms" (fourth keyword) Therefore, the natural language understanding system 720 selects the intent data from the candidate list as the return answer for the "Three Kingdoms" and "TV drama", and returns the answer 711 to output the voice response 707 to the user.

基於上述，自然語言理解系統720可依據來自用戶的語音輸入，並參酌衆人使用習慣、用戶喜好、用戶習慣或用戶所說的前後對話等等資訊，而輸出較能符合語音輸入701的請求資訊703的語音應答707予用戶。其中，自然語言理解系統720可依據不同的排序方式，例如衆人使用習慣、用戶喜好、用戶習慣或用戶所說的前後對話等等方式，來優先排序候選列表中的回報答案。藉此，若來自用戶的語音輸入701較不明確時，自然語言理解系統720可參酌衆人使用習慣、用戶喜好、用戶習慣或用戶所說的前後對話，來判斷出用戶的語音輸入701中所意指的意圖(例如語音輸入中的關鍵字709的屬性、知識領域等等)。換言之，若回報答案711與用戶曾表達過/衆人所指的意圖接近時，自然語言理解系統720則會優先考慮此回報答案711。如此一來，自然語言對話系統700所輸出的語音應答707，可較符合用戶的請求資訊 703。 Based on the above, the natural language understanding system 720 can output the request information 703 that conforms to the voice input 701 according to the voice input from the user and the information of the user's usage habits, user preferences, user habits, or the user's talks. The voice response 707 is given to the user. The natural language understanding system 720 can prioritize the reward answers in the candidate list according to different sorting methods, such as the usage habits, user preferences, user habits, or the user's forward and backward conversations. Therefore, if the voice input 701 from the user is relatively unclear, the natural language understanding system 720 can determine the user's voice input 701 according to the usage habits, user preferences, user habits, or the user's talks. The intent of the finger (such as the attribute of the keyword 709 in the voice input, the field of knowledge, etc.). In other words, if the reward answer 711 is close to what the user has expressed/intended by the person, the natural language understanding system 720 will prioritize the reward answer 711. In this way, the voice response 707 output by the natural language dialogue system 700 can be more consistent with the user's request information. 703.

應注意的是，雖然上述將特性資料庫730與結構化資料庫220以不同的資料庫做描述，但這兩個資料庫可合並在一起，本領域的技術人員可依據實際應用進行選擇。 It should be noted that although the feature database 730 and the structured database 220 are described in different databases, the two databases may be combined, and those skilled in the art may select according to actual applications.

綜上所述，本發明提供一種自然語言對話方法及其系統，自然語言對話系統可依據來自用戶的語音輸入而輸出對應的語音應答。本發明的自然語言對話系統還可依據依據衆人使用習慣、用戶喜好、用戶習慣或用戶所說的前後對話等等方式，來優先選出較適當的回報答案，據以輸出語音應答予用戶，藉以增進用戶與自然語言對話系統進行對話時的便利性。 In summary, the present invention provides a natural language dialogue method and system thereof, and the natural language dialogue system can output a corresponding voice response according to a voice input from a user. The natural language dialogue system of the present invention can also preferentially select a more appropriate return answer according to the usage habits of the people, the user's preference, the user's habits or the user's talks, etc., thereby outputting a voice response to the user, thereby enhancing The convenience of the user in a conversation with the natural language dialogue system.

接著再以自然語言理解系統100與結構化資料庫220等架構與構件，應用於依據用戶語音輸入的請求資訊分析而得的回報答案的數量，決定直接依據資料類型進行操作、或是要求用戶提供進一步指示，隨後在回報答案只剩一者時，亦可直接依據資料類型進行操作的實例做的說明。提供用戶這項選擇的好處為系統可以不必替用戶進行回報答案的篩選，而是將包含回報答案的候選列表直接提供給用戶，並讓用戶通過回報答案的選取，自己決定想要執行的軟體或提供哪種服務，以達到提供用戶友好介面(user-friendly interface)的目的。 Then, the architecture and components such as the system 100 and the structured database 220 are understood in a natural language, and the number of reward answers obtained by analyzing the requested information according to the user's voice input is determined, and the operation is directly performed according to the data type, or the user is required to provide Further instructions, then when there is only one return answer, you can also directly explain the example of the operation based on the data type. The benefit of providing the user with this choice is that the system does not have to filter the answer for the user, but instead provides the candidate list containing the reward answer directly to the user, and lets the user decide which software to execute by or not by selecting the answer. What kind of service is provided to achieve the purpose of providing a user-friendly interface.

圖9為依據本發明一實施例的行動終端裝置的系統示意圖。請參照圖9，在本實施例中，行動終端裝置900包括語音接收單元910、資料處理單元920、顯示單元930及存儲單元940。資料處理單元920耦接語音接收單元910、顯示單元930及存儲單元940。語音接收單元910用以接收第一輸入語音SP1及第二輸入語音SP2且傳送至資料處理單元920。上述的第一語音輸入SP1與第二語音輸入SP2可以是語音輸入501、701。顯示單元930用以受控於資料處理單元920以顯示第一/第二候選列表908/908’。存儲單元940用以儲存多個資料，這些資料可包含前述的結構化資料庫220或特性資料庫730的資料，在此不再贅述。此外，存儲單元940可以是伺服器或計算機系統內的任何類型的存儲器，例如動態隨機存儲器(DRAM)，靜態隨機存儲器(SRAM)、快閃存儲器(Flash memory)、祇讀存儲器(ROM)...等，本發明對此並不加以限制，本領域的技術人員可以依據實際需求進行選用。 FIG. 9 is a schematic diagram of a system of a mobile terminal device according to an embodiment of the invention. Referring to FIG. 9 , in the embodiment, the mobile terminal device 900 includes a voice receiving unit 910 , a data processing unit 920 , a display unit 930 , and a storage unit 940 . Capital The material processing unit 920 is coupled to the voice receiving unit 910, the display unit 930, and the storage unit 940. The voice receiving unit 910 is configured to receive the first input voice SP1 and the second input voice SP2 and transmit the data to the data processing unit 920. The first voice input SP1 and the second voice input SP2 described above may be voice inputs 501, 701. The display unit 930 is configured to be controlled by the material processing unit 920 to display the first/second candidate list 908/908'. The storage unit 940 is configured to store a plurality of materials, and the data may include the data of the structured database 220 or the characteristic database 730, and details are not described herein. Furthermore, storage unit 940 can be any type of memory within a server or computer system, such as dynamic random access memory (DRAM), static random access memory (SRAM), flash memory, read only memory (ROM).. The invention is not limited thereto, and those skilled in the art can select according to actual needs.

在本實施例中，資料處理單元920的作用如同圖1的自然語言理解系統100，會對第一輸入語音SP1進行語音識別以產生請求資訊902，再對第一請求資訊902進行分析與自然語言處理以產生對應第一輸入語音SP1的第一關鍵字904，並且依據第一輸入語音SP1對應的第一關鍵字904從存儲單元940的資料(例如搜尋引擎240依據關鍵字108對結構化資料庫220進行全文檢索)中找出第一回報答案906(例如第一回報答案511/711)。當所找到的第一回報答案906數量為1時，資料處理單元920可直接依據第一回報答案906所對應的文檔資料進行對應的操作；當第一回報答案906的數量大於1時，資料處理單元920可將第一回報答案906整理成一個第一候選列表908，隨後控制顯示單元940顯示第一候選列表908予用戶。在顯示第一候選列表908供用戶做進一步選取的狀况下，資料處理單元920會收到第二輸入語音SP2，並對其進行語音識別以產生第二請求資訊902’，再對第二請求資訊902’進行自然語言處理以產生對應第二輸入語音SP2的第二關鍵字904’，並且依據第二輸入語音SP2對應的第二關鍵字904’從第一候選列表908中選擇對應的部分。其中，第一關鍵字904及第二關鍵字904’可以由多個關鍵字所構成。上述對第二語音輸入SP2進行分析而產生第二請求資訊902’與第二關鍵字904’的方式，可以運用圖5A與7A對第二語音輸入進行分析的方式，因此不再贅述。 In the present embodiment, the data processing unit 920 functions as the natural language understanding system 100 of FIG. 1, and performs speech recognition on the first input speech SP1 to generate the request information 902, and then analyzes the first request information 902 and the natural language. Processing to generate a first keyword 904 corresponding to the first input speech SP1, and according to the first keyword 904 corresponding to the first input speech SP1, the data from the storage unit 940 (eg, the search engine 240 according to the keyword 108 to the structured database) 220 performs a full-text search) to find a first return answer 906 (eg, a first return answer 511/711). When the number of the first reward answers 906 found is 1, the data processing unit 920 can directly perform corresponding operations according to the document data corresponding to the first reward answer 906; when the number of the first reward answers 906 is greater than 1, the data processing Unit 920 can organize the first reward answer 906 into a first candidate list 908, and then control display unit 940 to display The first candidate list 908 is for the user. In the case that the first candidate list 908 is displayed for the user to make further selection, the data processing unit 920 receives the second input speech SP2 and performs speech recognition to generate the second request information 902', and then the second request. The information 902' performs natural language processing to generate a second keyword 904' corresponding to the second input speech SP2, and selects a corresponding portion from the first candidate list 908 according to the second keyword 904' corresponding to the second input speech SP2. The first keyword 904 and the second keyword 904' may be composed of a plurality of keywords. The manner of analyzing the second speech input SP2 to generate the second request information 902' and the second keyword 904' may use the manner in which the second speech input is analyzed by using FIGS. 5A and 7A, and thus will not be described again.

類似地，當第二回報答案906的數量為1時，資料處理單元920會依據第二回報答案906的類型進行對應的操作；當第二回報答案906’的數量大於1時，資料處理單元920會再依據第二回報答案906’整理成一個第二候選列表908’並控制顯示單元940予以顯示。接著，再依據用戶下一個輸入語音以選擇對應的部分，再依據後續回報答案的數量進行對應的操作，此可參照上述說明類推得知，在此則不再贅述。 Similarly, when the number of second reward answers 906 is 1, the data processing unit 920 performs a corresponding operation according to the type of the second reward answer 906; when the number of second reward answers 906' is greater than 1, the data processing unit 920 The second reward answer 906' is further organized into a second candidate list 908' and the display unit 940 is controlled to display. Then, according to the next input voice of the user, the corresponding part is selected, and then the corresponding operation is performed according to the number of subsequent feedback answers. This can be referred to the analogy of the above description, and will not be described here.

進一步來說，資料處理單元920會將結構化資料庫220的多個記錄302(例如標題欄304中的各分欄308的數值資料)與第一輸入語音SP1對應的第一關鍵字904進行比對(如前面對圖1、圖3A、3B、3C所述)。當結構化資料庫220某個記錄302與第一輸入語音SP1的第一關鍵字904為至少部分匹配時，則將此記錄 302視為第一輸入語音SP1所產生的匹配結果(例如圖3A/3B的產生匹配結果)。其中，若匹配結果所屬的文檔資料為音樂文件，則記錄302可包括歌曲名稱、歌手、專輯名稱、出版時間、播放次序、...等；若文檔資料為影像文件，則記錄302可包括影片名稱、出版時間、工作人員(包含演出人員)、...等；若文檔資料為網頁文件，則記錄302可包括網站名稱、網頁類型、對應的使用者帳戶、...等；若文檔資料為圖片文件，則記錄302可包括圖片名稱、圖片資訊、...等；若文檔資料為名片文件，則記錄302可包括連絡人名稱、連絡人電話、連絡人地址、...等。上述記錄302為舉例以說明，且記錄302可依據實際應用而定，本發明實施例不以此為限。 Further, the data processing unit 920 compares the plurality of records 302 of the structured repository 220 (eg, the numeric data of each of the sub-columns 308 in the title bar 304) with the first keyword 904 corresponding to the first input speech SP1. Yes (as described above with respect to Figures 1, 3A, 3B, 3C). When the structured database 220 has a record 302 that matches at least a portion of the first key 904 of the first input speech SP1, then the record is recorded 302 is considered to be the matching result produced by the first input speech SP1 (eg, the generation of the matching result of FIGS. 3A/3B). Wherein, if the document data to which the matching result belongs is a music file, the record 302 may include a song name, a singer, an album name, a publishing time, a playing order, ..., etc.; if the document material is an image file, the record 302 may include a movie. Name, publication time, staff (including performers), ..., etc.; if the document material is a webpage file, the record 302 may include the website name, the webpage type, the corresponding user account, ..., etc.; For a picture file, the record 302 may include a picture name, picture information, ..., etc.; if the document material is a business card file, the record 302 may include a contact name, a contact phone, a contact address, ... and the like. The foregoing record 302 is illustrated by way of example, and the record 302 may be determined according to an actual application, and the embodiment of the present invention is not limited thereto.

接著，資料處理單元920可判斷第二輸入語音SP2對應的第二關鍵字904’是否包含指示順序的一順序詞彙(例如“我要第三個選項”或“我選第三個”)。當第二輸入語音SP2對應的第二關鍵字904’包含指示順序的順序詞彙時，則資料處理單元920依據順序詞彙自第一候選列表908中選擇位於對應位置的資料。當第二輸入語音SP2對應的第二關鍵字904’未包含指示順序的順序詞彙時，表示用戶可能直接選取第一候選列表908中某個第一回報答案906，則資料處理單元920將第一候選列表908中各個第一回報答案906所對應的記錄302與第二關鍵字904’進行比對，以決定第一回報答案906與第二輸入語音SP2的對應程度，再依據對應程度決定第一候選列表908中是否有某個第一回報答案906對應第二輸入語音SP2。在本發明的一實施例中，資料處理單元920可依據第一回報答案906對第二關鍵字904’的對應程度(例如完全匹配或是部分匹配的程度)，來決定第一候選列表906中是否有某個第一回報答案906與第二輸入語音SP2產生對應，藉以簡化選擇的流程。其中，資料處理單元920可選擇資料中對應程度為最高者為對應第二輸入語音SP2。 Next, the material processing unit 920 can determine whether the second keyword 904' corresponding to the second input speech SP2 contains a sequential vocabulary indicating the order (eg, "I want a third option" or "I choose a third"). When the second keyword 904' corresponding to the second input speech SP2 includes the sequential vocabulary indicating the order, the material processing unit 920 selects the material located at the corresponding position from the first candidate list 908 according to the sequential vocabulary. When the second keyword 904' corresponding to the second input speech SP2 does not include the sequential vocabulary indicating the order, indicating that the user may directly select a certain first return answer 906 in the first candidate list 908, the data processing unit 920 will be the first The record 302 corresponding to each first return answer 906 in the candidate list 908 is compared with the second keyword 904 ′ to determine the degree of correspondence between the first return answer 906 and the second input voice SP2, and then the first is determined according to the degree of correspondence. Whether there is a certain first return answer 906 in the candidate list 908 The second input voice SP2. In an embodiment of the present invention, the data processing unit 920 may determine the first candidate list 906 according to the degree of correspondence of the first reward answer 906 to the second keyword 904' (eg, the degree of complete matching or partial matching). Whether there is a certain first return answer 906 corresponding to the second input speech SP2, thereby simplifying the selection process. The data processing unit 920 can select the highest degree of correspondence in the data to correspond to the second input voice SP2.

舉例來說，若第一輸入語音SP1為“今天天氣怎樣”，在進行語音識別及自然語言處理後，第一輸入語音SP1對應的第一關鍵字904會包括“今天”及“天氣”，因此資料處理單元920會讀取對應今天天氣的資料，並且通過顯示單元930顯示這些天氣資料作為第一候選列表908。接著，若第二輸入語音SP2為“我要看第3筆資料”或“我選擇第3筆”，在進行語音識別及自然語言處理後，第二輸入語音SP2對應的第二關鍵字904’會包括“第3筆”，在此“第3筆”會被解讀為指示順序的順序詞彙，因此資料處理單元920會讀取第一候選列表908中第3筆資料(亦即第一候選列表908中的第三筆第一回報答案906)，並且再通過顯示單元930顯示對應的天氣資訊。或者，若第二輸入語音SP2為“我要看北京的天氣”或“我選擇北京的天氣”，在進行語音識別及自然語言處理後，第二輸入語音SP2對應的第二關鍵字904’會包括“北京”及“天氣”，因此資料處理單元920會讀取第一候選列表908中對應北京的資料。當此項選擇所對應的第一回報答案906數量為1時，可直接通過顯示單元930顯示對應的天氣資訊；當所選擇的第一回報答案906數量大於1時，則再顯示進一步的第二候選列表908’(包含至少一個第二回報答案906’)供使用者進一步選擇。 For example, if the first input voice SP1 is “What is the weather today”, after performing voice recognition and natural language processing, the first keyword 904 corresponding to the first input voice SP1 may include “Today” and “Weather”, The data processing unit 920 reads the data corresponding to today's weather and displays the weather data as the first candidate list 908 through the display unit 930. Then, if the second input speech SP2 is "I want to see the third data" or "I choose the third pen", after the speech recognition and natural language processing, the second input speech SP2 corresponds to the second keyword 904' Will include "3rd", where "3rd" will be interpreted as a sequential vocabulary indicating order, so the data processing unit 920 will read the third data in the first candidate list 908 (ie, the first candidate list) The third one in the 908 first returns the answer 906), and the corresponding weather information is displayed again through the display unit 930. Alternatively, if the second input speech SP2 is "I want to see the weather in Beijing" or "I choose the weather in Beijing", after the speech recognition and natural language processing, the second keyword 904' corresponding to the second input speech SP2 will Including "Beijing" and "weather", the data processing unit 920 will read the data corresponding to Beijing in the first candidate list 908. When the number of the first return answers 906 corresponding to the selection is 1, the corresponding weather information may be directly displayed through the display unit 930; when the selected first return is answered When the number of cases 906 is greater than one, then a further second candidate list 908' (containing at least one second return answer 906') is displayed for further selection by the user.

若第一輸入語音SP1為“我要打電話給老張”，在進行語音識別及自然語言處理後，第一輸入語音SP1對應的第一關鍵字904會包括“電話”及“張”，因此資料處理單元920會讀取對應姓“張”的連絡人資料(可通過對結構化資料庫220進行全文檢索，再取得對應於記錄302的詳細資料)，並且通過顯示單元930顯示這些連絡人資料(亦即第一回報答案906)的第一候選列表908。接著，若第二輸入語音SP2為“第3個老張”或“我選擇第3個”，在進行語音識別及自然語言處理後，第二輸入語音SP2對應的第二關鍵字904’會包括“第3個”，在此“第3個”會被解讀為指示順序的順序詞彙，因此資料處理單元920會讀取第一候選列表908中的第3筆資料(亦即第三個第一回報答案906)，並且依據所選擇的資料進行撥接。或者，若第二輸入語音SP2為“我選139開頭的”，在進行語音識別及自然語言處理後，第二輸入語音SP2對應的第二關鍵字904’會包括“139”及“開頭”，在此“139”不會被解讀為指示順序的順序詞彙，因此資料處理單元920會讀取第一候選列表908中電話號碼為139開頭的連絡人資料；若第二輸入語音SP2為“我要北京的老張”，在進行語音識別及自然語言處理後，第二輸入語音SP2對應的第二關鍵字904’會包括“北京”及“張”，資料處理單元920會讀取第一候選列表908中地址為北京的連絡人資料。當所選擇的第一回報答案906數量為1時，則依據所選擇的資料進行撥接；當所選擇的第一回報答案906數量大於1，則將此時所選取的第一回報答案906作為第二回報答案906’，並整理成一第二候選列表908’顯示予用戶供其選擇。 If the first input voice SP1 is "I want to call the old card", after performing voice recognition and natural language processing, the first keyword 904 corresponding to the first input voice SP1 may include "telephone" and "sheet", The data processing unit 920 reads the contact information of the corresponding last name "Zhang" (the full-text search can be performed on the structured database 220, and then the detailed information corresponding to the record 302 is obtained), and the contact information is displayed through the display unit 930. The first candidate list 908 (ie, the first return answer 906). Then, if the second input voice SP2 is "3rd old" or "I choose the third", after performing speech recognition and natural language processing, the second keyword 904' corresponding to the second input speech SP2 may include "3rd", where "3rd" will be interpreted as a sequential vocabulary indicating the order, so the data processing unit 920 will read the third data in the first candidate list 908 (ie, the third first) Return the answer 906) and dial in based on the selected data. Alternatively, if the second input speech SP2 is "I choose 139 at the beginning", after performing speech recognition and natural language processing, the second keyword 904' corresponding to the second input speech SP2 may include "139" and "beginning". Here, "139" is not interpreted as a sequential vocabulary indicating the order, so the material processing unit 920 reads the contact information of the first candidate list 908 whose telephone number is 139; if the second input speech SP2 is "I want to After the speech recognition and natural language processing, the second keyword 904' corresponding to the second input speech SP2 will include "Beijing" and "Zhang", and the data processing unit 920 will read the first candidate list. The address in 908 is the contact information of Beijing. When the number of selected first return answers 906 is 1, it is based on the selected data. When the number of selected first return answers 906 is greater than 1, the first return answer 906 selected at this time is taken as the second return answer 906', and is organized into a second candidate list 908' for display to the user. Its choice.

若第一輸入語音SP1為“我要找餐廳”，在進行語音識別及自然語言處理後，第一輸入語音SP1的第一關鍵字904會包括“餐廳”，資料處理單元920會讀取所有對應於餐廳第一回報答案906，由於這樣的指示並不是很明確，所以將通過顯示單元930顯示第一候選列表908(包含對應於所有餐廳資料的第一回報答案906)予用戶，並等用戶進一步的指示。接著，若用戶通過第二輸入語音SP2輸入“第3個餐廳”或“我選擇第3個”時，在進行語音識別及自然語言處理後，第二輸入語音SP2對應的第二關鍵字904’會包括“第3個”，在此“第3個”會被解讀為指示順序的順序詞彙，因此資料處理單元920會讀取第一候選列表908中第3筆資料，並且依據所選擇的資料進行顯示。或者，若第二輸入語音SP2為“我選最近的”，在進行語音識別及自然語言處理後，第二輸入語音SP2對應的第二關鍵字904’會包括“最近的”，因此資料處理單元920會讀取第一候選列表908中地址與使用者最近的餐廳資料；若第二輸入語音SP2為“我要北京的餐廳”，在進行語音識別及自然語言處理後，第二輸入語音SP2對應的第二關鍵字904’會包括“北京”及“餐廳”，因此資料處理單元920會讀取第一候選列表908中地址為北京的餐廳資料。當所選擇第一回報答案906的數量為1時，則依據所選擇的資料進行顯示；當所選擇的第一回報答案906數量大於1，則將此時所選取的第一回報答案906作為第二回報答案906’，並整理成一第二候選列表908’顯示予使用者供其選擇。 If the first input voice SP1 is "I am looking for a restaurant", after performing voice recognition and natural language processing, the first keyword 904 of the first input voice SP1 will include "restaurant", and the data processing unit 920 will read all corresponding The first report answer 906 is returned in the restaurant. Since such an indication is not very clear, the first candidate list 908 (including the first return answer 906 corresponding to all restaurant materials) will be displayed to the user through the display unit 930, and the user is further advanced. Instructions. Then, if the user inputs "3rd restaurant" or "I choose the 3rd" through the second input voice SP2, after the voice recognition and natural language processing, the second keyword 904' corresponding to the second input voice SP2 Will include "3rd", where "3rd" will be interpreted as a sequential vocabulary indicating order, so the data processing unit 920 will read the third data in the first candidate list 908 and according to the selected data. Display. Alternatively, if the second input speech SP2 is "I select the nearest one", after performing speech recognition and natural language processing, the second keyword 904' corresponding to the second input speech SP2 may include "recent", so the data processing unit 920 will read the restaurant information of the first candidate list 908 whose address is closest to the user; if the second input voice SP2 is "I want a restaurant in Beijing", after the speech recognition and natural language processing, the second input voice SP2 corresponds to The second keyword 904' will include "Beijing" and "restaurant", so the data processing unit 920 will read the restaurant material of the first candidate list 908 with the address Beijing. When the number of selected first return answers 906 is 1, it is displayed according to the selected data; when the selected first return answer is 906 If the amount is greater than 1, the first return answer 906 selected at this time is taken as the second return answer 906', and is organized into a second candidate list 908' for display to the user for selection.

依據上述，資料處理單元920可依據所選擇第一回報答案906(或第二回報答案906’)的文檔資料進行對應的操作。舉例來說，當所選擇第一回報答案906對應的文檔資料為一音樂文件，則資料處理單元920依據所選擇的資料進行音樂播放；當所選擇的文檔資料為一影像文件，則資料處理920單元依據所選擇的資料進行影像播放；當所選擇的文檔資料為一網頁文件，則資料處理單元920依據所選擇的資料進行顯示；當所選擇的文檔資料為一圖片文件，則資料處理單元920依據所選擇的資料進行圖片顯示；當所選擇的資料的類型為一名片文件，則資料處理單元920依據所選擇的資料進行撥接。 According to the above, the data processing unit 920 can perform corresponding operations according to the document data of the selected first reward answer 906 (or the second reward answer 906'). For example, when the document data corresponding to the selected first reward answer 906 is a music file, the data processing unit 920 performs music playback according to the selected data; when the selected document data is an image file, the data processing 920 The unit performs video playback according to the selected data; when the selected document data is a webpage file, the data processing unit 920 displays according to the selected data; and when the selected document data is a photo file, the data processing unit 920 The image display is performed according to the selected data; when the type of the selected data is a business card file, the data processing unit 920 performs dialing according to the selected data.

圖10為依據本發明一實施例的資訊系統的系統示意圖。請參照圖9及圖10，在本實施例中，資訊系統1000包括行動終端裝置1010及伺服器1020，其中伺服器1020可以是雲端伺服器、區域網路伺服器、或其他類似裝置，但本發明實施例不以此為限。行動終端裝置1010包括語音接收單元1011、資料處理單元1013及顯示單元1015。資料處理單元1013耦接語音接收單元1011、顯示單元1015及伺服器1020。行動終端裝置1010可以是移動電話(Cell phone)、個人數字助理(Personal Digital Assistant，PDA)手機、智能型手機(Smart phone)等移動通訊裝置，本發明亦不對此加以限制。語音接收單元1011的功能相似於語音接收單元 910，顯示單元1015的功能相似於顯示單元930。伺服器1020用以儲存多個資料且具有語音識別功能。 FIG. 10 is a schematic diagram of a system of an information system according to an embodiment of the invention. Referring to FIG. 9 and FIG. 10, in the embodiment, the information system 1000 includes a mobile terminal device 1010 and a server 1020. The server 1020 may be a cloud server, a regional network server, or the like, but The embodiments of the invention are not limited thereto. The mobile terminal device 1010 includes a voice receiving unit 1011, a data processing unit 1013, and a display unit 1015. The data processing unit 1013 is coupled to the voice receiving unit 1011, the display unit 1015, and the server 1020. The mobile terminal device 1010 may be a mobile communication device such as a mobile phone (Cell phone), a personal digital assistant (PDA) mobile phone, or a smart phone, and the present invention is not limited thereto. The function of the voice receiving unit 1011 is similar to that of the voice receiving unit 910. The display unit 1015 functions similarly to the display unit 930. The server 1020 is configured to store a plurality of materials and has a voice recognition function.

在本實施例中，資料處理單元1013會通過伺服器1020對第一輸入語音SP1進行語音識別以產生第一請求資訊902，再對第一請求資訊902進行自然語言處理以產生對應第一輸入語音SP1的第一關鍵字904，並且伺服器1020會依據第一關鍵字904對結構化資料庫220進行全文檢索以找出第一回報答案906後並傳送至資料處理單元1013。當第一回報答案906的數量為1時，資料處理單元1013會依據第一回報答案906所對應的文檔資料進行對應的操作；當第一回報答案906的數量大於1時，資料處理單元1013將此時所選擇的第一回報答案906整理成第一候選列表908後控制顯示單元1015顯示予用戶，並等候用戶進一步的指示。當用戶又輸入指示後，接著，資料處理單元1013會通過伺服器1020對第二輸入語音PS2進行語音識別以產生第二請求資訊902’，再對第二請求資訊902’進行分析與自然語言處理以產生對應第二輸入語音SP2的第二關鍵字904’，並且伺服器1020依據第二輸入語音SP2對應的第二關鍵字904’從第一候選列表908中挑選對應的第一回報答案906作為第二回報答案906’，並傳送至資料處理單元1013。類似地，當此時對應的第二回報答案906的數量為1時，資料處理單元920會依據第二回報答案906所對應的資料的類型進行對應的操作；當第二回報答案906的數量大於1時，資料處理單元1013會再此時所選擇的第二回報答案906整理成一第二候選列表908’後，再控制顯示單元1015顯示予用戶做進一步選擇。接著，伺服器1020會再依據後續輸入語音選擇對應的部分，並且資料處理單元1013會再依據選擇的資料的數量進行對應的操作，此可參照上述說明類推得知，在此則不再贅述。 In this embodiment, the data processing unit 1013 performs voice recognition on the first input voice SP1 through the server 1020 to generate the first request information 902, and then performs natural language processing on the first request information 902 to generate a corresponding first input voice. The first keyword 904 of SP1, and the server 1020 performs a full-text search on the structured database 220 according to the first keyword 904 to find the first return answer 906 and transmits it to the data processing unit 1013. When the number of the first report answer 906 is 1, the data processing unit 1013 performs a corresponding operation according to the document data corresponding to the first report answer 906; when the number of the first report answer 906 is greater than 1, the data processing unit 1013 At this time, the selected first return answer 906 is organized into the first candidate list 908, and then the control display unit 1015 displays to the user, and waits for further instructions from the user. After the user inputs the indication, the data processing unit 1013 then performs voice recognition on the second input voice PS2 through the server 1020 to generate the second request information 902', and then analyzes and natural language processing the second request information 902'. To generate a second keyword 904' corresponding to the second input speech SP2, and the server 1020 selects a corresponding first reward answer 906 from the first candidate list 908 according to the second keyword 904' corresponding to the second input speech SP2. The second returns an answer 906' and is passed to the data processing unit 1013. Similarly, when the number of corresponding second reward answers 906 is 1 at this time, the data processing unit 920 performs corresponding operations according to the type of the data corresponding to the second reward answer 906; when the number of second reward answers 906 is greater than At 1 o'clock, the data processing unit 1013 will sort out the second return answer 906 selected at this time. After forming a second candidate list 908', the control display unit 1015 is again displayed to the user for further selection. Then, the server 1020 selects the corresponding part according to the subsequent input voice, and the data processing unit 1013 performs the corresponding operation according to the number of selected materials. This can be referred to the analogy of the above description, and will not be described herein.

應注意的是，在一實施例中，若依據第一輸入語音SP1對應的第一關鍵字904所選擇的第一回報答案906數量為1時，可以直接進行該資料對應的操作。此外，在另一實施例中，可以先輸出一個提示予用戶，以通知用戶所選擇的第一回報答案906的對應操作將被執行。再者，在又一實施例中，亦可在依據第二輸入語音SP2對應的第二關鍵字904’所選擇的第二回報答案906數量為1時，直接進行該資料對應的操作。當然，在另一實施例中，亦可以先輸出一個提示予用戶，以通知用戶所選擇的資料的對應操作將被執行，本發明對此都不加以限制。 It should be noted that, in an embodiment, if the number of first reward answers 906 selected according to the first keyword 904 corresponding to the first input voice SP1 is 1, the operation corresponding to the data may be directly performed. Moreover, in another embodiment, a prompt may be output to the user to inform the user that the corresponding operation of the selected first reward answer 906 will be performed. Furthermore, in another embodiment, when the number of second reward answers 906 selected according to the second keyword 904' corresponding to the second input voice SP2 is 1, the operation corresponding to the data may be directly performed. Of course, in another embodiment, a prompt may be output to the user to notify the user that the corresponding operation of the selected material is to be performed, and the present invention does not limit this.

進一步來說，伺服器1020會將結構化資料庫220各個記錄302與第一輸入語音SP1對應的第一關鍵字904進行比對。當各個記錄302與第一關鍵字904為至少部分匹配時，則將此記錄302視為第一輸入語音SP1所匹配的資料，並將此記錄302作為第一回報答案906的一者。若依據第一輸入語音SP1對應的第一關鍵字904所選擇的第一回報答案906數量大於1時，用戶可能再通過第二輸入語音SP2輸入指示。由於用戶此時通過第二輸入語音SP2所輸入的指示可能包含順序(用以指示選擇顯示資訊中的第幾項等順序)、直接選定顯示資訊中的某一者(例如直接指示某項資訊的內容)、或是依據指示判定用戶的意圖(例如選取最近的餐廳，就會用顯示“最近”的餐廳給用戶)，於是伺服器1020接著將判斷第二輸入語音SP2對應的第二關鍵字904’是否包含指示順序的一順序詞彙。當第二輸入語音SP2對應的第二關鍵字904’包含指示順序的順序詞彙時，則伺服器1020依據順序詞彙自第一候選列表908中選擇位於對應位置的第一回報答案906。當第二輸入語音SP2對應的第二關鍵字904’未包含指示順序的順序詞彙時，則伺服器1020將第一候選列表908中各個第一回報答案906與第二輸入語音SP2對應的第二關鍵字904’進行比對，以決定第一回報答案906與第二輸入語音SP2的對應程度，並可依據這些對應程度決定第一候選列表908中第一回報答案906是否對應第二輸入語音SP2。在本發明的一實施例中，伺服器1020可依據第一回報答案906與第二關鍵字904’的對應程度決定第一候選列表908中的那些第一回報答案906對應第二輸入語音SP2，以簡化選擇的流程。其中，伺服器1020可選擇第一回報答案906中對應程度為最高者為對應於第二輸入語音SP2者。 Further, the server 1020 compares the respective records 302 of the structured database 220 with the first keywords 904 corresponding to the first input speech SP1. When each record 302 is at least partially matched to the first keyword 904, then this record 302 is considered to be the material matched by the first input speech SP1 and this record 302 is taken as one of the first reward answers 906. If the number of first reward answers 906 selected according to the first keyword 904 corresponding to the first input speech SP1 is greater than 1, the user may input an indication through the second input speech SP2. Since the indication input by the user at this time through the second input voice SP2 may include an order (to indicate the order of selecting the first item in the information, etc.), directly select one of the displayed information (for example, directly indicating an item) The content of the information), or determining the user's intention according to the indication (for example, selecting the nearest restaurant, the user will be displayed with the "recent" restaurant), and the server 1020 will then determine the second key corresponding to the second input voice SP2. The word 904' contains a sequential vocabulary indicating the order. When the second keyword 904' corresponding to the second input speech SP2 includes the sequential vocabulary indicating the order, the server 1020 selects the first return answer 906 located at the corresponding position from the first candidate list 908 according to the sequential vocabulary. When the second keyword 904' corresponding to the second input speech SP2 does not include the sequential vocabulary indicating the order, the server 1020 sets the second first answer 906 of the first candidate list 908 with the second input speech SP2. The keyword 904' is compared to determine the degree of correspondence between the first return answer 906 and the second input speech SP2, and may determine whether the first return answer 906 in the first candidate list 908 corresponds to the second input speech SP2 according to the degree of correspondence. . In an embodiment of the present invention, the server 1020 may determine, according to the degree of correspondence between the first return answer 906 and the second keyword 904', those first return answers 906 in the first candidate list 908 corresponding to the second input speech SP2, To simplify the process of selection. The server 1020 can select the one with the highest degree of correspondence in the first report answer 906 as corresponding to the second input voice SP2.

圖11為依據本發明一實施例的基於語音識別的選擇方法的流程圖。請參照圖11，在本實施例中，會接收第一輸入語音(步驟S1100)，並且對第一輸入語音SP1進行語音識別以產生第一請求資訊902(步驟S1110)，再對第一請求資訊902進行分析自然語言處理以產生對應第一輸入語音的第一關鍵字904(步驟S1120)。接著，會依據第一關鍵字904從多個資料中選擇對應的第一回報答案906(步驟S1130)，並且判斷所選擇的第一回報答案906數量是否為1(步驟S1140)。當所選擇第一回報答案906的數量為1時，亦即步驟S1140的判斷結果為“是”，則依據第一回報答案906所對應的文檔資料進行對應的操作(步驟S1150)。當所選擇第一回報答案906的數量大於1時，亦即步驟S1140的判斷結果為“否”，依據所選擇第一回報答案906顯示第一候選列表908且接收第二輸入語音SP2(步驟S1160)，並且對第二輸入語音進行語音識別以產生第二請求資訊902’(步驟S1170)，再對第二請求資訊902’進行分析與自然語言處理以產生對應第二輸入語音的第二關鍵字904’(步驟S1180)。接著，依據第二請求資訊902從第一候選列表908中的第一回報答案906選擇對應的部分，再回到步驟S1140判斷判斷所選擇第一回報答案906的數量是否為1(步驟S1190)。其中，上述步驟的順序為用以說明，本發明實施例不以此為限。並且，上述步驟的細節可參照圖9及圖10實施例，在此則不再贅述。 11 is a flow chart of a voice recognition based selection method in accordance with an embodiment of the present invention. Referring to FIG. 11, in this embodiment, the first input voice is received (step S1100), and the first input voice SP1 is voice-recoordinated to generate the first request information 902 (step S1110), and the first request information is further requested. The 902 performs analysis natural language processing to generate a first keyword 904 corresponding to the first input voice (step S1120). Then, the corresponding keyword is selected from the plurality of materials according to the first keyword 904. The first answer 906 is answered (step S1130), and it is judged whether or not the number of selected first return answers 906 is 1 (step S1140). When the number of selected first return answers 906 is 1, that is, the determination result of step S1140 is YES, the corresponding operation is performed according to the document data corresponding to the first return answer 906 (step S1150). When the number of selected first return answers 906 is greater than 1, that is, the determination result of step S1140 is "NO", the first candidate list 908 is displayed according to the selected first return answer 906 and the second input speech SP2 is received (step S1160). And performing voice recognition on the second input voice to generate second request information 902' (step S1170), and analyzing and natural language processing on the second request information 902' to generate a second keyword corresponding to the second input voice 904' (step S1180). Next, the corresponding portion is selected from the first report answer 906 in the first candidate list 908 according to the second request information 902, and then the process returns to step S1140 to determine whether the number of selected first report answers 906 is 1 (step S1190). The order of the above steps is for illustration, and the embodiment of the present invention is not limited thereto. For details of the above steps, reference may be made to the embodiments of FIG. 9 and FIG. 10, and details are not described herein again.

綜上所述，本發明實施例的基於語音識別的選擇方法及其行動終端裝置及資訊系統，其對第一輸入語音及第二輸入語音進行語音識別及自然語言處理以確認第一輸入語音及第二輸入語音對應的關鍵字，再依據第一輸入語音及第二輸入語音對應的關鍵字對回報答案進行選擇。藉此，可提升使用者操作的便利性。 In summary, the voice recognition-based selection method and the mobile terminal device and the information system thereof perform voice recognition and natural language processing on the first input voice and the second input voice to confirm the first input voice and The keyword corresponding to the second input voice is further selected according to the keyword corresponding to the first input voice and the second input voice. Thereby, the convenience of the user's operation can be improved.

接下來針對本發明所揭示的自然語言理解系統100與結構化資料庫220等架構與構件，與輔助啟動裝置相結合的操作實例做說明。 Next, the architecture and components of the natural language understanding system 100 and the structured database 220 disclosed in the present invention are combined with the auxiliary activation device. Give an example.

圖12是依照本發明一實施例所繪示的語音操控系統的方塊圖。請參照圖12，語音操控系統1200包括輔助啟動裝置1210、行動終端裝置1220以及伺服器1230。在本實施例中，輔助啟動裝置1210會通過無線傳輸信號，來啟動行動終端裝置1220的語音系統，使得行動終端裝置1220根據語音信號與伺服器1230進行溝通。 FIG. 12 is a block diagram of a voice control system according to an embodiment of the invention. Referring to FIG. 12, the voice control system 1200 includes an auxiliary activation device 1210, a mobile terminal device 1220, and a server 1230. In the present embodiment, the auxiliary activation device 1210 activates the voice system of the mobile terminal device 1220 by wirelessly transmitting signals, so that the mobile terminal device 1220 communicates with the server 1230 according to the voice signal.

詳細而言，輔助啟動裝置1210包括第一無線傳輸模組1212以及觸發模組1214，其中觸發模組1214耦接於第一無線傳輸模組1212。第一無線傳輸模組1212例如是支援無線相容認證(Wireless fidelity，Wi-Fi)、全球互通微波存取(Worldwide Interoperability for Microwave Access，WiMAX)、藍芽(Bluetooth)、超寬頻(ultra-wideband，UWB)或射頻識別(Radio-frequency identification，RFID)等通訊協議的裝置，其可發出無線傳輸信號，以和另一無線傳輸模組彼此對應而建立無線連結。觸發模組1214例如為按鈕、按鍵等。在本實施例中，當使用者按壓此觸發模組1214產生一觸發信號後，第一無線傳輸模組1212接收此觸發信號而啟動，此時第一無線傳輸模組1212會發出無線傳輸信號，並通過第一無線傳輸模組1212傳送此無線傳輸信號至行動終端裝置1220。在一實施例中，上述的輔助啟動裝置1210可為一藍牙耳機。 In detail, the auxiliary activation device 1210 includes a first wireless transmission module 1212 and a trigger module 1214 , wherein the trigger module 1214 is coupled to the first wireless transmission module 1212 . The first wireless transmission module 1212 supports, for example, Wireless Fidelity (Wi-Fi), Worldwide Interoperability for Microwave Access (WiMAX), Bluetooth, and Ultra-wideband. A device of a communication protocol such as UWB or Radio-frequency identification (RFID), which can transmit a wireless transmission signal to establish a wireless connection with another wireless transmission module. The trigger module 1214 is, for example, a button, a button, or the like. In this embodiment, after the user presses the trigger module 1214 to generate a trigger signal, the first wireless transmission module 1212 receives the trigger signal and starts, and the first wireless transmission module 1212 sends a wireless transmission signal. And transmitting the wireless transmission signal to the mobile terminal device 1220 through the first wireless transmission module 1212. In an embodiment, the auxiliary activation device 1210 may be a Bluetooth headset.

值得注意的是，雖然目前有些免持的耳機/麥克風亦具有啟動行動終端裝置1220某些功能的設計，但本發明的另一實施例中，輔助啟動裝置1210可以不同於上述的耳機/麥克風。上述的耳機/麥克風藉由與行動終端裝置的連線，以取代行動終端裝置1220上的耳機/麥克風而進行聽/通話，啟動功能為附加設計，但本發明的輔助啟動裝置1210“僅”用於開啟行動終端裝置1220中的語音系統，並不具有聽/通話的功能，故內部的電路設計可簡化，成本也較低。換言之，相對於上述的免持耳機/麥克風而言，輔助啟動裝置1210是另外裝置，即使用者可能同時具備免持的耳機/麥克風以及本發明的輔助啟動裝置1210。 It is worth noting that although some hands-free headsets/microphones are currently available The design of certain functions of the mobile terminal device 1220 is initiated, but in another embodiment of the invention, the auxiliary activation device 1210 can be different from the earphone/microphone described above. The above-mentioned earphone/microphone performs listening/talking by replacing the earphone/microphone on the mobile terminal device 1220 by connecting with the mobile terminal device, and the activation function is an additional design, but the auxiliary starting device 1210 of the present invention is "only" used. The voice system in the mobile terminal device 1220 does not have the function of listening/talking, so the internal circuit design can be simplified and the cost is low. In other words, with respect to the hands-free headset/microphone described above, the auxiliary activation device 1210 is another device, that is, the user may have both a hands-free headset/microphone and the auxiliary activation device 1210 of the present invention.

此外，上述的輔助啟動裝置1210的形體可以是使用者隨手可及的用品，例如戒指、手錶、耳環、項鏈、眼鏡等裝飾品，即各種隨身可攜式物品，或者是安裝構件，例如為配置於方向盤上的行車配件，不限於上述。也就是說，輔助啟動裝置1210為“生活化”的裝置，通過內部系統的設置，讓使用者能夠輕易地觸碰到觸發模組1214，以開啟語音系統。舉例來說，當輔助啟動裝置1210的形體為戒指時，使用者可輕易地移動手指來按壓戒指的觸發模組1214使其被觸發。另一方面，當輔助啟動裝置1210的形體為配置於行車配件的裝置時，使用者亦能夠在行車期間輕易地觸發行車配件裝置的觸發模組1214。此外，相較於配戴耳機/麥克風進行聽/通話的不舒適感，使用本發明的輔助啟動裝置1210可以將行動終端裝置1220中的語音系統開啟，甚至進而開啟擴音功能(後將詳述)，使得使用者在不需配戴耳機/麥克風，仍可直接通過行動終端裝置1220進行聽/通話。另外，對於使用者而言，這些“生活化”的輔助啟動裝置1210為原本就會配戴或使用的物品，故在使用上不會有不習慣或是不舒適感的問題，即不需要花時間適應。舉例來說，當使用者在厨房做菜時，需要撥打放置於客廳的移動電話時，假設其配戴具有戒指、項鏈或手錶形體的本發明的輔助啟動裝置1210，就可以輕觸戒指、項鏈或手錶以開啟語音系統以詢問友人食譜細節。雖然目前部份具有啟動功能的耳機/麥克風亦可以達到上述的目的，但是在每次做菜的過程中，並非每次都需要撥打電話請教友人，故對於使用者來說，隨時配戴耳機/麥克風做菜，以備隨時操控行動終端裝置可說是相當的不方便。 In addition, the shape of the auxiliary starting device 1210 described above may be an item that is accessible to the user, such as a ring, a watch, an earring, a necklace, an eyeglass, etc., that is, various portable items, or a mounting member, for example, a configuration. The driving accessories on the steering wheel are not limited to the above. That is to say, the auxiliary activation device 1210 is a "living" device, and the setting of the internal system allows the user to easily touch the trigger module 1214 to turn on the voice system. For example, when the shape of the auxiliary activation device 1210 is a ring, the user can easily move the finger to press the trigger module 1214 of the ring to be triggered. On the other hand, when the shape of the auxiliary starting device 1210 is a device disposed on the driving accessory, the user can also easily trigger the triggering module 1214 of the driving accessory device during driving. In addition, the auxiliary activation device 1210 of the present invention can be used to turn on the voice system in the mobile terminal device 1220, and even turn on the amplification function, compared to the discomfort of listening/talking with the earphone/microphone (which will be detailed later). ), allowing users to go directly through the action without wearing a headset/microphone The terminal device 1220 performs listening/talking. In addition, for the user, these "living" auxiliary starting devices 1210 are items that would otherwise be worn or used, so there is no problem of uncomfortable or uncomfortable use, that is, no flowers are needed. Time to adapt. For example, when a user is cooking in a kitchen and needs to dial a mobile phone placed in a living room, assuming that it is wearing the auxiliary starting device 1210 of the present invention having a ring, a necklace or a watch shape, the ring and the necklace can be lightly touched. Or watch to turn on the voice system to ask for friend recipe details. Although some earphones/microphones with start-up function can achieve the above purposes, in the process of cooking, not every time you need to call a friend, so for the user, wear headphones at any time. It is quite inconvenient to cook the microphone in order to control the mobile terminal device at any time.

在其他實施例中，輔助啟動裝置1210還可配置有無線充電電池1216，用以驅動第一無線傳輸模組1212。進一步而言，無線充電電池1216包括電池單元12162以及無線充電模組12164，其中無線充電模組12164耦接於電池單元12162。在此，無線充電模組12164可接收來自一無線供電裝置(未繪示)所供應的能量，並將此能量轉換為電力來對電池單元12162充電。如此一來，輔助啟動裝置1210的第一無線傳輸模組1212可便利地通過無線充電電池1216來進行充電。 In other embodiments, the auxiliary activation device 1210 can also be configured with a wireless rechargeable battery 1216 for driving the first wireless transmission module 1212. Further, the wireless charging battery 1216 includes a battery unit 12162 and a wireless charging module 12164. The wireless charging module 12164 is coupled to the battery unit 12162. Here, the wireless charging module 12164 can receive energy supplied from a wireless power supply device (not shown) and convert the energy into power to charge the battery unit 12162. As such, the first wireless transmission module 1212 of the auxiliary activation device 1210 can be conveniently charged by the wireless rechargeable battery 1216.

另一方面，行動終端裝置1220例如為移動電話(Cell phone)、個人數字助理(Personal Digital Assistant，PDA)手機、智能型手機(Smart phone)，或是安裝有通訊軟體的掌上型計算機(Pocket PC)、平板型計算機(Tablet PC)或筆記型計算機等等。行動終端裝置1220可以是任何具備通訊功能的可攜式(Portable)移動裝置，在此並不限制其範圍。此外，行動終端裝置1220可使用Android操作系統、Microsoft操作系統、Android操作系統、Linux操作系統等等，不限於上述。 On the other hand, the mobile terminal device 1220 is, for example, a cell phone, a personal digital assistant (PDA) mobile phone, a smart phone, or a palmtop computer with a communication software installed (Pocket PC). ), Tablet PC or notebook computer, etc. The mobile terminal device 1220 can be any portable mobile device with communication function, and the scope is not limited herein. In addition, the mobile terminal device 1220 may use an Android operating system, a Microsoft operating system, an Android operating system, a Linux operating system, etc., and is not limited to the above.

行動終端裝置1220包括第二無線傳輸模組1222，第二無線傳輸模組1222能與輔助啟動裝置1210的第一無線傳輸模組1212相匹配，並採用相對應的無線通訊協議(例如無線相容認證、全球互通微波存取、藍芽、超寬頻通訊協議或射頻識別等通訊協議)，藉以與第一無線傳輸模組1212建立無線連結。值得注意的是，在此所述的“第一”無線傳輸模組1212、“第二”無線傳輸模組1222是用以說明無線傳輸模組配置於不同的裝置，並非用以限定本發明。 The mobile terminal device 1220 includes a second wireless transmission module 1222, and the second wireless transmission module 1222 can match the first wireless transmission module 1212 of the auxiliary activation device 1210, and adopts a corresponding wireless communication protocol (for example, wireless compatibility). Authentication, global interoperability microwave access, Bluetooth, ultra-wideband communication protocol or radio frequency identification communication protocol, thereby establishing a wireless connection with the first wireless transmission module 1212. It should be noted that the "first" wireless transmission module 1212 and the "second" wireless transmission module 1222 are used to describe that the wireless transmission module is configured in different devices, and is not intended to limit the present invention.

在其他實施例中，行動終端裝置1220還包括語音系統1221，此語音系統1221耦接於第二無線傳輸模組1222，故使用者觸發輔助啟動裝置1210的觸發模組1214後，能通過第一無線傳輸模組1212與第二無線傳輸模組1222無線地啟動語音系統1221。在一實施例中，此語音系統1221可包括語音取樣模組1224、語音合成模組1226以及語音輸出介面1227。語音取樣模組1224用以接收來自使用者的語音信號，此語音取樣模組1224例如為麥克風(Microphone)等接收音訊的裝置。語音合成模組1226可查詢一語音合成資料庫，而此語音合成資料庫例如是記錄有文字以及其對應的語音的資訊，使得語音合成模組1226能夠找出對應於特定文字訊息的語音，以將文字訊息進行語音合成。之後，語音合成模組1226可將合成的語音通過語音輸出介面1227輸出，藉以播放予使用者。上述的語音輸出介面1227例如為喇叭或耳機等。 In other embodiments, the mobile terminal device 1220 further includes a voice system 1221. The voice system 1221 is coupled to the second wireless transmission module 1222. Therefore, after the user triggers the trigger module 1214 of the auxiliary activation device 1210, the user can pass the first The wireless transmission module 1212 and the second wireless transmission module 1222 wirelessly activate the voice system 1221. In an embodiment, the voice system 1221 can include a voice sampling module 1224, a voice synthesis module 1226, and a voice output interface 1227. The voice sampling module 1224 is configured to receive a voice signal from a user. The voice sampling module 1224 is, for example, a device for receiving audio such as a microphone. The speech synthesis module 1226 can query a speech synthesis database, and the speech synthesis database is, for example, information recorded with text and corresponding speech, so that the speech synthesis module 1226 can find a corresponding correspondence. The speech of a specific text message to synthesize the text message. Thereafter, the speech synthesis module 1226 can output the synthesized speech through the speech output interface 1227 for playback to the user. The voice output interface 1227 described above is, for example, a speaker or an earphone.

另外，行動終端裝置1220還可配置有通訊模組1228。通訊模組1228例如是能傳遞與接收無線訊號的元件，如射頻收發器。進一步而言，通訊模組1228能夠讓使用者通過行動終端裝置1220接聽或撥打電話或使用電信業者所提供的其他服務。在本實施例中，通訊模組1228可通過網際網路接收來自伺服器1230的應答資訊，並依據此應答資訊建立行動終端裝置1220與至少一電子裝置之間的通話連線，其中所述電子裝置例如為另一行動終端裝置(未繪示)。 In addition, the mobile terminal device 1220 may also be configured with a communication module 1228. The communication module 1228 is, for example, an element capable of transmitting and receiving wireless signals, such as a radio frequency transceiver. Further, the communication module 1228 enables the user to answer or make a call through the mobile terminal device 1220 or use other services provided by the carrier. In this embodiment, the communication module 1228 can receive the response information from the server 1230 through the Internet, and establish a call connection between the mobile terminal device 1220 and the at least one electronic device according to the response information, wherein the electronic device The device is, for example, another mobile terminal device (not shown).

伺服器1230例如為網路伺服器或雲端伺服器等，其具有語音理解模組1232。在本實施例中，語音理解模組1232包括語音辨識模組12322以及語音處理模組12324，其中語音處理模組12324耦接於語音辨識模組12322。在此，語音辨識模組12322會接收從語音取樣模組1224傳來的語音信號，以將語音信號轉換成多個分段語義(例如關鍵字或字句等)。語音處理模組12324則可依據這些分段語義而解析出這些分段語義所代表的意指(例如意圖、時間、地點等)，進而判斷出上述語音信號中所表示的意思。此外，語音處理模組12324還會根據所解析的結果產生對應的應答資訊。在本實施例中，語音理解模組1232可由一個或數個邏輯門組合而成的硬體電路來實作，亦可以是以計算機程序碼來實作。值得一提的是，在另一實施例中，語音理解模組1232可配置於行動終端裝置1320中，如圖13所示的語音操控系統1300。上述伺服器1230的語音理解模組1232的操作，可如圖1A的自然語言理解系統100、圖5A/7A/7B的自然語言對話系統500/700/700’。 The server 1230 is, for example, a network server or a cloud server, and has a voice understanding module 1232. In this embodiment, the voice recognition module 1232 includes a voice recognition module 12322 and a voice processing module 12324. The voice processing module 12324 is coupled to the voice recognition module 12322. Here, the speech recognition module 12322 receives the speech signal transmitted from the speech sampling module 1224 to convert the speech signal into a plurality of segmentation semantics (eg, keywords or words, etc.). The speech processing module 12324 can parse the meanings (such as intent, time, location, etc.) represented by the segmentation semantics according to the segmentation semantics, and then determine the meaning represented in the speech signal. In addition, the voice processing module 12324 also generates corresponding response information according to the parsed result. In this embodiment, the speech understanding module 1232 can be one or several logics. The hardware circuit of the door combination is implemented, and can also be implemented by a computer program code. It is worth mentioning that in another embodiment, the voice understanding module 1232 can be configured in the mobile terminal device 1320, such as the voice control system 1300 shown in FIG. The operation of the speech understanding module 1232 of the server 1230 can be as shown in the natural language understanding system 100 of FIG. 1A and the natural language dialogue system 500/700/700' of FIGS. 5A/7A/7B.

以下即結合上述語音操控系統1200來說明語音操控的方法。圖14是依照本發明一實施例所繪示的語音操控方法的流程圖。請同時參照圖12及圖14，於步驟S1402中，輔助啟動裝置1210發送無線傳輸信號至行動終端裝置1220。詳細的說明是，當輔助啟動裝置1210的第一無線傳輸模組1212因接收到一觸發信號被觸發時，此輔助啟動裝置1210會發送無線傳輸信號至行動終端裝置1220。具體而言，當輔助啟動裝置1210中的觸發模組1214被使用者按壓時，此時觸發模組1214會因觸發信號被觸發，而使第一無線傳輸模組1212發送無線傳輸信號至行動終端裝置1220的第二無線傳輸模組1222，藉以使得第一無線傳輸模組1212通過無線通訊協議與第二無線傳輸模組1222連結。上述的輔助啟動裝置1210僅用於開啟行動終端裝置1220中的語音系統，並不具有聽/通話的功能，故內部的電路設計可簡化，成本也較低。換言之，相對於一般行動終端裝置1220所附加的免持耳機/麥克風而言，輔助啟動裝置1210是另一裝置，即使用者可能同時具備免持的耳機/麥克風以及本發明的輔助啟動裝置1210。 The method of voice manipulation will be described below in conjunction with the voice control system 1200 described above. FIG. 14 is a flowchart of a voice control method according to an embodiment of the invention. Referring to FIG. 12 and FIG. 14 simultaneously, in step S1402, the auxiliary activation device 1210 transmits a wireless transmission signal to the mobile terminal device 1220. The detailed description is that when the first wireless transmission module 1212 of the auxiliary activation device 1210 is triggered by receiving a trigger signal, the auxiliary activation device 1210 transmits a wireless transmission signal to the mobile terminal device 1220. Specifically, when the trigger module 1214 in the auxiliary activation device 1210 is pressed by the user, the trigger module 1214 is triggered by the trigger signal, and the first wireless transmission module 1212 sends the wireless transmission signal to the mobile terminal. The second wireless transmission module 1222 of the device 1220 is configured to enable the first wireless transmission module 1212 to be coupled to the second wireless transmission module 1222 via a wireless communication protocol. The above-mentioned auxiliary starting device 1210 is only used to enable the voice system in the mobile terminal device 1220, and does not have the function of listening/talking, so the internal circuit design can be simplified and the cost is low. In other words, with respect to the hands-free headset/microphone attached to the general mobile terminal device 1220, the auxiliary activation device 1210 is another device, that is, the user may have both the hands-free headset/microphone and the auxiliary activation device 1210 of the present invention.

值得一提的是，上述的輔助啟動裝置1210的形體可以是使用者隨手可及的用品，例如戒指、手錶、耳環、項鏈、眼鏡等各種隨身可攜式物品，或者是安裝構件，例如為配置於方向盤上的行車配件，不限於上述。也就是說，輔助啟動裝置1210為“生活化”的裝置，通過內部系統的設置，讓使用者能夠輕易地觸碰到觸發模組1214，以開啟語音系統1221。因此，使用本發明的輔助啟動裝置1210可以將行動終端裝置1220中的語音系統1221開啟，甚至進而開啟擴音功能(後將詳述)，使得使用者在不需配戴耳機/麥克風，仍可直接通過行動終端裝置1220進行聽/通話。此外，對於使用者而言，這些“生活化”的輔助啟動裝置1210為原本就會配戴或使用的物品，故在使用上不會有不習慣或是不舒適感的問題。 It is worth mentioning that the shape of the auxiliary starting device 1210 described above may be The portable items that are accessible to the user, such as rings, watches, earrings, necklaces, glasses, and the like, or the mounting members, such as the traveling accessories disposed on the steering wheel, are not limited to the above. That is to say, the auxiliary starting device 1210 is a "living" device, and the setting of the internal system allows the user to easily touch the triggering module 1214 to turn on the voice system 1221. Therefore, the auxiliary system 1210 of the present invention can be used to turn on the voice system 1221 in the mobile terminal device 1220, and even turn on the sound amplification function (described later in detail), so that the user can still wear the earphone/microphone without being equipped. Listening/talking is performed directly through the mobile terminal device 1220. In addition, for the user, these "living" auxiliary starting devices 1210 are items that would otherwise be worn or used, so there is no problem of uncomfortable or uncomfortable use.

此外，第一無線傳輸模組1212與第二無線傳輸模組1222皆可處於睡眠模式或工作模式。其中，睡眠模式指的是無線傳輸模組為關閉狀態，亦即無線傳輸模組不會接收/偵測無線傳輸信號，而無法與其它無線傳輸模組連結。工作模式指的是無線傳輸模組為開啟狀態，亦即無線傳輸模組可不斷地偵測無線傳輸信號，或隨時發送無線傳輸信號，而能夠與其它無線傳輸模組連結。在此，當觸發模組1214被觸發時，倘若第一無線傳輸模組1212處於睡眠模式，則觸發模組1214會喚醒第一無線傳輸模組1212，使第一無線傳輸模組1212進入工作模式，並使第一無線傳輸模組1212發送無線傳輸信號至第二無線傳輸模組1222，而讓第一無線傳輸模組1212通過無線通訊協議與行動終端裝置1220的第二無線傳輸模組1222連結。 In addition, the first wireless transmission module 1212 and the second wireless transmission module 1222 can be in a sleep mode or an operating mode. The sleep mode refers to the wireless transmission module being in a closed state, that is, the wireless transmission module does not receive/detect wireless transmission signals, and cannot be connected with other wireless transmission modules. The working mode refers to the wireless transmission module being turned on, that is, the wireless transmission module can continuously detect wireless transmission signals, or can transmit wireless transmission signals at any time, and can be connected with other wireless transmission modules. Here, when the trigger module 1214 is triggered, if the first wireless transmission module 1212 is in the sleep mode, the trigger module 1214 wakes up the first wireless transmission module 1212, and causes the first wireless transmission module 1212 to enter the working mode. And causing the first wireless transmission module 1212 to send the wireless transmission signal to the second wireless transmission module 1222, and let the first wireless transmission module 1212 pass the wireless communication protocol and the second of the mobile terminal device 1220. The line transmission module 1222 is connected.

另一方面，為了避免第一無線傳輸模組1212持續維持在工作模式而消耗過多的電力，在第一無線傳輸模組1212進入工作模式後的預設時間(例如為5分鐘)內，倘若觸發模組1214未再被觸發，則第一無線傳輸模組1212會自工作模式進入睡眠模式，並停止與行動終端裝置1220的第二無線傳輸模組1220連結。 On the other hand, in order to prevent the first wireless transmission module 1212 from continuously maintaining the operating mode and consuming excessive power, the preset time (for example, 5 minutes) after the first wireless transmission module 1212 enters the working mode, if triggered If the module 1214 is not triggered again, the first wireless transmission module 1212 enters the sleep mode from the working mode and stops connecting with the second wireless transmission module 1220 of the mobile terminal device 1220.

之後，於步驟S1404中，行動終端裝置1220的第二無線傳輸模組1222會接收無線傳輸信號，以啟動語音系統1221。接著，於步驟S1406，當第二無線傳輸模組1222偵測到無線傳輸信號時，行動終端裝置1220可啟動語音系統1221，而語音系統的1221取樣模組1224可開始接收語音信號，例如「今天溫度幾度？」、「打電話給老王。」、「請查詢電話號碼。」等等。 Thereafter, in step S1404, the second wireless transmission module 1222 of the mobile terminal device 1220 receives the wireless transmission signal to activate the voice system 1221. Next, in step S1406, when the second wireless transmission module 1222 detects the wireless transmission signal, the mobile terminal device 1220 can activate the voice system 1221, and the 1221 sampling module 1224 of the voice system can start receiving the voice signal, for example, "Today How many degrees?", "Call to Pharaoh.", "Please check the phone number."

於步驟S1408，語音取樣模組1224會將上述語音信號傳送至伺服器1230中的語音理解模組1232，以透過語音理解模組1232解析語音信號以及產生應答資訊。進一步而言，語音理解模組1232中的語音辨識模組12322會接收來自語音取樣模組1224的語音信號，並將語音信號分割成多個分段語義，而語音處理模組12324則會對上述分段語義進行語音理解，以產生用以回應語音信號的應答資訊。 In step S1408, the voice sampling module 1224 transmits the voice signal to the voice understanding module 1232 in the server 1230 to analyze the voice signal and generate response information through the voice understanding module 1232. Further, the speech recognition module 12322 in the speech understanding module 1232 receives the speech signal from the speech sampling module 1224 and divides the speech signal into a plurality of segmentation semantics, and the speech processing module 12324 then The segmentation semantics perform speech understanding to generate response information for responding to the speech signal.

在本發明的另一實施例中，行動終端裝置1220更可接收語音處理模組12324所產生的應答資訊，據以透過語音輸出介面1227輸出應答資訊中的內容或執行應答資訊所下達的操作。於步驟S1410，行動終端裝置1220的語音合成模組1226會接收語音理解模組1232所產生的應答資訊，並依據應答資訊中的內容(例如詞彙或字句等)進行語音合成，而產生語音應答。並且，於步驟S1412，語音輸出介面1227會接收並輸出此語音應答。 In another embodiment of the present invention, the mobile terminal device 1220 can further receive the response information generated by the voice processing module 12324, and output the content in the response information or perform the operation performed by the response information through the voice output interface 1227. Yubu In step S1410, the speech synthesis module 1226 of the mobile terminal device 1220 receives the response information generated by the speech understanding module 1232, and performs speech synthesis according to the content (such as a vocabulary or a sentence) in the response information to generate a speech response. And, in step S1412, the voice output interface 1227 receives and outputs the voice response.

舉例而言，當使用者按壓輔助啟動裝置1210中的觸發模組1214時，第一無線傳輸模組1212則會發送無線傳輸信號至第二無線傳輸模組1222，使得行動終端裝置1220啟動語音系統1221的語音取樣模組1224。在此，假設來自使用者的語音信號為一詢問句，例如「今天溫度幾度？」，則語音取樣模組1224便會接收並將此語音信號傳送至伺服器1230中的語音理解模組1232進行解析，且語音理解模組1232可將解析所產生的應答資訊傳送回行動終端裝置1220。假設語音理解模組1232所產生的應答資訊中的內容為「30℃」，則語音合成模組1226會將此「30℃」的訊息合成為語音應答，且語音輸出介面1227能將此語音應播報給使用者。 For example, when the user presses the trigger module 1214 in the auxiliary activation device 1210, the first wireless transmission module 1212 sends a wireless transmission signal to the second wireless transmission module 1222, so that the mobile terminal device 1220 activates the voice system. Voice sampling module 1224 of 1221. Here, assuming that the voice signal from the user is a query sentence, such as "Today's temperature is a few degrees?", the voice sampling module 1224 receives and transmits the voice signal to the voice understanding module 1232 in the server 1230. The speech understanding module 1232 can transmit the response information generated by the parsing back to the mobile terminal device 1220. Assuming that the content of the response information generated by the speech understanding module 1232 is "30 ° C", the speech synthesis module 1226 synthesizes the "30 ° C" message into a speech response, and the speech output interface 1227 can respond to the speech. Broadcast to the user.

在另一實施例中，假設來自使用者的語音信號為一命令句，例如「打電話給老王。」，則語音理解模組1232中可辨別出此命令句為「撥電話給老王的請求」。此外，語音理解模組1232會再產生新的應答資訊，例如「請確認是否撥給老王」，並將此新的應答資訊傳送至行動終端裝置1220。在此，語音合成模組1226會將此新的應答資訊合成為語音應答，並通過語音輸出介面1227播報於使用者。更進一步地說，當使用者的應答為「是」的類的肯定答案時，類似地，語音取樣模組1224可接收並傳送此語音信號至伺服器1230，以讓語音理解模組1232進行解析。語音理解模組1232解析結束後，便會在應答資訊記錄有一撥號指令資訊，並傳送至行動終端裝置1220。此時，通訊模組1228則會依據電話資料庫所記錄的聯絡人資訊，查詢出「老王」的電話號碼，以建立行動終端裝置1220與另一電子裝置之間的通話連線，亦即撥號給「老王」。 In another embodiment, assuming that the voice signal from the user is a command sentence, such as "calling to Pharaoh.", the voice understanding module 1232 can recognize the command sentence as "calling the phone to Pharaoh." request". In addition, the voice understanding module 1232 will generate new response information, such as "Please confirm whether to dial to Pharaoh", and transmit the new response information to the mobile terminal device 1220. Here, the speech synthesis module 1226 synthesizes the new response information into a voice response, and broadcasts the message to the user through the voice output interface 1227. Further, when the user's response is a positive answer to the "Yes" class, similarly, the voice sampling module 1224 can receive and transmit the voice message. The number is sent to the server 1230 for the speech understanding module 1232 to parse. After the speech understanding module 1232 has finished parsing, a dialing instruction information is recorded in the response information and transmitted to the mobile terminal device 1220. At this time, the communication module 1228 queries the phone number of the "Pharaoh" according to the contact information recorded in the phone database to establish a call connection between the mobile terminal device 1220 and another electronic device, that is, Dial to "Pharaoh."

在其他實施例中，除上述的語音操控系統1200外，亦可利用語音操控系統1300或其他類似的系統，進行上述的操作方法，並不以上述的實施例為限。 In other embodiments, the above-described operation method may be performed by using the voice control system 1300 or other similar system in addition to the voice control system 1200 described above, and is not limited to the above embodiments.

綜上所述，在本實施例的語音操控系統與方法中，輔助啟動裝置能夠無線地開啟行動終端裝置的語音功能。而且，此輔助啟動裝置的形體可以是使用者隨手可及的“生活化”的用品，例如戒指、手錶、耳環、項鏈、眼鏡等裝飾品，即各種隨身可攜式物品，或者是安裝構件，例如為配置於方向盤上的行車配件，不限於上述。如此一來，相較於目前另外配戴免持耳機/麥克風的不舒適感，使用本發明的輔助啟動裝置1210來開啟行動終端裝置1220中的語音系統將更為便利。 In summary, in the voice control system and method of the embodiment, the auxiliary activation device can wirelessly activate the voice function of the mobile terminal device. Moreover, the shape of the auxiliary starting device may be a "living" item accessible by the user, such as a ring, a watch, an earring, a necklace, a pair of glasses, etc., that is, various portable items, or a mounting member. For example, the traveling accessory disposed on the steering wheel is not limited to the above. As a result, it is more convenient to use the auxiliary activation device 1210 of the present invention to activate the voice system in the mobile terminal device 1220 compared to the current discomfort of wearing the hands-free headset/microphone.

值得注意的是，上述具有語音理解模組的伺服器1230可能為網路伺服器或雲端伺服器，而雲端伺服器可能會涉及到使用者的隱私權的問題。例如，使用者需上傳完整的通訊錄至雲端伺服器，才能完成如撥打電話、發簡訊等與通訊錄相關的操作。即使雲端伺服器採用加密連線，並且即用即傳不保存，還是難以消除使用者的擔憂。據此，以下提供另一種語音操控的方法及其對應的語音交互系統，行動終端裝置可在不上傳完整通訊錄的情况下，與雲端伺服器來執行語音交互服務。為了使本發明的內容更為明瞭，以下特舉實施例作為本發明確實能夠據以實施的範例。 It should be noted that the server 1230 with the voice understanding module may be a network server or a cloud server, and the cloud server may involve the privacy of the user. For example, the user needs to upload a complete address book to the cloud server in order to complete operations related to the address book such as making a call, sending a text message, and the like. Even if the cloud server uses encrypted connection, and it is not saved, it is difficult to eliminate. In addition to user concerns. Accordingly, the following provides another voice control method and a corresponding voice interaction system thereof, and the mobile terminal device can perform a voice interaction service with the cloud server without uploading the complete address book. In order to clarify the content of the present invention, the following specific examples are given as examples in which the present invention can be implemented.

雖然本發明已以實施例揭露如上，然其並非用以限定本發明，任何所屬技術領域中具有通常知識者，在不脫離本發明的精神和範圍內，當可作些許的更動與潤飾，故本發明的保護範圍當視後附的申請專利範圍所界定者為準。 Although the present invention has been disclosed in the above embodiments, it is not intended to limit the present invention, and any one of ordinary skill in the art can make some changes and refinements without departing from the spirit and scope of the present invention. The scope of the invention is defined by the scope of the appended claims.

100‧‧‧自然語言理解系統 100‧‧‧Natural Language Understanding System

102‧‧‧請求資訊 102‧‧‧Request information

104‧‧‧分析結果 104‧‧‧Analysis results

108‧‧‧關鍵字 108‧‧‧Keyword

110‧‧‧回應結果 110‧‧‧Responding results

112‧‧‧意圖資料 112‧‧‧Intentional information

114‧‧‧意圖語法資料 114‧‧‧Intentional grammar information

116‧‧‧分析結果輸出模組 116‧‧‧Analysis Results Output Module

200‧‧‧檢索系統 200‧‧‧Search System

220‧‧‧結構化資料庫 220‧‧‧ Structured Database

240‧‧‧搜尋引擎 240‧‧‧Search Engine

260‧‧‧檢索介面單元 260‧‧‧Search interface unit

280‧‧‧指引資料儲存裝置 280‧‧‧Guide data storage device

300‧‧‧自然語言處理器 300‧‧‧Natural Language Processor

Claims

A retrieval system includes: a structured database for storing a plurality of records, wherein each of the records has a data structure; and a search engine for performing a full-text search on the structured database, wherein the The data structure includes a title bar, the title bar includes a plurality of columns, each of the columns includes a guide bar and a value column, wherein the guide bar stores a guide data, and the value column stores a numerical data, wherein the column The numerical data corresponds to the guidance material belonging to the same column, wherein the guidance data is used to indicate the name of one of the plurality of attribute categories, and the content of the attribute category indicated by the guidance material corresponding to the numerical data record The plurality of numerical data belonging to a first record are associated with each other according to the corresponding plurality of guidance materials, and the plurality of numerical data belonging to the first record are used together to express a plurality of attributes of the first record.

The retrieval system of claim 1, wherein the data structure further comprises a content field, and the content column of the records stores content details of each of the records.

The search system of claim 1, wherein each of the columns stores a first special character for separating each of the columns, and storing a second special between the guide bar and the data of the value column. A character that separates the guide bar from the data in the value column.

The retrieval system of claim 1, wherein the title bar This column has a fixed number of digits.

The search system of claim 1, further comprising a search interface unit coupled to the search engine for receiving at least one keyword for transmission to the search engine, so that the search engine has the record for the record The title bar performs the full-text search and reflects a matching result of the search engine, and outputs at least one of the records to match the record.

The retrieval system of claim 5, wherein the retrieval matching record is a full matching record that completely matches the at least one keyword or a partial matching record that is partially matched with the at least one keyword.

The retrieval system of claim 6, wherein when the retrieval interface unit outputs a plurality of retrieval matching records, the outputting the matching records and the partial matching records are sequentially output, wherein the priority of the all matching records is greater than This part matches the priority of the records.

The retrieval system of claim 1, wherein the size of each of the records is equal to a specific value, and after searching for the specific value, the search engine performs the next record to the record currently searched for. Full Text Search.

The search system of claim 1, wherein a third special character is stored after the last column of the title bar of each record, and when the search engine finds the third special character during full-text search, That is, the next record of the record is subjected to full-text search.

A natural language understanding system includes: a natural language processor for analyzing a request message of a user to One of the possible intent grammar materials, each of the possible intent grammar materials includes at least one keyword and an intent data; a knowledge assisting understanding module coupled to the natural language processor for obtaining the at least one possible intent grammar data Determining the intent grammar data to express the user's intention to request the information; and a retrieval system comprising: a structured database for storing a plurality of records, wherein each of the records has a data structure; a search engine for performing a full-text search on the structured database, the data structure including a title bar including a plurality of columns, each of the columns including a guide bar and a numerical column, wherein the guide The column stores a guide data, and the value column stores a numerical data, wherein the numerical data corresponds to the guiding material belonging to the same column, wherein the guiding material is used to indicate a name of one of the plurality of attribute categories, and the The content of the attribute category indicated by the guidance material corresponding to the numerical data record, wherein the plurality of numbers belonging to a first record The data is associated with each other according to the corresponding plurality of guidance materials, and the plurality of attribute data belonging to the first record are used together to express a plurality of attributes of the first record; wherein the knowledge assisting understanding module transmits the keyword The search system, by means of a response from the retrieval system, assists in determining the determined intent grammar data, wherein the data structure further includes a content field, the content column of the records storing the content details of each of the records.

The natural language understanding system of claim 10, wherein a first special character is stored between each of the columns for separating each of the columns, and storing a reference between the guide column and the data of the value column. The second special character is used to separate the information of the guide bar and the value column.

The natural language understanding system of claim 10, wherein the column in the title bar has a fixed number of digits.

The natural language understanding system of claim 10, wherein the retrieval system further comprises a search interface unit coupled to the search engine and the knowledge assisted understanding module for receiving the keyword for transmission to the search An engine, wherein the search engine performs the full-text search on the title bar of the records, and responds to a matching result of the search engine, and outputs at least one search matching record of the records, wherein the knowledge-assisted understanding module compares And at least one retrieves the instruction material stored in the title bar in the matching record and the intent data included in the at least one possible intent grammar material, thereby determining the intention of the user to request the information.

The natural language understanding system of claim 13, wherein the search match record is a full match record that exactly matches the keyword or a part of the match record that matches the keyword portion.

The natural language comprehension system of claim 14, wherein when the search interface unit outputs a plurality of search matching records, the full match record and the partial match record are sequentially output, wherein the full match record is prioritized. The order is greater than the priority order of the partial matching records.

The natural language understanding system described in claim 10, wherein The size of each record is equal to a specific value, and after searching for the specific value, the search engine performs a full-text search on the next record to the record currently searched for.

The natural language understanding system of claim 10, wherein a third special character is stored after the last column of the title bar of the record, and the third special character is found by the search engine when searching in full text. When the next record of the record is full-text searched.

A retrieval method includes: providing a structured database, the structured database storing a plurality of records, wherein each of the records has a data structure; and performing a full-text search on the structured database, wherein the data structure Included in the title bar, the title bar includes a plurality of columns, each of the columns includes a guide bar and a value column, wherein the guide bar stores a guide data, and the value column stores a numerical data, wherein the numerical data Corresponding to the guidance material belonging to the same column, wherein the guidance material is used to indicate the name of one of the plurality of attribute categories, and the value data records the content of the attribute category indicated by the guidance material, wherein The plurality of numerical data belonging to a first record are associated with each other according to the corresponding plurality of guidance materials, and the plurality of numerical data belonging to the first record are used together to express a plurality of attributes of the first record.

The search method of claim 18, wherein the data structure further comprises a content column, the content column of the records storing the contents of each of the records Details.

The search method of claim 18, wherein a first special character is stored between each of the columns to separate each of the columns, and a second special is stored between the guide bar and the data of the value column. A character that separates the guide bar from the data in the value column.

The search method of claim 18, wherein the column in the title column has a fixed number of bits.

The method of claim 18, wherein the step of performing a full-text search on the structured database further comprises: receiving at least one keyword; and performing, by the keyword, the title bar of the records Full-text search; and if the full-text search has a matching result, at least one of the records is outputted to match the record.

The retrieval method of claim 22, wherein the retrieval matching record is an all-matching record that completely matches the keyword or a part of the matching record that is partially matched with the keyword.

The search method of claim 23, wherein the step of outputting the search matching records in the records further comprises: sequentially outputting the all-matched record and the partial matching record, wherein the priority of the all-matched record Greater than the priority of the partial matching record.

The search method of claim 18, wherein the size of each record is equal to a specific value, and after searching for the specific value, the search engine performs the next record to the record currently searched for. Full Text Search.

The search method of claim 18, wherein a third special character is stored after the last column of the title bar of each record, and when the search engine finds the third special character during full-text search, That is, the next record of the record is subjected to full-text search.

A retrieval system comprising: a structured database for storing a plurality of records, wherein each of the records comprises at least one column, wherein the column comprises a plurality of columns, each of the columns comprising a guide bar and a a value column, wherein the guide bar stores a guide data, and the value column stores a numerical data, wherein the numerical data corresponds to the guide data belonging to the same column, wherein the guide data is used to indicate one attribute of the plurality of attribute categories a name of the category, and the content of the attribute category indicated by the guidance material corresponding to the data record, wherein the plurality of numerical data belonging to a first record are related to each other according to the corresponding plurality of guidance materials, and belong to The plurality of attributes of the first record are used together to describe a plurality of attributes of the first record; and a search engine is configured to perform a full-text search on the structured database according to a keyword of a request information, wherein Outputting the first when a first value data of a first column of a first record of the structured database matches the keyword Column corresponding to the first numeric data is a first guidance information to confirm that the resource request The intention of the message, wherein when the intent data of the request information matches the guidance material, it is confirmed that the record corresponding to the guidance material is intended by the request information, wherein the keyword matches the numerical data, Having an all-match record in which the keyword exactly matches the value data or a part of the match record that matches the keyword and the value data portion, wherein when the intent of the request information is confirmed, the priority of the all-match record is greater than the Partially match the priority of the records.

The retrieval system of claim 27, wherein a first special character is stored between each of the columns to separate each of the columns.

The search system of claim 28, wherein a second special character is stored between the guide bar and the value field for separating the guide data of the guide bar from the numerical data of the value column.

The retrieval system of claim 27, wherein the column has a fixed number of digits.

The retrieval system of claim 27, wherein the column includes a content field for storing corresponding content details of the record.

The retrieval system of claim 27, wherein the request information is from a voice input by a user.

The retrieval system of claim 31, wherein the voice is input via a mobile communication device.

The search system described in claim 27, further comprising a search The interface unit is coupled to the search engine for receiving the keyword for transmission to the search engine.

The retrieval system of claim 27, wherein each of the records is equal in size to a specific value, and after searching for the specific value, the search engine performs the next record to the record currently searched for. Full Text Search.

The retrieval system of claim 27, wherein a third special character is stored after the last column of the title bar of each record, and when the search engine finds the third special character during full-text search, That is, the next record of the record is subjected to full-text search.

A retrieval method includes: inputting a keyword, wherein the keyword is generated by a request information; and performing a full-text search on a structured database according to the keyword, wherein the structured database stores a plurality of records, Each of the records includes at least one column, wherein the column includes a plurality of columns, each of the columns includes a guide bar and a value column, wherein the guide bar stores a guide data, and the value column stores a value Data, wherein the numerical data corresponds to the guidance material belonging to the same column, wherein the guidance data is used to indicate a name of one of the plurality of attribute categories, and the reference data corresponding to the guidance data is The content of the attribute category, wherein the plurality of numerical data belonging to a first record are related to each other according to the corresponding plurality of guidance materials, and the numerical data belonging to the first record are used together to describe the first record. Attributes; wherein the first number of a first column of a first record in the structured database When the value data matches the keyword, outputting a first guidance data corresponding to the first value data in the first column to confirm the intention of the request information, wherein the intent information of the request information and the guidance material When a match is generated, it is confirmed that the record corresponding to the guide data is intended by the request information, wherein the match of the keyword with the numerical data includes a full match record that the keyword completely matches the numerical data or A portion of the matching record with the keyword and the portion of the value data, wherein when the intent of the request information is confirmed, the priority of the all-matched record is greater than the priority of the partially matched record.

The search method of claim 37, wherein a first special character is stored between each of the columns to separate each of the columns.

The search method of claim 38, wherein a second special character is stored between the guide bar and the value field for separating the guide data of the guide bar from the numerical data of the value column.

The search method of claim 37, wherein the column has a fixed number of bits.

The search method of claim 37, wherein the column includes a content field for storing corresponding content details of the record.

The retrieval method of claim 37, wherein the request information is from a voice input by a user.

The search method of claim 42, wherein the voice is input via a mobile communication device.

The search method of claim 37, further comprising a search interface unit coupled to the search engine for receiving the keyword for transmission to the search engine.

The search method of claim 37, wherein the size of each record is equal to a specific value, and after searching for the specific value, the search engine performs the next record to the record currently searched for. Full Text Search.

The search method of claim 37, wherein a third special character is stored after the last column of the title bar of each record, and when the search engine finds the third special character during full-text search, That is, the next record of the record is subjected to full-text search.