TW476895B

TW476895B - Natural language inquiry system and method

Info

Publication number: TW476895B
Application number: TW089123053A
Authority: TW
Inventors: Ching-Lung Ye; Feng-Ling Jang
Original assignee: Semcity Technology Corp
Priority date: 2000-11-02
Filing date: 2000-11-02
Publication date: 2002-02-21
Also published as: US20020052871A1

Abstract

The invention relates to a natural language inquiry system and method, in which the natural language inquiry system includes a natural language processing program that processes the inputted Chinese inquiry sentence of user, and makes the Chinese inquiry sentence obtain the deep structure of usage; a document database used to store the related documents of knowledge of applied fields; a document conversion database linked with document database and used to store the data describing knowledge of applied fields; the comparison program linked in between the natural language processing program and document transfer database and used to compare the deep structure of usage concerning the Chinese inquiry sentence obtained from the processing of natural language processing program with the data stored in the document transfer database; and an answer fetching program linked with the comparison program and document database and used to find out the corresponding answer from document database in accordance with the comparison result obtained by the comparison program.

Description

^/6895 A7 B7 6672twf.doc/008 五、發明說明（丨）本發士是有關於一種可讓使用者輸入中文查詢句子的自然語言查詢系統及方法，且特別是有關於一種讓使用者以語音方式輸入中文查詢句子的系統及方法。請參考第1圖，爲習知技藝在作此類查詢時的做法。當習知使用者100欲查詢一個物件，像是書本或雜誌，通常都是經由一個輸入介面102,將查詢有關物件的關鍵詞輸入處理程式104中，而此處理程式104會逕自至資料庫 106中尋找有關此名詞的資料，然後經由輸出介面1〇8，將此資料傳送給使用者100。但此習知卻有以下缺點： 1·通常只能輸入關鍵詞進入查詢：如圖書館查詢系統中輸入書名，有時得到的資料不是使用者欲查詢的資料。 2.無法輸入非簡單句：如有兩個以上的關鍵詞，無法確切表達得正確的語意。有鑒於此，本發明提出一種自然語言查詢系統，此自然語言查詢系統可以以語音方式或鍵盤輸入方式，將一個完整句子作輸入查詢，且可以輸入兩個以上的關鍵詞，此自然語言查詢系統包括：自然語言處理程式、文件資料庫、文件轉換資料庫、答案擷取程式及比對程式。其中，自然語言處理程式，係處理使用者所輸入之中 ’ 文查詢句子，將此中文查詢句子經處理得到一種深層語法結構。文件資料庫，係作爲儲存應用領域知識之相關文件。文件轉換資料庫，係與文件資料庫相鏈結，作爲儲存本紙張尺度適用中國國家標準（CNS)A4規格（210 X 297公t )^ / 6895 A7 B7 6672twf.doc / 008 V. Description of the Invention (丨) The present invention relates to a natural language query system and method that allows users to input Chinese query sentences, and particularly relates to a method that allows users to System and method for inputting Chinese query sentences by voice. Please refer to Figure 1 for the practice of learning techniques when making such inquiries. When the learned user 100 wants to query an object, such as a book or magazine, usually through an input interface 102, the keywords for querying related objects are input into the processing program 104, and this processing program 104 will go directly to the database The information about the noun is found in 106, and then the data is transmitted to the user 100 through the output interface 108. However, this practice has the following disadvantages: 1. Generally, only keywords can be entered to enter the query: such as entering a book name in the library query system, and sometimes the information obtained is not the information the user wants to query. 2. Can't enter non-simple sentences: if there are more than two keywords, they can't be expressed accurately. In view of this, the present invention proposes a natural language query system. The natural language query system can use voice or keyboard input to input a complete sentence as an input query, and can input more than two keywords. This natural language query system Including: natural language processing program, document database, document conversion database, answer extraction program and comparison program. Among them, the natural language processing program is to process the query sentence inputted by the user, and process the Chinese query sentence to obtain a deep grammatical structure. The document database is used to store relevant knowledge in the application domain. The document conversion database is linked to the document database for storage. The paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 g)

—訂--------- (請先閱讀背面之注音？事項再填寫本頁) 經濟部智慧財產局員工消費合作社印製 476895 A7 B7 6672twf.doc/008 五、發明說明（1) 描述應用領域知識之資訊，其中這些資訊係以一種深層語法結構方式存在於文件轉換資料庫中，係紀錄有關存在文件資料庫中相關文件之內容涵義。比對程式，連接於自然語言處理程式與文件轉換資料庫之間，係將自然語言處理程式處理後得到之中文查詢句子的深層語法結構，與存在文件轉換資料庫中之資訊作比對。答案擷取程式，係與比對程式及文件資料庫連接，作爲將比對程式比對後得到之結果，從文件資料庫找出對應答案。在此自然語言查詢系統中更包括有：輸入介面、輸出介面、自然語言處理資料庫及比對資料庫。輸入介面’係與自然語言處理程式連接，作爲提供使用者以語音方式輸入中文查詢句子。輸出介面，係與答案擷取程式連接’把從文件資料庫找出之對應答案經由輸出介面呈現給使用者讀取。自然語言處理資料庫，與自然語言處理程式連接，作爲提供自然語言處理程式在作處理中文查詢句子時所需之資訊。比對資料庫，與比對程式連接’係作爲儲存存在於應用領域知識中之多個元件、還有元件的性質及這些元件彼此間的關聯性。本發明提出另一種自然語言查詢方法，係有關於使用者可以PJPJ曰方式’或是鍵盤輸入方式輸入中文查詢句子，透過此自然語言查詢方法，得到有關中文查詢句子之資訊。此自然語言查詢方法之步驟包括：先處理中文查詢句 4 張尺度適用中國國家標準（CNS)A4 ------- -----------裝--------訂---------線 (請先閱讀背面之注意事項再填寫本頁) 經濟部智慧財產局員工消費合作社印製 476895 A7 B7 6672twf.doc/008 五、發明說明（巧）子，得到中文查詢句子的一種深層語法結構，再將得到之深層語法結構，與文件轉換資料庫中之內容相比對，之後將比對後得到之相配結果，自文件資料庫中擷取對應答案，然後，將此對應答案傳送輸出給使用者。本發明又提出另一種自然語言處理元件，係有關於使用者以語音方式輸入中文查詢句子，透過此自然語言處理元件，分析中文查詢句子之結構，此自然語言處理元件即自然語言處理程式，此自然語言處理程式會將中文查詢句子作處理，得到中文查詢句子的深層語法結構。與自然語言處理程式連接的是自然語言處理資料庫，此自然語言處理資料庫係提供自然語言處理程式，處理中文查詢句子時所需之資訊。而在自然語言處理程式中包括：斷詞程式、剖析程式及語意解釋程式。斷詞程式，係作爲將使用者所輸入之中文查詢句子作斷詞處理。剖析程式，係與斷詞程式連接，作爲將經過斷詞處理過之中文查詢句子作剖析處理。語意解釋程式，與剖析程式連接，將剖析處理過之中文查詢句子作語意解釋分析，此語意解釋處理，係依中文查詢句子經剖析程式處理後得到之詞性，將中文查詢句子表示成一種深層語法結 ·，構。本發明又再次提出另一種自然語言處理方法，係有關於使用者以語音方式輸入中文查詢句子，然後透過此自然語言處理方法，分析中文查詢句子之結構。本紙張尺度適用中國國家標準（CNS)A4規格（210 X 297公釐） (請先閱讀背面之注音？事項再填寫本頁) 裝--------訂---------· 經濟部智慧財產局員工消費合作社印製 476895 A7 B7 6672twf.doc/008 五、發明說明（0) (請先閱讀背面之注意事項再填寫本頁) 此自然語言處理方法之步驟包括：先將中文查詢句子作斷詞處理，接著將經過斷詞處理過之中文查詢句子，作剖析處理，再來將經過剖析處理過之中文查詢句子，作語意解釋分析，最後得到中文查詢句子之深層語法結構。透過本發明所發明之「深層語法結構」的語意結構，可使在比對時能方便比對，且可簡化語意解釋的工作，更可以在句子爲非簡單句時，可以正確判斷出此句子的語 ZS±E1 思0 爲讓本發明之上述和其他目的、特徵、和優點能更明顯易懂，下文特舉較佳實施例，並配合所附圖式，作詳細說明如下= 圖式之簡單說明：第1圖繪示的是習知之一示意圖；第2圖繪示的是本發明之一示意圖；第3圖繪示的是本發明之一流程圖；第4圖繪示的是本發明之一示意圖；第5圖繪示的是本發明之一流程圖；第6圖繪示的是本發明之一流程圖；第7圖繪示的是本發明之一示意圖；經濟部智慧財產局員工消費合作社印製第8圖繪示的是本發明之一示意圖；第9圖繪示的是本發明之一示意圖；以及第10圖繪示的是本發明之一示意圖。重要元件標號 100，201 :使用者本紙張尺度適用中國國家標準（CNS)A4規格（210 X 297公釐） 476895 A7 B7 6672twf.doc/008 五、發明說明（匕） 102，202 :輸入介面 (請先閱讀背面之注意事項再填寫本頁) 經濟部智慧財產局員工消費合作社印製 104 :處理程式 106 :資料庫 108，214 :輸出介面 200 :自然語言查詢系統 204 :自然語言處理程式 206 :比對程式 208 :文件轉換資料庫 210 :文件資料庫 212 :答案擷取程式 216 :自然語言處理資料庫 218 :比對資料庫 400 :中文查詢句子 404 :斷詞程式 406 :剖析程式 408 :語意解釋程式 412 :深層語法結構 500 :輸入句子 502 :推理引擎 504 :文法規則 1 506 :輸出句子結構；步驟s302至步驟s308爲本發明之一實施步驟步驟s502至步驟s604爲本發明之另一實施步驟較佳實施例本紙張尺度適用中國國家標準（CNS)A4規格（210 X 297公釐） 476895 A7 B7 6672twf.doc/008 五、發明說明（6 ) 請參照第2圖，其繪示的是依照本發明一較佳實施例的一種自然語言查詢系統，係有關於使用者以語音方式，或是鍵盤輸入方式輸入中文查詢句子，透過此自然語言查詢系統，得到有關中文查詢句子之資訊。此自然語言查詢系統包括：自然語言處理程式204、文件資料庫210、文件轉換資料庫208、答案擷取程式212及比對程式206。其中，自然語言處理程式204，係處理使用者201所輸入之中文查詢句子，得到此中文查詢句子的一種深層語法結構。文件資料庫210，係作爲儲存應用領域知識之相關文件，而此應用領域知識是各個行業或環境所存在的領域，即若此環境是一個財務部門，則文件資料庫210中所儲存的即是有關財務方面的相關文件。文件轉換資料庫208，係與文件資料庫210相鏈結，作爲儲存描述應用領域知識之資訊，其中這些資訊係以一種深層語法結構方式存在於文件轉換資料庫208中，係紀錄有關存在文件資料庫210中相關文件之內容涵義。比對程式206，連接於自然語言處理程式204與文件轉換資料庫208之間，係將自然語言處理程式204處理後得到之中文查詢句子的深層語法結構，與存在文件轉換資料庫208 中之資訊作比對。 · 答案擷取程式212，係與比對程式206及文件資料庫 210連接，作爲將比對程式206比對後得到之結果，從文件資料庫210找出對應答案。在此自然語言查詢系統200中更包括有：輸入介面 ----------------—^--------- (請先閱讀背面之注咅？事項再填寫本頁) 經濟部智慧財產局員工消費合作社印製本紙張尺度適用中國國家標準（CNS)A4規格（210 X 297公釐） 476895 A7 6672twf.doc/008 五、發明說明（〇 ) 202、輸出介面214、自然語言處理資料庫216及比對資料庫 218。輸入介面202，係與自然語言處理程式204連接，作爲提供使用者201以語音方式輸入中文查詢句子。輸出介面214,係與答案擷取程式212連接，把從文件資料庫210 找出之對應答案經由輸出介面214呈現給使用者201讀取。自然語言處理資料庫210，與自然語言處理程式204 連接，作爲提供自然語言處理程式204在作處理中文查詢句子時所需之資訊，而這些資訊，包括有辭典、文法規則及語意解釋規則，其中自然語言處理程式204在處理中文查詢句子的方式，就是經由這些資訊將中文查詢句子作斷詞處理、剖析處理及語意解釋處理。比對資料庫218，與比對程式206連接，係作爲儲存存在於應用領域知識中之多個元件、還有元件的性質及這些元件彼此間的關聯性。茲舉一例，將上述方法據以實施，使用者201經由輸入介面202輸入中文查詢句子爲’’我愛小貓’’，經由自然語言處理程式204處理得到此句子之深層語法結構，爲 topic :我；domain ··我；type :喜歡；range :小貓。再由比對程式206從文件轉換資料庫208中找到與此中文查詢句子之深層語法結構相似之詞，此相似之詞爲，，小貓，，， ' 所以此比對程式206會輸出，，小貓，，至答案擷取程式212，；答案擷取程式212再自文件資料庫210中取得有關小貓的所有類型及相關內容，並傳送給使用者。請參照第3圖，其繪示的是依照本發明之另一較佳實本紙張尺度適用中國國家標準（CNS)A4規格（210 X 297公爱） -----------·裝--------訂---------線· (請先閱讀背面之注音？事項再填寫本頁) 經濟部智慧財產局員工消費合作社印製 476895 A7 B7 6672twf.doc/008 五、發明說明（ί) _ 施例，其繪示的是一種自然語言查詢方法’係有關於使用者以語音方式輸入中文查詢句子，透過此自然語言查詢方法，得到有關中文查詢句子之資訊。此自然語言查詢方法之步驟包括：步驟s302 ’處理中文查詢句子’得到中文查詢句子的一種深層語法結構，在步驟s304中，將得到之深層語法結構，與文件轉換資料庫中之內容相比對’在步驟s306中，將比對後得到之相配結果’自文件資料庫中擷取對應答案，在步驟s308中，將此對應答案傳送輸出給使用者。其中存在於文件轉換資料庫中之內容，也是以一種深層語法結構方式存在於文件轉換資料庫中。請參照第4圖，其繪示的是依照本發明之又另一較佳實施例，其繪示的是一種自然語言處理元件’係有關於使用者以語音方式輸入中文查詢句子，透過此自然語言處理元件，分析中文查詢句子之結構，此自然語言處理元件即自然語言處理程式204，此自然語言處理程式204會將中文查詢句子作處理，得到中文查詢句子的深層語法結構。接著與自然語言處理程式204連接的是自然語言處理資料庫216，此自然語言處理資料庫216係提供自然語言處理程式204處理中文查詢句子時所需之資訊，而此資訊包括有辭典、文法規則及語意解釋規則。而在自然語言處理程式204中包括：斷詞程式404、剖析程式406及語意解釋程式408。斷詞程式404,係作爲將使用者所輸入之中文查詢句 -----------裝--------訂--------- (請先閱讀背面之注意事項再填寫本頁) 經濟部智慧財產局員工消費合作社印製本紙張尺度適用中國國家標準（CNS)A4規格（210 X 297公釐） 476895 A7 B7 6672twf.doc/008 五、發明說明（q) (請先閱讀背面之注意事項再填寫本頁) 子作斷詞處理，其中此斷詞處理，係將中文查詢句子開頭的子字串與辭典中之內容相比對，得出中文查詢句子之詞串。剖析程式406，係與斷詞程式404連接，作爲將經過斷詞處理過之中文查詢句子作剖析處理，而此剖析處理，係作爲剖析斷詞處理後得到之詞串，並賦予詞串詞性。製作剖析程式的技術很多，本發明是採用一種Definite Clause Grammar (簡稱DCG)。語意解釋程式408，與剖析程式406連接，將剖析處理過之中文查詢句子作語意解釋分析，此語意解釋處理，係依中文查詢句子經剖析程式 406處理後得到之詞性，將中文查詢句子表示成一種深層語法結構。經濟部智慧財產局員工消費合作社印製請參照第5圖，其繪示的是依照本發明之又再另一較佳實施例，其繪示的是一種自然語言處理方法，係有關於使用者以語音方式輸入中文查詢句子，透過此自然語言處理方法，分析中文查詢句子之結構，此自然語言處理方法之步驟包括：步驟s502，將中文查詢句子作斷詞處理，接著在步驟s504中，將經過斷詞處理過之中文查詢句子，作剖析處理，再來在步驟s506中，將經過剖析處理過之中文查詢句子，作語意解釋分析，最後在步驟S508中，得到中文查詢句子之深層語法結構。，請參照第6圖，其繪示的是斷詞處理處理之步驟，包括：步驟s600，將中文查詢句子開頭的子字串與辭典中之內容相比對’接著在步驟s602中，依長詞優先規則，從中文查詢句子中挑出最長之子字串，在步驟s6〇4中，若本紙張尺度適用中國國家標準（CNS)A4規格（210 X 297公釐） 476895 A7 B7 經濟部智慧財產局員工消費合作社印製 6672twf.doc/008 發明說明（(^) 仍有剩餘之中文查詢句子未比對，則繼續比對及挑出最長之子字串，直至中文查詢句子做完爲止。茲參考第7圖，其繪示的是斷詞處理之斷詞演算法。請參照第8圖，其繪示的是DCG剖析程式之運作過程。使用者將輸入句子800以一種context-free grammar的文法規則804，便可透過Prolog的推理引擎802得出輸出句子結構806，表示成DCG形式。請參照第9圖，其繪示的是一個說明DCG形式的句型結構及剖析後的結果。其中，文法句子結構箭頭（左邊代表一個句子及其結構，右邊則是其組成成分，依序爲主詞、助動詞及疑問副詞、副詞詞組、動詞詞組、及問號。所傳回的結果爲 ques t ion( Type，Subj，Subj，AdvP， VP)，其中，Type爲疑問副詞的類型，第二及第三位分別爲句子的主題（Topic)及主語（Subject)，接下來是副詞組及動詞組。有關DCG的寫法可參考Prolog教科書，如Clocksin and Me 11i sh. Programming in Prolog，3ed.，1996，Springer-—Order --------- (Please read the note on the back? Matters and then fill out this page) Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 476895 A7 B7 6672twf.doc / 008 V. Description of Invention (1) Information describing application domain knowledge, which is stored in the document conversion database in a deep grammatical structure and records the meaning of the content of related documents stored in the document database The comparison program is connected between the natural language processing program and the document conversion database. It compares the deep grammatical structure of the Chinese query sentence obtained by the natural language processing program with the information stored in the document conversion database. The answer extraction program is connected with the comparison program and the document database. As a result obtained by comparing the comparison program, the corresponding answer is found from the document database. The natural language query system further includes: an input interface, an output interface, a natural language processing database and a comparison database. The input interface ’is connected with a natural language processing program, and provides a user to input Chinese query sentences by voice. The output interface is connected with the answer extraction program, and the corresponding answer found from the document database is presented to the user for reading through the output interface. The natural language processing database is connected with the natural language processing program to provide the information that the natural language processing program needs to process Chinese query sentences. The comparison database, the connection with the comparison program 'is used to store a plurality of components existing in the knowledge of the application domain, as well as the properties of the components and the correlation between these components. The present invention proposes another natural language query method, which relates to a user who can input Chinese query sentences in PJPJ mode or keyboard input mode, and obtain information about Chinese query sentences through this natural language query method. The steps of this natural language query method include: first processing Chinese query sentence 4 scales applicable to China National Standard (CNS) A4 ------- ----------- install ------ --Order --------- line (please read the notes on the back before filling out this page) Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 476895 A7 B7 6672twf.doc / 008 ) To obtain a deep grammatical structure of Chinese query sentences, and then compare the obtained deep grammatical structure with the content in the document conversion database, and then match the results obtained after the comparison, and extract it from the document database. The corresponding answer is then transmitted to the user. The present invention also proposes another natural language processing element, which is related to a user inputting a Chinese query sentence in a voice manner, and analyzing the structure of the Chinese query sentence through this natural language processing element. This natural language processing element is a natural language processing program. The natural language processing program will process the Chinese query sentence to obtain the deep grammatical structure of the Chinese query sentence. Connected with the natural language processing program is a natural language processing database. This natural language processing database provides a natural language processing program to process the information required for Chinese query sentences. The natural language processing programs include: word segmentation programs, parsing programs and semantic interpretation programs. The word segmentation program is to treat the Chinese query sentence entered by the user as the word segmentation process. The parsing program is connected with the word segmentation program as a parsing process for Chinese query sentences that have been processed by the word segmentation. Semantic interpretation program, connected with the analysis program, analyzes and analyzes the Chinese query sentence processed for semantic interpretation. This semantic interpretation process is based on the part-of-speech obtained by the Chinese query sentence after processing by the analysis program, and represents the Chinese query sentence as a deep grammar structure. The present invention again proposes another natural language processing method, which relates to a user inputting a Chinese query sentence by voice, and then analyzing the structure of the Chinese query sentence through this natural language processing method. This paper size applies to China National Standard (CNS) A4 (210 X 297 mm) (Please read the phonetic on the back? Matters before filling out this page) Loading -------- Order ------- -· Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 476895 A7 B7 6672twf.doc / 008 V. Description of Invention (0) (Please read the notes on the back before filling this page) The steps of this natural language processing method include: First treat Chinese query sentences as word segmentation, then analyze the Chinese query sentences that have been processed by word segmentation, and then analyze and analyze the Chinese query sentences that have been processed, and finally get the depth of Chinese query sentences. grammar structure. Through the semantic structure of the "deep grammatical structure" invented by the present invention, the comparison can be facilitated during comparison, the work of semantic interpretation can be simplified, and the sentence can be correctly judged when the sentence is a non-simple sentence. In order to make the above and other objects, features, and advantages of the present invention more comprehensible, the following describes the preferred embodiment in detail with the accompanying drawings, as follows: Brief description: Figure 1 shows a schematic diagram of the conventional art; Figure 2 shows a schematic diagram of the present invention; Figure 3 shows a flowchart of the present invention; Figure 4 shows the present invention A schematic diagram of the invention; Figure 5 shows a flowchart of the invention; Figure 6 shows a flowchart of the invention; Figure 7 shows a diagram of the invention; Intellectual Property of the Ministry of Economic Affairs FIG. 8 shows a schematic diagram of the present invention printed by the bureau's consumer cooperative; FIG. 9 shows a schematic diagram of the present invention; and FIG. 10 shows a schematic diagram of the present invention. Signs of important components 100, 201: The user's paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 476895 A7 B7 6672twf.doc / 008 5. Description of the invention (dagger) 102, 202: Input interface ( Please read the notes on the back before filling this page) Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 104: Processing Program 106: Database 108, 214: Output Interface 200: Natural Language Query System 204: Natural Language Processing Program 206: Comparison program 208: Document conversion database 210: Document database 212: Answer extraction program 216: Natural language processing database 218: Comparison database 400: Chinese query sentence 404: Word segmentation program 406: Parser 408: Semantic Interpreter 412: Deep grammatical structure 500: Input sentence 502: Inference engine 504: Grammar rule 1 506: Output sentence structure; Steps s302 to s308 are one implementation of the present invention. Steps s502 to s604 are another implementation of the present invention. Preferred embodiment of the procedure The paper size is in accordance with Chinese National Standard (CNS) A4 (210 X 297 mm) 476895 A7 B7 6672twf.doc / 008 V. The Invention Ming (6) Please refer to FIG. 2, which shows a natural language query system according to a preferred embodiment of the present invention. It relates to a user inputting a Chinese query sentence by voice or keyboard input. This natural language query system obtains information about Chinese query sentences. The natural language query system includes a natural language processing program 204, a document database 210, a file conversion database 208, an answer retrieval program 212, and a comparison program 206. The natural language processing program 204 processes a Chinese query sentence input by the user 201 to obtain a deep syntax structure of the Chinese query sentence. The document database 210 is used to store the relevant knowledge of the application domain, and this application domain knowledge is the domain of each industry or environment. That is, if the environment is a financial department, the document database 210 is Relevant financial documents. The document conversion database 208 is linked with the document database 210 and stores information describing the application domain knowledge. The information exists in the document conversion database 208 in a deep grammatical structure and records the existing document data. The meaning of the content of the related files in the library 210. The comparison program 206 is connected between the natural language processing program 204 and the document conversion database 208. It is the deep grammatical structure of the Chinese query sentence obtained by the natural language processing program 204 and the information stored in the document conversion database 208. Compare. · The answer retrieval program 212 is connected with the comparison program 206 and the document database 210. As a result obtained by comparing the comparison program 206, the corresponding answer is found from the file database 210. The natural language query system 200 further includes: an input interface ----------------- ^ --------- (Please read the note on the back? Please fill in this page for further information) Printed by the Employees' Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs The paper size applies to the Chinese National Standard (CNS) A4 (210 X 297 mm) 476895 A7 6672twf.doc / 008 V. Description of the Invention (〇) 202 , Output interface 214, natural language processing database 216, and comparison database 218. The input interface 202 is connected to the natural language processing program 204, and is used to provide the user 201 to input Chinese query sentences by voice. The output interface 214 is connected to the answer retrieval program 212, and presents the corresponding answer found from the document database 210 to the user 201 for reading through the output interface 214. The natural language processing database 210 is connected with the natural language processing program 204 and provides information required by the natural language processing program 204 when processing Chinese query sentences. The information includes a dictionary, grammar rules, and semantic interpretation rules. The way the natural language processing program 204 processes Chinese query sentences is to process the Chinese query sentences by word break processing, parsing processing, and semantic interpretation processing through the information. The comparison database 218, connected to the comparison program 206, stores a plurality of components existing in the knowledge of the application domain, as well as the properties of the components and the correlation between these components. For example, to implement the above method, the user 201 inputs the Chinese query sentence as "I love kittens" via the input interface 202, and processes the natural grammar processing program 204 to obtain the deep grammatical structure of the sentence, which is topic: I; domain · · I; type: like; range: kitten. Then the comparison program 206 finds a word similar to the deep grammatical structure of this Chinese query sentence from the document conversion database 208. The similar word is ,, kitten ,,, 'So this comparison program 206 will output ,, small The cat goes to the answer extraction program 212; The answer extraction program 212 then obtains all types and related contents of the kitten from the document database 210 and sends it to the user. Please refer to FIG. 3, which shows that another paper according to another preferred embodiment of the present invention is applicable to the Chinese National Standard (CNS) A4 specification (210 X 297 public love) ----------- · Equipment -------- Order --------- Line · (Please read the note on the back? Matters before filling out this page) Printed by the Employees' Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 476895 A7 B7 6672twf .doc / 008 V. Description of the Invention (ί) _ Example, which shows a natural language query method 'is about a user entering a Chinese query sentence by voice, and through this natural language query method, a Chinese query is obtained Sentence information. The steps of this natural language query method include: Step s302 'Process Chinese query sentences' to obtain a deep grammatical structure of Chinese query sentences. In step s304, compare the obtained deep grammatical structure with the content in the file conversion database. 'In step s306, the matching result obtained after the comparison' is taken from the document database, and in step s308, the corresponding answer is transmitted and output to the user. The content that exists in the document conversion database also exists in the document conversion database in a deep syntax structure. Please refer to FIG. 4, which illustrates another preferred embodiment of the present invention, which illustrates a natural language processing element, which is related to a user inputting a Chinese query sentence in a voice manner. The language processing component analyzes the structure of the Chinese query sentence. This natural language processing component is the natural language processing program 204. The natural language processing program 204 processes the Chinese query sentence to obtain the deep grammatical structure of the Chinese query sentence. Next connected to the natural language processing program 204 is a natural language processing database 216. This natural language processing database 216 provides information required by the natural language processing program 204 to process Chinese query sentences. This information includes dictionaries and grammar rules. And semantic interpretation rules. The natural language processing program 204 includes a word segmentation program 404, a parsing program 406, and a semantic interpretation program 408. Word segmentation program 404, as a Chinese query entered by the user ---------- install -------- order --------- (Please read first Note on the back, please fill in this page again.) Printed by the Intellectual Property Bureau of the Ministry of Economic Affairs, Consumer Cooperatives. This paper is printed in accordance with Chinese National Standard (CNS) A4 (210 X 297 mm) 476895 A7 B7 6672twf.doc / 008 5. Description of the invention (Q) (Please read the notes on the back before filling in this page) Sub-word segmentation processing, where this segmentation processing is to compare the sub-string at the beginning of a Chinese query sentence with the content in the dictionary to obtain Chinese The query string. The parsing program 406 is connected to the word segmentation program 404 as a parsing process for the Chinese query sentence processed by the word segmentation, and this parsing process is a word string obtained after parsing the word segmentation process and gives the word string part of speech. There are many techniques for making analysis programs. The present invention uses a Define Clause Grammar (referred to as DCG). The semantic interpretation program 408 is connected with the analysis program 406 to analyze and analyze the Chinese query sentence processed. This semantic interpretation process is based on the part-of-speech obtained by the Chinese query sentence after being processed by the analysis program 406, and the Chinese query sentence is expressed as A deep grammatical structure. Printed by the Consumer Cooperative of the Intellectual Property Bureau of the Ministry of Economics, please refer to FIG. 5, which shows yet another preferred embodiment according to the present invention. It shows a natural language processing method, which concerns users. Enter Chinese query sentences by voice, and analyze the structure of Chinese query sentences through this natural language processing method. The steps of this natural language processing method include: step s502, processing the Chinese query sentence as word segmentation, and then in step s504 The Chinese query sentence processed by the word segmentation is parsed. Then, in step s506, the Chinese query sentence processed and parsed is interpreted semantically. Finally, in step S508, the deep grammatical structure of the Chinese query sentence is obtained. . Please refer to FIG. 6, which shows the steps of word segmentation processing, including: step s600, comparing the substring at the beginning of the Chinese query sentence with the content in the dictionary. Then in step s602, according to the length Word priority rule, pick the longest child string from the Chinese query sentence, in step s604, if this paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 476895 A7 B7 intellectual property of the Ministry of Economic Affairs Printed by the Consumer Cooperative of the Bureau of the Bureau 6672twf.doc / 008 Invention Description ((^) There are still Chinese query sentences left unmatched, continue to compare and pick out the longest substring until the Chinese query sentence is completed. Reference Figure 7, which shows the word-breaking algorithm for word-breaking processing. Please refer to Figure 8, which shows the operation process of the DCG analysis program. The user will enter the sentence 800 in a context-free grammar grammar Rule 804, the output sentence structure 806 can be obtained through Prolog's inference engine 802 and expressed in DCG form. Please refer to FIG. 9, which shows a sentence structure that illustrates the DCG form and the results after analysis. , Grammatical sentence structure arrow (the left side represents a sentence and its structure, the right side is its constituent elements, in order, the main word, auxiliary verb and interrogative adverb, adverb phrase, verb phrase, and question mark. The returned result is ques t ion ( Type, Subj, Subj, AdvP, VP), where Type is the type of interrogative adverbs, the second and third digits are the topic and subject of the sentence, respectively, followed by the adverb and verb. The writing of DCG can refer to Prolog textbooks, such as Clocksin and Me 11i sh. Programming in Prolog, 3ed., 1996, Springer-

Ver 1 ag o 語意解釋處理會將接收到的句子結構轉成一種深層語法結構。此深層g吾法結構是以「特徵結構」（f e a t u r e structure)表示的。所謂「特徵結構」是一組特徵値對 (feature-value pair)，其中特徵爲一原子（atom)値爲一原子或另一特徵結構。聯並（un i f i c a t i on )爲「特 ----------裝--------訂--------- (請先閱讀背面之注意事項再填寫本頁) 本紙張尺度適用中國國家標準（CNS)A4規格（210 X 297公釐） 476895 A7 B7 6672twf.doc/008 五、發明說明（U) (請先閱讀背面之注意事項再填寫本頁) 徵結構」主要的運算。兩個特徵結構A、B經聯並運算後，結果爲能涵蓋A、B的最小特徵結構。若不存在此特徵結構，則聯並運算失敗。經語意解釋處理產生的深層語法結構爲，，主題 (topic)’’、’’應用領域知識（domain )，，、，，類型（type)，，及’’範圍（range) ’’四個部分。茲舉一例，說明一個中文查詢句子經斷詞處理、剖析處理及S吾思解釋處理的過程。若有一中文查詢句子輸入爲’’我想知道公司的財務狀況’’，則在斷詞處理中，會將此句依長詞規則將此中文查詢句子斷句成，，我，，、’’知道公司財務狀況’’。將這些子字串經由剖析處理得到DCG 形式的句子結構，Type ··空白；Subj ··我；AdvP ··空白； VP:知道+(公司，財務狀況），可知道輸入之中文查詢句子之詞性。再將此句子結構交由語意解釋處理得到深層語法結構’爲 topi c :我；domain :我；type :知道；range : (公司，財務狀況）。請參照第10圖，茲舉多例，得知這些中文查詢句子的深層語法結構。經濟部智慧財產局員工消費合作社印製綜上所述，本發明的優點如下： 1 ·用「丨朵層語法結構」當作中文查詢句子及文件轉換 ^ 資料庫中資料的語意表示法，可使比對程式在比對時能方 ) 便比對。 2·用「深層語法結構」當作中文查詢句子的語意表示法’可使自然語言處理程式在處理的過程中，能簡化語意本紙張尺度適用中國國家標準（CNS)A4規格（210 X 297公釐） A7 476895 6672twf.doc/008 _B7_ 五、發明說明（VT ) 解釋的工作。 3.若是非簡單句，例如可能一個句子有兩個主詞之類，可以正確判斷出此句子的語意。雖然本發明已以較佳實施例揭露如上，然其並非用以限定本發明，任何熟習此技藝者，在不脫離本發明之精神和範圍內，當可作各種之更動與潤飾，因此本發明之保護範圍當視後附之申請專利範圍所界定者爲準。 (請先閱讀背面之注意事項再填寫本頁) 經濟部智慧財產局員工消費合作社印製本紙張尺度適用中國國家標準（CNS)A4規格（210 X 297公釐）Ver 1 ag o Semantic interpretation processing transforms the received sentence structure into a deep syntactic structure. This deep structure is represented by a "feature structure" (f e a t u r e structure). The so-called "feature structure" is a set of feature-value pairs, in which a feature is an atom or another feature structure. United (un ificati on) is "special ---------- install -------- order --------- (Please read the precautions on the back before filling in this (Page) This paper size is in accordance with Chinese National Standard (CNS) A4 (210 X 297 mm) 476895 A7 B7 6672twf.doc / 008 5. Description of Invention (U) (Please read the precautions on the back before filling this page) Structure "is the main operation. After the two characteristic structures A and B are combined and operated, the result is the smallest characteristic structure that can cover A and B. If this characteristic structure does not exist, the union operation fails. The deep grammatical structure produced by the semantic interpretation process is four parts: topic '', `` domain knowledge '', ``, '', type, and `` range ''. . Here is an example to explain the process of a Chinese query sentence processed by word segmentation, parsing, and Swiss interpretation. If a Chinese query sentence is entered as `` I want to know the financial status of the company '', in the word segmentation processing, this sentence will be segmented into a Chinese query sentence according to long-term rules, I ,,, `` know The company's financial situation ''. These substrings are analyzed and processed to obtain the sentence structure in DCG form, Type ·· blank; Subj ·· me; AdvP ·· blank; VP: Know + (company, financial status), you can know the part of speech of the Chinese query sentence entered . Then this sentence structure is processed by semantic interpretation to obtain a deep grammatical structure ’as topi c: I; domain: I; type: know; range: (company, financial status). Please refer to Figure 10 for more examples to learn the deep grammatical structure of these Chinese query sentences. Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economics In summary, the advantages of the present invention are as follows: 1 · Use "丨 layer grammatical structure" as Chinese query sentences and file conversions ^ The semantic representation of the data in the database, but So that the comparison program can do the comparison). 2. Use "deep grammatical structure" as the semantic representation of Chinese query sentences' to enable natural language processing programs to simplify semantic processing. This paper scale applies Chinese National Standard (CNS) A4 specification (210 X 297 public) (%) A7 476895 6672twf.doc / 008 _B7_ V. Explanation of Invention (VT). 3. If it is not a simple sentence, for example, a sentence may have two subjects or the like, the semantic meaning of the sentence can be determined correctly. Although the present invention has been disclosed as above with a preferred embodiment, it is not intended to limit the present invention. Any person skilled in the art can make various modifications and retouches without departing from the spirit and scope of the present invention. Therefore, the present invention The scope of protection shall be determined by the scope of the attached patent application. (Please read the precautions on the back before filling out this page) Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs This paper is in accordance with China National Standard (CNS) A4 (210 X 297 mm)

Claims

476895 A8 B8 C8 D8 6672twf.doc / 008 6. Scope of Patent Application1. A natural language query system, which involves a user inputting a voice input and a keyboard input, either in a Chinese query sentence or a Chinese query sentence. Obtaining information about one of the Chinese query sentences through the natural language query system, the natural language query system includes: a natural language processing program, the natural language processing program is to process the Chinese query sentence input by the user, A deep grammatical structure of the Chinese query sentence is obtained; a ^ file database is used to store a 'application domain knowledge * related file; a file conversion database is linked with the file database as a storage description of the A plurality of information of application domain knowledge, wherein the information exists in the document conversion database in the form of the grammatical structure of the layer, and records the meaning of one of the related documents stored in the document database; a comparison program, Connected between the natural language processing program and the document conversion database, The speech processing program obtains the deep grammatical structure of the Chinese query sentence and compares it with the information in the document conversion database; and an answer retrieval program connected with the comparison program and the document database, A result is obtained after comparing the comparison programs, and a corresponding answer is found from the document database. «2. The natural language query system as described in item 1 of the scope of patent application, wherein the natural language query system further includes: an input interface connected with the natural language processing program as a way to provide the user with the voice mode Enter the Chinese query sentence; (Please read the precautions on the back before filling this page) Pack -------- Order · ------ —Xinyi Economic Zou Intellectual Property Bureau Employees' Cooperatives Print this paper The standard is applicable to the Chinese National Standard (CNS) A4 specification (210 X 297 mm) 476895 A8 B8 C8 D8 6672twf.doc / 〇〇8 6. Application scope of patents-an output interface, connected with the answer extraction program, will be The corresponding answer found from the document database is presented to the user for reading via the output interface; a natural language processing database, connected to the natural language processing program, as the natural language processing program is provided to process the Chinese language Multiple pieces of information required for querying sentences; and a comparison database connected with the comparison program as a storage of multiple components existing in the knowledge of the application field, The plurality of properties of the components and the correlation between the components. 3. The natural language query system described in item 2 of the scope of patent application, wherein the information provided by the natural language processing database includes a dictionary, a grammar rule, and a semantic interpretation rule. 4. The natural language query system as described in item 2 of the scope of patent application, wherein the natural language processing program processes the Chinese query sentence by performing a word segmentation process and an analysis process on the Chinese query sentence through the information. And a semantic interpretation. 5. — A natural language query method, which involves a user inputting a Chinese query sentence using a voice input and a keyboard input, and using the natural language query method to obtain the Chinese query sentence Information, the steps of the natural language query method include: processing the Chinese query sentence to obtain a deep grammatical structure of the Chinese query sentence; comparing the deep grammatical structure obtained with a content in a document conversion database Yes; this paper size applies the Chinese National Standard (CNS) A4 specification (210 X 297 mm) (Please read the precautions on the back before filling this page) -Installation -------- Order ----- ---- Line Printed by the Consumer Cooperatives of the Intellectual Property Bureau of the Ministry of Economic Affairs 476895 A8 B8 6672twf.doc / 〇〇8 6. The scope of patent application will be compared with one of the matching results, which will be extracted from a document database. The corresponding answer; and (please read the notes on the back before filling out this page) and send the corresponding answer to the user. 6. The natural language query method as described in item 5 of the scope of patent application, wherein the content existing in the document conversion database is the deep grammatical structure. 7. A kind of natural language processing element, which involves a user inputting a Chinese query sentence by one of a voice input and a keyboard input, and analyzing the Chinese query sentence through the natural language processing element. A structure. The natural language processing element is a natural language processing program that processes the Chinese query sentence to obtain a deep grammatical structure of the Chinese query sentence. 8. According to the natural language processing component described in item 7 of the scope of patent application, a natural language processing database is connected to the natural language processing component, and the natural language processing database provides the natural language processing program to process the Chinese language. Multiple pieces of information needed when querying a sentence. Printed by a member of the Intellectual Property Bureau of the Ministry of Economic Affairs and a Consumer Cooperative. 9. According to the natural language processing component described in item 8 of the scope of the patent application, the natural language processing database connected to the natural language processing component. The information provided includes: A dictionary, a grammar rule, and a semantic interpretation rule. «, 10 · The natural language processing element described in item 9 of the scope of patent application; wherein the natural language processing element includes: a word segmentation program, which performs a word segmentation process on the Chinese query sentence input by the user ; This paper size applies to China National Standard (CNS) A4 (210 X 297 mm) 476895 A8 B8 C8 6672twf.doc / 008 D8 6. The scope of the patent application-an analysis program, connected with the word segmentation program, will go through the segmentation Analyze the Chinese query sentence processed by the word; and (please read the notes on the back before filling this page) a semantic interpretation program, connect with the analysis program, and make the Chinese query sentence processed by the analysis Semantic interpretation analysis. 11. The natural language processing element as described in item 10 of the scope of patent application, wherein the word segmentation processing is to compare a substring at the beginning of the Chinese query sentence with one of the contents in the dictionary to obtain the Chinese Query a string of words. 12. The natural language processing element described in item 11 of the scope of the patent application, wherein the analysis process is to analyze the word string to obtain a part of speech of the word string. 13. The natural language processing element described in item 12 of the scope of the patent application, wherein the semantic interpretation process represents the Chinese query sentence as the deep grammatical structure according to the part of speech. 14. A natural language processing method, which involves a user inputting a Chinese query sentence by using a voice input and a keyboard input, and analyzing one of the Chinese query sentences through the natural language processing method. Structure, the steps of the natural language processing method include: printed by the consumer co-operative of the Intellectual Property Bureau of the Ministry of Economic Affairs to treat the Chinese query sentence as a word segmentation; and to analyze and process the Chinese query sentence processed by the word segmentation; Analyze and analyze the Chinese query sentence processed by the analysis; and obtain a deep grammatical structure. This paper size is in accordance with Chinese National Standard (CNS) A4 (210 X 297 mm) 476895 A8 B8 C8 6672twf.doc / 008 D8 VI. Scope of patent application 15. The natural language processing method described in item 14 of the scope of patent application The step of processing the word segmentation includes: comparing a substring at the beginning of the Chinese query sentence with one of the contents in a dictionary; selecting the longest one from the Chinese query sentence according to a long word priority rule The substring; and if the Chinese query sentence remains unmatched, continue to compare and select the longest substring until the Chinese query sentence is completed. (Please read the notes on the back before filling this page)

Printed by Qilang Intellectual Property Bureau's Consumer Cooperatives 19 This paper size applies to China National Standard (CNS) A4 (210 X 297 mm)