TW202238435A

TW202238435A - Natural language dialogue intention analysis method

Info

Publication number: TW202238435A
Application number: TW110111481A
Authority: TW
Inventors: 王文傑
Original assignee: 卓騰語言科技股份有限公司
Priority date: 2021-03-26
Filing date: 2021-03-26
Publication date: 2022-10-01
Also published as: TWI799822B

Abstract

A natural language dialogue intention analysis method, the "keywords" interactive "interface" designed with "language" as the "direction", and the NLU engine conforming to the Chinese structure design. It is worth noting that the present invention is anchored in the [verb] establishment [The style model between entity] and [verb], and can extract Chinese [vocabulary]; set the verb position in the sentence pattern as an anchor point in the plural sentence structure, and establish the intention model between the entity vocabulary and the anchor point; The anchor point can be the reference point for the analysis of the natural language dialogue intention of the present invention; the system can output the analyzed sentence intention as a program code, and cooperate with the developer to establish their own business or application logic block calculations, and to compare their own business or The calculation result of the application is returned to the user, or other applications are performed.

Description

Natural language dialogue intent analysis method

本發明涉及一種自然語言處理技術領域，尤其涉及一種自然語言深層結構的語意計算意圖分析方法。 The invention relates to the technical field of natural language processing, in particular to a semantic computing intent analysis method of deep structure of natural language.

自然語言處理(NLP)和自然語言理解(NLU)是人工智慧時代重要的「人機互動介面」技術。最新一波AI熱潮中，除了影像辨識處理外，最重要的落地技術，就是NLP/NLU了；然而NLP/NLU既然是處理「自然語言」，當然會受到「語言資料特性」的限制。 Natural language processing (NLP) and natural language understanding (NLU) are important "human-computer interaction interface" technologies in the era of artificial intelligence. In the latest wave of AI boom, in addition to image recognition processing, the most important landing technology is NLP/NLU; however, since NLP/NLU is processing "natural language", it will of course be limited by "language data characteristics".

現有技術中有Clickbot點擊式聊天機器人是依腳本來做比對而對腳本的限制非常嚴苛，這種方式需要大量的腳本邏輯來分辨意圖關聯，更常由於腳本邏輯做不好乾脆列出所有的腳本選項。另，Keywordbot關鍵詞聊天機器人的問題則是需大量的關鍵詞(包含縮寫)，使用者經常因為Keywordbot對應不到關鍵詞而得到不相干的回覆，再者專案生命週期愈長，愈難維護。而ML方法的NLU引擎，也需要大量的句子來做字符訓練模型，使用時有不易調整模型錯誤以及因運算量龐大，需大型伺服器才能運作得宜而受限於必需線上使用的缺點。 In the prior art, there is a Clickbot click-type chatbot that compares scripts and has very strict restrictions on scripts. This method requires a lot of script logic to distinguish intent associations, and more often simply lists all script options. In addition, the problem with the Keywordbot chatbot is that it requires a large number of keywords (including abbreviations). Users often get irrelevant replies because Keywordbot cannot match the keywords. Moreover, the longer the project life cycle, the more difficult it is to maintain. The NLU engine of the ML method also requires a large number of sentences to make character training models. It is not easy to adjust the model errors when using it, and due to the huge amount of calculation, it needs a large server to operate properly and is limited to online use.

綜合現有技術的很多智能問答系統領域中，大都採用大數據、深度學習、AI來實現自然語言處理。其缺點是過於粗糙、簡單，導致語義分析結果準確性較差，而且需要龐大的設備容量來儲存資料庫，而難以下載到手機內部使用。 In the field of many intelligent question answering systems that integrate existing technologies, most of them use large numbers Data, deep learning, AI to achieve natural language processing. Its disadvantage is that it is too rough and simple, which leads to poor accuracy of semantic analysis results, and requires a huge device capacity to store the database, which is difficult to download to the mobile phone for internal use.

有鑑於此，本發明設計出一種不需要大量的句子並且靠句法結構來產生模型，具有容易追查調整模型錯誤，且運算需求極低的特性，還可以安裝在企業內部或個人手機上使用；本發明可以大大降低硬體設備的容量需求，只要約32MB的儲存空間，遠優於現有技術要GB以上的儲存空間需求；市面上的語意理解系統大都需要架設第三方伺服器，本發明設備容量需求小所以不須要第三方伺服器，甚至可以以APP元件的形式下載到手機上來執行初步的自然語言對話；由於不須要第三方伺服器所以本系統網路需求為非必要條件。 In view of this, the present invention designs a model that does not require a large number of sentences and relies on syntactic structures. It is easy to trace and adjust model errors, and has the characteristics of extremely low computing requirements. It can also be installed in enterprises or on personal mobile phones. The invention can greatly reduce the capacity requirement of hardware equipment, only about 32MB of storage space is required, which is far superior to the storage space requirement of more than GB in the prior art; most of the semantic understanding systems on the market need to set up a third-party server, and the capacity requirement of the device of the present invention It is small so no third-party server is needed, and it can even be downloaded to the mobile phone in the form of an APP component to perform initial natural language conversations; since no third-party server is needed, the network requirement of this system is not a necessary condition.

本發明一種自然語言對話意圖分析方法，以「語言」為「導向」設計的「關鍵詞彙」互動「介面」，符合中文結構設計的NLU引擎，值得注意的是，本發明定錨在[動詞]建立[實體]和[動詞]間的樣式模型做為建立意圖分析/分類之用。和其它基於機器學習的方案相比，並具備能額外擷取中文[詞彙]做為後續應用的能力；其設計之目的在於透過句型比對，以及挑選語意計算時所需的詞彙做為參數，以便在保有「句型-語意」的語言表現關係之餘，也能擷取出關鍵詞彙做為計算介面之所需。 The present invention is a method for analyzing the intent of a natural language dialogue. The "key word" interactive "interface" designed with "language" as the "guidance" and the NLU engine conforms to the Chinese structure design. It is worth noting that the present invention is anchored in [verbs] Create a pattern model between [entity] and [verb] for intent analysis/classification. Compared with other machine learning-based solutions, it also has the ability to additionally extract Chinese [vocabularies] for subsequent applications; its design aims to compare sentences and select vocabulary required for semantic calculation as parameters , so as to retain the language expression relationship of "sentence pattern-semantic" and also extract key words for the calculation interface.

本發明將對話分成三層，分別是：專案名稱(Project)：某組意圖適用的場景，例如在便利商店的場景；意圖名稱(Intent)：某一種意圖，例如在便利商店場中的場景，具有繳費意圖、購票意圖，但不具住宿意圖；語言表達(Utterance)：一組可以用來表達某一場景下，某一意圖的語言表達，’可以是完整的句子或是不完整的句子。 The present invention divides the dialogue into three layers, which are: project name (Project): the scene where a certain group of intentions is applicable, such as the scene in a convenience store; intention name (Intent): a certain intention, For example, in the scene in the convenience store, there is the intention of payment and ticket purchase, but not the intention of accommodation; language expression (Utterance): a group of language expressions that can be used to express a certain intention in a certain scene, can be complete sentences or incomplete sentences.

本發明一自然語言對話意圖分析方法其中複數個句法結構模型建立方法步驟為：S1. 管理者先行建立一個應用場景做為語境(context)；S2. 在該應用場景下建立複數個意圖(intent)；S3. 進一步，在每一個意圖下以一句一例方式建立符合此應用場景下該意圖的複數個例句；S4. 本系統之語句分析模組將該些個例句進行詞彙切分、詞性標注，將標注過詞性的該些個例句轉譯為複數個句型結構，進一步，該語句分析模組依該些個例句之詞彙、詞性，提供一實體詞彙勾選畫面，供該管理者勾選與該意圖有關聯的該實體詞彙；S5. 一意圖分析模組將該應用場景下的該些個句型結構綁定各自的意圖，編碼成該些個句法結構模型，儲存到該場景資料庫之一應用場景模型單元；S6. 本系統一輸出轉換模組，提取該應用場景模型單元之該些個句法結構模型，一一將其轉換成可供下載的一程式碼輸出。 A natural language dialogue intention analysis method of the present invention, wherein the steps of the method for establishing a plurality of syntactic structure models are as follows: S1. The manager first establishes an application scene as a context (context); S2. Establishes a plurality of intentions (intent ); S3. Further, under each intention, a plurality of example sentences that meet the intention in this application scenario are established in a sentence-by-example manner; S4. The sentence analysis module of this system performs lexical segmentation and part-of-speech tagging on these example sentences, Translate the example sentences marked with part of speech into plural sentence structures. Further, the sentence analysis module provides a physical vocabulary check screen according to the vocabulary and part of speech of these example sentences for the manager to check and match with the The entity vocabulary associated with the intent; S5. An intent analysis module binds the sentence structures in the application scene to their respective intents, encodes them into syntactic structure models, and stores them in one of the scene databases Application scenario model unit; S6. The system outputs a conversion module, which extracts the syntactic structure models of the application scenario model unit, and converts them into a program code output available for download.

本發明該自然對話意圖分析方法，其中該意圖分析方法如下：S7. 一使用者由一終端裝置連結該應用場景之一對話頁面，該使用者在該對話頁面輸入一對話語句；S8. 該語句分析模組提取該對話頁面之該對話語句，將該對話語句進行詞彙切分、詞性標注及將動詞位置設定為該錨點，將標注過詞性的該語句轉譯為該句型結構；S9. 該意圖分析模組依該應用場景連結該場景資料庫之該應用場景模型單元，比對該對話語句之該句型結構與該應用場景模型單元中的該些個句法結構模型，進一步，以該錨點為比較基準點進行一句型結構比對；S10. 若是該些個句法結構模型中有一組模型比對結果一致，則定義該對話語句含有該組模型綁定的該意圖為一表示意圖。 The natural dialogue intention analysis method of the present invention, wherein the intention analysis method is as follows: S7. A user connects to a dialogue page of the application scene through a terminal device, and the user inputs a dialogue sentence on the dialogue page; S8. The sentence The analysis module extracts the dialogue sentence of the dialogue page, performs lexical segmentation and part-of-speech tagging on the dialogue sentence, sets the verb position as the anchor point, and translates the sentence marked with the part-of-speech into the sentence structure; S9. The intent analysis module links the application scenario model unit of the scenario database according to the application scenario, compares the sentence structure of the dialogue sentence with the syntactic structure models in the application scenario model unit, and further uses the anchor Sentence structure comparison is carried out at the point as the reference point; S10. If there is a group of model comparison results among the syntactic structure models that are consistent, then define that the dialogue sentence contains the intent bound by the group of models as a schematic diagram.

本發明自然語言對話意圖分析方法，可以建立「數學應用問題」做為專案名稱(Project name)，說明這一組意圖將適用於數學應用問題的語言場景(Discourse)；接著建立「加減法」的意圖名稱(Intent name)。在這意圖下，所有的句子都是為了描述「加減法」的語言表達(Utterance)。例如「爸爸吃掉兩顆蘋果」、「姐姐弄破三張」或「哥哥又給他兩枝筆」…等；透過語句分析模組及意圖分析模組進行斷詞、詞性標記與命名實體辨識處理後，所有的句子都將轉化為只保留標記做為句式辨識用的正規表式示；以「爸爸吃掉兩顆蘋果」為例，完整的模型產生流程為： The natural language dialogue intention analysis method of the present invention can establish "mathematical application problems" as the project name (Project name), indicating that this group of intentions will be applicable to the language scene (Discourse) of mathematical application problems; and then establish the "addition and subtraction" method Intent name. Under this intention, all the sentences are to describe the language expression (Utterance) of "addition and subtraction". For example, "Dad ate two apples", "My sister broke three" or "My brother gave him two more pens"... etc.; use the sentence analysis module and intent analysis module to perform word segmentation, part-of-speech tagging and named entity recognition After processing, all sentences will be transformed into regular expressions that only retain tags for sentence pattern recognition; taking "Dad ate two apples" as an example, the complete model generation process is as follows:

上述中，意圖分析模組將訓練句的「爸爸吃掉兩顆蘋果」轉寫為如下的正規表示式的同時，亦將動詞轉寫為以方括號標記的[吃掉]，即可擴充兼容任何以「掉」為結尾的動詞，或是包含「吃」的動詞與動詞組來表示「減少」的意圖，同時把否定表述的「不」會造成的反向語意也予以排除；如此設計，便能用極少的資料，透過保留語言表達(Utterance)句型的方式來區分語意意圖，獲得最大的兼容性。 In the above, the intent analysis module transcribes the training sentence "Dad ate two apples" into the following regular expression, and at the same time transcribes the verb into [eat] marked with square brackets, which can be extended and compatible Any verb that ends with "drop", or verbs and verb groups that contain "eat" expresses the intention of "reduce", and at the same time excludes the reverse semantics caused by the negative expression "not"; this design, You can use very little data to express (Utterance) sentences through reserved language Type of way to distinguish semantic intent, to obtain maximum compatibility.

本發明自然語言對話意圖分析方法與基於機器學習的先前技術，例如LUIS比較，我們同樣在本系統與LUIS先行建立一個與換匯意圖相關的18組例句，如下所示： Compared with the previous technology based on machine learning, such as LUIS, the natural language dialogue intention analysis method of the present invention, we also established a 18-group example sentence related to the exchange intention in this system and LUIS, as follows:

1. 上星期三美金兌台幣是多少； 1. What was the exchange rate between US dollars and Taiwan dollars last Wednesday;

2. 今天美金兌台幣是多少； 2. What is the exchange rate between US dollars and Taiwan dollars today;

3. 我想要美金100元； 3. I want USD 100;

4. 我想要100元美金； 4. I want 100 USD;

5. 我想買美金100元； 5. I want to buy USD 100;

6. 我想買100元美金； 6. I want to buy 100 USD;

7. 美金100要台幣多少； 7. How much is US$100 in Taiwan dollars;

8. 美金100要多少台幣； 8. How much is US$100 in Taiwan dollars;

9. 100美金要台幣多少； 9. How much is 100 US dollars in Taiwan dollars;

10. 100美金要多少台幣； 10. How much is 100 US dollars in Taiwan dollars;

11. 美金100元要台幣多少； 11. How much is 100 US dollars in Taiwan dollars;

12. 美金100元要多少台幣； 12. How much is 100 US dollars in Taiwan dollars;

13. 00元美金要台幣多少； 13. How much is Taiwan dollars for 00 US dollars;

14. 100元美金要多少台幣； 14. How much is 100 US dollars in Taiwan dollars;

15. 100元美金可以兌換多少台幣； 15. How many Taiwan dollars can be exchanged for 100 US dollars;

16. 100元美金可以兌換台幣多少； 16. How much can 100 US dollars be converted into Taiwan dollars;

17. 美金100元可以兌換多少台幣； 17. How many Taiwan dollars can be exchanged for 100 US dollars;

18. 美金100元可以兌換台幣多少； 18. How much can 100 US dollars be converted into Taiwan dollars;

上述中，本發明自然語言對話意圖分析方法與先前技術LUIS比較，輸入完18組例句後再執行模型訓練；接著我們輸入與例句不同的相關問句來執行LUIS測試，測試過後可以得到與換匯意圖相關的相似度分數，如【圖4】所示，我們得知與上述第3例句相同的問句『我想要美金100元』的相似度為0.985；同樣意圖的『我想換美金200元』的相似度下降到只有0.677，在同樣的句型、同樣的意圖但只有金額不同，相似度就相差那麼多，所以採用跟LUIS相同的技術就需要大量的例句來訓練模型；我們再看另一個問句『口罩三個15元台幣再加7元台幣運費很貴嗎』，這個例句完全與換匯意圖無關，但它的相似度卻高達0.978；由以上我們可得知，為什麼現有的語言理解技術常常會因其利用數學原理建立機器學習模型而導致一些文不對題的回答。 In the above, the natural language dialogue intent analysis method of the present invention is compared with the prior art LUIS, and model training is performed after inputting 18 sets of example sentences; then we input related questions different from the example sentences to perform the LUIS test, and after the test, we can obtain and exchange currency Intention-related similarity scores, as shown in [Figure 4], we know that the similarity of the question sentence "I want US$100" that is the same as the third example sentence above is 0.985; the same intention "I want to exchange US$200 The similarity of “元” dropped to only 0.677. In the same sentence pattern, the same intention but only the amount of money is different, the similarity is so different. Therefore, using the same technology as LUIS requires a large number of example sentences to train the model; let’s look at Another question is "Is NT$15 for three masks and NT$7 for shipping expensive?" This example sentence has nothing to do with the intention of currency exchange at all, but its similarity is as high as 0.978; from the above we can know why the existing Language understanding technologies often lead to off-topic answers because they use mathematical principles to build machine learning models.

承上所述，本發明自然語言對話意圖分析方法與先前技術LUIS比較，同樣輸入上述18組例句來訓練模型，本系統採用句型結構比對方式，在本系統中輸入『我想買歐元100元』的到意圖分析如下，我們可以在例句模型中找到相同的句型結構。 As mentioned above, the natural language dialogue intention analysis method of the present invention is compared with the prior art LUIS, and the above-mentioned 18 groups of example sentences are also input to train the model. The intent analysis of 元』 is as follows, we can find the same sentence structure in the example sentence model.

承上所述，本發明自然語言對話意圖分析方法與先前技術LUIS比較，在本系統中輸入『我想買美金200元』的到意圖分析如下，我們一樣可以在例句模型中找到相同的句型結構，本系統不會因為金額不同就比對不到例句模型；本系統是採用句型結構比對方式，而非基於數學原理的機器學習模型所以只要句型結構一致就可以找到相同句型結構的例句模型。 Based on the above, the analysis method of the natural language dialogue intention of the present invention is compared with the previous technology LUIS. The intention analysis of inputting "I want to buy US$200" in this system is as follows. We can also find the same sentence pattern in the example sentence model Structure, the system will not fail to compare the example sentence model because of the difference in the amount; this system uses the sentence structure comparison method, not the machine learning model based on mathematical principles, so as long as the sentence structure is consistent, the same sentence structure can be found example sentence model of .

承上所述，本發明自然語言對話意圖分析方法與先前技術LUIS比較，本系統採用句型結構比對方式，在本系統中輸入『我想換美金兩萬元』的到意圖分析如下，，我們一樣可以在例句模型中找到相同的句型結構。 Based on the above, compared with the prior art LUIS, the natural language dialogue intention analysis method of the present invention, this system adopts the sentence structure comparison method, and the intention analysis of inputting "I want to exchange US$20,000" in this system is as follows, We can also find the same sentence structure in the example sentence model.

承上所述，本發明自然語言對話意圖分析方法與先前技術LUIS比較，本系統採用句型結構比對方式，在本系統中輸入『口罩三個15元台幣再加7元台幣運費很貴嗎』的到意圖分析如下，本系統在例句模型中找不到相同的句型結構。 Based on the above, compared with the previous technology LUIS, the natural language dialogue intention analysis method of the present invention, this system adopts the sentence structure comparison method, and enters "three masks, 15 NT dollars plus 7 NT dollars, is the shipping cost very expensive?" 』The intention analysis is as follows, the system cannot find the same sentence structure in the example sentence model.

由上述比較中可知，本發明具有非常高的語言意圖辨識能力，不需要大量的語料庫來訓練，也不需要進行深度學習，僅須要在每種句型結構一個例句的最少限制下達到意圖分析，這些都是本發明和現行技術相異之特點。 As can be seen from the above comparison, the present invention has a very high language intent recognition ability, does not require a large amount of corpus for training, and does not require deep learning. It only needs to achieve intent analysis under the minimum limit of one example sentence for each sentence structure. These are all the different characteristics of the present invention and prior art.

本發明自然語言對話意圖分析方法也可以找出特定事件的焦點人物以做為資訊擷取(information extraction)之用，例如，如果一篇新聞中的人物涉及「反洗錢(AML)」的罪嫌，則該人物勢必是因為「作了某件事」而涉嫌；如【圖8】所示，我們先處理訓練資料的新聞文本，利用本系統的語句分析模組標示出文本中的動詞，並將「動詞」取出集合存到資料集中；接著，將自[AML新聞]中取出動詞集合，刪去[非AML新聞]中取出的動詞集合；其結果即為本系統使用的AML新聞分類器。 The natural language dialogue intent analysis method of the present invention can also find out the focal person of a specific event for information extraction. , then the person must be suspected because of "doing something"; as shown in [Figure 8], we first process the news text of the training data, use the sentence analysis module of this system to mark the verbs in the text, and Save the collection of "verbs" into the data set; then, take the collection of verbs from [AML News] and delete the collection of verbs from [Non-AML News]; the result is the AML news classifier used by this system.

上述中，如果一個人物涉及「反洗錢(AML)」的罪嫌，那麼記者描述這個人物的句子的態樣(pattern)勢必和其他無罪的人不同；接著，我們再利用語句分析模組將[AML新聞]中所有[含有人名的句子]取出，並將這些句子導入意圖分析模組做為訓練模型；語句分析模組是學習[句型態樣]而非字符間的分佈機率；句型結構樣態有限，但組合變化無窮，因此諸如[張三遭判刑10年6個月]和[李四遭處6個月]兩個句子沒有幾個字符一樣，但因句型態樣一致，都為「人+遭+動詞+時間」的句型態樣結構，因此本系統意圖分析模組會將之視為一樣的資料，只需輸入一次即可，如此一來，訓練資料(模型)的需求將大幅減少，這有別於現有需要大量資料以建立機器學習模型的語意分析技術。 In the above, if a character is involved in the crime of "anti-money laundering (AML)", the pattern of the reporter's sentences describing this character is bound to be different from other innocent people; then, we use the sentence analysis module to [ All [sentences containing people's names] in AML news] are taken out, and These sentences are imported into the intent analysis module as a training model; the sentence analysis module is to learn [sentence patterns] rather than the distribution probability between characters; the sentence pattern structure is limited, but the combination is infinite, so such as [Zhang San was encountered Sentenced to 10 years and 6 months] and [Li Si was sentenced to 6 months] and [Li Si was sentenced to 6 months], the two sentences have few characters the same, but because the sentence patterns are the same, they are both "person + suffer + verb + time" structure, so the intent analysis module of this system will treat it as the same data, and only need to input it once. In this way, the demand for training data (model) will be greatly reduced, which is different from the existing ones that require a large amount of data to build Semantic analysis techniques for machine learning models.

由上可知本系統採用句型結構比對方式克服了先前技術問題，加強了語句意圖的辨識度；進一步，提供語意分析程式開發者將可依據本系統對對話文本的分析結果，運用到自身的商業或應用邏輯，將意圖程式碼填入自身程式碼相應的區塊中，已完成開發對話機器人；進一步，開發者可將程式安裝到使用者的手持裝置內，或是自身的伺服器中，不用租很貴的網路頻寬，讓聊天機器人能夠在同一時間服務最多的客戶。 It can be seen from the above that this system uses the sentence structure comparison method to overcome the previous technical problems and strengthen the recognition of sentence intent; further, developers who provide semantic analysis programs will be able to use the analysis results of the dialogue text by this system to their own Business or application logic, fill in the intent code into the corresponding block of its own code, and the development of the dialogue robot has been completed; further, the developer can install the program into the user's handheld device or its own server, There is no need to rent expensive network bandwidth, so that the chatbot can serve the most customers at the same time.

較佳的，本系統因為不需要大量的訓練模型，所以可以以APP元件的形式佈署在「使用者自己的手機上」，讓大多數的問題透由本地端的APP來計算與來回答，只有真正的業務需求，才會以指令的形式透過網路送上雲端的伺服器。 Preferably, since this system does not require a large number of training models, it can be deployed in the form of APP components on "the user's own mobile phone", so that most of the questions can be calculated and answered by the local APP, only The real business needs will be sent to the server in the cloud through the network in the form of instructions.

100:輸入模組 100: input module

200:資料庫模組 200:Database module

300:語句分析模組 300: Sentence analysis module

400:意圖分析模組 400: Intent analysis module

500:輸出轉換模組 500: Output conversion module

S1~S8:句法結構模型建立流程步驟 S1~S8: Steps in the process of building a syntactic structure model

S100~S103:意圖分析方法流程步驟 S100~S103: Process steps of intent analysis method

A1~A5:實施例一模型建立操作步驟 A1～A5: Embodiment 1 model building operation steps

B1~B8:實施例二例外情況流程步驟 B1~B8: Process Steps of Embodiment 2 Exceptional Circumstances

C3~C8:實施例二無罪人士流程步驟 C3~C8: Embodiment 2 Process Steps for Innocent Persons

D3~D8:實施例二洗錢嫌疑流程步驟 D3~D8: Steps in the suspected money laundering process of Embodiment 2

E1~E7:實施例二模型應用流程步驟 E1~E7: Steps in the application process of the model in Embodiment 2

【圖1】自然語言對話意圖分析系統示意圖 [Figure 1] Schematic diagram of the natural language dialogue intent analysis system

【圖2】句法結構模型建立流程示意圖 [Figure 2] Schematic diagram of the process of building a syntactic structure model

【圖3】意圖分析方法流程示意圖 [Figure 3] Schematic diagram of the flow chart of the intent analysis method

【圖4】LUIS意圖分析方法相似度分數示意圖 [Figure 4] Schematic diagram of the similarity score of the LUIS intent analysis method

【圖5】實施例一模型建立操作步驟流程示意圖 [Figure 5] Schematic flow chart of the model building operation steps in the first embodiment

【圖6】實施例一複數個例句輸入畫面 [Fig. 6] Example sentence input screen of embodiment one

【圖7】實施例一實體詞彙勾選畫面 [Figure 7] Embodiment 1 Entity Vocabulary Check Screen

【圖8】特定事件的焦點人物分析流程示意圖 [Figure 8] Schematic diagram of the analysis process of the focus person of a specific event

【圖9】實施例二例外情況流程示意圖 [Figure 9] Schematic flow chart of exceptions in the second embodiment

【圖10】實施例二無罪人士流程示意圖 [Figure 10] Schematic diagram of the flow of the innocent person in the second embodiment

【圖11】實施例二洗錢嫌疑流程示意圖 [Figure 11] Schematic diagram of the suspected money laundering process of Embodiment 2

【圖12】實施例二模型應用流程示意圖 [Figure 12] Schematic diagram of the application process of the model of the second embodiment

【圖13】實施例三函式事件池示意圖 [Figure 13] Schematic diagram of the function event pool of the third embodiment

【圖14a】~【圖14d】實施例三Math專案示意圖 [Fig. 14a] ~ [Fig. 14d] Schematic diagram of the third embodiment of the Math project

為使本發明的目的、技術方案和優點更加清楚明瞭，下面結合具體實施方式並參照附圖，對本發明進一步詳細說明。應該理解，這些描述只是示例性的，而並非要限制本發明的範圍。 In order to make the object, technical solution and advantages of the present invention clearer, the present invention will be further described in detail below in combination with specific embodiments and with reference to the accompanying drawings. It should be understood that these descriptions are exemplary only, and are not intended to limit the scope of the present invention.

本發明一自然語言對話意圖分析系統示意圖，如【圖1】所示，包含：一輸入模組100，可讓一管理者進行本系統之資料建立；一資料庫模組200，連接該輸入模組100，包含一動詞資料庫、一場景資料庫，該動詞資料庫儲存複數個動詞，該場景資料庫儲存複數個場景；一語句分析模組300，連接該輸入模組100與該資料庫模組200，該語句分析模組300可將一語句或一文本進行詞彙切分並標注詞性；一意圖分析模組400，連接該資料庫模組200與該語句分析模組300，該意圖分析模組400可將該語句或該文本歸納成含有一意圖之該語句或該文本；一輸出轉換模組500，連接該分析模組400與該資料庫模組200，該輸出轉換模組500將該意圖綁定該語句或該文本轉換成一程式碼輸出。 A schematic diagram of a natural language dialogue intent analysis system of the present invention, as shown in [Fig. 1], includes: an input module 100, which allows a manager to create data in the system; a database module 200, connected to the input module Group 100 includes a verb database and a scene database, the verb database stores a plurality of verbs, the scene database stores a plurality of scenes; a sentence analysis module 300 connects the input module 100 and the database module Group 200, the sentence analysis module 300 can carry out lexical segmentation and mark the part of speech of a sentence or a text; an intention analysis module 400, connect the resource The database module 200 and the sentence analysis module 300, the intent analysis module 400 can summarize the sentence or the text into the sentence or the text containing an intent; an output conversion module 500, connected to the analysis module 400 and the database module 200, the output conversion module 500 converts the intent binding the statement or the text into a program code output.

上述該輸入模組100進一步包含：一自定義辭典單元，可由該管理者匯入符合一應用場景的複數個例句；一例句建立單元，可由該管理者建立符合該應用場景的該些個例句並儲存成一句法結構模型。 The above-mentioned input module 100 further includes: a custom dictionary unit, which can be imported by the manager into a plurality of example sentences that meet an application scenario; an example sentence creation unit, where the manager can create these example sentences that meet the application scenario and Stored as a syntactic structure model.

本發明一自然語言對話意圖分析方法，其中複數個句法結構模型建立如【圖2】所示，該方法如下： A natural language dialogue intention analysis method of the present invention, wherein a plurality of syntactic structure models are established as shown in [Fig. 2], the method is as follows:

S1. 本系統一輸入模組100之一例句建立單元提供一管理者先行建立一第m應用場景； S1. The example sentence creation unit of the input module 100 of the system provides a manager to first create an mth application scenario;

S2. 進一步，該管理者於該第m應用場景下建立屬於該第m應用場景中的一第n意圖資料集； S2. Further, the manager creates an nth intent data set belonging to the mth application scenario in the mth application scenario;

S3. 進一步，該管理者於該第n意圖下以一句一例方式建立符合該第m應用場景之該第n意圖的複數個例句，並傳送到一語句分析模組300； S3. Further, the manager creates a plurality of example sentences that meet the nth intent of the mth application scenario in a sentence-by-example manner under the nth intent, and sends them to a sentence analysis module 300;

S4. 該語句分析模組300將該些個例句進行詞彙切分、詞性標注，進一步，將標注過詞性的該些個例句轉譯為複數個句型結構； S4. The sentence analysis module 300 performs lexical segmentation and part-of-speech tagging on these example sentences, and further, translates these example sentences marked with part-of-speech into plural sentence structure;

S5. 一意圖分析模組400將該第m應用場景下該些個句型結構綁定該第n意圖，編碼成該些個句法結構模型並儲存到該場景資料庫之一第m應用場景模型單元； S5. An intent analysis module 400 binds the sentence structures in the mth application scenario to the nth intent, encodes these syntactic structure models and stores them in one of the mth application scenario models of the scenario database unit;

S6. 重複步驟S2~S5，直到屬於該第m應用場景的所有意圖均已建立完成； S6. Repeat steps S2~S5 until all intents belonging to the mth application scenario have been established;

S7. 重複步驟S1~S6，直到所有應用場景均已建立完成； S7. Repeat steps S1~S6 until all application scenarios are established;

S8. 本系統一輸出轉換模組500，提取該第m應用場景模型單元之該些個句法結構模型，一一將其轉換成可供下載的一程式碼(句型結構表示法)輸出。 S8. The system outputs the conversion module 500, which extracts the syntactic structure models of the m-th application scene model unit, and converts them one by one into a program code (sentence structure representation) available for download for output.

本發明一較佳實施例如【圖7】所示，上述步驟S4進一步包含S4’，該語句分析模組300依該些個例句之詞彙、詞性，提供一實體詞彙勾選畫面，供該管理者勾選與該第n意圖有關聯的該實體詞彙。 A preferred embodiment of the present invention is shown in [FIG. 7], the above step S4 further includes S4', the sentence analysis module 300 provides a physical vocabulary check screen for the manager according to the vocabulary and part of speech of these example sentences Check the entity vocabulary associated with the nth intent.

上述實施例中，該些個句型結構將句型中的動詞位置設定為一錨點，建立該實體詞彙和該錨點間的該些個意圖模型；值得注意的是，該錨點為該自然語言對話意圖分析的比較基準點。 In the above-mentioned embodiments, the sentence structure sets the position of the verb in the sentence as an anchor point, and establishes the intention models between the entity vocabulary and the anchor point; it is worth noting that the anchor point is the anchor point A comparison benchmark for natural language conversational intent analysis.

本發明該自然對話意圖分析方法，其中該意圖分析方法流程如【圖3】所示，步驟如下： The natural dialogue intent analysis method of the present invention, wherein the flow of the intent analysis method is shown in [Figure 3], and the steps are as follows:

S100.一使用者由一終端裝置連結該第m應用場景之一對話頁面，該使用者與該對話頁面輸入一對話語句，該對話頁面將該對話語句傳送到該語句分析模組300； S100. A user connects to a dialog page of the m-th application scene through a terminal device, the user inputs a dialog sentence with the dialog page, and the dialog page sends the dialog sentence to the sentence analysis module 300;

S101. 該語句分析模組300提取該對話頁面之該對話語句，將該對話語句進行詞彙切分、詞性標注及將動詞位置設定為該錨點，將標注過詞性的該語句轉譯為該句型結構，進一步，將該句型結構及該第m應用場景編碼傳送到該意圖分析模組400； S101. The sentence analysis module 300 extracts the dialogue sentence on the dialogue page, performs lexical segmentation, part-of-speech tagging on the dialogue sentence, sets the verb position as the anchor point, and translates the sentence marked with the part-of-speech into the sentence pattern. structure, and further, transmit the sentence structure and the mth application scene code to the intent analysis module 400;

S102. 該意圖分析模組400依該第m應用場景連結該場景資料庫之該第m應用場景模型單元，進一步，比對該對話語句之該句型結構與該第m應用場景模型單元中的該些個句法結構模型，以該錨點為比較基準點進行一句型結構比對； S102. The intent analysis module 400 links the mth application scenario model unit of the scenario database according to the mth application scenario, and further compares the sentence structure of the dialog sentence with the mth application scenario model unit These syntactic structure models are compared with the anchor point Sentence structure comparison;

S103. 若是該些個句法結構模型中有一組模型比對結果一致，則定義該對話語句含有該組模型綁定的該第n意圖為一表示意圖。 S103. If there is a group of model comparison results among the syntactic structure models that are consistent, define the nth intent that the dialog sentence contains the group of model bindings as an expression graph.

上述步驟S103若是該些個句法結構模型中比對結果沒有一致，則將該對話語句儲存到一沒有對應意圖的資料集中。 In the above step S103, if the comparison results of the syntactic structure models are inconsistent, the dialogue sentence is stored in a data set without corresponding intent.

上述步驟S101進一步包含，該語句分析模組300依該對話語句之詞彙、詞性及該實體詞彙轉譯為該句型結構。 The above step S101 further includes that the sentence analysis module 300 translates the dialogue sentence into the sentence structure according to the vocabulary, the part of speech and the entity vocabulary.

本發明一實施例，該自然對話意圖分析方法，該對話頁面及該輸入模組100亦可以一文本方式輸入，該語句分析模組300將該文本進行斷句、詞彙切分、詞性標注。 In an embodiment of the present invention, the natural dialogue intent analysis method, the dialogue page and the input module 100 can also input a text, and the sentence analysis module 300 performs sentence segmentation, vocabulary segmentation, and part-of-speech tagging on the text.

本發明一實施例，該自然對話意圖分析方法，該程式碼提供程式開發者將自身的商業或應用邏輯填入程式碼相應區塊中，已完成開發對話機器人；進一步，開發者可將程式安裝到使用者的手持裝置內，或是自身的伺服器中。 In one embodiment of the present invention, the natural dialogue intent analysis method, the program code provides program developers to fill in their own business or application logic into the corresponding block of the program code, and the development of the dialogue robot has been completed; further, the developer can install the program to the user's handheld device, or to its own server.

本發明一實施例，該自然對話意圖分析方法，該表示意圖配合使用者勾選結構送入自身的商業或應用程式邏輯區塊計算，開發者將自身的商業或應用程式的計算結果回傳給使用者，或進行其它應用。 According to an embodiment of the present invention, the natural dialogue intent analysis method, the diagram is sent to the logic block of the business or application program for calculation according to the structure selected by the user, and the developer sends back the calculation result of the business or application program users, or other applications.

實施例一，如【圖5】所示，一個具有換匯意圖的模型建立操作實施例： Example 1, as shown in [Figure 5], an example of a model building operation with a foreign exchange intention:

A1. 本系統一例句建立單元接收一管理者在此建立與金融業務有關的一金融業務場景，接著再該金融業務場景下建立與不同幣別換匯問題有關的一換匯意圖； A1. An example sentence establishment unit of this system receives a manager to establish a financial business scenario related to financial business here, and then establishes a currency exchange intention related to different currency exchange issues under the financial business scenario;

A2.進一步，本系統提供複數個例句輸入畫面供該管理者輸入，以一個句型結構一個例子方式建立，建立客戶端在輸入換匯問題時有可能的問法句型，如【圖6】所示； A2. Further, the system provides a plurality of example sentence input screens for the manager to input, and establishes one sentence structure and one example, and establishes possible question sentence patterns when the client inputs currency exchange questions, as shown in [Figure 6] shown;

A3. 再該些個例句輸入完畢後，執行一全句分析功能，通知一語句分析模組300將該些個例句進行詞彙切分並標注詞性，進一步，依詞彙與詞性原則，提供一實體詞彙勾選畫面供該管理者勾選意圖分析所需參數，如【圖7】所示； A3. After these example sentences are input, execute a whole sentence analysis function, and notify a sentence analysis module 300 to segment these example sentences into vocabulary and mark the part of speech, and further, provide an entity vocabulary according to the principle of vocabulary and part of speech Check the screen for the manager to check the parameters required for intent analysis, as shown in [Figure 7];

A4. 該管理者勾選與該換匯意圖有關的複數個實體詞彙，如【圖7】所示的與該換匯意圖有關的該些個實體詞彙為幣別及金額； A4. The manager checks a plurality of entity words related to the exchange intention, as shown in [Figure 7], the entity words related to the exchange intention are currency and amount;

A5. 該管理者再勾選完意圖所需參數後執行一生成模型，一意圖分析模組400即產生一個具有該換匯意圖的複數個句法結構模型，該些個句法結構模型生成後即可進行該自然語言對話意圖分析。 A5. After the manager checks the required parameters of the intention and then executes a generation model, an intention analysis module 400 generates a plurality of syntactic structure models with the exchange intention. After these syntactic structure models are generated, they can be Perform the analysis of the natural language dialogue intent.

接續上述實施例一，輸入一對話語句『我想買歐元100元』，即便幣別與模型內容例句不同，但經由本系統自然語言對話意圖分析後，可以在例句模型中找到相同的句型結構(符合【圖6】第16例句句法結構模型)。 Continuing from the first example above, input a dialogue sentence "I want to buy 100 euros", even if the currency type is different from the example sentence in the model content, after the analysis of the natural language dialogue intention of this system, the same sentence structure can be found in the example sentence model (conform to [Fig. 6] the 16th example sentence syntactic structure model).

接續上述實施例一，輸入該對話語句『我想買人民幣200 元』，即便金額數字與模型內容例句不同，但經由本系統自然語言對話意圖分析後，可以在例句模型中找到相同的句型結構(符合【圖6】第16例句句法結構模型)。 Continuing from the first example above, input the dialogue sentence "I want to buy RMB 200 Yuan", even if the amount number is different from the example sentence of the model content, the same sentence structure can be found in the example sentence model after the analysis of the natural language dialogue intention of the system (consistent with the syntactic structure model of the 16th example sentence in [Figure 6]).

接續上述實施例一，輸入該對話語句『口罩三個15元台幣再加7元台幣運費很貴嗎』，經由本系統自然語言對話意圖分析後，在例句模型中找不到相同的句型結構，本系統判斷該對話語句與該換匯意圖無關，如此呈現了不同於現行基於數學原理之機器學習模型的自然語言理解效果。 Continuing from the first example above, input the dialogue sentence "Is it expensive to have three masks, NT$15 plus shipping fee of NT$7?", after the analysis of the natural language dialogue intention of this system, the same sentence structure cannot be found in the example sentence model , the system judges that the dialogue sentence has nothing to do with the currency exchange intention, thus presenting a natural language understanding effect different from the current machine learning model based on mathematical principles.

較佳的，本系統固定將「命名實體(例如：地名、貨幣金額、地址…等)」、「時間詞」、「量詞」、「分類詞」、「測量詞」、方位詞、「量化動詞」和「句型標記(直述句/疑問句)」自動搭配錨定的動詞形成句型模型。唯「形容詞」、「副詞」和「名詞(包括使用者自訂詞彙)」可由開發者自行決定是否要建入模型中，故留有勾選欄位。 Preferably, the system fixedly uses "named entities (such as place names, currency amounts, addresses, etc.)", "time words", "quantifiers", "classifiers", "measurement words", location words, "quantitative verbs" " and "Sentence mark (direct sentence/interrogative sentence)" are automatically matched with anchored verbs to form a sentence model. Only "adjectives", "adverbs" and "nouns (including user-defined words)" can be built into the model at the developer's discretion, so there is a check box.

實施例二：如【圖9】至【圖12】所示，一個涉及「反洗錢(AML)」的罪嫌(焦點人物)的模型建立操作實施例： Embodiment 2: As shown in [Fig. 9] to [Fig. 12], a model-building operation embodiment of a suspect (focus person) involved in "anti-money laundering (AML)":

B1. 本系統該例句建立單元接收該管理者建立一AML專案場景； B1. The example sentence creation unit of this system receives the manager to create an AML project scenario;

B2. 在該AML專案下建立屬於該AML專案中的三個意圖； B2. Establish the three intents belonging to the AML project under the AML project;

B3.建立意圖01一例外情況，如【圖9】所示； B3. Establish an exception of intent 01, as shown in [Figure 9];

B4. 該輸入模組100輸入該例外情況的複數個文本，本系統將該些個文本經該語句分析模組300進行詞彙切分、詞性標注及將動詞位置設定為該錨點，將標注過詞性的該語句轉譯為該些個句型結構； B4. The input module 100 inputs plural texts of the exception, and the system performs lexical segmentation, part-of-speech tagging and setting the verb position as the anchor point for these texts through the sentence analysis module 300, and the tagged The sentence of the part of speech is translated into these sentence structures;

B4’. 提供一實體詞彙勾選畫面，供該管理者勾選具該例外情況的人名以做為參數(非必要步驟)； B4'. Provide an entity vocabulary check screen for the manager to check the name of the person with the exception as a parameter (non-essential step);

B5. 該意圖分析模組400將該些個句型結構、該AML專案場景綁定該例外情況意圖，編碼成該些個句法結構模型，儲存到該場景資料庫之一AML專案模型單元； B5. The intent analysis module 400 binds the sentence structure and the AML project scenario to the exception intent, encodes the syntactic structure models, and stores them in one of the AML project model units of the scenario database;

B6. 該輸出轉換模組500依該AML專案模型單元之模型內容產生可供下載的該程式碼； B6. The output conversion module 500 generates the program code available for download according to the model content of the AML project model unit;

B7. 程式開發者下載該程式碼，將語句分析模組300標註的人名存入該資料庫模組200之一例外情況資料集中； B7. The program developer downloads the program code, and stores the name of the person marked by the sentence analysis module 300 into an exception data set of the database module 200;

B8. 程式開發者將程式編寫到伺服器中。 B8. The program developer writes the program into the server.

接續上述實施例二，步驟B2.建立三個意圖其中之一接續如下： Continuing from the second embodiment above, step B2. Establish one of the three intentions and proceed as follows:

C3. 建立意圖02一無罪人士，如【圖10】所示； C3. Establish intent 02-innocent person, as shown in [Figure 10];

C4. 該輸入模組100輸入該無罪人士複數個文本，本系統將該些個文本經該語句分析模組300進行詞彙切分、詞性標注及將動詞位置設定為該錨點，將標注過詞性的該語句轉譯為該些個句型結構； C4. The input module 100 inputs a plurality of texts of the innocent person, and the system takes these texts Carry out vocabulary segmentation, part-of-speech tagging and set the verb position as the anchor point through the sentence analysis module 300, and translate the sentence marked with part-of-speech into these sentence structures;

C4’. 提供該實體詞彙勾選畫面，供該管理者勾選該無罪人士的人名以做為參數(非必要步驟)； C4’. Provide the entity vocabulary check screen for the manager to check the name of the innocent person as a parameter (non-essential step);

C5. 該意圖分析模組400將該些個句型結構、該AML專案綁定該無罪人士意圖，編碼成該些個句法結構模型，儲存到該場景資料庫之該AML專案模型單元； C5. The intent analysis module 400 encodes the sentence structure and the intent of the innocent person bound to the AML project into these syntactic structure models, and stores them in the AML project model unit of the scene database;

C6. 該輸出轉換模組500依該AML專案模型單元之模型內容產生可供下載的該程式碼； C6. The output conversion module 500 generates the program code available for download according to the model content of the AML project model unit;

C7. 程式開發者下載該程式碼，將語句分析模組300標註的人名存入該資料庫模組200之一無罪人士資料集中； C7. The program developer downloads the program code, and stores the names marked by the sentence analysis module 300 into one of the innocent person data sets of the database module 200;

C8. 程式開發者將程式編寫到伺服器中。 C8. The program developer writes the program into the server.

D3. 建立意圖03一洗錢嫌疑，如【圖11】所示； D3. Establish intention 03-money laundering suspicion, as shown in [Figure 11];

D4. 該輸入模組100輸入該洗錢嫌疑複數個文本，本系統將該些個文本經該語句分析模組300進行詞彙切分、詞性標注及將動詞位置設定為該錨點，將標注過詞性的該語句轉譯為該些個句型結構； D4. The input module 100 inputs multiple texts of the suspected money laundering, and the system performs word segmentation and part-of-speech tagging on these texts through the sentence analysis module 300, and sets the position of the verb as the anchor point, and the tagged part-of-speech The sentence is translated into these sentence structures;

D4’. 提供該實體詞彙勾選畫面，供該管理者勾選該洗錢嫌疑的人名以做為參數(非必要步驟)； D4'. Provide the entity vocabulary check screen for the manager to check the name of the person suspected of money laundering as a parameter (non-essential step);

D5. 該意圖分析模組400將該些個句型結構、該AML專案綁定該洗錢嫌疑意圖，編碼成該些個句法結構模型，儲存到該場景資料庫之該AML 專案模型單元； D5. The intent analysis module 400 binds the sentence structure and the AML project to the suspected intention of money laundering, encodes these syntactic structure models, and stores them in the AML of the scene database project model unit;

D6. 該輸出轉換模組500依該AML專案模型單元之模型內容產生可供下載的該程式碼； D6. The output conversion module 500 generates the program code available for download according to the model content of the AML project model unit;

D7. 程式開發者下載該程式碼，將語句分析模組300標註的人名存入該資料庫模組200之一洗錢嫌疑資料集中； D7. The program developer downloads the program code, and stores the names marked by the sentence analysis module 300 into one of the money laundering suspect data sets of the database module 200;

D8. 程式開發者將程式編寫進伺服器中。 D8. The program developer writes the program into the server.

接續上述實施例二，一個涉及「反洗錢(AML)」的罪嫌的模型應用步驟，如【圖12】所示： Continuing from the second embodiment above, a model application procedure involving a crime involving "anti-money laundering (AML)" is shown in [Figure 12]:

E1. 一使用者由一終端裝置連結該AML專案場景，並於該輸入模組100輸入複數篇新聞內容，傳送到該語句分析模組300； E1. A user connects to the AML project scene through a terminal device, and inputs a plurality of news contents in the input module 100, and sends them to the sentence analysis module 300;

E2. 該語句分析模組300將該些篇新聞內容進行詞彙切分、詞性標注及將動詞位置設定為該錨點，將標注過詞性的該語句轉譯為該些個句型結構，進一步，將該些個句型結構及該AML專案編碼傳送到該意圖分析模組400； E2. The sentence analysis module 300 performs lexical segmentation, part-of-speech tagging and setting the position of the verb as the anchor point for these news contents, and translates the sentence marked with part-of-speech into these sentence structures, and further, The sentence structure and the AML project code are sent to the intent analysis module 400;

E3. 該意圖分析模組400依該AML專案連結該場景資料庫之該AML專案模型單元，進一步，比對該些篇新聞內容語句之該些個句型結構與該AML專案模型單元中的該些個句法結構模型，以該錨點為比較基準點進行一句型結構比對，若是該些個句法結構模型中有一組模型比對結果一致，則定義該對話語句含有該組模型綁定的一表示意圖； E3. The intent analysis module 400 links the AML project model unit of the scene database according to the AML project, and further compares the sentence structures of the news content sentences with the AML project model unit These syntactic structure models are compared with the anchor point as the reference point for sentence structure comparison. If the comparison results of a group of models among these syntactic structure models are consistent, it is defined that the dialogue sentence contains a binding of this group of models. show intention;

E4.該表示意圖配合使用者勾選結構傳送到應用程式邏輯區塊中計算； E4. The table is intended to cooperate with the user to check the structure and send it to the application logic block for calculation;

E5. 該新聞文本「1. 王○○又挪用兩百萬公款...」一句符合該洗錢嫌疑意圖內的該句型結構，依勾選結果，將人名[王○○]輸出，存入該洗錢嫌疑資料集中； E5. The sentence "1. Wang ○○ embezzled two million public funds..." in the news text conforms to the sentence structure in the intention of the money laundering suspicion. According to the checked result, the name [Wang ○○] is output and stored in It's time to wash Money suspect data collection;

E6. 該新聞文本「2. 檢查官李○○指出原告…」一句符合該例外情況意圖內的該句型結構，依勾選結果，將人名[李○○]輸出，存入該例外情況資料集中； E6. The sentence "2. Prosecutor Li ○○ points out that the plaintiff..." in the news text conforms to the sentence structure in the intention of the exception. According to the result of the check, the name [Li ○○] is output and stored in the exception data concentrated;

E7. 該新聞文本「3. 二審改判林○○無罪…」一句符合該無罪人士意圖內的該句型結構，依勾選結果，將人名[林○○]輸出，存入該無罪人士資料集中。 E7. The sentence in the news text "3. Lin ○○ was sentenced not guilty in the second instance..." conforms to the sentence structure in the intention of the innocent person. According to the result of the check, the name [Lin ○○] is output and stored in the innocent person's data set .

上述實施例二中，該實體詞彙勾選人名，對系統而言是表示人名是一個會變化的參數的意思，也就是說，系統會把[謝○○]挪用120萬美元儲存為：[某某人]挪用120萬美元的句型；句型：[某某人]挪用120萬美元，會進一步被轉譯為「句型結構」如下：[某某人]-Verb(挪用)-Currency如此一來，只要下次看到：[某某人]-Verb(挪用)-某數金額的句型，就知道這個[某某人]是[洗錢嫌疑]人。 In the above-mentioned embodiment 2, ticking the person's name in the entity vocabulary means to the system that the person's name is a variable parameter, that is to say, the system will store [Xie ○○] embezzling $1.2 million as: [a certain Sentence pattern: [someone] embezzled $1.2 million; Sentence pattern: [someone] misappropriated $1.2 million, which will be further translated into a "sentence structure" as follows: [someone]-Verb (misappropriation)-Currency Come on, as long as you see the sentence pattern: [so-and-so]-Verb (misappropriation)-a certain amount of money next time, you will know that [so-and-so] is a [money laundering suspect].

實施例三：一個「加法事件」和「減法事件」的實施例流程如下： Embodiment 3: The embodiment flow of an "addition event" and "subtraction event" is as follows:

F1. 建立一Math專案場景，進一步建立一加法意圖資料集及一減法意圖資料集，再依步驟S3~S8建立該些個句法結構模型； F1. Establish a Math project scene, further establish an addition intention data set and a subtraction intention data set, and then establish these syntactic structure models according to steps S3~S8;

F2. 定義「加法」和「減法」兩種事件的一函式事件池(Event Pool)如【圖13】所示，各函式內有「動詞、句型組合」和「數學運算子及運算公式」的對照規則；計算意圖的文字，經過配合事件的比對，即產生該計算意圖文字所應進行的數學運算；例如：「弟弟有X顆橘子，再給他Y顆」的X+Y及「哥哥有X枝鉛筆，借給弟弟Y支」時的 X-Y的「加法事件」和「減法事件」； F2. A function event pool (Event Pool) that defines two events of "addition" and "subtraction" is shown in [Figure 13]. The comparison rule of "formula"; the text of the calculation intention, after the comparison with the event, is the mathematical operation that should be performed to generate the calculation intention text; for example: "My brother has X oranges, and give him Y oranges" X+Y And when "the older brother has X pencils and lends the younger brother Y pencils" "Addition event" and "subtraction event" of X-Y;

F3. 將「加法事件」設計為一個獨立的函式，透過意圖分析模組400將事件轉譯為一正規表示式，該正規表示式可透過該實體詞彙勾選(贈予/給予/購買...)的標記方式來說明這些句子的語意都是表示「加法事件」的功能； F3. Design the "addition event" as an independent function, and translate the event into a regular expression through the intent analysis module 400. The regular expression can be selected through the entity vocabulary (gift/give/purchase.. .) to illustrate that the semantics of these sentences all represent the function of "addition event";

F4. 將「減法事件」設計為一個獨立的函式，透過意圖分析模組400將事件轉譯為該正規表示式，該正規表示式可透過該實體詞彙勾選(吃掉/借走/賣出...)的標記方式來說明這些句子的語意都是表示「減法事件」的功能； F4. Design the "subtraction event" as an independent function, and translate the event into the regular expression through the intent analysis module 400, and the regular expression can be selected through the entity vocabulary (eat/borrow/sell ...) to illustrate that the semantics of these sentences all represent the function of "subtraction event";

F5. 系統將一數學題目中的句子一次一句傳送到該Math專案，將句子經語句分析模組300處理成含有詞性標記(POS/NER)的字串； F5. The system transmits the sentences in a math topic one sentence at a time to the Math project, and the sentences are processed by the sentence analysis module 300 into strings containing part-of-speech tags (POS/NER);

F6. 經意圖分析模組400比對過該Math專案下該些個句法結構模型後，再轉譯為可表示「加法事件」或「減法事件」意圖的一問題句型字串； F6. After the intent analysis module 400 compares the syntactic structure models under the Math project, it is translated into a question sentence string that can represent the intention of "addition event" or "subtraction event";

F7. 依該問題句型字串、該實體詞彙及不同的該句型結構提取事件池中的函式，即能求解。 F7. Extract the functions in the event pool according to the sentence string of the question, the entity vocabulary and different sentence structures, and then solve the problem.

上述實施例三中，以「桌上有三顆蘋果，小明吃掉一顆，還剩下幾顆蘋果」的題目為例，依前述流程，系統將依次送出「桌上有三顆蘋果」、「小明吃掉一顆」和「還剩下幾顆蘋果」三個句子；經比對Math專案後，「桌上有三顆蘋果」將回傳「定義語境場景(Definition)」的意圖，以及句型中可供計算的詞彙單位為「桌上(Possessor)」、「蘋果(Entity)」和「三(Quantity)顆(Classifier)」等三個論元，如【圖14a】所示。 In the third embodiment above, take the topic "There are three apples on the table, Xiao Ming eats one, and how many apples are left" as an example. Eat one" and "How many apples are left"; after comparing the Math project, "There are three apples on the table" will return the intention of "Definition" and the sentence pattern The lexical units that can be counted in are three arguments: "table (Possessor)", "apple (Entity)" and "three (Quantity) (Classifier)", as shown in [Figure 14a].

上述實施例三中，第二句「小明吃掉一顆」將回傳「計算過程(Calculation)」的意圖，故句型中可供計算的詞彙單位為「小明(Possessor)」、「蘋果(Entity)」、「一(Quantity)顆(Classifier)」等論元，因「吃掉」為「減法事件」，故「一」加上負號成為“-1”，如【圖14b】所示。 In the third embodiment above, the second sentence "Xiao Ming eats one" will return the intention of "Calculation", so the lexical units available for calculation in the sentence pattern are "Xiao Ming (Possessor)", "Apple ( Entity)", "Quantity (Classifier)" and other arguments, because "eat" is a "subtraction event", so "one" plus a minus sign becomes "-1", as shown in [Figure 14b] .

上述實施例三中，最後一句「還剩下幾顆蘋果」將從Math專案得到「求解目標(Question)」的意圖回傳；詢問的目標是「(幾)顆」，詢問的實體則是「蘋果」；因句子中沒有提到持有者，故將該欄位留空；如【圖14c】所示，依求解指示，要詢問的實體是「蘋果」，故從「定義語境場景」開始進行計算，得到「蘋果=3顆」的初始定義，接著進入計算過程，得到「蘋果-1」的計算過程，因沒有其它的計算過程，故得到3-1=2，還剩下2顆蘋果的最終答案。 In the third embodiment above, the last sentence "how many apples are left" will be returned from the Math project with the intent of "question"; the object of the query is "(several) apples", and the entity of the query is " Apple”; because the owner is not mentioned in the sentence, this field is left blank; as shown in [Figure 14c], according to the solution instruction, the entity to be queried is “apple”, so start from “Definition Context Scenario” Start the calculation, get the initial definition of "apple = 3", then enter the calculation process, get the calculation process of "apple-1", because there is no other calculation process, so get 3-1=2, and there are 2 pieces left Apple's final answer.

上述實施例三中，若題目不是問「還剩下幾顆蘋果」，而是問「小明總共吃掉幾顆蘋果」，則「求解目標(Question)」會是如【圖14d】所示的結果：則計算過程中，因為在「定義語境場景(Definition)」的資料中不存在「小明持有蘋果」的記錄，因此會略過定義語境場景，直接進入「計算過程(Calculation)」的資料中，取得「小明持有蘋果數為-1」的記錄，再因為「吃掉」是一「減法事件」，因此取得的數值要再加上負號，得到“-(-1)”為“1”。故小明總共吃掉「1顆蘋果」的最終答案。 In the third embodiment above, if the question is not asking "How many apples are left", but "How many apples did Xiao Ming eat in total", then "Question" will be as shown in [Figure 14d] Result: During the calculation process, because there is no record of "Xiao Ming holds an apple" in the "Definition" data, the definition context scene will be skipped and directly enter the "Calculation" In the data, get the record of "the number of apples held by Xiao Ming is -1", and because "eating" is a "subtraction event", so the obtained value must be added with a minus sign to get "-(-1)" to "1". Therefore, Xiao Ming ate the final answer of "1 apple" in total.

應當理解的是，本發明的上述具體實施方式僅僅用於示例性說明或解釋本發明的原理，而不構成對本發明的限制。因此，在不偏離本發明的精神和範圍的情況下所做的任何修改、等同替換、改進等，均應包含在本發明的保護範圍之內。此外，本發明所附權利要求旨在涵蓋落入所附權利要求範圍和邊界、或者這種範圍和邊界的等同形式內的全部變化和修改例。 It should be understood that the above specific embodiments of the present invention are only used to illustrate or explain the principle of the present invention, and not to limit the present invention. Therefore, any modification, equivalent replacement, improvement, etc. made without departing from the spirit and scope of the present invention shall fall within the protection scope of the present invention. Furthermore, the appended claims hereto are intended to cover All changes and modifications within the scope and boundaries of the claims, or equivalents of such scope and boundaries are intended.

100:輸入模組 100: input module

200:資料庫模組 200:Database module

300:語句分析模組 300: Sentence analysis module

400:意圖分析模組 400: Intent analysis module

500:輸出轉換模組 500: Output conversion module

Claims

A method for analyzing the intention of a natural language dialogue, wherein the method for establishing a plurality of syntactic structure models is as follows:

S1. An example sentence creation unit of an input module of the system provides a manager to create an mth application scenario;

S2. Further, the manager creates a data set belonging to an nth intent in the mth application scenario in the mth application scenario;

S3. Further, under the nth intent, the manager creates a plurality of example sentences that meet the nth intent of the mth application scenario in a sentence-by-example manner, and transmits them to a sentence analysis module;

S4. The sentence analysis module performs lexical segmentation and part-of-speech tagging on these example sentences, and further, transfers these example sentences marked with part-of-speech into plural sentence structure;

S5. An intent analysis module binds the sentence structures in the mth application scenario to the nth intent, encodes these syntactic structure models, and stores them in one of the mth application scenario model units of a scenario database ;

S6. Repeat steps S2~S5 until all intents belonging to the mth application scenario have been established;

S7. An output conversion module of the system extracts the syntactic structure models of the m-th application scene model unit and converts them into a program code for output.

The method for analyzing the intention of a natural language dialog as described in Claim 1, wherein the step S4 further includes, the sentence analysis module provides a screen for checking entity words according to the words and parts of speech of the example sentences.

The natural language dialogue intent analysis method as described in claim 1, wherein the input module The group further includes a custom dictionary unit, which can import these example sentences.

The natural language dialogue intent analysis method as described in claim 1, wherein step S7 of the process is continued as follows:

S8. A user connects to a dialog page of the m-th application scene from a terminal device, inputs a dialog sentence in the input module of the terminal device, and sends it to the sentence analysis module;

S9. The sentence analysis module extracts the dialogue sentence on the dialogue page, performs lexical segmentation, part-of-speech tagging on the dialogue sentence, sets the verb position as the anchor point, and translates the dialogue sentence marked with the part-of-speech into the sentence pattern structure, encoding the sentence structure and the mth application scenario and sending them to the intent analysis module;

S10. The intent analysis module links the mth application scenario model unit of the scenario database according to the mth application scenario, and compares the sentence structure of the dialogue sentence with the mth application scenario model units. Syntax structure model, using the anchor point as a reference point for sentence structure comparison;

S11. If it conforms to a set of models among the syntactic structure models, then define the n-th intent bound by the set of models in the dialogue sentence as an expression map.

The method for analyzing the intention of a natural language dialogue as described in Claim 1, wherein the sentence structure sets the position of the verb in the sentence as an anchor point, which is used as a reference point for sentence structure comparison.

The method for analyzing the intent of a natural language dialogue as described in Claim 2, wherein the sentence analysis module translates the dialogue sentence into the sentence structure according to the vocabulary, part of speech and the entity vocabulary.

In the method for analyzing the intent of a natural language dialog as described in claim 1 or claim 4, the input module can also be input in a text form.

The natural language dialogue intent analysis method as described in claim item 4, wherein the expression means Graphs can be sent to logical blocks of multiple applications.

The method for analyzing the intent of a natural language dialog as described in Claim 4, wherein the graph can be connected to a function event pool for mathematical operations.

The method for analyzing the intent of a natural language dialogue as described in Claim 3, wherein the representation can be used for text classification, and can also be further used for name extraction or other information extraction.