TWI745878B - Chat robot system and chat robot model training method - Google Patents

Chat robot system and chat robot model training method Download PDF

Info

Publication number
TWI745878B
TWI745878B TW109107231A TW109107231A TWI745878B TW I745878 B TWI745878 B TW I745878B TW 109107231 A TW109107231 A TW 109107231A TW 109107231 A TW109107231 A TW 109107231A TW I745878 B TWI745878 B TW I745878B
Authority
TW
Taiwan
Prior art keywords
training
corpus
model
entity
answer
Prior art date
Application number
TW109107231A
Other languages
Chinese (zh)
Other versions
TW202135045A (en
Inventor
賴彥佐
Original Assignee
宏碁股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 宏碁股份有限公司 filed Critical 宏碁股份有限公司
Priority to TW109107231A priority Critical patent/TWI745878B/en
Publication of TW202135045A publication Critical patent/TW202135045A/en
Application granted granted Critical
Publication of TWI745878B publication Critical patent/TWI745878B/en

Links

Images

Landscapes

  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Electrically Operated Instructional Devices (AREA)

Abstract

A chat robot model training method includes: recording an association between an intent and an entity and an answer by means of a knowledge base; analyzing the intent and entity in a plurality of training corpora; using a generalization ability of a machine learning model to separate a plurality of synonym words in the training corpus, and removing the synonym words from the training corpus to obtain a corpus set; inputting the corpus set into a neural network model to obtain a training model, testing the training model to obtain a test result; analyzing the test results according to a predefined rule to generate an analysis result, and generating a new version of the training corpus according to the analysis result.

Description

聊天機器人系統及聊天機器人模型訓練方法Chat robot system and chat robot model training method

本發明是關於一種互動式機器人系統,特別是關於一種聊天機器人系統及聊天機器人模型訓練方法。The invention relates to an interactive robot system, in particular to a chat robot system and a chat robot model training method.

在使用者瀏覽網頁時,有些網頁可開啟互動性視窗,讓使用者輸入訊息,互動式視窗可依據使用者輸入訊息顯示回覆資訊。此外,使用者也可以將問題以文字或語音方式輸入到具有使用者介面的機器人,機器人可應用使用者介面回應使用者的問題。When a user browses a webpage, some webpages can open an interactive window for the user to input a message, and the interactive window can display reply information based on the user's input message. In addition, the user can also input questions into a robot with a user interface in text or voice, and the robot can use the user interface to respond to the user's question.

傳統的網頁伺服器或機器人都需要大量的訓練語料作機器學習,當使用者輸入訊息與訓練語料符合時,才能找到對應的回覆資訊。Traditional web servers or robots require a large amount of training corpus for machine learning. When the user input information matches the training corpus, the corresponding reply information can be found.

然而,同樣的使用者意圖(例如為開燈)可以有多種表達方式(例如為我想開燈、覺得很暗、幫我開燈…等等),傳統的網頁伺服器或機器人無法窮舉出所有使用者可能使用的用語,以作為訓練語料。即使採用大量的訓練語料,亦可能導致系統的運算量提升,需要硬體升級或是花費較多的時間成本。因此,如何有效率的產生聊天機器人的訓練模型已成為本領域需解決的問題之一。However, the same user's intention (for example, to turn on the light) can be expressed in multiple ways (for example, I want to turn on the light, feel dark, turn on the light for me... etc.), which traditional web servers or robots cannot exhaust All the words that users may use are used as training corpus. Even if a large amount of training corpus is used, it may lead to an increase in the amount of calculations of the system, requiring hardware upgrades or costly time. Therefore, how to efficiently generate a training model of a chat robot has become one of the problems to be solved in this field.

為了解決上述的問題,本揭露內容之一態樣提供了一種聊天機器人系統。聊天機器人系統包含一儲存裝置以及一處理器。儲存裝置用以儲存一知識庫,該知識庫用以記錄一意圖與一實體之組合與一答案之間的關聯。處理器用以分析複數個訓練語料中的意圖與實體,並將利用一機器學習模型的一泛化(generalization)能力將此些訓練語料中的複數個同義字分離,並將此些同義字從此些訓練語料移除後,以取得一語料集,將語料集輸入一類神經網路模型後,取得一訓練模型,測試訓練模型,以取得一測試結果。其中,當處理器從測試結果中以一預先定義規則分析測試結果,以產生一分析結果,並依據分析結果產生一新版訓練語料。In order to solve the above-mentioned problems, one aspect of the present disclosure provides a chat robot system. The chat robot system includes a storage device and a processor. The storage device is used for storing a knowledge base for recording the association between a combination of an intention and an entity and an answer. The processor is used to analyze the intentions and entities in a plurality of training corpus, and use a generalization ability of a machine learning model to separate the plural synonyms in these training corpus, and to separate these synonyms After the training corpus is removed, a corpus is obtained. After the corpus is input to a neural network model, a training model is obtained, and the training model is tested to obtain a test result. Wherein, when the processor analyzes the test result from the test result using a predefined rule to generate an analysis result, and generates a new version of the training corpus according to the analysis result.

本發明之又一態樣係於提供一種聊天機器人模型訓練方法。聊天機器人模型訓練方法包含:藉由一知識庫記錄一意圖與一實體之組合與一答案之間的關聯;分析複數個訓練語料中的意圖與實體;利用一機器學習模型的一泛化(generalization)能力將此些訓練語料中的複數個同義字分離,並將此些同義字從此些訓練語料中移除,以取得一語料集;將語料集輸入一類神經網路模型,取得一訓練模型,測試訓練模型,以取得一測試結果;以及依據一預先定義規則分析測試結果,以產生一分析結果,並依據分析結果產生一新版訓練語料。Another aspect of the present invention is to provide a chat robot model training method. The chatbot model training method includes: using a knowledge base to record the association between the combination of an intention and an entity and an answer; analyzing the intentions and entities in a plurality of training corpus; using a generalization of a machine learning model ( generalization) Ability to separate plural synonymous characters in these training corpus, and remove these synonymous characters from these training corpora to obtain a corpus; input the corpus into a type of neural network model, Obtain a training model, test the training model to obtain a test result; analyze the test result according to a predefined rule to generate an analysis result, and generate a new version of the training corpus according to the analysis result.

本發明所示之聊天機器人系統及聊天機器人模型訓練方法,透過機器學習模型的泛化能力,將訓練語料中的同義字去除,以產生用於輸入類神經網路模型之語料集的大小,可以減少運算量並節省運算時間,透過分析訓練模型的方法,可依據分析結果產生新版訓練語料,透過新版訓練語料輸入類神經網路模型,以達到優化訓練模型的效果。The chat robot system and chat robot model training method shown in the present invention remove synonymous words in the training corpus through the generalization ability of the machine learning model to generate the size of the corpus used for the input neural network model , Which can reduce the amount of calculation and save time. By analyzing the training model, a new version of the training corpus can be generated based on the analysis result, and the new version of the training corpus can be input into the neural network model to achieve the effect of optimizing the training model.

以下說明係為完成發明的較佳實現方式,其目的在於描述本發明的基本精神,但並不用以限定本發明。實際的發明內容必須參考之後的權利要求範圍。The following description is a preferred implementation of the invention, and its purpose is to describe the basic spirit of the invention, but not to limit the invention. The actual content of the invention must refer to the scope of the claims that follow.

必須了解的是,使用於本說明書中的”包含”、”包括”等詞,係用以表示存在特定的技術特徵、數值、方法步驟、作業處理、元件以及/或組件,但並不排除可加上更多的技術特徵、數值、方法步驟、作業處理、元件、組件,或以上的任意組合。It must be understood that the words "including", "including" and other words used in this specification are used to indicate the existence of specific technical features, values, method steps, operations, elements, and/or components, but they do not exclude Add more technical features, values, method steps, job processing, components, components, or any combination of the above.

於權利要求中使用如”第一”、"第二"、"第三"等詞係用來修飾權利要求中的元件,並非用來表示之間具有優先權順序,先行關係,或者是一個元件先於另一個元件,或者是執行方法步驟時的時間先後順序,僅用來區別具有相同名字的元件。Words such as "first", "second", and "third" used in the claims are used to modify the elements in the claims, and are not used to indicate that there is an order of priority, antecedent relationship, or an element Prior to another element, or the chronological order of execution of method steps, is only used to distinguish elements with the same name.

請參照第1~2圖,第1圖係依照本發明一實施例繪示整合聊天機器人系統100之方塊圖。第2圖係依照本發明一實施例繪示聊天機器人模型訓練方法200之流程圖。Please refer to FIGS. 1 and 2. FIG. 1 is a block diagram of the integrated chat robot system 100 according to an embodiment of the present invention. FIG. 2 is a flowchart of a chat robot model training method 200 according to an embodiment of the present invention.

於一實施例中,如第1圖所示,整合多平台登入系統100包含一儲存裝置10及一處理器20。於一實施例中,儲存裝置10耦接於處理器20。In one embodiment, as shown in FIG. 1, the integrated multi-platform login system 100 includes a storage device 10 and a processor 20. In one embodiment, the storage device 10 is coupled to the processor 20.

於一實施例中,聊天機器人系統100可以是筆記型電腦、桌機、平板、手機或其它電子裝置。於一實施例中,聊天機器人系統100包含一使用者介面及一螢幕,使用者可以透過使用者介面輸入訊息,螢幕可顯示使用者輸入訊息及處理器20輸出的回覆訊息。例如,使用者使用筆記型電腦打開一網頁後,網頁連結到聊天機器人系統100後,開啟一訊息視窗,用以接收使用者的輸入訊息,例如,使用者在訊息視窗中輸入「東京的維修中心在哪?」,聊天機器人系統100透過執行聊天機器人模型訓練方法200後找到答案,在訊息視窗中回覆位於東京的維修中心的地址。In an embodiment, the chat robot system 100 may be a notebook computer, a desktop computer, a tablet, a mobile phone, or other electronic devices. In one embodiment, the chat robot system 100 includes a user interface and a screen through which the user can input messages, and the screen can display user input messages and reply messages output by the processor 20. For example, after a user opens a web page on a laptop computer, and the web page is linked to the chatbot system 100, a message window is opened to receive the user's input message. For example, the user enters "Tokyo Repair Center" in the message window Where?" The chat robot system 100 finds the answer after executing the chat robot model training method 200, and responds to the address of the repair center in Tokyo in the message window.

於一實施例中,儲存裝置10可被實作為唯讀記憶體、快閃記憶體、軟碟、硬碟、光碟、隨身碟、磁帶、可由網路存取之資料庫或熟悉此技藝者可輕易思及具有相同功能之儲存媒體。In one embodiment, the storage device 10 can be implemented as a read-only memory, flash memory, floppy disk, hard disk, optical disk, flash drive, tape, a database accessible by the network, or those familiar with the art Easily think about storage media with the same function.

於一實施例中,處理器20可以由體積電路如微控制單元(micro controller)、微處理器(microprocessor)、數位訊號處理器(digital signal processor)、特殊應用積體電路(application specific integrated circuit,ASIC)或一邏輯電路來實施。In one embodiment, the processor 20 may be composed of a volume circuit such as a micro controller, a microprocessor, a digital signal processor, and an application specific integrated circuit (application specific integrated circuit, ASIC) or a logic circuit.

請參閱第2圖,以下敘述聊天機器人模型訓練方法200。Please refer to Figure 2, the chat robot model training method 200 is described below.

於步驟210中,儲存裝置10儲存一知識庫,知識庫用以記錄一意圖(question type)與一實體(entity)之組合與一答案之間的關聯。In step 210, the storage device 10 stores a knowledge base for recording the association between a combination of a question type and an entity and an answer.

於一實施例中,使用者輸入的句子可能對應到相同或不同的意圖,意圖是指使用者想要進行的操作或是問題的種類,藉由聊天機器人系統100或人工可以將使用者輸入的句子進行意圖的分類。句子與意圖的例子如下表一所示。 句子 意圖 hi Greeting hello Greeting hi there Greeting where is the repair center in Tokyo Find Repair Center I want to find a repair center in Taipei city Find Repair Center i want to go to the acer repair center Find Repair Center why my wifi is not working Troubleshooting I am not able to use my wifi Troubleshooting my wifi doesn’t work Troubleshooting Launch the system backup Use application Use the system backup Use application Please open the Acer Recovery Use application 表一 其中,聊天機器人系統100可以透過類神經演算法,例如詞向量(word embedding)演算法或是意圖分類(intent classification)演算法,將多個句子映射到高維的向量空間後,再去計算彼此間的距離,最後將問題分類,距離相近者視為同一類意圖;然而,本發明不限於此,只要是能夠分類出句子意圖的方式都可以採用。藉由標記每個句子的意圖可以讓聊天機器人系統100去學習使用者輸入的每個句子背後代表的意思。 In one embodiment, the sentence input by the user may correspond to the same or different intentions, and the intention refers to the operation that the user wants to perform or the type of question, which can be input by the user through the chatbot system 100 or manually Sentences are classified into intentions. Examples of sentences and intentions are shown in Table 1 below. sentence intention hi Greeting hello Greeting hi there Greeting where is the repair center in Tokyo Find Repair Center I want to find a repair center in Taipei city Find Repair Center i want to go to the acer repair center Find Repair Center why my wifi is not working Troubleshooting I am not able to use my wifi Troubleshooting my wifi doesn't work Troubleshooting Launch the system backup Use application Use the system backup Use application Please open the Acer Recovery Use application In Table 1, the chat robot system 100 can map multiple sentences to a high-dimensional vector space through neural algorithms, such as word embedding algorithms or intent classification algorithms, and then go Calculate the distance between each other, and finally classify the question. Those with similar distances are regarded as the same type of intent; however, the present invention is not limited to this, as long as the method can be used to classify the sentence intent. By marking the intention of each sentence, the chatbot system 100 can learn the meaning behind each sentence input by the user.

於一實施例中,聊天機器人系統100還需要句子中的重要的單詞,稱為實體(或關鍵字),例如在句子“where is the repair center in Tokyo”與“where is the repair center in Taipei”中,此種句型的“Tokyo”與“Taipei”稱為實體。因為相同的意圖可能發生在不同的實體上,所以聊天機器人系統100會學習辨識出句子中的實體。In one embodiment, the chat robot system 100 also needs important words in sentences, called entities (or keywords), for example, in the sentences "where is the repair center in Tokyo" and "where is the repair center in Taipei" In this sentence, "Tokyo" and "Taipei" are called entities. Because the same intention may occur on different entities, the chatbot system 100 will learn to recognize the entities in the sentence.

於一實施例中,聊天機器人系統100可應用命名實體識別(Named Entity Recognition,NER)的技術以識別實體,其中,此技術用到的演算法為條件隨機域(conditional random field, CRF),是一種鑑別式機率模型,常用於標記或分析序列資料,如自然語言文字,藉此可預測句子中的標籤(tag)。於一實施例中,聊天機器人系統100藉由句子的前後文可判斷出實體的標籤,例如在介詞“in”的後面通常會是地點,因此將“in”後面所接的“Tokyo”標記為地點,並將地點“Tokyo”視為實體。In one embodiment, the chatbot system 100 can apply a named entity recognition (Named Entity Recognition, NER) technology to identify entities, where the algorithm used in this technology is a conditional random field (CRF), which is A discriminative probability model, often used to mark or analyze sequence data, such as natural language text, to predict tags in sentences. In one embodiment, the chatbot system 100 can determine the tag of the entity by the context of the sentence. For example, the place after the preposition "in" is usually the place, so the "Tokyo" following the "in" is marked as Location, and treat the location "Tokyo" as an entity.

藉由上述方法,當使用者輸入“where is the repair center in Tokyo”, 聊天機器人系統100可以理解語意,將此句子資訊簡化成意圖為“find repair center”,實體為“Tokyo”。接著,知識庫可以記錄意圖(例如“find repair center”)與實體(例如“Tokyo”)之與一答案(例如“shinjuku”(新宿))之間的關聯。換言之,知識庫將意圖與實體視為一組索引,透過索引可關聯(或查找)到答案。With the above method, when the user enters "where is the repair center in Tokyo", the chatbot system 100 can understand the semantic meaning and simplify the sentence information to the intention "find repair center", and the entity is "Tokyo". Then, the knowledge base can record the association between an intent (for example, "find repair center") and an entity (for example, "Tokyo") and an answer (for example, "shinjuku" (Shinjuku)). In other words, the knowledge base regards the intent and the entity as a set of indexes, through which the answers can be associated (or searched).

於步驟220中,處理器20分析複數個訓練語料中的意圖與實體。In step 220, the processor 20 analyzes the intentions and entities in the plurality of training corpora.

於一實施例中,表一中的句子可視為訓練語料,處例器20透過詞向量演算法或是意圖分類演算法分析出句子的意圖。In one embodiment, the sentences in Table 1 can be regarded as training corpus, and the processor 20 analyzes the intention of the sentence through a word vector algorithm or an intention classification algorithm.

於一實施例中,處理器20透過命名實體識別的技術以識別實體。In one embodiment, the processor 20 uses named entity recognition technology to identify entities.

於步驟230中,處理器20利用一機器學習模型的一泛化(generalization)能力將此些訓練語料中的複數個同義字分離,並將此些同義字從此些訓練語料中移除,以取得一語料集。In step 230, the processor 20 uses a generalization capability of a machine learning model to separate the plural synonyms in the training corpus, and remove these synonyms from the training corpus, To obtain a corpus.

於一實施例中,同義字可以由使用者事先定義。In one embodiment, the synonymous characters can be defined by the user in advance.

第3圖係依照本發明一實施例繪示聊天機器人模型訓練方法之示意圖。以下請一併參閱第3圖,在前處理310(pre-process)階段,處理器20將訓練語料30去除同義字32後,產生語料集34,作為後續訓練的資料。FIG. 3 is a schematic diagram of a chat robot model training method according to an embodiment of the present invention. Please also refer to Figure 3 below. In the pre-process stage, the processor 20 removes the synonymous characters 32 from the training corpus 30 to generate a corpus 34 as data for subsequent training.

於一實施例中,機器學習模型可以由邏輯迴歸、線性迴歸、最近鄰演算法、K-Means聚類、類神經網路等等方式實現之。In one embodiment, the machine learning model can be implemented by logistic regression, linear regression, nearest neighbor algorithm, K-Means clustering, neural network, etc.

於一實施例中,泛化能力是指機器學習模型對新鮮樣本(例如訓練語料30)的適應能力。學習的目的是學到隱含在對背後的規律,對具有同一規律的訓練語料30以外的資料,經過訓練的網路也能給出合適的輸出,此能力稱為泛化能力。In one embodiment, generalization ability refers to the adaptability of the machine learning model to fresh samples (for example, training corpus 30). The purpose of learning is to learn the rules behind the pairing. For data other than the training corpus 30 with the same rules, the trained network can also give appropriate output. This ability is called generalization ability.

舉例而言,在句子“where is the repair center in Tokyo”與“where is the repair center in Taipei”中,此種句型的“Tokyo”與“Taipei”為實體,即視為重要的關鍵字,且標記為地點,當新的一筆資料輸入時,例如為“where is the repair center in Beijing”,即使此資料沒有出現過,機器學習模型的泛化能力仍可以依據規律(例如句型、介詞或其他規律)判斷出“Beijing”為實體,且標記為地點。For example, in the sentences "where is the repair center in Tokyo" and "where is the repair center in Taipei", the sentence patterns "Tokyo" and "Taipei" are entities, which are regarded as important keywords. And mark it as a location. When a new piece of data is entered, such as "where is the repair center in Beijing", even if the data has not appeared before, the generalization ability of the machine learning model can still be based on rules (such as sentence patterns, prepositions, or Other rules) It is determined that "Beijing" is an entity and marked as a location.

藉此,即使不用大量的訓練語料30,應用機器學習模型的泛化能力,也能夠判斷出新資料的實體及其標籤。In this way, even if a large amount of training corpus 30 is not used, the generalization ability of the machine learning model can be used to determine the entity of the new data and its label.

由於好的機器學習模型能作到一定程度的泛化,換言之,即使某些字詞不需要特別訓練,機器學習模型也能將此些字詞辨識出來,因此可以利用機器學習的泛化能力將同義字32從訓練語料30中分離出來。Since a good machine learning model can generalize to a certain degree, in other words, even if certain words do not require special training, the machine learning model can recognize these words, so the generalization ability of machine learning can be used to Synonymous characters 32 are separated from the training corpus 30.

其中,同義字32例如為“wifi”、“wi-fi”、“wireless”,此三個字詞出現在句子中有相當高的程度代表相同的意思,因此,此三個字詞可取“wifi”作為關鍵字保留下來,而“wi-fi”、“wireless”則從訓練語料30中移除,移除掉同義字32(僅保留一個關鍵字)的訓練語料30稱為語料集34。於一實施例中,“wifi”、“wi-fi”、“wireless”可被稱為一組同義字集。Among them, the synonyms 32 are, for example, "wifi", "wi-fi", and "wireless". These three words appear in a sentence to a high degree and represent the same meaning. Therefore, these three words can be taken as "wifi" "Is retained as keywords, while "wi-fi" and "wireless" are removed from the training corpus 30. The training corpus 30 with the synonym 32 (only one keyword retained) is removed is called the corpus 34. In one embodiment, "wifi", "wi-fi", and "wireless" may be referred to as a set of synonyms.

藉由移除掉重複意義的同義字32,語料集34的資料量降低,可以大幅減低後續訓練語料集34所需的運算量。By removing the synonymous words 32 with repeated meanings, the amount of data in the corpus 34 is reduced, which can greatly reduce the amount of computation required for the subsequent training corpus 34.

於步驟240中,處理器20將語料集34輸入一類神經網路模型36,取得一訓練模型38,測試訓練模型38,以取得一測試結果44。In step 240, the processor 20 inputs the corpus 34 into a type of neural network model 36, obtains a training model 38, tests the training model 38, and obtains a test result 44.

於一實施例中,如第3圖所示,在訓練階段320中,語料集輸入類神經網路模型36,神經網路模型36中會依序進行卷積(convolution)層、ReLU層、卷積層、ReLU層、池化(pooling)層、ReLU層、卷積層、ReLU層、池化層及全連接(fully connected)。然而,神經網路模型36中各層運算可以依實作調整,不限於此。經過一連串的運算後,神經網路模型36輸出一訓練模型38。In one embodiment, as shown in Figure 3, in the training phase 320, the corpus is input to the neural network model 36, and the neural network model 36 will sequentially perform convolution layers, ReLU layers, Convolutional layer, ReLU layer, pooling layer, ReLU layer, convolutional layer, ReLU layer, pooling layer and fully connected. However, the calculation of each layer in the neural network model 36 can be adjusted according to the implementation, and it is not limited to this. After a series of calculations, the neural network model 36 outputs a training model 38.

於一實施例中,測試資料40的意圖及實體是已知的,意圖及實體與其對應的結果也是已知的。在測試階段330中,將測試資料40輸入訓練模型38後,訓練模型38輸出測試結果44。In one embodiment, the intent and entity of the test data 40 are known, and the intent and entity and their corresponding results are also known. In the test phase 330, after the test data 40 is input to the training model 38, the training model 38 outputs the test result 44.

於一實施例中,處理器20將測試資料40及同義字32輸入訓練模型38,藉此測試訓練模型38是否能正確的判斷出同義字32,並取得對應同義字32的答案,訓練模型38輸出測試結果44。In one embodiment, the processor 20 inputs the test data 40 and the synonymous characters 32 into the training model 38 to test whether the training model 38 can correctly determine the synonymous characters 32 and obtain the answer corresponding to the synonymous characters 32 to train the model 38 Output test result 44.

接著,處理器20分析測試結果44以驗證訓練模型38的準確性。Then, the processor 20 analyzes the test result 44 to verify the accuracy of the training model 38.

於步驟250中,處理器20依據一預先定義規則46分析測試結果44,以產生一分析結果48,並依據分析結果48產生新版訓練語料50。In step 250, the processor 20 analyzes the test result 44 according to a predefined rule 46 to generate an analysis result 48, and generates a new version of the training corpus 50 according to the analysis result 48.

於一實施例中,如第3圖所示,在分析階段340中,處理器20將測試結果44輸入分析模組45中,分析模組45可以由電路、韌體或軟體實現,分析模組45中儲存預先定義規則46,測試結果44經由預先定義規則46進行一連串的判斷後,分析模組45輸出分析結果48,並依據分析結果48產生新版訓練語料50。In one embodiment, as shown in FIG. 3, in the analysis stage 340, the processor 20 inputs the test result 44 into the analysis module 45. The analysis module 45 can be implemented by a circuit, firmware or software. The analysis module Pre-defined rules 46 are stored in 45. After the test result 44 is subjected to a series of judgments through the pre-defined rules 46, the analysis module 45 outputs the analysis result 48, and generates a new version of the training corpus 50 according to the analysis result 48.

於一實施例中,請參閱第4圖,第4圖係依照本發明一實施例繪示應用預先定義規則46分析測試結果44的方法400之流程圖。In one embodiment, please refer to FIG. 4, which is a flowchart of a method 400 for applying a predefined rule 46 to analyze a test result 44 according to an embodiment of the present invention.

於步驟410中,處理器20讀取測試結果44。In step 410, the processor 20 reads the test result 44.

於一實施例中,測試結果44包含測試一結果與測試二結果,預先定義規則46定義測試一為比較預測結果44是否等於預期的標籤,預先定義規則46定義測試二為用訓練模型38輸出的意圖及實體到知識庫中找答案。於一實施例中,測試結果44如表二所示。 句子 測試一的結果 測試二的結果 where is the repair center in Tokyo PASS PASS I want to find a repair center in Taipei city PASS FAIL why my wifi is not working FAIL FAIL 表二 其中,測試通過則以“PASS”表示,測試出有錯誤則以“FALL”表示。 In one embodiment, the test result 44 includes test one result and test two result. The predefined rule 46 defines test one to compare whether the predicted result 44 is equal to the expected label, and the predefined rule 46 defines test two to be output by the training model 38. Intentions and entities find answers in the knowledge base. In one embodiment, the test result 44 is shown in Table 2. sentence Results of test one Results of test two where is the repair center in Tokyo PASS PASS I want to find a repair center in Taipei city PASS FAIL why my wifi is not working FAIL FAIL In Table 2, “PASS” is used for passing the test, and “FALL” is used for testing errors.

於一實施例中,若測試一的結果與測試二的結果皆通過,代表測試結果44沒有錯誤,無須進行後續步驟。若測試一的結果與測試二的結果其中一者有錯誤,則進入步驟420。In one embodiment, if both the result of test 1 and the result of test 2 pass, it means that the test result 44 has no errors and no subsequent steps are required. If one of the results of test one and the result of test two is wrong, step 420 is entered.

於步驟420中,處理器20藉由預先定義規則46分析錯誤型態。In step 420, the processor 20 analyzes the error pattern by using a predefined rule 46.

於一實施例中,錯誤型態包含一缺乏同義字型態、一缺乏答案型態或一其他錯誤型態。In one embodiment, the error type includes a lack of synonymous word type, a lack of answer type, or another error type.

於一實施例中,若預先定義規則46判斷測試一結果為錯誤,且測試二結果也為錯誤,則代表訓練模型38無法識別預先定義好的字,且此字也不在同義字集中,也無法用訓練模型38標記的意圖與實體在知識庫中找到關聯的答案。預先定義規則46判斷此種錯誤型態為缺乏同義字型態。例如,表二中的句子“why my wifi is not working”,預先定義規則46判斷測試一結果為錯誤,且測試二結果也為錯誤,則代表訓練模型38可能是無法識別出“wifi”,且此字也不在同義字集中,因此也無法在知識庫中找到關聯的答案。In one embodiment, if the predefined rule 46 judges that the result of test one is wrong, and the result of test two is also wrong, it means that the training model 38 cannot recognize the predefined word, and this word is not in the synonymous character set, nor can it The intent and entity marked with the training model 38 find the associated answer in the knowledge base. The pre-defined rule 46 judges that this type of error is a type of lack of synonymous characters. For example, in the sentence "why my wifi is not working" in Table 2, the predefined rule 46 judges that the result of test one is wrong, and the result of test two is also wrong, it means that the training model 38 may not be able to recognize "wifi", and This word is also not in the synonymous character set, so the associated answer cannot be found in the knowledge base.

於一實施例中,若預先定義規則46判斷測試一結果為通過,測試二結果為錯誤,則代表訓練模型38可正確的辨識出句子的意圖及實體,但無法將意圖及實體作為索引查找知識庫關聯到的答案。預先定義規則46判斷此種錯誤型態為缺乏答案型態。例如,表二中的句子“I want to find a repair center in Taipei city”,預先定義規則46判斷測試一結果為通過,且測試二結果為錯誤,則代表訓練模型38可能無法將意圖及實體作為索引查找知識庫關聯到的答案。In one embodiment, if the predefined rule 46 judges that the result of test one is passed and the result of test two is wrong, it means that the training model 38 can correctly identify the intention and entity of the sentence, but cannot use the intention and entity as an index to find knowledge The answer to which the library is associated. The pre-defined rule 46 judges this type of error as a lack of answer type. For example, in the sentence "I want to find a repair center in Taipei city" in Table 2, the pre-defined rule 46 determines that the result of test one is passed and the result of test two is wrong, which means that the training model 38 may not be able to use the intention and entity as The index finds the answers associated with the knowledge base.

於一實施例中,其他錯誤型態是指未知的錯誤,例如類神經網路的神經元不夠,層數不夠,或演算法出錯,需要資料工程師協助調整。In one embodiment, other error types refer to unknown errors, such as insufficient neurons in a neural network, insufficient number of layers, or an algorithm error, which requires the assistance of a data engineer to adjust.

處理器20依據上述方法分析出來的錯誤型態可視為分析結果48,並於後續步驟對應不同的錯誤型態進行不同的操作,例如依據分析結果48及語料集34產生新版訓練語料50。The error pattern analyzed by the processor 20 according to the above method can be regarded as the analysis result 48, and different operations are performed corresponding to different error patterns in the subsequent steps, for example, a new version of the training corpus 50 is generated according to the analysis result 48 and the corpus 34.

於步驟430中,處理器20判斷錯誤型態為缺乏同義字型態或缺乏答案型態。當處理器20判斷錯誤型態為缺乏同義字型態時,進入步驟440。當處理器20判斷錯誤型態為缺乏答案型態時,進入步驟450。另外,當處理器20判斷錯誤型態為其他錯誤型態時,結束此流程。In step 430, the processor 20 determines that the error type is the lack of synonyms or the lack of answers. When the processor 20 determines that the error type is the type lacking synonymous characters, step 440 is entered. When the processor 20 determines that the error type is a lack of answer type, step 450 is entered. In addition, when the processor 20 determines that the error type is another error type, the process ends.

於步驟440中,當處理器20分析測試結果44中的至少一錯誤型態為缺乏同義字型態時,代表一特定同義字(例如為“wifi”)無法被訓練模型38分析出來,則處理器20將特定同義字(例如為“wifi”)加入語料集34,以更新語料集34,將更新的語料集34(即,新版訓練語料50)輸入類神經網路模型36,以產生一新版訓練模型。In step 440, when the processor 20 analyzes that at least one error pattern in the test result 44 is the lack of a synonymous character pattern, it means that a specific synonymous character (for example, "wifi") cannot be analyzed by the training model 38, then process The device 20 adds a specific synonym (for example, "wifi") to the corpus 34 to update the corpus 34, and inputs the updated corpus 34 (ie, the new version of the training corpus 50) into the neural network model 36, To generate a new version of the training model.

藉此,聊天機器人系統100可以使沒有被辨識出來的同義字加入語料集34,避免再次發生錯誤。經過多次重複流程320~340,可以不斷的優化訓練模型。In this way, the chat robot system 100 can add unrecognized synonyms to the corpus 34 to avoid recurring errors. After repeating the process 320~340 many times, the training model can be continuously optimized.

於步驟450中,當處理器20分析測試結果44中的意圖及實體判斷正確,但至少一錯誤型態為缺乏答案類型,代表意圖(例如為解決“wifi”壞損問題)及實體(例如為“wifi”)之組合與答案(例如為重新啟動)之間的關聯沒有被建立,則處理器20依據實體及意圖作為索引查找知識庫關聯到的答案(例如實體“wifi”與“wireless”為同義字,可以由“wireless”與意圖作為索引查出知識庫關聯到的答案),將實體及意圖與答案建立關聯,以更新知識庫。In step 450, when the processor 20 analyzes the intention and the entity in the test result 44 to be correct, but at least one error type is a lack of answer type, representing the intention (for example, to solve the "wifi" damage problem) and the entity (for example, The association between the combination of "wifi") and the answer (for example, restarting) is not established, and the processor 20 searches for the answer associated with the knowledge base according to the entity and intent as an index (for example, the entities "wifi" and "wireless" are Synonymous words, you can use "wireless" and intent as an index to find out the answer associated with the knowledge base), and associate the entity and intent with the answer to update the knowledge base.

本發明所示之聊天機器人系統及聊天機器人模型訓練方法,透過機器學習模型的泛化能力,將訓練語料中的同義字去除,以產生用於輸入類神經網路模型之語料集的大小,可以減少運算量並節省運算時間,透過分析訓練模型的方法,可依據分析結果產生新版訓練語料,透過新版訓練語料輸入類神經網路模型,以達到優化訓練模型的效果。The chat robot system and chat robot model training method shown in the present invention remove synonymous words in the training corpus through the generalization ability of the machine learning model to generate the size of the corpus used for the input neural network model , Which can reduce the amount of calculation and save time. By analyzing the training model, a new version of the training corpus can be generated based on the analysis result, and the new version of the training corpus can be input into the neural network model to achieve the effect of optimizing the training model.

本發明之方法,或特定型態或其部份,可以以程式碼的型態存在。程式碼可以包含於實體媒體,如軟碟、光碟片、硬碟、或是任何其他機器可讀取(如電腦可讀取)儲存媒體,亦或不限於外在形式之電腦程式產品,其中,當程式碼被機器,如電腦載入且執行時,此機器變成用以參與本發明之裝置。程式碼也可以透過一些傳送媒體,如電線或電纜、光纖、或是任何傳輸型態進行傳送,其中,當程式碼被機器,如電腦接收、載入且執行時,此機器變成用以參與本發明之裝置。當在一般用途處理單元實作時,程式碼結合處理單元提供一操作類似於應用特定邏輯電路之獨特裝置。The method of the present invention, or a specific type or part thereof, can exist in the form of code. The program code can be included in physical media, such as floppy disks, CDs, hard disks, or any other machine-readable (such as computer-readable) storage media, or not limited to external forms of computer program products. Among them, When the program code is loaded and executed by a machine, such as a computer, the machine becomes a device for participating in the present invention. The code can also be transmitted through some transmission media, such as wire or cable, optical fiber, or any transmission type. When the code is received, loaded and executed by a machine, such as a computer, the machine becomes used to participate in this Invented device. When implemented in a general-purpose processing unit, the program code combined with the processing unit provides a unique device that operates similar to the application of a specific logic circuit.

雖然本發明已以實施方式揭露如上,然其並非用以限定本發明,任何熟習此技藝者,在不脫離本發明之精神和範圍內,當可作各種之更動與潤飾,因此本發明之保護範圍當視後附之申請專利範圍所界定者為準。Although the present invention has been disclosed in the above embodiments, it is not intended to limit the present invention. Anyone who is familiar with the art can make various changes and modifications without departing from the spirit and scope of the present invention. Therefore, the protection of the present invention The scope shall be subject to the definition of the attached patent application scope.

100:聊天機器人系統 10:儲存裝置 20:處理器 200:聊天機器人模型訓練方法 210~250、410~450:步驟 310:前處理階段 30:訓練語料 32:同義字 34:語料集 320:訓練階段 330:測試階段 340:分析階段 36:類神經網路模型 38:訓練模型 40:測試資料 44:測試結果 45:分析模組 46:預先定義規則 48:分析結果 50:新版訓練語料 400:應用預先定義規則分析測試結果的方法 100: Chatbot system 10: Storage device 20: Processor 200: Chatbot model training method 210~250, 410~450: steps 310: Pre-processing stage 30: Training corpus 32: Synonyms 34: Corpus 320: Training phase 330: Test phase 340: Analysis phase 36: Neural Network Model 38: Training model 40: Test data 44: Test results 45: Analysis Module 46: Pre-defined rules 48: Analysis results 50: New version of training corpus 400: The method of applying pre-defined rules to analyze test results

第1圖係依照本發明一實施例繪示整合聊天機器人系統之方塊圖。 第2圖係依照本發明一實施例繪示聊天機器人模型訓練方法之流程圖。 第3圖係依照本發明一實施例繪示聊天機器人模型訓練方法之示意圖。 第4圖係依照本發明一實施例繪示應用預先定義規則分析測試結果的方法之流程圖。 FIG. 1 is a block diagram of an integrated chat robot system according to an embodiment of the present invention. Figure 2 is a flowchart of a chat robot model training method according to an embodiment of the present invention. FIG. 3 is a schematic diagram of a chat robot model training method according to an embodiment of the present invention. FIG. 4 is a flowchart of a method for analyzing test results by applying predefined rules according to an embodiment of the present invention.

200:聊天機器人模型訓練方法 210~250:步驟 200: Chatbot model training method 210~250: Steps

Claims (6)

一種聊天機器人系統,包含:一儲存裝置,用以儲存一知識庫,該知識庫用以記錄一意圖與一實體之組合與一答案之間的關聯;以及一處理器,用以分析複數個訓練語料中的該意圖與該實體,並將利用一機器學習模型的一泛化(generalization)能力將該些訓練語料中的複數個同義字分離,並將該些同義字從該些訓練語料移除後,以取得一語料集,將該語料集輸入一類神經網路模型後,取得一訓練模型,測試該訓練模型,以取得一測試結果;其中,當該處理器從該測試結果中以一預先定義規則分析該測試結果,以產生一分析結果,並依據該分析結果產生一新版訓練語料;其中,該處理器應用該預先定義規則,以分析該測試結果中的至少一錯誤型態,該至少一錯誤型態可視為該分析結果,該至少一錯誤型態包含一缺乏同義字型態、一缺乏答案型態或一其他錯誤型態;其中,當該處理器分析該測試結果中的該至少一錯誤型態為該缺乏同義字型態時,代表一特定同義字無法被該訓練模型分析出來,則該處理器將該特定同義字加入該語料集,以更新該語料集,將更新的該語料集輸入該類神經網路模型,以產生該新版訓練模型。 A chat robot system includes: a storage device for storing a knowledge base for recording the association between a combination of an intention and an entity and an answer; and a processor for analyzing a plurality of trainings The intention and the entity in the corpus will be used to separate the plural synonymous characters in the training corpus using a generalization ability of a machine learning model, and the synonymous characters will be separated from the training words. After the data is removed, a corpus is obtained. After the corpus is input into a neural network model, a training model is obtained, and the training model is tested to obtain a test result; wherein, when the processor receives the test result from the In the result, a predefined rule is used to analyze the test result to generate an analysis result, and a new version of the training corpus is generated according to the analysis result; wherein, the processor applies the predefined rule to analyze at least one of the test results Error type, the at least one error type can be regarded as the analysis result, and the at least one error type includes a lack of synonyms, a lack of answers, or other error types; wherein, when the processor analyzes the When the at least one error pattern in the test result is the lack of synonymous character pattern, it means that a specific synonymous character cannot be analyzed by the training model. Then the processor adds the specific synonymous character to the corpus to update the corpus The corpus, the updated corpus is input into this type of neural network model to generate the new version of the training model. 如申請專利範圍第1項所述之聊天機器人系統,其中,當該處理器分析該測試結果中的該意圖及該實體判斷正確,但該至少一錯誤型態為該缺乏答案類型,代表該意圖及該實體之組合與該答案之間的關聯沒有被建立,則該處理器依據該實體及該意圖 作為索引查找該知識庫關聯到的該答案,將該實體及該意圖與該答案建立關聯,以更新該知識庫。 For example, the chat robot system described in claim 1, wherein, when the processor analyzes the intention in the test result and the entity judges correct, but the at least one error type is the lack of answer type, which represents the intention And the association between the entity’s combination and the answer has not been established, the processor is based on the entity and the intention Use as an index to find the answer associated with the knowledge base, and associate the entity and the intention with the answer to update the knowledge base. 如申請專利範圍第1項所述之聊天機器人系統,其中,該處理器更用以將複數個測試資料及該些同義字輸入該訓練模型,以取得該測試結果,分析該測試結果以驗證該訓練模型。 For example, the chat robot system described in item 1 of the scope of patent application, wherein the processor is further used to input a plurality of test data and the synonyms into the training model to obtain the test result, and analyze the test result to verify the Train the model. 一種聊天機器人模型訓練方法,包含:藉由一知識庫記錄一意圖與一實體之組合與一答案之間的關聯;分析複數個訓練語料中的該意圖與該實體;利用一機器學習模型的一泛化(generalization)能力將該些訓練語料中的複數個同義字分離,並將該些同義字從該些訓練語料中移除,以取得一語料集;將該語料集輸入一類神經網路模型,取得一訓練模型,測試該訓練模型,以取得一測試結果;以及依據一預先定義規則分析該測試結果,以產生一分析結果,並依據該分析結果產生一新版訓練語料;應用該預先定義規則,以分析該測試結果中的至少一錯誤型態,該至少一錯誤型態,該至少一錯誤型態可視為該分析結果,該至少一錯誤型態包含一缺乏同義字型態、一缺乏答案型態或一其他錯誤型態;分析該測試結果中的該至少一錯誤型態為該缺乏同義字型態時,代表一特定同義字無法被該訓練模型分析出來,則將該特定同 義字加入該語料集,以更新該語料集,將更新的該語料集輸入該類神經網路模型,以產生該新版訓練語料。 A chatbot model training method includes: recording the association between a combination of an intent and an entity and an answer through a knowledge base; analyzing the intent and the entity in a plurality of training corpora; using a machine learning model A generalization ability separates plural synonymous characters in the training corpus, and removes the synonymous characters from the training corpus to obtain a corpus; input the corpus A type of neural network model to obtain a training model, test the training model to obtain a test result; and analyze the test result according to a predefined rule to generate an analysis result, and generate a new version of the training corpus according to the analysis result ; Apply the predefined rules to analyze at least one error type in the test result, the at least one error type, the at least one error type can be regarded as the analysis result, and the at least one error type includes a lack of synonyms Pattern, a lack of answer pattern, or another error pattern; when the at least one error pattern in the test result is the lack of synonym pattern, it means that a specific synonym cannot be analyzed by the training model, then The same as The semantic characters are added to the corpus to update the corpus, and the updated corpus is input into the neural network model to generate the new version of the training corpus. 如申請專利範圍第4項所述之聊天機器人模型訓練方法,更包含:當分析該驗證結果中的該意圖及該實體判斷正確,但該至少一錯誤型態為該缺乏答案類型,代表該意圖及該實體之組合與該答案之間的關聯沒有被建立,則依據該實體及該意圖作為索引查找該知識庫關聯到的該答案,將該實體及該意圖與該答案建立關聯,以更新該知識庫。 For example, the chat robot model training method described in item 4 of the scope of patent application further includes: when the intention in the verification result and the entity judgment are correct, but the at least one error type is the lack of answer type, which represents the intention And the association between the combination of the entity and the answer has not been established, then the answer to the knowledge base is searched based on the entity and the intent as an index, and the entity and the intent are associated with the answer to update the answer knowledge base. 如申請專利範圍第4項所述之聊天機器人模型訓練方法,更包含:將複數個測試資料及該些同義字輸入該訓練模型,以取得該測試結果,分析該測試結果以驗證該訓練模型。 For example, the chat robot model training method described in item 4 of the scope of patent application further includes: inputting a plurality of test data and the synonyms into the training model to obtain the test result, and analyzing the test result to verify the training model.
TW109107231A 2020-03-05 2020-03-05 Chat robot system and chat robot model training method TWI745878B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW109107231A TWI745878B (en) 2020-03-05 2020-03-05 Chat robot system and chat robot model training method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW109107231A TWI745878B (en) 2020-03-05 2020-03-05 Chat robot system and chat robot model training method

Publications (2)

Publication Number Publication Date
TW202135045A TW202135045A (en) 2021-09-16
TWI745878B true TWI745878B (en) 2021-11-11

Family

ID=78777487

Family Applications (1)

Application Number Title Priority Date Filing Date
TW109107231A TWI745878B (en) 2020-03-05 2020-03-05 Chat robot system and chat robot model training method

Country Status (1)

Country Link
TW (1) TWI745878B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110301441A1 (en) * 2007-01-05 2011-12-08 Myskin, Inc. Analytic methods of tissue evaluation
CN103000052A (en) * 2011-09-16 2013-03-27 上海先先信息科技有限公司 Man-machine interactive spoken dialogue system and realizing method thereof
CN105393263A (en) * 2013-07-12 2016-03-09 微软技术许可有限责任公司 Feature completion in computer-human interactive learning
TW201741948A (en) * 2016-03-30 2017-12-01 Alibaba Group Services Ltd Resume assessment method and apparatus
CN109101217A (en) * 2013-03-15 2018-12-28 先进元素科技公司 Method and system for purposefully calculating

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110301441A1 (en) * 2007-01-05 2011-12-08 Myskin, Inc. Analytic methods of tissue evaluation
CN103000052A (en) * 2011-09-16 2013-03-27 上海先先信息科技有限公司 Man-machine interactive spoken dialogue system and realizing method thereof
CN109101217A (en) * 2013-03-15 2018-12-28 先进元素科技公司 Method and system for purposefully calculating
CN105393263A (en) * 2013-07-12 2016-03-09 微软技术许可有限责任公司 Feature completion in computer-human interactive learning
TW201741948A (en) * 2016-03-30 2017-12-01 Alibaba Group Services Ltd Resume assessment method and apparatus

Also Published As

Publication number Publication date
TW202135045A (en) 2021-09-16

Similar Documents

Publication Publication Date Title
Lin et al. Traceability transformed: Generating more accurate links with pre-trained bert models
Umer et al. CNN-based automatic prioritization of bug reports
US11663409B2 (en) Systems and methods for training machine learning models using active learning
US10599767B1 (en) System for providing intelligent part of speech processing of complex natural language
US8972408B1 (en) Methods, systems, and articles of manufacture for addressing popular topics in a social sphere
US20190347571A1 (en) Classifier training
US9613093B2 (en) Using question answering (QA) systems to identify answers and evidence of different medium types
US20150170051A1 (en) Applying a Genetic Algorithm to Compositional Semantics Sentiment Analysis to Improve Performance and Accelerate Domain Adaptation
US20220083742A1 (en) Man-machine dialogue method and system, computer device and medium
US20230259707A1 (en) Systems and methods for natural language processing (nlp) model robustness determination
US20220414463A1 (en) Automated troubleshooter
WO2021212681A1 (en) Semantic role annotation method and apparatus, and computer device and storage medium
US11151117B2 (en) Increasing the accuracy of a statement by analyzing the relationships between entities in a knowledge graph
CN112685550B (en) Intelligent question-answering method, intelligent question-answering device, intelligent question-answering server and computer readable storage medium
US11809804B2 (en) Text formatter
CN112101042A (en) Text emotion recognition method and device, terminal device and storage medium
CN118093839B (en) Knowledge operation question-answer dialogue processing method and system based on deep learning
JP5317061B2 (en) A simultaneous classifier in multiple languages for the presence or absence of a semantic relationship between words and a computer program therefor.
US11922126B1 (en) Use of semantic confidence metrics for uncertainty estimation in large language models
JP2019036210A (en) FAQ registration support method using machine learning, and computer system
TWI745878B (en) Chat robot system and chat robot model training method
TWI768513B (en) Artificial intelligence training system and artificial intelligence training method
CN114357964A (en) Subjective question scoring method, model training method, computer device, and storage medium
CN111627461A (en) Voice quality inspection method and device, server and storage medium
CN114492397B (en) Artificial intelligence model training system and artificial intelligence model training method