TWI803093B - Semantic understanding system for rich-text, method and computer readable medium thereof - Google Patents

Semantic understanding system for rich-text, method and computer readable medium thereof Download PDF

Info

Publication number
TWI803093B
TWI803093B TW110146063A TW110146063A TWI803093B TW I803093 B TWI803093 B TW I803093B TW 110146063 A TW110146063 A TW 110146063A TW 110146063 A TW110146063 A TW 110146063A TW I803093 B TWI803093 B TW I803093B
Authority
TW
Taiwan
Prior art keywords
sentence
text
named entity
vocabulary
similar
Prior art date
Application number
TW110146063A
Other languages
Chinese (zh)
Other versions
TW202324381A (en
Inventor
黃至德
楊富丞
Original Assignee
中華電信股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中華電信股份有限公司 filed Critical 中華電信股份有限公司
Priority to TW110146063A priority Critical patent/TWI803093B/en
Application granted granted Critical
Publication of TWI803093B publication Critical patent/TWI803093B/en
Publication of TW202324381A publication Critical patent/TW202324381A/en

Links

Images

Abstract

The present invention is a semantic understanding system for rich-text and a method thereof. Multiple named entity recognition methods are integrated to find vocabularies of potential named entities by a potential named entity recognition module. Non-overlapping vocabularies are combined to generate similar sentences according to the system support sentence pattern and the order of appearance of the potential named entities in the sentence by a similar sentence generation module. The closest sentence from a number of similar sentences is selected to estimate the sentence intention and select a combination of named entities by a sentence intention selection module. Therefore, the recognition problem caused by missing or typo in the sentence can be avoided in the present invention, so it can improve the quality and precision of voice control service. The present invention also provides a computer-readable medium for executing the method of the present invention.

Description

處理豐富文字之語意理解系統、方法及其電腦可讀媒介 Semantic understanding system, method and computer-readable medium for processing rich text

本發明係關於聲控語意辨識之技術,尤指一種處理豐富文字之語意理解系統、方法及其電腦可讀媒介。 The invention relates to the technology of voice-controlled semantic recognition, especially a semantic understanding system and method for processing rich text and its computer-readable medium.

聲控服務普遍運用在家電控制、串流媒體撥放、活動預約等場景,也提供人們便利生活應用。隨著越來越多聲控服務的產生,如何準確識別出用戶語意變得更加重要,舉例來說,用戶使用聲控服務往往不假思索,結果語句普遍簡略,時而缺主詞或動詞或受詞,導致系統辨識誤差,又或是用戶記不住過長節目名稱,結果語句漏講部分字詞,例如「經典101說給孩子聽的世界文學名著」之節目名稱,只講了片段「說給孩子聽的文學名著」,也會導致系統無法準確辨識,又或者,節目名稱存在諧音,例如「股癌」、「談股論金」、「讀角戲」等語音辨識誤植同音字到語句,因此,語句缺字或錯別字等情況,將對聲控服務自然語言處理品質產生更多挑戰。 Voice control services are widely used in scenarios such as home appliance control, streaming media playback, and event reservations, and also provide people with convenient life applications. With the emergence of more and more voice-activated services, how to accurately identify the user's semantics has become more important. For example, users often use voice-activated services without thinking, and the resulting sentences are generally short, sometimes lacking subjects, verbs, or objects, causing system Recognition error, or the user can’t remember the program name that is too long, and the resulting sentence misses some words. For example, the program title of "Classic 101 World Literature Classics for Children" only narrates the segment "Say it for children Literary classics" will also cause the system to be unable to accurately identify, or there are homonyms in the program names, such as "stock cancer", "talking about money", "dujiaoxi" and other speech recognition errors. Missing characters or typos will pose more challenges to the quality of natural language processing of voice-activated services.

進言之,習知例如Dialogflow之自然語言處理系統,沿用模糊比對(Fuzzy matching)技術,當語句中節目名稱有缺字,例如語句「我要聽職涯」,此節目名稱缺字「99」,Dialogflow可能誤將動詞「聽」當成節目名稱一部分,結果 誤觸節目「職享聽你說」,經統計,節目缺字情況約有一成會導致撥放錯誤;又,當語句中節目名稱存在諧音,例如語句「我要聽Emily爆報」,同音字「爆」未修正以致節目「Emily報報」無法撥放,經統計類似情況也約一成;另外,不論缺字或錯別字,還有一成節目恐無法被識別,系統需再此詢問用戶意圖,例如語句「我要聽大聯盟最新一集」缺字「Hito」或語句「我要聽三金秀」同音字「斤」。由上可知,習用之自然語言處理系統,在節目名稱缺字或同音字時,錯誤率約兩成,故有待予改良之需求。 In other words, it is known that natural language processing systems such as Dialogflow continue to use fuzzy matching technology. When there are missing characters in the program name in the sentence, for example, the sentence "I want to listen to career", the program name is missing "99" , Dialogflow may mistake the verb "listen" as part of the program title, and the result Mistakenly touching the program "Job Enjoy Listening to You". According to statistics, about 10% of the program's lack of characters will lead to misplaying; also, when the program name has homophony in the sentence, such as the sentence "I want to listen to Emily's explosive report", the homophone The program "Emily Newspaper" could not be played due to the uncorrected "Burst", according to the statistics, about 10% of the similar situations; in addition, regardless of missing characters or typos, there are still 10% of the programs that may not be recognized, the system needs to ask the user's intention again, For example, the sentence "I want to listen to the latest episode of the major league" lacks the word "Hito" or the sentence "I want to listen to Sanjinxiu" has a homophone "jin". It can be seen from the above that the commonly used natural language processing system has an error rate of about 20% when the program name is missing or homonymous, so there is a need for improvement.

鑑於上述問題,如何提升聲控辨識率,特別是用戶使用聲控服務時語句過於簡略,節目名稱過長有漏部分字詞、節目名稱諧音所致之錯別字誤植的情況下,仍能準確推估語句意圖,進而提供用戶正確之聲控服務,此將成為目前本技術領域人員急欲追求之目標。 In view of the above problems, how to improve the recognition rate of voice control, especially when the user uses the voice control service, the sentence is too simple, the program name is too long and some words are missing, and the program name is homophonic due to typos and mistakes, and the sentence intention can still be accurately estimated , and then provide users with correct voice-activated services, which will become the goal that those skilled in the art are eager to pursue.

為解決上述現有技術之問題,本發明係揭露一種處理豐富文字之語意理解系統,係包括:潛在命名實體識別模組,係具有多個不同類型之命名實體識別器且接收經語音轉換後之文字語句,各該命名實體識別器分別用於計算該文字語句與預存詞彙的最長共同子符號序列並進行計分,以挑選出分數最高且符合第一門檻值之詞彙作為潛在命名實體;相似語句生成模組,係用於將該潛在命名實體所對應之詞彙進行不重複之組合,以生成相似語句並進行計分,俾由其中挑選出分數最高且符合第二門檻值之相似語句作為近似語句;以及語句意圖評選模組,係用於依據該近似語句之數量,提供用於觸發對應服務之參數、產生對應之回覆訊息或是產生互動訊息。 In order to solve the above-mentioned problems in the prior art, the present invention discloses a semantic understanding system for processing rich text, which includes: a latent named entity recognition module, which has multiple named entity recognizers of different types and receives text converted from speech Each named entity recognizer is used to calculate and score the longest common subsymbol sequence between the text sentence and the pre-stored vocabulary, so as to select the vocabulary with the highest score and meet the first threshold as a potential named entity; generate similar sentences The module is used to combine the vocabulary corresponding to the potential named entity without repetition to generate similar sentences and score them, so that the similar sentences with the highest score and meeting the second threshold are selected as similar sentences; And the sentence intent selection module is used to provide parameters for triggering corresponding services, generate corresponding reply messages or generate interactive messages according to the number of similar sentences.

本發明復揭露一種處理豐富文字之語意理解方法,係由電腦設備執行該方法,該方法包括以下步驟:令潛在命名實體識別模組接收經語音轉換之文字語句,以利用不同類型之命名實體識別器,計算出該文字語句與預存詞彙的最長共同子符號序列以進行計分,俾挑選出分數最高且符合第一門檻值之詞彙作為潛在命名實體;令相似語句生成模組對該潛在命名實體所對應之詞彙進行不重複之組合,以生成相似語句並進行計分,俾由其中挑選出分數最高且符合第二門檻值之相似語句作為近似語句;以及令語句意圖評選模組依據該近似語句之數量,提供用於觸發對應服務之參數、產生對應之回覆訊息或是產生互動訊息。 The present invention further discloses a semantic understanding method for processing rich text, which is executed by a computer device. The method includes the following steps: making the latent named entity recognition module receive text sentences converted from speech, so as to use different types of named entity recognition The device calculates the longest common subsymbol sequence between the text sentence and the pre-stored vocabulary for scoring, so as to select the vocabulary with the highest score and meeting the first threshold as a potential named entity; let the similar sentence generation module compare the potential named entity The corresponding vocabulary is combined without repetition to generate similar sentences and score them, so that the similar sentences with the highest score and meeting the second threshold are selected as similar sentences; and the sentence intent selection module is based on the similar sentences The number of , providing parameters for triggering the corresponding service, generating a corresponding reply message or generating an interactive message.

於前述系統和方法中,該命名實體識別器為字符命名實體識別器時,係以一個中文字或一個英文單字為符號單位,對該文字語句之符號與該詞彙之符號進行匹配率計算,以得到該最長共同子符號序列。 In the aforementioned system and method, when the named entity recognizer is a character named entity recognizer, it uses a Chinese character or an English word as a symbol unit to calculate the matching rate between the symbol of the text sentence and the symbol of the vocabulary, and Obtain the longest common subsymbol sequence.

於前述系統和方法中,該命名實體識別器為音符命名實體識別器時,係以一個中文字的注音符號不含音調或一個英文單字詞幹為符號單位,對該文字語句之符號與該詞彙之符號進行匹配率計算,以得到該最長共同子符號序列。 In the aforementioned system and method, when the named entity recognizer is a musical note named entity recognizer, it uses a phonetic symbol of a Chinese character without tones or an English word stem as a symbol unit, and the symbol of the text statement is related to the The symbols of the vocabulary are calculated to obtain the longest common sub-symbol sequence.

於前述系統和方法中,該潛在命名實體識別模組係以滑動視窗方式找出該文字語句與該詞彙的該最長共同子符號序列,且該計分方式係比對該文字語句之符號與該詞彙之符號,以於符號正確時加分及符號錯誤時扣分,俾令分數最高且符合該第一門檻值之詞彙作為該潛在命名實體。 In the foregoing system and method, the latent named entity recognition module finds the longest common subsymbol sequence between the text sentence and the vocabulary in a sliding window manner, and the scoring method is to compare the symbols of the text sentence with the For the symbol of the vocabulary, points are added when the symbol is correct and points are deducted when the symbol is wrong, so that the word with the highest score and meeting the first threshold is used as the potential named entity.

於另一實施例中,該潛在命名實體識別模組於找出該潛在命名實體後,係利用遮蔽該文字語句中共同子序列,再以相同程序尋找該文字語句中的其他潛在命名實體,直到無找到其他潛在命名實體為止。 In another embodiment, after the potential named entity recognition module finds out the potential named entity, it uses masking the common subsequence in the text sentence, and then uses the same procedure to find other potential named entities in the text sentence until Until no other potential named entities are found.

於前述系統和方法中,該相似語句生成模組係根據系統所支援句型以及該潛在命名實體之出現順序,組合不重疊詞彙以生成該相似語句,再計算出每一個該相似語句與該文字語句之最長共同子符號序列,於比對該文字語句之符號與該相似語句之符號後進行計分,以於符號正確時加分及符號錯誤時扣分,俾令分數最高且符合該第二門檻值之相似語句作為該近似語句。 In the aforementioned system and method, the similar sentence generating module is based on the sentence patterns supported by the system and the order of appearance of the potential named entity, combining non-overlapping words to generate the similar sentence, and then calculates each similar sentence and the text The longest common sub-symbol sequence of the sentence is scored after comparing the symbols of the literal sentence with the symbols of the similar sentence, so as to add points when the symbols are correct and deduct points when the symbols are wrong, so that the score is the highest and meets the second A similar statement of the threshold value is used as the approximate statement.

於前述系統和方法中,該語句意圖評選模組於該近似語句僅有一個時,透過後端之參數屢交模組以生成觸發該對應服務之參數,或是於該近似語句為零個或多個時,透過後端之答覆語句生成模組以產生該回覆訊息或該互動訊息。 In the aforementioned systems and methods, when there is only one approximate sentence, the sentence intent selection module generates parameters to trigger the corresponding service through the parameters of the backend repeatedly passed to the module, or when the approximate sentence is zero or When multiple, the reply message or the interaction message is generated through the reply statement generating module at the back end.

於前述系統和方法中,該處理豐富文字之語意理解方法復包括聲控句型建構模組,用以於進行語意理解之前,利用該聲控句型建構模組由多個訓練語句中匡列出重要詞彙組合且忽略不重要字詞,以建構出該預存詞彙。 In the aforementioned system and method, the semantic understanding method for processing rich text further includes a voice-activated sentence pattern construction module, which is used to list important Vocabulary combinations and ignoring unimportant words to construct the pre-stored vocabulary.

本發明復揭露一種電腦可讀媒介,應用於計算裝置或電腦中,係儲存有指令,以執行前述之處理豐富文字之語意理解方法。 The present invention further discloses a computer-readable medium, which is used in a computing device or a computer, and stores instructions to execute the aforementioned semantic understanding method for processing rich text.

由上可知,本發明之處理豐富文字之語意理解系統及其方法,係融合數種命名實體(Named Entity,NE)識別結果,發揮綜效涵蓋各種情況,有效降低命名實體錯誤率。具體而言,本發明以自然語言建構聲控服務支援的句型,制定一套潛在命名實體初選機制以推舉各種潛在命名實體,再依句型組合潛在命名實體以生成相似語句,最後,透過近似語句評選公式以確立語句意圖並選定命 名實體組合。相較於傳統模型先識別語句意圖再模糊比對潛在命名實體,本發明可避免單一模型先入為主,或框錯範圍模糊比對,故能有效提升缺字及錯別字語句之命名實體辨識率。 It can be seen from the above that the semantic understanding system and method for processing rich texts of the present invention integrates the recognition results of several named entity (NE) to exert synergistic effects to cover various situations and effectively reduce the error rate of named entities. Specifically, the present invention constructs sentence patterns supported by voice-activated services in natural language, formulates a set of potential named entity primary selection mechanism to recommend various potential named entities, and then combines potential named entities according to sentence patterns to generate similar sentences. Finally, through approximation Sentence selection formulas to establish sentence intent and select command name entity combination. Compared with the traditional model, which first recognizes the intent of the sentence and then fuzzily compares the potential named entities, the present invention can avoid the preconception of a single model, or the fuzzy comparison of the range of frame errors, so it can effectively improve the recognition rate of named entities for sentences with missing characters and typos.

1:處理豐富文字之語意理解系統 1: Semantic understanding system for processing rich text

11:潛在命名實體識別模組 11: Potential Named Entity Recognition Module

111:字符命名實體識別器111 111: Character Named Entity Recognizer 111

112:音符命名實體識別器 112:Note Named Entity Recognizer

12:相似語句生成模組 12: Similar sentence generation module

13:語句意圖評選模組 13: Sentence intent selection module

14:聲控句型建構模組 14: Voice-activated sentence construction module

2:語音轉文字模組 2: Speech-to-text module

3:參數屢交模組 3: Parameters are repeatedly handed over to the module

4:答覆語句生成模組 4: Answer sentence generation module

5:服務應用程式介面 5: Service API

6:文字轉語音模組 6: Text-to-speech module

301-305:流程 301-305: Process

401-406:流程 401-406: Process

S501-S503:步驟 S501-S503: Steps

圖1係本發明之處理豐富文字之語意理解系統之示意架構圖。 FIG. 1 is a schematic structural diagram of a semantic understanding system for processing rich text according to the present invention.

圖2係本發明之處理豐富文字之語意理解系統於一應用實施例之示意架構圖。 FIG. 2 is a schematic structural diagram of an application embodiment of the semantic understanding system for processing rich text of the present invention.

圖3係於本發明中潛在命名實體識別模組初選潛在命名實體之流程圖。 FIG. 3 is a flow chart of the primary selection of potential named entities by the potential named entity recognition module in the present invention.

圖4係於本發明中相似語句生成模組生成相似語句以及語句意圖評選模組找出近似語句之流程圖。 FIG. 4 is a flow chart of generating similar sentences by the similar sentence generating module and finding similar sentences by the sentence intent selection module in the present invention.

圖5係本發明之處理豐富文字之語意理解方法之步驟圖。 FIG. 5 is a step diagram of the semantic understanding method for processing rich text according to the present invention.

以下藉由特定的具體實施形態說明本發明之技術內容,熟悉此技藝之人士可由本說明書所揭示之內容輕易地瞭解本發明之優點與功效。然本發明亦可藉由其他不同的具體實施形態加以施行或應用。 The following describes the technical content of the present invention through specific embodiments, and those skilled in the art can easily understand the advantages and effects of the present invention from the content disclosed in this specification. However, the present invention can also be implemented or applied in other different specific implementation forms.

圖1為本發明之處理豐富文字之語意理解系統之示意架構圖。如圖所示,處理豐富文字之語意理解系統1至少包括潛在命名實體識別模組11、相似語句生成模組12及語句意圖評選模組13。 FIG. 1 is a schematic structural diagram of a semantic understanding system for processing rich text according to the present invention. As shown in the figure, the semantic understanding system 1 for processing rich text includes at least a latent named entity recognition module 11 , a similar sentence generation module 12 and a sentence intent selection module 13 .

潛在命名實體識別模組11具有多個不同類型之命名實體識別器且接收由用戶輸入之語音經轉換後之文字語句,各該命名實體識別器分別用於計算該文字語句與預存之詞彙的最長共同子符號序列並進行計分,以挑選出分數最高且符合預定第一門檻值之詞彙作為潛在命名實體。易言之,潛在命名實體識別模組11係整合數種命名實體(NE)識別法以初選潛在NE,其中,該潛在命名實體識別模組11內預設有多個不同類型之命名實體識別器,當用戶所輸入之輸入語音經轉換成為文字語句後,會送至該潛在命名實體識別模組11進行分析,各該命名實體識別器用於計算所接收之文字語句與預存之詞彙兩者的最長共同子符號序列,之後由文字語句與最長共同子符號序列比對以進行計分,挑選出分數最高且符合預定第一門檻值之詞彙以作為潛在命名實體。 The potential named entity recognition module 11 has a plurality of different types of named entity recognizers and receives converted text sentences from the speech input by the user, and each named entity recognizer is used to calculate the longest length between the text sentences and the pre-stored vocabulary. The sequence of common sub-symbols is scored and the vocabulary with the highest score and meeting the predetermined first threshold is selected as a potential named entity. In other words, the potential named entity recognition module 11 integrates several named entity (NE) recognition methods to initially select potential NEs, wherein the potential named entity recognition module 11 is preset with a plurality of different types of named entity recognition device, when the input speech input by the user is converted into a text sentence, it will be sent to the potential named entity recognition module 11 for analysis, and each named entity recognizer is used to calculate the difference between the received text sentence and the pre-stored vocabulary The longest common sub-symbol sequence is then compared with the longest common sub-symbol sequence for scoring, and the vocabulary with the highest score and meeting the predetermined first threshold is selected as a potential named entity.

於一實施例中,該命名實體識別器包括字符命名實體識別器和音符命名實體識別器,其中,字符命名實體識別器係以一個中文字或一個英文單字為符號單位,對該文字語句之符號與該詞彙之符號進行匹配率計算,以得到該最長共同子符號序列,而音符命名實體識別器係以一個中文字的注音符號不含音調或一個英文單字詞幹為符號單位,對該文字語句之符號與該詞彙之符號進行匹配率計算,以得到該最長共同子符號序列。易言之,字符命名實體(NE)識別器和音符命名實體(NE)識別器所執行之字符NE識別法與音符NE識別法,係分別因應詞彙缺字與同音錯別字來進行識別,其中,字符NE識別法其符號單位為一個中文字或一個英文單字,而音符NE識別法其符號單位中文字為注音或拼音符號不含聲調,英文單字為詞幹。因此,當文字語句與預存之詞彙經過符號轉換後,便能在同一套潛在NE評選框架下進行潛在NE初選。 In one embodiment, the named entity recognizer includes a character named entity recognizer and a musical note named entity recognizer, wherein the character named entity recognizer uses a Chinese character or an English word as a symbol unit, and the symbol of the text sentence Calculate the matching rate with the symbols of the vocabulary to obtain the longest common subsymbol sequence, and the phonetic note named entity recognizer uses a Chinese character phonetic symbol without tone or an English word stem as the symbol unit, and the character The matching rate calculation is performed between the symbols of the sentence and the symbols of the vocabulary to obtain the longest common sub-symbol sequence. In other words, the character NE recognition method and the musical note NE recognition method performed by the character named entity (NE) recognizer and the musical note named entity (NE) recognizer are respectively used for recognition in response to vocabulary missing characters and homophonic typos. The symbol unit of the NE recognition method is a Chinese character or an English word, and the symbol unit of the phonetic NE recognition method is a phonetic or pinyin symbol without tone, and the English word is a word stem. Therefore, after the text sentences and pre-stored vocabulary are converted into symbols, the primary selection of potential NEs can be carried out under the same potential NE selection framework.

於一實施例中,潛在命名實體識別模組11以滑動視窗方式找出該文字語句與該詞彙的該最長共同子符號序列,且該計分方式係比對該文字語句之符號與該詞彙之符號,以於符號正確時加分(如一分)及符號錯誤時扣分(如一分),俾令分數最高且符合該預定第一門檻值之詞彙作為該潛在命名實體。簡言之,先依文字語句與預存之詞彙的符號匹配率,推選出K則較佳(Top K)詞彙,K值可依實際需求設定,接著,針對每一條詞彙,以滑動視窗方式找出預存之詞彙與文字語句的最長共同子符號序列,該最長共同子符號序列之長度係小於等於詞彙長度,最後,從Top K詞彙中挑選分數最高且大於等於預定第一門檻值之詞彙作為潛在命名實體,其中,該預定門檻值能依據最長共同子符號序列評分結果來調整以提高成效。 In one embodiment, the latent named entity recognition module 11 finds the longest common subsymbol sequence between the text sentence and the vocabulary in a sliding window manner, and the scoring method is to compare the symbols of the text sentence and the vocabulary Symbols, adding points (such as one point) when the symbols are correct and deducting points (such as one point) when the symbols are wrong, so that the vocabulary with the highest score and meeting the predetermined first threshold is used as the potential named entity. In short, according to the symbol matching rate between the text sentence and the pre-stored vocabulary, K is the better (Top K) vocabulary is selected. The value of K can be set according to actual needs. Then, for each vocabulary, find out the The longest common sub-symbol sequence of the pre-stored vocabulary and text sentences, the length of the longest common sub-symbol sequence is less than or equal to the length of the vocabulary, and finally, the vocabulary with the highest score and greater than or equal to the predetermined first threshold value is selected from the Top K vocabulary as a potential naming The entity, wherein the predetermined threshold can be adjusted according to the scoring result of the longest common sub-symbol sequence to improve performance.

另外,最長共同子符號序列之計分方式係例如Longest Common Subsequence(LCS)之計分方式,包括符號正確加分(如一分),符號錯誤,例如多/少/置換一個符號則扣分(如一分),LCS計分不僅看編輯距離,還須參考正確加分部分,優先挑出較長的詞彙,以免取出局部最佳的詞彙。 In addition, the scoring method of the longest common sub-symbol sequence is, for example, the scoring method of Longest Common Subsequence (LCS), including bonus points for correct symbols (such as one point), and incorrect symbols, such as more/less/replacement of one symbol, then deducting points (such as one points), the LCS score not only depends on the edit distance, but also refers to the correct extra points, and gives priority to picking out longer words to avoid picking out the best words locally.

於另一實施例中,潛在命名實體識別模組11於找出該潛在命名實體後,係利用遮蔽該文字語句中共同子序列,再以相同程序尋找該文字語句中的其他潛在命名實體,直到無找到其他潛在命名實體為止。簡言之,於找出一個潛在命名實體後,若還存在其他潛在命名實體時,可透過遮蔽文字語句中共同子序列後,再進行下一次的潛在命名實體尋找,例如可以符號★遮蔽文字語句中之共同子符號序列,以便再次尋找其他潛在命名實體,倘若不存在潛在命名實體,則可彙整命名實體之探勘結果。 In another embodiment, after finding out the potential named entity, the potential named entity recognition module 11 uses the masking of the common subsequence in the text sentence, and then uses the same procedure to find other potential named entities in the text sentence until Until no other potential named entities are found. In short, after a potential named entity is found, if there are other potential named entities, the next potential named entity search can be performed after masking the common subsequence in the text sentence, for example, the symbol ★ can mask the text sentence In order to search for other potential named entities again, if there is no potential named entity, the exploration results of named entities can be aggregated.

因此,潛在命名實體識別模組11整合數種命名實體(NE)識別法以初選潛在NE,其中,NE識別法包含字符與音符NE識別法以分別處理文字語句中詞彙缺字與錯別字等情況,須說明者,潛在命名實體識別模組11亦能整合習知NE識別法來因應其他情況,不受前述識別器種類為限。另外,潛在命名實體識別模組11彙整數種NE初選結果,包含NE詞彙、開始與結束位置,以供相似語句生成模組12構建相似語句。 Therefore, the potential named entity recognition module 11 integrates several named entity (NE) recognition methods to initially select potential NEs. Among them, the NE recognition method includes character and musical note NE recognition methods to deal with missing words and typos in words and sentences. It should be noted that the latent named entity recognition module 11 can also integrate conventional NE recognition methods to cope with other situations, and is not limited to the types of recognizers mentioned above. In addition, the latent named entity recognition module 11 collects all kinds of NE preliminary selection results, including NE vocabulary, start and end positions, for the similar sentence generation module 12 to construct similar sentences.

相似語句生成模組12用於將該潛在命名實體所對應之詞彙進行不重複之組合,以生成相似語句並進行計分,俾由其中挑選出分數最高且符合第二門檻值之相似語句作為近似語句。簡言之,相似語句生成模組12目的是組合潛在命名實體之詞彙以產生相似語句,其中,可根據系統預定之句型跟文字語句中潛在命名實體出現順序,組合不重疊詞彙,以生成相似語句,之後,計算每一則相似語句與原始文字語句的最長共同子符號序列並進行評分,由其中挑選出分數最高且符合第二門檻值的相似語句以作為近似語句,其中,第二門檻值同樣可視最長共同子符號序列之評分結果進行調整,藉以提高成效。 The similar sentence generation module 12 is used for non-repetitive combination of vocabulary corresponding to the potential named entity to generate similar sentences and score them, so that the similar sentence with the highest score and meeting the second threshold value is selected as an approximate statement. In short, the purpose of the similar sentence generation module 12 is to combine the vocabulary of potential named entities to generate similar sentences. Among them, non-overlapping words can be combined according to the sentence patterns predetermined by the system and the order of appearance of potential named entities in text sentences to generate similar sentences. statement, after that, calculate and score the longest common subsymbol sequence between each similar sentence and the original text sentence, and select the similar sentence with the highest score and meet the second threshold value as an approximate sentence, where the second threshold value is also The scoring result of the longest common sub-symbol sequence can be adjusted to improve the performance.

於一實施例中,相似語句生成模組12係根據系統所支援句型以及該潛在命名實體之出現順序,組合不重疊詞彙以生成該相似語句,且計算每一個該相似語句與該文字語句之最長共同子符號序列,比對該文字語句之符號與該相似語句之符號並進行計分,以於符號正確時加分(如一分)及符號錯誤時扣分(如一分),俾令分數最高且符合該第二門檻值之相似語句作為該近似語句。 In one embodiment, the similar sentence generation module 12 is based on the sentence patterns supported by the system and the order of appearance of the potential named entity, combining non-overlapping words to generate the similar sentence, and calculates the difference between each similar sentence and the text sentence The longest common sub-symbol sequence, compare the symbols of the text sentence with the symbols of the similar sentence and score, so as to add points (such as one point) when the symbols are correct and deduct points (such as one point) when the symbols are wrong, so as to maximize the score And a similar sentence that meets the second threshold is used as the approximate sentence.

語句意圖評選模組13用於依據該近似語句之數量,提供用於觸發對應服務之參數、產生對應之回覆訊息或是產生互動訊息。簡言之,語句意圖評選模組13目的是挑選相似語句生成模組12所產生之近似語句,並推估語句意圖, 最後選定命名實體組合,進言之,當近似語句數量為零時,透過系統回覆無對應服務,當近似語句數量唯一,則將近似語句意圖與組成近似語句的命名實體作為觸發服務的路徑與履交的參數,當近似語句數量不只一則,則組成近似語句的命名實體,套入選項樣板以得到數筆問題選項,組成選擇題詢問用戶意圖。 The sentence intent selection module 13 is used for providing parameters for triggering corresponding services, generating corresponding reply messages or generating interactive messages according to the quantity of the approximate sentences. In short, the purpose of the sentence intent selection module 13 is to select similar sentences generated by the similar sentence generation module 12, and estimate the sentence intent, Finally, the named entity combination is selected. In other words, when the number of similar sentences is zero, the system will reply that there is no corresponding service. When the number of similar sentences is unique, the intent of the similar sentence and the named entities that make up the similar sentence will be used as the path and delivery of the trigger service. Parameters, when the number of similar sentences is more than one, the named entity of the similar sentence is formed, and the option template is inserted to obtain several question options, and multiple-choice questions are formed to ask the user's intention.

另外,處理豐富文字之語意理解系統1復包括聲控句型建構模組14,係用於預先由多個訓練語句中匡列出重要詞彙組合且忽略不重要字詞,以建構出該預存之詞彙。簡言之,句型是由一個以上詞彙組合而成,詞彙先後順序不影響句型唯一性,以自然語言構建聲控句型時,本發明能透過聲控句型建構模組14來匡列訓練語句中重要詞彙並忽略不重要字詞,最後成為系統預存之詞彙,以供後續潛在命名實體識別模組11分析比對時使用。 In addition, the semantic understanding system 1 for processing rich text further includes a voice-activated sentence construction module 14, which is used to list important vocabulary combinations and ignore unimportant words in advance from multiple training sentences, so as to construct the pre-stored vocabulary . In short, a sentence pattern is composed of more than one vocabulary, and the order of the words does not affect the uniqueness of the sentence pattern. When constructing a voice-controlled sentence pattern in natural language, the present invention can use the voice-controlled sentence pattern construction module 14 to sort out the training sentences Important words are selected and unimportant words are ignored, and finally become the words stored in the system for subsequent analysis and comparison by the latent named entity recognition module 11 .

圖2為本發明之處理豐富文字之語意理解系統於一應用實施例之示意架構圖。如圖所示,處理豐富文字之語意理解系統1之潛在命名實體識別模組11、相似語句生成模組12及語句意圖評選模組13與圖1相同,於本實施例中,復包括與處理豐富文字之語意理解系統1連線之語音轉文字模組2、參數屢交模組3、答覆語句生成模組4、服務應用程式介面5以及文字轉語音模組6。 FIG. 2 is a schematic structural diagram of an application embodiment of the semantic understanding system for processing rich text of the present invention. As shown in the figure, the latent named entity recognition module 11, the similar sentence generation module 12 and the sentence intent selection module 13 of the semantic understanding system 1 for processing rich text are the same as those in FIG. Text-rich semantic understanding system 1 connected speech-to-text module 2, parameter repeating module 3, answer sentence generation module 4, service API 5, and text-to-speech module 6.

處理豐富文字之語意理解系統1主要用於處理內容豐富文字,將其輕量化進行語意理解,其中包含用於整合數種命名實體識別法初選潛在命名實體之潛在命名實體識別模組11、用於組合潛在命名實體之詞彙以產生相似語句之相似語句生成模組12以及用於挑選近似語句之語句意圖評選模組13。實際運作時,一般聲控服務經由外部之語音轉文字模組2產生文字語句以作為處理豐富文字之語意理解系統1的輸入,潛在命名實體識別模組11分送文字語句至字符命名實體識別器111、音符命名實體識別器112或其他NE識別模組,彙整數種識別結果 給相似語句生成模組12,相似語句生成模組12根據句型組合潛在命名實體,產生相似語句供語句意圖評選模組13挑選出近似語句,推估語句意圖後,選定命名實體組合。 Semantic understanding system 1 for processing rich text is mainly used to process rich text and lighten it for semantic understanding, which includes a potential named entity recognition module 11 used to integrate several named entity recognition methods for primary selection of potential named entities. A similar sentence generation module 12 for combining the vocabulary of potential named entities to generate similar sentences and a sentence intent evaluation module 13 for selecting similar sentences. In actual operation, the general voice-activated service generates text sentences through the external speech-to-text module 2 as the input of the semantic understanding system 1 for processing rich text, and the latent named entity recognition module 11 distributes text sentences to the character named entity recognizer 111 , Musical Note Named Entity Recognizer 112 or other NE recognition modules, collect various recognition results For the similar sentence generation module 12, the similar sentence generation module 12 combines potential named entities according to the sentence pattern, generates similar sentences for the sentence intent selection module 13 to select similar sentences, and selects the named entity combination after estimating the sentence intent.

語句意圖評選模組13於近似語句僅有一個時,透過後端之參數屢交模組3生成觸發該對應服務之參數,或是於該近似語句為零個或多個時,透過後端之答覆語句生成模組4產生該回覆訊息或該互動訊息。具體而言,處理豐富文字之語意理解系統1選定意圖與NE組合後,能供外部之參數屢交模組3產生參數觸發後端服務,即由服務應用程式介面(API)5執行對應服務,又或是,由答覆語句生成模組4根據選定的NE組合以產生答覆語句,再經由文字轉語音模組6發出聲音跟用戶互動。 Sentence intent selection module 13 generates parameters to trigger the corresponding service through the parameter repeating module 3 of the backend when there is only one similar sentence, or when there are zero or more similar sentences, passes the parameter of the backend The reply sentence generation module 4 generates the reply message or the interaction message. Specifically, after the semantic understanding system 1 for processing rich text selects the intent and combines it with NE, it can be used by the external parameter repeating module 3 to generate parameters to trigger the back-end service, that is, the service application programming interface (API) 5 executes the corresponding service, Alternatively, the reply sentence generation module 4 combines the selected NEs to generate a reply sentence, and then the text-to-speech module 6 emits a voice to interact with the user.

須說明者,於本實施例中,語音轉文字模組2、參數屢交模組3、答覆語句生成模組4以及文字轉語音模組6等模組是在處理豐富文字之語意理解系統1外部,亦即,上述各模組根據不同目的,架設於不同設備、伺服器、系統中執行,在處理豐富文字之語意理解系統1解析完文字語句後,推得文字語句之意圖後,交由對應系統進行後續處理。然於其他應用實施例中,亦可將上述模組整合於處理豐富文字之語意理解系統1,成為一個從語音轉換、語言分析、到最後給予客戶回饋的一個整合系統服務。 It should be noted that in this embodiment, modules such as speech-to-text module 2, parameter repeating module 3, reply sentence generation module 4, and text-to-speech module 6 are used in the semantic understanding system 1 for processing rich text. Externally, that is, the above-mentioned modules are installed in different devices, servers, and systems according to different purposes. After the semantic understanding system 1 that processes rich text has parsed the text and deduced the intention of the text, it is handed over to the Corresponding system for follow-up processing. However, in other application embodiments, the above-mentioned modules can also be integrated into the semantic understanding system 1 for processing rich text, and become an integrated system service from voice conversion, language analysis, and finally feedback to customers.

於處理豐富文字之語意理解系統1運作前,復包括預先進行預存詞彙的建構,可由處理豐富文字之語意理解系統1內之聲控句型建構模組(如圖1所示)進行詞彙建構,因為聲控語句是由數個詞彙組合而成,詞彙先後順序通常不影響語意,例如「播放故事三隻小豬」跟「播放三隻小豬的故事」語意相同,聲控語句向來簡略,用戶甚至省略動詞直呼節目名稱,例如三隻小豬,故聲控句型 建構模組僅需匡列出重要詞彙組合,忽略不重要字詞,最後可形成預存之詞彙,以供潛在命名實體識別模組11於尋找潛在命名實體時比對之用。如下面表一所示,說明透過重要詞彙組合,進而推得支援句型、意圖和對應參數等訊息。 Before the operation of the semantic understanding system 1 for processing rich text, it includes the construction of pre-stored vocabulary in advance, and the vocabulary construction can be carried out by the voice-controlled sentence structure module (as shown in Figure 1) in the semantic understanding system 1 for processing rich text, because Voice control sentences are composed of several words. The order of words usually does not affect the semantics. For example, "play the story of the three little pigs" has the same meaning as "play the story of the three little pigs". Voice control sentences are always simple, and users even omit verbs Call the program name directly, such as the three little pigs, so the voice control sentence pattern The construction module only needs to list important vocabulary combinations, ignore unimportant words, and finally form a pre-stored vocabulary for comparison by the latent named entity recognition module 11 when searching for potential named entities. As shown in Table 1 below, it is explained that information such as supporting sentence patterns, intentions, and corresponding parameters can be deduced through the combination of important words.

Figure 110146063-A0101-12-0011-1
Figure 110146063-A0101-12-0011-1

後續將針對潛在命名實體識別模組執行潛在命名實體之初選、相似語句生成模組生成相似語句以及語句意圖評選模組找出近似語句等流程進一步說明。 Follow-up will further explain the process of performing the primary selection of potential named entities by the potential named entity recognition module, generating similar sentences by the similar sentence generation module, and finding similar sentences by the sentence intent selection module.

圖3為本發明中潛在命名實體識別模組初選潛在命名實體之流程圖,係說明潛在命名實體識別模組整合數種命名實體(NE)識別法初選潛在NE的流程。 FIG. 3 is a flow chart of the primary selection of potential named entities by the potential named entity recognition module in the present invention, illustrating the process of the primary selection of potential NEs by the potential named entity recognition module integrating several named entity (NE) recognition methods.

於流程301,依文字語句與詞彙的符號匹配率,推舉Top K則詞彙。本流程係進行輸入語音經轉換後之文字語句與預存之詞彙的符號匹配率,其中,符號匹配率算式如下: In the process 301, Top K words are recommended according to the symbol matching rate between the text sentence and the vocabulary. This process is to carry out the symbol matching rate between the converted text sentences of the input voice and the pre-stored vocabulary, wherein, the symbol matching rate formula is as follows:

Figure 110146063-A0101-12-0011-2
Figure 110146063-A0101-12-0011-2

其中,對於字符命名實體識別器來說,符號單位為一個中文字或一個英文單字,對於音符命名實體識別器來說,符號單位為一個中文字的注音符號不含 聲調或一個英文單字詞幹,不論是字符命名實體識別器或音符命名實體識別器,匹配率算式有別於習知BM25(Best Match25)算式,屏除逆向文件頻率以免節目名稱存在常見字詞而被排序在後,後續得以較小範圍K以計算語句與詞彙之最長共同子符號序列。 Among them, for the character named entity recognizer, the symbol unit is a Chinese character or an English word, and for the musical note named entity recognizer, the symbol unit is a Chinese character. Tones or an English word stem, whether it is a character named entity recognizer or a musical note named entity recognizer, the matching rate calculation formula is different from the conventional BM25 (Best Match25) calculation formula, and the frequency of reverse files is eliminated to avoid common words in the program name. After being sorted, a smaller range K can be used to calculate the longest common subsymbol sequence of sentences and vocabulary.

於流程302,針對每條詞彙,以滑動視窗方式找出詞彙與文字語句的最長共同子符號序列(LCS)。具體來說,最長共同子符號序列之長度≦詞彙長度,其中,最長共同子符號序列之計分方式可為符號正確則加分(如一分),符號錯誤(多/少/置換一個符號)則扣分(如一分),因為LCS計分不只看編輯距離,還會參考正確加分部分,故會優先挑出較長的詞彙,可避免取出局部最佳的詞彙。 In the process 302, for each vocabulary, the longest common subsymbol sequence (LCS) between the vocabulary and the text sentence is found in a sliding window manner. Specifically, the length of the longest common sub-symbol sequence≦vocabulary length, wherein, the scoring method of the longest common sub-symbol sequence can be an extra point if the symbol is correct (such as one point), and a wrong symbol (more/less/replacement of one symbol) Points are deducted (such as one point), because LCS scoring not only depends on the edit distance, but also refers to the correct extra points, so longer vocabulary will be selected first, which can avoid taking out locally optimal vocabulary.

於流程303,從K個詞彙中,挑選分數最高且≧α的詞彙作為潛在NE。本流程即進行詞彙評分後,選出分數最高且符合預設條件α的詞彙,以作為潛在命名實體,其中,α可視最長共同子符號序列評分結果調整門檻提高成效。 In the process 303, the vocabulary with the highest score and ≧α is selected from the K vocabulary as a potential NE. This process is to select the vocabulary with the highest score and meet the preset condition α as a potential named entity after lexical scoring, where α can adjust the threshold according to the scoring result of the longest common subsymbol sequence to improve the effect.

接著,判斷是否還存在候選命名實體(NE),若存在候選NE,進至流程304,若無候選NE,進至流程305。 Next, it is judged whether there is a candidate named entity (NE), if there is a candidate NE, go to process 304 , if there is no candidate NE, go to process 305 .

於流程304,以符號遮蔽文字語句中共同子序列(LCS),再尋找其他潛在NE。當存在有其他候選NE時,可利用符號(例如★)遮蔽文字語句中共同子序列,接著,再尋找文字語句中其他潛在NE,亦即利用符號遮蔽文字語句中共同子序列後,再回到流程301重複進行前述流程。 In the process 304, symbols are used to mask the common subsequence (LCS) in the text sentence, and then other potential NEs are searched. When there are other candidate NEs, symbols (such as ★) can be used to cover the common subsequences in the text sentence, and then, other potential NEs in the text sentence can be found, that is, after using symbols to cover the common subsequences in the text sentence, and then return to The process 301 repeats the preceding process.

於流程305,彙整NE探勘結果。當不存在其他候選NE時,則可彙整NE探勘結果,其中復包含詞彙在語句中的開始與結束位置。 In the process 305, the NE exploration results are compiled. When there are no other candidate NEs, the NE exploration results can be compiled, which includes the start and end positions of the words in the sentence.

須說明者,除了上述字符命名實體(NE)識別器所執行之字符NE識別法以及音符命名實體(NE)識別器所執行之音符NE識別法外,亦可擴充習知詞性句型(part-of-speech Pattern,POS Pattern)、馬爾科夫模型(Hidden Markov Model,HMM)、條件隨機域(Conditional Random Field,CRF)、循環神經網路(Recurrent neural network,RNN)或卷積神經網路(Convolutional Neural Network,CNN)等NE識別法至潛在命名實體識別模組,以涵蓋不同類型之文字語句。 It should be noted that, in addition to the character NE recognition method carried out by the above-mentioned character named entity (NE) recognizer and the musical note NE recognition method carried out by the musical note named entity (NE) recognizer, it is also possible to expand the conventional part-of-speech sentence pattern (part- of-speech Pattern, POS Pattern), Markov model (Hidden Markov Model, HMM), conditional random field (Conditional Random Field, CRF), recurrent neural network (Recurrent neural network, RNN) or convolutional neural network ( Convolutional Neural Network, CNN) and other NE recognition methods to latent named entity recognition modules to cover different types of text sentences.

圖4為本發明中相似語句生成模組生成相似語句以及語句意圖評選模組找出近似語句之流程圖,係說明相似語句生成模組組合潛在NE詞彙以產生相似語句的流程,以及語句意圖評選模組挑選近似語句的流程。 Fig. 4 is a flow chart of generating similar sentences by the similar sentence generation module and finding similar sentences by the sentence intent evaluation module in the present invention, which illustrates the process of the similar sentence generation module combining potential NE vocabulary to generate similar sentences, and the sentence intent evaluation The process of selecting similar sentences by the module.

於流程401,根據系統支援句型跟文字語句潛在NE出現順序,組合不重疊詞彙生成相似語句。本流程即相似語句生成模組根據系統所支援之句型和文字語句中潛在NE出現順序,將不重疊詞彙作組合以生成相似語句。 In the process 401, according to the sequence of appearance of the system supported sentence pattern and the potential NE of the text sentence, non-overlapping words are combined to generate a similar sentence. This process is that the similar sentence generation module combines the non-overlapping words according to the sentence pattern supported by the system and the sequence of potential NE appearance in the text sentence to generate similar sentences.

於流程402,計算每一則相似語句與原始語句最長共同子符號序列。本流程即針對每一個相似語句進行評分,也就是跟原始的文字語句進行符號比對,所採用之LCS計分方式如前所述,符號正確則加分(如一分),而多/少/置換一個符號等符號錯誤情況,則扣分(如一分)。 In the process 402, the longest common subsymbol sequence between each similar sentence and the original sentence is calculated. This process is to score each similar sentence, that is, to compare the symbols with the original text sentence. The LCS scoring method adopted is as mentioned above. If the symbol is correct, it will add points (such as one point), and more/less/ Substitution of a symbol and other symbol errors will result in a deduction of points (such as one point).

於流程403,挑選分數最高且≧β的相似語句作為近似語句。本流程即在流程402對每一個相似語句評分後進行挑選,選分數最高且符合符合預設條件β的相似語句,作為近似語句,其中,β可視最長共同子符號序列評分結果調整門檻提高成效。 In the process 403, the similar sentence with the highest score and ≧β is selected as an approximate sentence. This process is to select after scoring each similar sentence in the process 402, select the similar sentence with the highest score and meet the preset condition β as an approximate sentence, where β can be adjusted according to the scoring result of the longest common subsymbol sequence to improve the effect.

之後,根據近似語句數量會有不同結果,當近似語句數量為1時,進到流程404,當近似語句數量為0時,進到流程405,當近似語句數量為多個(≧2)時,進到流程406。 Afterwards, there will be different results according to the number of approximate sentences. When the number of approximate sentences is 1, go to process 404. When the number of approximate sentences is 0, go to process 405. When the number of approximate sentences is multiple (≧2), Go to process 406 .

於流程404,近似語句意圖與組成近似語句的NE作為觸發服務的路徑與履交的參數。本流程即近似語句只有一則時,語句意圖評選模組根據近似語句意圖以及組成近似語句的命名實體,產生作為觸發服務的路徑與履交的參數。另外,組成近似語句的命名實體,可套入答覆語句樣板而得到開頭介紹語句。 In the process 404, the intent of the approximate statement and the NEs that compose the approximate statement are used as parameters for triggering the path and delivery of the service. This process means that when there is only one similar sentence, the sentence intent selection module generates the path and delivery parameters for triggering the service according to the intent of the similar sentence and the named entities that make up the similar sentence. In addition, the named entities that make up the approximate sentence can be inserted into the reply sentence template to obtain the introduction sentence at the beginning.

於流程405,答覆用戶聽不懂語句。本流程即近似語句不存在時,語句意圖評選模組產生答覆用戶聽不懂語句之對應回饋。 In the process 405, the user is answered that the sentence cannot be understood. This process means that when the similar sentence does not exist, the sentence intent selection module generates corresponding feedback to answer that the user does not understand the sentence.

於流程406,組成近似語句的NE套入選項樣板可得數筆問題選項,組成選擇題詢問用戶意圖。本流程即近似語句不只一則(大於等於2)時,語句意圖評選模組將組成近似語句的命名實體套入選項樣板,進而得到數筆問題選項以組成選擇題,藉以詢問用戶意圖。 In the process 406, the NE forming the approximate sentence is inserted into the option template to obtain several question options, and a multiple-choice question is formed to ask the user's intention. This process means that when there is more than one similar sentence (greater than or equal to 2), the sentence intention selection module will put the named entities that make up the similar sentence into the option template, and then obtain several question options to form multiple choice questions, so as to ask the user's intention.

圖5為本發明之處理豐富文字之語意理解方法之步驟圖。本發明之處理豐富文字之語意理解方法可於例如個人電腦、伺服器或雲端設備之電腦設備執行,其中,本發明之處理豐富文字之語意理解方法包括以下步驟。 FIG. 5 is a step diagram of the semantic understanding method for processing rich text according to the present invention. The semantic understanding method for processing rich text of the present invention can be executed on computer equipment such as personal computers, servers or cloud devices, wherein the semantic understanding method for processing rich text of the present invention includes the following steps.

於步驟S501,潛在命名實體識別模組接收由用戶之輸入語音經轉換後之文字語句,以利用不同類型之命名實體識別器,計算該文字語句與預存之詞彙的最長共同子符號序列並進行計分,以挑選出分數最高且符合預定第一門檻值之詞彙作為潛在命名實體。本步驟即潛在命名實體識別模組接收由用戶之輸入語音經轉換後之文字語句,並利用不同類型之命名實體識別器分析文字 語句與預存之詞彙兩者的最長共同子符號序,之後針對符號位置進行計分,從中挑選出分數最高且符合預定第一門檻值之詞彙作為潛在命名實體。 In step S501, the potential named entity recognition module receives the converted text sentence from the user's input voice, and uses different types of named entity recognizers to calculate the longest common subsymbol sequence between the text sentence and the pre-stored vocabulary Points, to select the vocabulary with the highest score and meeting the predetermined first threshold as a potential named entity. In this step, the potential named entity recognition module receives the converted text sentence from the user's input voice, and uses different types of named entity recognizers to analyze the text The longest common sub-symbol sequence of both the sentence and the pre-stored vocabulary is then scored for the symbol position, and the vocabulary with the highest score and meeting the predetermined first threshold is selected as a potential named entity.

於一實施例中,該命名實體識別器可為字符命名實體識別器或音符命名實體識別器。字符命名實體識別器係以一個中文字或一個英文單字為符號單位,對該文字語句之符號與該詞彙之符號進行匹配率計算,以得到最長共同子符號序列,而音符命名實體識別器係以一個中文字的注音符號不含音調或一個英文單字詞幹為符號單位,藉以對該文字語句之符號與該詞彙之符號進行匹配率計算,而得到最長共同子符號序列。易言之,透過將文字語句與預存之詞彙經過符號轉換後,即能在同一套潛在NE評選框架下進行潛在NE初選。 In one embodiment, the named entity recognizer may be a character named entity recognizer or a musical note named entity recognizer. The character named entity recognizer uses a Chinese character or an English single character as a symbol unit, and calculates the matching rate between the symbol of the text sentence and the symbol of the vocabulary to obtain the longest common subsymbol sequence, while the musical note named entity recognizer uses A phonetic symbol of a Chinese character does not contain tones or an English single word stem as a symbol unit, so as to calculate the matching rate between the symbol of the word sentence and the symbol of the vocabulary, and obtain the longest common sub-symbol sequence. In other words, by transforming text sentences and pre-stored vocabulary into symbols, the primary selection of potential NEs can be performed under the same potential NE selection framework.

於另一實施例中,潛在命名實體識別模組係以滑動視窗方式找出該文字語句與該詞彙的該最長共同子符號序列,藉由比對該文字語句之符號與該詞彙之符號,以於符號正確時加分(如一分)及符號錯誤時扣分(如一分),最後選擇分數最高且符合預定第一門檻值之詞彙作為該潛在命名實體。 In another embodiment, the latent named entity recognition module uses a sliding window to find the longest common subsymbol sequence between the text sentence and the vocabulary, and by comparing the symbols of the text sentence with the symbols of the vocabulary, in Add points (such as one point) when the symbol is correct and deduct points (such as one point) when the symbol is wrong, and finally select the word with the highest score and meet the predetermined first threshold as the potential named entity.

另外,潛在命名實體識別模組於找出該潛在命名實體後,若發現還有其他候選NE時,係利用遮蔽文字語句中共同子序列方式,再以相同程序尋找該文字語句中的其他潛在命名實體,直到無找到其他潛在命名實體為止。 In addition, after the potential named entity recognition module finds out the potential named entity, if there are other candidate NEs, it uses the method of masking the common subsequence in the text sentence, and then uses the same procedure to find other potential names in the text sentence entities until no other potential named entities are found.

於步驟S502,相似語句生成模組將該潛在命名實體所對應之詞彙進行不重複之組合,以生成相似語句並進行計分,俾由其中挑選出分數最高且符合第二門檻值之相似語句作為近似語句。本步驟即相似語句生成模組根據系統預定之句型跟文字語句中潛在命名實體出現順序,組合不重疊詞彙,藉以生成相似語句,接著,計算每一個相似語句與原始文字語句的最長共同子符號序列,並對每一個相似語句進行評分,從中挑選出分數最高且符合第二門檻值的 相似語句來作為近似語句。其中,評分方式包括比對該文字語句之符號與該相似語句之符號以進行計分,於符號正確時加分(如一分)及於符號錯誤時扣分(如一分),以令分數最高且符合第二門檻值之相似語句作為該近似語句。 In step S502, the similar sentence generation module combines the vocabulary corresponding to the potential named entity without repetition to generate similar sentences and perform scoring, so that the similar sentence with the highest score and meeting the second threshold value is selected as the Approximate sentences. This step is that the similar sentence generation module combines the non-overlapping vocabulary according to the sentence pattern predetermined by the system and the order of appearance of potential named entities in the text sentence, so as to generate similar sentences, and then calculates the longest common subsymbol between each similar sentence and the original text sentence sequence, and score each similar sentence, and select the one with the highest score and the second threshold similar sentences as approximate sentences. Among them, the scoring method includes comparing the symbols of the text sentence with the symbols of the similar sentence to score, adding points (such as one point) when the symbols are correct and deducting points (such as one point) when the symbols are wrong, so as to maximize the score and A similar sentence that meets the second threshold is used as the approximate sentence.

於步驟S503,語句意圖評選模組依據該近似語句之數量,提供用於觸發對應服務之參數、產生對應之回覆訊息或是產生互動訊息。本步驟即語句意圖評選模組根據前一步驟產生之近似語句的數量,給予後段系統或平台執行對應回饋,包括於該近似語句僅有一個時,透過後端之參數屢交模組以生成觸發該對應服務之參數,或是於該近似語句為零個或多個時,透過後端之答覆語句生成模組以產生該回覆訊息或該互動訊息。 In step S503, the sentence intent selection module provides parameters for triggering corresponding services, generates corresponding reply messages, or generates interactive messages according to the number of similar sentences. In this step, the statement intent selection module gives corresponding feedback to the back-end system or platform based on the number of similar sentences generated in the previous step, including when there is only one similar sentence, the module is repeatedly passed to the back-end parameter to generate a trigger The parameters of the corresponding service, or when the approximate sentence is zero or more, the reply message or the interaction message is generated through the reply sentence generation module at the backend.

於其他實施例中,於系統執行語意理解之運作前,係預先透過聲控句型建構模組由多個訓練語句中匡列出重要詞彙組合且忽略不重要字詞,以建構出該預存之詞彙。易言之,透過聲控句型建構模組從訓練語句中找出重要詞彙組合,成為預存之詞彙,以供系統分析使用。 In other embodiments, before the system executes the operation of semantic understanding, the voice-controlled sentence construction module is used to list important vocabulary combinations and ignore unimportant words in advance through the voice-controlled sentence construction module to construct the pre-stored vocabulary . In other words, through the voice-activated sentence construction module, important vocabulary combinations are found from the training sentences and become pre-stored vocabulary for system analysis.

以下以一具體實施例說明本發明。於本實施例中,係使用自然語言建構聲控服務支援的句型,使用二種命名實體(Named Entity,NE)識別方法,透過一套潛在NE初選機制推舉各種潛在NE,再依句型組合潛在NE生成相似語句,最後,透過近似語句評選公式,確立語句意圖,選定NE組合以供服務觸發參數履交或是答覆語句生成,本發明融合兩種NE識別結果,發揮綜效涵蓋缺字與錯別字情況,故能有效提升NE識別正確率。 The present invention is described below with a specific embodiment. In this embodiment, natural language is used to construct the sentence patterns supported by voice-activated services, and two named entity (Named Entity, NE) recognition methods are used to recommend various potential NEs through a potential NE primary selection mechanism, and then combined according to sentence patterns Potential NEs generate similar sentences. Finally, through similar sentence selection formulas, sentence intentions are established, and NE combinations are selected for service trigger parameter fulfillment or reply sentence generation. The present invention integrates the two NE recognition results, and exerts synergistic effects covering missing characters and typos, so it can effectively improve the accuracy of NE recognition.

首先,令聲控服務支援下面表二的句型,其中,{Story}包含「三隻小豬」等故事,{Song}包含「蜘蛛」等歌曲,{Listen}包含「要聽」等動詞。 First, let the voice control service support the sentence patterns in Table 2 below, where {Story} contains stories such as "Three Little Pigs", {Song} contains songs such as "Spider", and {Listen} contains verbs such as "I want to listen".

Figure 110146063-A0101-12-0017-3
Figure 110146063-A0101-12-0017-3

當缺字語句「我要聽三隻豬」輸入本系統時,使用字符命名實體識別器尋找語句中潛在NE。首先,對應流程301,依語句與詞彙的匹配率推舉Top K則詞彙,於本實施例中,結果Top K詞彙依序為「三隻小豬」與「要聽」,即Top 2。接著,對應流程302,針對每一條詞彙,以滑動視窗方式(一個字一個字移位)找出詞彙與語句的最長共同子符號序列(LCS),其中LCS長度≦詞彙長度。 When the word-missing sentence "I want to listen to three pigs" is input into the system, a character named entity recognizer is used to find potential NEs in the sentence. First, corresponding to the process 301, the Top K vocabulary is recommended according to the matching rate between the sentence and the vocabulary. In this embodiment, the resulting Top K vocabulary is "Three Little Pigs" and "To Listen", namely Top 2. Next, corresponding to the process 302, for each vocabulary, the longest common subsymbol sequence (LCS) between the vocabulary and the sentence is found in a sliding window manner (shifting one word by one word), wherein the LCS length≦vocabulary length.

有關詞彙「三隻小豬」其字符LCS的計算過程如下。「我要聽三隻豬」為初始所提供之缺字語句,與詞彙「三隻小豬」進行字符比對,「我要聽三隻豬」字符編號依序為0-5。 The calculation process of the character LCS of the word "Three Little Pigs" is as follows. "I want to listen to three pigs" is the word-missing sentence provided initially, and the character comparison is performed with the vocabulary "three little pigs". The character numbers of "I want to listen to three pigs" are 0-5 in sequence.

將詞彙「三隻小豬」與文字語句之片段「我要聽三」比對,字符LCS分數=1-3=-2,其中,1表示有一個字符相同,3表示有三個字符不同,故字符LCS分數為兩者相減後為-2。 Comparing the vocabulary "Three Little Pigs" with the text segment "I want to listen to three", the character LCS score = 1-3=-2, where 1 means that there is one character that is the same, and 3 means that there are three characters that are different, so The character LCS score is -2 after subtracting the two.

將詞彙「三隻小豬」與文字語句之片段「要聽三隻」比對,字符LCS分數=2-2=0,其中,前面的2表示有二個字符相同,後面的2表示有二個字符不同,故字符LCS分數為兩者相減後為0。 Comparing the vocabulary "Three Little Pigs" with the fragment of the text sentence "To listen to three", the character LCS score = 2-2=0, wherein, the front 2 means that there are two characters that are the same, and the back 2 means that there are two characters are different, so the character LCS score is 0 after the subtraction of the two.

將詞彙「三隻小豬」與文字語句之片段「聽三隻豬」比對,字符LCS分數=3-2=1,其中,3表示有三個字符相同,2表示有二個字符不同(有文字置換情況),故字符LCS分數為兩者相減後為1。 Comparing the vocabulary "three little pigs" with the segment "listen to three pigs" in the text sentence, the character LCS score = 3-2=1, where 3 means that there are three characters that are the same, and 2 means that there are two characters that are different (there are text replacement), so the character LCS score is 1 after the subtraction of the two.

將詞彙「三隻小豬」與文字語句之片段「三隻豬」比對,字符LCS分數=3-1=2,其中,3表示有三個字符相同,1表示有一個字符不同(少字),故字符LCS分數為兩者相減後為2。 Compare the vocabulary "three little pigs" with the fragment "three pigs" of the text sentence, the character LCS score = 3-1=2, where 3 means that there are three characters that are the same, and 1 means that there is one character that is different (less characters) , so the character LCS score is 2 after subtracting the two.

綜上,「我要聽三隻豬」與詞彙「三隻小豬」作字符比對後,最終,結果流程302之LCS位置為(3,6),分數2,也就是從編號3開始(“三”字所在),編號6後開始不包含。 To sum up, after the character comparison between "I want to listen to three pigs" and the vocabulary "three little pigs", the LCS position of the result process 302 is (3,6), and the score is 2, that is, starting from number 3 ( Where the word "three" is located), after number 6, it does not include.

另外,有關另一個詞彙「要聽」其字符LCS的計算過程如下。同樣地,「我要聽三隻豬」為初始所提供之缺字語句,與詞彙「要聽」進行字符比對,「我要聽三隻豬」字符編號依序為0-5。 In addition, the calculation process of the character LCS of another word "want to listen" is as follows. Similarly, "I want to listen to three pigs" is the word-missing sentence provided initially, and the character comparison is performed with the vocabulary "Yao Ting", and the character numbers of "I want to listen to three pigs" are 0-5 in sequence.

將詞彙「要聽」與文字語句之片段「我要」比對,字符LCS分數=1-1=0,其中,前面的1表示有一個字符相同,後面的1表示有一個字符不同,故字符LCS分數為兩者相減後為0。 Comparing the vocabulary "want to listen" with the segment "I want" in the text sentence, the character LCS score = 1-1=0, where the preceding 1 indicates that there is a character that is the same, and the following 1 indicates that there is a character that is different, so the character The LCS score is 0 after subtracting the two.

將詞彙「要聽」與文字語句之片段「要聽」比對,字符LCS分數=2-0=2,其中,2表示有二個字符相同,後面的0表示有無字符不同,故字符LCS分數為兩者相減後為2。 Comparing the vocabulary "YaoListen" with the segment "YaoListen" of the text sentence, the character LCS score = 2-0 = 2, where 2 means that there are two characters that are the same, and the following 0 means whether there are different characters, so the character LCS score After subtracting the two, it is 2.

將詞彙「要聽」與文字語句之片段「聽三」比對,字符LCS分數=2-1=1,其中,2表示有二個字符相同,後面的1表示有一個字符不同,故字符LCS分數為兩者相減後為1。 Comparing the vocabulary "to listen" with the segment "listen three" of the text sentence, the character LCS score = 2-1=1, where 2 means that there are two characters that are the same, and the following 1 means that there is a character that is different, so the character LCS The score is 1 after subtracting the two.

故「我要聽三隻豬」與詞彙「要聽」作字符比對後,最終,流程302之結果LCS位置為(1,3),分數2,也就是從編號1開始(“要”字所在),編號3後開始不包含。 Therefore, after the character comparison of "I want to listen to three pigs" and the word "Yao Ting", the final result of the process 302 is the LCS position (1,3), and the score is 2, that is, starting from number 1 ("Yao" word location), after number 3, it is not included.

接下來,對應流程303,從前述Top 2詞彙挑選分數最高且≧1的詞彙作為潛在NE,結果找出潛在詞彙為「三隻小豬」,LCS為「三隻豬」且分數為2。之後,以符號★遮蔽語句中LCS,再尋找「我要聽★★★」的其他潛在NE。 Next, corresponding to the process 303, the vocabulary with the highest score and ≧1 is selected from the aforementioned Top 2 vocabulary as a potential NE. As a result, the potential vocabulary is found to be "three little pigs", the LCS is "three pigs" and the score is 2. After that, cover the LCS in the sentence with the symbol ★, and then search for other potential NEs of "I want to listen to ★★★".

再次執行流程301,依語句與詞彙的匹配率,推舉Top K則詞彙。結果Top K詞彙為「要聽」,接著,執行流程302,針對每一條詞彙,以滑動視窗方式找出該詞彙與語句片段的LCS,其中,語句片段長度≦詞彙長度。 The process 301 is executed again, and the Top K words are recommended according to the matching rate between the sentences and the words. As a result, the Top K vocabulary is "to be listened to". Then, the process 302 is executed, and for each vocabulary, the LCS between the vocabulary and the sentence segment is found in the form of a sliding window, wherein the sentence segment length≦vocabulary length.

關於詞彙「要聽」其字符LCS的計算過程如下。「我要聽★★★」為遮蔽後語句,與詞彙「要聽」進行字符比對,「我要聽★★★」字符編號依序為0-5。 The calculation process of the character LCS of the vocabulary "To Listen" is as follows. "I want to listen to ★★★" is a masked sentence, which is compared with the word "want to listen". The character numbers of "I want to listen to ★★★" are 0-5 in sequence.

將詞彙「要聽」與文字語句之片段「我要」比對,字符LCS分數=1-1=0;將詞彙「要聽」與文字語句之片段「要聽」比對,字符LCS分數=2-0=2;將詞彙「要聽」與文字語句之片段「聽★」比對,字符LCS分數=2-1=1。因此,「我要聽★★★」與詞彙「要聽」作字符比對後,結果流程302之LCS為「要聽」,分數2。 Compare the vocabulary "Yaolisten" with the fragment "I want" of the text sentence, the character LCS score = 1-1=0; compare the vocabulary "Yaolisten" with the fragment "Yaolisten" of the text sentence, the character LCS score = 2-0=2; compare the vocabulary "want to listen" with the segment "listen★" of the text sentence, the character LCS score=2-1=1. Therefore, after the character comparison of "I want to listen ★★★" and the word "YaoListen", the LCS of the result flow 302 is "YaoListen", with a score of 2.

再次,執行流程303,從Top 1詞彙挑選分數最高且≧1的詞彙作為潛在NE。結果流程303之潛在詞彙為「要聽」,LCS為「要聽」且分數為2。之後,執行流程303,以符號★遮蔽語句中LCS,再尋找「我★★★★★」其他潛在NE,結果不存在其他潛在NE,進入流程305,彙整字符命名實體識別器執行結果如下面表三。 Again, the process 303 is executed, and the vocabulary with the highest score and ≧1 is selected from the Top 1 vocabulary as a potential NE. The potential vocabulary of the result flow 303 is "to listen", the LCS is "to listen" and the score is 2. Afterwards, execute the process 303, cover the LCS in the statement with the symbol ★, and then search for other potential NEs of "I★★★★★". As a result, there are no other potential NEs, enter the process 305, and compile the execution results of the character named entity recognizer as shown in the following table three.

Figure 110146063-A0101-12-0019-4
Figure 110146063-A0101-12-0019-4

Figure 110146063-A0101-12-0020-5
Figure 110146063-A0101-12-0020-5

另外,音符命名實體識別器使用音符尋找語句中潛在NE,其中,音符單位為一個中文字的注音符號不含聲調或一個英文單字詞幹。因此,於本實施例中,語句「我要聽三隻豬」與詞彙,文字轉注音,注音轉音符過程如下面表四。 In addition, the musical note named entity recognizer uses musical notes to find potential NEs in a sentence, where the musical note unit is a phonetic symbol of a Chinese character without tone or an English word stem. Therefore, in the present embodiment, the sentence "I want to listen to three pigs" and the vocabulary, the text is converted into phonetic notation, and the process of translating phonetic notation from phonetic notation is as shown in Table 4 below.

Figure 110146063-A0101-12-0020-6
Figure 110146063-A0101-12-0020-6

對應流程301,依語句與詞彙的匹配率,推舉Top K則詞彙。結果Top K詞彙依序為「三隻小豬」、「蜘蛛」與「要聽」。接著,對應流程302,針對每一條詞彙,以滑動視窗方式找出詞彙與語句的LCS,其中LCS長度≦詞彙長度。流程302比對過程如下。 Corresponding to the process 301, Top K words are recommended according to the matching rate between sentences and words. As a result, the Top K words were "Three Little Pigs", "Spider" and "To Listen". Next, corresponding to the process 302 , for each vocabulary, the LCS of the vocabulary and the sentence is found out in a sliding window manner, wherein the length of the LCS≦the length of the vocabulary. The comparison process in flow 302 is as follows.

A B C D E F為語句音符,其中,音符代碼為D E G F之詞彙「三隻小豬」與語句片段「三隻豬」比對,音符LCS位置(3,6),分數3-1=2;音符代碼為E F之詞彙「蜘蛛」與語句片段「隻豬」,音符LCS位置(4,6),分數2-0=2;音符代碼為B C之詞彙「要聽」與語句片段「要聽」,音符LCS位置(1,3),分數2-0=2。 A B C D E F is a sentence note, wherein, the note code is D E G F vocabulary "three little pigs" and sentence fragment "three pigs" comparison, note LCS position (3,6), score 3-1=2; note code is E F The vocabulary "spider" and the sentence fragment "pig", the note LCS position (4,6), score 2-0=2; the note code is B C The vocabulary "want to listen" and the sentence fragment "want to listen", the note LCS position (1,3), fraction 2-0=2.

接下來,對應流程303,從3個詞彙音符挑選分數最高且≧1的詞彙作為潛在NE,結果流程303之潛在詞彙為「三隻小豬」,LCS為「三隻 豬」且分數為2。之後,對應流程304,以符號★遮蔽語句中LCS,再尋找「我要聽★★★」其他潛在NE。 Next, corresponding to the process 303, the vocabulary with the highest score and ≧1 is selected from the 3 vocabulary notes as the potential NE. As a result, the potential vocabulary in the process 303 is "three little pigs", and the LCS is "three little pigs". Pig" with a score of 2. Afterwards, corresponding to the process 304, the LCS in the sentence is covered by the symbol ★, and then other potential NEs of "I want to listen to ★★★" are searched.

再次對應流程301,依語句與詞彙的匹配率,推舉Top K則詞彙。結果本次流程301之Top K詞彙為「要聽」,接著,對應流程302,針對每一條詞彙,以滑動視窗方式找出詞彙與語句的LCS,其中LCS長度≦詞彙長度。 Corresponding to the process 301 again, Top K words are recommended according to the matching rate between sentences and words. As a result, the Top K vocabulary in the process 301 is "to listen to". Then, corresponding to the process 302, for each vocabulary, the LCS of the vocabulary and the sentence is found in the form of a sliding window, wherein the length of the LCS≦the length of the vocabulary.

同樣地,A B C D E F為語句音符,其中,音符代碼為B C之詞彙「要聽」與語句片段「要聽」比對,音符LCS位置(1,3),分數2-0=2。 Similarly, A B C D E F is a sentence note, wherein, the word "want to listen" whose note code is B C is compared with the sentence fragment "want to listen", the note LCS position is (1,3), and the score is 2-0=2.

再次執行流程303,結果潛在詞彙為「要聽」,LCS為「要聽」且分數為2。之後,執行流程304,以符號★遮蔽語句中LCS,再尋找「我★★★★★」其他潛在NE。結果不存在其他潛在NE,故進到流程305,彙整音符命名實體識別器執行結果如下面表五。 The process 303 is executed again, and the result is that the potential word is "to listen", the LCS is "to listen" and the score is 2. Afterwards, the process 304 is executed, the LCS in the statement is covered by the symbol ★, and other potential NEs of "I★★★★★" are searched. As a result, there are no other potential NEs, so proceed to process 305, and compile the execution results of the note named entity recognizer as shown in Table 5 below.

Figure 110146063-A0101-12-0021-7
Figure 110146063-A0101-12-0021-7

於字符命名實體識別器以及音符命名實體識別器執行完後,彙整探勘結果(表三和表五),最終得到如表五的結果,之後將用於相似語句生成與近似語句評選。對應流程401,根據系統支援的句型跟潛在NE出現順序,組合不重疊詞彙,生成相似語句。對應流程402,計算每一則相似語句與原始語句最長共同子符號序列,計分方式為符號正確則加一分,符號錯誤(多/少/置換一個符號)則扣一分。 After the character named entity recognizer and musical note named entity recognizer are executed, the exploration results (Table 3 and Table 5) are compiled, and the results shown in Table 5 are finally obtained, which will be used for similar sentence generation and similar sentence selection. Corresponding to the process 401, according to the sentence patterns supported by the system and the sequence of appearance of potential NEs, non-overlapping words are combined to generate similar sentences. Corresponding to process 402, calculate the longest common sub-symbol sequence between each similar sentence and the original sentence, and the scoring method is that if the symbol is correct, one point will be added, and if the symbol is wrong (more/less/replacement of a symbol), one point will be deducted.

例如原始語句之字符「我要聽三隻豬」和音符「A B C D E F」,與每種相似語句評分過程如下面表六。 For example, the characters "I want to listen to three pigs" and the musical notes "A B C D E F" in the original sentence, and the scoring process of each similar sentence are shown in Table 6 below.

Figure 110146063-A0101-12-0022-8
Figure 110146063-A0101-12-0022-8

接下來,對應流程403,從所有相似語句挑分數≧0且最高分的近似語句,結果相似語句「要聽三隻小豬」分數最高,推選為唯一近似語句,從句型對照下面表七可知,語句意圖為listenStory,選定詞彙組合後,由流程404將NE詞彙填至對應的履行參數,連同意圖代碼後傳後端restful API觸發服務,例如:/listenStory?Book=三隻小豬。另外,根據選定NE詞彙置換開頭介紹樣板中的NE,可得開頭介紹語句,供串流撥放前節目名稱介紹「為您播放故事三隻小豬」。 Next, corresponding to process 403, select the approximate sentence with the highest score ≧ 0 from all similar sentences. As a result, the similar sentence "Listen to the Three Little Pigs" has the highest score and is recommended as the only similar sentence. From the sentence pattern comparison table 7 below, we can see , the intent of the statement is listenStory. After the vocabulary combination is selected, the process 404 fills in the NE vocabulary into the corresponding execution parameters, and transmits the back-end restful API to trigger the service together with the intent code, for example: /listenStory? Book=Three little pigs. In addition, by replacing the NE in the opening introduction template with the selected NE vocabulary, the opening introduction sentence can be obtained, which can be used to introduce the title of the program before streaming, "playing the story of the three little pigs for you".

Figure 110146063-A0101-12-0022-9
Figure 110146063-A0101-12-0022-9

假使「三隻小豬」既是{Story}也是{Song},則語句「我要聽三隻豬」會有兩則同分的近似語句,「要聽三隻小豬」其意圖可能是listenStory或listenSong,此時需要跟用戶確認。對應流程406,組成近似語句的NE套入選項樣板可得數筆問題選項,組成選擇題詢問用戶意圖,問題選項如下面表八,答覆「您是要聽故事三隻小豬還是歌曲三隻小豬」選擇題以釐清用戶意圖。 If "The Three Little Pigs" is both {Story} and {Song}, the sentence "I want to listen to the Three Pigs" will have two similar sentences with the same score. The intention of "I want to listen to the Three Little Pigs" may be listenStory or listenSong, you need to confirm with the user at this time. Corresponding to process 406, the NE that forms the approximate sentence can be inserted into the option template to get several question options, and the multiple -choice questions are formed to ask the user's intention . Pig ” multiple-choice questions to clarify user intent.

Figure 110146063-A0101-12-0023-10
Figure 110146063-A0101-12-0023-10

此外,本發明還揭示一種電腦可讀媒介,係應用於具有處理器(例如,CPU、GPU等)及/或記憶體的計算裝置或電腦中,且儲存有指令,並可利用此計算裝置或電腦透過處理器及/或記憶體執行此電腦可讀媒介,以於執行此電腦可讀媒介時執行上述之方法及各步驟。 In addition, the present invention also discloses a computer-readable medium, which is applied to a computing device or computer having a processor (for example, CPU, GPU, etc.) and/or memory, and stores instructions, and can be used by this computing device or The computer executes the computer-readable medium through the processor and/or memory, so as to execute the above-mentioned method and each step when executing the computer-readable medium.

本發明之模組、單元、裝置等包括微處理器及記憶體,而演算法、資料、程式等係儲存記憶體或晶片內,微處理器可從記憶體載入資料或演算法或程式進行資料分析或計算等處理,在此不予贅述。易言之,本發明之處理豐富文字之語意理解系統可於電子設備上執行,例如一般電腦、平板或是伺服器,在收到文字語句後執行分析與運算,故處理豐富文字之語意理解系統所進行程序,可透過軟體設計並架構在具有處理器、記憶體等元件之電子設備上,以於各類電子設備上運行;另外,亦可將處理豐富文字之語意理解系統之各模組或單元分別以獨立元件組成,例如設計為計算器、記憶體、儲存器或是具有處理單元的韌體,皆可成為實現本發明之組件,而命名實體識別器等相關組件,亦可選擇以軟體程式、硬體或韌體架構呈現。 The modules, units, devices, etc. of the present invention include a microprocessor and a memory, and algorithms, data, programs, etc. are stored in the memory or a chip, and the microprocessor can load data or algorithms or programs from the memory. Processing such as data analysis or calculation will not be repeated here. In other words, the semantic understanding system for processing rich text of the present invention can be executed on electronic equipment, such as a general computer, tablet or server, and performs analysis and calculation after receiving text sentences, so the semantic understanding system for processing rich text The program to be carried out can be designed and built on electronic devices with processors, memory and other components through software, so as to run on various electronic devices; in addition, various modules of the semantic understanding system for processing rich text or The units are composed of independent components, such as a calculator, a memory, a storage device, or a firmware with a processing unit, all of which can be used to realize the components of the present invention, and related components such as named entity recognizers can also be selected as software Presentation of program, hardware or firmware architecture.

綜上,本發明之處理豐富文字之語意理解系統及其方法,係用於將內容豐富文字轉換成輕量語意以便於分析理解,其中,潛在命名實體識別模組整合多種NE識別方法,尋找潛在NE詞彙,相似語句生成模組根據系統支援句型跟語句潛在NE出現順序,組合不重疊詞彙生成相似語句,語句意圖評選模 組從數則相似語句挑出最接近原始語句,推估語句意圖,選定NE組合。易言之,對於用戶使用聲控服務往往不假思索,而有節目名稱過長而漏講部分字詞或是節目名稱諧音往往錯別字誤植等狀況,進而影響聲控辨識率,本發明提出一套框架構建聲控句型、評選數種Named Entity(NE)識別結果,推估語句意圖,發揮綜效因應語句缺字與錯別字,藉以提升聲控服務品質與準確度。故本發明具有以下功效。 In summary, the semantic understanding system and method for processing rich text of the present invention is used to convert content-rich text into lightweight semantics for easy analysis and understanding. Among them, the potential named entity recognition module integrates multiple NE recognition methods to find potential NE vocabulary, similar sentence generation module According to the sentence patterns supported by the system and the potential NE appearance order of the sentence, the combination of non-overlapping words generates similar sentences, and the sentence intention selection model The group selects the closest to the original sentence from the number of similar sentences, estimates the sentence intent, and selects the NE combination. In other words, for users who use voice-activated services without hesitation, some words are omitted because the program name is too long, or the homonym of the program name is often wrongly written, which affects the voice-activated recognition rate. This invention proposes a framework to construct voice-activated sentences Type, select several Named Entity (NE) recognition results, estimate the intent of the sentence, and use the synergy to deal with missing words and typos in the sentence, so as to improve the quality and accuracy of voice control services. Therefore, the present invention has the following effects.

首先,透過聲控句型建構模組框架及歸納重要詞彙,忽略不重要字詞,故不需要大量範例語句,不需要大量運算資源進行訓練。 First of all, it constructs a module framework and summarizes important vocabulary through voice-controlled sentence patterns, and ignores unimportant words, so it does not require a large number of example sentences and a large number of computing resources for training.

其次,命名實體(Named Entity,NE)評選框架融合多種NE識別結果,發揮綜效提升缺字或是錯別字詞彙之辨識率。 Secondly, the Named Entity (NE) selection framework integrates multiple NE recognition results to play a synergistic effect to improve the recognition rate of missing or misspelled words.

上述實施例僅為例示性說明,而非用於限制本發明。任何熟習此項技藝之人士均可在不違背本發明之精神及範疇下,對上述實施例進行修飾與改變。因此,本發明之權利保護範圍係由本發明所附之申請專利範圍所定義,只要不影響本發明之效果及實施目的,應涵蓋於此公開技術內容中。 The above-mentioned embodiments are for illustrative purposes only, and are not intended to limit the present invention. Anyone skilled in the art can make modifications and changes to the above-mentioned embodiments without departing from the spirit and scope of the present invention. Therefore, the protection scope of the present invention is defined by the scope of patent application attached to the present invention, as long as it does not affect the effect and implementation purpose of the present invention, it should be included in this disclosed technical content.

1:處理豐富文字之語意理解系統 1: Semantic understanding system for processing rich text

11:潛在命名實體識別模組 11: Potential Named Entity Recognition Module

12:相似語句生成模組 12: Similar sentence generation module

13:語句意圖評選模組 13: Sentence intent selection module

14:聲控句型建構模組 14: Voice-activated sentence construction module

Claims (11)

一種處理豐富文字之語意理解系統,係包括:潛在命名實體識別模組,係具有多個不同類型之命名實體識別器且接收經語音轉換之文字語句,其中,字符命名實體識別器與音符命名實體識別器分別對該文字語句之符號與預存詞彙之符號進行匹配率計算,挑出匹配率較高的預存詞彙,再以詞彙長度為範圍、滑動視窗方式在該文字語句與該匹配率較高的預存詞彙間尋找最長共同子符號序列,並比對該文字語句之符號與該預存詞彙之符號以進行計分,以於符號正確時加分及符號錯誤時扣分,俾挑選出分數最高且符合第一門檻值之預存詞彙作為潛在命名實體,以及於找出一個潛在命名實體後,利用遮蔽該文字語句中共同子符號序列,再以相同程序尋找該文字語句中的其他潛在命名實體,直到找無其他潛在命名實體為止;相似語句生成模組,係用於依據系統所支援句型以及該文字語句中潛在命名實體出現順序,以範圍不重疊方式組合該潛在命名實體對應之預存詞彙,以生成相似語句,再比對每個該相似語句與該文字語句之符號相似與相異處,以於符號正確時加分及符號錯誤時扣分,俾由其中挑選出分數最高且符合第二門檻值之相似語句作為近似語句;以及語句意圖評選模組,係用於依據該近似語句之意圖與數量,推估該文字語句之意圖,選定正確的命名實體,以產生該意圖對應服務之參數與該文字語句之回覆訊息。 A semantic understanding system for processing rich text, including: a latent named entity recognition module, which has a plurality of different types of named entity recognizers and receives speech-converted text sentences, wherein, the character named entity recognizer and the musical note named entity The recognizer calculates the matching rate between the symbols of the text sentence and the symbols of the pre-stored vocabulary, picks out the pre-stored words with a high matching rate, and then uses the length of the vocabulary as the range and slides the window to compare the words between the text sentence and the words with a high matching rate. Find the longest common sub-symbol sequence between the pre-stored vocabulary, and compare the symbols of the text sentence with the symbols of the pre-stored vocabulary to score, so as to add points when the symbols are correct and deduct points when the symbols are wrong, so as to select the highest score and meet the The pre-stored vocabulary of the first threshold is used as a potential named entity, and after a potential named entity is found, the common sub-symbol sequence in the word sentence is masked, and then other potential named entities in the word sentence are searched for using the same procedure until finding Until there are no other potential named entities; the similar sentence generation module is used to combine the pre-stored vocabulary corresponding to the potential named entities in a non-overlapping manner according to the sentence patterns supported by the system and the order of appearance of the potential named entities in the text sentence to generate Similar sentences, and then compare the similarities and differences between the symbols of each similar sentence and the text sentence, so as to add points when the symbols are correct and deduct points when the symbols are wrong, so as to select the highest score and meet the second threshold Similar sentences are used as similar sentences; and the sentence intent selection module is used to estimate the intent of the literal sentence based on the intent and quantity of the similar sentence, select the correct named entity, and generate the parameters of the service corresponding to the intent and the The reply message of the text statement. 如請求項1所述之處理豐富文字之語意理解系統,其中,該字符命名實體識別器係以一個中文字或一個英文單字為符號單位。 The semantic understanding system for processing rich text as described in Claim 1, wherein the character named entity recognizer uses a Chinese character or an English single character as a symbol unit. 如請求項1所述之處理豐富文字之語意理解系統,其中,該音符命名實體識別器係以一個中文字的注音符號不含音調或一個英文單字詞幹為符號單位。 The semantic understanding system for processing rich text as described in Claim 1, wherein the phonetic note named entity recognizer uses a Chinese phonetic symbol without tone or an English word stem as a symbol unit. 如請求項1所述之處理豐富文字之語意理解系統,其中,該語句意圖評選模組於該近似語句僅有一個時,透過後端之參數屢交模組以生成觸發該對應服務之參數,於該近似語句為多個時,透過後端之答覆語句生成模組以產生數筆問題選項,組成選擇題以詢問用戶意圖,於該近似語句為零個時,透過該答覆語句生成模組產生語句以說明無對應服務。 The semantic understanding system for processing rich text as described in Claim 1, wherein, when the sentence intent selection module has only one similar sentence, the parameters for triggering the corresponding service are generated by repeatedly submitting to the module through back-end parameters, When there are multiple similar sentences, use the answer sentence generation module at the back end to generate several question options to form multiple choice questions to ask the user's intent. When there are zero similar sentences, use the answer sentence generation module to generate statement to indicate that there is no corresponding service. 如請求項1所述之處理豐富文字之語意理解系統,復包括聲控句型建構模組,係用於預先由多個訓練語句中匡列出重要詞彙組合且忽略不重要字詞,以建構出該系統所支援句型與需預存的詞彙。 The semantic understanding system for processing rich text as described in Claim 1 further includes a voice-controlled sentence structure module, which is used to list important vocabulary combinations and ignore unimportant words in advance from multiple training sentences, so as to construct The system supports sentence patterns and pre-stored vocabulary. 一種處理豐富文字之語意理解方法,係由電腦設備執行該方法,該方法包括以下步驟:由潛在命名實體識別模組接收經語音轉換之文字語句,利用字符命名實體識別器與音符命名實體識別器分別對該文字語句之符號與預存詞彙之符號進行匹配率計算,挑出匹配率較高的預存詞彙,再以詞彙長度為範圍、滑動視窗方式在該文字語句與該匹配率較高的預存詞彙間尋找最長共同子符號序列,並比對該文字語句之符號與該預存詞彙之符號以進行計分,以於符號正確時加分及符號錯誤時扣分,俾挑選出分數最高且符合第一門檻值之預存詞彙作為潛在命名實體,以及於找出一個潛在命名實體後,係利用遮蔽該文字語句中共同子符號序列,再以相同程序尋找該文字語句中的其他潛在命名實體,直到找無其他潛在命名實體為止; 由相似語句生成模組依據系統所支援句型以及該文字語句中潛在命名實體出現順序,以範圍不重疊方式組合該潛在命名實體之預存詞彙,以生成相似語句,再比對每個該相似語句與該文字語句之符號相似與相異處,以於符號正確時加分及符號錯誤時扣分,俾由其中挑選出分數最高且符合第二門檻值之相似語句作為近似語句;以及令語句意圖評選模組依據該近似語句之意圖與數量,推估該文字語句之意圖,選定正確的命名實體,以產生該意圖對應服務之參數與該文字語句之回覆訊息。 A semantic understanding method for processing rich text, which is executed by a computer device, the method includes the following steps: a latent named entity recognition module receives a speech-converted text statement, and uses a character named entity recognizer and a musical note named entity recognizer Calculate the matching rate between the symbols of the text sentence and the symbols of the pre-stored vocabulary, pick out the pre-stored vocabulary with a high matching rate, and then use the length of the vocabulary as the range and slide the window to compare the words between the text sentence and the pre-stored vocabulary with a high matching rate Find the longest common sub-symbol sequence between them, and compare the symbols of the text sentence with the symbols of the pre-stored vocabulary to score, so as to add points when the symbols are correct and deduct points when the symbols are wrong, so as to select the highest score and meet the first The pre-stored vocabulary of the threshold value is used as a potential named entity, and after a potential named entity is found, the common sub-symbol sequence in the word sentence is masked, and then other potential named entities in the word sentence are searched for using the same procedure until none is found. other potential named entities; The similar sentence generation module combines the pre-stored vocabulary of the potential named entities in a non-overlapping manner according to the sentence patterns supported by the system and the order of appearance of the potential named entities in the text sentence to generate similar sentences, and then compares each of the similar sentences For the similarities and differences between the symbols and the text sentences, points are added when the symbols are correct and points are deducted when the symbols are wrong, so that similar sentences with the highest score and meeting the second threshold can be selected as similar sentences; The selection module estimates the intent of the text sentence based on the intent and quantity of the similar sentence, selects the correct named entity, and generates the parameters of the service corresponding to the intent and the reply message of the text sentence. 如請求項6所述之處理豐富文字之語意理解方法,其中,該字符命名實體識別器係以一個中文字或一個英文單字為符號單位。 The semantic understanding method for processing rich text as described in Claim 6, wherein the character named entity recognizer uses a Chinese character or an English single character as a symbol unit. 如請求項6所述之處理豐富文字之語意理解方法,其中,該音符命名實體識別器係以一個中文字的注音符號不含音調或一個英文單字詞幹為符號單位。 The semantic understanding method for processing rich text as described in Claim 6, wherein the phonetic named entity recognizer uses a Chinese phonetic symbol without tones or an English word stem as a symbol unit. 如請求項6所述之處理豐富文字之語意理解方法,其中,該語句意圖評選模組於該近似語句僅有一個時,透過後端之參數屢交模組以生成觸發該對應服務之參數,於該近似語句為多個時,透過後端之答覆語句生成模組以產生數筆問題選項,組成選擇題以詢問用戶意圖,於該近似語句為零個時,透過該答覆語句生成模組產生語句以說明無對應服務。 The semantic understanding method for processing rich text as described in Claim 6, wherein, when the sentence intent selection module has only one similar sentence, the parameters of the backend are repeatedly passed to the module to generate parameters that trigger the corresponding service, When there are multiple similar sentences, use the answer sentence generation module at the back end to generate several question options to form multiple choice questions to ask the user's intent. When there are zero similar sentences, use the answer sentence generation module to generate statement to indicate that there is no corresponding service. 如請求項6所述之處理豐富文字之語意理解方法,復包括於進行語意理解之前,利用聲控句型建構模組由多個訓練語句中匡列出重要詞彙組合且忽略不重要字詞,以建構出該系統所支援句型與需預存的詞彙。 The semantic understanding method for processing rich text as described in Claim 6 further includes, before performing semantic understanding, using a voice-controlled sentence pattern construction module to list important word combinations and ignore unimportant words from multiple training sentences, so as to Construct the sentence patterns supported by the system and the vocabulary to be stored in advance. 一種電腦可讀媒介,應用於計算裝置或電腦中,係儲存有指令,以執行如請求項6至10之任一者所述之處理豐富文字之語意理解方法。 A computer-readable medium, used in a computing device or a computer, stores instructions to execute the semantic understanding method for processing rich text as described in any one of claims 6 to 10.
TW110146063A 2021-12-09 2021-12-09 Semantic understanding system for rich-text, method and computer readable medium thereof TWI803093B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW110146063A TWI803093B (en) 2021-12-09 2021-12-09 Semantic understanding system for rich-text, method and computer readable medium thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW110146063A TWI803093B (en) 2021-12-09 2021-12-09 Semantic understanding system for rich-text, method and computer readable medium thereof

Publications (2)

Publication Number Publication Date
TWI803093B true TWI803093B (en) 2023-05-21
TW202324381A TW202324381A (en) 2023-06-16

Family

ID=87424481

Family Applications (1)

Application Number Title Priority Date Filing Date
TW110146063A TWI803093B (en) 2021-12-09 2021-12-09 Semantic understanding system for rich-text, method and computer readable medium thereof

Country Status (1)

Country Link
TW (1) TWI803093B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140324572A1 (en) * 2005-09-14 2014-10-30 Millennial Media, Inc. System For Targeting Advertising Content To A Plurality Of Mobile Communication Facilities
CN105260360A (en) * 2015-10-27 2016-01-20 小米科技有限责任公司 Named entity identification method and device
TW201835784A (en) * 2016-12-30 2018-10-01 美商英特爾公司 The internet of things
CN110134931A (en) * 2019-05-14 2019-08-16 北京字节跳动网络技术有限公司 Media title generation method, device, electronic equipment and readable medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140324572A1 (en) * 2005-09-14 2014-10-30 Millennial Media, Inc. System For Targeting Advertising Content To A Plurality Of Mobile Communication Facilities
CN105260360A (en) * 2015-10-27 2016-01-20 小米科技有限责任公司 Named entity identification method and device
TW201835784A (en) * 2016-12-30 2018-10-01 美商英特爾公司 The internet of things
CN110134931A (en) * 2019-05-14 2019-08-16 北京字节跳动网络技术有限公司 Media title generation method, device, electronic equipment and readable medium

Also Published As

Publication number Publication date
TW202324381A (en) 2023-06-16

Similar Documents

Publication Publication Date Title
CN109635270B (en) Bidirectional probabilistic natural language rewrite and selection
US10410627B2 (en) Automatic language model update
US10152971B2 (en) System and method for advanced turn-taking for interactive spoken dialog systems
JP7204690B2 (en) Tailor interactive dialog applications based on author-provided content
WO2016067418A1 (en) Conversation control device and conversation control method
US11016968B1 (en) Mutation architecture for contextual data aggregator
Ostendorf et al. Speech segmentation and spoken document processing
US11093110B1 (en) Messaging feedback mechanism
US8849668B2 (en) Speech recognition apparatus and method
US20220083577A1 (en) Information processing apparatus, method and non-transitory computer readable medium
US11263852B2 (en) Method, electronic device, and computer readable storage medium for creating a vote
JP5231484B2 (en) Voice recognition apparatus, voice recognition method, program, and information processing apparatus for distributing program
TWI803093B (en) Semantic understanding system for rich-text, method and computer readable medium thereof
US6735560B1 (en) Method of identifying members of classes in a natural language understanding system
Misu et al. Dialogue strategy to clarify user’s queries for document retrieval system with speech interface
Wang et al. Voice search
CN111429886B (en) Voice recognition method and system
Wiggers Modelling context in automatic speech recognition
Attanayake Statistical language modelling and novel parsing techniques for enhanced creation and editing of mathematical e-content using spoken input
US20230186898A1 (en) Lattice Speech Corrections
US20230215441A1 (en) Providing prompts in speech recognition results in real time
Duta Natural language understanding and prediction: from formal grammars to large scale machine learning
Adhikary Intelligent Techniques to Accelerate Everyday Text Communication
US11900072B1 (en) Quick lookup for speech translation
Yoshino Spoken Dialogue System for Information Navigation based on Statistical Learning of Semantic and Dialogue Structure