TWI803093B - Semantic understanding system for rich-text, method and computer readable medium thereof - Google Patents
Semantic understanding system for rich-text, method and computer readable medium thereof Download PDFInfo
- Publication number
- TWI803093B TWI803093B TW110146063A TW110146063A TWI803093B TW I803093 B TWI803093 B TW I803093B TW 110146063 A TW110146063 A TW 110146063A TW 110146063 A TW110146063 A TW 110146063A TW I803093 B TWI803093 B TW I803093B
- Authority
- TW
- Taiwan
- Prior art keywords
- sentence
- text
- named entity
- vocabulary
- similar
- Prior art date
Links
Images
Landscapes
- Machine Translation (AREA)
- Electrotherapy Devices (AREA)
Abstract
Description
本發明係關於聲控語意辨識之技術,尤指一種處理豐富文字之語意理解系統、方法及其電腦可讀媒介。 The invention relates to the technology of voice-controlled semantic recognition, especially a semantic understanding system and method for processing rich text and its computer-readable medium.
聲控服務普遍運用在家電控制、串流媒體撥放、活動預約等場景,也提供人們便利生活應用。隨著越來越多聲控服務的產生,如何準確識別出用戶語意變得更加重要,舉例來說,用戶使用聲控服務往往不假思索,結果語句普遍簡略,時而缺主詞或動詞或受詞,導致系統辨識誤差,又或是用戶記不住過長節目名稱,結果語句漏講部分字詞,例如「經典101說給孩子聽的世界文學名著」之節目名稱,只講了片段「說給孩子聽的文學名著」,也會導致系統無法準確辨識,又或者,節目名稱存在諧音,例如「股癌」、「談股論金」、「讀角戲」等語音辨識誤植同音字到語句,因此,語句缺字或錯別字等情況,將對聲控服務自然語言處理品質產生更多挑戰。 Voice control services are widely used in scenarios such as home appliance control, streaming media playback, and event reservations, and also provide people with convenient life applications. With the emergence of more and more voice-activated services, how to accurately identify the user's semantics has become more important. For example, users often use voice-activated services without thinking, and the resulting sentences are generally short, sometimes lacking subjects, verbs, or objects, causing system Recognition error, or the user can’t remember the program name that is too long, and the resulting sentence misses some words. For example, the program title of "Classic 101 World Literature Classics for Children" only narrates the segment "Say it for children Literary classics" will also cause the system to be unable to accurately identify, or there are homonyms in the program names, such as "stock cancer", "talking about money", "dujiaoxi" and other speech recognition errors. Missing characters or typos will pose more challenges to the quality of natural language processing of voice-activated services.
進言之,習知例如Dialogflow之自然語言處理系統,沿用模糊比對(Fuzzy matching)技術,當語句中節目名稱有缺字,例如語句「我要聽職涯」,此節目名稱缺字「99」,Dialogflow可能誤將動詞「聽」當成節目名稱一部分,結果 誤觸節目「職享聽你說」,經統計,節目缺字情況約有一成會導致撥放錯誤;又,當語句中節目名稱存在諧音,例如語句「我要聽Emily爆報」,同音字「爆」未修正以致節目「Emily報報」無法撥放,經統計類似情況也約一成;另外,不論缺字或錯別字,還有一成節目恐無法被識別,系統需再此詢問用戶意圖,例如語句「我要聽大聯盟最新一集」缺字「Hito」或語句「我要聽三金秀」同音字「斤」。由上可知,習用之自然語言處理系統,在節目名稱缺字或同音字時,錯誤率約兩成,故有待予改良之需求。 In other words, it is known that natural language processing systems such as Dialogflow continue to use fuzzy matching technology. When there are missing characters in the program name in the sentence, for example, the sentence "I want to listen to career", the program name is missing "99" , Dialogflow may mistake the verb "listen" as part of the program title, and the result Mistakenly touching the program "Job Enjoy Listening to You". According to statistics, about 10% of the program's lack of characters will lead to misplaying; also, when the program name has homophony in the sentence, such as the sentence "I want to listen to Emily's explosive report", the homophone The program "Emily Newspaper" could not be played due to the uncorrected "Burst", according to the statistics, about 10% of the similar situations; in addition, regardless of missing characters or typos, there are still 10% of the programs that may not be recognized, the system needs to ask the user's intention again, For example, the sentence "I want to listen to the latest episode of the major league" lacks the word "Hito" or the sentence "I want to listen to Sanjinxiu" has a homophone "jin". It can be seen from the above that the commonly used natural language processing system has an error rate of about 20% when the program name is missing or homonymous, so there is a need for improvement.
鑑於上述問題,如何提升聲控辨識率,特別是用戶使用聲控服務時語句過於簡略,節目名稱過長有漏部分字詞、節目名稱諧音所致之錯別字誤植的情況下,仍能準確推估語句意圖,進而提供用戶正確之聲控服務,此將成為目前本技術領域人員急欲追求之目標。 In view of the above problems, how to improve the recognition rate of voice control, especially when the user uses the voice control service, the sentence is too simple, the program name is too long and some words are missing, and the program name is homophonic due to typos and mistakes, and the sentence intention can still be accurately estimated , and then provide users with correct voice-activated services, which will become the goal that those skilled in the art are eager to pursue.
為解決上述現有技術之問題,本發明係揭露一種處理豐富文字之語意理解系統,係包括:潛在命名實體識別模組,係具有多個不同類型之命名實體識別器且接收經語音轉換後之文字語句,各該命名實體識別器分別用於計算該文字語句與預存詞彙的最長共同子符號序列並進行計分,以挑選出分數最高且符合第一門檻值之詞彙作為潛在命名實體;相似語句生成模組,係用於將該潛在命名實體所對應之詞彙進行不重複之組合,以生成相似語句並進行計分,俾由其中挑選出分數最高且符合第二門檻值之相似語句作為近似語句;以及語句意圖評選模組,係用於依據該近似語句之數量,提供用於觸發對應服務之參數、產生對應之回覆訊息或是產生互動訊息。 In order to solve the above-mentioned problems in the prior art, the present invention discloses a semantic understanding system for processing rich text, which includes: a latent named entity recognition module, which has multiple named entity recognizers of different types and receives text converted from speech Each named entity recognizer is used to calculate and score the longest common subsymbol sequence between the text sentence and the pre-stored vocabulary, so as to select the vocabulary with the highest score and meet the first threshold as a potential named entity; generate similar sentences The module is used to combine the vocabulary corresponding to the potential named entity without repetition to generate similar sentences and score them, so that the similar sentences with the highest score and meeting the second threshold are selected as similar sentences; And the sentence intent selection module is used to provide parameters for triggering corresponding services, generate corresponding reply messages or generate interactive messages according to the number of similar sentences.
本發明復揭露一種處理豐富文字之語意理解方法,係由電腦設備執行該方法,該方法包括以下步驟:令潛在命名實體識別模組接收經語音轉換之文字語句,以利用不同類型之命名實體識別器,計算出該文字語句與預存詞彙的最長共同子符號序列以進行計分,俾挑選出分數最高且符合第一門檻值之詞彙作為潛在命名實體;令相似語句生成模組對該潛在命名實體所對應之詞彙進行不重複之組合,以生成相似語句並進行計分,俾由其中挑選出分數最高且符合第二門檻值之相似語句作為近似語句;以及令語句意圖評選模組依據該近似語句之數量,提供用於觸發對應服務之參數、產生對應之回覆訊息或是產生互動訊息。 The present invention further discloses a semantic understanding method for processing rich text, which is executed by a computer device. The method includes the following steps: making the latent named entity recognition module receive text sentences converted from speech, so as to use different types of named entity recognition The device calculates the longest common subsymbol sequence between the text sentence and the pre-stored vocabulary for scoring, so as to select the vocabulary with the highest score and meeting the first threshold as a potential named entity; let the similar sentence generation module compare the potential named entity The corresponding vocabulary is combined without repetition to generate similar sentences and score them, so that the similar sentences with the highest score and meeting the second threshold are selected as similar sentences; and the sentence intent selection module is based on the similar sentences The number of , providing parameters for triggering the corresponding service, generating a corresponding reply message or generating an interactive message.
於前述系統和方法中,該命名實體識別器為字符命名實體識別器時,係以一個中文字或一個英文單字為符號單位,對該文字語句之符號與該詞彙之符號進行匹配率計算,以得到該最長共同子符號序列。 In the aforementioned system and method, when the named entity recognizer is a character named entity recognizer, it uses a Chinese character or an English word as a symbol unit to calculate the matching rate between the symbol of the text sentence and the symbol of the vocabulary, and Obtain the longest common subsymbol sequence.
於前述系統和方法中,該命名實體識別器為音符命名實體識別器時,係以一個中文字的注音符號不含音調或一個英文單字詞幹為符號單位,對該文字語句之符號與該詞彙之符號進行匹配率計算,以得到該最長共同子符號序列。 In the aforementioned system and method, when the named entity recognizer is a musical note named entity recognizer, it uses a phonetic symbol of a Chinese character without tones or an English word stem as a symbol unit, and the symbol of the text statement is related to the The symbols of the vocabulary are calculated to obtain the longest common sub-symbol sequence.
於前述系統和方法中,該潛在命名實體識別模組係以滑動視窗方式找出該文字語句與該詞彙的該最長共同子符號序列,且該計分方式係比對該文字語句之符號與該詞彙之符號,以於符號正確時加分及符號錯誤時扣分,俾令分數最高且符合該第一門檻值之詞彙作為該潛在命名實體。 In the foregoing system and method, the latent named entity recognition module finds the longest common subsymbol sequence between the text sentence and the vocabulary in a sliding window manner, and the scoring method is to compare the symbols of the text sentence with the For the symbol of the vocabulary, points are added when the symbol is correct and points are deducted when the symbol is wrong, so that the word with the highest score and meeting the first threshold is used as the potential named entity.
於另一實施例中,該潛在命名實體識別模組於找出該潛在命名實體後,係利用遮蔽該文字語句中共同子序列,再以相同程序尋找該文字語句中的其他潛在命名實體,直到無找到其他潛在命名實體為止。 In another embodiment, after the potential named entity recognition module finds out the potential named entity, it uses masking the common subsequence in the text sentence, and then uses the same procedure to find other potential named entities in the text sentence until Until no other potential named entities are found.
於前述系統和方法中,該相似語句生成模組係根據系統所支援句型以及該潛在命名實體之出現順序,組合不重疊詞彙以生成該相似語句,再計算出每一個該相似語句與該文字語句之最長共同子符號序列,於比對該文字語句之符號與該相似語句之符號後進行計分,以於符號正確時加分及符號錯誤時扣分,俾令分數最高且符合該第二門檻值之相似語句作為該近似語句。 In the aforementioned system and method, the similar sentence generating module is based on the sentence patterns supported by the system and the order of appearance of the potential named entity, combining non-overlapping words to generate the similar sentence, and then calculates each similar sentence and the text The longest common sub-symbol sequence of the sentence is scored after comparing the symbols of the literal sentence with the symbols of the similar sentence, so as to add points when the symbols are correct and deduct points when the symbols are wrong, so that the score is the highest and meets the second A similar statement of the threshold value is used as the approximate statement.
於前述系統和方法中,該語句意圖評選模組於該近似語句僅有一個時,透過後端之參數屢交模組以生成觸發該對應服務之參數,或是於該近似語句為零個或多個時,透過後端之答覆語句生成模組以產生該回覆訊息或該互動訊息。 In the aforementioned systems and methods, when there is only one approximate sentence, the sentence intent selection module generates parameters to trigger the corresponding service through the parameters of the backend repeatedly passed to the module, or when the approximate sentence is zero or When multiple, the reply message or the interaction message is generated through the reply statement generating module at the back end.
於前述系統和方法中,該處理豐富文字之語意理解方法復包括聲控句型建構模組,用以於進行語意理解之前,利用該聲控句型建構模組由多個訓練語句中匡列出重要詞彙組合且忽略不重要字詞,以建構出該預存詞彙。 In the aforementioned system and method, the semantic understanding method for processing rich text further includes a voice-activated sentence pattern construction module, which is used to list important Vocabulary combinations and ignoring unimportant words to construct the pre-stored vocabulary.
本發明復揭露一種電腦可讀媒介,應用於計算裝置或電腦中,係儲存有指令,以執行前述之處理豐富文字之語意理解方法。 The present invention further discloses a computer-readable medium, which is used in a computing device or a computer, and stores instructions to execute the aforementioned semantic understanding method for processing rich text.
由上可知,本發明之處理豐富文字之語意理解系統及其方法,係融合數種命名實體(Named Entity,NE)識別結果,發揮綜效涵蓋各種情況,有效降低命名實體錯誤率。具體而言,本發明以自然語言建構聲控服務支援的句型,制定一套潛在命名實體初選機制以推舉各種潛在命名實體,再依句型組合潛在命名實體以生成相似語句,最後,透過近似語句評選公式以確立語句意圖並選定命 名實體組合。相較於傳統模型先識別語句意圖再模糊比對潛在命名實體,本發明可避免單一模型先入為主,或框錯範圍模糊比對,故能有效提升缺字及錯別字語句之命名實體辨識率。 It can be seen from the above that the semantic understanding system and method for processing rich texts of the present invention integrates the recognition results of several named entity (NE) to exert synergistic effects to cover various situations and effectively reduce the error rate of named entities. Specifically, the present invention constructs sentence patterns supported by voice-activated services in natural language, formulates a set of potential named entity primary selection mechanism to recommend various potential named entities, and then combines potential named entities according to sentence patterns to generate similar sentences. Finally, through approximation Sentence selection formulas to establish sentence intent and select command name entity combination. Compared with the traditional model, which first recognizes the intent of the sentence and then fuzzily compares the potential named entities, the present invention can avoid the preconception of a single model, or the fuzzy comparison of the range of frame errors, so it can effectively improve the recognition rate of named entities for sentences with missing characters and typos.
1:處理豐富文字之語意理解系統 1: Semantic understanding system for processing rich text
11:潛在命名實體識別模組 11: Potential Named Entity Recognition Module
111:字符命名實體識別器111
111: Character Named
112:音符命名實體識別器 112:Note Named Entity Recognizer
12:相似語句生成模組 12: Similar sentence generation module
13:語句意圖評選模組 13: Sentence intent selection module
14:聲控句型建構模組 14: Voice-activated sentence construction module
2:語音轉文字模組 2: Speech-to-text module
3:參數屢交模組 3: Parameters are repeatedly handed over to the module
4:答覆語句生成模組 4: Answer sentence generation module
5:服務應用程式介面 5: Service API
6:文字轉語音模組 6: Text-to-speech module
301-305:流程 301-305: Process
401-406:流程 401-406: Process
S501-S503:步驟 S501-S503: Steps
圖1係本發明之處理豐富文字之語意理解系統之示意架構圖。 FIG. 1 is a schematic structural diagram of a semantic understanding system for processing rich text according to the present invention.
圖2係本發明之處理豐富文字之語意理解系統於一應用實施例之示意架構圖。 FIG. 2 is a schematic structural diagram of an application embodiment of the semantic understanding system for processing rich text of the present invention.
圖3係於本發明中潛在命名實體識別模組初選潛在命名實體之流程圖。 FIG. 3 is a flow chart of the primary selection of potential named entities by the potential named entity recognition module in the present invention.
圖4係於本發明中相似語句生成模組生成相似語句以及語句意圖評選模組找出近似語句之流程圖。 FIG. 4 is a flow chart of generating similar sentences by the similar sentence generating module and finding similar sentences by the sentence intent selection module in the present invention.
圖5係本發明之處理豐富文字之語意理解方法之步驟圖。 FIG. 5 is a step diagram of the semantic understanding method for processing rich text according to the present invention.
以下藉由特定的具體實施形態說明本發明之技術內容,熟悉此技藝之人士可由本說明書所揭示之內容輕易地瞭解本發明之優點與功效。然本發明亦可藉由其他不同的具體實施形態加以施行或應用。 The following describes the technical content of the present invention through specific embodiments, and those skilled in the art can easily understand the advantages and effects of the present invention from the content disclosed in this specification. However, the present invention can also be implemented or applied in other different specific implementation forms.
圖1為本發明之處理豐富文字之語意理解系統之示意架構圖。如圖所示,處理豐富文字之語意理解系統1至少包括潛在命名實體識別模組11、相似語句生成模組12及語句意圖評選模組13。
FIG. 1 is a schematic structural diagram of a semantic understanding system for processing rich text according to the present invention. As shown in the figure, the
潛在命名實體識別模組11具有多個不同類型之命名實體識別器且接收由用戶輸入之語音經轉換後之文字語句,各該命名實體識別器分別用於計算該文字語句與預存之詞彙的最長共同子符號序列並進行計分,以挑選出分數最高且符合預定第一門檻值之詞彙作為潛在命名實體。易言之,潛在命名實體識別模組11係整合數種命名實體(NE)識別法以初選潛在NE,其中,該潛在命名實體識別模組11內預設有多個不同類型之命名實體識別器,當用戶所輸入之輸入語音經轉換成為文字語句後,會送至該潛在命名實體識別模組11進行分析,各該命名實體識別器用於計算所接收之文字語句與預存之詞彙兩者的最長共同子符號序列,之後由文字語句與最長共同子符號序列比對以進行計分,挑選出分數最高且符合預定第一門檻值之詞彙以作為潛在命名實體。
The potential named
於一實施例中,該命名實體識別器包括字符命名實體識別器和音符命名實體識別器,其中,字符命名實體識別器係以一個中文字或一個英文單字為符號單位,對該文字語句之符號與該詞彙之符號進行匹配率計算,以得到該最長共同子符號序列,而音符命名實體識別器係以一個中文字的注音符號不含音調或一個英文單字詞幹為符號單位,對該文字語句之符號與該詞彙之符號進行匹配率計算,以得到該最長共同子符號序列。易言之,字符命名實體(NE)識別器和音符命名實體(NE)識別器所執行之字符NE識別法與音符NE識別法,係分別因應詞彙缺字與同音錯別字來進行識別,其中,字符NE識別法其符號單位為一個中文字或一個英文單字,而音符NE識別法其符號單位中文字為注音或拼音符號不含聲調,英文單字為詞幹。因此,當文字語句與預存之詞彙經過符號轉換後,便能在同一套潛在NE評選框架下進行潛在NE初選。 In one embodiment, the named entity recognizer includes a character named entity recognizer and a musical note named entity recognizer, wherein the character named entity recognizer uses a Chinese character or an English word as a symbol unit, and the symbol of the text sentence Calculate the matching rate with the symbols of the vocabulary to obtain the longest common subsymbol sequence, and the phonetic note named entity recognizer uses a Chinese character phonetic symbol without tone or an English word stem as the symbol unit, and the character The matching rate calculation is performed between the symbols of the sentence and the symbols of the vocabulary to obtain the longest common sub-symbol sequence. In other words, the character NE recognition method and the musical note NE recognition method performed by the character named entity (NE) recognizer and the musical note named entity (NE) recognizer are respectively used for recognition in response to vocabulary missing characters and homophonic typos. The symbol unit of the NE recognition method is a Chinese character or an English word, and the symbol unit of the phonetic NE recognition method is a phonetic or pinyin symbol without tone, and the English word is a word stem. Therefore, after the text sentences and pre-stored vocabulary are converted into symbols, the primary selection of potential NEs can be carried out under the same potential NE selection framework.
於一實施例中,潛在命名實體識別模組11以滑動視窗方式找出該文字語句與該詞彙的該最長共同子符號序列,且該計分方式係比對該文字語句之符號與該詞彙之符號,以於符號正確時加分(如一分)及符號錯誤時扣分(如一分),俾令分數最高且符合該預定第一門檻值之詞彙作為該潛在命名實體。簡言之,先依文字語句與預存之詞彙的符號匹配率,推選出K則較佳(Top K)詞彙,K值可依實際需求設定,接著,針對每一條詞彙,以滑動視窗方式找出預存之詞彙與文字語句的最長共同子符號序列,該最長共同子符號序列之長度係小於等於詞彙長度,最後,從Top K詞彙中挑選分數最高且大於等於預定第一門檻值之詞彙作為潛在命名實體,其中,該預定門檻值能依據最長共同子符號序列評分結果來調整以提高成效。
In one embodiment, the latent named
另外,最長共同子符號序列之計分方式係例如Longest Common Subsequence(LCS)之計分方式,包括符號正確加分(如一分),符號錯誤,例如多/少/置換一個符號則扣分(如一分),LCS計分不僅看編輯距離,還須參考正確加分部分,優先挑出較長的詞彙,以免取出局部最佳的詞彙。 In addition, the scoring method of the longest common sub-symbol sequence is, for example, the scoring method of Longest Common Subsequence (LCS), including bonus points for correct symbols (such as one point), and incorrect symbols, such as more/less/replacement of one symbol, then deducting points (such as one points), the LCS score not only depends on the edit distance, but also refers to the correct extra points, and gives priority to picking out longer words to avoid picking out the best words locally.
於另一實施例中,潛在命名實體識別模組11於找出該潛在命名實體後,係利用遮蔽該文字語句中共同子序列,再以相同程序尋找該文字語句中的其他潛在命名實體,直到無找到其他潛在命名實體為止。簡言之,於找出一個潛在命名實體後,若還存在其他潛在命名實體時,可透過遮蔽文字語句中共同子序列後,再進行下一次的潛在命名實體尋找,例如可以符號★遮蔽文字語句中之共同子符號序列,以便再次尋找其他潛在命名實體,倘若不存在潛在命名實體,則可彙整命名實體之探勘結果。
In another embodiment, after finding out the potential named entity, the potential named
因此,潛在命名實體識別模組11整合數種命名實體(NE)識別法以初選潛在NE,其中,NE識別法包含字符與音符NE識別法以分別處理文字語句中詞彙缺字與錯別字等情況,須說明者,潛在命名實體識別模組11亦能整合習知NE識別法來因應其他情況,不受前述識別器種類為限。另外,潛在命名實體識別模組11彙整數種NE初選結果,包含NE詞彙、開始與結束位置,以供相似語句生成模組12構建相似語句。
Therefore, the potential named
相似語句生成模組12用於將該潛在命名實體所對應之詞彙進行不重複之組合,以生成相似語句並進行計分,俾由其中挑選出分數最高且符合第二門檻值之相似語句作為近似語句。簡言之,相似語句生成模組12目的是組合潛在命名實體之詞彙以產生相似語句,其中,可根據系統預定之句型跟文字語句中潛在命名實體出現順序,組合不重疊詞彙,以生成相似語句,之後,計算每一則相似語句與原始文字語句的最長共同子符號序列並進行評分,由其中挑選出分數最高且符合第二門檻值的相似語句以作為近似語句,其中,第二門檻值同樣可視最長共同子符號序列之評分結果進行調整,藉以提高成效。
The similar
於一實施例中,相似語句生成模組12係根據系統所支援句型以及該潛在命名實體之出現順序,組合不重疊詞彙以生成該相似語句,且計算每一個該相似語句與該文字語句之最長共同子符號序列,比對該文字語句之符號與該相似語句之符號並進行計分,以於符號正確時加分(如一分)及符號錯誤時扣分(如一分),俾令分數最高且符合該第二門檻值之相似語句作為該近似語句。
In one embodiment, the similar
語句意圖評選模組13用於依據該近似語句之數量,提供用於觸發對應服務之參數、產生對應之回覆訊息或是產生互動訊息。簡言之,語句意圖評選模組13目的是挑選相似語句生成模組12所產生之近似語句,並推估語句意圖,
最後選定命名實體組合,進言之,當近似語句數量為零時,透過系統回覆無對應服務,當近似語句數量唯一,則將近似語句意圖與組成近似語句的命名實體作為觸發服務的路徑與履交的參數,當近似語句數量不只一則,則組成近似語句的命名實體,套入選項樣板以得到數筆問題選項,組成選擇題詢問用戶意圖。
The sentence
另外,處理豐富文字之語意理解系統1復包括聲控句型建構模組14,係用於預先由多個訓練語句中匡列出重要詞彙組合且忽略不重要字詞,以建構出該預存之詞彙。簡言之,句型是由一個以上詞彙組合而成,詞彙先後順序不影響句型唯一性,以自然語言構建聲控句型時,本發明能透過聲控句型建構模組14來匡列訓練語句中重要詞彙並忽略不重要字詞,最後成為系統預存之詞彙,以供後續潛在命名實體識別模組11分析比對時使用。
In addition, the
圖2為本發明之處理豐富文字之語意理解系統於一應用實施例之示意架構圖。如圖所示,處理豐富文字之語意理解系統1之潛在命名實體識別模組11、相似語句生成模組12及語句意圖評選模組13與圖1相同,於本實施例中,復包括與處理豐富文字之語意理解系統1連線之語音轉文字模組2、參數屢交模組3、答覆語句生成模組4、服務應用程式介面5以及文字轉語音模組6。
FIG. 2 is a schematic structural diagram of an application embodiment of the semantic understanding system for processing rich text of the present invention. As shown in the figure, the latent named
處理豐富文字之語意理解系統1主要用於處理內容豐富文字,將其輕量化進行語意理解,其中包含用於整合數種命名實體識別法初選潛在命名實體之潛在命名實體識別模組11、用於組合潛在命名實體之詞彙以產生相似語句之相似語句生成模組12以及用於挑選近似語句之語句意圖評選模組13。實際運作時,一般聲控服務經由外部之語音轉文字模組2產生文字語句以作為處理豐富文字之語意理解系統1的輸入,潛在命名實體識別模組11分送文字語句至字符命名實體識別器111、音符命名實體識別器112或其他NE識別模組,彙整數種識別結果
給相似語句生成模組12,相似語句生成模組12根據句型組合潛在命名實體,產生相似語句供語句意圖評選模組13挑選出近似語句,推估語句意圖後,選定命名實體組合。
語句意圖評選模組13於近似語句僅有一個時,透過後端之參數屢交模組3生成觸發該對應服務之參數,或是於該近似語句為零個或多個時,透過後端之答覆語句生成模組4產生該回覆訊息或該互動訊息。具體而言,處理豐富文字之語意理解系統1選定意圖與NE組合後,能供外部之參數屢交模組3產生參數觸發後端服務,即由服務應用程式介面(API)5執行對應服務,又或是,由答覆語句生成模組4根據選定的NE組合以產生答覆語句,再經由文字轉語音模組6發出聲音跟用戶互動。
Sentence
須說明者,於本實施例中,語音轉文字模組2、參數屢交模組3、答覆語句生成模組4以及文字轉語音模組6等模組是在處理豐富文字之語意理解系統1外部,亦即,上述各模組根據不同目的,架設於不同設備、伺服器、系統中執行,在處理豐富文字之語意理解系統1解析完文字語句後,推得文字語句之意圖後,交由對應系統進行後續處理。然於其他應用實施例中,亦可將上述模組整合於處理豐富文字之語意理解系統1,成為一個從語音轉換、語言分析、到最後給予客戶回饋的一個整合系統服務。
It should be noted that in this embodiment, modules such as speech-to-
於處理豐富文字之語意理解系統1運作前,復包括預先進行預存詞彙的建構,可由處理豐富文字之語意理解系統1內之聲控句型建構模組(如圖1所示)進行詞彙建構,因為聲控語句是由數個詞彙組合而成,詞彙先後順序通常不影響語意,例如「播放故事三隻小豬」跟「播放三隻小豬的故事」語意相同,聲控語句向來簡略,用戶甚至省略動詞直呼節目名稱,例如三隻小豬,故聲控句型
建構模組僅需匡列出重要詞彙組合,忽略不重要字詞,最後可形成預存之詞彙,以供潛在命名實體識別模組11於尋找潛在命名實體時比對之用。如下面表一所示,說明透過重要詞彙組合,進而推得支援句型、意圖和對應參數等訊息。
Before the operation of the
後續將針對潛在命名實體識別模組執行潛在命名實體之初選、相似語句生成模組生成相似語句以及語句意圖評選模組找出近似語句等流程進一步說明。 Follow-up will further explain the process of performing the primary selection of potential named entities by the potential named entity recognition module, generating similar sentences by the similar sentence generation module, and finding similar sentences by the sentence intent selection module.
圖3為本發明中潛在命名實體識別模組初選潛在命名實體之流程圖,係說明潛在命名實體識別模組整合數種命名實體(NE)識別法初選潛在NE的流程。 FIG. 3 is a flow chart of the primary selection of potential named entities by the potential named entity recognition module in the present invention, illustrating the process of the primary selection of potential NEs by the potential named entity recognition module integrating several named entity (NE) recognition methods.
於流程301,依文字語句與詞彙的符號匹配率,推舉Top K則詞彙。本流程係進行輸入語音經轉換後之文字語句與預存之詞彙的符號匹配率,其中,符號匹配率算式如下:
In the
其中,對於字符命名實體識別器來說,符號單位為一個中文字或一個英文單字,對於音符命名實體識別器來說,符號單位為一個中文字的注音符號不含 聲調或一個英文單字詞幹,不論是字符命名實體識別器或音符命名實體識別器,匹配率算式有別於習知BM25(Best Match25)算式,屏除逆向文件頻率以免節目名稱存在常見字詞而被排序在後,後續得以較小範圍K以計算語句與詞彙之最長共同子符號序列。 Among them, for the character named entity recognizer, the symbol unit is a Chinese character or an English word, and for the musical note named entity recognizer, the symbol unit is a Chinese character. Tones or an English word stem, whether it is a character named entity recognizer or a musical note named entity recognizer, the matching rate calculation formula is different from the conventional BM25 (Best Match25) calculation formula, and the frequency of reverse files is eliminated to avoid common words in the program name. After being sorted, a smaller range K can be used to calculate the longest common subsymbol sequence of sentences and vocabulary.
於流程302,針對每條詞彙,以滑動視窗方式找出詞彙與文字語句的最長共同子符號序列(LCS)。具體來說,最長共同子符號序列之長度≦詞彙長度,其中,最長共同子符號序列之計分方式可為符號正確則加分(如一分),符號錯誤(多/少/置換一個符號)則扣分(如一分),因為LCS計分不只看編輯距離,還會參考正確加分部分,故會優先挑出較長的詞彙,可避免取出局部最佳的詞彙。
In the
於流程303,從K個詞彙中,挑選分數最高且≧α的詞彙作為潛在NE。本流程即進行詞彙評分後,選出分數最高且符合預設條件α的詞彙,以作為潛在命名實體,其中,α可視最長共同子符號序列評分結果調整門檻提高成效。
In the
接著,判斷是否還存在候選命名實體(NE),若存在候選NE,進至流程304,若無候選NE,進至流程305。
Next, it is judged whether there is a candidate named entity (NE), if there is a candidate NE, go to
於流程304,以符號遮蔽文字語句中共同子序列(LCS),再尋找其他潛在NE。當存在有其他候選NE時,可利用符號(例如★)遮蔽文字語句中共同子序列,接著,再尋找文字語句中其他潛在NE,亦即利用符號遮蔽文字語句中共同子序列後,再回到流程301重複進行前述流程。
In the
於流程305,彙整NE探勘結果。當不存在其他候選NE時,則可彙整NE探勘結果,其中復包含詞彙在語句中的開始與結束位置。
In the
須說明者,除了上述字符命名實體(NE)識別器所執行之字符NE識別法以及音符命名實體(NE)識別器所執行之音符NE識別法外,亦可擴充習知詞性句型(part-of-speech Pattern,POS Pattern)、馬爾科夫模型(Hidden Markov Model,HMM)、條件隨機域(Conditional Random Field,CRF)、循環神經網路(Recurrent neural network,RNN)或卷積神經網路(Convolutional Neural Network,CNN)等NE識別法至潛在命名實體識別模組,以涵蓋不同類型之文字語句。 It should be noted that, in addition to the character NE recognition method carried out by the above-mentioned character named entity (NE) recognizer and the musical note NE recognition method carried out by the musical note named entity (NE) recognizer, it is also possible to expand the conventional part-of-speech sentence pattern (part- of-speech Pattern, POS Pattern), Markov model (Hidden Markov Model, HMM), conditional random field (Conditional Random Field, CRF), recurrent neural network (Recurrent neural network, RNN) or convolutional neural network ( Convolutional Neural Network, CNN) and other NE recognition methods to latent named entity recognition modules to cover different types of text sentences.
圖4為本發明中相似語句生成模組生成相似語句以及語句意圖評選模組找出近似語句之流程圖,係說明相似語句生成模組組合潛在NE詞彙以產生相似語句的流程,以及語句意圖評選模組挑選近似語句的流程。 Fig. 4 is a flow chart of generating similar sentences by the similar sentence generation module and finding similar sentences by the sentence intent evaluation module in the present invention, which illustrates the process of the similar sentence generation module combining potential NE vocabulary to generate similar sentences, and the sentence intent evaluation The process of selecting similar sentences by the module.
於流程401,根據系統支援句型跟文字語句潛在NE出現順序,組合不重疊詞彙生成相似語句。本流程即相似語句生成模組根據系統所支援之句型和文字語句中潛在NE出現順序,將不重疊詞彙作組合以生成相似語句。
In the
於流程402,計算每一則相似語句與原始語句最長共同子符號序列。本流程即針對每一個相似語句進行評分,也就是跟原始的文字語句進行符號比對,所採用之LCS計分方式如前所述,符號正確則加分(如一分),而多/少/置換一個符號等符號錯誤情況,則扣分(如一分)。
In the
於流程403,挑選分數最高且≧β的相似語句作為近似語句。本流程即在流程402對每一個相似語句評分後進行挑選,選分數最高且符合符合預設條件β的相似語句,作為近似語句,其中,β可視最長共同子符號序列評分結果調整門檻提高成效。
In the
之後,根據近似語句數量會有不同結果,當近似語句數量為1時,進到流程404,當近似語句數量為0時,進到流程405,當近似語句數量為多個(≧2)時,進到流程406。
Afterwards, there will be different results according to the number of approximate sentences. When the number of approximate sentences is 1, go to
於流程404,近似語句意圖與組成近似語句的NE作為觸發服務的路徑與履交的參數。本流程即近似語句只有一則時,語句意圖評選模組根據近似語句意圖以及組成近似語句的命名實體,產生作為觸發服務的路徑與履交的參數。另外,組成近似語句的命名實體,可套入答覆語句樣板而得到開頭介紹語句。
In the
於流程405,答覆用戶聽不懂語句。本流程即近似語句不存在時,語句意圖評選模組產生答覆用戶聽不懂語句之對應回饋。
In the
於流程406,組成近似語句的NE套入選項樣板可得數筆問題選項,組成選擇題詢問用戶意圖。本流程即近似語句不只一則(大於等於2)時,語句意圖評選模組將組成近似語句的命名實體套入選項樣板,進而得到數筆問題選項以組成選擇題,藉以詢問用戶意圖。
In the
圖5為本發明之處理豐富文字之語意理解方法之步驟圖。本發明之處理豐富文字之語意理解方法可於例如個人電腦、伺服器或雲端設備之電腦設備執行,其中,本發明之處理豐富文字之語意理解方法包括以下步驟。 FIG. 5 is a step diagram of the semantic understanding method for processing rich text according to the present invention. The semantic understanding method for processing rich text of the present invention can be executed on computer equipment such as personal computers, servers or cloud devices, wherein the semantic understanding method for processing rich text of the present invention includes the following steps.
於步驟S501,潛在命名實體識別模組接收由用戶之輸入語音經轉換後之文字語句,以利用不同類型之命名實體識別器,計算該文字語句與預存之詞彙的最長共同子符號序列並進行計分,以挑選出分數最高且符合預定第一門檻值之詞彙作為潛在命名實體。本步驟即潛在命名實體識別模組接收由用戶之輸入語音經轉換後之文字語句,並利用不同類型之命名實體識別器分析文字 語句與預存之詞彙兩者的最長共同子符號序,之後針對符號位置進行計分,從中挑選出分數最高且符合預定第一門檻值之詞彙作為潛在命名實體。 In step S501, the potential named entity recognition module receives the converted text sentence from the user's input voice, and uses different types of named entity recognizers to calculate the longest common subsymbol sequence between the text sentence and the pre-stored vocabulary Points, to select the vocabulary with the highest score and meeting the predetermined first threshold as a potential named entity. In this step, the potential named entity recognition module receives the converted text sentence from the user's input voice, and uses different types of named entity recognizers to analyze the text The longest common sub-symbol sequence of both the sentence and the pre-stored vocabulary is then scored for the symbol position, and the vocabulary with the highest score and meeting the predetermined first threshold is selected as a potential named entity.
於一實施例中,該命名實體識別器可為字符命名實體識別器或音符命名實體識別器。字符命名實體識別器係以一個中文字或一個英文單字為符號單位,對該文字語句之符號與該詞彙之符號進行匹配率計算,以得到最長共同子符號序列,而音符命名實體識別器係以一個中文字的注音符號不含音調或一個英文單字詞幹為符號單位,藉以對該文字語句之符號與該詞彙之符號進行匹配率計算,而得到最長共同子符號序列。易言之,透過將文字語句與預存之詞彙經過符號轉換後,即能在同一套潛在NE評選框架下進行潛在NE初選。 In one embodiment, the named entity recognizer may be a character named entity recognizer or a musical note named entity recognizer. The character named entity recognizer uses a Chinese character or an English single character as a symbol unit, and calculates the matching rate between the symbol of the text sentence and the symbol of the vocabulary to obtain the longest common subsymbol sequence, while the musical note named entity recognizer uses A phonetic symbol of a Chinese character does not contain tones or an English single word stem as a symbol unit, so as to calculate the matching rate between the symbol of the word sentence and the symbol of the vocabulary, and obtain the longest common sub-symbol sequence. In other words, by transforming text sentences and pre-stored vocabulary into symbols, the primary selection of potential NEs can be performed under the same potential NE selection framework.
於另一實施例中,潛在命名實體識別模組係以滑動視窗方式找出該文字語句與該詞彙的該最長共同子符號序列,藉由比對該文字語句之符號與該詞彙之符號,以於符號正確時加分(如一分)及符號錯誤時扣分(如一分),最後選擇分數最高且符合預定第一門檻值之詞彙作為該潛在命名實體。 In another embodiment, the latent named entity recognition module uses a sliding window to find the longest common subsymbol sequence between the text sentence and the vocabulary, and by comparing the symbols of the text sentence with the symbols of the vocabulary, in Add points (such as one point) when the symbol is correct and deduct points (such as one point) when the symbol is wrong, and finally select the word with the highest score and meet the predetermined first threshold as the potential named entity.
另外,潛在命名實體識別模組於找出該潛在命名實體後,若發現還有其他候選NE時,係利用遮蔽文字語句中共同子序列方式,再以相同程序尋找該文字語句中的其他潛在命名實體,直到無找到其他潛在命名實體為止。 In addition, after the potential named entity recognition module finds out the potential named entity, if there are other candidate NEs, it uses the method of masking the common subsequence in the text sentence, and then uses the same procedure to find other potential names in the text sentence entities until no other potential named entities are found.
於步驟S502,相似語句生成模組將該潛在命名實體所對應之詞彙進行不重複之組合,以生成相似語句並進行計分,俾由其中挑選出分數最高且符合第二門檻值之相似語句作為近似語句。本步驟即相似語句生成模組根據系統預定之句型跟文字語句中潛在命名實體出現順序,組合不重疊詞彙,藉以生成相似語句,接著,計算每一個相似語句與原始文字語句的最長共同子符號序列,並對每一個相似語句進行評分,從中挑選出分數最高且符合第二門檻值的 相似語句來作為近似語句。其中,評分方式包括比對該文字語句之符號與該相似語句之符號以進行計分,於符號正確時加分(如一分)及於符號錯誤時扣分(如一分),以令分數最高且符合第二門檻值之相似語句作為該近似語句。 In step S502, the similar sentence generation module combines the vocabulary corresponding to the potential named entity without repetition to generate similar sentences and perform scoring, so that the similar sentence with the highest score and meeting the second threshold value is selected as the Approximate sentences. This step is that the similar sentence generation module combines the non-overlapping vocabulary according to the sentence pattern predetermined by the system and the order of appearance of potential named entities in the text sentence, so as to generate similar sentences, and then calculates the longest common subsymbol between each similar sentence and the original text sentence sequence, and score each similar sentence, and select the one with the highest score and the second threshold similar sentences as approximate sentences. Among them, the scoring method includes comparing the symbols of the text sentence with the symbols of the similar sentence to score, adding points (such as one point) when the symbols are correct and deducting points (such as one point) when the symbols are wrong, so as to maximize the score and A similar sentence that meets the second threshold is used as the approximate sentence.
於步驟S503,語句意圖評選模組依據該近似語句之數量,提供用於觸發對應服務之參數、產生對應之回覆訊息或是產生互動訊息。本步驟即語句意圖評選模組根據前一步驟產生之近似語句的數量,給予後段系統或平台執行對應回饋,包括於該近似語句僅有一個時,透過後端之參數屢交模組以生成觸發該對應服務之參數,或是於該近似語句為零個或多個時,透過後端之答覆語句生成模組以產生該回覆訊息或該互動訊息。 In step S503, the sentence intent selection module provides parameters for triggering corresponding services, generates corresponding reply messages, or generates interactive messages according to the number of similar sentences. In this step, the statement intent selection module gives corresponding feedback to the back-end system or platform based on the number of similar sentences generated in the previous step, including when there is only one similar sentence, the module is repeatedly passed to the back-end parameter to generate a trigger The parameters of the corresponding service, or when the approximate sentence is zero or more, the reply message or the interaction message is generated through the reply sentence generation module at the backend.
於其他實施例中,於系統執行語意理解之運作前,係預先透過聲控句型建構模組由多個訓練語句中匡列出重要詞彙組合且忽略不重要字詞,以建構出該預存之詞彙。易言之,透過聲控句型建構模組從訓練語句中找出重要詞彙組合,成為預存之詞彙,以供系統分析使用。 In other embodiments, before the system executes the operation of semantic understanding, the voice-controlled sentence construction module is used to list important vocabulary combinations and ignore unimportant words in advance through the voice-controlled sentence construction module to construct the pre-stored vocabulary . In other words, through the voice-activated sentence construction module, important vocabulary combinations are found from the training sentences and become pre-stored vocabulary for system analysis.
以下以一具體實施例說明本發明。於本實施例中,係使用自然語言建構聲控服務支援的句型,使用二種命名實體(Named Entity,NE)識別方法,透過一套潛在NE初選機制推舉各種潛在NE,再依句型組合潛在NE生成相似語句,最後,透過近似語句評選公式,確立語句意圖,選定NE組合以供服務觸發參數履交或是答覆語句生成,本發明融合兩種NE識別結果,發揮綜效涵蓋缺字與錯別字情況,故能有效提升NE識別正確率。 The present invention is described below with a specific embodiment. In this embodiment, natural language is used to construct the sentence patterns supported by voice-activated services, and two named entity (Named Entity, NE) recognition methods are used to recommend various potential NEs through a potential NE primary selection mechanism, and then combined according to sentence patterns Potential NEs generate similar sentences. Finally, through similar sentence selection formulas, sentence intentions are established, and NE combinations are selected for service trigger parameter fulfillment or reply sentence generation. The present invention integrates the two NE recognition results, and exerts synergistic effects covering missing characters and typos, so it can effectively improve the accuracy of NE recognition.
首先,令聲控服務支援下面表二的句型,其中,{Story}包含「三隻小豬」等故事,{Song}包含「蜘蛛」等歌曲,{Listen}包含「要聽」等動詞。 First, let the voice control service support the sentence patterns in Table 2 below, where {Story} contains stories such as "Three Little Pigs", {Song} contains songs such as "Spider", and {Listen} contains verbs such as "I want to listen".
當缺字語句「我要聽三隻豬」輸入本系統時,使用字符命名實體識別器尋找語句中潛在NE。首先,對應流程301,依語句與詞彙的匹配率推舉Top K則詞彙,於本實施例中,結果Top K詞彙依序為「三隻小豬」與「要聽」,即Top 2。接著,對應流程302,針對每一條詞彙,以滑動視窗方式(一個字一個字移位)找出詞彙與語句的最長共同子符號序列(LCS),其中LCS長度≦詞彙長度。
When the word-missing sentence "I want to listen to three pigs" is input into the system, a character named entity recognizer is used to find potential NEs in the sentence. First, corresponding to the
有關詞彙「三隻小豬」其字符LCS的計算過程如下。「我要聽三隻豬」為初始所提供之缺字語句,與詞彙「三隻小豬」進行字符比對,「我要聽三隻豬」字符編號依序為0-5。 The calculation process of the character LCS of the word "Three Little Pigs" is as follows. "I want to listen to three pigs" is the word-missing sentence provided initially, and the character comparison is performed with the vocabulary "three little pigs". The character numbers of "I want to listen to three pigs" are 0-5 in sequence.
將詞彙「三隻小豬」與文字語句之片段「我要聽三」比對,字符LCS分數=1-3=-2,其中,1表示有一個字符相同,3表示有三個字符不同,故字符LCS分數為兩者相減後為-2。 Comparing the vocabulary "Three Little Pigs" with the text segment "I want to listen to three", the character LCS score = 1-3=-2, where 1 means that there is one character that is the same, and 3 means that there are three characters that are different, so The character LCS score is -2 after subtracting the two.
將詞彙「三隻小豬」與文字語句之片段「要聽三隻」比對,字符LCS分數=2-2=0,其中,前面的2表示有二個字符相同,後面的2表示有二個字符不同,故字符LCS分數為兩者相減後為0。
Comparing the vocabulary "Three Little Pigs" with the fragment of the text sentence "To listen to three", the character LCS score = 2-2=0, wherein, the
將詞彙「三隻小豬」與文字語句之片段「聽三隻豬」比對,字符LCS分數=3-2=1,其中,3表示有三個字符相同,2表示有二個字符不同(有文字置換情況),故字符LCS分數為兩者相減後為1。 Comparing the vocabulary "three little pigs" with the segment "listen to three pigs" in the text sentence, the character LCS score = 3-2=1, where 3 means that there are three characters that are the same, and 2 means that there are two characters that are different (there are text replacement), so the character LCS score is 1 after the subtraction of the two.
將詞彙「三隻小豬」與文字語句之片段「三隻豬」比對,字符LCS分數=3-1=2,其中,3表示有三個字符相同,1表示有一個字符不同(少字),故字符LCS分數為兩者相減後為2。 Compare the vocabulary "three little pigs" with the fragment "three pigs" of the text sentence, the character LCS score = 3-1=2, where 3 means that there are three characters that are the same, and 1 means that there is one character that is different (less characters) , so the character LCS score is 2 after subtracting the two.
綜上,「我要聽三隻豬」與詞彙「三隻小豬」作字符比對後,最終,結果流程302之LCS位置為(3,6),分數2,也就是從編號3開始(“三”字所在),編號6後開始不包含。
To sum up, after the character comparison between "I want to listen to three pigs" and the vocabulary "three little pigs", the LCS position of the
另外,有關另一個詞彙「要聽」其字符LCS的計算過程如下。同樣地,「我要聽三隻豬」為初始所提供之缺字語句,與詞彙「要聽」進行字符比對,「我要聽三隻豬」字符編號依序為0-5。 In addition, the calculation process of the character LCS of another word "want to listen" is as follows. Similarly, "I want to listen to three pigs" is the word-missing sentence provided initially, and the character comparison is performed with the vocabulary "Yao Ting", and the character numbers of "I want to listen to three pigs" are 0-5 in sequence.
將詞彙「要聽」與文字語句之片段「我要」比對,字符LCS分數=1-1=0,其中,前面的1表示有一個字符相同,後面的1表示有一個字符不同,故字符LCS分數為兩者相減後為0。 Comparing the vocabulary "want to listen" with the segment "I want" in the text sentence, the character LCS score = 1-1=0, where the preceding 1 indicates that there is a character that is the same, and the following 1 indicates that there is a character that is different, so the character The LCS score is 0 after subtracting the two.
將詞彙「要聽」與文字語句之片段「要聽」比對,字符LCS分數=2-0=2,其中,2表示有二個字符相同,後面的0表示有無字符不同,故字符LCS分數為兩者相減後為2。 Comparing the vocabulary "YaoListen" with the segment "YaoListen" of the text sentence, the character LCS score = 2-0 = 2, where 2 means that there are two characters that are the same, and the following 0 means whether there are different characters, so the character LCS score After subtracting the two, it is 2.
將詞彙「要聽」與文字語句之片段「聽三」比對,字符LCS分數=2-1=1,其中,2表示有二個字符相同,後面的1表示有一個字符不同,故字符LCS分數為兩者相減後為1。 Comparing the vocabulary "to listen" with the segment "listen three" of the text sentence, the character LCS score = 2-1=1, where 2 means that there are two characters that are the same, and the following 1 means that there is a character that is different, so the character LCS The score is 1 after subtracting the two.
故「我要聽三隻豬」與詞彙「要聽」作字符比對後,最終,流程302之結果LCS位置為(1,3),分數2,也就是從編號1開始(“要”字所在),編號3後開始不包含。
Therefore, after the character comparison of "I want to listen to three pigs" and the word "Yao Ting", the final result of the
接下來,對應流程303,從前述Top 2詞彙挑選分數最高且≧1的詞彙作為潛在NE,結果找出潛在詞彙為「三隻小豬」,LCS為「三隻豬」且分數為2。之後,以符號★遮蔽語句中LCS,再尋找「我要聽★★★」的其他潛在NE。
Next, corresponding to the
再次執行流程301,依語句與詞彙的匹配率,推舉Top K則詞彙。結果Top K詞彙為「要聽」,接著,執行流程302,針對每一條詞彙,以滑動視窗方式找出該詞彙與語句片段的LCS,其中,語句片段長度≦詞彙長度。
The
關於詞彙「要聽」其字符LCS的計算過程如下。「我要聽★★★」為遮蔽後語句,與詞彙「要聽」進行字符比對,「我要聽★★★」字符編號依序為0-5。 The calculation process of the character LCS of the vocabulary "To Listen" is as follows. "I want to listen to ★★★" is a masked sentence, which is compared with the word "want to listen". The character numbers of "I want to listen to ★★★" are 0-5 in sequence.
將詞彙「要聽」與文字語句之片段「我要」比對,字符LCS分數=1-1=0;將詞彙「要聽」與文字語句之片段「要聽」比對,字符LCS分數=2-0=2;將詞彙「要聽」與文字語句之片段「聽★」比對,字符LCS分數=2-1=1。因此,「我要聽★★★」與詞彙「要聽」作字符比對後,結果流程302之LCS為「要聽」,分數2。
Compare the vocabulary "Yaolisten" with the fragment "I want" of the text sentence, the character LCS score = 1-1=0; compare the vocabulary "Yaolisten" with the fragment "Yaolisten" of the text sentence, the character LCS score = 2-0=2; compare the vocabulary "want to listen" with the segment "listen★" of the text sentence, the character LCS score=2-1=1. Therefore, after the character comparison of "I want to listen ★★★" and the word "YaoListen", the LCS of the
再次,執行流程303,從Top 1詞彙挑選分數最高且≧1的詞彙作為潛在NE。結果流程303之潛在詞彙為「要聽」,LCS為「要聽」且分數為2。之後,執行流程303,以符號★遮蔽語句中LCS,再尋找「我★★★★★」其他潛在NE,結果不存在其他潛在NE,進入流程305,彙整字符命名實體識別器執行結果如下面表三。
Again, the
另外,音符命名實體識別器使用音符尋找語句中潛在NE,其中,音符單位為一個中文字的注音符號不含聲調或一個英文單字詞幹。因此,於本實施例中,語句「我要聽三隻豬」與詞彙,文字轉注音,注音轉音符過程如下面表四。 In addition, the musical note named entity recognizer uses musical notes to find potential NEs in a sentence, where the musical note unit is a phonetic symbol of a Chinese character without tone or an English word stem. Therefore, in the present embodiment, the sentence "I want to listen to three pigs" and the vocabulary, the text is converted into phonetic notation, and the process of translating phonetic notation from phonetic notation is as shown in Table 4 below.
對應流程301,依語句與詞彙的匹配率,推舉Top K則詞彙。結果Top K詞彙依序為「三隻小豬」、「蜘蛛」與「要聽」。接著,對應流程302,針對每一條詞彙,以滑動視窗方式找出詞彙與語句的LCS,其中LCS長度≦詞彙長度。流程302比對過程如下。
Corresponding to the
A B C D E F為語句音符,其中,音符代碼為D E G F之詞彙「三隻小豬」與語句片段「三隻豬」比對,音符LCS位置(3,6),分數3-1=2;音符代碼為E F之詞彙「蜘蛛」與語句片段「隻豬」,音符LCS位置(4,6),分數2-0=2;音符代碼為B C之詞彙「要聽」與語句片段「要聽」,音符LCS位置(1,3),分數2-0=2。 A B C D E F is a sentence note, wherein, the note code is D E G F vocabulary "three little pigs" and sentence fragment "three pigs" comparison, note LCS position (3,6), score 3-1=2; note code is E F The vocabulary "spider" and the sentence fragment "pig", the note LCS position (4,6), score 2-0=2; the note code is B C The vocabulary "want to listen" and the sentence fragment "want to listen", the note LCS position (1,3), fraction 2-0=2.
接下來,對應流程303,從3個詞彙音符挑選分數最高且≧1的詞彙作為潛在NE,結果流程303之潛在詞彙為「三隻小豬」,LCS為「三隻
豬」且分數為2。之後,對應流程304,以符號★遮蔽語句中LCS,再尋找「我要聽★★★」其他潛在NE。
Next, corresponding to the
再次對應流程301,依語句與詞彙的匹配率,推舉Top K則詞彙。結果本次流程301之Top K詞彙為「要聽」,接著,對應流程302,針對每一條詞彙,以滑動視窗方式找出詞彙與語句的LCS,其中LCS長度≦詞彙長度。
Corresponding to the
同樣地,A B C D E F為語句音符,其中,音符代碼為B C之詞彙「要聽」與語句片段「要聽」比對,音符LCS位置(1,3),分數2-0=2。 Similarly, A B C D E F is a sentence note, wherein, the word "want to listen" whose note code is B C is compared with the sentence fragment "want to listen", the note LCS position is (1,3), and the score is 2-0=2.
再次執行流程303,結果潛在詞彙為「要聽」,LCS為「要聽」且分數為2。之後,執行流程304,以符號★遮蔽語句中LCS,再尋找「我★★★★★」其他潛在NE。結果不存在其他潛在NE,故進到流程305,彙整音符命名實體識別器執行結果如下面表五。
The
於字符命名實體識別器以及音符命名實體識別器執行完後,彙整探勘結果(表三和表五),最終得到如表五的結果,之後將用於相似語句生成與近似語句評選。對應流程401,根據系統支援的句型跟潛在NE出現順序,組合不重疊詞彙,生成相似語句。對應流程402,計算每一則相似語句與原始語句最長共同子符號序列,計分方式為符號正確則加一分,符號錯誤(多/少/置換一個符號)則扣一分。
After the character named entity recognizer and musical note named entity recognizer are executed, the exploration results (Table 3 and Table 5) are compiled, and the results shown in Table 5 are finally obtained, which will be used for similar sentence generation and similar sentence selection. Corresponding to the
例如原始語句之字符「我要聽三隻豬」和音符「A B C D E F」,與每種相似語句評分過程如下面表六。 For example, the characters "I want to listen to three pigs" and the musical notes "A B C D E F" in the original sentence, and the scoring process of each similar sentence are shown in Table 6 below.
接下來,對應流程403,從所有相似語句挑分數≧0且最高分的近似語句,結果相似語句「要聽三隻小豬」分數最高,推選為唯一近似語句,從句型對照下面表七可知,語句意圖為listenStory,選定詞彙組合後,由流程404將NE詞彙填至對應的履行參數,連同意圖代碼後傳後端restful API觸發服務,例如:/listenStory?Book=三隻小豬。另外,根據選定NE詞彙置換開頭介紹樣板中的NE,可得開頭介紹語句,供串流撥放前節目名稱介紹「為您播放故事三隻小豬」。
Next, corresponding to process 403, select the approximate sentence with the highest score ≧ 0 from all similar sentences. As a result, the similar sentence "Listen to the Three Little Pigs" has the highest score and is recommended as the only similar sentence. From the sentence pattern comparison table 7 below, we can see , the intent of the statement is listenStory. After the vocabulary combination is selected, the
假使「三隻小豬」既是{Story}也是{Song},則語句「我要聽三隻豬」會有兩則同分的近似語句,「要聽三隻小豬」其意圖可能是listenStory或listenSong,此時需要跟用戶確認。對應流程406,組成近似語句的NE套入選項樣板可得數筆問題選項,組成選擇題詢問用戶意圖,問題選項如下面表八,答覆「您是要聽故事三隻小豬還是歌曲三隻小豬」選擇題以釐清用戶意圖。 If "The Three Little Pigs" is both {Story} and {Song}, the sentence "I want to listen to the Three Pigs" will have two similar sentences with the same score. The intention of "I want to listen to the Three Little Pigs" may be listenStory or listenSong, you need to confirm with the user at this time. Corresponding to process 406, the NE that forms the approximate sentence can be inserted into the option template to get several question options, and the multiple -choice questions are formed to ask the user's intention . Pig ” multiple-choice questions to clarify user intent.
此外,本發明還揭示一種電腦可讀媒介,係應用於具有處理器(例如,CPU、GPU等)及/或記憶體的計算裝置或電腦中,且儲存有指令,並可利用此計算裝置或電腦透過處理器及/或記憶體執行此電腦可讀媒介,以於執行此電腦可讀媒介時執行上述之方法及各步驟。 In addition, the present invention also discloses a computer-readable medium, which is applied to a computing device or computer having a processor (for example, CPU, GPU, etc.) and/or memory, and stores instructions, and can be used by this computing device or The computer executes the computer-readable medium through the processor and/or memory, so as to execute the above-mentioned method and each step when executing the computer-readable medium.
本發明之模組、單元、裝置等包括微處理器及記憶體,而演算法、資料、程式等係儲存記憶體或晶片內,微處理器可從記憶體載入資料或演算法或程式進行資料分析或計算等處理,在此不予贅述。易言之,本發明之處理豐富文字之語意理解系統可於電子設備上執行,例如一般電腦、平板或是伺服器,在收到文字語句後執行分析與運算,故處理豐富文字之語意理解系統所進行程序,可透過軟體設計並架構在具有處理器、記憶體等元件之電子設備上,以於各類電子設備上運行;另外,亦可將處理豐富文字之語意理解系統之各模組或單元分別以獨立元件組成,例如設計為計算器、記憶體、儲存器或是具有處理單元的韌體,皆可成為實現本發明之組件,而命名實體識別器等相關組件,亦可選擇以軟體程式、硬體或韌體架構呈現。 The modules, units, devices, etc. of the present invention include a microprocessor and a memory, and algorithms, data, programs, etc. are stored in the memory or a chip, and the microprocessor can load data or algorithms or programs from the memory. Processing such as data analysis or calculation will not be repeated here. In other words, the semantic understanding system for processing rich text of the present invention can be executed on electronic equipment, such as a general computer, tablet or server, and performs analysis and calculation after receiving text sentences, so the semantic understanding system for processing rich text The program to be carried out can be designed and built on electronic devices with processors, memory and other components through software, so as to run on various electronic devices; in addition, various modules of the semantic understanding system for processing rich text or The units are composed of independent components, such as a calculator, a memory, a storage device, or a firmware with a processing unit, all of which can be used to realize the components of the present invention, and related components such as named entity recognizers can also be selected as software Presentation of program, hardware or firmware architecture.
綜上,本發明之處理豐富文字之語意理解系統及其方法,係用於將內容豐富文字轉換成輕量語意以便於分析理解,其中,潛在命名實體識別模組整合多種NE識別方法,尋找潛在NE詞彙,相似語句生成模組根據系統支援句型跟語句潛在NE出現順序,組合不重疊詞彙生成相似語句,語句意圖評選模 組從數則相似語句挑出最接近原始語句,推估語句意圖,選定NE組合。易言之,對於用戶使用聲控服務往往不假思索,而有節目名稱過長而漏講部分字詞或是節目名稱諧音往往錯別字誤植等狀況,進而影響聲控辨識率,本發明提出一套框架構建聲控句型、評選數種Named Entity(NE)識別結果,推估語句意圖,發揮綜效因應語句缺字與錯別字,藉以提升聲控服務品質與準確度。故本發明具有以下功效。 In summary, the semantic understanding system and method for processing rich text of the present invention is used to convert content-rich text into lightweight semantics for easy analysis and understanding. Among them, the potential named entity recognition module integrates multiple NE recognition methods to find potential NE vocabulary, similar sentence generation module According to the sentence patterns supported by the system and the potential NE appearance order of the sentence, the combination of non-overlapping words generates similar sentences, and the sentence intention selection model The group selects the closest to the original sentence from the number of similar sentences, estimates the sentence intent, and selects the NE combination. In other words, for users who use voice-activated services without hesitation, some words are omitted because the program name is too long, or the homonym of the program name is often wrongly written, which affects the voice-activated recognition rate. This invention proposes a framework to construct voice-activated sentences Type, select several Named Entity (NE) recognition results, estimate the intent of the sentence, and use the synergy to deal with missing words and typos in the sentence, so as to improve the quality and accuracy of voice control services. Therefore, the present invention has the following effects.
首先,透過聲控句型建構模組框架及歸納重要詞彙,忽略不重要字詞,故不需要大量範例語句,不需要大量運算資源進行訓練。 First of all, it constructs a module framework and summarizes important vocabulary through voice-controlled sentence patterns, and ignores unimportant words, so it does not require a large number of example sentences and a large number of computing resources for training.
其次,命名實體(Named Entity,NE)評選框架融合多種NE識別結果,發揮綜效提升缺字或是錯別字詞彙之辨識率。 Secondly, the Named Entity (NE) selection framework integrates multiple NE recognition results to play a synergistic effect to improve the recognition rate of missing or misspelled words.
上述實施例僅為例示性說明,而非用於限制本發明。任何熟習此項技藝之人士均可在不違背本發明之精神及範疇下,對上述實施例進行修飾與改變。因此,本發明之權利保護範圍係由本發明所附之申請專利範圍所定義,只要不影響本發明之效果及實施目的,應涵蓋於此公開技術內容中。 The above-mentioned embodiments are for illustrative purposes only, and are not intended to limit the present invention. Anyone skilled in the art can make modifications and changes to the above-mentioned embodiments without departing from the spirit and scope of the present invention. Therefore, the protection scope of the present invention is defined by the scope of patent application attached to the present invention, as long as it does not affect the effect and implementation purpose of the present invention, it should be included in this disclosed technical content.
1:處理豐富文字之語意理解系統 1: Semantic understanding system for processing rich text
11:潛在命名實體識別模組 11: Potential Named Entity Recognition Module
12:相似語句生成模組 12: Similar sentence generation module
13:語句意圖評選模組 13: Sentence intent selection module
14:聲控句型建構模組 14: Voice-activated sentence construction module
Claims (11)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW110146063A TWI803093B (en) | 2021-12-09 | 2021-12-09 | Semantic understanding system for rich-text, method and computer readable medium thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW110146063A TWI803093B (en) | 2021-12-09 | 2021-12-09 | Semantic understanding system for rich-text, method and computer readable medium thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
TWI803093B true TWI803093B (en) | 2023-05-21 |
TW202324381A TW202324381A (en) | 2023-06-16 |
Family
ID=87424481
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW110146063A TWI803093B (en) | 2021-12-09 | 2021-12-09 | Semantic understanding system for rich-text, method and computer readable medium thereof |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI803093B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140324572A1 (en) * | 2005-09-14 | 2014-10-30 | Millennial Media, Inc. | System For Targeting Advertising Content To A Plurality Of Mobile Communication Facilities |
CN105260360A (en) * | 2015-10-27 | 2016-01-20 | 小米科技有限责任公司 | Named entity identification method and device |
TW201835784A (en) * | 2016-12-30 | 2018-10-01 | 美商英特爾公司 | The internet of things |
CN110134931A (en) * | 2019-05-14 | 2019-08-16 | 北京字节跳动网络技术有限公司 | Media title generation method, device, electronic equipment and readable medium |
-
2021
- 2021-12-09 TW TW110146063A patent/TWI803093B/en active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20140324572A1 (en) * | 2005-09-14 | 2014-10-30 | Millennial Media, Inc. | System For Targeting Advertising Content To A Plurality Of Mobile Communication Facilities |
CN105260360A (en) * | 2015-10-27 | 2016-01-20 | 小米科技有限责任公司 | Named entity identification method and device |
TW201835784A (en) * | 2016-12-30 | 2018-10-01 | 美商英特爾公司 | The internet of things |
CN110134931A (en) * | 2019-05-14 | 2019-08-16 | 北京字节跳动网络技术有限公司 | Media title generation method, device, electronic equipment and readable medium |
Also Published As
Publication number | Publication date |
---|---|
TW202324381A (en) | 2023-06-16 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109635270B (en) | Bidirectional probabilistic natural language rewrite and selection | |
JP7524296B2 (en) | Tailoring interactive dialogue applications based on author-provided content | |
US10410627B2 (en) | Automatic language model update | |
US10152971B2 (en) | System and method for advanced turn-taking for interactive spoken dialog systems | |
Mairesse et al. | Stochastic language generation in dialogue using factored language models | |
US11016968B1 (en) | Mutation architecture for contextual data aggregator | |
US11093110B1 (en) | Messaging feedback mechanism | |
WO2016067418A1 (en) | Conversation control device and conversation control method | |
Ostendorf et al. | Speech segmentation and spoken document processing | |
US8849668B2 (en) | Speech recognition apparatus and method | |
US11531693B2 (en) | Information processing apparatus, method and non-transitory computer readable medium | |
US11263852B2 (en) | Method, electronic device, and computer readable storage medium for creating a vote | |
JP5231484B2 (en) | Voice recognition apparatus, voice recognition method, program, and information processing apparatus for distributing program | |
TWI803093B (en) | Semantic understanding system for rich-text, method and computer readable medium thereof | |
US6735560B1 (en) | Method of identifying members of classes in a natural language understanding system | |
US20230186898A1 (en) | Lattice Speech Corrections | |
Kawahara | New perspectives on spoken language understanding: Does machine need to fully understand speech? | |
Misu et al. | Dialogue strategy to clarify user’s queries for document retrieval system with speech interface | |
Wang et al. | Voice search | |
Wiggers | Modelling context in automatic speech recognition | |
Attanayake | Statistical language modelling and novel parsing techniques for enhanced creation and editing of mathematical e-content using spoken input | |
Duta | Natural language understanding and prediction: from formal grammars to large scale machine learning | |
US20230215441A1 (en) | Providing prompts in speech recognition results in real time | |
JP2024150554A (en) | Tailoring interactive dialogue applications based on author-provided content | |
Adhikary | Intelligent Techniques to Accelerate Everyday Text Communication |