TW449734B - Keyword spotting method for mandarin speech without using filler models - Google Patents
Keyword spotting method for mandarin speech without using filler models Download PDFInfo
- Publication number
- TW449734B TW449734B TW88115161A TW88115161A TW449734B TW 449734 B TW449734 B TW 449734B TW 88115161 A TW88115161 A TW 88115161A TW 88115161 A TW88115161 A TW 88115161A TW 449734 B TW449734 B TW 449734B
- Authority
- TW
- Taiwan
- Prior art keywords
- model
- chinese
- scope
- patent application
- keyword extraction
- Prior art date
Links
Landscapes
- Document Processing Apparatus (AREA)
Abstract
Description
449734 ---------— ----------- 五、發明說明(1) 【本發明之領域】 本發明係有關語音辨識之技術頜域’尤指一種免用贅 詞模型之中文關鍵詞萃取方法。 【本發明之背景】 按,一般之語音辨識系缽所能辨識的範圍為有限的詞 彙,語者所輸入之語音與詞彙中之某一詞音必須完全符 合’不能摻雜其他的語音否則辨識易產生錯誤,例如在語 音訂車票系統中’假設能辨識的詞彙範圍為站名,當使用 者被問到終站時只能一字不差的回答「台北」而不能說 「我要到台北」,否則無法得到正確之辨識結果。 因此’為改進前述之缺失,即有關鍵詞萃取技術之發 展二以讓使用者自由輸入任何語詞,而由語音辨識系統自 動萃取可能之關鍵詞,因而解除了語音辨識上許多不便的 限制’而在關鍵詞萃取之習知作法,係如第五圖所示,先 用關鍵詞模型(51)及贅詞模型(52)建構—如圖所示之 接續關係的語音模型(53 ),再用雉特比(Viterbi )演 算法將語音訊號(55 )與語音模型(53 )之可能接續路徑 做匹配(Matching),以獲致如第六圖所示之匹配路徑 圖 而在得到一最大近似值(Maximum Like 1 ihood,ML ) 之=配路徑(6 1 )時即可得知在訊號中那些段落為贅詞之 f音’那些段落為那個關鍵詞之發音,對於萃取出之關鍵 β可發θ I又落再進一步做確認(veriHCau〇n),做出接受 (Accept)或拒絕(Rejecf)的決定,對於被接受的段落449734 ----------- ----------- V. Description of the invention (1) [Field of the invention] The present invention relates to the technology of speech recognition. Chinese keyword extraction method using redundant word model. [Background of the present invention] According to the general speech recognition system, the range that can be recognized is a limited vocabulary. The speech input by the speaker and a certain sound in the vocabulary must completely match 'cannot be mixed with other speech, otherwise recognition It is easy to make mistakes. For example, in the voice ticket booking system, 'assuming that the vocabulary range that can be recognized is the station name, when the user is asked at the end station, he can only answer "Taipei" verbatim instead of saying "I want to go to Taipei." "Or you wo n’t get the correct identification result. Therefore, 'in order to improve the aforementioned deficiency, there is the development of keyword extraction technology to allow users to enter any words freely, and the possible keywords are automatically extracted by the speech recognition system, thus eliminating many inconvenient restrictions on speech recognition' and In the conventional method of keyword extraction, as shown in the fifth figure, a keyword model (51) and a redundant word model (52) are first used to construct a speech model (53) of a connection relationship as shown in the figure, and then used. The Viterbi algorithm matches the speech signal (55) with the possible connection path of the speech model (53) to obtain a matching path map as shown in the sixth figure and obtain a maximum approximation (Maximum Like 1 ihood (ML) = Matching path (6 1), you can know that in the signal, those paragraphs are f sounds of redundant words, and those paragraphs are the pronunciation of that keyword. For the extracted key β, you can send θ I VeriHCau〇n and then make a further decision to accept (Accept) or reject (Rejecf), for the accepted paragraph
第4頁 449734Page 4 449734
五、發明說明 :2) 以及其 辨 識 結 果 即 為 被 萃 取 出 之 關 鍵詞。 惟 前 述 之 習 知 關 鍵 詞 萃 取 方 法 涉及贅詞模型 (Garbage Mode 1 或F i 1 1 er Model ) (52 )之建立,而為 要得到 較 好 之 萃 取 效 果 > 在 歐 洲 專 利EP0800158 A1 971008 ” 語 詞 萃 取 (Word spo 11 ing ) ”乙案中,已指出贅 詞模型 必 須 根 據使 用 者 在 應 用 場 合 常用之贅詞而建立,又 對中文 語 音 辨 識 而 __ 美 國 專 利 案 USP56490 57 "使用關鍵 詞模型 及 非 關 鍵 詞 模 型 之 語 音 辨 識 (Speech recognition employ i ng ke y word mode 1 i ng and non-keyword mode 1 i ng ) "也說明了同樣之現象 惟因贅詞的範圍往往 隨著應 用 領 域 ( Task Domai η ) 以及使用者的用語習慣而 有所變 化 因 此 為 了 得 到 較 好 的 萃取效果,必須針對使 用時會 常 出 現 之 贅 語 詞 彙 作 為 建 立 模型之依據,亦即贅詞 模型都 要 依 不 同 之 應 用 場 合 而 有 所 調整(Task Dependent ),而 調 整 所 耗 之 工 夫 均 涉 及 贅 詞 模塑的重建,故習知關 鍵詞萃 取 方 法 之 缺 點 為 每 當 改 變 應 用領城時,即要收集使 用者在 此 應 用 場 合 可 能 會 產 生 的 贅 詞,再針對這些資訊建 立適當 的 贅 詞 模 型 且 收 集 語 彙 是 極其繁瑣的過程,因 此,前 述 之 關 鍵 詞 萃 取 方 法 實 有 予 以改進之必要。 發 明 人 爰 因 於 此 本 於 積 極 發 明之精神’亟思一種可 以解決 上 述 問 題 之 免 用 贅 詞 模 型 之 中文關鍵詞萃取方法, 幾經研 究 實 驗 終 至 完 成 此 項 新 穎 進 步之發明° 【本發 明 之 概 述 ] 4 497 3 4 五、發明說明(3) 本發明之目的係在提供一種中文關鍵詞萃取方法,以 在不必針對應用領域建立贅詞模型之要求下萃取語音訊 中之關鍵詞。 ^ 為達前述之目的,本發明所提出之免用贅詞模型之中 文關鍵詞萃取方法係首先利用中文字音所具有之聲母接韻 母的接續結構,而自該輸入語音訊號中找出其可能之字音 訊號端點及字音段落,以建構出一由複數個字音段落接續 而成之字音連接圖;再依據關鍵詞之詞長以從該字音連接 圖中辨識出關鍵詞,而得以免用贅詞模型並能正確地 中文關鍵詞。 由於本發明之設計新穎’能提供產業上利用,且確 增進功效,故依法申請專利。 為使責審查委員能進一步瞭解本發明之結構、特徵 及其目的,茲附以圖式及較佳具體實施例之詳細說明如 【圖式簡單說明】 第一圖:係本發明之免用贅詞模型之中文關鍵 方法的處理流程。 μ平取 第二圖:係本發明之免用贅詞模型之中文關鍵 方法所使用之通用聲、韻母模型。 卒取 第三圖··係說明依據不同字音個數之假設以 音訊號進行匹配演算所得到之最大近似值。 a 第四圖:係顯示一語音訊號"從新竹到桃園"之發音波V. Description of the invention: 2) and its identification results are the key words that have been extracted. However, the aforementioned conventional keyword extraction method involves the establishment of a redundant word model (Garbage Mode 1 or F i 1 1 er Model) (52), and in order to obtain a better extraction effect > in European Patent EP0800158 A1 971008 '' words Extraction (Word spo 11 ing) In case B, it has been pointed out that the superfluous word model must be established based on the superfluous words commonly used by users in the application, and the Chinese speech recognition __ US Patent Case USP56490 57 " Using Keyword Model And non-keyword model speech recognition (Speech recognition employ i ng ke y word mode 1 i ng and non-keyword mode 1 i ng) " also illustrates the same phenomenon because the scope of redundant words often varies with the application area ( Task Domai η) and the user ’s terminology have changed. Therefore, in order to obtain a better extraction effect, the verbal vocabulary that often appears during use must be used as the basis for building a model, that is, the verbal model must be based on different Application-specific adjustments (Task Dependent ), And the time spent adjusting all involves the reconstruction of superfluous words. Therefore, the disadvantage of the conventional keyword extraction method is that whenever the application city is changed, the superfluous words that may be generated by users in this application are collected. It is extremely tedious to build an appropriate redundant word model and collect vocabulary based on this information. Therefore, it is necessary to improve the aforementioned keyword extraction method. Because of this, the spirit of active invention was' immediately thinking about a Chinese keyword extraction method that can solve the above problem without using redundant word models. After several research experiments, this novel and progressive invention has been completed. Overview] 4 497 3 4 V. Description of the invention (3) The purpose of the present invention is to provide a Chinese keyword extraction method to extract keywords in a voice message without the need to establish a redundant word model for the application field. ^ In order to achieve the aforesaid objective, the Chinese keyword extraction method of the unnecessary word model proposed by the present invention first uses the consonant structure of consonants and vowels in the initials of Chinese characters, and finds its possible from the input voice signal The end points of the zigzag signal and the syllabic paragraphs are used to construct a zigzag connection diagram which is a continuation of a plurality of syllabic paragraphs; and then the keywords are identified based on the word length of the keywords to avoid the redundant Word model and correct Chinese keywords. Since the novel design of the present invention can provide industrial utilization and indeed enhance the efficacy, a patent is applied in accordance with the law. In order to enable the review committee to further understand the structure, characteristics and purpose of the present invention, detailed descriptions of the drawings and preferred embodiments are attached as [Simplified Description of the Drawings]. Word model Chinese key method processing flow. μ flat drawing Figure 2: This is the general sound and vowel model used in the Chinese key method of the unnecessary word model of the present invention. The third figure is the maximum approximation obtained by matching calculations based on the assumption of different numbers of sounds. a The fourth picture: it shows a voice signal " from Hsinchu to Taoyuan "
第6頁 449734 五、發明說明(4) 形及依據本發明之免用贅詞模型之中文關鍵詞萃取方法所 形成之字音連接圖。 第五圖:係顯示一由關鍵詞模型及贅詞模型所建構之 習用語音模型。 第六圖:係顯示以維特比演算法將語音訊號與語音模 型做匹配所獲得之匹配路徑圖。 【圖號說明】 (2 1 )聲母模型 (22 )韻母模型 ( 31 ) ( 42 ) ( 43 ) ( 44)字音訊號端點組 (4 1 )( 5 5 )語音訊號發音波形 (4 5 )字音連接圖 (5 1 )關鍵詞模型 (52 )贅詞模型 (5 3 )語音模型 (61 )匹配路徑 【較佳具體實施例之詳細說明】 有關本發明之免用贅詞模型之中文關鍵詞萃取方法之 一較佳實施例,請先參照第一圖所示,其顯示本發明之方 法主要包括有尋找語音訊號字音段落及關鍵詞搜尋之兩階 段的處理步驟(SI,S2 ),其中,尋找語音訊號字音段落 之處理步驟(S1 )係用以找出所欲辨識之語音訊號中字音Page 6 449734 V. Description of the invention (4) The character-to-speech connection diagram formed by the Chinese keyword extraction method of the shape-free and redundant word model according to the present invention. Figure 5: Shows a conventional speech model constructed by a keyword model and a redundant word model. Figure 6: Shows the matching path map obtained by matching the voice signal with the voice model using the Viterbi algorithm. [Illustration of drawing number] (2 1) Initial model (22) Final model (31) (42) (43) (44) Word audio signal endpoint group (4 1) (5 5) Voice signal pronunciation waveform (4 5) Word sound Connection diagram (5 1) Keyword model (52) Redundant model (5 3) Speech model (61) Matching path [Detailed description of the preferred embodiment] Chinese keyword extraction related to the free redundant model of the present invention A preferred embodiment of the method, please refer to the first figure, which shows that the method of the present invention mainly includes a two-stage processing step (SI, S2) of searching for a voice signal, a paragraph of words, and a keyword search. The processing step (S1) of the voice signal word and sound paragraph is used to find the word sound in the voice signal to be recognized
4497 3 4 五、發明說明(5) 訊號端點的可能位置,並據 (训―Graph),而關鍵= 音連接圓 從字音連接圖中尋找關鍵詞,據U =理:驟⑻則 正確萃取中文關鍵詞之目的。據以達成免用贅詞模型而可 中在的處理步驟⑶ 擋之聲母聲音(例如「電」字之’「'由氣^被發音器官所阻 發出之韻母聲音(例「&子〜幻」)及由聲帶顫動所 生,而根據中文字音的-弓」)所接續產 可利用語音模型與訊號比對韻母的規律性, 聲、韻母段落,進而找出字音音;號中匹配出 處理步驟首先需建立一如 /又落,因此,此階段之 及韻母模型(22)所接續而:斤由f母模型⑺) 能夠辨識語音訊號是否為 :I、頊母模型,以便 般已知之能夠辨熾θ ·或明母即可,但不需要如一 批夕J符-¾疋那一個聲 型’該通用聲、韻母模型之建立= 用聲、韻母模 的訓練語料而訓練出只 =收集所有聲、韻母 是將已有的專用聲、措聲母或韻母的通用模型,或 在建立ϋΞΪ =#模型予以合併而獲得。 訊號比對法則在言五立3 =型之,含則可使用語音模型與 出字音訊號之1;::=:=韻母段落,進而找 號比對法則係為維特比t ^演中語音模型與訊 =法進行匹配時,=公二’ 第一圖所不,假設輸入訊號最多有L個字音, 五、發明謂^' 1 則在^對t輪入語音訊號進行匹配演算過程令可得到字音數為 予音數為L的最大近似值ML广MLl,其中每一最大近似值 : '應予音讯號端點組(3 1 )’在此L個最大近似值 由.於使用維特比演算法切割後的結果可能會有稍許插 =二除(lnserti〇n / Deleti〇n)的誤差,亦即輸人訊 合’正的子音個數與對應於最大近似值最高的字音個數 句1差1、2個,而為容許此種誤差之存在,在選取匹配路 ,時,自ML广MLl中選取前}^個最高之最大近似值而非只取1 個最高之最大近似值,則此N個最大近似值所對應之字音 個j便有很高之機率可包含輸入語音之字音個數。此外, f貭算過程中,除了字音個數的資訊外尚可得到訊號中字 二段落的位置,這些字音段落的接續可構成一字音連接 ,同樣的,連接圖中有高機率的含有正確接續的字音段 第四圖係顯示一實際之範例,其係對一語音訊號,,從 到桃園"(41 )之發音波形以維特比演算法進行匹配 碉异,並自匹配演算所得之最大近似值中選取前三個最高 之最大近似值,而對應此三個噩宾夕县丄_ 间 ^ . β 7 。 ^ ^ 敢间之最大近似值之字音個 數為6、7、8,其中,正確之字音數6即包含在其中,又 ,三個最高之最大近似值所分別對應之字音訊號端點纟且、 ^42,43 J4)予以合併,即可獲致所有可能之字音訊& ,點0〜9及子音段⑨(以箭頭之起點至終點表示), 邊等字音段落接續而成之字音連接圖(“)。 當輸入語音之字音連接圖建構完成之後,即可進行第4497 3 4 V. Explanation of the invention (5) The possible positions of the signal endpoints, and according to (Training-Graph), and the key = sound connection circle looks for keywords from the word sound connection map. According to U = reason: Suddenly extract correctly The purpose of Chinese keywords. According to the processing steps that can be used to avoid the redundant word model, the consonant sound of the block (for example, the word "" "in the" electricity "vowel sound is blocked by the vowel organ (eg" & 子 ~ 幻”) And vocal vocal tremor, and according to the“ -bow ”of the Chinese character voicing”) can use the voice model and the signal to compare the regularity of the vowel, the vowel and the vowel paragraph, and then find the word sound; the number matches The processing steps first need to establish the same / failure. Therefore, at this stage, the vowel model (22) is followed by: the f mother model f) can identify whether the voice signal is: I, the mother model, so as to be known It can be distinguished by radiant θ · or Mingmu, but it does not need to be a sound pattern such as a batch of Xi J symbol -¾ 疋. The establishment of the universal voice and final model = training using the training data of the voice and final model only = Collecting all initials and finals is obtained by combining the existing general models of special initials, initials, or finals, or by merging the ϋΞΪ = # model. The rule of signal comparison is in the words 3 = type, and Han can use the voice model and the output of the voice signal; 1 :: =: = vowel paragraph, and then the rule of sign comparison is Viterbi t ^ during the speech model When matching with the signal = method, = the public second 'not shown in the first picture, assuming that the input signal has a maximum of L characters, five, the invention is called ^' 1 The matching calculation process of the t-round voice signal at ^ can be obtained The number of words is the maximum approximation ML and MLl of the number of presuppositions L, each of which is the maximum approximation: 'Yingyu signal signal endpoint group (3 1)' where the L maximum approximations are made by using the Viterbi algorithm after cutting The result may have a slight interpolation = two division (lnserti〇n / Deleti〇n) error, that is, the number of positive consonants and the number of phonetic sounds corresponding to the maximum approximation are 1, 1, 2 In order to allow the existence of such errors, when selecting the matching path, the top} ^ highest approximate value is selected from ML and MLl instead of taking only the 1 highest maximum approximate value. There is a high probability that the corresponding zigzag j can include the number of zigzags of the input voice. In addition, in the f 貭 calculation process, in addition to the information of the number of phonetic numbers, the position of the second paragraph of the signal can be obtained. The continuation of these phonetic paragraphs can form a phonetic connection. Similarly, there is a high probability in the connection diagram that there is a correct connection. The fourth picture of the word segment shows a practical example, which uses a Viterbi algorithm to match the pronunciation waveform of a speech signal from Taoyuan " (41) to the maximum approximation obtained from the matching calculation. Choose the first three highest maximum approximations, which correspond to these three Hobinxi counties 间 _ ^. Β 7. ^ ^ The number of vowels with the maximum approximate value of dare is 6, 7, 8, among which the correct number of vowels 6 is included, and the endpoints of the voicing signal corresponding to the three highest maximum approximations are, respectively, ^ 42, 43 J4) combined, you can get all possible phonics & points 0 to 9 and sub-syllables ⑨ (indicated by the start point to the end of the arrow). ). After the construction of the connection map of the input voice is completed, the first step can be performed.
第9頁 4497 3 4 五、發明說明(7) 二階段之關鍵詞搜尋之處理步驟(S2 ),亦即自字音連接 圖中尋找關鍵詞,以前述之範例說明,若候選之關鍵詞為 29個台灣的主要地名,而假設關鍵詞彙之内容均為2個字 音之台灣地名,則第四圓之字音連接圖(45 )即包含有! j 個含2字音的字音段洛0-2、〇-3、1-3、1_4、2-4、3_5、 4-6、4-7、5-8、5-9及6-9,其中字音段落1-4與段落5-9 即是「新竹」與「桃園」的發音,因此當以例如基於隱藏 式馬可夫模型(Hidden Markov Model,HMM)之中文語音 辨識器對此段落作辨識時,應可得到正確的結果,此外, 其餘的段落經由辨識也會各個得到一地名,因此,每個段 落辨識所得的结果只能視為產生關鍵詞的假定 (Hypotheses ),故經辨識後會有11個關鍵詞的假定,這 些假定需進一步的轉認以對辨識結果做出接受或拒絕之決 定,而最後被接受之確認結果即為萃取到的關鍵詞。對於 此範例而言’其辨識及確認之結果為: 段落 辨 識 結 果 確認 結果 0-2 厂 中 壢 J 拒 絕 0-3 厂 中 壢 J 拒 絕 1-3 厂 新 營 J 拒 絕 1-4 厂 新 竹 J 接 受 2-4 厂 新 竹 J 拒 絕 5-8 厂 桃 園 J 拒 絕 5-9 厂 桃 園 J 接 受Page 9 4497 3 4 V. Description of the invention (7) The processing step (S2) of the keyword search in the second stage, that is, searching for keywords from the phonetic connection diagram, using the example described above, if the candidate keyword is 29 This is the main place name in Taiwan, and assuming that the contents of the keyword collection are all 2 place names in Taiwan, the connection map (45) of the fourth circle contains it! j syllables with 2 syllables 0-2, 0-3, 1-3, 1_4, 2-4, 3_5, 4-6, 4-7, 5-8, 5-9, and 6-9, of which The phonetic paragraphs 1-4 and 5-9 are the pronunciations of "Hsinchu" and "Taoyuan". Therefore, when the Chinese speech recognizer based on the Hidden Markov Model (HMM) is used to recognize this paragraph, The correct result should be obtained. In addition, the remaining paragraphs will each get a place name through identification. Therefore, the result of each paragraph identification can only be regarded as a hypothesis of generating keywords (Hypotheses), so after identification, there will be 11 Keyword hypotheses. These hypotheses need to be further acknowledged to make an acceptance or rejection decision on the identification result, and the final accepted confirmation result is the extracted keywords. For this example, the results of its identification and confirmation are: Paragraph identification result confirmation result 0-2 Factory Zhong 坜 J Rejected 0-3 Factory Zhong1-3J Rejected 1-3 Factory Xinying J Refused 1-4 Factory Hsinchu J Accepted 2 -4 Factory Hsinchu J rejected 5-8 Factory Taoyuan J Rejected 5-9 Factory Taoyuan J Accepted
第10頁 449734Page 10 449734
其中,段落卜4與段落5-9所獲得之確認分數較高,辨識結 果因而被接受’其餘的段落不管辨識結果為何將因低分而 被拒絕,因此,段落1 -4所對應之關鍵詞「新竹」與段落 5 - 9所對應之關鍵祠「桃園」即為萃取所得之關鍵詞。 又為增加中文關鍵詞萃取之速度,本發明之另一較佳 實施例係以縮減所假定之關鍵詞數目及候選之關鍵詞數目 來減少確認及辨識之次數’與前—實施例不同之處在於其 係依據每個字音段落所提供之資訊來過濾掉根本不可能^ 為關鍵詞的段落,當中較為可靠的資訊即為每個字音段落 的韻母辨識結果C中文語音中韻母辨識較可靠),因此, 若對每個字音段落取前F名韻母辨識的結果(一般辨識器 取F = 5至10可得到98%以上的正確率)作為候選韻母,則對 含k子音的關鍵祠而§ ,只要檢視每個字音的韻母是否有 包含在前F名韻母辨識結果’即可得知接續的k個字音段落 疋否有可能為遠關鍵詞’又對此一 k字音段落而言,原本 在辨識時需對每個含k字音的候選關鍵詞一一做比對’,'但 若有了每個字音前F名韻母的資訊’則只需對篩選剩下 韻母符合之候選關鍵詞做比對,但如果篩選後沒有剩下 何候選關鍵詞,即表示此段落各字音的前F名韻母都不 在於關鍵詞中,便可不必再作進一步之處理,因此可大 減少關鍵詞確認及辨識之次數而有效増進萃取之速度。: 以前一實施例之範例而言,可將原先的〗丨個字音段=诘 7J固’亦即確認的動作可由u次減至7次,而在辨識時間之 即省上,原先每個字音段落作辨識時需要對29個地名詞音Among them, paragraphs 4 and 5-9 have higher confirmation scores, and the recognition results are accepted. The remaining paragraphs will be rejected due to low scores regardless of the recognition results. Therefore, the keywords corresponding to paragraphs 1-4 The key shrine corresponding to "Hsinchu" and paragraphs 5-9, "Taoyuan", is the key word from the extraction. In order to increase the speed of Chinese keyword extraction, another preferred embodiment of the present invention is to reduce the number of confirmations and recognitions by reducing the number of assumed keywords and candidate keywords. The reason is that it is based on the information provided by each phonetic paragraph to filter out paragraphs that are impossible to use as keywords. The more reliable information is the final result of each phonetic paragraph. C Chinese phonetic vowel recognition is more reliable.) Therefore, if the result of the first F name vowel recognition is taken for each syllable paragraph (general recognizer takes F = 5 to 10 to obtain a 98% accuracy rate) as the candidate vowel, then for the key temples containing k consonants, §, As long as you check whether the finals of each vowel contain the recognition result of the first F vowels, you can know whether the subsequent k-character paragraphs are likely to be distant keywords. And for this k-character paragraph, it was originally identified When comparing each candidate key word with k vowels one by one, 'but if you have the information of the first F vowels of each vowel', you only need to screen the candidate keywords that the remaining vowels match. Comparison, but if there are no candidate keywords left after screening, it means that the first F vowels of each syllable of this paragraph are not in the keywords, and no further processing is required, so the keyword confirmation and The number of identifications and the speed of effective extraction. : For the example of the previous embodiment, the original 〖丨 character segment = J7Jsolid ', that is, the confirmation action can be reduced from u times to 7 times, and in the time of recognition, the original, each character sound When identifying paragraphs, you need to pronounce 29 place nouns.
第11頁 449734 五、發明說明(9) 做比對’用以本實施例之方法師選後母字音段落(如 ^的話)候選詞只剩1至2個需做比對,故可明顯加速關鍵 ’萃取之速度。 本發明之免用贅詞模型之中文關鍵詞萃取方法在實際 之試驗下確有極佳之表現,在以應用領域為電話中個人分 機號碼之查詢且關鍵詞為2 〇 〇個人名的條件下進行測試, 使用者在電話中只要說出如:『請轉XXX』或『請問χχχ的 &機號碼』則系統需萃取出人名而對應到分機號碼,習知 採用贅詞模型之關鍵詞萃取方法在〗5 2個測試語句中有j 〇 句萃取失敗,而在使用相同的辨識器之情況下,本發明之 方法有8句萃取失敗,因此,其確實能夠在不必針對應用 領域建立贅詞模型之要求下達成極佳之關鍵詞萃取效果。 _综上所陳,本發明無論就目的、手段及功效,在在均 顯不其週異於習知技術之特徵,為中文關鍵詞萃取方法之 λ计上的~大突破,懇請貴審查委員明察,早曰賜准專 利,俾嘉惠社會,實感德便。惟應注意的是,上述諸多實 施例僅係為了便於說明而舉例而已,本發明所主張之權利 fe圍自應以申請專利範圍所述為準,而非僅限於上述實施Page 11 449734 V. Description of the invention (9) Comparisons' The method used in this example is to select only one or two candidate words in the vowel section (such as ^) for comparison, so it can be significantly accelerated. The key 'extraction speed. The Chinese keyword extraction method of the free word model of the present invention has excellent performance under actual experiments. Under the condition that the application field is the query of the personal extension number in the telephone and the keywords are 2000 personal names For testing, the user only needs to say “Please transfer to XXX” or “Ask & phone number” on the phone, the system needs to extract the name of the person and correspond to the extension number. The keyword extraction using the redundant word model is known. The method failed to extract j 0 sentences in 5 of the 2 test sentences, and in the case of using the same recognizer, the method failed to extract 8 sentences. Therefore, it does not need to establish redundant words for the application field. Under the requirements of the model, it achieves an excellent keyword extraction effect. _In summary, the present invention, regardless of its purpose, means and efficacy, is different from the known technology in its characteristics. It is a breakthrough on the lambda meter of the Chinese keyword extraction method. Observing clearly, granting a quasi-patent as early as possible, and benefiting the society, I feel a sense of virtue. It should be noted that many of the above-mentioned embodiments are merely examples for the convenience of description. The rights claimed in the present invention should be based on the scope of the patent application, and not limited to the above-mentioned implementations.
Claims (1)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW88115161A TW449734B (en) | 1999-09-03 | 1999-09-03 | Keyword spotting method for mandarin speech without using filler models |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW88115161A TW449734B (en) | 1999-09-03 | 1999-09-03 | Keyword spotting method for mandarin speech without using filler models |
Publications (1)
Publication Number | Publication Date |
---|---|
TW449734B true TW449734B (en) | 2001-08-11 |
Family
ID=21642165
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW88115161A TW449734B (en) | 1999-09-03 | 1999-09-03 | Keyword spotting method for mandarin speech without using filler models |
Country Status (1)
Country | Link |
---|---|
TW (1) | TW449734B (en) |
-
1999
- 1999-09-03 TW TW88115161A patent/TW449734B/en not_active IP Right Cessation
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR100769029B1 (en) | Method and system for voice recognition of names in multiple languages | |
US8380505B2 (en) | System for recognizing speech for searching a database | |
JP7200405B2 (en) | Context Bias for Speech Recognition | |
US20140244258A1 (en) | Speech recognition method of sentence having multiple instructions | |
JP3542026B2 (en) | Speech recognition system, speech recognition method, and computer-readable recording medium | |
JP2001005488A (en) | Voice interactive system | |
US20080133245A1 (en) | Methods for speech-to-speech translation | |
US20070016420A1 (en) | Dictionary lookup for mobile devices using spelling recognition | |
Novitasari et al. | Cross-lingual machine speech chain for javanese, sundanese, balinese, and bataks speech recognition and synthesis | |
US20170270923A1 (en) | Voice processing device and voice processing method | |
Mohanty et al. | Speaker identification using SVM during Oriya speech recognition | |
US20200372110A1 (en) | Method of creating a demographic based personalized pronunciation dictionary | |
JP2014164261A (en) | Information processor and information processing method | |
TW449734B (en) | Keyword spotting method for mandarin speech without using filler models | |
CN111429886B (en) | Voice recognition method and system | |
Vancha et al. | Word-level speech dataset creation for sourashtra and recognition system using kaldi | |
Zheng | A syllable-synchronous network search algorithm for word decoding in Chinese speech recognition | |
JP2002215184A (en) | Speech recognition device and program for the same | |
JP2001195081A (en) | Japanese dictation system | |
KR101095864B1 (en) | Apparatus and method for generating N-best hypothesis based on confusion matrix and confidence measure in speech recognition of connected Digits | |
Pranjol et al. | Bengali speech recognition: An overview | |
KR20030010979A (en) | Continuous speech recognization method utilizing meaning-word-based model and the apparatus | |
JP3881155B2 (en) | Speech recognition method and apparatus | |
JP2001188556A (en) | Method and device for voice recognition | |
JP2001013992A (en) | Voice understanding device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
GD4A | Issue of patent certificate for granted invention patent | ||
MM4A | Annulment or lapse of patent due to non-payment of fees |