TW200538969A

TW200538969A - Handwriting and voice input with automatic correction

Info

Publication number: TW200538969A
Application number: TW094103440A
Authority: TW
Inventors: Alex Robinson; Ethan Bradford; David Kay; Van Meurs Pim; James Stephanick
Original assignee: America Online Inc
Priority date: 2004-02-11
Filing date: 2005-02-03
Publication date: 2005-12-01
Also published as: KR100912753B1; AU2005211782A1; WO2005077098A3; BRPI0507577A; EP1714234A2; WO2005077098A8; WO2005077098B1; CA2556065C; CN1918578A; WO2005077098A2; KR20070090075A; CN1918578B; CA2556065A1; EP1714234A4; AU2005211782B2; JP2007524949A

Abstract

A hybrid approach to improve handwriting recognition and voice recognition in data process systems is disclosed. In one embodiment, a front end is used to recognize strokes, characters and/or phonemes. The front end returns candidates with relative or absolute probabilities of matching to the input. Based on linguistic characteristics of the language, e.g. alphabetical or ideographic language for the words being entered, e.g. frequency of words and phrases being used, likely part of speech of the word entered, the morphology of the language, or the context in which the word is entered), a back end combines the candidates determined by the front end from inputs for words to match with known words and the probabilities of the use of such words in the current context.

Description

入的識別理等等之有上外型之嚴格大小限制以及輸選單等等）的嚴袼限制，小戰性的問題。現今接受文字。近來從攜帶電腦、手持電、行動電話以及其他攜帶無型攜帶使用者友善之使用者以編輯文件及訊息，如用於 •同時傳送及接收電子郵There are strict restrictions on the appearance, and strict restrictions on the input menu, etc.), and minor issues. Accept text today. Recently, from portable computers, handheld phones, mobile phones, and other user-friendly portable users to edit documents and messages, such as for the simultaneous sending and receiving of e-mail

200538969 玖、發明說明：【發明所屬之技術領域】本發明與使用資料處理系統之人類語言輪關，如在桌上型電腦、手持電腦、個人資料助的手寫辨識及語音辨識。【先前技術】由於記憶體限制、尺寸與修正文字之控制（按鈕、裝置上的文字輸入是一具挑入的手持電腦裝置變得更小與個人資料助理至雙向傳呼技術的發展已導出對於一小面的需求，以接受文字輸入向傳訊系統以及尤其是"5 (e-mail )或短訊的系統。多年來，攜帶電腦已變得越來越小。在製造一更小帶電腦之努力中的一項尺寸限制元件為鍵盤。如果使用準打字尺寸按鍵，該攜帶電腦至少和該鍵盤一樣大。縮的鍵盤已被使用在攜帶電腦上，但該縮小鍵盤按鍵太小無法被一使用者以足夠的精確性簡單或快速的梯作。在攜帶電腦中加入一全尺寸鍵盤也會I1 且礙該電腦之真正攜性效用。多數的攜帶電腦無法不被置於一平坦工作表面操作以允許該使用者用兩手輸入。一使用者在站立或移入型輸腦線介雙件攜標小而帶上動 5 200538969 時無法輕易地使用一攜帶電腦。手寫辨識為已被採用之一種方式，其可解決具備偵測一手指或觸控筆之動作的一電子感應螢幕或平板的小型裝置上的文子輸入問題。在稱為個人數位助理（PDAs )的最新世代小型攜帶電腦中，各公司嘗試藉由在該pda中加入手寫辨識軟體以解決此問題。一使用者可藉由在一觸控感應板或顯示螢幕上書寫而直接地輸入文字。該辨識軟體隨200538969 发明 Description of the invention: [Technical field to which the invention belongs] The present invention relates to human language using data processing systems, such as handwriting recognition and speech recognition on desktop computers, handheld computers, and personal data. [Previous technology] Due to memory restrictions, size, and correction of text control (buttons, text input on the device is a pick-up handheld computer device has become smaller and the development of personal data assistant to two-way paging technology has been derived for a Small facets to accept text input to messaging systems and especially the " 5 (e-mail) or SMS system. Over the years, portable computers have become smaller and smaller. In making a smaller computer with One size-limiting element in the effort is the keyboard. If quasi-typed keys are used, the portable computer is at least as large as the keyboard. The reduced keyboard has been used on the portable computer, but the reduced keyboard keys are too small to be used. Simple or fast ladder making with sufficient accuracy. Adding a full-size keyboard to a portable computer will also I1 and hinder the true portability of the computer. Most portable computers cannot be operated without being placed on a flat work surface. Allow the user to use both hands to input. One user can move when the stand-up or move-in brain-brain-feeding double-piece is small and moves 5 200538969 It is easy to use a portable computer. Handwriting recognition is a method that has been adopted, which can solve the problem of text input on a small device with an electronic induction screen or tablet that detects the movement of a finger or stylus. In the latest generation of small portable computers, which are personal digital assistants (PDAs), companies try to solve this problem by adding handwriting recognition software to the pda. A user can write on a touch sensor pad or display screen Enter text directly. The recognition software

即將此手寫文字轉換為數位資料。典型地，該使用者即時寫入文字而該PDA即時辨識一字元。在該觸控感應板或顯示營幕上的書寫建立指出該接觸點的-資料輸入串。該手寫辨識軟體分析該資料輸入串的幾何特徵以判定符合該使用者正在書寫的-字元。該手寫辨識軟體典型地執行幾何外型辨識以判定該手寫字元。不幸地，意。目前的手強大的個人電小型裝置上，性；而個人書於這些原因，個別字母之一該系統之幾何書寫該字母的結果為非常低一”V卞作/夂仞个兮人滿寫辨識解決方案罝古4☆ 茶/、有許多問題，例如即使在腦上，該手寫辨場私 Α軟體並非十分準確；而在記憶體限制更推尺進—步限制手寫辨識之準確寫風格也與用於訓綾、、來該手寫軟體的不同。由許多手寫或‘graffit彳，太產口口要求該使用者學習組特定筆晝。這牲a #This handwritten text is converted into digital data. Typically, the user writes text in real time and the PDA recognizes a character in real time. Writing on the touch sensor or display screen creates a data input string indicating the contact point. The handwriting recognition software analyzes the geometric features of the data input string to determine that it matches the-character that the user is writing. The handwriting recognition software typically performs geometric shape recognition to determine the handwriting characters. Unfortunately, Italian. The current hand-powered personal electric small device is sexual; and for these reasons, personal letters are one of the letters of the system and the result of writing the letter is very low. The solution is ancient 4 ☆ There are many problems. For example, even in the brain, the handwriting recognition software is not very accurate; while the memory limit is further increased-further restricting the accurate writing style of handwriting recognition is also related to It is used for training, and different from this handwriting software. By many handwriting or 'graffit 彳, Taichankou requires the user to learn a group of specific pen days. This animal a #

一、疋筆畫組合被用於簡化外型辨識處理並辦A 曰ϋ辨識率。這些筆晝常鱼自然方式十分不同。上提出之問題的最終的產品採用度。語音辨識為被採用以解決文予輸入問題的另一方式。 6 200538969 一，吾θ辨識系統典型地包括一麥克風以偵測並記錄該語音輸入該θ輪入被數位化並被分析以取出一語音樣本。浯曰辨識典型地需要一強大系統以處理該語音輸入。某些舭力有限之浯音辨識系統已被用於小型裝置上，如用於行動電話上以供語音控制操作。對於語音控制操作而言，一裝置僅需識別幾種命令。即使對於依有限範圍之語音辨識而口由於玲樣本會隨著不同使用者以及不同情況有所變化，一小型裝置血 /、5^地並不具有々人滿意的語音辨識準確度。發展出《更實用的系統以處理人類語言輸入是有利的，該系統具有-使用者友善方式，如手寫辨識系統以供以-自然方式輸入手寫或語音辨識系統以供以一自然方式說出語音輸入，該系統具有改善的準確度以及降低的計算需求，如降低的記憶體需求及處理能力需求。【發明内容】此處描述一混合方式以辦推眘粗南武以增進貝枓處理系統上的手寫辨識及語音辨識。在—實施例…前端被用於識別筆畫、字元、音節及/或音素。該前端傳回具備符合該輸入之相對或絕對可能性的候選者。依據該語言言學㈣，如字母或表意語言；輸入中字詞，如正被使用中的字詞或片語 7 200538969First, the stroke combination is used to simplify the appearance recognition process and to perform the A recognition rate. These pens are very different in their natural way. The final product adoption is the question posed above. Speech recognition is another way to solve the problem of text input. 6 200538969 First, our θ identification system typically includes a microphone to detect and record the speech input. The θ turn is digitized and analyzed to extract a speech sample. Recognition typically requires a powerful system to process the speech input. Some limited voice recognition systems have been used on small devices, such as mobile phones for voice-controlled operations. For voice control operations, a device only needs to recognize several commands. Even for speech recognition based on a limited range of speech, because the sample of Ling will change with different users and different situations, a small device does not have satisfactory speech recognition accuracy. It would be advantageous to develop a more practical system to handle human language input. The system has a user-friendly approach, such as a handwriting recognition system for input, and a natural input for a handwriting or speech recognition system for speaking in a natural way. Input, the system has improved accuracy and reduced computing requirements, such as reduced memory requirements and processing power requirements. [Summary of the Invention] A hybrid method is described here to improve the handwriting recognition and speech recognition on the Behr processing system. In an embodiment ... the front end is used to identify strokes, characters, syllables and / or phonemes. The front end returns candidates with relative or absolute likelihoods that match the input. Linguistics based on the language, such as letters or ideographic languages; words in input, such as words or phrases being used 7 200538969

的頻率，該輸入字詞之語音的可能部分，該語言之型態；或該輸入字詞之上下文，一後端結合該前端從字詞輸入所判定之候選者以匹配已知字詞以及該些字詞在目前上下文中的可能用法。該後端可使用外卡以選擇候選字詞、使用語言特徵以預測一待完成字詞或完整的接續字詞、呈現候選字詞以供使用者選擇、及/或提供附加輸出，如字元的自動重音、自動大寫以及自動增加標點及定義符號，以協助該使用者。在一實施例中，對多個輸入模式同步使用一語言後端，如語音辨識、手寫辨識以及鍵盤輸入。本發明之一實施例包含一種在一資料處理系統上處理語言輸入的方法，其包含：對多個字詞成分分別接收多個辨識結果已處理一語言之一字詞的使用者輸入，並從多個辨識結果與指出一字詞列表之使用可能性中判定該字詞之使用者輸入的一或多個候選字詞。該多個辨識結果中至少有一個包含多個候選字詞成分以及多個可能性指標。該多個可能性指標指出該多個字詞成分符合該使用者輸入之一部分相對於彼此之可能性程度。在一實施例中，該候選字詞成分包含來自手寫辨識的一筆晝、來自手寫辨識的字元以及來自語音辨識的音素。該語言可為字母的或表意的。在一實施例中，判定一或多個候選字詞包含：消除該多個辨識結果之多個候選字詞組合、自該語言之一字詞列表選擇多個候選字詞，該多個候選字詞含有該多個辨識結果之候選字詞成分的組合、從該多個辨識結果及指出一字詞列表之使用可能性的資料中對該一或多個候選字詞判定 8The frequency of the input word, the possible part of the input word, the type of the language; or the context of the input word, a back end combined with the front end to determine the candidate determined from the word input to match the known word and the Possible uses of these words in the current context. The backend can use wild cards to select candidate words, use linguistic features to predict a to-be-completed word or complete continuation words, present candidate words for user selection, and / or provide additional output, such as characters Auto-accent, auto-capitalization, and automatic punctuation and definition symbols to assist the user. In one embodiment, a single language backend is used simultaneously for multiple input modes, such as speech recognition, handwriting recognition, and keyboard input. An embodiment of the present invention includes a method for processing language input on a data processing system. The method includes: receiving multiple recognition results for a plurality of word components, and processing user input of a word in a language, and A plurality of recognition results and one or more candidate words input by a user who judges the word among the possibility of using the word list. At least one of the multiple recognition results includes multiple candidate word components and multiple likelihood indicators. The multiple likelihood indicators indicate the degree of likelihood that the multiple word components conform to a portion of the user input relative to each other. In one embodiment, the candidate word component includes a stroke from handwriting recognition, characters from handwriting recognition, and phonemes from speech recognition. The language can be alphabetic or ideographic. In one embodiment, determining one or more candidate words includes: eliminating a plurality of candidate word combinations of the plurality of recognition results, selecting a plurality of candidate words from a word list of the language, and the plurality of candidate words. The combination of candidate word components containing the plurality of recognition results, the judgment of the one or more candidate words from the plurality of recognition results and data indicating the possibility of using a word list 8

200538969 或多個可能性指標以指出符合該字詞之使用者輸入的對可能性、或依據一或多個可能性指標排序該一或多個選字詞。在一實施例中，自動地從一或多個候選字詞選擇一選者並呈現給該使用者。可依據該語言中的任何片語、語言中的字詞對（word pairs )、以及該語言中的三連字 (word tri grams )而執行該自動選擇。也可依據該語言任何形態（morphology)以及該語言之文法規則而執行自動選擇。也可依據所接收之該字詞之使用者輸入的一下文而執行該自動選擇。在一實施例中，該方法進一步包含依據預料一使用輸入接續字詞而自動選擇的字詞而預測多個候選字詞。在一實施例中，該方法包含呈現該一或多個候選字以供使用者選擇，並接收一使用者輸入以選擇該多個候字詞其中之一。在一實施例中，一字詞成分之多個辨識結果包含一候選字詞成分之任一者對於符合該使用者字詞輸入之一分具有相同可能性的一指示。指出該字詞列表之使用可性的資料可包含該語言中的字詞使用頻率、一使用者使字詞之頻率以及一文件中使用字詞之頻率的任一者。在一實施例中，該方法進一步包含自動重音一或多字元、自動大寫一或多個字元、自動增加一或多個標點號以及自動增加一或多個定義符號的任一者。本發明之一實施例包含在一資料處理系統上辨識語相候候該串之該上者詞選組部能用個符言 9 200538969200538969 or more likelihood indicators to indicate the likelihood entered by a user who matches the word, or to rank the one or more selected words based on one or more likelihood indicators. In one embodiment, a candidate is automatically selected from one or more candidate words and presented to the user. The automatic selection can be performed based on any phrase in the language, word pairs in the language, and word tri grams in the language. It can also perform automatic selection based on any morphology of the language and the grammatical rules of the language. The automatic selection may also be performed based on the text entered by the user of the word received. In one embodiment, the method further includes predicting a plurality of candidate words based on a word that is expected to be automatically selected using an input continuation word. In one embodiment, the method includes presenting the one or more candidate words for a user to select, and receiving a user input to select one of the plurality of candidate words. In one embodiment, the plurality of recognition results of a word component include an indication that any one of the candidate word components has the same possibility to match a portion of the user word input. The data indicating the availability of the word list may include any of the frequency of use of words in the language, the frequency of use of words by a user, and the frequency of use of words in a document. In one embodiment, the method further includes any of automatically accenting one or more characters, automatically capitalizing one or more characters, automatically adding one or more punctuation marks, and automatically adding one or more defined symbols. An embodiment of the present invention includes a data processing system that can recognize a word when the word selection group of the string can use a token 9 200538969

輸入的一方法，該方法包含：透過樣式識別處理一語言之一字詞的一使用者輸入以對多個字詞成分個別建立多個辨識結果，並從多個辨識結果及指出一字詞列表之使用可能性的資料中判定該使用者輸入字詞的一或多個候選字詞。該多個辨識結果之至少一者包含多個候選字詞成分以及多個可能性指標。該多個可能性指標指出該多個字詞成分符合該使用者輸入之一部分相對於彼此的可能性程度。該樣式辨識可包括手寫辨識，其中每個該多個候選字詞成分包括一筆晝，例如用於一表意語言符號或字母字元；或一字元，例如用於一字母語言。該字詞可為一字母字詞或一表意語言符號。該樣式辨識可包括語音辨識，其中每個候選字詞成分包含一音素。在一實施例中，一字詞成分之多個辨識結果之一包含一指示，其指出一組候選字詞成分之任一者具有同等的可能性符合該使用者輸入之該字詞的一部分。該組候選字詞成分包含該語言之所有字母字元。指出該字詞列表之使用可能性的資料可包含該語言中的字詞使用頻率、一使用者使用字詞之頻率以及一文件中字詞使用之頻率的任一者。指出字詞列表之使用可能性的資料可包表示該語言之形態的資料以及表示該語言之文法規則的資料的任一者。指出該字詞列表之使用頻率的資料可包含：表示所接收之使用者輸入字詞的上下文的資料。在一實施例中，該使用者輸入僅指定該字詞之一完整字詞成分組合的一部分。該系統判定該候選字詞。 10 200538969 在一實施例中，該一或多個候選字詞包含一部分該多個辨識結果中的候選字詞成分組合所形成之字詞以及一部分含有辨識結果中的候選字詞成分組合的字詞。在一實施例中，該一或多個候選字詞包含多個候選字詞。該方法進一步包含：呈現該多個候選字詞以供選擇，以及接收一使用者輸入以從該多個候選字詞中選擇其中之A method for inputting. The method includes: processing a user input of a word in a language through pattern recognition to establish a plurality of recognition results for a plurality of word components individually, and pointing out a plurality of recognition results and a word list. The possibility of using the data to determine one or more candidate words of the user input words. At least one of the plurality of recognition results includes a plurality of candidate word components and a plurality of likelihood indicators. The plurality of likelihood indicators indicate a degree of likelihood that the plurality of word components conform to a portion of the user input relative to each other. The pattern recognition may include handwriting recognition, where each of the plurality of candidate word components includes a day, such as for an ideographic language symbol or alphabetic character; or a character, such as for an alphabetic language. The word can be a one-letter word or an ideographic symbol. The pattern recognition may include speech recognition, where each candidate word component includes a phoneme. In one embodiment, one of the plurality of recognition results of a word component includes an indication that any one of a group of candidate word components has an equal likelihood of matching a portion of the word entered by the user. The set of candidate word components contains all alphabetic characters of the language. The data indicating the possibility of using the word list may include any of the frequency of use of words in the language, the frequency of use of words by a user, and the frequency of use of words in a document. The data indicating the use possibility of the word list may include any one of data indicating the form of the language and data indicating the grammatical rules of the language. The data indicating the frequency of use of the word list may include data indicating the context in which the user has entered the word. In one embodiment, the user input specifies only a portion of a complete combination of word components of the word. The system determines the candidate word. 10 200538969 In one embodiment, the one or more candidate words include a part of the words formed by the candidate word component combinations in the plurality of recognition results and a part of the words contain the candidate word component combinations in the recognition results. . In one embodiment, the one or more candidate words include a plurality of candidate words. The method further includes: presenting the plurality of candidate words for selection, and receiving a user input to select one of the plurality of candidate words.

在一實施例中，該方法進一步包含：依據預測一使用者所輸入之接續字詞而選擇之一字詞而預測一或多個候選字詞。在一實施例中，該多個候選字詞以符合該使用者輸入之字詞的可能性順序而加以呈現。在一實施例中該方法進一步包含：從一或多個候選字詞中自動地選擇一最有可能者作為該使用者所輸入之一字詞的一辨識字詞。在一實施例中，該方法進一步包含：依據預測一使用者所輸入之接續字詞的一最有可能字詞而預測一或多個候選字詞。在一實施例中，該方法進一步包含自動重音一或多個字元、自動大寫一或多個字元、自動增加一或多個標點符號以及自動增加一或多個定義符號之任一者。在一實施例中，該多個辨識結果之每一者包含個別與多個候選字詞成分有關之可能性指標以指出符合該使用者輸入之一部分的相對可能性。【實施方式】 11 200538969 為主手持音辨體的擎的及語In one embodiment, the method further comprises: predicting one or more candidate words based on selecting a word based on predicting a continuous word input by a user. In one embodiment, the plurality of candidate words are presented in an order of likelihood that matches the words entered by the user. In an embodiment, the method further includes: automatically selecting a most likely one from one or more candidate words as a recognition word of a word input by the user. In one embodiment, the method further includes predicting one or more candidate words based on predicting a most likely word of a contiguous word input by a user. In one embodiment, the method further includes any of automatically accenting one or more characters, automatically capitalizing one or more characters, automatically adding one or more punctuation marks, and automatically adding one or more defined symbols. In one embodiment, each of the plurality of recognition results includes a likelihood index that is individually related to a plurality of candidate word components to indicate a relative likelihood that it matches a portion of the user input. [Embodiment] 11 200538969 The language and language of the main handheld audio discriminator

音節性的而非詞輸中此提供而手為可Syllable rather than word input

如一節、 URL 一實識，與數輸入方法，如手寫辨識及語音辨識，可為傳統以之輸入方法的重要替代方案，尤其是對於小型裝電腦、個人資料助理及行動電話而言。傳統手寫識系統面臨著需要超過小型電子裝置上可利用之難題。本發明透過自動校正以降低手寫或語音辨記憶體需求及處理能力需求而改進這些裝置上的音輸入技術。本發明使用一混合方式以增進資料處理系統之手語音辨識。在一實施例中，一前端辨識筆晝、字、及/或音素並傳回具有符合該輸入之相對或絕對候選者。可傳回不同候選者以供一後端進一步處使用該前端僅選擇一候選者。該後端結合該前端入所判定的候選者以配對已知字詞以及在目前上字詞之使用可能性。藉由結合該前端雨後端，本具有一增進辨識率以及更加使用者友善的一系統寫及語音辨識輸入之一有效且低記憶體/CPU使行的。在本發明中，一 “字詞（word ) ”係指任何語言物串形成一字詞、詞幹（word stem)、字首或字尾片語、縮寫、俚語、表情符號（emoticon )、使用者或表意字元序列的一或多個字元或符號。在本發施例中，一前端被用於執行該語言輸入上的樣如手寫、語音輸入等等。許多技術已被用於將該個目標樣式相比較，如筆晝、手寫字元以及語音鍵盤置如及語記憶識引文字寫辨元、可能理，從字下文發明〇因用成件，、音 ID、明之式辨輸入輸入 12 200538969For example, URL, practical knowledge, and number input methods, such as handwriting recognition and speech recognition, can be important alternatives to traditional input methods, especially for small computers, personal data assistants, and mobile phones. Traditional handwriting recognition systems face a challenge that requires more than is available on small electronic devices. The present invention improves the sound input technology on these devices by automatically correcting to reduce handwriting or speech recognition memory requirements and processing power requirements. The present invention uses a hybrid approach to enhance the hand of data processing systems for speech recognition. In one embodiment, a front end recognizes the pen day, word, and / or phoneme and returns a relative or absolute candidate that matches the input. Different candidates can be returned for further processing by a back end. Only one candidate is selected using the front end. The back end combines the candidates identified by the front end to match the known word and the possibility of using the current word. By combining the front-end rain back end, the system has an improved recognition rate and is more user-friendly. One of the system's write and speech recognition inputs is effective and has a low memory / CPU implementation. In the present invention, a "word" means any linguistic string forming a word, word stem, prefix or suffix, abbreviation, slang, emoticon, use Or one or more characters or symbols of a sequence of ideographic characters. In this embodiment, a front end is used to execute samples on the language input such as handwriting, voice input, and so on. Many techniques have been used to compare this target style, such as pen day, handwriting, and phonetic keyboard settings, language memory, recognition, text, writing, recognition, and possible reasoning. Invented from the following text. Audio ID, Mingzhi type input 12 200538969

重音等。典型地’一輸入不同程度地與數個目標樣式相符合。舉例來說，一手寫字母可能與字元“a”或“c”、“〇”或‘、，，相似。目前可用之樣式辨識技術可判定該手寫字母為這些字元之任一者的 < 能性。然而’一辨識系統典型地被迫僅回報一項符合。因此’具有最高符合可能性的該字元典型地會被回報為辨識結果。在本發明之一實施例中，數個候選者被送進該後端作為可能選擇’而非預先排除其他候選者以得到一項可能為錯誤的符合，因而該後端使用該前後文以對該語言輸入整體地判定更為可能之候選者組合，如一字詞、一片語、字詞對、三連字串、或符合一語句之前後文的一字詞，例如依據文法結構。舉例來說，可從該使用者嘗試輸入之字詞中的不同字元候選者組合中判定不同的候選字詞。從該語言中使用該字詞的頻率以及符合該候選字元之相對或絕對可能性中，該後端可判定該使用者最有可能正在輸入的字詞。此與傳統方法不同，後者提供一組獨立判定之最有可能字元，其甚至無法組成一有意義字詞。因此，本發明結合精確字詞搜尋軟體與一手寫辨識 (HR)引擎或-語音辨識（SR)引擎以提供小型電子裝置如個人數位助理、電話或任彳_ 人饮何泛領域產業上用於輸入文字及資料的許多特定裝置上文丰盥任立又予興3吾音輸入之持續問題一種有力的解決方案。擎以有效地服務各種而僅有低度的記憶體此外，本發明使用一單一後端引輸入型態（標準鍵盤、手寫、語音）， 13 200538969 及處理器需求。Stress, etc. Typically'-input matches the target patterns to varying degrees. For example, a handwritten letter may be similar to the characters "a" or "c", "0" or ‘,,,’. Currently available pattern recognition techniques can determine the handwritten letter as < ability of any of these characters. However, a'-identification system is typically forced to report only one match. Therefore, the character with the highest probability of matching is typically reported as a recognition result. In one embodiment of the present invention, several candidates are sent to the backend as possible choices, rather than excluding other candidates in advance to get a possible mismatch, so the backend uses the context to match The language input determines overall more likely candidate combinations, such as a word, a phrase, a word pair, a triplet, or a word that fits before or after a sentence, for example, according to a grammatical structure. For example, different candidate words can be determined from different combinations of character candidates among the words that the user is trying to enter. From the frequency of using the word in the language and the relative or absolute likelihood of matching the candidate character, the backend can determine the word that the user is most likely to type. This is different from the traditional method, which provides a set of independently determined most likely characters that cannot even form a meaningful word. Therefore, the present invention combines a precise word search software with a handwriting recognition (HR) engine or a speech recognition (SR) engine to provide a small electronic device such as a personal digital assistant, a telephone, or a mobile phone. Many specific devices for inputting text and data. Feng Liren Li Xing gave a powerful solution to the persistent problem of Wuyin input. The engine can effectively serve a variety of low-level memories. In addition, the present invention uses a single back-end input type (standard keyboard, handwriting, voice), 13 200538969, and processor requirements.

第1圖說明依據本發明在一資料處理系統上辨識使用者輸入之一系統的一圖示。在語言輸入101如手寫或語音於該樣式辨識引擎103被接收後，該樣式辨識引擎103處理該輸入以提供候選字詞成分如字元、音素或筆晝以及其符合該輸入1 05之對應部分的可能性。舉例來說，一字元輸入可與一候選字元列表相符，而造成模糊。在一實施例中，該模糊於該前端層級被容忍而被傳送至該語言非模糊後端以供進一步處理。舉例來說，一種以字詞為基礎的非模糊引擎1 07比對該字詞列表1 09核對該字元的可能組合以建立候選字詞以及其符合該使用者輸入111的關聯可能性。由於較不常使用之字詞或未知字詞如未列入字詞列表1 09中的字詞較不可能符合該使用者輸入，該些候選字詞可被降級而具有較低的符合可能性，即使依據該樣式辨識引擎1 05的結果其看似具有相對較高的符合可能性。該以字詞為基礎的非模糊引擎107可消除某些較不可能的候選字詞因而該使用者不會受到一龐大選擇清單所煩擾。替代地，該以字詞為基礎之非模糊引擎可從該候選字詞選擇一最有可能的字詞。在一實施例中，如果該以字詞為基礎之非模糊引擎 107的輸出中具有模糊，一種以片語為基礎之非模糊引擎 11 3進一步比對該片語列表11 5以核對該結果，該列表可包括二連字串、三連字串等等。可將一或多個先前辨識的 14 200538969 字詞與該目前字詞結合以符合該片語列毒該片語之使用頻率可被用於修改符合該柄以建立該候選片語以及符合11 7的關聯月模糊，該以片語為基礎的非模糊引擎可初識的字詞以及該片語列表11 5而預測接續在一實施例中，如果該依據片語之# 輸出中具有模糊，便執行一前後文及/或夕去不太可能的字詞/片語。如果無法透過驾處理解決該模糊，可呈現該選擇給該使用擇121。在該使用者選擇後，可更新該字該片語列表1 1 5以升級該使用者選擇的：加新的字詞/片語至該列表中。第2圖為依據本發明一種用於辨識使處理系統的一方塊圖。雖然第2圖說明一統之各種元件，已瞭解依據本發明之一嘴理系統-般可包括相較於第2圖所描述者件。舉例來說，某些系統可能不具有一窝需要用於處理聲音的元件。 • 丁呆些系統可能描述的其他功能，如一行動订動電話環境上# 圖說明各種與本發明之〆系些特徵密件。在此說明書中’一習知技藝人士將畴 /資料處理系統的配置並不限於第2 構。顯示器203透過適當的介面電路 • 11 5中的片語。選字詞之可能性能性。即使沒有 .用於依據先前辨字詞。模糊引擎113的法分析11 9以消自動語言非模糊者以供使用者選詞列表109以及 P詞/片語及/或增用者輸入之資料示範資料處理系施例的一資料處更多或較少的元音辨識能力而不具有第2圖中未通信電路。第2 切相關的各種元解依據本發明之中描述的特定結 ?至處理器2 0 1。 15 200538969 一手寫輸入裝置202，如一觸控螢幕、一滑鼠、或一數位筆’被連接至該處理器201以接收使用者輸入以供手寫辨識及/或其他使用者輸入。一語音輸入裝置如一麥克風被連接至該處理器201以接收使用者輸入以供語音辨識及/或其他語音輸入。選擇地，一聲音輸出裝置20 5如一別叭亦被連接至該處理器。該處理器201自該語音輸入裝置204接收輸入並管理輸出至該顯示器及剩 °八。該處理器2 0 1被連接至一記憶體2 1 0。該記憶體包括一暫時儲存媒體組合如隨機存取記憶體（RAM )以及永久儲存媒體組合如唯讀記憶體（ROM )、軟碟、硬碟或 CD-ROMs。該憶體210含有所有管理系統作業所需的軟體常式及資料。該記憶體典型地含有一作業系統2ιι以及應用程式220。應用程式之範例包括文書處理器、軟體辭典以及外語翻譯器。亦可提供語音合成軟體作為應用程式。較佳地’該記憶體進一步包含一筆畫/字元辨識引擎 =Γ供辨識該手寫輪入中的筆畫/字元及/或音素辨識引辨識該語音輸入中的音素。該音素辨識引擎以及該筆畫/字元辨識弓丨擎可使…域中已知之任何技術以提供-候選列表以及符合每個輸入之筆晝之關聯=。已瞭解該前端引擎如該筆畫/字元二;丨擎或該曰素辨識引擎213中用於樣式辨識之本發明中並非是適切的。 η 在本發明之一實施例中’該記憶體210進—步包括一 16 200538969 語言非模糊後端，复可6 “」匕祜一或多個以字詞為基礎的非糊引擎216、以片語兔苴姑七、為基礎之辨識非模糊引擎2 1 7、以前後文為基礎之非模构引整孕2 1 8、一選擇模組2 1 9以及其他如一字詞列表2 1 4以及一 y 片s吾列表2 1 5等等。在此實施例中，該以前後文為基礎之非禮模糊引擎應用有助於輸入非模糊之吏用者仃動的別後文態’策。舉例來說，可依據選擇的使用者位置’㈣使用者在辦公室或在家中；一天中的時間，如工作時間抑或閒暇年· 叹-間，或接收者等等。在本發明之一眘施例中’用於一非模糊後端之元件多數於不同輸入形式中祐极/、用’如用於手寫辨識與用於語音辨識。該字詞列表2〗4 1 4包含一語言中的一已知字詞列表。該字詞列表214可淮一止—人 ^ 步包含該語言中對應字詞之使用頻率資訊。在一實施例中 ’不存在於該語言之字詞列表 214 中的一字詞頻率被相見為零。替代地，可指派一非常小的使用頻率給一未知字匈 ”J °使用該未知字詞之預設使用頻率，便可以一實際上相同』的方式處理該已知及未知字詞。該字詞列表2 1 4可侔陆姑、 ~以字詞為基礎之非模糊引擎2 1 6而被使用以排列、消本β > /或選擇依據該樣式辨識前端（例如該筆畫/字元辨識引擎 # 212或該音素辨識引擎213)之結果所判定之候選字詞，祐Β 7〜、為了元成字詞而依據一部分的使用者輸入而預測字詞。無7 n Ί 類似地，該片語列表2 1 5可包含包括兩個以上字詞的一 y汉^；丨士片^列表以及該使用頻率資訊，該片語表可被該以片語為基礎之非模糊引擎2 1 7所使用且可被用於預測字詞以完成片語。 17 200538969 在本發明之一實施例中，每個輸入序列被參照至一咬多個字彙模組而加以處理，每個字彙模組含有一或多個字 •彙以及關於每個字彙的資訊，包括該字詞中的字元數量以 “及該字詞關於其他相同長度的字詞的發生頻率。替代地，關於該字彙模組或一特定字詞為一成員之模組的資訊被伴隨每個字詞而儲存，或一模組可依據語言樣式修改或建立字詞’如在一特定音節上放至一區別標記，或依據任何用於解譯該目前輸入序列的其他演算法及/或附近前後文而 • 建立或過滤候選字詞。在一實施例中，每個輸入序列被一樣式辨識鈿端所處理以提供一連串的候選列表，如筆晝、字元、音節、音素等等。該候選者的不同組合提供不同的候選字詞。該非模糊後端結合該候選者之符合可能性以及該候選字詞之使用頻率以排列、消去及/或選擇一字詞或更多字詞作為替代品以供使用者選擇。具有較高使用頻率的字詞為高度可能性的候選者。未知字詞或較低使用頻率的字詞為低度可能性的候選者。該選擇模組2 1 9選擇性地自該使用者可選擇者呈現數個南度可能性的字詞。在本發明 • 的另一實施例中，字詞之使用頻率乃依據該使用者之使用或在一特定前後文中該字詞之使用，例如在該使用者正在編輯之一訊息或文章中。因此，常使用的字詞成為更有可能的字詞。 _ 在另一實施例中，每個字彙模組中存有字詞，因而該字詞被分類為含有相同長度之字詞的檔案或叢集。首先藉由搜尋相同長度的字詞群組作為該輸入序列中的輸入數目 18 200538969 而處理每個輸入序列，並以最佳符合度量分數識別該選字詞。如果與該輸入序列具有相同長度而被識別的字詞少於一臨界數量，則該系統繼續比較N輸入的輸列與N+1長度之字詞群組中每個字詞的前1^個字母。理持續搜尋越來越長的字詞並比較輸入之輸入序列與群組中每個字詞的前N個字母，直到識別臨界數量的字詞。長度大於該輸入序列的可用候選字詞可被提供使用者作為該輸入序列的可能解釋，其提供一字詞完形式。 f安裝階段中，或在收到文字訊息或其他資料之過程中，在資料檔案尋此資訊樓案的方法已，加入語彙中的字詞。用一曰發^ 1 存在於習知技藝中。在搜尋過 α ,日m 史破增加至一子彙模組作為低頻詞，且因此被置於該念一播f 予祠相關聯之字詞列表的末端。押8¾過程中*一特定在 '新字詞被偵測到的次數，便藉由子Θ相關列表中升級丄疮 h子詞而指定一相對越來越高的度’因而增加資訊私 ,^ 貧冗輪入期間中該字詞顯示於該字詞選表中的可能性。在本發"明之_ - %、樣中，對於每個輸入序列，一字組藉由識別且右畏古八阿可能性的候選字詞成分並且編製選字詞成分所構成之 ^ 7 之〜子詞而建立一候選字詞。此“峰型”字詞隨後被包含％天匕^於候選字詞列表中，亦可被呈現於別^曰疋搁位中。該字詞語彙具有冒犯字詞之附錄，搭般可接又狀態下的頬似字詞，因而輸入該冒犯字詞時些候候選入序此處每個候選給該成的持續於搜程中率字依據在該優先擇列詞模由候切類一特配一，即 19 200538969Figure 1 illustrates a diagram of a system for recognizing user input on a data processing system according to the present invention. After the language input 101, such as handwriting or speech, is received by the style recognition engine 103, the style recognition engine 103 processes the input to provide candidate word components such as characters, phonemes, or pen days and their corresponding parts that match the input 105 Possibility. For example, a character input can match a candidate character list and cause ambiguity. In one embodiment, the obfuscation is tolerated at the front-end level and transmitted to the language non-fuzzy back-end for further processing. For example, a word-based non-fuzzy engine 1 07 compares the word list 1 09 with possible combinations of the characters to establish candidate words and their association possibilities that match the user input 111. Since less frequently used words or unknown words are less likely to match the user input if they are not in the word list 1, 09, these candidate words can be downgraded and have a lower likelihood of matching Even though the results of the pattern recognition engine 105 seem to have a relatively high probability of matching. The word-based non-fuzzy engine 107 eliminates some of the less likely candidate words so that the user is not bothered by a large selection list. Alternatively, the word-based non-fuzzy engine may select a most likely word from the candidate words. In an embodiment, if the output of the word-based non-fuzzy engine 107 has blur, a phrase-based non-fuzzy engine 11 3 further compares the phrase list 115 to check the result. The list can include double hyphens, triple hyphens, and so on. One or more previously identified 14 200538969 words can be combined with the current word to match the phrase. The frequency of use of the phrase can be used to modify the match to create the candidate phrase and match 11 7 The associated month is ambiguous, the phrase based on the phrase-based non-fuzzy engine and the phrase list 115 are predicted to continue in an embodiment. If the # output according to the phrase has ambiguous, then Perform contextual and / or unlikely words / phrases. If the ambiguity cannot be resolved by driving, this option can be presented to the use option 121. After the user selects the word, the phrase list 1 1 5 can be updated to upgrade the user's choice: Add a new word / phrase to the list. Fig. 2 is a block diagram of an identification processing system according to the present invention. Although FIG. 2 illustrates the various elements of the system, it is understood that a mouthpiece system in accordance with the present invention may generally include elements compared to those described in FIG. 2. For example, some systems may not have a nest of components needed to process sound. • Other functions that the system may describe, such as a mobile phone subscription environment. Figures illustrate various feature secrets related to the present invention. In this specification, a skilled artisan will not limit the configuration of the domain / data processing system to the second configuration. The display 203 passes through the appropriate interface circuit. Possibility of choosing words Even if there is no. To use based on previously recognized words. Method analysis of fuzzy engine 113 11 9 Eliminates non-fuzzy automatic language for the user to select a list of words 109 and P-words / phrases and / or input data input by the user. Model data processing is a data section of the embodiment. Or less vowel recognition capability without the uncommunicated circuit in Figure 2. The various meta-relevant meta-analysis solutions are based on the specific structure described in the present invention to the processor 201. 15 200538969 A handwriting input device 202, such as a touch screen, a mouse, or a digital pen 'is connected to the processor 201 to receive user input for handwriting recognition and / or other user input. A speech input device such as a microphone is connected to the processor 201 to receive user input for speech recognition and / or other speech input. Alternatively, a sound output device 205 is connected to the processor as well. The processor 201 receives input from the voice input device 204 and manages output to the display and the remaining eight. The processor 2 0 1 is connected to a memory 2 1 0. The memory includes a combination of temporary storage media such as random access memory (RAM) and a combination of permanent storage media such as read-only memory (ROM), floppy disks, hard disks, or CD-ROMs. The memory 210 contains all software routines and data needed to manage system operations. The memory typically contains an operating system 2m and an application program 220. Examples of applications include word processors, software dictionaries, and foreign language translators. Speech synthesis software is also available as an application. Preferably, the memory further includes a stroke / character recognition engine = Γ for recognizing strokes / characters and / or phoneme recognition in the handwriting turn to identify phonemes in the speech input. The phoneme recognition engine and the stroke / character recognition engine allow any technique known in the field to provide-candidate lists and associations of strokes that match each input =. It has been known that the front-end engine such as the stroke / character two; engine or the element recognition engine 213 for pattern recognition is not appropriate in the present invention. η In an embodiment of the present invention, 'the memory 210 further includes a 16 200538969 language non-fuzzy backend, which may be 6 "" dagger one or more word-based non-puzzling engines 216, and Phrase Rabbit Aunt VII, Non-fuzzy Engine Based on Identification 2 1 7, Non-modelled Pregnancy Based on Precedence 2 1 8, Choice Module 2 1 9 and Others such as Word List 2 1 4 And a y piece of my list 2 1 5 and so on. In this embodiment, the context-based indecent assault fuzzy engine application is helpful to input non-ambiguity user's other postulates' strategies. For example, depending on the selected user location, the user is in the office or at home; the time of day, such as working hours or leisure years, sighs, or recipients, and so on. In a prudent embodiment of the present invention, the components used for a non-fuzzy back end are mostly used in different input forms, such as for handwriting recognition and for speech recognition. The word list 2〗 4 1 4 contains a list of known words in a language. The word list 214 may end in one step—the person ^ step contains frequency usage information of the corresponding word in the language. In one embodiment, the frequency of a word that does not exist in the word list 214 of the language is seen to be zero. Alternatively, a very small frequency of use can be assigned to an unknown word "J °" using the default frequency of use of the unknown word, and the known and unknown word can be processed in a substantially the same way. The word The word list 2 1 4 can be used for Lugu, ~ word-based non-fuzzy engine 2 1 6 and used to arrange, eliminate β > / or choose to recognize the front end according to the style (such as the stroke / character recognition The candidate word determined by the result of engine # 212 or the phoneme recognition engine 213), ΒΒ7 ~, to predict the word based on a part of the user input in order to form the word. None 7 n Ί Similarly, the film The phrase list 2 1 5 may include a y Han ^ including two or more words; a list of shi films ^ and the frequency of use information, and the phrase list may be used by the phrase-based non-fuzzy engine 2 1 7 Used and can be used to predict words to complete the phrase. 17 200538969 In one embodiment of the present invention, each input sequence is processed by referring to a plurality of word modules, each word module contains a Or multiple vocabularies and information about each vocabulary , Including the number of characters in the word to "the word occurrence frequency and the other terms on the same length. Alternatively, information about the vocabulary module or a module in which a specific word is a member is stored with each word, or a module can modify or create a word according to the language style, such as on a specific syllable Put in a distinguishing mark, or build or filter candidate terms based on any other algorithms and / or nearby contexts used to interpret the current input sequence. In one embodiment, each input sequence is processed by a pattern recognition terminal to provide a series of candidate lists, such as pen day, character, syllable, phoneme, and so on. Different combinations of this candidate provide different candidate terms. The non-fuzzy backend combines the candidate's compliance probability and the frequency of use of the candidate word to rank, eliminate, and / or select one or more words as alternatives for the user to choose. Words with a higher frequency of use are highly likely candidates. Unknown words or words that are used less frequently are candidates for low probability. The selection module 2 1 9 selectively presents a number of southerly possibilities from the user-selectable person. In another embodiment of the present invention, the frequency of use of words is based on the use of the user or the use of the word in a specific context, such as in a message or article that the user is editing. As a result, frequently used words become more likely. _ In another embodiment, a word is stored in each vocabulary module, so the word is classified as a file or cluster containing words of the same length. First, each input sequence is processed by searching for groups of words of the same length as the number of inputs in the input sequence, and the selected word is identified with the best coincidence score. If the number of recognized words with the same length as the input sequence is less than a critical number, the system continues to compare the input of N input with the first 1 ^ of each word in the word group of length N + 1 letter. Li continuously searches for longer and longer words and compares the input sequence entered with the first N letters of each word in the group until a critical number of words are identified. Available candidate words longer than the input sequence can be provided to the user as a possible interpretation of the input sequence, which provides a one-word completion form. f During the installation phase, or in the process of receiving text messages or other information, the method of finding this information building case in the data file has been added to the words in the vocabulary. With 曰一发 ^ 1 exists in the know-how. After searching for α, the day m history breaks into a sub-module as a low-frequency word, and is therefore placed at the end of the list of words associated with the chanting f. During the bet 8¾ process * a specific number of times that a new word was detected, a relatively higher degree was specified by upgrading the scab h subword in the sub-Θ related list, thus increasing information privacy, ^ poor The likelihood that the word will appear in the word list during a long rotation. In the present " 明之 _-%, sample, for each input sequence, a group of words is formed by identifying a candidate word component that is right and fearing the possibility, and compiling the selected word component ^^ 7 ~ Sub-words to build a candidate word. This "peak type" word is then included in the list of candidate words, and can also be presented in the other place. The word vocabulary has an appendix to the offensive word, which is similar to the offensive word in the state, so when entering the offensive word, the candidate candidates are sorted into order. The rate word is selected by the candidate category one, which is 19 200538969.

使該文字的確切輸入包含該冒犯字詞，僅會產生該確切類型攔位中的相關可接受字詞，且在適當情況下作為該字詞選擇列表中的一建議。此特性可過濾掉冒犯字詞的出現，該情形在該使用者暸解到可能更快地打字而較不注意地觸碰該鍵盤之預期字母的精確位置時將可能偶然地出現。因此，在顯示該確切鍵入字串之前使用習知技藝中熟知之技術，負責顯示該字詞選擇列表之軟體常式比較該目前確切鍵入字串以及冒犯字詞附錄，若發現兩者相符，便以相關可接受字詞取代該顯示字串。否則，即使將一冒犯字詞視為一極低頻率字詞，當該字詞之每個字母被直接觸碰時，其仍將被顯示為該確切鍵入字詞。即使此情形與意外在一標準鍵盤上鍵入一冒犯字詞相似，本發明容忍該使用者較不準確的輸入。此特性可由該使用者開啟或關閉，例如透過一系統選單選項。該些習知技藝人士將暸解可於該電腦中開啟額外字彙模組，例如含有法律術語、醫學術語以及其他語言之字彙模組。再者於某些語言如印度語中，該字彙模組可使用有效子字詞序列之“樣板（template) ”以於該先前輸入及該候選字詞正被考慮時判定何者候選字詞成分是可能的或適當的。透過一系統選單，該使用者可設定該系統以使該額外字彙字詞出現於可能字詞列表的最前面或最後面，例如藉由特別著色或高亮度標示，或該系統可自動依據何者字彙模組供應該直接先前選擇的字詞而自動切換該字詞之順序。因此，在附加申請專利範圍中，將瞭解本發明可以除 20 200538969 了此處特別說明之外的方式加以實施。依據本發明之另一態樣，在一使用者使用該系統之過程中，—升級演算法自動地調整該語彙，該演算法於每次該使用纟選擇_字詞肖執行以透㉟逐漸增加與該字詞相關之：：頻率而升级該語彙中的字詞。&一實施例中，該升級演算法增加與一相對大量增額所選擇之字詞相關的頻率數值’而降低_非常小減額所忽略之該些字詞的頻率數值。對於相對頻率資訊由字詞出現於一列表中的連續順序所指出的一字彙模組而言，藉由將該選擇字詞向上移動某部分與列表前端間的距離而完成升級。該升級演算法最好避免移動最常使用的字詞以及非常不常使用的字詞遠離其原始位置。舉例來說該列表之中間範圍中的字詞隨著每次選擇破升級最大的比例。位於該選擇字詞於該語橐升級中開始與結束之間的字詞被有效地以數值丨所降級。字詞列表整體維持守恆，因而關於該列表中字詞之相對頻率的資訊可被維護並更新，而無須增加該列表所需之儲存。該升級演算法增加選擇字詞之頻率且於適當1降低未選擇字詞之頻帛。舉例來說’在相對頻率資訊由字詞出現於-列表中的連績順序所指出的一語棄中，於該列表中的 IDX位置出現的一選擇字詞被移動至（Ιβχ/2 )位置。相應地，位於該列表中（1_位置向下至（lDx+i)之間的字詞被向下移動該列表中的一個位置。當一連串接觸點被處理且一字詞選擇依據該計算的符合度量分數所建立，且 -或多個字詞於該列表中出現於該使用者所選擇的字詞之 21 200538969 前時時，便將該列表中的字詞降級。在該選於更上端但未被選擇之字詞可被推定將被指 •高頻率，亦即於該列表中其出現過於上方。起 • 位置的此一字詞可被降級，例如被移動至（位置。因此，一字詞越常被考慮選擇，其被亦即其被移動的階層數量越少。該升級及降依據該使用者之一動作所觸發，或可能依據入而被不同地執行。舉例來說，僅有在該使Making the exact input of the text include the offensive word will only produce relevant acceptable words in the exact type of stop and, where appropriate, as a suggestion in the word selection list. This feature filters out the appearance of offensive words, which may happen by accident when the user learns that it may be possible to type faster without inadvertently touching the exact position of the expected letter of the keyboard. Therefore, before displaying the exact typed string, use a technique well known in the art, and the software routine responsible for displaying the word selection list compares the current exact typed string and the offending word appendix. If the two are found to match, then Replace the display string with relevant acceptable words. Otherwise, even if an offensive word is regarded as a very low frequency word, when each letter of the word is directly touched, it will still be displayed as the exact typed word. Even though this situation is similar to accidentally typing an offensive word on a standard keyboard, the present invention tolerates less accurate input by the user. This feature can be turned on or off by the user, for example through a system menu option. Those skilled in the art will understand that additional vocabulary modules can be opened in the computer, such as vocabulary modules containing legal terms, medical terms, and other languages. Furthermore, in certain languages such as Hindi, the vocabulary module may use a "template" of a valid sub-word sequence to determine which candidate word component is the previous input and the candidate word is being considered Possible or appropriate. Through a system menu, the user can set up the system so that the additional vocabulary words appear first or last in the list of possible words, such as by special coloring or highlighting, or by which system the vocabulary can be automatically based on The module supplies the directly selected word and automatically switches the order of the words. Therefore, in the scope of the additional application patents, it will be understood that the present invention can be implemented in ways other than those specifically described herein. According to another aspect of the present invention, in the process of using the system by a user, an upgrade algorithm automatically adjusts the vocabulary, and the algorithm is executed every time when the use _select_word Xiao is executed to gradually increase the transparency. Related to the word :: Frequency to upgrade words in that vocabulary. & In one embodiment, the upgrade algorithm increases the frequency values associated with a relatively large number of selected words ' and decreases the frequency values of those words ignored by very small deductions. For a vocabulary module where the relative frequency information is indicated by the consecutive order in which the words appear in a list, the upgrade is completed by moving the selected word upward from the distance between a part and the front end of the list. The upgrade algorithm is best to avoid moving the most frequently used words and very rarely used words away from their original location. For example, the percentage of words in the middle range of the list will increase with each selection. The words between the start and end of the selected word in the language upgrade are effectively downgraded by the value 丨. The list of terms is conserved as a whole, so information about the relative frequency of the terms in the list can be maintained and updated without the need to increase the storage required for the list. The upgrade algorithm increases the frequency of selected words and reduces the frequency of unselected words by 1 as appropriate. For example, 'In the phrase where the relative frequency information is indicated by the order in which the words appear in the-list, the selected word appearing at the IDX position in the list is moved to the (Ιβχ / 2) . Correspondingly, words in the list (position 1_ down to (lDx + i) are moved down one position in the list. When a series of contact points are processed and a word is selected based on the calculated Match the metric score established, and-or more words appear in the list before 21 200538969 of the word selected by the user, the words in the list are downgraded. In the selection above However, words that are not selected can be presumed to be referred to • High frequency, that is, they appear too high in the list. This word of starting position can be downgraded, for example, to (position. Therefore, a The more often a word is considered for selection, the less the number of hierarchies it is moved in. The upgrade and drop are triggered by one of the user ’s actions, or may be performed differently depending on the entry. For example, only There should be

• 控筆或滑鼠點選或拖放其預期之字詞至一字的最前面位置時，在該選擇列表中比該使用詞更上方出現的字詞才會被降級。替代地，該選擇列表中一更上方位置的一選擇字詞可更大的係數。舉例來說，該升級字詞從IDX (IDX/3 )位置。對於該些習知技藝人士而變化是顯而易見的。依據本發明之另一態樣，該前端可偵測依據來自該後端之迴授改變其認知。隨著該 ^ 入並從該選擇列表中選擇該字詞，該候選文同順序以及每個選擇字詞中包含的預期字詞改變該前端所建立的可能性。替代地，該後該前端接收關於一或多個筆畫、字元、音節 •調整數值。一第3A及3B圖說明依據本發明之手寫辨糊輸出的一範例。本發明之一實施例結合一擇列表中出現派一不適當的初出現於IDX IDX*2+1 )之降級地越少，級處理可能僅該使用者的輸用者使用一觸詞選擇列表中者所預期之字被手動拖放至被升級較一般位置被移動至言，許多此類系統錯誤並且使用者重複輸字成分間的不成分可被用於端可維護一自或音素之獨立識軟體之非模手寫辨識引擎 22 200538969 與-模組’該模組自該手寫引擎取得與該使用者輸入之每個字母有關之所有可能符合，該實施例並結合這些可能性與該語言中的字詞可能性以對該使用者預測最有可能之字詞或該使用者嘗試輪入之字詞。習知技術中已知之任何技術可被用^定該τ能符纟以及與符合有關之可能性。舉例來說’該，使用者可能嘗試輸入五字元以輸入五個字母的字詞“often”。該使用者輸入可顯現為為第3α圈中• When the stylus or mouse clicks or drags the expected word to the front of the word, the word appearing above the word in the selection list will be downgraded. Alternatively, a selection word at an upper position in the selection list may have a larger coefficient. For example, the upgrade term starts at IDX (IDX / 3). The change is obvious to those skilled in the art. According to another aspect of the invention, the front end can detect changes in its cognition based on feedback from the back end. As the word is added and the word is selected from the selection list, the candidate order and the expected words contained in each selected word change the possibilities established by the front end. Alternatively, the front end then receives adjustment values for one or more strokes, characters, syllables. Figures 3A and 3B illustrate an example of handwritten recognition output according to the present invention. According to an embodiment of the present invention, the number of degraded places that appear in the selection list is inappropriate to appear in IDX (IDX * 2 + 1), and the level processing may only use the one-touch word selection list of the user ’s loser. The word expected by the Chinese is manually dragged and dropped to be upgraded to a more general position and moved to speech. Many of these system errors and the user's repeated input of the non-components between the components can be used to maintain a self-identity of the self or phoneme Software's non-modular handwriting recognition engine 22 200538969 and -Module 'This module obtains from the handwriting engine all possible correspondences related to each letter entered by the user. This embodiment combines these possibilities with the Word likelihood to predict the most likely word for the user or the word the user is trying to take turns. Any technique known in the art can be used to determine the τ energy sign and the possibility of compliance. For example, ‘Yes, a user may try to enter five characters to enter the five-letter word“ often ”. This user input can appear as in circle 3α

所說明者。該手寫辨識軟體指定以下的字元以及筆晝之字元可能性輸出：筆晝 1 (301):，〇, 60%，，a· 24%，V 12%，、，筆晝 2 (302):，t，40%，T 34%，4, 20%，τ 6% 筆晝 3 (303): V 50%，，f，42%，，Γ 4%，，i，4% 筆晝 4 (304): *c，40%，，e* 32%，，s，ι5%，13% 筆晝 5 (305):，n，42%，，!：，30%，，m，16%，v 12% 舉例來說，該筆畫301為的可能性為6〇%，筆畫3〇2 為t的可能性為40%，筆晝303為4’的可能性為筆畫304為‘c’的可能性為40%，筆晝305為、，的可能性為 4 2%。將該手寫辨識軟體認為最接近符合該使用者之筆畫的字母集中在一起，該手寫軟體模組呈現字串‘〇Ucn，給使用者，其並非該使用者預期輸入者。其甚至並非英語中的一字詞。本發明之一實施例使用一非模糊字詞搜尋模組以依據這些字元、關於該字元之符合可能性以及在英語中使用該字詞之頻率而找出一最佳預測。在本發明之一實施例中， 23 200538969 該結合的手寫模組以及該非模糊模組預測該最有可能的字詞為‘often’，其為該使用者嘗試輸入之字詞。Illustrated. The handwriting recognition software specifies the following characters and the possible output of the characters of the pen day: pen day 1 (301): 0, 60%, a · 24%, V 12%, ,, pen day 2 (302) :, T, 40%, T 34%, 4, 20%, τ 6% pen day 3 (303): V 50%, f, 42%, Γ 4%, i, 4% pen day 4 ( 304): * c, 40%, e * 32% ,, s, 5%, 13% pen 5 (305) :, n, 42% ,,!: 30%, m, 16%, v 12% For example, the probability of stroke 301 is 60%, the probability of stroke 302 is 40%, the probability of stroke 303 is 4 ', the probability of stroke 304 is' c' The probability is 40%, and the probability of pen day 305 is 4, 2%. The letters that the handwriting recognition software considers to be closest to the strokes of the user are grouped together, and the handwriting software module presents the string '0Ucn' to the user, who is not the user's intended input. It is not even a word in English. An embodiment of the present invention uses a non-fuzzy word search module to find an optimal prediction based on these characters, the possibility of matching the character, and the frequency of using the word in English. In one embodiment of the present invention, the combined handwriting module and the non-fuzzy module predict that the most likely word is 'often', which is the word the user is trying to enter.

舉例來說，如第3B圖所示，一後端工具接收所有的候選者並判定一可能字詞列表包括：ottcn，attcn，〇ftcn， aftcn，otfcn，atfcn，offcn，affcn, otten, atten，often，aften， otfen，atfen, offen, affen, otter, attcr，oftcr，after, otfer， atfer，offer, affer，otter，atter，ofter，after, otfer, atfer， offer，affer等等。該可能字詞可從該前端判定選擇最高符合可能性至最低符合可能性的字元所構成。當一或多個高度可能的字詞被找出時，可能性較低的字元便可以不被使用。為了簡化該描述，在第3 A圖中假設未知字詞之使用頻率為0，而已知字詞如often，after與offer之使用頻率為1。在第A圖中，由該使用頻率結果以及該字詞中使用之候選字詞的符合可能性而計算一候選字詞之符合指示器。舉例來說，在第3A圖中，字元‘〇，，‘f，，‘t，，‘e，及‘η，的的符合可能性分別為0.6，0.34，0· 5，0.32，0.42，而該字詞‘often’的使用頻率為1。因此，符合該字詞“often”的一指示器被判定為0.0137。類似地，字詞“after，，及“offer”的指示器分別為0.0039及0.0082。當該後端工具選擇最有可能的字詞，便會選擇“often”。注意該字詞之“指示器，，可被正規化以排序該候選字詞。在本發明之一實施例中，一或多個輸入為明確的，亦即與單一筆畫、字元、音節或音素相關，因而符合每個字元等等的可能性等於1 〇〇%。在本發明之另一實施例中，一 24 200538969 明確輸入自該辨識前端產生一特定數值集合，其使得該非模糊後端僅配對該確切字元等等在每個候選字詞的對應位置中。在本發明之另一實施例中，明確輸入被保留數字、適當的讀音符號（diacritics)以及重音標記及/或其他定義符號，並於字詞之内與之間被保留標點符號。For example, as shown in Figure 3B, a back-end tool receives all candidates and determines a list of possible words including: ottcn, attcn, ftcn, aftcn, otfcn, atfcn, offcn, affcn, otten, atten, often, aften, otfen, atfen, offen, affen, otter, attcr, oftcr, after, otfer, atfer, offer, affer, otter, atter, ofter, after, otfer, atfer, offer, affer, etc. The possible word can be formed by selecting the character with the highest matching possibility to the lowest matching possibility from the front end judgment. When one or more highly probable words are found, the less probable characters can be left out. To simplify the description, in Figure 3A, it is assumed that the frequency of use of unknown words is 0, and the frequency of use of known words such as often, after and offer is 1. In Fig. A, a coincidence indicator of a candidate word is calculated from the result of the frequency of use and the possibility of coincidence of the candidate word used in the word. For example, in Fig. 3A, the coincidence possibilities of the characters' 0 ,, 'f ,,' t ,, 'e, and' η, "are 0.6, 0.34, 0.5, 0.32, 0.42, The word 'often' is used frequently. Therefore, an indicator matching the word "often" was determined to be 0.0137. Similarly, the indicators for the words "after," and "offer" are 0.0039 and 0.0082, respectively. When the back-end tool selects the most likely word, it will select "often". Note the "indicator" for that word , Can be normalized to sort the candidate terms. In one embodiment of the present invention, one or more inputs are explicit, that is, related to a single stroke, character, syllable, or phoneme, so the probability of meeting each character, etc. is equal to 100%. In another embodiment of the present invention, a 24 200538969 explicit input generates a specific value set from the recognition front end, which makes the non-fuzzy back end only pair the exact character and so on in the corresponding position of each candidate word. In another embodiment of the present invention, the reserved numbers, appropriate diacritics, and accent marks and / or other defining symbols are explicitly entered, and punctuation marks are reserved within and between words.

第4 A-4C圖顯示依據本發明於一使用者介面上之手寫辨識的方案。如第4A圖所示，該裝置4〇1包括一區域以供使用者寫入該手寫輸入407。提供一區域4〇3以顯示該使用者正在輸入的訊息或文章，如在一網頁潘j覽器上、在一筆記軟體程式上、在一電子郵件程式上等等。該裝置包括觸控榮幕區域以供該使用者寫入。 ^，丨、你轉埋孩使用者手寫輸入4〇7之移該裝置於區域409提供一候選字詞列表以供該使用者Figures 4A-4C show a scheme for handwriting recognition on a user interface according to the present invention. As shown in FIG. 4A, the device 401 includes an area for a user to write the handwriting input 407. An area 403 is provided to display the message or article that the user is typing, such as on a web browser, on a note-taking software program, on an email program, and so on. The device includes a touch screen area for writing by the user. ^, 丨, you re-buried the user ’s handwriting input 407 shift This device provides a candidate word list in area 409 for the user

擇。該候選字詞被以餘人I 付σ可能性加以排序。該裝置可選呈現最前面幾個最有可你取秀Τ此的候選字詞。該使用者可使用傳統方法從該列表選擇一. 彈子巧，或使用對應該字詞之位的一數字鍵。替代地，哕 β亥使用者可選擇語音指令以選擇字詞，如藉由說出該遗 a 選擇予詞或對應該列表中字詞位置編號。在該較佳眘％ y t 呈現於d战歹1中’該最有可能字詞被自動選擇呈現於&域403。闵比如藉由開口此，如果該使用者接受該候選字詞‘ 如錯由開始寫入接績使用者碟實選擇一广’便不需要使用者選擇。如果; 候選者取代該自 ^詞，該裝置便以該使用者選擇< 可能的字詞被高=擇候選者。在另-實施例中，該… 儿又4硯作為該預設值，指出該使用者g 25 200538969 前選擇而將被輸出或被延伸一後續動作的一字詞，而一指定輸入改變該高亮度顯示至另一候選字詞。在另一實施例中，一指定的輸入選擇一音節或字詞以供修正或從已被輸入或預測之一多音節序列或多字詞片語重新輸入。Select. The candidate terms are sorted by the probability of I remaining σ. The device can optionally present the first few candidate words that you can choose from. The user can use a conventional method to select a bullet from the list, or use a numeric key corresponding to the position of the word. Alternatively, the user can select a voice command to select a word, such as selecting a word by uttering the word a or corresponding to the word position number in the list. In the better case,% y t is presented in d trench 1 'the most likely word is automatically selected and presented in & field 403. By saying this, if the user accepts the candidate word ‘If you start writing the result by mistake, the user will choose a wide selection’, and the user will not need to choose. If; candidate replaces the self-word, the device selects < possible word that the user selects as high = select candidate. In another embodiment, the ... and 4 砚 are used as the preset value, indicating that the user g 25 200538969 selects a word that will be output or extended by a subsequent action, and a specified input changes the height Brightness to another candidate word. In another embodiment, a specified input selects a syllable or word for modification or re-entry from a multi-syllable sequence or multi-word phrase that has been entered or predicted.

第4C圖說明當一前後文及/或文法分析進一步協助解決該模糊之一情形。舉例來說，第4C圖中該使用者已輸入該字詞“It is an”。以一文法分析而言，該裝置預測接續字詞為一名詞。因此，該裝置進一步調整該候選字詞之順序而提升屬於名詞之候選字詞。因此，該最有可能的字詞成為“offer”而非“often”。然而，由於一形容詞也可能位於該名詞及該字詞“an”之間，該裝置仍會呈現其他選項以供使用者選擇，如“often”及“after”。第5圖為一流程圖，其說明依據本發明之使用者輸入的處理。於步驟 5 01，該系統接收一字詞之手寫輸入。之後於步驟503建立可能符合該字詞之手寫中的每個字元的一候選字元列表。步驟5 0 5自該候選字元列表中判定一候選孛詞列表。步驟5 07結合該候選字詞之頻率指示器以及符合該候選字元之可能性以判定符合該候選字詞之可能性。步驟509依據符合該候選字詞之可能性而消去一部分的候選字詞。步驟5 1 1呈現一或多個候選字詞以供使用者選擇。雖然第5圖說明處理手寫輸入之一流程圖，從此說明中可暸解語音輸入也可以一類似方式加以處理，其中一語音辨識模組對該字詞中的每個音素建立候選音素。 26 200538969 又字及面臨更槽的記憶體及電腦處辨識系統的高錯誤率以及需常低〇本發明之一實施例結音辨識引擎所回報之相關可端以及可利用這些音素而形統白動修正該語音辨識輪出在本發明之一實施例中於接收每次輸入時在顯示器給該使用者〇該候選字詞以性所判定的順序加以呈現，最有可能的的字詞會出現在序列之提出解釋的其中之的輸入會起始 — 新的輸入序在本發明之另 — 態樣中示器上，最好是位於該文字候選字詞為依據該符合度量重複地啟動一特別指定的選可能性所判定之順序中呈現字詞〇一輸入序列也會在指及有效地選擇該序列之其中出之後被結束因而隨後的依據本發明之一混合系字母音節音素等等執行Figure 4C illustrates a situation where contextual and / or grammatical analysis further assists in resolving one of the ambiguities. For example, in Figure 4C, the user has entered the word "It is an". In terms of grammatical analysis, the device predicts the continuation word as a noun. Therefore, the device further adjusts the order of the candidate words to promote candidate words belonging to a noun. Therefore, the most likely word becomes "offer" instead of "often". However, since an adjective may also be between the noun and the word "an", the device will still present other options for the user to choose, such as "often" and "after". Fig. 5 is a flowchart illustrating a process of user input according to the present invention. At step 501, the system receives handwritten input of a word. A step 503 is followed to establish a candidate character list that may match each character in the handwriting of the word. Step 505 determines a candidate word list from the candidate character list. Step 507 combines the frequency indicator of the candidate word and the possibility of matching the candidate character to determine the possibility of matching the candidate word. Step 509 deletes a part of the candidate words according to the possibility of matching the candidate words. Step 5 1 1 presents one or more candidate words for the user to select. Although Figure 5 illustrates a flowchart for processing handwritten input, it can be understood from this description that speech input can also be processed in a similar manner, in which a speech recognition module creates candidate phonemes for each phoneme in the word. 26 200538969 The high error rate of the recognition system facing the more memory and computer systems and the need for constant low. The relevant report by the knot recognition engine of one embodiment of the present invention is reasonable and can use these phonemes to form white Dynamically revise the speech recognition turn-out. In one embodiment of the present invention, each time an input is received, it is displayed to the user on the display. The candidate words are presented in the order determined by sex. The most likely words will appear. The input of the present explanation of the sequence will start—the new input sequence is in another aspect of the present invention—the indicator is preferably located repeatedly on the text candidate word based on the coincidence metric to start a special Words are presented in the order determined by the specified selection possibility. An input sequence will also refer to and effectively select one of the sequences. The end result and the subsequent data according to the present invention a mixed system of letters, etc. syllable phoneme performed

命令輸入的語音辨識技術甚至理問題。此外，由於現今語音努力進行修正，故其採用度非合使用一組候選音素以及一語能性以及使用這些輸出的一後成之字詞的已知可能性。該系〇 ’符合該輸入序列的候選字詞上的一字詞選擇列表中被呈現計算每個候選字詞之符合可能因而依據該符合度量而被視為該列表的最前面。選擇該輸入會結束一輸入序列，因而隨後列。，僅有一候選字詞顯示於該顯正被建立之插入點上。顯示的而被認為是最有可能者。藉由擇輸入，該使用者可以該符合的替代候選字詞取代該顯示的定選擇輸入的一或多個啟動以一提出解釋以供該系統實際輸輸入起始一新的輸入序列。統首先於一成分層級如筆畫、樣式辨識，如手寫辨識、語音 27Command recognition speech recognition technology even solves the problem. In addition, due to today's efforts to correct speech, its adoption is based on the known possibility of using a set of candidate phonemes and monolingual capabilities as well as the use of these output after-words. The department 0 'is presented in a word selection list on candidate words that match the input sequence. Calculating the coincidence of each candidate word is therefore considered to be the top of the list based on the match metric. Selecting this input ends an input sequence and therefore the subsequent columns. Only one candidate word appears at the insertion point where the display is being created. The displayed ones are considered the most likely. With optional input, the user can replace the displayed one or more activations of the displayed selected input with the matching alternative candidate words to present an explanation for the system to actually input a new input sequence. The system starts with a composition level such as strokes, style recognition, such as handwriting recognition, speech 27

200538969 辨識等等，以提供模糊的結果以及相關的符合可能性隨後於内部成分層級如字詞、片語、字詞對、三連字等執行非模糊操作。該系統用於解決模糊所使用之語特徵可為該語言中的任何字詞使用頻率，該個別使用用字詞之頻率、該輸入字詞之可能語音部分、該語言態、該字詞被輸入的前後文、二連字串（字詞對）或字串、以及任何可用於解決該模糊之其他語言或前後訊。本發明可伴隨字母語言而使用，如英語及西班牙其中該手寫辨識前端的輸出為字母或筆晝以及其相關性。一字母語言之手寫非模糊操作可於該字詞層級行，其中每個字詞典型地包括多個字母。本發明亦可伴隨語意語言而使用，如中文及日文中該手寫辨識前端的輸出為筆晝以及其相關可能性。意語言之手寫非模糊操作可於該詞根/成分或字母層執行。該非模糊操作可進一步於一更高層級操作，如片二連字串、三連字串等等。再者，該語言之文法結構被用於該非模糊操作以選擇該輸入之最佳整體符合。本發明亦可伴隨語意語言之語音或字母表現而使該非模糊操作可於音節、語意字母、字詞、及/或片語被操作。類似地，本發明也可被用於語音辨識，其中該語識前端的輸出包含音素及其相關符合可能性。該候選可被結合以供選擇一字詞、片語、二連字串、三連字，且串等言的者使之型三連文資語，可能被執，其一語級被語、也可用。層級音辨音素串或 28 200538969 慣用語之一最佳符合。200538969 Recognition, etc. to provide ambiguous results and related compliance possibilities Then perform non-fuzzy operations on internal component levels such as words, phrases, word pairs, triplets, etc. The language used by the system to resolve ambiguity can be the frequency of any word in the language, the frequency of the individual words used, the possible phonetic portion of the input word, the language state, and the word being entered Context, ligatures (word pairs) or strings, and any other language or context that can be used to resolve the ambiguity. The invention can be used with alphabetic languages, such as English and Spanish, where the output of the handwriting recognition front end is a letter or pen day and its correlation. A hand-written non-fuzzy operation of a one-letter language can be performed at the word level, where each word lexically includes multiple letters. The present invention can also be used with semantic language, such as the output of the handwriting recognition front end in Chinese and Japanese as pen day and its related possibilities. The handwritten non-fuzzy operation of Italian language can be performed at the root / component or letter level. This non-fuzzy operation can be further operated at a higher level, such as a slice of a two-string, a triple-string, and so on. Furthermore, the grammatical structure of the language is used for the non-fuzzy operation to select the best overall fit of the input. The present invention can also be accompanied by the phonetic or letter expression of the semantic language so that the non-fuzzy operation can be performed on syllables, semantic letters, words, and / or phrases. Similarly, the present invention can also be used for speech recognition, where the output of the speech front end contains phonemes and their associated coincidence possibilities. The candidate can be combined for selection of a word, phrase, two-character string, three-character string, and a string of words that makes it a type of trigram, may be executed, its first-level verbs, Also available. Hierarchical Phonetic Recognition Phoneme String or 28 200538969 One of the idioms best matches.

本發明之一實施例亦於該使用者僅已輸入一些筆晝時預測字詞完成。舉例來說，在成功地以高可能性辨識一字詞之最初幾個字母之後，該系統之後端可提供一字詞列表，其中該最初幾個字母與該符合的字母相同。一使用者可從該列表選擇一字詞以完成該輸入。替代地，該列表中接近某些字詞之一指示可提示該使用者依據該字詞之完成可藉由應用於該列表輸入之一指定輸入而被顯示；該隨後彈出的字詞列表顯示包含該字詞的有限字詞，且可依序指出進一步的完成。該首先幾個字元之每個可僅具有一個高可能性候選者，其被用於選擇該待完成字詞列表。替代地，一或多個該首先字元可含有模糊，因而該首先幾個字元的數個高可能性組合可被用於選擇該待完成字詞列表。用於完成之字詞列表可依據符合該使用者正嘗試輸入之字詞的可能性而被排序並顯示。舉例來說，用於完成之字詞可依據該字詞於例如該語言中、在該使用者正編輯的文章中、在特定前後文中如一對話方塊等等被該使用者被使用的頻率及/或在片語、二連字串、三連字串、慣用語等等中出現的頻率而被排序。當位於一片語、二連字串、三連字串或慣用語等等中的一或多個字詞緊接於正被處理之字詞之前，這些片語、二連字串、三連字串或慣用語之出現頻率於判定該待完成字詞之排序時可被進一步與該字詞之頻率相結合。並未位於任何目前已知片語、二連字串、三連字串、慣用語等等中的字詞被視為在具有一非常低出現頻率 29 200538969 的未♦片舍中。類似地，並未位於已知字詞列表中詞被視為具有一非常低出現頻率的一未知片語。因此 •何字詞之輸入或一字詞之最前面部分可被處理以判定 . 可能的輸入。在本發明之一實施例中，該後端持續取得該樣式前端所辨識之每個字詞、筆畫、音素的候選列表，以該列表並排序待完成字詞。隨著該使用者提供更多入，關於完成之較不可能的字詞會被消去。用於完成 • 詞列表隨著該使用者提供更多輸入而縮小規模，直到不存在或該使用者自該列表選擇一字詞為止。再者，在該樣式辨識前端提供該接續字詞之最前入一候選列表前’該後端自一或多個之前緊接的字詞已知片語、二連字串、三連字串、慣用語等等判定待字詞，以判定一片語、二連字串、三連字串、慣用語之待完成字詞列表。因此，本發明亦依據該使用者最入的字詞而判定該完整的接續字詞在本發明之一實施例中，該後端使用表示具有相 ® 能性之任何筆畫、字70、音節或音素的外卡。依據該輸入之一部分的該待完成字詞列表可被視為對於該使即將輸入或即將從該樣式辨識前端接收之一或多個筆字元或音素使用一外卡的〜範例。 • .在本發明之實施例中，該前端可能無法辨識 • 畫、字元或音素。該前蠕並不會停止該輸入處理以迫用者重新鍵入該輸入相反地該前端可容忍該結果並的字，任最有辨識更新的輸之字模糊面輸以及 JL> 等等後輸同可字詞用者晝、一筆使使傳送 30An embodiment of the present invention is also completed when the user has input only a few predicted time of day. For example, after successfully identifying the first few letters of a word with a high probability, the system can provide a list of words at the rear, where the first few letters are the same as the matching letter. A user can select a word from the list to complete the entry. Alternatively, an indication that one of the words in the list is close to some may prompt the user to be displayed by applying the specified input to one of the list inputs based on the completion of the word; the subsequent pop-up word list display contains The word is a finite word and can indicate further completions in order. Each of the first few characters may have only one high probability candidate, which is used to select the to-be-completed word list. Alternatively, one or more of the first characters may contain ambiguity, so several high-probability combinations of the first few characters may be used to select the to-be-completed word list. The word list for completion can be sorted and displayed according to the likelihood that the word the user is trying to enter. For example, the word used for completion may be based on how often the word is used by the user in the language, in the article the user is editing, in a specific context, such as a dialog box, and / Or sorted by frequency of occurrence in phrases, digraphs, trigrams, idioms, etc. When one or more words in a phrase, double-liga, triple-liga, or idiomatic phrase, etc., immediately precede the word being processed, these phrases, double-liga, triple-hybrid The occurrence frequency of a string or idiom can be further combined with the frequency of the word when determining the ordering of the word to be completed. Words that are not in any of the currently known phrases, double ligatures, triple ligaments, idioms, etc. are considered in unlicensed houses with a very low frequency of occurrence 29 200538969. Similarly, a word that is not in the list of known words is treated as an unknown phrase with a very low frequency of occurrence. Therefore • The input of any word or the first part of a word can be processed to determine the possible input. In an embodiment of the present invention, the back end continuously obtains a candidate list of each word, stroke, and phoneme recognized by the style front end, and uses the list to sort the words to be completed. As the user provides more input, less likely words about completion are eliminated. Used for completion • The word list is reduced in size as the user provides more input until it does not exist or the user selects a word from the list. Furthermore, before the front end of the pattern recognition provides the first word of the continuation word to enter a candidate list, the back end has known phrases, two-character strings, three-character strings, Identifiers, etc. are used to determine the words to be determined, so as to determine a list of words to be completed, such as a phrase, a double ligature, a triple ligament, a idiom. Therefore, the present invention also determines the complete continuation word based on the most entered word of the user. In one embodiment of the present invention, the back end uses any stroke, word 70, syllable or Wild card of phonemes. The list of to-be-completed words according to a part of the input can be regarded as an example of using a wild card for one or more pen characters or phonemes to be input or to be received from the pattern recognition front end. • In the embodiment of the present invention, the front end may not recognize the picture, character or phoneme. The front crawl does not stop the input processing to force the user to re-type the input. On the contrary, the front end can tolerate the resulting merged words, and any of the most updated and updated words with fuzzy input and JL > Wordable user day, pen, messenger, teleport 30

200538969 一外卡至該後端。在一高層級中，該後端可解決該模糊無須迫使該使用者重新鍵入該輸入。此大大地增進該系之使用者友善度。在本發明之一實施例中，該後端自動地以外卡取代自於該前端的一或多個輸入。舉例來說，當從一已知字列表中未發現任何可能字詞時，該後端可以一外卡取代最模糊輸入以擴張候選組合。舉例來說，具有大量低可 2候選者的一列表可被一外卡所取代。在一實施例中，月·J端提供一候選列表因而該輸入符合該列表中的其中一 '者的可此性會高於一臨界值。因此，一模糊輸入具有量的低可能性候選者。在其他實施例中，該前端提供一、列表因而每個候選者符合該輸入的可能性會高於一臨因此’一模糊輸入為其中一候選者的可能性很低。 ’該系統實施外卡，例如適合任何字母的筆晝給予所 ^ ，相同的可能性，因而可處理未使用外卡時沒有找到能字窮1 啊情況。在本發明之一實施例中，該後端自該樣辨諸^ ΐΙίΤ irtjf J ~所提供之候選筆畫、字元或音素之組合中建立同的候；逛& 、艰予詞，舉例來說，每個字元輸入的候選字元可據符合· 4 ^ Μ輸入之可能性而加以排序。該候選字詞之建立付合的字元開始延伸至較不可能符合的字元。當個候iP + «χ 〜、、予詞在已知字詞列表中被發現時，較不可能符合元便可能不被用於建立進一步的候選字詞。在一實施中’該系統顯示最有可能的字詞或依據計算過的可能性 X排序的一所有候選字詞列表。該系統可自動地增加一而統來詞該能該候大候界因有可式不依從數的例加輸 31 200538969 出以幫助使用者。此包括例如自動重音字元、自及自動增加標點符號及定義符號。本發明之一態樣提供一語言後端同時地被用入形式如語音辨識、手寫辨識、在硬式鍵盤或觸的鍵盤輸入。在本發明之另一實施例中’ 一語言於去模糊該候選字詞。在一後端成分結合來自該選輸入以判定候選字詞及其符合可能性之後，一被用於依據語言特性排序該候選字詞。舉例來說 B 後段進一步結合使用該使用者於例如該語言中、正編輯之一文章中、需要該輸入之一前後文中等字詞之頻率以及源自該後端成分之候選字詞與其性以去模糊該候選字詞。該語言後端也可依據串、三連字串、片語等等而執行一去模糊操作。語言後端可依據該前後文、文法結構等等而執行作。由於該語言後端所執行之任務對於各種不同法如語音辨識、手寫辨識或使用硬式鍵盤或依觸鍵盤輸入而言都是相同的，因此該語言後端可於 Φ 形式間共享。在本發明之一實施例中’一語言後服務多個輸入形式，因而當一使用者結合不同的以提供一輸入時，僅需一單一語言後端以支援混式。在本發明之另一態樣中’來自一特定前端的 -被視為一明確的候選字詞成分，其若非被記錄符為1 00%即為該後端將使用之一明確筆晝、字元或合在對應位置中含有其的有限字詞。本發明亦包動大寫以於多個輸控螢幕上後端被用前端之候語言後端 ’該語言該使用者等使用該符合可能一二連字再者，該去模糊操的輸入方控螢幕之多個輸入端同時地輸入形式合輸入模每個輸入合可能性音節以符含一混合 32 200538969 系先纟使用來自_或多個辨識系統的候選者集合以及 =性，且其藉由使用…之某些已知特徵以解決 …中的槟糊。解決該手寫/語音辨識中的模糊可增進該統的辨識率以増進該使用者友善度200538969 A wild card stuck to the back end. At a high level, the back end can resolve the ambiguity without forcing the user to retype the input. This greatly enhances the user-friendliness of the department. In one embodiment of the invention, the back end automatically replaces one or more inputs from the front end with a wild card. For example, when no possible words are found from a known word list, the backend can replace the most ambiguous input with a wild card to expand the candidate combination. For example, a list with a large number of low-availability candidates can be replaced by a wild card. In an embodiment, the candidate list is provided by the month J terminal, and thus the probability that the input meets one of the lists will be higher than a critical value. Therefore, a fuzzy input has a low probability of candidates. In other embodiments, the front end provides a list, so that each candidate is more likely to meet the input than one, so it is very unlikely that a fuzzy input is one of the candidates. ‘The system implements wild cards, such as pen pens that are suitable for any letter, giving the same possibility, so it can handle the situation where no valid word is found when the wild card is not used. In one embodiment of the present invention, the backend establishes the same candidate from the combination of candidate strokes, characters, or phonemes provided by ^ ίΙίΤ irtjf J ~; &, difficult words, for example That is, the candidate characters for each character input can be sorted according to the likelihood of being input. The establishment of the candidate word begins to extend to characters that are less likely to match. When the candidate iP + «χ ~, and prepositions are found in the list of known terms, it is less likely to match the element and may not be used to build further candidate terms. In one implementation ', the system displays the most likely words or a list of all candidate words sorted by the calculated probability X. The system can automatically add a general word that should be able to wait for a large number of cases. There are examples of non-compliant numbers that can be added 31 200538969 to help users. This includes, for example, auto accented characters, auto and punctuation, and definition symbols. One aspect of the present invention provides a language back end to be used simultaneously in forms such as speech recognition, handwriting recognition, hard keyboard or touch keyboard input. In another embodiment of the present invention, a language is used to defuzzify the candidate word. After combining a back-end component from the selected input to determine candidate words and their coincidence possibilities, one is used to rank the candidate words according to language characteristics. For example, the second paragraph of B further uses the frequency of words such as the user in the language, an article being edited, an introductory text that requires the input, and candidate words derived from the back-end component. Deblur the candidate. The language backend can also perform a deblurring operation based on strings, triplets, phrases, and so on. The language backend can perform operations based on the context, grammatical structure, and so on. Since the tasks performed by the language backend are the same for various methods such as speech recognition, handwriting recognition, or input using a hard keyboard or touch keyboard, the language backend can be shared among Φ forms. In one embodiment of the present invention, 'a language serves multiple input forms, so when a user combines different to provide an input, only a single language backend is required to support mixed types. In another aspect of the present invention, 'from a specific front end' is considered as a clear candidate word component, and if it is not 100% of the recorded character, it is one of the clear pen and word that the back end will use. Yuanhehe contains its finite words in the corresponding position. The invention also includes capitalization so that the back-end is used as the front-end language back-end on multiple input-control screens. The language, the user, etc. use the conformable one or two ligatures again, and the input side control of the de-ambiguity operation. Multiple input terminals on the screen simultaneously input the form and the input mode. Each input possibility syllable contains a mixture of 32 200538969. First, it uses the candidate set from _ or multiple recognition systems and the sex. Use some of the known characteristics of to solve the betel in. Resolving the ambiguity in the handwriting / speech recognition can improve the recognition rate of the system to promote user friendliness

雖然此處伴隨該較佳實士將可明確地瞭解其他的應而不會偏離本發明之精神與於以下包含的申請專利範圍施例說明本發明，習知技藝人用程式可取代此處所提出者，範圍。因此，本發明應僅受限【圖式簡單說明】第1圖說明依據本發明用於在一資料處理系統上識別使用者輸入的一系統，· 第2圖為依據本發明用於辨識使用者輪入的一資料處理系統的一方塊圖；第3A及3B圖說明依據本發明之一手寫辨識軟體的非模糊輸出的一範例；Although accompanied by the preferred person here, other applications should be clearly understood without departing from the spirit of the invention and the scope of patent applications contained in the following examples to illustrate the invention. Programs for skilled artisans can replace the ones proposed here Or range. Therefore, the present invention should only be limited. [Schematic description] Figure 1 illustrates a system for identifying user input on a data processing system according to the present invention. Figure 2 is for identifying a user according to the present invention. A block diagram of a data processing system in turn; Figures 3A and 3B illustrate an example of non-fuzzy output of a handwriting recognition software according to the present invention;

第4 A-4C圖說明依據本發明之一使用者介面上的手寫辨識方案；及第5A圖為依據本發明處理使用者輸入的一流程圖。【元件代表符號簡單說明】 1 0 1語言輸入 103樣式辨識引擎 105，111 輸入 33 200538969 107，113非模糊引擎 109字詞列表 11 5片語列表 11 7符合 11 9分析 1 2 1使用者選擇 201處理器Figures 4A-4C illustrate a handwriting recognition scheme on a user interface according to the present invention; and Figure 5A is a flowchart of processing user input according to the present invention. [Simple description of component representative symbols] 1 0 1 language input 103 style recognition engine 105, 111 input 33 200538969 107, 113 non-fuzzy engine 109 word list 11 5 phrase list 11 7 match 11 9 analysis 1 2 1 user selection 201 processor

202手寫輸入裝置 203顯示器 204語音輸入裝置 205聲音輸出裝置 2 1 0記憶體 2 11作業系統 212筆晝/字元辨識引擎 213音素辨識引擎 2 1 4字詞列表 2 1 5片語列表 2 1 6以字詞為基礎的非模糊引擎 2 1 7以片語為基礎的非模糊引擎 2 1 8以前後文為基礎的非模糊引擎 2 1 9選擇模組 220應用程式 401裝置 403, 405, 409 區域 407手寫輸入 501，503，505, 507, 509, 5 1 1 步驟 34202 handwriting input device 203 display 204 voice input device 205 sound output device 2 1 0 memory 2 11 operating system 212 day / character recognition engine 213 phoneme recognition engine 2 1 4 word list 2 1 5 phrase list 2 1 6 Word-based non-fuzzy engine 2 1 7 Phrase-based non-fuzzy engine 2 1 8 Pre- and post-based non-fuzzy engine 2 1 9 Selection module 220 Application 401 Device 403, 405, 409 Area 407 Handwriting input 501, 503, 505, 507, 509, 5 1 1 Step 34

Claims

200538969 Scope of patent application: 1. A method for identifying language input in a data processing system, which includes at least the following steps:

A user input of a word in a language is processed through pattern recognition to generate a plurality of recognition results for a plurality of word components, respectively. At least one of the plurality of recognition results includes a plurality of candidate word components and a plurality of possible components. Sexual indicators, the plurality of likelihood indicators indicate the degree of likelihood that the plurality of word components and a part of the user input are consistent with each other; and from the plurality of recognition results and the use of a list of possible words Sexual data determines one or more candidate words for that word that are available to the user. 2. The method according to item 1 of the scope of patent application, wherein the pattern recognition includes handwriting recognition. 3. The method according to item 2 of the scope of patent application, wherein each of the plurality of candidate word components includes a stroke of day: and the word includes a semantic language symbol. 35

200538969 4. The method described in item 2 of the scope of patent application, wherein each of the plurality of word components contains a character; and the word includes a sentence 5. The method described in item 1 of the scope of patent application, Wherein the pattern contains speech recognition; and each of the plurality of candidate word components is 0. 6. The method according to item 1 of the scope of patent application, wherein one of the plurality of recognition results of a word includes an indication, It indicates that any one of the candidate component sets has the same possibility of matching one of the uses of the word; and the alphabetic characters contained in the candidate word component set. 7. The method as described in item 1 of the scope of patent application, wherein the information indicating the possibility of using the table includes at least one of the following: the frequency of using words in the language; the frequency of using words by users; and How often words are used in the document. 8. The method as described in item 1 of the scope of patent application, wherein the data indicating the possibility of using the table includes at least one of the following: phrases in the language; word pairs in the language; and candidate words . Recognize selected words that contain a single tone. The user enters the language. Word string Word string 36 200538969 The triplet in the language. 9. The method according to item 1 of the scope of patent application, wherein the data indicating the possibility of using the word list includes at least one of the following: data indicating the form of the language; and data indicating the grammar rules of the language.

10. The method as described in item 1 of the scope of patent application, wherein the data indicating the possibility of using the word list at least includes: a contextual data indicating that the user who received the word has input. 1 1 The method as described in item 1 of the scope of patent application, wherein the user input specifies only a part of a complete set of word components of the word.

1 2. The method according to item 1 of the scope of patent application, wherein the one or more candidate words include a part of words formed by a combination of candidate word components in the plurality of recognition results and the plurality of candidate words Part of a combination of candidate word components in the recognition result. 1 3 · The method according to item 1 of the scope of patent application, wherein the one or more candidate words include a plurality of candidate words; and the method further includes the following steps: presenting the plurality of candidate words for selection; And 37 200538969 receives a user input to select one of the plurality of candidate words. 1 4 · The method described in item 13 of the scope of patent application, further comprising the following steps: Predicting one or more candidate words based on one of the next word input expected by a user.

1 5. The method as described in item 13 of the scope of patent application, wherein the plurality of candidate words are presented in the order of the possibility of user input consistent with the word 0 16. as described in item 1 of the scope of patent application The method further includes the following steps: automatically selecting a most likely one from among one or more candidate words as a recognition word input by a user of the word; according to the expected next word input by a user Most likely to predict one or more candidate terms. 1 7 · The method according to item 1 of the scope of patent application, further comprising the following steps: automatically accenting one or more characters; automatically capitalizing one or more characters; 38 200538969 automatically adding one or more characters Punctuation marks; and automatically add one or more definition symbols. 1 8 · The method as described in item 1 of the scope of patent application, wherein each of the plurality of recognition results includes a plurality of likelihood indicators related to a plurality of candidate word components, respectively, so as to indicate that it meets a part of the user input Relative possibilities.

19. A machine-readable medium having instruction data, when the machine-readable medium is executed on a data processing system, it will cause the system to execute a method for recognizing language input, the method including at least the following step:

Processing a user input of a word in a language by performing pattern recognition to generate a plurality of recognition results for a plurality of word components, respectively, at least one of the plurality of recognition results including a plurality of candidate word components and a plurality of Likelihood indicators, the plurality of likelihood indicators indicating the degree of likelihood that the plurality of word components and a portion of the user input are consistent with each other; and from the plurality of recognition results and a list of words that can be pointed out Useability data to determine one or more candidate words for the word entered by the user. 2 0. The media as described in item 19 of the scope of patent application, wherein the one or more 39

200538969 The candidate word includes a plurality of candidate words; and the method further includes the steps of: presenting the plurality of candidate words for selection; receiving a user input to select one of the plurality of candidate words; and a basis A user is expected to input one of the next word selection to predict one or more candidate words. 21. The medium according to item 19 of the scope of patent application, wherein the method includes the following steps: automatically selecting a recognition word from one or more candidate words that is most likely to be entered by a user of the word; One or more candidate words are predicted based on the most probable next-word input expected by a user. 22.-A data processing system for identifying language input, including at least a processing component, which is used to process a user input for processing a word through pattern recognition to generate a plurality of words respectively. A recognition result, at least one of the plurality of recognition results including a plurality of candidate word components and a plurality of likelihood indicators, and the plurality of likelihood indicators indicate a possibility that the plurality of word components and a part of the use are consistent with each other Degree of sexuality; and a judging component, which is used to select from the plurality of identification results and the next step, which includes the more capable ones, including: Language I One of the productions.

200538969 Use the data of a word list to determine the candidate word of the word or one entered by the supplier. 23. The data processing system as described in item 22 of the scope of patent application, wherein one or more candidate words include a plurality of candidate words; and the system includes: a presentation component for presenting the plurality of candidate word selections A receiving component, which is used for receiving a user input to select one of the candidate words; and among them, the plurality of candidate words are present with the possibility of user input matching the word. 24. The data processing system described in item 22 of the scope of the patent application, each of the plurality of identification results includes a plurality of likelihood indicators that are respectively related to a plurality of word components to indicate a relative possibility that conforms to a part of the use. 25. The data processing system according to item 22 of the scope of patent application, comprising means for any of the following: automatically accenting one or more characters; automatically capitalizing one or more characters; automatically increasing One or more punctuation marks; and multiple ones in which the system can be selected for selection

200538969 Automatically add one or more definition symbols. 26. The data processing system described in item 22 of the scope of the patent application, wherein the selection of the plurality of candidate words causes the pattern recognition to adjust the subsequent index of the selected one or more word components of the candidate words. 27. A method for processing language input in a data processing system, which includes at least the following steps: receiving a plurality of recognition results of a plurality of word components, and processing a user input of a word in a language, At least one of the plurality of identifications includes a plurality of candidate word components and a plurality of likelihood indicators, the plurality of likelihood indicators indicating a degree of sexuality in which the plurality of word points and a portion of the user input are consistent with each other; And from the plurality of recognition results and data that can indicate the possibility of using a word list, determine one or more candidate words that can be input by the user. 28. The method as described in item 27 of the scope of patent application, wherein the candidate component includes at least any of the following: a stroke derived from handwriting recognition, speech recognition, or keyboard input 1 a stroke derived from handwriting recognition, speech recognition, or keyboard input The reversible party in Zitian made it possible to process several words to make the word 42 200538969 a phoneme derived from handwriting recognition, speech recognition, or keyboard input; and from handwriting recognition, speech recognition, or keyboard One syllable for input or one syllable for other speech expressions. 2 9. The method according to item 27 of the scope of patent application, wherein the above-mentioned language is any one of alphabetic or semantic.

30. The method as described in item 27 of the scope of patent application, wherein the step of determining one or more candidate words further comprises the following steps: eliminating the plurality of candidate word component combinations of the plurality of recognition results. 3 1. The method as described in item 30 of the scope of patent application, wherein the step of determining one or more candidate words further includes the following steps: selecting a plurality of candidate words from a word list in the language, and the plural number Each candidate word contains a combination of candidate word components in the plurality of recognition results. 32. The method according to item 31 of the scope of patent application, further comprising the following steps: determining one or more candidate words from the plurality of recognition results and data indicating the possibility of using a word list; Multiple likelihood indicators to indicate the likelihood of user input matching the term. 43 200538969 3 3 · The method described in item 32 of the scope of patent application, further comprising the following steps: Sort the one or more candidate words according to the one or more possibility indicators. 34. The method according to item 33 of the patent application scope, further comprising the following steps:

Automatically select a word from the one or more candidate words. 35. The method of claim 34, wherein the step of automatically selecting is performed on any of the following: a phrase in the language; a word pair in the language; and three in the language Hyphenation. 36. The method described in item 34 of the scope of patent application, wherein the step of automatically selecting is performed according to any of the following: the form of the language; and the grammatical rules of the language. 3 7. The method as described in item 34 of the scope of patent application, wherein the step of automatically selecting is performed based on the context input by the user who received the word 44 200538969. 38. The method according to item 34 of the scope of patent application, further comprising the following steps: predicting a plurality of candidate words according to a word automatically selected by a user who is expected to input the next word. 39. The method according to item 33 of the scope of patent application, further comprising the following

Steps: presenting the one or more candidate words for a user to select; and receiving a user input to select a word from the plurality of candidate words. 40. The method of claim 39, wherein the plurality of candidate words are presented according to the order of the one or more likelihood indicators. 4 1 · The method as described in item 39 of the scope of patent application, further comprising the following steps: predicting a plurality of candidate words according to the expected next-word input by a user. 42. The method as described in item 27 of the scope of patent application, wherein one of the plurality of recognition results of a word component includes a prediction indicating that any one of the candidate words in the set of 45 200538969 points has a matching word The use of words is part of the same possibility. 43. The method as described in item 27 of the scope of patent application, wherein the data indicating the possibility of using the list includes any of the following: the frequency of using words in the language; the frequency of using words by users; and How often words are used in the document. Enter the word

44. The method according to item 27 of the scope of patent application, further comprising any step of: automatically accenting one or more characters; automatically capitalizing one or more characters; automatically adding one or more punctuation marks ; And automatically add one or more definition symbols. 45. — A machine-readable medium with instruction data. When the machine medium is executed on a data processing system, it will make the system a method for identifying language input. The method includes at least steps: Multiple recognition results of word components, a user input of a word in a language, at least one of the plurality of recognitions includes multiple candidate word components, and the following

May read a number of performance indicators that perform one or more of the following steps to process the results, the plurality of likelihood indicators indicating the degree of likelihood of conformity of the plurality of word components and a portion of the user input relative to each other; and One or more candidate words of the word that can be input by the user are determined from the plurality of recognition results and data indicating the use possibility of the word list.

46. The medium of claim 45, wherein the step of determining one or more candidate words includes the following steps: eliminating a plurality of candidate word component combinations from a plurality of recognition results; and one of the languages A plurality of candidate words are selected from the word list, and the plurality of candidate words contain a combination of candidate word components in the plurality of recognition results. 47. The media as described in item 46 of the scope of patent application; the method further includes the following steps: judging one or more candidate words from the plurality of recognition results and information indicating the possibility of using a word list One or more likelihood indicators to indicate the likelihood of user input matching the word; sort the one or more candidate words according to the one or more likelihood indicators; automatically from the one or more candidate words Choose one of them; and 47