TW200825783A

TW200825783A - System and method for setting up common use glossary dictionary according to the behavior of input data from user

Info

Publication number: TW200825783A
Application number: TW95147263A
Authority: TW
Inventors: Chau-Cer Chiu; Hsiao-Min Han
Original assignee: Inventec Corp
Priority date: 2006-12-15
Filing date: 2006-12-15
Publication date: 2008-06-16

Abstract

System and method for setting up common use glossary dictionary according to the behavior of input data from user. It captures to analyze and update relative database by classified catalogue and priority according to the behavior of input data from user, then point out the most common use and direct associated glossary to be selected for user.

Description

200825783 九、發明說明：【發明所屬之技術領域】典的系統及其方法【先前技術】本發明係關於一種建立使用者常用字囊字典的系統及盆方法’特別是-輸據使时輸人聽ftf建立朗者常用字囊字以往對於者輸鱗的智能聯想，叫料輸人法中對於 ♦單純詞彙的聯想。而在多數搜索引擎中的聯想内容提示也是口單 =羅列出所有曾出現過的輸人内容，如此對於制者來說，雖提南了些許的友善度’但仍舊沒有達到較理想的智能化的過渡。相對於使岐平常職輸人之賴介面，彳時尚未充分的加以智統_整，此柯應贿式_簡收麵是體現使用者習慣的可靠分析來源。再者’對於已經收集到的使用者輪入資訊習慣資料雖然制定 _ 了系聰理的方法，以及作為反饋時優先順序的智能判斷，但對於曰漸魔大的資料庫，反饋時仍舊存在無法更有針對性的，僅是-眛地提供龐大的聯想詞彙，無法更為貼切地提供使用者符合、他需要的智能聯想。 - 肖_譯傾客槪的雕建立，雜智統、實用化及精確使二的努力方向’因此…種依據使用者輸人資料賴建立使用者系用字菜字典的系統及其方法遂成為一被關注的議題。【發明内容】 200825783 — 本發明提供-餘據使用者輸人資料f慣建立制者常用字，子典崎統’包括：—制者輸人f料介面，提供使用者輪入貧=的-編輯視窗或一輪入搁位；一輸入資料習慣捕捉模塊，依據彳x數個捕捉條件，針對編輯視窗或輸人攔位的輸人資料提取相應單ϋ司或"司組，告思規則庫，依據-語意結構、-語意句法、一語意類別與捕捉條件建立_語意規則；—推論機，依據語意規則，針對所輸入資料進行推論，並輸出一f典推論結果；一使用 _者輸人f慣字典，轉語_賴分類，將字典推論結果，分別儲存在各語意類別的相應資料庫中；以及一智能優先提取模塊，依據使用者輸人資料的初始糊，從使用者輸人習慣字典中提取與初始子词相關的優先排列，以供使用者選取。本發明更提出一種依據使用者輸入資料習慣建立使用者常用字茱子典的方法，包括下列步驟··提供使用者輸人資料的一編輯視窗或一輸入攔位；依據複數個捕捉條件，針對編輯視窗或輪入攔位的輸入資料提取相應單詞或詞組；依據一語意結構、一語意句法、一語意類別與捕捉條件建立一語意規則；依據語意規則，針對所輸入資料進行推論，並輸出一字典推論結杲；依據語意類 *別的分類’將字典推論結果，分別儲存在各語意類別的相應資料 - 庫中；以及依據使用者輸入資料的初始字詞，從各語意類別的相應貧料庫中提取與初始字詞相關的優先排列，以供使用者選取。所述之捕捉條件係以曾經出現或查詢過的單詞或詞組設定為一次，以經常出現或查詢過的單詞或詞組設定為多次；使用者輸 200825783 入習慣字典更包括常用語句字典資料庫、興趣喜好字典資料庫、商務交流字典資料庫、生詞字典資料庫與使用者定義資料庫；而使用者定義資料庫係依據秦意類別的自定義分類來建立。以上之關於本發明内容之說明及以下之實施方式之說明係甩以示範與騎本發明之原理，並且提供本發明之、專利f請範圍更進一步之解釋。 ' 【貫施方式】舞本發明則是通過長期對於使用者輸入資訊的分類以及分析，建立個不/、疋單純6己丨思的資料庫，而是根據持續的輸入習慣捕捉’通過-_對於輸人資訊内容構成捕捉以及消除記錄條件的建立’騎資料庫巾内容的進行稍更新，賴分其優先順序。進而在使用者再次輸入相同字詞的時候，給予最常用，最直接的智能型聯想。本發暇«財錢者歡資_財，構成可被^ 集條件的資訊内容，通·統内建的語意規則庫，分析並如構，割分語言内容的_，從而建立起若干符合不同使用者記貝的輸人貧崎f字典。從而在使用者進行再輸人時，通過言意規則，再分析，_其賴需求，並從相應的習慣字典中，根據該早詞或片語的t用頻率劃分優先順序，作智能型聯相指同不時錄度的貼近使用者需求的輸人智能型聯想方二為以及~ ^套!戶習慣字典，也將作為分析用戶個人語言行 ”、、偏向的最大參考。本發騎能實現應㈣這套使用者 200825783 資訊習慣字典，讓使用者進行日常輸入的時候，能夠更有針對性， •更準確地提供智能聯想。也為制者的智能分身，對於使用者本身語言習慣的人工智能類比，提供更完整而準確的資料資訊來源。 $ 1圖係為本發明所提之依據使用者輸入資料習慣建立使用 •者㈣字彙字典❹、勒_，包括：-制者輸人資料介面 110，提供彻者輸人資料的-編輯視窗或-輸人欄位，此編輯視窗或輸入攔位係針對電腦執行介面中所能出現的輸入框或一般的 •文書處理軟體而言;一輸入資料習慣捕捉模塊.120，依據複數個捕捉條件，針對編輯視窗或輪人攔位的輸人資料提取相應單詞或詞組’其巾’捕捉條件係以曾經出現或查觸的單詞或詞組設定為 -人，以經g出現或查詢過的單詞或詞組設定為多次，這些都被記錄在纽巾，做為『使用者輸人資料f慣』關斷依據；-語 w規則庫130 ’依據一語意結構、一語意句法、一語意類別與捕捉條件建JL-語意規則，該語意規則可以從上述之語意結構、語意拳句法、5吾思類別與捕捉條件的組合中任選其中一種，以做為下述推論的條件依據；一推論機140，依據語意規則，針對所輸人資料進仃推論，並輸出一字典推論結果；一使用者輸入習慣字典15Θ， • 依據語意類別的分類，將字典推論結果，分別儲存在各語意類別的相應資料庫中，使用者輸入習慣字典150更包括常用語句字典貝料庫151、興趣喜好字典資料庫152、商務交流字典資料庫153、生詞字典資料庫154與使用者定義資料庫155;以及一智能優先提取模塊160，依據使用者輸入資料的初始字詞，從使用者輸入習慣 200825783 字典中提取與初始字詞相關的優先排列，以供使用者選取。第2圖係為本發明所提之依據使用者輸入資料習慣建立使用者常用子彙字典的方法流程圖，包括下列步驟：提供使用者輸入資料的一編輯視窗或一輸入攔位(步驟210);依據複數個捕捉條件’針對編輯視窗或輸入欄位的輸入資料提取相應單詞或詞組(步驟220);依據一語意結構、一語意句法、一語意類別與捕捉條件建立一語意規則(步驟23〇);依據語意規則，針對所輸入資料進行 • 推論，並輸出一字典推論結果(步驟240);依據語意類別的分類，將字典推論結果，分別儲存在各語意類別的相應資料庫中（步驟 250)，以及依據使用者輸入資料的初始字詞，從各語意類別的相應資料庫中提取與初始字詞相關的優先排列，以供使用者選取(步驟 260) 〇上述之捕捉條件細曾經出現或查詢過的糊朗組設定為 -人，1块常出現或查詢過的單詞或詞組設定為多次；語意類別 _的^類更包括常聽句字典、興趣喜好字典、.商務交流字典、生河予典與仙者定義；而姻者定義資料庫餘據語意類別的定義分類來建立。 +現紅—紐錢例來說明本發明之可行性，依據使用者於 '入觸習舰立制者常时轉典的技射段魏在翻譯軟^ 中而。’對於使用者輸人簡fi}f的收集分為兩方面，第對 ::在各應用拉組中經常輸入出現的重複資訊，或者存儲蝻方面疋判畊為使用者不太熟悉，需要提醒的資 200825783 訊内容，一般來自於翻譯軟體中的輸入單詞内容，如使用翻譯軟體時，是來自於智能生詞筆記中所記錄的内容。從這兩方面所能夠提供的服務，可以做到對已知内容重複輸入時的便捷，以及對於不熟悉内容的提醒。意圖達到智能化引擎的人性化表現，第3A 圖係為本發明之第一實施例示意圖，當使用者在輸入攔位裡頭輸入『那天我要去Andy』時，則出現下列跳現式視窗的聯想内容： Andy. Wang200825783 IX. Description of the invention: [Technical field of invention] Typical system and method thereof [Prior Art] The present invention relates to a system and a basin method for establishing a dictionary of user-used character capsules, in particular, Listening to ftf to establish the common association of the sacs of the syllabus in the past, for the intelligent association of the scales of the people, the association of the simple vocabulary in the input method. In most search engines, the Lenovo content prompt is also a single order = list all the input content that has appeared, so for the system, although the South has a little friendliness, but still does not achieve the ideal intelligence Transition. Compared with the interface of the incumbent of the incumbent, the time has not yet been fully added to the ethics. This is a reliable source of analysis that reflects the user's habits. In addition, 'the information about the user's enrolled information habits that have been collected is _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ More targeted, it is only a huge association of vocabulary, can not more closely provide the user with the intelligent association he needs. - Xiao _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ A topic of concern. [Description of the Invention] 200825783 - The present invention provides - the data used by the user to input the data commonly used by the system to establish the system, the sub-pattern Kawasaki's include: - the system to input the user interface, providing the user to turn into the poor = Edit window or one round of entry; one input data habit capture module, according to the number of capture conditions of 彳x, extract the corresponding single or division group for the input data of the edit window or the input block, and confess the rule base According to the semantic structure, the semantic semantics, the semantic category and the capture condition, the _ semantic rules are established; the inference machine, according to the semantic rules, infers the input data, and outputs a f-code inference result; f idiom dictionary, the _ _ classification, the dictionary inference results are stored in the corresponding database of each semantic category; and a smart priority extraction module, based on the initial paste of the user input data, from the user input habit The dictionary extracts the prioritization associated with the initial subwords for the user to select. The present invention further provides a method for establishing a user's commonly used vocabulary according to the user's input data habits, including the following steps: providing an edit window or an input block for the user to input the data; according to a plurality of capture conditions, Editing the input data of the window or the wheeled entry to extract the corresponding word or phrase; establishing a semantic rule according to a semantic structure, a semantic sentence, a semantic category and a capture condition; inferring according to the semantic rules, and outputting a Dictionary inference; according to the semantic category * other classifications 'the dictionary inference results are stored in the corresponding data of each semantic category - the library; and according to the initial words of the user input data, the corresponding poor materials from the semantic categories The library extracts the prioritization associated with the initial words for the user to select. The capture condition is set to one time by a word or a phrase that has appeared or been queried, and is set to a plurality of frequently appearing or queried words or phrases; the user input 200825783 into the custom dictionary further includes a dictionary of commonly used sentences, The interest preference dictionary database, the business communication dictionary database, the new word dictionary database and the user-defined database; and the user-defined database is established according to the custom classification of the Qinyi category. The above description of the present invention and the following description of the embodiments are intended to illustrate and ride the principles of the invention, and to provide further explanation of the scope of the invention. ' [Appropriate application method] The dance book invention is to establish a database of non-, 疋 6 丨通过对于对于长期长期长期使用者使用者使用者使用者使用者使用者使用者使用者使用者建立建立建立建立建立建立建立建立建立建立建立舞舞舞舞舞舞舞舞For the establishment of the input information content capture and the elimination of the recording conditions, the content of the riding database towel is slightly updated, depending on its priority. In turn, when the user enters the same word again, the most common and direct intelligent association is given. This hairpin «The money is rich, _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ The user remembers the input of the poor saki f dictionary. Therefore, when the user re-enters the person, through the rules of speech, re-analyze, _ depends on the demand, and from the corresponding custom dictionary, according to the frequency of the early word or phrase to prioritize the frequency, for intelligent integration It refers to the input intelligent intelligent Lenovo II that is close to the user's needs from time to time, as well as the ~ ^ set! household habit dictionary, which will also be used as the largest reference for analyzing the user's personal language line, and bias. The implementation of (4) this set of users 200825783 information habits dictionary, allowing users to make daily input, can be more targeted, more accurately provide intelligent associations. Also for the intelligent division of the system, for the user's own language habits The artificial intelligence analogy provides a more complete and accurate source of information. The $1 image is based on the user's input data habits and is used by the user (4) vocabulary dictionary, _, including: - system input data The interface 110 provides an input window or an input field for the input data of the user. The edit window or the input block is for the input box or the general one that can appear in the computer execution interface. In terms of document processing software; an input data habit capture module. 120, according to a plurality of capture conditions, extracting corresponding words or phrases for the input data of the editing window or the wheeled person's interception condition is to appear or check The touched word or phrase is set to - person, and the word or phrase that appears or has been queried by g is set to multiple times, and these are recorded in the button, as the basis for the "user input data f" is turned off; The language w rule base 130 'based on a semantic structure, a semantic sentence, a semantic category and a capture condition to construct a JL-speech rule, the semantic rule can be a combination of the above semantic structure, semantic captive syntax, 5 Wusi category and capture condition One of them is selected as the conditional basis for the following inference; a deductive machine 140, based on the semantic rules, infers the inference of the input data, and outputs a dictionary inference result; a user enters the habit dictionary 15Θ, • According to the classification of the semantic categories, the dictionary inference results are stored in the corresponding databases of the semantic categories, and the user input habit dictionary 150 further includes the common sentences. a sample library 151, an interest preference dictionary database 152, a business communication dictionary database 153, a new word dictionary database 154 and a user-defined database 155, and a smart priority extraction module 160, based on the initial words of the user input data. The user prioritizes the prioritization related to the initial words from the user input habit 200825783 dictionary. The second figure is a method for establishing a user common sub-division dictionary according to the user input data habit. The flow chart includes the following steps: providing an edit window or an input block of the user input data (step 210); extracting a corresponding word or phrase for the input data of the edit window or the input field according to the plurality of capture conditions (step 220) And establishing a semantic rule according to a semantic structure, a semantic sentence, a semantic category and a capture condition (step 23〇); performing a deduction on the input data according to the semantic rule, and outputting a dictionary inference result (step 240); According to the classification of semantic categories, the dictionary inference results are stored in the corresponding databases of each semantic category ( Step 250), and according to the initial words input by the user, extract the priority arrangement related to the initial words from the corresponding databases of the semantic categories for the user to select (step 260). The appearing or queried group is set to - person, 1 frequently appearing or queried word or phrase is set to multiple times; semantic category _ ^ class includes frequent listening sentence dictionary, interest preference dictionary, business communication dictionary , the definition of the birth of the river and the definition of the immortal; and the definition of the database of the marriage defined by the classification of semantic categories. + Red-New Zealand example to illustrate the feasibility of the present invention, according to the user's technical record in the "Training Ships". The collection of the user's input is divided into two aspects. The first pair: the repeated information that appears frequently in each application group, or the storage aspect is not familiar to the user, and needs to be reminded. The content of 200825783 is generally derived from the input word content in the translation software. When using the translation software, it is from the content recorded in the intelligent word notes. The services that can be provided from these two aspects can make it easy to repeatedly input the known content and remind the unfamiliar content. Intended to achieve the humanized performance of the intelligent engine, the 3A figure is a schematic diagram of the first embodiment of the present invention. When the user inputs "I want to go to Andy that day" in the input block, the following pop-up window appears. Lenovo Content: Andy. Wang

Andy家的狗狗 Andy· Wang的生曰聚會 Andy—起出遊日其排列的優先順序係以内容重複出現率為主，而每一詞條的記錄皆來自不同的使用者輸入習慣字典150，例如，『Andy. Wang』來自好友列表的資料庫；『Andy家的狗狗』來自最常出現的相關組合，『Andy· Wang的生日聚會』以及『Andy —起出遊日』皆來 — 自行事曆資料庫。第3B圖係為本發明之第二實施例示意圖，當使用者在輸入欄位裡頭輸入『I don’t want any more tra』時，則出現下列跳現式視窗的聯想内容: trace traffic trademark是一艘船的樣子... 其排列的優先順序係以内容重複出現率為主，而每一詞條的 10 200825783 言細來_的使用者輸人習慣字典⑼，例如，『—來自取㊆出現的早_合；『祿』曾經在翻譯軟體中查詢過的單詞; trademark疋-艘船的樣子』來自備忘錄巾儲存的内容。 —雖然本發咖前述之較佳實施例揭露如上，然其並非用疋本發明。林雜本發明之精神和範之更動 ^ ^本發明之翻倾顧。本發騎界定之範ς夫考所附之中請專纖圍。圍明麥Andy's dog Andy Wang's oyster party Andy - the order of priority for the tour is based on the content recurrence rate, and the record of each entry is from a different user input habit dictionary 150, for example "Andy. Wang" comes from a database of buddy lists; "Andy's dog" comes from the most common combination, "Andy Wang's birthday party" and "Andy - starting a tour" come - self-care database. Figure 3B is a schematic view of the second embodiment of the present invention. When the user inputs "I don't want any more tra" in the input field, the following pop-up window is displayed: trace traffic trademark is The appearance of a ship... The priority of its arrangement is based on the repetition rate of content, and the input of each entry is 10, 2008, 783. The user input habit dictionary (9), for example, "- from taking seven The early _he appeared; the word "Luo" once searched in the translation software; the trademark疋-the ship's appearance" comes from the contents of the memo towel storage. - Although the preferred embodiment of the present invention has been disclosed above, it is not intended to be used in the present invention. The spirit of the invention and the change of the invention are the turning of the present invention. This is the definition of Fan Fufu. Wai Ming Mai

【圖式簡要說明】 ★ ^ 1圖係為本發明所提之依據使用者輸入資料習f貫建立使用者常用字財典㈣鮮姻； #第2圖係為本發明所提之依據使用者輸入資料習慣建立使用者常用字料躺方絲關；第3A圖係為本發明之第一實施例示意圖；以及第3B圖係為本發明之第二實施例示意圖。【圖式符號說明】 110, 120 130 140 150 151 152 使用者輸入資料介面輸入資料習慣捕捉模塊語意規則庫推論機使用者輸入習慣字典常用語句字典資料庠興趣喜好字典資料庫 11 153.......200825783 商務交流字典資料庫生同字典貧料庫使用者定義資料庫智能優先提取模塊提供使用者輪入資料的一編輯視窗或[Simplified description of the schema] ★ ^ 1 diagram is based on the user input data, the user commonly used word code (4) fresh marriage; #第图2 is the basis of the user The input data is used to establish a user's common word material lying on the screen; FIG. 3A is a schematic view of the first embodiment of the present invention; and FIG. 3B is a schematic view of the second embodiment of the present invention. [Illustration of schema symbols] 110, 120 130 140 150 151 152 User input data interface input data habit capture module semantic rules library inference machine user input habit dictionary common sentence dictionary data 庠 interest preference dictionary database 11 153.... ...200825783 Business Communication Dictionary Database Student Dictionary Defining Library User-Defined Database Intelligent Priority Extraction Module provides an edit window for users to turn in data or

步驟220…依據複數個敝條件，針對編輯視窗或輪入攔位的輸入資料提取相應單詞或詞組步驟23G.......·...依據—語意結構、-語意句法、-語意類別與捕捉條件建立一語意規則、 ^ ^ 240...........依據語意規則，針對所輸人:#料進行推論，並輸出一字典推論結果乂驟250...........依據#意類別的分類，將字典推論έ士要Step 220: According to a plurality of 敝 conditions, extract corresponding words or phrases for the input data of the edit window or the wheeled block. Step 23G..... According to the semantic structure, the semantic grammar, the semantic category Establish a semantic rule with the capture condition, ^ ^ 240........... according to the semantic rules, infer the inference for the input: #料, and output a dictionary inference result step 250..... ...... According to the classification of #意类, the dictionary is inferred to be gentleman

154·•…·· 155···…· 160……· 步驟210. 輸入攔位分別儲存在各語意_的相應資料庫中 …果’ f ^260—·····雜使_輪人資料的被字詞，從各語二類別的相應資料庫中提取與初始字詞相關的優先排列，以供使用者選取 12154·•...·· 155···...·160...· Step 210. The input block is stored in the corresponding database of each semantic meaning... fruit 'f ^260—····················· The word being used to extract the prioritization associated with the initial word from the corresponding database of the two categories of each language for the user to select 12

Claims

200825783 X. The scope of application for patents: 1. A system for establishing a user's word-word exchange system based on the data input habits of the system, including a spoonful of user input data interface, providing an edit window or a user input data. Input block; - input data habit capture module, according to a plurality of capture conditions, for the edit window or the input of the person's block (4) to extract the corresponding word or phrase; - semantic rules library, based on semantic structure, - semantic syntax a ruin category and the capture condition establish a semantic rule; a decimator, according to the semantic rule, infers the input data, and outputs a dictionary inference result; - the user enters a custom dictionary, according to the language Classification, the: dictionary inference silk, divided into the corresponding f library of each semantic meaning; and a smart priority extraction module, according to the initial words of the user input data, extracted from the user input habit dictionary The initial words are prioritized for selection by the user. 2. If the user's input data is used as described in item 1 of the patent application to establish a system for the user's commonly used sub-reports, the capture condition is set once with a word or phrase that has appeared or been queried. 3. If the system of the user's common vocabulary dictionary is established according to the user's input data habits as described in item 1 of the patent application scope, the conditions for the towel collection are based on the words or phrases that are frequently or frequently queried on 13 200825783. Set to multiple times. 4. The system for establishing a user's common vocabulary dictionary according to the user input data habit as described in the first paragraph of the patent application scope, wherein the user input habit dictionary further includes a common sentence dictionary database, a interest preference dictionary database, and business communication. Dictionary database, new word dictionary database and user-defined database. 5. A system for establishing a user's commonly used vocabulary dictionary according to the user input data as described in item 4 of the patent application scope, wherein the user-defined database is established according to a custom classification of the semantic category. 6. A method for establishing a user's common vocabulary dictionary according to a user input data habit, comprising the steps of: providing an edit window or an input block for the user to input data; and according to the plurality of capture conditions, for the edit window or the The input data of the round entry field extracts the corresponding word or phrase; according to the semantic structure, the semantic semantics, the semantic meaning category and the capture condition, a semantic rule is established; according to the semantic rule, the input data is inferred, and a dictionary is output Inference results; according to the classification of the semantic category, the county dictionary inference results are stored in the corresponding database of each semantic category; and according to the initial words of the user input data, the corresponding data warehouse from each semantic category The extraction and the initial words of the axis are preferentially arranged for the user to select. 14 200825783 The method for establishing a user's regular dictionary according to the user's input data habits as described in item 6 of Shenyue Special Scope, the capture condition is set once by a word or phrase that has appeared or been queried. . 8. A method for establishing a user's commonly used vocabulary dictionary according to the user input data as described in item 6 of the general application, wherein the capture condition is set to multiple times by a frequently occurring or queried word or phrase. 9. The method for establishing a user-used vocabulary dictionary according to the user input data habit as described in item 6 of the patent application scope, wherein the classification of the semantic category further includes a common sentence dictionary, a interest preference dictionary, a business communication dictionary, a new word dictionary and User defined. 1 〇 If the user enters the data as described in item 9 of the patent application, it is customary to establish a user's common vocabulary dictionary, wherein the user definition is established according to a custom classification of the semantic category. 15