TW201220092A - Search system for sieving similar words in accordance with word classification and method thereof - Google Patents

Search system for sieving similar words in accordance with word classification and method thereof Download PDF

Info

Publication number
TW201220092A
TW201220092A TW99138342A TW99138342A TW201220092A TW 201220092 A TW201220092 A TW 201220092A TW 99138342 A TW99138342 A TW 99138342A TW 99138342 A TW99138342 A TW 99138342A TW 201220092 A TW201220092 A TW 201220092A
Authority
TW
Taiwan
Prior art keywords
word
result
words
category
target
Prior art date
Application number
TW99138342A
Other languages
Chinese (zh)
Inventor
Chau-Cer Chiu
mei-hua Yu
Original Assignee
Inventec Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Inventec Corp filed Critical Inventec Corp
Priority to TW99138342A priority Critical patent/TW201220092A/en
Publication of TW201220092A publication Critical patent/TW201220092A/en

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

A search system for sieving similar words in accordance with a word classification and a method thereof are provided. By using a fuzzy query technique to obtain result words based on a query data, filtering the result words by a word classification of each result word, and displaying the retained result words and interpretation data of the retained result words, the system and the method can search interpretation data of words which is synonymous with query data, and achieve the effect of providing data which is similar to query data.

Description

201220092 六、發明說明: 【發明所屬之技術領域】 種帛姑㈣統及其丨法’制健—_齡詞麵筛 選相近字詞之查找系統及其查找方法。 【先前技術】 使用者在子自⑺5時’往往會遇到無法理解某個字詞的意義 的情況’大部分的使用者在遇舰種情況時,通常會透過查找辭 典的方式來取得字詞的意義。而查找查詢字詞之釋義資料的傳統 方式’是者以_紙楼典的方絲進行。獨紙本辭典且 有體積大、重量高、資料更新不便等_,於是,隨著電子產品 的普及、網路的發達,使时查找辭典的行為也由查找資料紙本 的辭典轉㈣使用可以查找字詞之釋義資料的辭典軟體、電子翻 譯機或線上的辭典服務。 在目前的英料補典中’例如_漢英辭典,若欲查找英 文字詞(查詢資料)的釋義資料,則辭典軟體/線上辭典服務會至 t字詞庫中’查找出該查詢資料的釋義資料後,辭典軟體/線上辭血 服務會將該查詢資料_義資料顯示出來,若字詞庫中沒有奶 詢資料的釋義資料,則有的辭典軟體/線上辭典服務會嘗試以拼^ •矯正的功能提供與查詢資料相近的其他字詞給使用者,藉 ‘用者輕是錢人鑛歧讓朗相姐的其 資料的涵義。 隱5旬 不匕Φ於上述的拼寫繞正功能僅能提供給英語系的查次 料使用旦使用者欲查找的查构資料不屬於英語系,例如為 文的「俄文」’若字詞庫中沒有查詢資料「俄文」的釋義資料 201220092 是卻存在與查詢資料「俄文」同義的字詞「俄語」時,目前的辭 典軟體、電子_贼線均魏務都無法提供字詞 釋義資料給使用者。 °σ 綜上所述,可知先前技術中長期以來一直存在與查詢資料同 義之字詞之釋義資料無法由查詢資料查出的問題,因此有必要提 出改進的技術手段,來解決此一問題。 【發明内容】 有馨於先前技術存在與查詢㈣_之字詞之_資料無法 2詢資料查出的問題’本發明遂揭露—種依據字詞類別筛選相 近子凋之查找系統及其查找方法,其中: 本發明所揭路之依齡詞酬篩勒近字歇查 至 存模组,用以儲存多個目標字詞及分別與目標字詞對 二=心貝ί ’各目標字詞屬於至少一類別;輸入模組,用以提 °句:料,查找模組’用以依據查詢資料對目標字詞進行 ^月一找藉以查找出結果字詞;字詞過遽模組,用以依據結果 =^_過餘果字詞;顯賴組,用以顯示被保留之結果字 ㈣及被保冑之結果字觸對應讀義資料。 本發明所揭露之依據字詞__贿字取錢方法,其 i = :胃存^目標字詞及分別與目標字詞對應之釋義 二資二== 至少一類別;提供輸入查詢資料;依據 ㈣如^ 糊錢’細錢出結果字詞;依據 伴^之類別錢結果字詞;顯示被保留之結果字詞及被 保遠之、、·。果字詞所對應之釋義資料。 本發明所揭露之系統與方法如上,與先前技術之_差異在 4 201220092 於本發明透過依據查詢資料對目標字詞進行模糊查找而取得結果 字詞後’依據結果字詞所屬之_過親果字詞,並顯示被保留 之結果字詞及結果字詞所對應之釋義資料,藉以解決先前技術所 存在的問題’並可以達成盡可能提供與查詢資料相近之資料的技 • 術功效。 【實施方式】 以下將配合圖式及實施例輯細制本發日月之特徵與實施方 鲁式’内容足以使任何熟習相關技藝者能夠輕易地充分理解本發明 解決技術問題所触術手段並據以實施,藉此實縣發^可 達成的功效。 本發明是依據查詢資料進行模糊查找^以查找出與查詢資 =相似的結,字詞,之後,再依據結果字詞的釋義#料過遽結果 字Θ ’藉以-選出可能可以說明查詢資料的釋義資料 成結果字顺錢字最小單位只有少數不同,或是結果字詞 與查詢字财起來她,甚至結果糊魅解觸發音相似, 籲在本發明中’都可以稱組成結果字詞與查詢字詞相似,但本發明 所提之結果字詞與查詢字詞相似並不以此為限。 本發明所提之查詢㈣為-連_文字數字麟號等語言單元 的組合,但本發明所提之查詢資料並不以此為限。一般而言,查 ^料為—個完整的字詞,但事實上,查詢資料也可以為i整I 字詞的一部分。其巾,本發明所提之語言單元隨著查詢資料所屬 之語系不同而有不同,例如,當查詢資料所屬之語系為中文時, 語言單元為一個中文字,而當查詢資料所屬之語系為英文時,語 言單元為一個英文字母等,但本發明所提之語言單元並不以上述 201220092 為限。 本發明所提之顯顧為可赠制者_結果字詞之含義 的資料’包含結果字詞的發音符號、詞性、解釋文字、例句等, 但本發明並不以此為限。 、以下先以f 1圖」本發明所提之依據字詞類別篩選相近字 巧之查找m贿構圖來朗本發明的系統運作。如「第1 圖+」所不’本發明之系統含有儲存模組llG、輸人模組⑽、查找 桓組130、字詞過濾模組15〇、顯示模組湖。 …儲存模組110負責儲存多個目標字詞以及分別與各個目標字 2對應的釋㈣料,儲雜組UG所儲存之目標字詞為一連串文 字數字與相等語言單元的組合,—般而言,目標字詞為一個完 整的字詞。 儲存模組11G也儲存目標字詞所屬之_,得—提的是, 目標字詞蘭之齡在存人儲雜組11G時,—併將目標字 詞所屬之類別儲存在儲存模組削中,或是可以由分類模組19〇 分類各個目標字詞。 儲存模組110可以使用資料庫或檔案來儲存目標字詞、相對 應之釋義資料以及目標字詞所屬之類另[但本發明並不以此為限。 輸入模組120負責提供輸入查詢資料。一般而言,使用者可 1 以操作鍵盤、觸控筆等外部的輸入裝置來輸入查詢資料,如此, 輸入裝置便會產生相對應的輸入訊號,輸入模組12〇在接收到使 用者操作輸入裝置進行查詢資料的輸入而產生的輸入訊號後,會 將輸入訊號轉換為相對應的查詢資料,藉以提供查詢資料給後續 模組使用。 201220092 ,找模組130負責依據輸入模組12〇所提供輸入之查詢資 1抑儲存模組110對儲存模系且110所儲存之目標字詞進行模糊 ^ ’藉以查找出結果字詞,查找模組13〇可能查找出一個結果 =里也可能查找好個結果字詞。另外,查找模組BG在查找 字詞時’可以—併查找出結果字詞所對應的懸資料以及 尨果子詞所屬的類別。 查找模組13G可以由輸人模組12()所提供輸人之查詢資料的 =固Γ單元或最後—個語言單元開始,刪去—個語言單元或 Γ _的語言單元’並在儲存模組⑽中查找包含已刪去語言 备杏%t、串的、。果字㈤’藉以對目標字詞進行模糊查找。例如, 文的〜」時,查賊組13G可糾查詢資料「㈣」 _ H a早兀(夬文字母「t」)開始,分別刪去—個語元、 兩個語言單元而得到字串「 後-個語言單補始,分別査詢「啤的最 , Γ Γ 紊個5吾5早兀、兩個語言單元而 ^ ' tes」、「te」後’至儲存模組110中查找包含字串「est」、 字1 /太或te」的目&子詞’被查找出之目標字詞即為結果 子小但本㈣所提之獅歧找並不以上述為限。 查找模組130也負眚仿姑认、 料至储存模組110中查找;料二:所提供輸入之查詢資 之目標字詞即為結果字詞:、貧科相同的目標字詞,被查找出 字詞過滤模組15〇倉杳/务 130所杳找出之&果字1 據。果子詞之類別過遽查找模組 杳找字觸魏_可以依據 找松心所細之所有結_ ,量過親字詞,例如,查找模組心 201220092 字詞’分別為第一結果字詞、第二結果字詞、…、以及第六結果 字詞,其中,第一結果字詞屬於第一類別與第二類別、第二結果 字詞、第三結果字詞都屬於第三類別、第四結果字詞屬於第二類 別與第三類別、第五結果字詞屬於第二類別、第六結果字詞屬於 第四類別,則第一類別包含一個結果字詞、第二類別包含三個結 果子§5]、第二類別包含三個結果字詞、第四類別包含一個結果字 詞,因此,字詞過濾模組150可以過濾只屬於包含較少結果字詞 之類別的結果字詞,也就是將僅屬於只包含一個結果字詞之第四 類別的第六結果字詞移除,而保留第一至第五結果字詞。其中, 雖然第一結果字詞所屬之第一類別也僅包含一個結果字詞,但因 為第一結果字詞還屬於包含三個結果字詞的第二類別,因此,第 一結果字詞也會被字詞過遽模組15〇保留。但字詞過遽模組15〇 過濾結果字詞之方式並不以上述為限。 另外,字詞過濾模組150還可以進一步依據結果字詞之釋義 資料過濾查找模組130所查找出之結果字詞,例如,只保留包含 相同段落或相似段落之釋義資料所對應之結果字詞,而删去沒有 相同段落或相似段落之釋義資料所對應之的結果字詞。其中,計 算段落相似度的方式已為習知,故不再特別說明。 顯示模組160負責顯示被字詞過濾模組15〇所保留之結果字 詞以及結果字詞所對應之釋義資料。 此外,本發明更可以包含分類模組190,分類模組刚負責依 據各目標字詞對應之釋義資料所包含的特定關鍵字,將各個目標 字詞分類至相對應的類別。例如,將包含「電腦」或r網路」等 關鍵字之釋義資料所對應的目標字詞分類至「資訊」類別、將以 201220092 國豕名稱或簡稱開頭並以「 ^ 言」以及該_名稱等兩個^ $」結尾的字詞分類至「語 —、 u類別、將以國家名稱或汽魈八脑s「詡 =」以及姻家名稱等兩個類別、將包含半^箄 情緒相關字詞分類至「情緒 朱」或心傷」荨 分類目#丰1 j荨本發明所提之分類模組190 刀類目之方式並不以上述為限,例如 以依據目標字詞的詞性分類目標字詞。刀類模、'且190也了 另外,輸人馳120更·供輸人 也就是說,目標字詞·所屬之類別, 儲存模組110儲存,使得分類 ,、、且120輸入後,被 中之目㈣糊= 可以由儲存於儲存模組⑽ 之目^子觸屬的類別分類目標字詞。 昭「 施例來解說本發明的運作系统舆方法,並請參 法之方法流程」圖提之依據字詞類別篩選相近字詞之查找方 ^論本發朗在安裝於電_辭典軟體 ,提供糊查餘務_關職上 =5 月”儲存模、组110巾需要先儲在日押〜財使用本發明之 釋義資料(步_),例如储存包ΓΓ及與目標字詞對應的 的標案或諸料。之後如及釋義資料 中之釋義資料所包含的特定關鍵字或是存模組⑽ 尉應的嚼咖4物_ 輪釋義資料 入夂加4疋由輸入模組120提供輪 個仏子詞所屬的類別,藉以將目標字詞分類至正、201220092 VI. Description of the invention: [Technical field to which the invention belongs] The 帛 帛 四 四 四 四 四 四 四 四 四 四 四 四 。 。 。 。 。 。 _ _ _ _ 。 。 。 。 。 。 。 。 。 。 。 [Prior Art] When a user starts from (7) 5, he often encounters a situation in which he cannot understand the meaning of a certain word. Most users usually obtain words by searching for a dictionary when they encounter a ship. The meaning. The traditional way of finding the interpretation data of the query words is carried out by the square wire of the paper house. The paper-only dictionary is large in size, high in weight, and inconvenient in updating data. Therefore, with the popularization of electronic products and the development of the Internet, the behavior of finding a dictionary is also changed from the dictionary of the data book (4). Dictionary services for finding definitions of words, electronic translation machines, or online dictionary services. In the current British Supplement Code, for example, if you want to find the definition of English words (inquiry data), the dictionary software/online dictionary service will go to the t-word library to find out the query data. After the interpretation of the information, the dictionary software / online blood service will display the query data _ meaning data, if there is no interpretation data of the milk information in the word library, some dictionary software / online dictionary service will try to fight ^ • The corrective function provides other words that are similar to the query data to the user, and the use of the user is the meaning of the information of the Langxiang sister. The above-mentioned spelling and circling function can only be provided to the English department. The search data that the user wants to find does not belong to the English department. For example, the word "Russian" There is no query in the library. The Russian-language interpretation data 201220092 is the same as the query "Russian". The current dictionary software and electronic _ thief line are unable to provide word definition. Information to the user. In view of the above, it can be seen that in the prior art, there has been a problem that the interpretation data of the words which are synonymous with the query data cannot be detected by the inquiry data for a long time, and therefore it is necessary to propose an improved technical means to solve the problem. [Summary of the Invention] There are problems in the prior art existence and query (4) _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ The method, wherein: the invention of the invention is based on the word-rewarding, the near-word check-to-storage module, for storing a plurality of target words and respectively corresponding to the target word pair==心贝ί' each target word It belongs to at least one category; the input module is used to raise the sentence: the material, the search module is used to search for the target word according to the query data to find the result word; the word over-module module, According to the result = ^ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ According to the invention, the method for extracting money according to the word __ bribe, the i = : stomach storage ^ target word and the corresponding meaning of the target word respectively, the second capital == at least one category; providing input query data; (4) If the word "money money" is the result word of the fine money; the result word according to the category of the accompanying ^; the result word that is retained and the word that is kept away, and ·. Interpretation data corresponding to the word word. The system and method disclosed in the present invention are as described above, and the difference with the prior art is 4 201220092. After the result word is obtained by performing fuzzy search on the target word according to the query data, the result is based on the result of the word. Words, and display the interpretation data corresponding to the retained result words and the result words, in order to solve the problems existing in the prior art' and to achieve the technical and technical effects of providing the data as close as possible to the query data. [Embodiment] The following is a description of the features and implementations of the present invention in conjunction with the drawings and the embodiments, so that any skilled person can easily fully understand the means for solving the technical problems of the present invention. According to the implementation, the effect of the real county can be achieved. The invention performs fuzzy search according to the query data to find a knot and a word similar to the query capital=, and then, according to the interpretation of the result word, the result is 料 借 借 借 借 借 借 借 借 借 选 选 选 选 选 选 选 选 选 选 选The interpretation of the data into the result of the word is the smallest unit of money, only a few differences, or the result of the word and the query of the word for her, and even the result of the paste is similar to the pronunciation, called in the invention can be called the composition of the word and query The words are similar, but the result words mentioned in the present invention are similar to the query words and are not limited thereto. The query (4) proposed by the present invention is a combination of language units such as the _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ In general, the search is a complete word, but in fact, the query data can also be part of the i-I word. The language unit of the present invention differs according to the language of the query data. For example, when the language of the query data is Chinese, the language unit is a Chinese character, and when the query data belongs to the English language, the language of the query data is English. When the language unit is an English letter, etc., the language unit proposed by the present invention is not limited to the above 201220092. The material referred to in the present invention as the meaning of the word "result word" includes the pronunciation symbol, part of speech, explanatory text, example sentence, and the like of the result word, but the invention is not limited thereto. In the following, the system of the invention is operated by using the word category to filter the similar fonts according to the word category. For example, the system of the present invention includes a storage module 11G, an input module (10), a search group 130, a word filter module 15A, and a display module lake. The storage module 110 is responsible for storing a plurality of target words and a corresponding (four) material corresponding to each target word 2. The target word stored by the storage group UG is a combination of a series of alphanumeric characters and equivalent language units, in general , the target word is a complete word. The storage module 11G also stores the _ to which the target word belongs, and the result is that the target word lan is in the storage group 11G, and the category to which the target word belongs is stored in the storage module. Or, the classification module 19 can classify each target word. The storage module 110 can use a database or a file to store the target words, the corresponding interpretation data, and the like of the target word [but the invention is not limited thereto. The input module 120 is responsible for providing input query data. Generally, the user can input the query data by using an external input device such as a keyboard or a stylus, so that the input device generates a corresponding input signal, and the input module 12 receives the user operation input. After the input signal generated by the device is input, the input signal is converted into the corresponding query data to provide the query data for subsequent modules. In 201220092, the finding module 130 is responsible for querying the storage model and storing the target words stored in the storage module 110 according to the input query of the input module 12〇. Group 13 may find a result = it may also find a good result word. In addition, the search module BG can 'can'-find the word and find out the hanging data corresponding to the result word and the category to which the fruit word belongs. The search module 13G can be started by the input module 12 or the last language unit of the input data of the input module, and the language unit of the language unit or the language unit of the _ is deleted and stored in the storage module. In the group (10), the search includes the deleted language apricot %t, string. The fruit word (five) is used to make a fuzzy search of the target word. For example, when the text is ~", the thief group 13G can correct the query data "(4)" _ H a early (the letter "t"), and delete the word, two language units and get the string. "After-language singles, the query "the most beer, Γ Γ 个 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 5 The target word of the string "est", word 1 / too or te" is found to be the result of the small word, but the lion's search mentioned in (4) is not limited to the above. The search module 130 also searches for the search module 110, and the target word of the input query is the result word: the same target word of the poor department, is searched Out of the word filtering module 15 〇 杳 杳 / 130 130 杳 杳 杳 & & & & & & & & 杳The category of fruit words is too much to find the module to find the word touch Wei _ can be based on all the knots of the loose heart _, the amount of words, such as, find the module heart 201220092 word 'respectively the first result word a second result word, ..., and a sixth result word, wherein the first result word belongs to the first category and the second category, the second result word, and the third result word belong to the third category, The fourth result word belongs to the second category and the third category, the fifth result word belongs to the second category, and the sixth result word belongs to the fourth category, the first category contains one result word, and the second category contains three results. Sub-§5], the second category contains three result words, and the fourth category contains one result word, so the word filtering module 150 can filter the result words that belong only to the category containing fewer result words, That is, the sixth result word belonging to only the fourth category containing only one result word is removed, and the first to fifth result words are retained. Wherein, although the first category to which the first result word belongs only contains one result word, since the first result word also belongs to the second category containing three result words, the first result word will also It is retained by the word 遽 module 15〇. However, the method of filtering the result words is not limited to the above. In addition, the word filtering module 150 may further filter the result words found by the search module 130 according to the interpretation data of the result words, for example, only retain the result words corresponding to the interpretation data of the same paragraph or similar paragraphs. And delete the result words corresponding to the interpretation data of the same paragraph or similar paragraph. Among them, the way of calculating the similarity of paragraphs is already known, so it is not specifically stated. The display module 160 is responsible for displaying the interpretation words retained by the word filtering module 15 and the interpretation data corresponding to the result words. In addition, the present invention may further include a classification module 190. The classification module is only responsible for classifying each target word into a corresponding category according to a specific keyword included in the interpretation data corresponding to each target word. For example, the target words corresponding to the definition data of keywords such as "computer" or r network" are classified into the "information" category, and will start with the name of 201220092 or the abbreviation with "^" and the name The words ending with two ^ $" are classified into two categories: "language-, u-category, which will be the name of the country or the name of the car, "诩" and the name of the ceremonial name. Word classification to "emotional Zhu" or heart injury" 荨分类目#丰1 j荨 The classification module of the present invention 190 The type of knife category is not limited to the above, for example, to classify the target according to the target word Words. Knife model, 'and 190 also in addition, the input is more than 120 for the loser, that is to say, the target word · the category, the storage module 110 is stored, so that the classification,, and 120 input, after being (4) Paste = The target word can be classified by the category of the item stored in the storage module (10). Zhao "Instructions to explain the operating system of the present invention, and the method flow of the method of reference" is based on the word category to filter the search terms of similar words. The book is installed in the electric _ dictionary software, provided Paste the _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ The case or the materials. The specific keywords contained in the interpretation data in the interpretation data or the storage module (10) 嚼 的 嚼 4 _ _ _ _ 疋 疋 疋 疋 疋 疋 疋 疋 疋 疋The category to which the dice word belongs, so that the target words are classified into positive,

(步驟210),並由儲存模組11〇儲存 & 、 ,J 在儲存模組丨_存目標字詞、㈣目=所屬的類別。 叹目標字珊屬_顺,包含树目;^=_義資料、 今月的辭典軟體、電子翻譯 201220092 機或是網路飼服器可以提供如「第3圖」所示之使用者界面綱 、。使用者’藉以讓使用者透過輸入模組W在輸入區域則中輸 入查詢資料,絲得與查詢#料對應的釋義·。在本實施例中, 假叹輸入模組120提供使用者輸入至輸入區域sl〇的查詢資料為 「俄文」’但本發明並不以此為限。 在輸入模組12〇提供輸入查詢資料(步驟22〇)後,查找模組 130可以依據查3旬-貝料,至儲存模組11〇中對所有目標字詞進行模 糊查找’藉以雜出與錢龍對應的結果字詞以及與結果字詞 對應之釋義資料與結果相所屬的綱(步驟23Q)。在本實施例 中’假設查賊組13G會由查詢資料的開頭以及最後逐一删去一 個語言單元,直到無法在刪除語言單元為止。由於查詢資料為「俄 文」,因此,查找模組130將有兩筆進行查找的資料,也就是「俄」 以及文」’接著’查找模組13〇會查找出包含「俄」或包含「文」 的目標字詞、相對應的釋義資料以及所屬的類別,其中,被查找 出的目標字詞例如「俄頃」、「俄語」、「俄羅斯」以及「中文」等 子。司,這些被查找出之目標字詞即為本發明所稱之「結果字詞」。 在貫務上’在查找模組13〇依據查詢資料對所有目標字詞進 打模糊查找,並查找出結果字詞(步驟23〇)前,查找模組13〇 更可以如習知之資料查找技術,依據查詢資料至儲存模組11〇中 對所有目標子詞進行完整查找,藉以查找出結果字詞(步驟24〇), 此時查找模組130所查找出之結果字詞即與查詢資料相同,故不 再多加描述。 若查找模組130以完整查找時即查找出結果字詞以及相對應 的釋義資料,則查找模組130可以不再進行模糊查找,且顯示模 201220092 組160可以直接顯示結果字詞的釋義資料。事實上,查找模組13〇 也可以不論完整查找是否有查找出結果字詞以及相對應的釋義資 料’查找模組130都可以繼續進行模糊查找(步驟230),本發明 並不以此為限。 在查找模組130依據查詢資料對所有目標字詞進行模糊查 找’並查找出結果字詞(步驟230)後,字詞過濾模組150可以依 據結果字詞所屬的類別過遽結果字詞(步驟25〇)。在本實施例中, 假設結果字詞「俄頃」所屬的類別為「時間副詞」、結果字詞「俄 語」所屬的類別為「名詞」、「語言」以及「俄羅斯」、結果字詞「俄 羅斯」所屬的類別為「名詞」、「國家」以及「俄羅斯」、結果字詞 中文」所屬的類別為「名詞」、「語言」以及「中國」,則字詞過 濾模組150可以統計出類別「時間副詞」只包含結果字詞「俄頃」, 類別「名詞」包含結果字詞「俄語」、結果字詞「俄羅斯」以及結 果字詞「中文」、類別「語言」包含結果字詞「俄語」以及結果字 詞「中文」,類別「國家」只包含結果字詞「俄羅斯」、類別「俄 • 羅斯」包含結果字詞「俄語」以及結果字詞「俄羅斯」,類別「中 國」只包含結果字詞「中文」,因此,字詞過濾模組15〇可以刪去 只包含一個結果字詞之類別所屬的結果字詞,字詞過濾模組 .也可以保留包含兩個結果字詞之類別所屬的結果字詞,在本實施 例中,不論字詞過濾模組150是刪去包含過少結果字詞之類別所 屬的結果字詞或是保留包含—定數量之結果字詞之類別所屬的結 果字柯,結果糊「俄頃」會被字詞過雜組W麟而冊法, 而結果字詞「俄語」、結果字詞「俄羅斯」以及結果字詞「中文」 會被字詞過濾模組150保留。 ^ 201220092 在字詞過濾模組150依據結果字詞所屬的類別過濾結果字詞 (步驟250)後,顯示模組160可以在使用者界面300之釋義顯示 區域330中顯示被字詞過濾模組150保留的結果字詞之釋義資料 (步驟270)。如此,即使無法由使用者所輸入之查詢資料直接查 找出相對應的釋義資料,本發明仍然可以由查詢資料取得同義字 詞的釋義資料。 在上述的實施例中,字詞過濾模組150更可以依據不同結果 字詞對應之釋義資料中是否存在相同或相似的段落過濾結果字 詞’也就是說’字詞過濾模組150可以保留包含相同段落或相似 段落之釋義資料所對應的結果字詞(步驟260),而刪去不存在相 同段落或相似段落之釋義資料所對應的結果字詞。在本實施例 中’由於結果字詞「俄語」的釋義資料為「Russian (language)」、 結果字詞「俄羅斯」的釋義資料為rRussia」、結果字詞「中文」 的釋義為料為「Chinese language; Chinese」,因此,若字詞過渡模 組150只判斷出結果字詞「俄語」的釋義資料與結果字詞「俄羅 斯」的釋義資料相似,則字詞過濾模組15〇將保留結果字詞「俄 語」以及結果字詞「俄羅斯」,也就是删去結果字詞「中文」,並 由顯示模組160將結果字詞「俄語」以及結果字詞「俄羅斯」的 釋義資料顯示於釋義顯示區域330 (步驟270);而若字詞過濾模 組150除了判斷出結果字詞「俄語」的釋義資料與結果字詞「俄 羅斯」的釋義資料相似之外,也判斷出結果字詞「俄語」的釋義 資料與結果字詞「巾文」的釋義資料相似,則字詞過龍組150 可以/、保逼結果字詞「俄語」的釋義資料,並由顯示模組將 結果子祠俄語」的釋義資料顯示於釋義顯示區域33〇(步驟Wo )。 12 201220092 T上所it可知本發贿切技術之間的差 查詢資料對目標字詞進行模糊查找而取料ς、=依據 字=屬之類別過渡結果字詞’並顯示被保留之結二3 手段,勤此—技術手段可 ^ Π=Γ義之字詞之釋義資料無法由查詢資料查= 4,進而達成盡可能提供與查崎料麵之資料的技術功效。(Step 210), and the storage module 11 〇 stores &, , J in the storage module 丨 _ save the target word, (4) destination = belongs to the category. Sighing the target word is _ shun, including the tree; ^=_ meaning data, this month's dictionary software, electronic translation 201220092 machine or network feeding device can provide the user interface as shown in "3" . The user's input allows the user to enter the query data in the input area through the input module W, and the definition corresponding to the query #料. In the present embodiment, the sigh input module 120 provides the query data input by the user to the input area sl〇 as “Russian”, but the invention is not limited thereto. After the input module 12 provides the input query data (step 22〇), the search module 130 can perform fuzzy search on all the target words in the storage module 11〇 according to the investigation of the 3rd-before-before materials. The result word corresponding to Qianlong and the interpretation data corresponding to the result word and the result belong to the class (step 23Q). In the present embodiment, it is assumed that the thief group 13G deletes one language unit from the beginning of the query data and finally one by one until the language unit cannot be deleted. Since the query data is "Russian", the search module 130 will have two records for searching, that is, "Russian" and "Text" and then "Find" module 13 will find "Russian" or "Include" The target words of the text, the corresponding interpretation data and the categories to which they belong, such as the target words such as "Russian", "Russian", "Russian" and "Chinese". Secretary, these target words are found to be the "result words" of the present invention. In the transaction, before the search module 13 searches for all the target words according to the query data, and finds the result words (step 23〇), the search module 13 can be as known as the data search technology. And searching for all the target sub-words in the storage module 11〇 according to the query data, so as to find the result words (step 24〇), the result word searched by the search module 130 is the same as the query data. Therefore, no more descriptions are made. If the search module 130 finds the result word and the corresponding interpretation data when the search is complete, the search module 130 can no longer perform the fuzzy search, and the display module 201220092 group 160 can directly display the interpretation data of the result word. In fact, the search module 13 can continue to perform the fuzzy search (step 230) regardless of whether the full search finds the search result word and the corresponding interpretation data, and the present invention is not limited thereto. . After the search module 130 performs a fuzzy search on all the target words according to the query data and finds the result words (step 230), the word filtering module 150 may pass the result words according to the category to which the result words belong (steps). 25〇). In this embodiment, it is assumed that the result word "Russian" belongs to the category "time adverb" and the result word "Russian" belongs to the category "noun", "language" and "Russia", and the result word "Russia" The word filter module 150 can count the category "time" in the categories of "noun", "country" and "Russia", and the result word "Chinese" belongs to the categories "noun", "language" and "China". The adverb "only contains the result word "Russian", the category "noun" contains the result word "Russian", the result word "Russia" and the result word "Chinese", the category "Language" contains the result word "Russian" and the result The word "Chinese", the category "Country" contains only the result word "Russia", the category "Russian Ross" contains the result word "Russian" and the result word "Russia", the category "China" contains only the result words" Therefore, the word filtering module 15 can delete the result words belonging to the category containing only one result word, the word filtering module. In order to retain the result word to which the category containing the two result words belongs, in this embodiment, the word filtering module 150 deletes the result word to which the category containing the result word is too small or retains the inclusion The result word of the category of the result word of the quantity belongs to Ke, and the result is that the word "Russian" will be crossed by the word W Lin and the result word "Russian", the result word "Russia" and the result word " Chinese will be retained by the word filtering module 150. ^ 201220092 After the word filtering module 150 filters the result words according to the category to which the result word belongs (step 250), the display module 160 can display the word filtering module 150 in the definition display area 330 of the user interface 300. Interpretation data for the retained result words (step 270). In this way, even if the query data input by the user cannot be directly found to find the corresponding interpretation data, the present invention can still obtain the interpretation data of the synonym word from the query data. In the above embodiment, the word filtering module 150 may further filter the result words according to whether the same or similar paragraphs in the interpretation data corresponding to the different result words are filtered. Result words corresponding to the interpretation data of the same paragraph or similar paragraphs (step 260), and deleting the result words corresponding to the interpretation data of the same paragraph or similar paragraphs. In the present embodiment, the interpretation of the result word "Russian" is "Russian (language)", the interpretation of the result word "Russia" is rRussia", and the interpretation of the result word "Chinese" is expected to be "Chinese". Language; Chinese", therefore, if the word transition module 150 only determines that the interpretation data of the result word "Russian" is similar to the interpretation data of the result word "Russia", the word filtering module 15 will retain the result word. The word "Russian" and the result word "Russia", that is, the result word "Chinese" is deleted, and the interpretation data of the result word "Russian" and the result word "Russia" are displayed by the display module 160 on the interpretation display. The area 330 (step 270); and if the word filtering module 150 determines that the interpretation data of the result word "Russian" is similar to the interpretation data of the result word "Russia", the result word "Russian" is also judged. The interpretation data is similar to the interpretation of the result word "towel", and the word "over-the-shoulder group 150" can be used to read the meaning of the word "Russian" and will be displayed by the display module. The interpretation data of the result "Russian Russian" is displayed in the paraphrase display area 33 (step Wo). 12 201220092 T on the IT can know the difference between the information of the bribe cutting technology to the fuzzy search of the target word and the retrieving ς, = according to the word = genus of the category transition result words 'and shows the retained knot 2 Means, diligence - technical means ^ Π = Γ 之 之 之 之 之 资料 资料 资料 资料 资料 之 之 之 之 之 之 = = = = = = = = = = = = = = = = = = = = =

—再者’本發明之依據字詞類別篩選相近字詞之查找方法 貫現於硬體、軟體或硬體與軟體之纟且人 ^ 實現或以不同元件散佈於若干互連^腦 吉接之實施方式如上,惟所述之内容並非用以 2接^本剌之翻錢翻。任何本發騎屬技術領域中且 有通吊知識者’在不脫離本發明所揭露之精神和範圍的前提下了 2本U之實關形式上及細節上作些許之更細飾,均屬於本 明之專利保護範圍。本發明之專利保護範圍,仍須以所附之申 請專利範圍所界定者為準。 【圖式簡單說明】 第1圖為本發明所提之依據字詞類別筛選相近字詞之查找系 統架構圖。 —Μ 第2圖為本發明所提之依據字詞類別篩選相近字詞之查找方 法流程圖。 第3圖為本發明實施例所提之使用者界面之示意圖。 【主要元件符號說明】 110 儲存模組 輸入模組 13 201220092 130 查找模組 150 字詞過濾模組 160 顯示模組 190 分類模組 300 使用者界面 310 輸入區域 330 釋義顯示區域 步驟201儲存目標字詞及與目標字詞對應之釋義資料 步驟210分類目標字詞 步驟220提供輸入查詢資料- Furthermore, the method for searching for similar words based on the word category of the present invention is implemented in hardware, software or hardware and software, and is realized by a plurality of interconnects or different components. The embodiment is as above, but the content is not used to transfer the money. Anyone who is in the field of the technology of the present invention and who has the knowledge of the scope and scope of the present invention has made some fine details on the form and details of the two U. The scope of patent protection of this invention. The scope of patent protection of the present invention is still subject to the scope of the appended claims. [Simple Description of the Drawings] Figure 1 is a schematic diagram of the search system for filtering similar words based on word categories. —Μ Figure 2 is a flow chart showing the method for finding similar words based on word categories. FIG. 3 is a schematic diagram of a user interface according to an embodiment of the present invention. [Main component symbol description] 110 Storage module input module 13 201220092 130 Search module 150 Word filter module 160 Display module 190 Classification module 300 User interface 310 Input area 330 Definition display area Step 201 Store target words And a definition data corresponding to the target word, step 210, classifying the target word, step 220, providing input query data

步驟230依據查詢資料對目標字詞進行模糊查找,藉以查找 出結果字詞 步驟240依據查詢資料至目標字詞中查找結果字詞 步驟250依據結果字詞之類別過濾結果字詞 步驟260保留包含相同段落或相似段落之釋義資料所對應之 結果字詞 步驟270顯示被保留之結果字詞及結果字詞所對應之釋義資 料Step 230: performing fuzzy search on the target word according to the query data, and searching for the result word step 240, according to the query data to the search result word in the target word, step 250, filtering the result word according to the category of the result word, step 260 retaining the same The result word step 270 corresponding to the interpretation data of the paragraph or similar paragraph displays the interpretation data corresponding to the retained result word and the result word.

1414

Claims (1)

201220092 七、申清專利範圍: 1. 一種依據字詞類別篩選相近字詞之查找方法,該查找方法至 少包含下列步驟: 儲存多個目標字詞及分別與各該目標字詞對應之各釋義 - 資料; 刀類各§玄目標子詞為至少·一類別; 提供輸入一查詢資料; 依據該查詢資料對該些目標字詞進行模糊查找,藉以查 找出至少一結果字詞; 依據各該結果字詞所屬之類別過濾各該結果字詞;及 顯示被保留之各該結果字詞及被保留之各該結果字詞所 對應之釋義資料。 ° 2. 如申請專繼圍第1項所狀依據字纖㈣鞠近字詞之 查找方法,其中s亥查找方法於該顯示被保留之各該結果字詞 及被保留之各遠結果字詞所對應之釋義資料之步驟前,更包 • 含保留包含相同段落或相似段落之該些釋義資料所對應之結 果字詞之步驟。 3. 如申請專利範圍第1項所述之依據字詞類別篩選相近字詞之 . 查找方法,其中該方法更包含依據該查詢資料至該目標字詞 中查找一該結果字詞之步驟。 4. 如申請專利範圍第1項所述之依據字詞類別篩選相近字詞之 查找方法,其巾該分醜些目標字詞為各該_之步驟是依 據各該釋義貢料所包含之關鍵字分類對應之錢目標字詞或 依據各該目標字詞之詞性分類各該目標字詞。 15 201220092 5. 如申請專利範圍帛1項所述之依據字詞類別筛選相近字詞之 查找方法,其中該依據各該結果字詞之類別過濾各該結果字 δ司之步驟是依據各該分類所包含之各該結果字詞之數量 各該結果字詞。 6. —種依據字詞類別篩選相近字詞之查找系統,該查找系統至 少包含: 一儲存模組,用以儲存多個目標字詞及分別與各該目標 字詞對應之各釋義資料,各該目標字詞屬於至少一類別; 一輸入模組,用以提供輸入一查詢資料; 一查找模組,用以依據該查詢資料對該些目標字詞進行 模糊查找,藉以查找出至少一結果字詞; 一字詞過濾模組,用以依據各該結果字詞之類別過濾各 該結果字詞;及 一顯示模組,用以顯示被保留之各該結果字詞及被保留 之各該結果字詞所對應之釋義資料。 7. 如申請專利範圍第6項所述之依據字詞類別篩選相近字詞之 查找系統,其中該查找模組更用以依據該查詢資料至該目標 字詞中查找一該結果字詞。 8. 如申請專利範圍第6項所述之依據字詞類別篩選相近字詞之 查找系統,其中該字詞過濾模組更用以保留包含相同段落或 相似段落之該些釋義資料所對應之結果字詞。 9. 如申請專利範圍第6項所述之依據字詞類別篩選相近字詞之 查找系統’其中該字詞過滤模組是依據各該類別所包含之各 該結果字詞之數量過濾各該結杲字詞。 201220092 10.如申請專利範圍第6項所述之依據字詞類別篩選相近字詞之 查找系統,其中該查找系統更包含一分類模組,用以依據各 該釋義資料所包含之一關鍵字分類對應之各該目標字詞或依 據各該目標字詞之詞性分類各該目標字詞。201220092 VII. Shenqing Patent Scope: 1. A method for searching for similar words based on word categories. The search method includes at least the following steps: storing multiple target words and respective interpretations corresponding to each target word- Information; each § 目标 target subword of the knives is at least one category; provide input and query data; perform fuzzy search on the target words according to the query data, thereby searching for at least one result word; The category to which the word belongs filters each of the result words; and displays each of the retained result words and the interpretation data corresponding to each of the retained result words. ° 2. If you apply for the method of searching for the word (4) near the word in the first item, the method of finding the word in the display is retained in the display and the result words are retained. Before the step of the corresponding interpretation data, the package includes the steps of retaining the result words corresponding to the interpretation data of the same paragraph or the similar paragraph. 3. The method for searching for similar words according to the word category according to item 1 of the patent application scope, wherein the method further comprises the step of searching for a result word according to the query data to the target word. 4. If the method for searching for similar words according to the word category is as described in item 1 of the patent application scope, the steps of the target word for each of the ugly items are based on the key points included in each interpretation. The word target word corresponding to the word classification or each target word according to the part of speech of each target word. 15 201220092 5. The method for searching for similar words according to the word category according to the scope of patent application 帛1, wherein the step of filtering each result word according to the category of each result word is based on each The number of each of the result words included in the classification is the result word. 6. A search system for filtering similar words according to a word category, the search system comprising at least: a storage module for storing a plurality of target words and respective interpretation data corresponding to each of the target words, each The target word belongs to at least one category; an input module is configured to provide input of a query data; and a search module is configured to perform fuzzy search on the target words according to the query data, thereby searching for at least one result word a word filtering module for filtering each of the result words according to a category of each of the result words; and a display module for displaying each of the retained result words and the retained result Interpretation data corresponding to the word. 7. The search system for filtering similar words according to the word category, as described in claim 6, wherein the search module is further configured to find a result word according to the query data to the target word. 8. The search system for filtering similar words according to the word category according to item 6 of the patent application scope, wherein the word filtering module is further used to retain the result corresponding to the interpretation data including the same paragraph or the similar paragraph. Words. 9. The search system for filtering similar words according to the word category as described in item 6 of the patent application scope, wherein the word filtering module filters each of the result words according to the number of the result words included in each category.杲 word. 201220092 10. The search system for screening similar words according to the word category according to item 6 of the patent application scope, wherein the search system further comprises a classification module for classifying the keywords according to one of the interpretation data. Corresponding to each of the target words or classifying each of the target words according to the part of speech of each of the target words. 1717
TW99138342A 2010-11-08 2010-11-08 Search system for sieving similar words in accordance with word classification and method thereof TW201220092A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
TW99138342A TW201220092A (en) 2010-11-08 2010-11-08 Search system for sieving similar words in accordance with word classification and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
TW99138342A TW201220092A (en) 2010-11-08 2010-11-08 Search system for sieving similar words in accordance with word classification and method thereof

Publications (1)

Publication Number Publication Date
TW201220092A true TW201220092A (en) 2012-05-16

Family

ID=46553056

Family Applications (1)

Application Number Title Priority Date Filing Date
TW99138342A TW201220092A (en) 2010-11-08 2010-11-08 Search system for sieving similar words in accordance with word classification and method thereof

Country Status (1)

Country Link
TW (1) TW201220092A (en)

Similar Documents

Publication Publication Date Title
US7783644B1 (en) Query-independent entity importance in books
US9323827B2 (en) Identifying key terms related to similar passages
US20110219018A1 (en) Digital media voice tags in social networks
US20140229159A1 (en) Document summarization using noun and sentence ranking
TW200842614A (en) Automatic disambiguation based on a reference resource
US20180004838A1 (en) System and method for language sensitive contextual searching
TW201510753A (en) Query suggestion templates
JP2010055618A (en) Method and system for providing search based on topic
JP5273735B2 (en) Text summarization method, apparatus and program
JP2004178123A (en) Information processor and program for executing information processor
JP3864687B2 (en) Information classification device
JP2005128872A (en) Document retrieving system and document retrieving program
KR20180015491A (en) Method and apparatus for storing log of access based on kewords
JP4525433B2 (en) Document aggregation device and program
JP4569179B2 (en) Document search device
TW201220092A (en) Search system for sieving similar words in accordance with word classification and method thereof
JP2006139484A (en) Information retrieval method, system therefor and computer program
CN111681776A (en) Medicine object relation analysis method and system based on medicine big data
JP2005234772A (en) Documentation management system and method
JP2005228033A (en) Document search device and method
TW201005557A (en) Translation system by words capturing and method thereof
JP4384736B2 (en) Image search device and computer-readable recording medium storing program for causing computer to function as each means of the device
JP6135327B2 (en) Information processing apparatus, document data organizing apparatus, document presentation method, and computer program
WO2016132558A1 (en) Information processing device and method, and program
JPS63175965A (en) Document processor