TWI734183B - Blacklist database retrieval system and retrieval method - Google Patents
Blacklist database retrieval system and retrieval method Download PDFInfo
- Publication number
- TWI734183B TWI734183B TW108131255A TW108131255A TWI734183B TW I734183 B TWI734183 B TW I734183B TW 108131255 A TW108131255 A TW 108131255A TW 108131255 A TW108131255 A TW 108131255A TW I734183 B TWI734183 B TW I734183B
- Authority
- TW
- Taiwan
- Prior art keywords
- character
- original
- target
- name
- characters
- Prior art date
Links
Images
Abstract
該黑名單資料庫檢索系統執行一黑名單資料庫檢索方法,包含:使用者介面模組,用以產生一目標姓名字串,條件擴展模組,接收目標姓名字串並將目標姓名字字串根據一姓名字典資料庫中的複數原始姓名字串進行比對,將符合容錯比對條件的原始姓名字串儲存為一擴展姓名字串集合,最後將擴展姓名字串集合提供至一檢索引擎進行黑名單資訊資料庫的檢索;通過將目標姓名字串擴展為多個符合容錯比對條件的相近的姓名字串,在進行檢索時即能找出所有可能是目標人物的相關資料,減少因輸入錯誤、同音或相近音字、英譯拼音誤差等原因漏掉可能的目標人物。The blacklist database retrieval system executes a blacklist database retrieval method, including: a user interface module for generating a target name string, a condition expansion module, receiving the target name string and converting the target name string Compare the plural original name strings in a name dictionary database, store the original name strings that meet the fault-tolerant comparison conditions as an extended name string set, and finally provide the extended name string set to a search engine for processing Retrieval of blacklist information database; by expanding the target name string into multiple similar name strings that meet the fault-tolerant comparison conditions, all relevant data that may be the target person can be found when searching, reducing input Mistakes, homophones or similar phonetic characters, errors in English translation and pinyin miss the possible target person.
Description
一種檢索系統及檢索方法,尤指一種黑名單資料庫檢索系統及檢索方法。A retrieval system and retrieval method, especially a blacklist database retrieval system and retrieval method.
金融機構通常需要對於其客戶或往來者行進行一定程度的背景調查,尤其是需確認該客戶或該往來者否具有不良之信用紀錄或相關新聞消息,判斷該客戶的信用資訊及身分背景,以作為洗錢防制之基礎。金融機構一般來說會使用全球性的黑名單資料庫進行檢核,並建立檢索系統,以在需要時通過該檢索系統搜尋黑名單資訊資料庫的人物資訊。Financial institutions usually need to conduct a certain level of background checks on their customers or correspondents, especially to confirm whether the customer or the correspondent has a bad credit history or related news information, and determine the customer’s credit information and identity background. As the basis for money laundering prevention and control. Generally speaking, financial institutions will use a global blacklist database to check and establish a search system to search for person information in the blacklist information database when necessary.
該黑名單資訊資料庫一般來說是關聯式資料庫,使用者輸入之欲搜尋之目標姓名字串後,檢索系統將該姓名字串與該黑名單資訊資料庫中的名單列表進行比對,並尋找符合該目標姓名字串的姓名欄位內容,並在找到相符合的欄位內容後,讀取並輸出該欄位的相關欄位內容資訊以供使用者瀏覽。其中,一個人可能有多種代表名稱,即姓名欄位可能包含中文姓名、英文姓名、別名、英文拼音變化等多個欄位。而在姓名檢索的技術中,經常會遇到以下幾點問題:當黑名單資料庫中的姓名欄位內容不齊全,使用者輸入的目標姓名字串可能不符合該對應該目標人物的任一姓名欄位的內容資訊,而導致漏掉該目標人物;當使用輸入的中文或英文有錯誤時,也無法搜尋到目標人物。The blacklist information database is generally a relational database. After the user enters the target name string to be searched, the search system compares the name string with the list of lists in the blacklist information database. And look for the name field content that matches the target name string, and after finding the matching field content, read and output the relevant field content information of the field for users to browse. Among them, a person may have multiple representative names, that is, the name field may include multiple fields such as Chinese name, English name, alias, and English pinyin changes. In the technology of name retrieval, the following problems are often encountered: When the content of the name field in the blacklist database is not complete, the target name string entered by the user may not match any of the corresponding target persons. The content information in the name field causes the target person to be missed; when the inputted Chinese or English is incorrect, the target person cannot be searched.
也就是說,當輸入的目標姓名字串不完全符合姓名欄位的欄位內容時,檢索系統就無法找到該目標人物,雖然現有的檢索系統有部分比對或模糊比對之功能,但仍然難以有效率的處理並命中簡繁體字體交錯,或同音不同字等問題。綜上所述,現有的黑名單資料庫檢索系統勢必須進行進一步改良。In other words, when the input target name string does not completely match the field content of the name field, the retrieval system cannot find the target person. Although the existing retrieval system has partial or fuzzy comparison functions, it still It is difficult to efficiently deal with and hit the problem of interlacing simplified and traditional fonts, or homophonic different characters. In summary, the existing blacklist database retrieval system must be further improved.
有鑑於現有的黑名單資料庫檢索系統可能因資料庫內容不齊全或搜索條件輸入錯字導致無法搜尋出正確的目標,本創作提供一種黑名單資料庫檢索系統及檢索方法,該黑名單資料庫檢索系統連接一黑名單資訊資料庫及一姓名字典資料庫。該姓名字典資料庫包含複數基於真實姓名的原始姓名字串,該黑名單資料庫是一關聯式資料庫,即單筆資料包含有關於一目標人物的姓名錯或名稱、縮寫、別名、出生日期或登記日期、相關事件資料等欄位。In view of the fact that the existing blacklist database retrieval system may not be able to search for the correct target due to incomplete database content or incorrectly entered search conditions, this creation provides a blacklist database retrieval system and retrieval method. The blacklist database retrieval The system connects a blacklist information database and a name dictionary database. The name dictionary database contains plural original name strings based on real names. The blacklist database is a relational database, that is, a single piece of data contains the wrong name or name, abbreviation, alias, date of birth of a target person Or fields such as registration date and related event data.
該檢索系統包含一使用者介面模組、一條件擴展模組及一檢索引擎,該使用者介面模組提供一使用者介面,用以根據使用者輸入產生一目標姓名字串,該條件擴展模組連接該姓名字典資料庫以接收該目標姓名字串,且當該條件擴展模組接收該目標姓名字串,該條件擴展模組將該目標姓名字串與該姓名字典資料庫中的各該原始姓名字串進行比對,並將符合一容錯比對條件的各該原始姓名字串儲存為一擴展姓名字串集合,該檢索引擎連接該條件擴展模組以接收該擴展姓名字串集合,並根據該擴展姓名字串集合判斷該黑名單資訊資料庫中的複數姓名欄位的內容資訊是否符合該擴展姓名字串集合的各該原始姓名字串,當該黑名單資訊資料庫中的其中一姓名欄位的內容資訊符合該擴展姓名字串集合的其中一原始姓名字串,該檢索引擎判斷該姓名欄位對應的一筆人物資料為一目標結果資料。The retrieval system includes a user interface module, a conditional expansion module, and a search engine. The user interface module provides a user interface for generating a target name string based on user input. The conditional expansion module The group connects the name dictionary database to receive the target name string, and when the condition expansion module receives the target name string, the condition expansion module combines the target name string with each of the names in the name dictionary database. The original name string is compared, and each original name string that meets a fault-tolerant comparison condition is stored as an extended name string set, and the search engine connects to the conditional expansion module to receive the extended name string set, According to the extended name string set, it is determined whether the content information of the plural name field in the blacklist information database matches each of the original name strings in the extended name string set. The content information of a name field matches one of the original name strings in the extended name string set, and the search engine determines that a piece of person data corresponding to the name field is a target result data.
該黑名單資料庫檢索方法由該黑名單資料庫檢索系統執行,包含以下步驟: 接收一目標姓名字串; 將該目標姓名字串與一姓名字典資料庫的複數原始姓名字串依序進行比對,判斷各該原始姓名字串是否符合一容錯比對條件,將符合該容錯比對條件的原始姓名字串儲存至一擴展姓名字串集合; 根據該擴展姓名字串集合檢索一黑名單資訊資料庫,判斷該黑名單資訊資料庫中的各該姓名欄位的內容資訊是否符合該擴展姓名字串集合的各該原始姓名字串; 當該黑名單資訊資料庫中的其中一姓名欄位的內容資訊符合該擴展姓名字串集合的其中一原始姓名字串,判斷該姓名欄位對應的一筆人物資料為一目標結果資料。The blacklist database retrieval method is executed by the blacklist database retrieval system and includes the following steps: Receive a target name string; Compare the target name string with the plural original name strings in a name dictionary database in order to determine whether each of the original name strings meets a fault-tolerant comparison condition, and the original name characters that meet the fault-tolerant comparison condition The string is stored in an extended name string set; Search a blacklist information database based on the extended name string set, and determine whether the content information of each name field in the blacklist information database matches each original name string of the extended name string set; When the content information of one of the name fields in the blacklist information database matches one of the original name strings in the extended name string set, it is determined that a piece of person data corresponding to the name field is a target result data.
也就是說,本發明的檢索系統是先將由使用者輸入的該目標姓名字串通過該條件擴展模組,比對出與該目標姓名字串符合容錯比對條件的所有姓名字串,再以該些姓名字串進行黑名單資訊資料庫的檢索。其中,該姓名字典資料庫係一包含所有可能之姓名拼音的姓名字串之資料庫。通過將原來由使用者輸入的單一的目標姓名字串擴展成該擴展系姓名字串集合,在進行黑名單資訊資料庫檢索時,便能完整地找出所有可能是目標人物的相關資料,避免因目標姓名字串的輸入錯誤、同音字或相近音字、英譯姓名拼音誤差等原因而漏掉可能的目標人物。此外,由於是有條件地找出可能的相近姓名字串,因此也能避免檢索條件過寬導致找出過多不相關的檢索結果資訊,或過於耗費檢索引擎之資源。That is to say, the retrieval system of the present invention first passes the target name string input by the user through the condition expansion module, compares all name strings that meet the fault-tolerant comparison condition with the target name string, and then uses These name strings are searched in the blacklist information database. Among them, the name dictionary database is a database containing all possible pinyin name strings. By expanding the single target name string originally entered by the user into the set of extended family name strings, when searching the blacklist information database, all relevant data that may be the target person can be found completely, and avoid Possible target persons are missed due to input errors in the target name string, homophones or similar phonetic characters, and pinyin errors in the English translation of the name. In addition, since it is conditional to find possible similar name strings, it can also avoid finding too much irrelevant search result information due to too wide search conditions, or excessively consuming the resources of the search engine.
請參閱圖1所示,本發明提供一種黑名單資料庫檢索系統10,該黑名單資料庫檢索系統10連接一黑名單資訊資料庫21及一姓名字典資料庫22,該姓名字典資料庫22包含複數基於真實姓名的原始姓名字串,該黑名單資料庫21是一關聯式資料庫,即單筆資料包含有關於一目標人物的姓名錯或名稱、縮寫、別名、出生日期或登記日期、相關事件資料等欄位。該黑名單資料庫檢索系統10包含一使用者介面模組11、一條件擴展模組12及一檢索引擎13,該使用者介面模組11提供一使用者介面,該使用者介面用以根據使用者輸入產生一目標姓名字串,該條件擴展模組12連接該使用者介面模組以接收該目標姓名字串,且連接該姓名字典資料庫22,當該條件擴展模組12接收該目標姓名字串,該條件擴展模組12將該目標姓名字串與該姓名字典資料庫22中的各該原始姓名字串進行比對,並將符合一容錯比對條件的各該原始姓名字串儲存為一擴展姓名字串集合。Please refer to Figure 1, the present invention provides a blacklist
該檢索引擎13連接該條件擴展模組12以接收該擴展姓名字串集合,並依據該擴展姓名字串集合中的各該原始姓名字串進行黑名單資訊資料庫21的檢索,該檢索引擎13根據該擴展姓名字串集合判斷該黑名單資訊資料庫21中的複數姓名欄位的內容資訊是否符合該擴展姓名字串集合的各該原始姓名字串,當該黑名單資訊資料庫21中的其中一姓名欄位的內容資訊符合該擴展姓名字串集合的其中一原始姓名字串,該檢索引擎13判斷該姓名欄位對應的一筆人物資料為一目標結果資料。當該檢索引擎13完成所有姓名欄位的比對,該檢索引擎13輸出符合該目擴展姓名字串集合中任一姓名字串的所有目標結果資料至該使用者介面模組11供使用者瀏覽。The
請參閱圖2所示,該黑名單資料庫檢索方法包含以下步驟:
接收一目標姓名字串(S201);
將該目標姓名字串與一姓名字典資料庫22的複數原始姓名字串依序進行比對,判斷各該原始姓名字串是否符合一容錯比對條件,並將符合該容錯比對條件的原始姓名字串儲存至一擴展姓名字串集合(S202);
根據該擴展姓名字串集合檢索一黑名單資訊資料庫,判斷該黑名單資訊資料庫21中的各該姓名欄位的內容資訊是否符合該擴展姓名字串集合的其中一原始姓名字串(S203);
當該黑名單資訊資料庫21中的其中一姓名欄位的內容資訊符合該擴展姓名字串集合的其中一原始姓名字串,判斷該姓名欄位對應的一筆人物資料為一目標結果資料(S204)。Please refer to Figure 2. The blacklist database retrieval method includes the following steps:
Receive a target name string (S201);
The target name string is compared with a plurality of original name strings in a
基於現有的國際通用黑名單資料庫多以英文拼音姓名為主要姓名標的,當使用者由該使用者介面模組11輸入的文字內容為一中文字串時,該條件擴展模組12將該中文字串根據至少一中文英譯拼音法將該中文字串轉換為一英文字串的目標姓名字串,再將該目標姓名字串與該姓名字典資料庫22的原始姓名字串進行比對。進一步來說,該目標姓名字串包含有複數字元,舉例來說,姓名字串「JOHN」包含「J」「O」「H」「N」共四個字元。當使用者輸入的文字內容包含複數個姓名字串,例如該文字內容為「JOHN YANG」時,該條件擴展模組12係分別對「JOHN」「YANG」二個姓名字串進行與該姓名字典資料庫22的比對,並產生二個擴展姓名字串集合,再提供該檢索引擎13以交集檢索的方式進行檢索。Based on the existing international blacklist database, most of the names are in English pinyin as the main name. When the text input by the
較佳的,該條件擴展模組12的該容錯比對條件包含有一容錯字數的資訊,該容錯字數較佳是由檢索系統內建設定,也可以由使用者進行檢索時根據所期望的檢索精準度及檢索結果資料筆數進行調整。當該條件擴展模組12進行目標姓名字串與姓名字典資料庫22的比對,判斷各該原始姓名字串是否符合一容錯比對條件時,是逐一判斷該姓名字典資料庫22中的各該原始姓名字串與該目標姓名字串的相異字元數是否低於該容錯字數,若是,則該條件擴展模組12判斷該原始姓名字串符合該容錯比對條件。Preferably, the fault-tolerant comparison condition of the
舉例來說,該容錯字數為1,當該目標姓名字串為「JOHN」時,且該條件擴展模組12在該姓名字典資料庫22比對到一「JOGN」的原始姓名字串時,該條件擴展模組12判斷該原始姓名字串「JOGN」符合該容錯比對條件,並將「JOGN」儲存至該擴展字串姓名集合。For example, when the number of error-tolerant characters is 1, when the target name string is "JOHN", and the
進一步來說,當該條件擴展模組12讀取該姓名字典資料庫22時,該條件擴展模組12僅比對字元數與該目標姓名字串字元數的差異小於或等於該容錯字數的原始姓名字串。舉例來說,該目標姓名字串「JOHN」共包含4個字元,且該容錯比對條件的容錯字數為1,則該條件擴展模組12僅擷取姓名字典資料庫22中包含的字元數與目標姓名字串「JOHN」的字元數差異小於或等於1的原始姓名字串,也就是說,該條件擴展模組12僅擷取姓名字典資料庫22中包含的字元數為3、4、5的原始姓名字串進行比對。如此一來,大幅縮減該條件擴展模組12所需比對的原始姓名字串筆數,而不會比對已知不符合容錯比對條件的原始姓名字串,避免浪費該條件擴展模組12之資源,提高整體檢索系統之效率。Furthermore, when the
以下將進一步說明該條件擴展模組12將該目標姓名字串與該姓名字典集合比對之方法。在本發明的第一較佳實施例中,該條件擴展模組12係指定該目標姓名字串的其中一字元為一第一目標字元,且當該第一目標字元不是字串末端字元,指定該第一目標字元的下一字元為一第二目標字元;且該條件擴展模組指定該原始姓名字串的其中一字元為一第一原始字元,且當該第一原始字元不是字串末端字元,指定該第一原始字元的下一字元為一第二原始字元;;且該條件擴展模組12儲存一相異字元數,並預設該相異字元數為0;該條件擴展模組12比對該第一目標字元與該第一原始字元;當該第一目標字元與該第一原始字元相同,該條件擴展模組12不更新該異字元數;當該第一目標字元與該第一原始字元相異,該條件擴展模組12比對該第一目標字元與該第二原始字元;當該第一目標字元與該第一原始字元相異,但該第一目標字元與該第二原始字元相同時,該條件擴展模組12更新該相異字元數為增加1;當該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相異時,比對該第二目標字元與該第一原始字元;當該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相異,以及該第二目標字元與該第一原始字元相異時,該條件擴展模組12更新該相異字元數增加1;當該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相異,以及該第二目標字元與該第一原始字元相同時,該條件擴展模組也更新該相異字元數增加1。The method for comparing the target name string with the name dictionary set by the
請參閱圖3所示,也就是說,該黑名單資料庫檢索系統10的將該目標姓名字串與一姓名字典資料庫22的複數原始姓名字串進行比對的步驟中,係包含以下子步驟:
指定該目標姓名字串的其中一字元為一第一目標字元,且當該第一目標字元不是字串末端字元,指定該第一目標字元的下一字元為一第二目標字元;且
指定該原始姓名字串的其中一字元為一第一原始字元,且當該第一原始字元不是字串末端字元,指定該第一原始字元的下一字元為一第二原始字元;且
儲存一相異字元數,並預設該相異字元數為0(S301);
比對該第一目標字元與該第一原始字元(S302);
當該第一目標字元與該第一原始字元相同,不更新該相異字元數(S303);
當該第一目標字元與該第一原始字元相異,比對該第一目標字元與該第二原始字元(S304);
當該第一目標字元與該第一原始字元相異,但該第一目標字元與該第二原始字元相同時,更新該相異字元數增加1(S305);
當該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相異時,比對該第二目標字元與該第一原始字元(S306);
當該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相異,以及該第二目標字元與該第一原始字元相異時,更新該相異字元數增加1(S307);
當該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相異,以及該第二目標字元與該第一原始字元相同時,更新該相異字元數增加1(S308)。Please refer to FIG. 3, that is, the step of comparing the target name string with the plural original name string of a
在步驟S305及S308中,也就是當判斷該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相同,該目標姓名字串與該原始姓名字串是具有一單一相異的字元;在步驟S307中,也就是當判斷該第一目標字元與該第一原始字元相異,該第一目標字元與該第二原始字元相異,且該第二目標字元與該第一原始字元也相異,該目標姓名字串與該原始姓名字串是具有一成對相異的字元。在上述三種情況中,即無論是單一相異或成對相異的字元,皆是判斷該原始姓名字串具有一個相異的字元,並更新該相異字元數增加1。In steps S305 and S308, that is, when it is determined that the first target character is different from the first original character, and the first target character is the same as the second original character, the target name string is The original name string has a single different character; in step S307, that is, when it is determined that the first target character is different from the first original character, the first target character is different from the second original character. The characters are different, and the second target character is also different from the first original character. The target name string and the original name string have a pair of different characters. In the above three cases, that is, whether it is a single different character or a pair of different characters, it is determined that the original name string has a different character, and the number of the different characters is updated to increase by one.
在本發明的一第二較佳實施例中,當該條件擴展模組12判斷該第一目標字元與該第一原始字元相同,或當該條件擴展模組12判斷該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相異,以及該第二目標字元與該第一原始字元相異時,該條件擴展模組12進一步判斷該相異字元數是否大於該容錯字數;當該相異字元數大於該容錯字數,該條件擴展模組結束該原始姓名字串與該目標姓名字串的比對;當該相異字元數小於或等於該容錯字數,該條件擴展模組指定該第二目標字元為新的第一目標字元,指定該第二目標字元的下一字元為新的第二目標字元;且該條件擴展模組指定該第二原始字元為新的第一原始字元,指定該第二原始字元的下一字元為新的第二原始字;並且該條件擴展模組12再次執行比對第一目標字元及第一原始字元的步驟,以繼續進行比對該原始姓名字串及該目標姓名字串的其餘字元。In a second preferred embodiment of the present invention, when the
此外,當該條件擴展模組12判斷該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元同,該條件擴展模組12進一步判斷該相異字元數是否大於該容錯字數;當該相異字元數大於該容錯字數,該條件擴展模組結束該原始姓名字串與該目標姓名字串的比對;當該相異字元數小於或等於該容錯字數,該條件擴展模組指定該第二目標字元為新的第一目標字元,指定該第二目標字元的下一字元為新的第二目標字元;且該條件擴展模組指定該第二原始字元的下一字元為新的第一原始字元,指定該第二原始字元的下二字元為新的第二原始字元,並且該條件擴展模組12比對更新後的第一目標字元及第一原始字元。In addition, when the
此外,當該條件擴展模組12判斷該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相異,以及該第二目標字元與該第一原始字元相同時,該條件擴展模組12進一步判斷該相異字元數是否大於該容錯字數;當該相異字元數大於該容錯字數,該條件擴展模組結束該原始姓名字串與該目標姓名字串的比對;當該相異字元數小於或等於該容錯字數,該條件擴展模組指定該第二目標字元為新的第一目標字元,指定該第二目標字元的下一字元為新的第二目標字元;且當該第二原始字元不是字串末端字元,該條件擴展模組指定該第二原始字元的下一字元為新的第一原始字元,指定該第二原始字元的下二字元為新的第二原始字元,並且該條件擴展模組12比對更新後的第一目標字元及第一原始字元。In addition, when the
也就是說,請一併參閱圖3及圖4所示,在本較佳實施例中,本發明的黑名單資料庫檢索方法,還包含有以下子步驟: 當該第一目標字元與該第一元始字元相同,或該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相異,以及該第二目標字元與該第一原始字元相異時,進一步執行以下步驟: 判斷該相異字元數是否大於該容錯字數(S309); 當該相異字元數大於該容錯字數,結束該原始姓名字串與該目標姓名字串的比對; 當該相異字元數小於或等於該容錯字數,指定該第二目標字元為新的第一目標字元,指定該第二目標字元的下一字元為新的第二目標字元;且指定該第二原始字元為新的第一原始字元,指定該第二原始字元的下一字元為新的第二原始字(S310); 再次執行「比對該第一目標字元及第一原始字元」的步驟(S302)。In other words, please refer to Figures 3 and 4 together. In this preferred embodiment, the blacklist database retrieval method of the present invention further includes the following sub-steps: When the first target character is the same as the first original character, or the first target character is different from the first original character, and the first target character is different from the second original character, And when the second target character is different from the first original character, the following steps are further performed: Determine whether the number of different characters is greater than the number of error-tolerant characters (S309); When the number of different characters is greater than the number of error-tolerant characters, the comparison between the original name string and the target name string is ended; When the number of different characters is less than or equal to the number of error-tolerant characters, the second target character is designated as the new first target character, and the next character of the second target character is designated as the new second target character ; And designate the second original character as the new first original character, and designate the next character of the second original character as the new second original character (S310); Perform the step of "comparing the first target character with the first original character" again (S302).
此外,請一併參閱圖3及圖5所示,當該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相同,進一步執行以下步驟: 判斷該相異字元數是否大於該容錯字數(S311); 當該相異字元數大於該容錯字數,結束該原始姓名字串與該目標姓名字串的比對; 當該相異字元數小於或等於該容錯字數,指定該第二目標字元為新的第一目標字元,指定該第二目標字元的下一字元為新的第二目標字元;且指定該第二原始字元的下一字元為新的第一原始字元,指定該第二原始字元的下二字元為新的第二原始字元(S312); 再次執行「比對該第一目標字元及第一原始字元」的步驟(S302)。In addition, please refer to FIGS. 3 and 5 together. When the first target character is different from the first original character, and the first target character is the same as the second original character, the following is further executed step: Determine whether the number of different characters is greater than the number of error-tolerant characters (S311); When the number of different characters is greater than the number of error-tolerant characters, the comparison between the original name string and the target name string is ended; When the number of different characters is less than or equal to the number of error-tolerant characters, the second target character is designated as the new first target character, and the next character of the second target character is designated as the new second target character And designate the next character of the second original character as the new first original character, and designate the next two characters of the second original character as the new second original character (S312); Perform the step of "comparing the first target character with the first original character" again (S302).
此外,請一併參閱圖3及圖6所示,當該第一目標字元與該第一原始字元相異,且該第一目標字元與該第二原始字元相異,以及該第二目標字元與該第一原始字元相同時,進一步執行以下步驟: 判斷該相異字元數是否大於該容錯字數(S313); 當該相異字元數大於該容錯字數,結束該原始姓名字串與該目標姓名字串的比對; 當該相異字元數小於或等於該容錯字數,指定該第二目標字元的下一字元為新的第一目標字元,指定該第二目標字元的下二字元為新的第二目標字元;且指定該第二原始字元為新的第一原始字元,且當該第二原始字元不是字串末端字元,指定該第二原始字元的下一字元為新的第二原始字元(S314); 再次執行「比對該第一目標字元及第一原始字元」的步驟(S302)。In addition, please refer to FIGS. 3 and 6 together, when the first target character is different from the first original character, and the first target character is different from the second original character, and the When the second target character is the same as the first original character, the following steps are further performed: Determine whether the number of different characters is greater than the number of error-tolerant characters (S313); When the number of different characters is greater than the number of error-tolerant characters, the comparison between the original name string and the target name string is ended; When the number of different characters is less than or equal to the number of error-tolerant characters, designate the next character of the second target character as the new first target character, and designate the next two characters of the second target character as the new The second target character; and the second original character is designated as the new first original character, and when the second original character is not the end character of the string, the next character of the second original character is designated Is the new second original character (S314); Perform the step of "comparing the first target character with the first original character" again (S302).
簡而言之,在本發明的第二較佳實施例中,當判斷該第一目標字元與該第一原始字元相同,或判斷該第一目標字元與該第一原始字元為成對相異的字元後,進一步指定原來的第二目標字元為新的第一目標字元,原來的第二目標字元的下一字元為新的第二目標字元,並且指定原來的第二原始字元為新的第一原始字元,指定原來的第二原始字元的下一字元為新的第二原始字元。In short, in the second preferred embodiment of the present invention, when it is determined that the first target character is the same as the first original character, or it is determined that the first target character and the first original character are After pairing different characters, the original second target character is further designated as the new first target character, the next character of the original second target character is the new second target character, and the The original second original character is the new first original character, and the character next to the original second original character is designated as the new second original character.
進一步的,單一相異的情況又分為二種,第一種為第一目標字元與第一原始字元相異,但第一目標字元與第二原始字元相同;第二種為第一目標字元與第一原始字元相異,但第一原始字元與第二目標字元相同。Further, there are two types of single differences. The first type is that the first target character is different from the first original character, but the first target character is the same as the second original character; the second is The first target character is different from the first original character, but the first original character is the same as the second target character.
在第一種單一相異的情況下,是指定原來的第二目標字元為新的第一目標字元,原來的第二目標字元的下一字元為新的第二目標字元,且指定該第二原始字元的下一字元為新的第一原始字元,指定該第二原始字元的下二字元為新的第二原始字元。在第二種單一相異的情況下,則是指定原來的第二目標字元的下一字元為新的第一目標字元,以及指定原來的第二目標字元的下二字元為新的第二目標字元,且指定該第二原始字元為新的第一原始字元,指定該第二原始字元的下一字元為新的第二原始字元。In the case of the first single difference, the original second target character is designated as the new first target character, and the next character of the original second target character is the new second target character. And designate the next character of the second original character as the new first original character, and designate the next two characters of the second original character as the new second original character. In the case of the second single difference, the next character of the original second target character is designated as the new first target character, and the next two characters of the original second target character are designated as A new second target character, and the second original character is designated as the new first original character, and the next character of the second original character is designated as the new second original character.
並且,當指定新的第一、第二目標字元以及新的第一、第二原始字元後,再次回到執行比對第一目標字元及第二原始字元的步驟,即再次執行步驟S302。In addition, when the new first and second target characters and the new first and second original characters are specified, the step of comparing the first target character and the second original character is returned again, that is, it is executed again Step S302.
在第一個例子中,該目標姓名字串為「JOHN」,而該姓名字典資料庫22中的一原始姓名字串為「TJOHN」。先設定該目標姓名字串的「J」為該第一目標字元,以及該目標姓名字串的「O」為該第二目標字元;設定該原始姓名字串的「T」為該第一原始字元,以及該原始姓名字串的「J」為該第二原始字元。該條件擴展模組12先比對該第一目標字元「J」與該第一原始字元「T」為相異,再比對第一目標字元「J」與該第二原始字元「J」為相同。如此一來,即能得到該目標姓名字串與該原始姓名字串有一單一相異的字元,且這是上述第一種單一相異的情況,因此進一步指定該目標姓名字串的「O」為第一目標字元、「H」為第二目標字元,並指定該原始姓名字串的「O」為第一原始字元、「H」為第二原始字元,也就是說,由該目標姓名字串的「J」的下一字元及該原始姓名字串的「T」的下二字元繼續比對。In the first example, the target name string is "JOHN", and an original name string in the
在第二個例子中,該目標姓名字串為「AJOHN」,而該姓名字典資料庫22中的一原始姓名字串為「JOHN」。先設定該目標姓名字串的「A」為該第一目標字元,以及該目標姓名字串的「J」為該第二目標字元;設定該原始姓名字串的「J」為該第一原始字元,以及該原始姓名字串的「O」為該第二原始字元。該條件擴展模組12先比對該第一目標字元「A」與該第一原始字元「J」為相異,再比對第一目標字元「A」與該第二原始字元「O」為相異,再比對第二目標字元「J」與該第一原始字元「J」為相同。如此一來,即能得到該目標姓名字串與該原始姓名字串有一單一相異的字元,且這是上述第二種單一相異的情況。因此,進一步指定該目標姓名字串的「O」為第一目標字元、「H」為第二目標字元,並指定該原始姓名字串的「O」為第一原始字元、「H」為第二原始字元,也就是說,由該目標姓名字串的「J」的下一字元及該原始姓名字串的「T」的下二字元繼續比對In the second example, the target name string is "AJOHN", and an original name string in the
在第三個例子中,該目標姓名字串為「JOHN」,而該姓名字典資料庫22中的一原始姓名字串為「JOGN」。當該條件擴展指定該目標姓名字串的「J」為第一目標字元及原始姓名字串的「J」為第一原始字元,且指定該目標姓名字串的「O」為第二目標字元及原始姓名字串的「O」為第二原始字元,該第一目標字元及第一原始字元相同。因此進一步指定原來的第二目標字原為新的第一目標字元,及指定原來的第二原始字原為新的第一原始字元,也就是該目標姓名字串的「O」為第一目標字元及原始姓名字串的「O」為第一原始字元。至此,判斷該目標姓名字串的「J」及「O」與原始姓名字串的「J」及「O」相同。當比對至第三個字元時,即指定該目標姓名字串的「H」為該第一目標字元,以及該目標姓名字串的「N」為該第二目標字元,且指定該原始姓名字串的「G」為該第一原始字元,以及指定該原始姓名字串的「N」為該第二原始字元。該條件擴展模組12先比對該第一目標字元「H」與該第一原始字元「G」為相異,再比對第一目標字元「H」與該第二原始字元「N」為相異,再比對該第二目標字元「N」與該第一原始字元「G」為相異。如此一來,即能判斷該目標姓名字串與該原始姓名字串有一成對相異的字元。因此,進一步指定該目標姓名字串的「N」為第一目標字元,並指定該原始姓名字串的「N」為第一原始字元,也就是進一步比對該目標姓名字串的「H」的下一字元以及該原始姓名字串的「G」的下一字元。由於該目標姓名字串的「N」及該原始姓名字串的「N」分別為最後一個字元,因此不須指定該第二目標字元及該第二原始字元。比對該目標姓名字串的「N」及原始姓名字串的「N」為相同字元,並結束目標姓名字串與原始姓名字串的比對,共有一成對相異的字元。In the third example, the target name string is "JOHN", and an original name string in the
此外,在第一目標字串或第二原始字串的比對移動至字串的末端,例如當執行指定新的第一、第二目標字元以及新的第一、第二原始字元的步驟時,新的第一目標字元或新的第一原始字元已是一字串末端字元,也就是字串的最後一字元,已無新的第二目標字元或新的第二原始字元以供指定,則直接比對該新的第一目標字元及新的第一原始字元,若為相異,則更新該相異字元數增加1,並且進一步計算其中還未比對的該字串的剩餘字數,根據該剩餘字數更新該相異字元數,再結束比對。In addition, the comparison of the first target string or the second original string is moved to the end of the string, for example, when the execution of specifying the new first and second target characters and the new first and second original characters During the step, the new first target character or the new first original character is already the end character of a string, that is, the last character of the string, and there is no new second target character or new first character. Two original characters for designation are directly compared to the new first target character and the new first original character. If they are different, the number of different characters is updated to increase by 1, and the number of different characters is further calculated. The number of remaining characters of the compared character string is updated, and the number of different characters is updated according to the remaining number of characters, and then the comparison is ended.
或者,當執行指定新的第一、第二目標字元以及新的第一、第二原始字元的步驟時,若已無新的第一目標字元或新的第一原始字元以供指定,則直接計算其中還未比對的另一字串的剩餘字數,根據該剩餘字數更新該相異字元數,再結束比對。Or, when the steps of specifying new first and second target characters and new first and second original characters are executed, if there is no new first target character or new first original character for Specify, directly calculate the remaining number of characters in another string that has not been compared, update the number of different characters according to the remaining number of characters, and then end the comparison.
因此,當該條件擴展模組12根據該目標姓名字串及該姓名字典資料庫22進行擴展比對,以目標姓名字串為「JOHN」為例,該條件擴展模組12產生的擴展姓名集合為:{JOHN*,WJOHN,TJOHN,PJOHN,MJOHN,LJOHN,KJOHN,JÓHN,JOÁN,JOHÉ,JOHON,JOHEN,JOHAN,IJOHN,HJOHN,FJOHN,EJOHN,C,OHN,BJOHN,AJOHN,YOHN,WOHN,SOHN,ROHN,POHN,NOHN,MOHN,LOHN,KOHN,JUHN,JOYN,JOWN,JOUN,JORN,JOON,JOIN,JOHT,JOHS,JOHR,JOHO,JOHM,JOHL,JOHG,JOHE,JOHD,JOHA,JOGN,JOEN,JOBN,JOAN,JEHN,JAHN,HOHN,GOHN,FOHNDOHN,COHN,BOHN,OHN,JON,JOH}。其中,JOHN*表示所有以「JOHN」開頭之原始姓名字串。Therefore, when the
類似的,當使用者輸入的第二個目標姓名字串為「YANG」,該條件擴展模組12根據該目標姓名字串及該姓名字典資料庫22進行擴展比對,該條件擴展模組12產生的擴展姓名集合為:Similarly, when the second target name string entered by the user is "YANG", the
{ YANG*,ZYANG,YÊNG,YVANG,YUANG,YTANG,YRANG,YJANG,YIANG,YEANG,YAUNG,YAONG,YANNG,YANIG,YANEG,YANAG,YAING,YAANG,UYANG,SYANG,RYANG,OYANG,NYANG,LYANG,JYANG,IYANG,HYANG,GYANG,EYANG,AYANG,ZANG,YUNG,YONG,YLNG,YING,YENG,YARG,YAOG,YANY,YANX,YANW,YANV,YANU,YANT,YANS,YANR,YANQ,YANP,YANO,YANN,YANM,YANL,YANK,YANJ,YANI,YANH,YANF,YANE,YAND,YANC,YANB,YANA,YAAG,XANG,WANG,VANG,UANG,TANG,SANG,RANG,PANG,NANG,MANG,LANG,KANG,JANG,IANG,HANG,GANG,FANG,EANG,DANG,CANG,BANG,YNG,YAN,YAG,ANG}。其中,YANG*表示所有以「YANG」開頭之原始姓名字串。{YANG*,ZYANG,YÊNG,YVANG,YUANG,YTANG,YRANG,YJANG,YIANG,YEANG,YAUNG,YAONG,YANNG,YANIG,YANEG,YANAG,YAING,YAANG,UYANG,SYANG,RYANG,OYANG,NYANG,LYANG, JYANG,IYANG,HYANG,GYANG,EYANG,AYANG,ZANG,YUNG,YONG,YLNG,YING,YENG,YARG,YAOG,YANY,YANX,YANW,YANV,YANU,YANT,YANS,YANR,YANQ,YANP,YANO, YANN,YANM,YANL,YANK,YANJ,YANI,YANH,YANF,YANE,YAND,YANC,YANB,YANA,YAAG,XANG,WANG,VANG,UANG,TANG,SANG,RANG,PANG,NANG,MANG,LANG, KANG, JANG, IANG, HANG, GANG, FANG, EANG, DANG, CANG, BANG, YNG, YAN, YAG, ANG}. Among them, YANG* means all original name strings beginning with "YANG".
當該檢索引擎13接收到以上二個擴展姓名字串集合,便能根據二個擴展姓名字串集合的交集對該黑名單資訊資料庫21進行姓名欄位的檢索,檢索式如以下表示:((JOHN* or WJOHN or TJOHN or PJOHN or MJOHN or LJOHN or KJOHN or JÓHN or JOÁN or JOHÉ or JOHON or JOHEN or JOHAN or IJOHN or HJOHN or FJOHN or EJOHN or CJOHN or BJOHN or AJOHN or YOHN or WOHN or SOHN or ROHN or POHN or NOHN or MOHN or LOHN or KOHN or JUHN or JOYN or JOWN or JOUN or JORN or JOON or JOIN or JOHT or JOHS or JOHR or JOHO or JOHM or JOHL or JOHG or JOHE or JOHD or JOHA or JOGN or JOEN or JOBN or JOAN or JEHN or JAHN or HOHN or GOHN or FOHN or DOHN or COHN or BOHN or OHN or JON or JOH) and (YANG* or ZYANG or YÊNG or YVANG or YUANG or YTANG or YRANG or YJANG or YIANG or YEANG or YAUNG or YAONG or YANNG or YANIG or YANEG or YANAG or YAING or YAANG or UYANG or SYANG or RYANG or OYANG or NYANG or LYANG or JYANG or IYANG or HYANG or GYANG or EYANG or AYANG or ZANG or YUNG or YONG or YLNG or YING or YENG or YARG or YAOG or YANY or YANX or YANW or YANV or YANU or YANT or YANS or YANR or YANQ or YANP or YANO or YANN or YANM or YANL or YANK or YANJ or YANI or YANH or YANF or YANE or YAND or YANC or YANB or YANA or YAAG or XANG or WANG or VANG or UANG or TANG or SANG or RANG or PANG or NANG or MANG or LANG or KANG or JANG or IANG or HANG or GANG or FANG or EANG or DANG or CANG or BANG or YNG or YAN or YAG or ANG))。When the
如此一來,當使用者輸入的檢索目標姓名為「JOHN YANG」,該檢索引擎13進行檢索時,根據上述之展開條件進行檢索,能夠找出所有與「JOHN YANG」具有一個容錯字數以內的所有相關人物資料,確保所有可能的人物資料皆會被找出來提供給使用者,避免因輸入誤繕、黑名單資訊資料庫21誤繕,等原因而漏掉可能之目標人物。In this way, when the search target name entered by the user is "JOHN YANG", when the
更進一步來說,當使用者輸入的檢索內容為中文字串時,該條件擴展模組12會先根據至少一種中文英譯拼音法例如羅馬拼音、威妥瑪拼音、漢語拼音、通用拼音法等所有常見拼音法將該中文字串轉換為一英文字串,再將得到的至少一英文字串作為目標姓名字串進行條件擴展。由於無法得知該黑名單資訊資料庫21中蒐集的姓名欄位的姓名字串是以何種拼音方法登陸該筆人物資料,因此先將輸入的中文字串以多種拼音法轉換為英文字串,再進行條件擴展,最大程度的確保能夠找到所有可能的目標人物資料。Furthermore, when the search content input by the user is a Chinese character string, the
在本發明的一第三較佳實施例中,該使用者介面模組11還進一步接收使用者輸入的一第一日期條件資訊,且該檢索引擎13接收該第一日期條件資訊。該檢索引擎13根據該擴展姓名字串集合進行檢索時,當判斷該黑名單資訊資料庫21中的其中一姓名欄位的內容資訊符合該擴展姓名字串集合的其中一原始姓名字串,該檢索引擎13進一步讀取該姓名欄位對應的一第一日期欄位的內容資訊,且當該第一日期欄位內不包含內容資訊,或該第一日期欄位的內容資訊符合該第一日期條件資訊,該檢索引擎13才判斷該姓名欄位及該第一日期欄位對應的該筆人物資料為該目標結果資料。In a third preferred embodiment of the present invention, the
請參閱圖7所示,也就是說,在本較佳實施例中,在該黑名單資料庫檢索方法的步驟S201中,當接收該目標姓名字串時,還進一步接收一第一日期條件資訊(S401),當判斷該黑名單資訊資料庫21中的其中一姓名欄位的內容資訊符合該擴展姓名字串集合的其中一原始姓名字串(S205),進一步讀取該姓名欄位對應的一第一日期欄位的內容資訊,並判斷該第一日期欄位內是否不包含內容資訊,或該第一日期欄位的內容資訊符合該第一日期條件資訊(S402),當該第一日期欄位內不包含內容資訊,或該第一日期欄位的內容資訊符合該第一日期條件資訊,才判斷該姓名欄位及該第一日期欄位對應的該筆人物資料為該目標結果資料(S204)。Please refer to FIG. 7, that is, in the preferred embodiment, in step S201 of the blacklist database retrieval method, when the target name string is received, a first date condition information is further received (S401), when it is determined that the content information of one of the name fields in the
也就是說,當該黑名單資訊資料庫21中的一姓名欄位符合該擴展姓名字串集合中的其中一原始姓名字串,該檢索引擎13還進一步根據該第一日期條件資訊判斷該姓名欄位對應的第一日期欄位,當該第一日期欄位的內容符合該第一日期條件資訊,則該筆人物資料為一目標結果資料;若該第一日期欄位中沒有內容資訊,也就是該黑名單資訊資料庫21並未蒐集到該筆人物資料的相關的第一日期,例如是該人物資料的一生日日期的年份資訊,該檢索引擎13仍然判斷該筆人物資料為一目標結果資料,避免因黑名單資訊資料庫21未蒐集到日期資料中的其中一部份例如年份或月份的缺漏,而無法檢索到該筆人物資料。That is, when a name field in the
以上所述僅是本發明的較佳實施例而已,並非對本發明做任何形式上的限制,雖然本發明已以較佳實施例揭露如上,然而並非用以限定本發明,任何熟悉本專業的技術人員,在不脫離本發明技術方案的範圍內,當可利用上述揭示的技術內容做出些許更動或修飾為等同變化的等效實施例,但凡是未脫離本發明技術方案的內容,依據本發明的技術實質對以上實施例所作的任何簡單修改、等同變化與修飾,均仍屬於本發明技術方案的範圍內。The above are only preferred embodiments of the present invention, and do not limit the present invention in any form. Although the present invention has been disclosed as above in preferred embodiments, it is not intended to limit the present invention. Anyone familiar with the professional technology Personnel, without departing from the scope of the technical solution of the present invention, when the technical content disclosed above can be used to make slight changes or modification into equivalent embodiments with equivalent changes, but any content that does not deviate from the technical solution of the present invention, according to the present invention Any simple modifications, equivalent changes and modifications made to the above embodiments are still within the scope of the technical solutions of the present invention.
10:黑名單資料庫檢索系統 11:使用者介面模組 12:條件擴展模組 13:檢索引擎 21:黑名單資訊資料庫 22:姓名字典資料庫10: Blacklist database retrieval system 11: User interface module 12: Conditional expansion module 13: search engine 21: Blacklist Information Database 22: Name dictionary database
圖1係本發明的黑名單資料庫檢索系統的方塊示意圖。 圖2係本發明的黑名單資料庫檢索方法的流程示意圖。 圖3係本發明的黑名單資料庫檢索方法第一較佳實施例的流程示意圖。 圖4係本發明的黑名單資料庫檢索方法第二較佳實施例的部分流程示意圖。 圖5係本發明的黑名單資料庫檢索方法第二較佳實施例的另一部分流程示意圖。 圖6係本發明的黑名單資料庫檢索方法第二較佳實施例的再一部分流程示意圖。 圖7係本發明的黑名單資料庫檢索方法第三較佳實施例的流程示意圖。Figure 1 is a block diagram of the blacklist database retrieval system of the present invention. Fig. 2 is a schematic flowchart of the blacklist database retrieval method of the present invention. FIG. 3 is a schematic flowchart of the first preferred embodiment of the blacklist database retrieval method of the present invention. FIG. 4 is a schematic diagram of a part of the flow of the second preferred embodiment of the blacklist database retrieval method of the present invention. FIG. 5 is a schematic diagram of another part of the flow of the second preferred embodiment of the blacklist database retrieval method of the present invention. 6 is a schematic diagram of another part of the flow of the second preferred embodiment of the blacklist database retrieval method of the present invention. FIG. 7 is a schematic flowchart of the third preferred embodiment of the blacklist database retrieval method of the present invention.
10:黑名單資料庫檢索系統 10: Blacklist database retrieval system
11:使用者介面模組 11: User interface module
12:條件擴展模組 12: Conditional expansion module
13:檢索引擎 13: search engine
21:黑名單資訊資料庫 21: Blacklist Information Database
22:姓名字典資料庫 22: Name dictionary database
Claims (12)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW108131255A TWI734183B (en) | 2019-08-30 | 2019-08-30 | Blacklist database retrieval system and retrieval method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
TW108131255A TWI734183B (en) | 2019-08-30 | 2019-08-30 | Blacklist database retrieval system and retrieval method |
Publications (2)
Publication Number | Publication Date |
---|---|
TW202109309A TW202109309A (en) | 2021-03-01 |
TWI734183B true TWI734183B (en) | 2021-07-21 |
Family
ID=76035219
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
TW108131255A TWI734183B (en) | 2019-08-30 | 2019-08-30 | Blacklist database retrieval system and retrieval method |
Country Status (1)
Country | Link |
---|---|
TW (1) | TWI734183B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW200521737A (en) * | 2003-12-22 | 2005-07-01 | Inventec Besta Co Ltd | Vocabulary search method and system |
WO2007016478A2 (en) * | 2005-07-29 | 2007-02-08 | Bit9, Inc. | Network security systems and methods |
US8219533B2 (en) * | 2007-08-29 | 2012-07-10 | Enpulz Llc | Search engine feedback for developing reliable whois database reference for restricted search operation |
CN102622379A (en) * | 2011-01-31 | 2012-08-01 | 北京千橡网景科技发展有限公司 | Real name detection method and equipment |
-
2019
- 2019-08-30 TW TW108131255A patent/TWI734183B/en active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
TW200521737A (en) * | 2003-12-22 | 2005-07-01 | Inventec Besta Co Ltd | Vocabulary search method and system |
WO2007016478A2 (en) * | 2005-07-29 | 2007-02-08 | Bit9, Inc. | Network security systems and methods |
US8219533B2 (en) * | 2007-08-29 | 2012-07-10 | Enpulz Llc | Search engine feedback for developing reliable whois database reference for restricted search operation |
CN102622379A (en) * | 2011-01-31 | 2012-08-01 | 北京千橡网景科技发展有限公司 | Real name detection method and equipment |
Also Published As
Publication number | Publication date |
---|---|
TW202109309A (en) | 2021-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US5606690A (en) | Non-literal textual search using fuzzy finite non-deterministic automata | |
CN101978348B (en) | Manage the archives about approximate string matching | |
US8775931B2 (en) | Spell check function that applies a preference to a spell check algorithm based upon extensive user selection of spell check results generated by the algorithm, and associated handheld electronic device | |
US8055498B2 (en) | Systems and methods for building an electronic dictionary of multi-word names and for performing fuzzy searches in the dictionary | |
US8918402B2 (en) | Method of bibliographic field normalization | |
JPH06266780A (en) | Character string retrieving method by semantic pattern recognition and device therefor | |
US20090083255A1 (en) | Query spelling correction | |
US20080091660A1 (en) | System and method for searching information using synonyms | |
US8583415B2 (en) | Phonetic search using normalized string | |
CN103733193A (en) | Statistical spell checker | |
US9092418B2 (en) | Use of a suffix-changing spell check algorithm for a spell check function, and associated handheld electronic device | |
Abdelmageed et al. | Jentab: A toolkit for semantic table annotations | |
US8881004B2 (en) | Use of multiple data sources for spell check function, and associated handheld electronic device | |
TWI734183B (en) | Blacklist database retrieval system and retrieval method | |
TWM592561U (en) | Blacklist database searching system | |
US8812459B2 (en) | Method and system for text interpretation and normalization | |
US20080244387A1 (en) | Use of a Suffix-Removing Spell Check Algorithm for a Spell Check Function, and Associated Handheld Electronic Device | |
US11281736B1 (en) | Search query mapping disambiguation based on user behavior | |
JP3396734B2 (en) | Corpus error detection / correction processing apparatus, corpus error detection / correction processing method, and program recording medium therefor | |
JP2500680B2 (en) | Data name assignment registration device | |
JP3972310B2 (en) | Information conversion apparatus and program | |
US20120117086A1 (en) | Method of bibliographic field normalization | |
Kwok et al. | GeoName: a system for back-transliterating pinyin place names | |
JP2827066B2 (en) | Post-processing method for character recognition of documents with mixed digit strings | |
JP2002197116A (en) | Retrieval device |