TWI237775B - Method and systems for screening Chinese address data - Google Patents

Method and systems for screening Chinese address data Download PDF

Info

Publication number
TWI237775B
TWI237775B TW91110251A TW91110251A TWI237775B TW I237775 B TWI237775 B TW I237775B TW 91110251 A TW91110251 A TW 91110251A TW 91110251 A TW91110251 A TW 91110251A TW I237775 B TWI237775 B TW I237775B
Authority
TW
Taiwan
Prior art keywords
database
chinese
item
scope
items
Prior art date
Application number
TW91110251A
Other languages
Chinese (zh)
Inventor
Mung Wah Marie Low
Xin Yu
Cheh Wooi Albert Tan
Zhen-Ming Xi
Loy Lee
Original Assignee
Dell Products Lp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CNB021017360A external-priority patent/CN100442275C/en
Application filed by Dell Products Lp filed Critical Dell Products Lp
Application granted granted Critical
Publication of TWI237775B publication Critical patent/TWI237775B/en

Links

Landscapes

  • Document Processing Apparatus (AREA)

Abstract

A system is proposed for identifying orders in an order management system which are to individual or organization having Chinese addresses. The Chinese address items of the order management database, and the Chinese address items of a database of proscribed individuals and organizations, are each converted into respective databases in a common writing system, namely the Pin Yin transliteration standard. In the conversion process, Simplified Mandarin items in the order management database which can be converted in multiple ways are converted in each of those ways. The converted items from the two Pin Yin databases are then compared.

Description

1237775 五、發明說明(1 ) 技術領域 本發明涉及用來比較兩個中文項的資料庫的方法和系 統,尤其是,本發明用以將諸如個人及/或機構地址的資料 項目進行比較。 、 背景技術 2前已存在多種書寫中文文本的標準,除了傳統的中 文子付集(它仍然在臺灣和香港這些地區廣泛地使用),中 華人民共和國是用簡化的漢字來書寫文本,另外中 ίο 15 ί〇 像”拼音字元,,這樣的羅馬字母,或者是通過其他 H比如被定義成ALA_LC的羅馬字母表的系統。 單4Γ準之間的轉換是很常見的,例如-個傳統的訂 二=系統(SMARTS)需要通過❹拼音字^來鍵入訂貨 =位址,然後該拼音字元被轉換成存儲於SMARTS資 科庫中的雙字節簡體漢字字元。注意 、 有的轉換都是很明相,比如i w 4 ’亚非所 對心… 早個的簡體漢字字元可以 于應(用拚音)幾個羅馬字母集。類似一口 (用拼音)集也可能會對應多個簡體漢字字Γ早個羅馬字母 字字元將會有不同的意義。μ,這些簡體漢 既然不同資料庫中的中 儲,因而w ’中文子7°可以用不同標準來被存 u而比李父不同資料庫中的相 程,比如_第二太P 員目疋—個困難的過 (“DPL”),二 、&頒布了開發方列表” 別處理。該列^曰、列^中的所有方進行的交易都將受特 譯自中文文〜抵疋用央文來發布(即,傳統的英文文字和 Μ並且沒有任何跡象表 4 五、發明說明(2) 明將來它會被翻譯成簡體漢語字元。因此使用存儲於諸如 SMARTS^樣的訂單管理系統中的名字來比較列表是报困 難的。 比較這兩個列表的困難會引起一種可能,它會讓產品 提仏商釦淚地將產品提供給列於DPL表上的當事方,而未 對该又易進仃特殊處理。這樣的錯誤可能對公司的利益造 成損害。 ° 發明内容 一、,本^ H圖解決上述問題,尤其是提供用來比較兩個 〇貝料庫的方法和系統,該兩種資料庫都包括中文文本資料 項目比如個人或機構這樣實體的位址,並且兩種資料庫對 於中文文本資料項目採用不同的中文書寫系統。 總括來說,本發明提出兩種資料庫的中文文本項都被 _轉換成-共同的標準語言,尤其是拼音拼寫標準(pin他 〉Transliteratl〇n standard)。在轉換過程中,任何可以用多種 方法轉換的項目都被用這些方法中的進行轉換,於是這兩 個破轉換資料庫中的項目即可以進行比較。 」寺別地’本發明的第一方面在於用來比較兩個資料庫 的計算機應用方法,該兩資料庫的每一個都包 0的中文文本資料項目,該方法包括: 對於每-個資料庫,將任何一個並不是預先定義的共 同中文語言袼式的中文文本資料項目轉換成共同袼式,能 夠用多種方式轉換成該共同格式的、且至少是在第_資才: 庫中的任何項目都制所㈣些方法來轉換成該共同样 1237775 五 、發明說明( 式,以及 一一比k k些具有共同格式的資料項目,以便識別與該第 —貢料庫中的中文文本資料項目相對應的該第-資料庫中 的中文文本資料項目。 本舍月的第二方面在於提出了一種用於比較兩個資料 的-種計算機系統,每個資料庫都包括指明位址的中文 文本資料項目,該計算機系統包括: 10 帛轉換單TL ’用來將第—資料庫的中文文本資料 成一預先定義的共同中文語言格式’並且能夠用 方法將它們轉換成共同格式的第一資料庫中的任意項 目所有這些方法來轉換以便生成具有共同格式的項 料IS曰Ϊ轉換早(’用來將該第二資料庫的中文文本資 ,、轉換成該共同中文語言格式;以及 _=二項目 料庫中的中文文本資料項目。貝項目相對應的该第一資 =如果該第二資料庫中的資料項目已經是共同格 式,那麼可以省略該第二轉換單元。1237775 V. Description of the Invention (1) Technical Field The present invention relates to a method and system for comparing a database of two Chinese items. In particular, the present invention is used to compare data items such as personal and / or institutional addresses. 2. Background There have been various standards for writing Chinese text before 2. In addition to the traditional Chinese subset (which is still widely used in Taiwan and Hong Kong), the People's Republic of China uses simplified Chinese characters to write text. 15 ί〇 Like "phonetic characters, such Roman letters, or other systems such as the Roman alphabet defined as ALA_LC. Conversion between single 4Γ standards is very common, such as a traditional order Two = System (SMARTS) needs to input order = address by ❹Pinyin ^, and then the Pinyin characters are converted into double-byte simplified Chinese characters stored in SMARTS library. Note that some conversions are It ’s very obvious, such as iw 4 'Asia and Africa are right at heart ... The earlier simplified Chinese character can be used (using Pinyin) for several Roman alphabet sets. Similar to a bite (using Pinyin) set may also correspond to multiple simplified Chinese characters Γ The earlier Roman alphabet characters will have different meanings. Μ, since these simplified Chinese characters are stored in different databases, w 'Chinese sub 7 ° can be stored using different standards than u Li's father's database in different databases, such as _ the second P member of the project-a difficult experience ("DPL"), two, & issued a list of developers "do not deal with. The transactions conducted by all parties in this column ^, and ^ will be specially translated from Chinese ~ published in central language (that is, the traditional English language and M and there is no sign of any 2) In the future it will be translated into simplified Chinese characters. Therefore it is difficult to compare lists using names stored in order management systems such as SMARTS ^. The difficulty of comparing the two lists raises a possibility, it It will allow product vendors to tearfully provide the product to the parties listed on the DPL form without special treatment. Such errors may cause damage to the company's interests. This figure solves the above problems, in particular, it provides methods and systems for comparing two 0 databases, both of which include Chinese text data items such as the addresses of entities such as individuals or institutions, and two The database uses different Chinese writing systems for Chinese text data items. In summary, the present invention proposes that the Chinese text items of both databases are converted into a common standard language, especially Pinyin spelling standard (pin he> Transliteratlon standard). During the conversion process, any item that can be converted by multiple methods is converted by these methods, so the items in the two broken conversion databases can be performed. "Compare." The first aspect of the present invention is a computer application method for comparing two databases. Each of the two databases includes 0 Chinese text data items. The method includes: for each- A database that converts any Chinese text data item that is not a pre-defined common Chinese language format into a common format that can be converted into the common format in a variety of ways, at least in the _ Zicai: library Any project has some methods to convert it into the same 1237775. 5. Description of invention (), and one-to-one comparison of some data items with a common format, in order to identify the Chinese text data items in the first-tribute database. Corresponding Chinese text data items in this-database. The second aspect of this month is to propose a method for comparing two data -A computer system, each database includes Chinese text data items with specified addresses, the computer system includes: 10 帛 Transform TL 'used to convert the Chinese text data of the first database into a common common Chinese language Format 'and can be used to convert them to any item in the first database in a common format. All of these methods are converted to generate items with a common format. Chinese text resources, converted into the common Chinese language format; and _ = Chinese text data items in the second item database. The first asset corresponding to the shell item = if the data items in the second database are already common Format, then the second conversion unit can be omitted.

Ch該二同中文語言格式優先採用拼音字元⑽Yin ㈣,該第-資科庫可以是—訂 :=及/或提貨地址的資料項目,第-資料庫中Si 目可Μ簡體漢語字元。該第二資料庫 疋央文或者疋傳統的英文文字和拼音的結合體,比如,該 1237775 五、發明說明(4 第二資料庫可以是一些或者是全部的由第二方所發佈的第 三方列表。 中文文本資料項目,可以被定義成中國語言的項 目”’比如說漢字’可選擇地或附加地,”中文文本資料項 5目可以被定義成包括或者是含有與位於指定的中國境内 的地址有關的資料項目,比如中華人民共和國及/或可選的 任何其他中文語言通用的領土内以便訂貨及/或運輸(尤其 是使用簡體漢字字元地區)。 注意,除了中文文本資料項目之外,任何一個資料庫 了月b I括非中文文本資料項目的項目,比如該訂單管理系 統可以包括與中國沒有任何關係的方面的資料。類似地, 忒第一貝料庫(比如說是一些或者是全部的DPL·方的情況) 包括驗證其實體的地址不位於指定的中國境内的項目。較 佳地,在任何一種情況下該轉換過程僅僅將每一個資料庫 中的中文項目進行轉換,並且衹有該比較才確定該第一資 料庫的被轉換項目是否與該第二資料庫的被轉換項目相對 應。 圖式簡介 與所描述的本發明的實例相關的本發明的另外的優點 〃特徵將要被纣論,僅僅作為一個例子,參考下列附圖, 它包括: 第1圖是用來顯示作為本發明的具體實例的方法的方 框圖; 第2圖疋作為本發明的具體實例的系統的結構方框 1237775The Chinese language format of Ch is the first to use the Pinyin character ⑽Yin㈣. The first asset library can be the data item of: = and / or pickup address. The first item in the database can be simplified Chinese characters. The second database is a combination of central Chinese or traditional English text and pinyin, for example, the 1237775 V. Invention Description (4 The second database may be some or all third parties published by the second party List. Chinese text data items can be defined as items in the Chinese language. "For example, Chinese characters." Alternatively or in addition, "Chinese text data items 5 items can be defined to include or contain items that are located in the designated Chinese territory." Address-related data items, such as in the People's Republic of China and / or any other Chinese language territories commonly available for ordering and / or shipping (especially in areas with simplified Chinese characters). Note that in addition to the Chinese text data Any item that contains a non-Chinese text data item in the database, such as the order management system, can include data that has nothing to do with China. Similarly, the first shellfish database (such as some or In the case of all DPLs and parties) Including projects verifying that their entities' addresses are not located in the designated China Preferably, in any case, the conversion process only converts Chinese items in each database, and only the comparison determines whether the converted item in the first database is converted with the second database The items correspond. Brief introduction to the drawings Additional advantages and features of the invention related to the described examples of the invention will be discussed, just as an example, referring to the following drawings, which include: Figure 1 is used to show Block diagram of the method as a specific example of the present invention; FIG. 2 疋 Structure of the system as a specific example of the present invention block 1237775

圖’匕用來貫現第1圖的方法; 第3圖顯示了由第2圖中的系統產生的一個窗口,它被 用來形成由拼音字元所得的簡體漢語字元位址; 苐4圖疋由苐2圖的糸統所產生的一個窗口,它顯示了 5存儲於該系統中的用簡體漢語字元表示的位址; 第5圖由第5(a)圖至第5(c)圖所組成,它顯示了第丨圖中 的將簡體漢語字元轉換成拼音字元的方法的步驟; 第6圖顯示藉第丨圖之方法由DpL產生之拼音字元的資 料庫;以及 1〇 第7圖是第2圖的系統所產生的一個窗口,它顯示了兩 個資料庫進行比較的結果。 具體實施方式Figure 'Dagger' is used to implement the method of Figure 1. Figure 3 shows a window generated by the system in Figure 2, which is used to form the simplified Chinese character address obtained from Pinyin characters; 苐 4 Figure 疋 A window created by the system of Figure 2 which shows the addresses stored in the system in simplified Chinese characters; Figure 5 from Figures 5 (a) to 5 (c) ), Which shows the steps of the method for converting simplified Chinese characters into Pinyin characters in Figure 丨; Figure 6 shows a database of Pinyin characters generated by DpL by the method in Figure 丨; and 10 Figure 7 is a window generated by the system of Figure 2, which shows the results of the comparison between the two databases. detailed description

第1圖顯示了依據本發明的具體實施例的方法的步 驟’它用來將潛在的接受物件方的地址與至少是列表DPL 15中的一部分地址進行比較。該方法由第2圖中所示的系統來 實現。 第2圖的系統含有一訂單管理系統1〇〇,比如smarts 系統,它包括一用來存儲個人及/或公司的運輪及/或提貨 地址的資料庫110,個人及/或公司已經發出訂單或者準備 20接受訂單;和一個資料輸入設備120,它使用拼音字元以便 將資料輸入至資料庫110。僅僅顯示了一個資料輸入裝置 120,但是實際應用中可能有多個這樣的單元。 該系統還包括一第二資料庫13 0,它用來存儲英文士五士 的 DPL。 1237775 五、發明說明(6) 该系統另外包括一第一轉換單元140,它用來將該第一 資料庫110中的簡體漢字資料項目轉換成拼音資料項目,以 便形成第一拼音資料庫150。該過程不刪除資料庫120。 該系統仍包括一第二轉換單元160,用來將該第二資料 1庫130中的英文語言資料轉換成拼音資料項目,拼音資料項 目在第二拼音資料庫17〇中。這個過程不刪除該第二資料庫 130 〇 、 取俊,该糸統包括一比較單元1 8〇 0 和第二資料庫150、170中的拼音項,以及—輸出單元190 用來將該第一和第二拼音資料庫150、170中的各項目之指 的任何相符合m通知給«統的操作員,相符合的相 況由比較單元180來完成發現。 約圖的方法的頭兩個步驟(即,第i圖中短虛線以上的 2)是大家所熟知的將資料輸入至該訂單管理系統刚的 该第-資料庫m中的步驟。特別地,在步驟1〇中,比如說 銷售代表中的用p μ 貝料輸入裝置120以便將諸如訂貨 和運輸地址這樣的資料輸入至訂單管理系統⑽中。、 用戶所產“顯示給 用廷個窗口,在步驟20中 r:亥訂單管理系統⑽將輸人資料轉換成簡化的漢字雙字 :==第一資料庫110中的項目。當從該第-資 的”:二被列印出來的時候它們是簡體漢字 輸和”單的文 貝〃、庫中輸出的一個窗口,它具有用 0 1237775 五、發明說明(7) 雙字節簡體漢字字亓查皆a # & 意,資料庫u。有可:;進::訂貨和郵寄位址。注 本發明不相關的項目。這樣的項目,如果它們己經: 用=語言的方式’有可能通過眾所皆知的方法直接地與 貝目(例如,非中文項目)進行比較。 在步驟3〇中’以簡化的漢字雙字 — 貨料庫削中的訂貨和運輸資料,由該第-轉換單元140將 匕們轉換成拼音字元,以便形成該第—拼音資料庫15〇中的 ‘0 20 立—上所㉛—單個的簡體漢字字S可能對應多個拼 日子'集*因此,對於每個該第—資料庫"时的簡體漢字 、/第轉換單%14G將生成所有可能的拼音字元集, ,們能夠由所述的項目派生出。而且這些拼音字元集中的 母一個在資料庫150中都形成-個項目。儘管如此,我們已 經確認這種”簡化的”軸不會損«別過程的完整性。 特別地’在步驟3G中由轉換單元⑽完成的轉換可以通 過使用-個轉換文件來實現,該轉換文件比如像已經裝載 的微軟視窗98的簡體巾文操«統的預設«,對於每一 個安裝的該預設文件系統的位置均能夠在每一臺%機的 C:\Wmd〇WS\system\winpyc〇m處找到’其中在每個%機裏 都安裝了這種操作系統。 第5圖顯示了步驟3〇的過程的一個例子。第4圖的窗口 2所顯示的地址是該第一資料庫中的第46〇2249〇1丨號訂 單如第5(a)圖中戶斤不。第5⑼圖所示為能夠將每個簡體漢 字字元轉換成拼音的各種方法。大多數字元僅僅衹有一個 10 Ϊ237775Fig. 1 shows the steps of a method according to a specific embodiment of the invention, which is used to compare the addresses of potential recipients with at least a portion of the addresses in the list DPL 15. This method is implemented by the system shown in FIG. The system of FIG. 2 contains an order management system 100, such as the smarts system, which includes a database 110 for storing individual and / or company shipping and / or pick-up addresses, and the individual and / or company has placed an order Or prepare 20 to accept an order; and a data input device 120, which uses Pinyin characters to input data into the database 110. Only one data input device 120 is shown, but in practice there may be multiple such units. The system also includes a second database 13 0, which is used to store DPL for English and Japanese. 1237775 V. Description of the invention (6) The system further includes a first conversion unit 140, which is used to convert the simplified Chinese character data items in the first database 110 into Pinyin data items so as to form the first Pinyin data library 150. This process does not delete the database 120. The system still includes a second conversion unit 160 for converting the English language data in the second material 1 database 130 into Pinyin data items, and the Pinyin data items are in the second Pinyin database 17. This process does not delete the second data base 130. The system includes a comparison unit 1800 and the pinyin items in the second data bases 150 and 170, and the output unit 190 is used for the first data base. Any match m corresponding to each of the items in the second pinyin database 150 and 170 is notified to the operator of the system, and the match status is found by the comparison unit 180. The first two steps of the method of the graph reduction (that is, 2 above the short dashed line in the i-th figure) are well-known steps for entering data into the first-middle database m of the order management system. Specifically, in step 10, for example, a p μ material input device 120 in a sales representative is used to input information such as an order and a shipping address into the order management system ⑽. The "produced by the user" window is displayed to the user. In step 20, the r: Hai order management system will convert the input data into a simplified Chinese character: == the item in the first database 110. When starting from the first -Funded ": When they are printed, they are a simplified Chinese character input and a single window output from the library. It has 0 1237775 V. Description of the invention (7) Double-byte simplified Chinese character亓 查 Both a # & Italian, database u. Available :; Order: and mailing address. Note the items that are not relevant to the present invention. Such items, if they have been: use = language It may be directly compared to the Beam (for example, non-Chinese items) by a well-known method. In step 30, 'to simplify the Chinese double character — ordering and transportation information in the warehouse cut, from the first -The conversion unit 140 converts the daggers into pinyin characters so as to form the '0 20 立 — 上 所 ㉛—the single simplified Chinese character S may correspond to multiple Pinyin' sets in the first-Pinyin database 150. * Therefore , For every simplified Chinese character of the first-database " Changing the order% 14G will generate all possible sets of Pinyin characters, which can be derived from the items described. And the mother of these Pinyin character sets is formed in the database 150. However, we have already Make sure that this "simplified" axis does not compromise the integrity of other processes. In particular, the conversion done by the conversion unit 在 in step 3G can be achieved by using a conversion file, such as the one already loaded The simplified operating system «System Presets« of Microsoft Windows 98 can be found at C: \ Wmd〇WS \ system \ winpyc〇m for each installed file system. 'This operating system is installed in each machine. Figure 5 shows an example of the process of step 30. The address shown in window 2 of Figure 4 is the 46th in the first database. Order 〇2249〇1 丨 is shown in Figure 5 (a). Figure 5⑼ shows various methods that can convert each simplified Chinese character to Pinyin. Most characters only have one 10Ϊ237775

五、發明說明(9) 址。作為參考,相應的簡體中文地址顯示於第6圖的右手欄 中’儘管,本發明並不需要產生這一攔目。 儘管在原理上,將DPL列表中的所有項目都轉換成拼 音是有可能的,但本實施例僅僅將DPL列表中的中文項目 5的地址進行轉換,例如,在本上下文中的,,中文,,可以被 定義成這樣的項目,它們是中華人民共和國境内以及其他 可選領土之内的地址。通過採用這種,,簡化的,,的手段, 轉換的數目(因而同樣對於緊接下來的比較)被大大地減少 了。一般地來講,由於鑒別過程基於地址的基礎之上,而 °且地址的屬性並沒有發生,,轉變”,上述方法不會降低鑒 別的完整性。 在步驟50中,對於該第一和第二拼音資料庫15〇,17〇 進订比較,以便確定它們是否相符合。這些是通過自動地 抽取4第為料庫中的拼音字串(例如,第5(c)圖中所示的 5子串)與該第二資料庫中的拼音字串(第6圖的,,拼音位 址”欄目)之間的相符合情況來完成的。 第7圖所示為一可以選擇地顯示給用戶的窗口,通過比 較單元180以便讓用戶確定這些匹配是怎樣被處理的。如圖 所不,已經發現了一個在訂單號第4〇2211〇81(如第4和5 °圖,以及第7圖下的上半部分所示)與第ό圖表中的實體 ΡΙΝ-YIN一4(如第7圖的下半部分所示)之後的可能相匹配 的情況。注意,DPL列表中的實體名稱(“Beijing Aer〇space Automatic Control Limited(北京航空自動控制有限公司),,) 與名稱(“DaLi Furniture (China) Ltd.)(大理家具(中國)有 1237775 五、發明說明(l〇 限公司))是不相同的,其中應用名稱(大理家具(中國)有 限公來製作訂單;本實施例已經發現這種相符合情況 衹疋單單基於地址的基礎之上。通過向第7圖窗口中的通當 的可選對話框中輸入點擊,然後點擊” 〇κ” ,用戶可以指 示這種相符合情況是怎樣被處理的。 如果願意的話,步驟50可以由運行該訂單管理系統的 機構的DPL遵循部門來完《,該才目符合的情況可以與地方 的歹J表相&作,即,運行該訂單管理系統的機構將發 展同表中所列的各方(可不同於DPL列表)進行交易,至少 10 是沒有筛選雲別操作。緊接著,地方肌列表有可能會被 使用以便附加到出口官理系統中,用於出口法規的遵循雲 別的目的,這正如同出口/運輸文件的產生一樣。 因此’步驟30和4〇已經形成了一公共平臺(拼音),這 樣在步驟50中即能夠實現中國訂單的地址的遵從一致的雲 5 別° —該實施例可以採用在處理的方式進行操作,其中該第 貝料庫iig中的多個項目(比如,該第—資料庫中的所 有中文項目)一個接一個地(例如,像一連續不斷的序賴 轉換成拼日項目’以便形成該資料庫存π,然後資料庫⑽ )中的每-個被轉換項目均與該第二資料庫17〇的被轉換項 目進行比較(比如說,_個接_個地)。 可以選擇地,對於該第_f料庫n㈣項目可以分別地 進订步驟30(例如,每當有一新的項目被添加至該第一資料 庫中)而通過將每個被轉換的項目與該第二拼音資料 13 12377755. Description of Invention (9). For reference, the corresponding Simplified Chinese address is shown in the right-hand column of FIG. 6 'Although the present invention does not need to generate this block. Although it is possible in principle to convert all items in the DPL list to Pinyin, this embodiment only converts the address of Chinese item 5 in the DPL list. For example, in this context, Chinese, Can be defined as items that are addresses within the People's Republic of China and within other optional territories. By adopting this, simplified,, means, the number of transformations (and therefore also for the next comparison) is greatly reduced. Generally speaking, because the authentication process is based on the address, and the attributes of the address have not occurred, the conversion ", the above method will not reduce the integrity of the authentication. In step 50, for the first and second The two Pinyin databases are compared at 15 and 17 to determine whether they match. These are automatically extracted by the 4th pinyin string in the database (for example, 5 shown in Figure 5 (c)). Substrings) and the Pinyin strings in the second database (Figure 6, "Pinyin Address" column). Fig. 7 shows a window which can be selectively displayed to the user, and the comparison unit 180 is used to let the user determine how these matches are processed. As shown in the figure, it has been found that one of the order number 402211〇81 (as shown in Figures 4 and 5 °, and the upper half of Figure 7) and the entity PIN-YIN in the chart 4 (as shown in the lower half of Figure 7). Note that the entity name in the DPL list ("Beijing Aer〇space Automatic Control Limited,") and the name ("DaLi Furniture (China) Ltd.) (Dali Furniture (China) have 1237775. The description of the invention (10 limited company)) is different, in which the application name (Dali Furniture (China) Co., Ltd.) is used to make orders; this embodiment has found that this compliance is based on the address alone. Pass Enter a click into the common optional dialog box in the window in Figure 7 and click "〇κ", the user can indicate how this compliance is handled. If desired, step 50 can be executed by running the order The DPL compliance department of the organization that manages the system completes the situation, and the status of this project can be matched with the local schedule, that is, the organization that runs the order management system will develop the parties listed in the table ( (Can be different from DPL list) for trading, at least 10 is no filtering cloud operation. Then, the local muscle list may be used for appending to the exit In the management system, it is used for the purpose of complying with export regulations, which is just like the generation of export / transport documents. Therefore, 'steps 30 and 40 have formed a common platform (pinyin), so that in step 50, Achieve a consistent cloud of address for Chinese orders. This embodiment can be operated in a processing manner, in which multiple items in the first database iig (for example, all the Chinese in the first database) Items) one after the other (for example, like a continuous sequence converted into a day-to-day item 'to form the database inventory π, and then the database ⑽) each of the converted items is related to the second database 17 〇 the converted items are compared (for example, _ connected to _ land). Optionally, the 30th item in the _f library n 进 can be subscribed to step 30 separately (for example, whenever a new item is added To the first database) and by converting each converted item with the second pinyin material 13 1237775

庫170的所有被轉換項目進行比較,且為了資料庫15〇中的 所得項目可以進行步驟5〇。如果沒有發現相符合的情況, 該資料庫150的内容可以捨棄掉,換句話月 變化中’該第一拼音資料庫160在任何時候不需要含有比由 資料庫m中的簡體漢字項目中的某—單個項目而派生出 來的拼音項目的數目更多的拼音項目數目。 步驟50中的比較可以用如上所述的方法來進—女 有任何一個匹配情況被發現,即要使用輪出單元1以便雨 知該系統的操作員,他(她)可以對相應訂單進行特別户理 ΟAll converted items in library 170 are compared, and step 50 may be performed for the resulting items in database 150. If no match is found, the contents of the database 150 can be discarded, in other words, the monthly change 'the first Pinyin database 160 does not need to contain any more than the simplified Chinese characters in the database m at any time. Some—The number of Pinyin items derived from a single item is greater. The comparison in step 50 can be performed by the method described above-any matching condition of the woman is found, that is, the wheel out unit 1 is used so that the operator of the system can be informed. He (she) can make special orders for the corresponding order. Household Management 〇

元件標號對照表 10…· ••步驟 20…· ••步驟 30··· · ••步驟 40…· ••步驟 50…. …步驟 100·· •…訂單管理系統 110·· •…資料庫 120·. •…資料輸入設備 130… 第二資料庫 140… 第一轉換單元 150··· 第一拼音資料庫 160… 第二轉換單元 170··· 第二拼音資料庫 180··· 比較單元 190··· 輪出單元Component number comparison table 10… ••• Step 20… ••• Step 30… •••• Step 40… ••• Step 50….… Step 100… ••… Order Management System 110 ··… Database 120 ···· Data input device 130… Second database 140… First conversion unit 150 ··· First Pinyin database 160 ... Second conversion unit 170 ··· Second Pinyin database 180 ·· Comparison unit 190 ...

— 14— 14

Claims (1)

1237775 If 六、申請專利範圍 多正替換1 A8 B8 C8 D8 15 20 第91110251就申請案申請專利範圍修正本 94 3>71. -種用於比較兩個資料庫的計算機應用方法,每個資 料庫都包含指明地址的中文文本資料項目,該方法包括: 子;母個 > 料庫,將任何一個並不是預先定義 的共同中文語言袼式的中文文本資料項目轉換成共同 格式,能夠用多種方式轉換成該共同格式的、且至少 是在第-資料庫中的任何項目都被用所有這些方法來 轉換成該共同格式;以及 比較這些具有共同格式的資料項目,則更識別與 該第-貝料庫中的中文文本資料項目相對應的該第一 資料庫中的中文文本資料項目。 如申請專利範圍第1項所述的方法,其中該共同的中 文語言格式是拼音字符(Pin Yin characters)。 如申請專利範圍第1或2項所述的方法,其中該第一資料庫是一個訂單管理系統資料庫,並且包括含有相應 的運輸及/或訂貨地址的資料項目。如申請專利範圍第1項所述的方法,其中該第一資料 庫的中文文本資料項目使用簡體漢語字符(simplified Mandarin characters) 〇 如申請專利範圍第1項所述的方法,其中該第二資料 庫是一個由第二方所發佈的第三方列表。 如申印專利範圍第1項所述的方法,其中不包括位於 至少一個所指定中國領土之内地址的該第二資料庫 2. 3. 4. 5. 本紙張尺度適用中國國家標準(CNS ) A4規格(210 X 297公釐) --------„ (請先閲讀背面之注意事項再填寫本頁) 訂 麝 15 12377751237775 If VI. Applying for more than 1 patent replacement scope A8 B8 C8 D8 15 20 No. 91110251 applying for amendment of patent scope for application 94 3 > 71.-A computer application method for comparing two databases, each database Both contain Chinese text data items with specified addresses. The method includes: child; parent > material bank, which converts any Chinese text data item that is not a pre-defined common Chinese language format into a common format, which can be used in a variety of ways Any item converted into the common format and at least in the first database is converted into the common format by all of these methods; and comparing these data items with the common format, it is more recognizable with the first The Chinese text data items in the first database correspond to the Chinese text data items in the library. The method according to item 1 of the patent application scope, wherein the common Chinese language format is Pin Yin characters. The method according to item 1 or 2 of the scope of patent application, wherein the first database is an order management system database and includes data items containing corresponding shipping and / or order addresses. The method according to item 1 of the scope of patent application, wherein the Chinese text data item of the first database uses simplified Mandarin characters. The method according to item 1 of the scope of patent application, wherein the second material A library is a list of third parties published by a second party. The method as described in item 1 of the scope of the patent application, which does not include the second database located at the address within at least one designated Chinese territory 2. 3. 4. 5. This paper size applies to the Chinese National Standard (CNS) A4 size (210 X 297 mm) -------- „(Please read the precautions on the back before filling in this page) Order 15 1237775 申請專利範圍 10 經濟部口央標邋扁負工消費合作社印製 .智慧膝產局 20 資料項目將不被轉換成該共同袼式。 如申請專利範圍第i項所述的方法,其中通過對具有 该共同格式中文文本資料項目的比較,若發現它們之 ‘間相匹配’則編輯該第_資料庫的相應項目。 —種用於比較兩個資料庫的計算機系統,其中每個資 料庫包括指明地址的中文文本資料項目,該計算機系 統含有·· 一第-轉換單元’用來將第―資料庫的中文文本 資料項目轉換成-預先定義的共同中文語言格式,並 且能夠用多種方法將它們轉換成共同格式的第一資料 庫中的任意項目被用所有這些方法來轉換以便生成具 有共同袼式的項目; 弟一轉換單元,用來將該第二資料庫的中文文 本為料項目轉換成該共同中文語言格式;以及 一比較單元,用於比較被轉換的資料項目,以便 識別與該第二資料庫中的中文文本資料項目相對應的 該第一資料庫中的中文文本資料項目。 如申請專利範圍第8項所述的系統,其中該共同中文 語言格式是拼音字符。 10.如申請專利範圍第8或9項所述的系統,其中該第一資 料庫是一個訂單管理系統資料庫,並且包括含有相應 的運輪及/或訂貨地址的中文文本資料項目。 11 ·如申請專利範圍第8項所述的系統,其中該第一資料 庫的中文文本資料項目使用簡體漢語字符。 7. 8. 請 先 閲 讀 背 之 注 意 事 項 再 寫 本 頁 t 訂 9.Scope of application for patents 10 Printed by the Ministry of Economic Affairs of the Ministry of Economic Affairs of the Ministry of Economic Affairs and Consumer Cooperatives. Wisdom Knee Production Bureau 20 Data items will not be converted to this common mode. The method as described in item i of the scope of patent application, wherein, by comparing items of the Chinese text data with the common format, if they are found to be 'matched in time', the corresponding items of the data library are edited. A computer system for comparing two databases, where each database includes a Chinese text data item with a specified address, the computer system contains a first conversion unit to convert the Chinese text data of the first database Items are converted into-a pre-defined common Chinese language format, and any item in the first database that can be converted to a common format in a variety of ways is converted using all of these methods to generate a project with a common pattern; A conversion unit for converting the Chinese text of the second database into the common Chinese language format; and a comparison unit for comparing the converted data items so as to identify the Chinese with the second database The Chinese text data items in the first database corresponding to the text data items. The system according to item 8 of the scope of patent application, wherein the common Chinese language format is Pinyin characters. 10. The system according to item 8 or 9 of the scope of patent application, wherein the first database is an order management system database and includes Chinese text data items containing the corresponding shipping and / or order addresses. 11 · The system according to item 8 of the scope of patent application, wherein the Chinese text data item of the first database uses simplified Chinese characters. 7. 8. Please read the notes of the memorandum before writing this page. T Order 9. A4規格(210X297公釐) 16A4 size (210X297 mm) 16 10 1237775 申請專利範圍 12. ::請=圍第8項所述的系統,其中該第二資料 庫疋-個由第二方所發佈的第三方列表。 13. 如申請專利範圍第8項所述 1 β π # ^ 糸、、'先其中該第二轉換 早凡並不轉換那些不包含位於至少__指定的中 國領土之内地址的該第二資料庫資料項目。 14. 如申請專利範圍第8項 a ^ ^ 幻糸統,其中通過檢測該 具有共同格式的中文文本資料項目之間的相匹配情 況,該比較單元用來操作輸出裝置以發送信號,因而 使得該第一資料庫相應項目被編輯。 K如申請專利範圍第8項所述的系統,其中通過檢測呈 有該共同格式的中文文本資料項目之間的相匹配情 況’该比較早疋用來編輯該第一資料庫中的相應項 目0 、 n n In m m 111 1¾ - tm 1 m ! I ϋ^— · -ii (請先閲讀背面之注意事項再填寫本頁) 經濟部Φ央標隼f工消費合作社印製 智慧蛛產局 i 本紙張尺度適用中國國家標準(CNS ) A4規格(210X297公釐) 1710 1237775 Scope of patent application 12. :: Please = The system described in item 8, wherein the second database is a list of third parties published by a second party. 13. As described in item 8 of the scope of the patent application, 1 β π # ^ 糸 ,, 'First of all, the second conversion does not convert those second materials that do not contain addresses located in at least __ designated Chinese territories. Library data items. 14. For example, in the patent application No. 8 a ^ ^ fantasy system, in which the comparison unit is used to operate the output device to send a signal by detecting the matching between the Chinese text data items with a common format, so that the The corresponding items of the first database are edited. K The system according to item 8 of the scope of patent application, wherein by detecting the matching between the Chinese text data items in the common format, the comparison is used to edit the corresponding items in the first database. , Nn In mm 111 1¾-tm 1 m! I ϋ ^ — · -ii (Please read the precautions on the back before filling out this page) Ministry of Economy Φ Central Standard 隼 Industrial Cooperative Cooperative Printed Wisdom Spider Production Bureau i This paper Standards are applicable to China National Standard (CNS) A4 specifications (210X297 mm) 17
TW91110251A 2002-01-17 2002-05-16 Method and systems for screening Chinese address data TWI237775B (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CNB021017360A CN100442275C (en) 2002-01-17 2002-01-17 Method and system for indentifying Chinese address data
US10/080,815 US7065484B2 (en) 2002-01-17 2002-02-22 Method and systems for screening Chinese address data

Publications (1)

Publication Number Publication Date
TWI237775B true TWI237775B (en) 2005-08-11

Family

ID=36929934

Family Applications (1)

Application Number Title Priority Date Filing Date
TW91110251A TWI237775B (en) 2002-01-17 2002-05-16 Method and systems for screening Chinese address data

Country Status (1)

Country Link
TW (1) TWI237775B (en)

Similar Documents

Publication Publication Date Title
US7870485B2 (en) Method and apparatus for generating multiple documents using a template and a data source
AU2009238294A1 (en) Data transformation based on a technical design document
CN107783950A (en) Package insert processing method and processing device
JP7208872B2 (en) Systems and methods for generating proposals based on request for proposals (RFPs)
JP2007149096A (en) Data element naming system and method
US9612786B2 (en) Document output processing
US20100005115A1 (en) Method and system for generating documents usable by a plurality of differing computer applications
US20180239813A1 (en) Information processing apparatus and information processing method
JP2011227779A (en) Financial data processing apparatus, financial data processing method and financial data processing program
US8516007B1 (en) Systems and methods for creating documents from templates
TW202029018A (en) Legal document generation device and method for enabling a user not familiar with legal clause to generate a complete and valid legal document
TWI237775B (en) Method and systems for screening Chinese address data
US7065484B2 (en) Method and systems for screening Chinese address data
JP2007041983A (en) Application form creation program and application form creation apparatus
WO2018206819A1 (en) Data storage method and apparatus
WO2017090054A1 (en) Editfile
CN107256220A (en) Data logging generation method, device and electronic equipment
Hannon XBRL enters a new phase.(XBRL)
JP6419286B1 (en) DM configuration information management system
KR20160136898A (en) Cloud-based creative work registration system
WO2020107108A1 (en) A method and system of workflow management
JP2014059666A (en) Task input screen customization system
US20090248432A1 (en) Heuristic matching method for use in financial systems
JP4399060B2 (en) Electronic trading system and ordering server for electronic trading system
WO2022230180A1 (en) Information processing device, information processing method, and program

Legal Events

Date Code Title Description
MK4A Expiration of patent term of an invention patent