JPH11288416A - Method and device for excluding error of automatic connection of different kind of data having address information - Google Patents

Method and device for excluding error of automatic connection of different kind of data having address information

Info

Publication number
JPH11288416A
JPH11288416A JP10088338A JP8833898A JPH11288416A JP H11288416 A JPH11288416 A JP H11288416A JP 10088338 A JP10088338 A JP 10088338A JP 8833898 A JP8833898 A JP 8833898A JP H11288416 A JPH11288416 A JP H11288416A
Authority
JP
Japan
Prior art keywords
result
combining
connection
processing
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
JP10088338A
Other languages
Japanese (ja)
Other versions
JP3495253B2 (en
Inventor
Tsuneo Yasuda
恒雄 安田
Hideyuki Tsuchiya
秀幸 土屋
Ayafumi Nunobiki
純史 布引
Kensaku Fujii
憲作 藤井
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nippon Telegraph and Telephone Corp
Original Assignee
Nippon Telegraph and Telephone Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nippon Telegraph and Telephone Corp filed Critical Nippon Telegraph and Telephone Corp
Priority to JP08833898A priority Critical patent/JP3495253B2/en
Publication of JPH11288416A publication Critical patent/JPH11288416A/en
Application granted granted Critical
Publication of JP3495253B2 publication Critical patent/JP3495253B2/en
Anticipated expiration legal-status Critical
Expired - Fee Related legal-status Critical Current

Links

Abstract

PROBLEM TO BE SOLVED: To improve the reliability of connection by detecting and excluding the error of an automatic connected result in as small man-hours as possible at the time of automatically retrieving and connecting the same resident data by using a computer for two kinds of data having address information (town name, address, number, care of, and resident name or the like). SOLUTION: Two kinds of data 101 and 102 to be connected are connected by the plural kinds of connection processing systems whose connection evaluating methods are different (110 and 120). Those connected results and collated to each other, and the incoincident result is extracted (130), and any erroneous connection is excluded by using the extracted result as an object for judging the erroneous connection (140). Then, the connected results in the plural kinds of connection processing systems after the erroneous connection is excluded are combined so that a connected data base 200 can be prepared (150).

Description

【発明の詳細な説明】DETAILED DESCRIPTION OF THE INVENTION

【0001】[0001]

【発明の属する技術分野】本発明は,住所関連情報(町
丁目・番地・号,方書,住人名等)を含む2種類のデー
タに対して,計算機を利用し,住所関連情報を参考にし
て効率良く自動で同一住人のデータを探し出して結び付
ける(結合させる)方法に対して,自動で結合させるこ
とによる結合誤りを効率良く検出し排除する方法と装置
に関するものである。
BACKGROUND OF THE INVENTION The present invention relates to two types of data including address-related information (town address, address, number, form, resident name, etc.) using a computer and referring to the address-related information. The present invention relates to a method and an apparatus for efficiently detecting and rejecting data of the same resident efficiently and automatically (combining them) and efficiently detecting and eliminating connection errors caused by automatic connection.

【0002】[0002]

【従来の技術】例えば顧客データと,建物毎に住人情報
を持つ詳細な住宅地図データとを結合させて,電話番号
等の顧客情報から地図上の建物を特定して表示できるよ
うにするなどのためには,異種類のデータに対して住所
情報をもとに結合する処理が必要となる。
2. Description of the Related Art For example, by combining customer data with detailed house map data having resident information for each building, it is possible to specify and display a building on a map from customer information such as a telephone number. Therefore, a process of combining different types of data based on address information is required.

【0003】ところが,別個に作られた異種のデータの
住所情報は,表記のゆらぎや不完全さ,住所情報などの
不正確さを各々包含しており,例えば電話番号帳データ
ベース(DB)と市販の電子化された住宅地図DBとを
完全一致で結合すると4割程度しか自動で結合しない
(参考文献1の第6章)。残りのデータについて人手で
結合させると,例えば東京23区の職業別電話番号帳
(約100万件)に掲載された顧客情報と地図との結合
では,大変な工数を要することになる。
[0003] However, the address information of different types of data created separately includes fluctuations, incompleteness, and inaccuracy such as address information. When the computerized house map DB is completely matched, only about 40% is automatically combined (Chapter 6 of Reference 1). If the remaining data is manually combined, for example, combining customer information and a map posted in a telephone directory (approximately 1 million) by occupation in the 23 wards of Tokyo would require a great deal of man-hours.

【0004】[参考文献1]:安田,松村,水町,唐沢
「電話・FAXを使った地図案内システム」,第3回機
能図形情報システムシンポジウム講演論文集,1992
/4,pp.81-86.住所情報における表記のゆらぎとは,
例えば漢字で表記したりカタカナで表記したりすること
があること,「株式会社〜」と表記したり「(株)〜」
というように省略して表記したりすることがあること,
全角文字で表記したり半角文字で表記したりすることが
あることなどをいう。他にも種々のゆらぎがある。
[Reference Document 1]: Yasuda, Matsumura, Mizumachi, Karasawa “Map guidance system using telephone and FAX”, Proc. Of the 3rd Symposium on Functional Graphic Information Systems, 1992
/ 4, pp.81-86. What is the fluctuation of the notation in the address information?
For example, it may be written in kanji or katakana, written as "stock company" or "(stock) ~"
May be abbreviated as follows,
It means that it may be written in full-width characters or half-width characters. There are various other fluctuations.

【0005】このため従来,計算機での自動結合精度を
上げる方法として,2種類のDBの住所情報(町丁目・
番地・号)については厳密な一致をさせずに単に比較デ
ータの絞り込みにのみ使い,住人情報の比較に重点をお
くこととして,住人名に対して日本語処理による形態素
解析を行い,構成する固有情報を抽出して一致度を判定
する方法が提案されている(上記参考文献1)。この方
法を使うことにより,計算機による自動結合率を約8割
以上にまで高め,大幅に人手作業を削減することが可能
になった。
[0005] For this reason, conventionally, as a method for improving the automatic connection accuracy in a computer, two types of address information of a DB (machichome /
Address and number) are used only for narrowing down the comparison data without strict matching, and the emphasis is placed on comparison of resident information. A method of extracting information and determining the degree of coincidence has been proposed (see Reference 1 above). By using this method, the automatic connection rate by the computer can be increased to about 80% or more, and it has become possible to greatly reduce manual work.

【0006】また,特願平10−21617号「住所情
報による異種データの結合処理方法および装置」で示さ
れているように,住所の一致度に応じて文字の一致度を
評価し,結合する方法等もある。この方法は,住所の一
致度を考慮して住人名情報の文字の一致度の判定基準を
変化させて評価することで,きめ細かい評価による自動
結合が可能となり,しかも,文字列を単語の集合として
捉えず,単に文字の集合として捉え,共通文字の存在に
のみ着目して処理し,表記のゆらぎに強く,処理も日本
語処理を必要としないため非常に簡易な処理を実現する
ものである。
Further, as disclosed in Japanese Patent Application No. 10-21617, "Method and Apparatus for Combining Different Types of Data Using Address Information", the degree of coincidence of characters is evaluated according to the degree of coincidence of addresses and combined. There are also methods. In this method, by changing the criteria for determining the degree of coincidence of the characters in the resident name information in consideration of the degree of coincidence of the address, the evaluation can be automatically combined by fine evaluation, and furthermore, the character string can be used as a set of words. Instead, it is regarded as a set of characters, processed by focusing only on the presence of common characters, and is resistant to fluctuations in notation. Since the processing does not require Japanese processing, it realizes very simple processing.

【0007】[0007]

【発明が解決しようとする課題】しかし,曖昧さを許容
して結合率を向上させるいずれの方法においても,自動
結合であるが故に,例えば「AA自動車レンタカー」と
「AAカーショップ」が同一とみなされて結合されるな
ど,結合誤りが生じるのは避けられない。特に利用者に
とっては,例えばこの情報をもとに結合した地図を見て
現地に出かけて行って初めて誤りに気付くことになる。
このような事態は,サービスとしては極力排除されるべ
きものであるが,この誤りを排除するには,従来では,
結合結果の正当性全体を人手で見直すしかなく,データ
数によっては非現実的な膨大な工数になるという問題が
あった。
However, in any of the methods for improving the connection ratio by allowing ambiguity, since the automatic connection is used, for example, "AA car rental car" and "AA car shop" are the same. It is inevitable that connection errors occur, such as being considered and combined. In particular, a user will not notice an error until they go to the site, for example, looking at a map combined based on this information.
Such a situation should be eliminated as a service as much as possible. To eliminate this error, conventionally,
There is no other choice but to manually review the entire validity of the combined result, and there is a problem that the number of data may be unrealistically large.

【0008】本発明の目的は,できるだけ小さい工数で
自動結合結果の誤りを検出排除して結合の信頼性を向上
できる手段を提供することにある。
An object of the present invention is to provide a means for improving the reliability of connection by detecting and eliminating an error in the result of automatic connection with as few man-hours as possible.

【0009】[0009]

【課題を解決するための手段】図1は,本発明の要部構
成を示すブロック図である。処理装置100は,CPU
およびメモリからなる装置であり,複数の結合処理部1
10,120,結果突合分類部130,誤結果排除部1
40,結合結果合体部150の各処理手段を備える。
FIG. 1 is a block diagram showing a main configuration of the present invention. The processing device 100 is a CPU
And a memory comprising a plurality of connection processing units 1
10, 120, result matching classification unit 130, false result elimination unit 1
40, each processing means of the combining result unit 150.

【0010】2種類のデータベース(DB)DBa10
1とDBb102の住所情報(住所,住人名等)から同
一の建物に対する属性データを結合させる方式の異なる
複数種類の結合処理部110,120を持ち,それぞれ
の結合処理を実行させて,結果突合分類部130に処理
を移し,その結合結果を突き合わせて結果が不一致のも
のと一致したものに分類する。不一致のものについて
は,誤結果排除部140に送って両者のどちらの結果を
採用するかどうか,または両結果とも採用しないかどう
かを人手で判断選択できる表示装置160および入力装
置170を使って,誤結合を排除するとともに,結合成
功と判断できるのものについては両者の結果をマージさ
せる。
Two types of databases (DB) DBa10
1 and a plurality of types of combination processing units 110 and 120 that combine attribute data for the same building from the address information (address, resident name, etc.) of the DBb 102, execute the respective combination processing, and perform the result matching classification. The processing is transferred to the unit 130, the combined results are compared, and the results are classified into unmatched ones and matched ones. The unmatched one is sent to the erroneous result elimination unit 140, and the display device 160 and the input device 170 are used to manually determine whether to use either of the results or not to use both results. In addition to eliminating erroneous combinations, those that can be determined to be successful are merged.

【0011】結合結果合体部150では,この結果と結
合突合分類部130の処理で結果が一致したものとを合
わせて最終的な結合DB200を生成する。本発明によ
る作用は,以下のとおりである。
The join result merging unit 150 generates a final join DB 200 by combining the result with the result that has been matched by the process of the join matching unit 130. The operation according to the present invention is as follows.

【0012】図1に示す構成の処理装置100におい
て,一方では固有語に着目し,一方では単に文字の一致
度のみに着目して結合させるなど,方式の異なる2種類
の結合処理部110,120に処理を行わせてその結果
を比較しているため,名義のゆらぎや住所のゆらぎが大
きく自動結合の失敗が起きやすいレコードに対して,各
方式特有の結合誤りが出やすく,両者の結果不一致のも
ののみを対象として人手でチェックさせることで,きわ
めて効率良く自動結合結果の誤りを検出することができ
る。
In the processing apparatus 100 having the configuration shown in FIG. 1, two types of connection processing units 110 and 120 using different methods, such as focusing on proper words on the one hand and focusing only on the degree of coincidence of characters on the other hand, are used. , And the results are compared. For records where the name fluctuation or address fluctuation is large and automatic merging is likely to fail, it is easy for a combination error peculiar to each method to occur, and the results do not match. By manually checking only those of the automatic combination, it is possible to detect errors in the automatic combination result very efficiently.

【0013】なお,ここでは2種類の結合処理部11
0,120の例を示しているが,もちろん3種類以上で
もよい。
It is to be noted that here two types of connection processing units 11
Although examples of 0 and 120 are shown, three or more types may be used.

【0014】[0014]

【発明の実施の形態】次に,本発明の実施の形態につい
て図面を参照して説明する。図2は,図1に示した本発
明の処理装置による処理フローの概要を具体的に示す図
である。
Next, embodiments of the present invention will be described with reference to the drawings. FIG. 2 is a diagram specifically showing an outline of a processing flow by the processing apparatus of the present invention shown in FIG.

【0015】結合処理の対象となるDBa101,DB
b102は,図2に示すように,それぞれある名義に対
して住所やその他の情報を持っている。結合処理部11
0および結合処理部120は,それぞれDBa101と
DBb102にアクセスし,それぞれのレコードを読み
出し,それぞれ独自の方法で結合処理を行って中間結合
結果a103,中間結合結果b104を出力する。中間
結合結果では,図2に示すように結合結果と,元になっ
たDBa101およびDBb102両者の名義と住所を
保持している。このデータ中の結合フラグは,結合成功
のとき“1”,結合失敗のとき“0”である。結合失敗
の場合,中間結合結果にはDBa101の情報だけがあ
り,DBb102の情報はない。
DBa 101 and DB to be combined
As shown in FIG. 2, b102 has an address and other information for a certain name. Combination processing unit 11
0 and the combination processing unit 120 respectively access the DBa 101 and DBb 102, read out the respective records, perform the combination processing by their own methods, and output the intermediate combination results a103 and b104. As shown in FIG. 2, the intermediate combining result holds the combining result and the names and addresses of both the original DBa 101 and DBb 102. The combination flag in this data is "1" when the combination is successful and "0" when the combination fails. In the case of a combining failure, the intermediate combining result has only the information of DBa101 and no information of DBb102.

【0016】結果突合分類部130は,結果比較部13
1と,結合結果不一致データ群105,結合結果一致デ
ータ群106および結合失敗データ群107の記憶手段
から構成される。中間結合結果a103,b104に対
してDBa101の情報をベースに結合されたDBb1
02の情報をレコード単位で比較し,結合結果が不一致
ならば結合結果不一致データ群105に格納し,一致し
ていれば結合結果一致データ群106に格納し,両者と
も結合に失敗していれば結合失敗データ群107に格納
する。
The result reconciliation classification unit 130
1 and storage means for a combined result mismatch data group 105, a combined result matched data group 106, and a combined failure data group 107. DBb1 combined with the intermediate combination results a103 and b104 based on the information of DBa101
02 is compared in the unit of record. If the combined results do not match, the information is stored in the combined result mismatch data group 105. If they match, the combined result is stored in the combined result matching data group 106. The data is stored in the unsuccessful data group 107.

【0017】誤結果排除部140は,不一致結果選択部
141と,結合承認データ群108および結合否認デー
タ群109の記憶手段から構成され,不一致結果選択部
141は,結合結果不一致データ群105から読み出し
たデータ群をディスプレイ等の表示装置160に表示
し,人間が,結合した両者の結果と対応する住所や名義
情報を見て結合の妥当性を判断し,一方がOKかどう
か,両者ともNGかどうかを判断してキーボード等の入
力装置170からその旨を通知する。不一致結果選択部
141ではその結果を見て振り分けて,一方がOKであ
れば結合承認データ群108へ,両者ともNGであれば
結合否認データ群109へ格納する。
The erroneous result elimination unit 140 is composed of a mismatch result selection unit 141 and a storage unit for the combination approval data group 108 and the combination denial data group 109. The mismatch result selection unit 141 reads from the combination result mismatch data group 105. The displayed data group is displayed on a display device 160 such as a display, and a human judges the validity of the combination by checking the result of the combination and the address and name information corresponding to the combination. Then, the input device 170 such as a keyboard notifies the user. The disagreement result selection unit 141 sorts the results based on the result. If one is OK, the result is stored in the combination approval data group 108, and if both are NG, the result is stored in the combination denial data group 109.

【0018】図3に,結合処理部110の処理A,結合
処理部120の処理Bで結果が異なり,人間がどちらも
採用せず結合失敗と判断した例を示す。DBa101の
[(住所)街区番号−地番/名義]で[5−2/田中医
院]に対して,結合処理部110の処理Aでは,同一住
所で一致するDBb102のレコードが発見できなかっ
たため,周辺を見て[6−10/田中一郎]を見つけ,
重要固有名詞「田中」の一致によりこれを結合対象と認
定した。
FIG. 3 shows an example in which the result is different between the processing A of the combining processing unit 110 and the processing B of the combining processing unit 120, and a human has adopted neither of them and determined that the combining has failed. For [5-2 / Tanaka Clinic] in [(address) block number-lot number / name] in DBa101, in process A of the combination processing unit 110, a matching DBb102 record at the same address could not be found, To find [6-10 / Ichiro Tanaka]
This was recognized as a target to be combined by matching the important proper noun "Tanaka".

【0019】一方,結合処理部120の処理Bでは,住
所一致で同様に一致レコードを見つけられなかったた
め,周辺を探索して[5−12/中井医院]を発見し,
田中医院4文字に対して3文字一致しているため一致度
が高いとして結合対象として認定した。
On the other hand, in the processing B of the combining processing unit 120, since no matching record was found in the same way by address matching, the surrounding area was searched to find [5-12 / Nakai clinic].
Since 3 characters matched 4 characters from Tanaka Clinic, the degree of matching was determined to be high and the character was recognized as a combination target.

【0020】これらの結果は,それぞれ中間結合結果a
103,中間結合結果b104に格納される。結合フラ
グは,どちらも“1”で結合成功を示す。結果突合分類
部130は,これらの中間結合結果a103,中間結合
結果b104を結果比較部131により比較し,一致し
ないことから結合結果不一致データ群105に格納す
る。
These results are respectively the intermediate combination result a
103, stored in the intermediate combination result b104. Both of the combination flags are “1”, indicating successful combination. The result matching unit 130 compares the intermediate combination result a103 and the intermediate combination result b104 with the result comparison unit 131, and stores them in the combination result mismatch data group 105 because they do not match.

【0021】誤結果排除部140の不一致結果選択部1
41は,結合結果不一致データ群105から読み出した
これらの結合結果を,誤結果かどうかを判別するため表
示装置160に表示する。ここでは,結果を判定する人
間は,両者の結合結果を見た後に,両者とも結合させる
には危険性が高いと判断して,結合失敗を選択したの
で,両結果不採用になっている。したがって,結合否認
データ群109に結果が格納されることになる。
Mismatch result selection unit 1 of false result elimination unit 140
41 displays on the display device 160 these combined results read from the combined result mismatch data group 105 to determine whether they are erroneous results. Here, the person who determines the result, after seeing the result of the combination of the two, judges that there is a high risk of combining the two, and selects the failure of the combination, so the two results are not adopted. Therefore, the result is stored in the connection denial data group 109.

【0022】結合結果合体部150では,誤結果排除部
140の結果と結合突合分類部130のデータ群をもと
に結果を集計し,結合DB200を作成する。
In the join result merging unit 150, the results are totaled based on the result of the erroneous result elimination unit 140 and the data group of the join matching classifying unit 130, and a joint DB 200 is created.

【0023】[0023]

【実施例】具体的に東京23区の電話帳DB(約100
万件)と住宅地図DBとの結合に対し,前述した参考文
献1(安田,松村,水町,唐沢「電話・FAXを使った
地図案内システム」,第3回機能図形情報システムシン
ポジウム講演論文集,1992/4,pp.81-86. )に記
載された日本語処理による固有情報比較方式(処理A)
で自動結合させた場合の結果は,結合成功率約85%,
結合失敗約15%であり,サンプリングによる推定誤結
合は5%(結合信頼性95%)であった。
[Example] Specifically, a telephone directory DB (about 100
References 1 (Yasuda, Matsumura, Mizumachi, Karasawa "Map guidance system using telephone and FAX", 3rd Functional Graphic Information System Symposium, 1992/4, pp.81-86.) Specific information comparison method using Japanese processing (Processing A)
The result when combining automatically with is that the combining success rate is about 85%,
The connection failure was about 15%, and the estimated misconnection by sampling was 5% (connection reliability 95%).

【0024】一方,これに対し特願平10−21617
号で提案されている方式(処理B)により結合処理を行
った結果は,結合成功約90%,結合失敗約10%であ
り,両者の突合の結果,両者の結果一致が約75%,不
一致が20%(処理Aのみ結合が約5%,処理Bのみ結
合が約10%,両者結合も結果不一致が5%),未結合
が5%という結果となった。
On the other hand, Japanese Patent Application No. Hei 10-21617 discloses this.
As a result of performing the combining process by the method (Processing B) proposed in the above publication, the combining success was about 90% and the combining failure was about 10%. Was 20% (only treatment A had about 5% binding, only treatment B had about 10% binding, and both bindings had a mismatched result of 5%) and 5% had no binding.

【0025】さらに,両者結果不一致のもののみ人手で
チェックした結果,2%が両方式いずれかが正解と判断
できて,選択によって結合成功となり,3%(誤り全体
の60%)が結合誤りと判断できて排除できた。
Furthermore, as a result of manually checking only those that do not match both results, 2% can judge that either of the two methods is the correct answer, the combination succeeds by selection, and 3% (60% of the total errors) is a joint error. I was able to judge and I could eliminate it.

【0026】例えば「処理A」だけの方式を用いた場合
には,全体の信頼性を3%向上させるには,誤りが10
0万件のデータに均一に存在すると仮定して,60万件
の結合結果を人手でチェックして誤結合を排除する作業
が必要と考えられる。しかもこの作業は,1つの結合結
果だけを見て結合結果が誤りか否かを判断するために住
宅地図などで周辺の情報を見てよりふさわしいものが存
在するか否かを判断しながら進める必要があり,東京2
3区の例では,200件/人日程度の効率であった。
For example, in the case of using only the method of “processing A”, in order to improve the overall reliability by 3%, 10 errors are required.
Assuming that the data exists evenly in the data of 100,000, it is considered necessary to manually check the results of the connection of 600,000 data to eliminate the incorrect connection. In addition, this work must be performed while looking at the surrounding information on a house map or the like and judging whether there is something more appropriate, in order to judge whether or not the joining result is incorrect by looking at only one joining result. There is Tokyo 2
In the case of three wards, the efficiency was about 200 cases / person-day.

【0027】一方,本発明の方法を採用し,2方式の結
果不一致(5%)のみを人手でチェックした例では,5
万件(5%)のみのチェックで良く,しかも2方式の結
果を比較して両者とも結合させるのはふさわしくない
(誤結合の危険性が高い)と判断したもののみを排除す
るというやり方のため,3000件/人日の効率を上げ
ることができた。この結果,誤結合全体の60%を排除
する工数を1/180に削減できた。
On the other hand, in the example in which the method of the present invention is employed and only the inconsistency (5%) as a result of the two methods is manually checked, 5
It is enough to check only 10,000 cases (5%), and it is necessary to compare the results of the two methods and exclude only those judged to be inappropriate (high risk of wrong connection) to be combined. , 3,000 jobs / person / day. As a result, the man-hour for eliminating 60% of the total misconnection can be reduced to 1/180.

【0028】[0028]

【発明の効果】以上説明したように,本発明によれば,
複数種類の結合処理方式の結果の不一致部分だけを比較
することで,きわめて効率良く,かつ,少ない工数で自
動結合誤りを排除し,結合結果全体の信頼性を上げるこ
とができるようになる。
As described above, according to the present invention,
By comparing only the unmatched portions of the results of the plural types of join processing methods, it is possible to eliminate the automatic join error extremely efficiently and with a small number of man-hours, thereby improving the reliability of the entire join result.

【図面の簡単な説明】[Brief description of the drawings]

【図1】本発明の要部構成を示すブロック図である。FIG. 1 is a block diagram illustrating a main configuration of the present invention.

【図2】図1に示した本発明の処理装置による処理フロ
ーの概要を具体的に示す図である。
FIG. 2 is a diagram specifically showing an outline of a processing flow by the processing apparatus of the present invention shown in FIG.

【図3】複数の結合処理部(処理A,処理B)による結
合結果が異なり,人間がどちらも採用せず結合失敗とし
た例を示す図である。
FIG. 3 is a diagram illustrating an example in which the combination results by a plurality of combination processing units (processing A and processing B) are different, and a human has adopted neither of them and has failed in the combination.

【符号の説明】[Explanation of symbols]

101 結合対象のデータベース(DBa) 102 結合対象のデータベース(DBb) 110 結合処理部(処理A) 120 結合処理部(処理B) 130 結合突合分類部 140 誤結果排除部 150 結合結果合体部 160 表示装置 170 入力装置 200 結合データベース(DB) DESCRIPTION OF SYMBOLS 101 Database to be combined (DBa) 102 Database to be combined (DBb) 110 Combination processing unit (Process A) 120 Combination processing unit (Process B) 130 Combined joining classification unit 140 False result elimination unit 150 Combined result united unit 160 Display device 170 Input device 200 Connection database (DB)

───────────────────────────────────────────────────── フロントページの続き (72)発明者 藤井 憲作 東京都新宿区西新宿三丁目19番2号 日本 電信電話株式会社内 ────────────────────────────────────────────────── ─── Continued on the front page (72) Inventor Kensaku Fujii Nippon Telegraph and Telephone Co., Ltd., 3-19-2 Nishishinjuku, Shinjuku-ku, Tokyo

Claims (4)

【特許請求の範囲】[Claims] 【請求項1】 住所情報を持つ異なる2種類のデータを
計算機により結び付ける処理方法において,結合対象と
なる前記2種類のデータについて結合の評価方法が異な
る複数種類の結合処理方式によって結合する過程と,前
記複数種類の結合処理方式による結合結果を突き合わせ
て,結果が不一致のものを抽出し,その抽出した結果を
誤結合かどうかの判断対象として誤結合を排除するため
の過程と,誤結合を排除した前記複数種類の結合処理方
式による結合結果を合わせて,求める結合結果とする過
程とを有することを特徴とする住所情報を持つ異種デー
タ自動結合の誤り排除方法。
1. A processing method for combining two different types of data having address information by a computer, wherein the two types of data to be combined are combined by a plurality of different types of combining processing methods having different combinations of evaluation methods. A process for eliminating the incorrect connection by comparing the combined results obtained by the plurality of types of combining processing, extracting a mismatched result, and judging whether or not the extracted result is a wrong connection, and eliminating the wrong connection. Combining the combined results of the plurality of types of combining processing to obtain a combined result to be obtained.
【請求項2】 前記複数種類の結合処理方式は,住所情
報中の住人名やビル名等の要素を日本語処理し,重要語
を評価して結合させる第1の方法と,住所の一致度に応
じて文字の一致度を評価する第2の方法とを含むことを
特徴とする請求項1記載の住所情報を持つ異種データ自
動結合の誤り排除方法。
2. A method according to claim 1, further comprising: a first method for processing elements such as a resident name and a building name in the address information in Japanese and evaluating and combining important words; 2. A method according to claim 1, further comprising the step of: estimating the degree of coincidence of characters in accordance with the method.
【請求項3】 住所情報を持つ異なる2種類のデータを
計算機により結び付ける処理装置において,結合対象と
なる前記2種類のデータについて,それぞれ結合の評価
方法が異なる結合処理方式によって結合する複数の結合
処理部と,前記複数の結合処理部による結合結果を突き
合わせて,結果が不一致のものを抽出する結果突合分類
部と,前記結果突合分類部により抽出した結果を誤結合
かどうかの判断対象として,誤結合を排除するための入
出力インタフェースを持つ誤結果排除部と,誤結合を排
除した前記複数の結合処理部による結合結果を合わせ
て,求める結合結果とする結合結果合体部とを備えるこ
とを特徴とする住所情報を持つ異種データ自動結合の誤
り排除処理装置。
3. A processing device for combining two different types of data having address information by a computer, wherein a plurality of combining processes are combined for each of the two types of data to be combined using different combining evaluation methods. And a result matching unit that matches the combined results obtained by the plurality of combining processing units and extracts a result having a mismatch, and a result extracted by the result matching unit that is determined to be an incorrect combination. An error result elimination unit having an input / output interface for eliminating a connection, and a connection result uniting unit that obtains a connection result obtained by combining the connection results by the plurality of connection processing units that eliminate the connection error. An error elimination processing device for automatically combining different types of data having address information.
【請求項4】 前記複数種類の結合処理部の中に,住所
情報中の住人名やビル名等の要素を日本語処理し,重要
語を評価して結合させる第1の結合処理部と,住所の一
致度に応じて文字の一致度を評価する第2の結合処理部
とを含むことを特徴とする請求項3記載の住所情報を持
つ異種データ自動結合の誤り排除処理装置。
4. A first combination processing unit for processing elements such as a resident name and a building name in the address information into Japanese and evaluating and combining important words in the plurality of types of combination processing units; 4. The apparatus according to claim 3, further comprising a second combination processing unit that evaluates the degree of coincidence of characters in accordance with the degree of coincidence of the address.
JP08833898A 1998-04-01 1998-04-01 Error elimination method for automatic combination of heterogeneous data having address information and its processing apparatus Expired - Fee Related JP3495253B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP08833898A JP3495253B2 (en) 1998-04-01 1998-04-01 Error elimination method for automatic combination of heterogeneous data having address information and its processing apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
JP08833898A JP3495253B2 (en) 1998-04-01 1998-04-01 Error elimination method for automatic combination of heterogeneous data having address information and its processing apparatus

Publications (2)

Publication Number Publication Date
JPH11288416A true JPH11288416A (en) 1999-10-19
JP3495253B2 JP3495253B2 (en) 2004-02-09

Family

ID=13940087

Family Applications (1)

Application Number Title Priority Date Filing Date
JP08833898A Expired - Fee Related JP3495253B2 (en) 1998-04-01 1998-04-01 Error elimination method for automatic combination of heterogeneous data having address information and its processing apparatus

Country Status (1)

Country Link
JP (1) JP3495253B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008122183A (en) * 2006-11-10 2008-05-29 Denso Corp Facility information processing apparatus and program

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0962700A (en) * 1995-08-29 1997-03-07 Nippon Telegr & Teleph Corp <Ntt> Method and device for constructing dictionary
JPH09259141A (en) * 1996-03-26 1997-10-03 Hitachi Software Eng Co Ltd Map data linkage system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH0962700A (en) * 1995-08-29 1997-03-07 Nippon Telegr & Teleph Corp <Ntt> Method and device for constructing dictionary
JPH09259141A (en) * 1996-03-26 1997-10-03 Hitachi Software Eng Co Ltd Map data linkage system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2008122183A (en) * 2006-11-10 2008-05-29 Denso Corp Facility information processing apparatus and program

Also Published As

Publication number Publication date
JP3495253B2 (en) 2004-02-09

Similar Documents

Publication Publication Date Title
US9792324B2 (en) Method and system for uniquely identifying a person to the exclusion of all others
US6185583B1 (en) Parallel rule-based processing of forms
US8620937B2 (en) Real time data warehousing
JP3201945B2 (en) How to compare database tables
KR100850255B1 (en) Real time data warehousing
US7324998B2 (en) Document search methods and systems
US7664343B2 (en) Modified Levenshtein distance algorithm for coding
US20030182296A1 (en) Association candidate generating apparatus and method, association-establishing system, and computer-readable medium recording an association candidate generating program therein
US6694459B1 (en) Method and apparatus for testing a data retrieval system
CN111680110B (en) Data processing method, data processing device, BI system and medium
JP3495253B2 (en) Error elimination method for automatic combination of heterogeneous data having address information and its processing apparatus
JP2921522B1 (en) Database combining method and apparatus, and storage medium storing database combining program
JP3517345B2 (en) METHOD AND APPARATUS FOR JOINT PROCESSING OF DIFFERENT DATA WITH ADDRESS INFORMATION
US20070217679A1 (en) Abnormal pattern detection program for function call in source program
JP2002183441A (en) Method and device for confirming identity
CN113626558A (en) Intelligent recommendation-based field standardization method and system
CN110874326A (en) Test case generation method and device, computer equipment and storage medium
CN116846741B (en) Alarm convergence method, device, equipment and storage medium
CN113342816B (en) Catalog reporting method and device
WO2022180815A1 (en) Information processing program, information processing method, and information processing device
JP3057090B2 (en) Software component search method and software component search device
JP2001312419A (en) Software overlap degree evaluating device and recording medium with recorded software overlap degree evaluating program
CN114023445A (en) Automatic preliminary diagnosis system and method based on medical diagnosis database
JPH0243680A (en) Client credit information retrieving system
JP2000339403A (en) Method and device for recognizing personal information writedown medium, and recording medium

Legal Events

Date Code Title Description
FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20071121

Year of fee payment: 4

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20081121

Year of fee payment: 5

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20091121

Year of fee payment: 6

FPAY Renewal fee payment (event date is renewal date of database)

Free format text: PAYMENT UNTIL: 20101121

Year of fee payment: 7

LAPS Cancellation because of no payment of annual fees