JP2004086782A

JP2004086782A - Apparatus for supporting integration of heterogeneous database

Info

Publication number: JP2004086782A
Application number: JP2002249889A
Authority: JP
Inventors: Hiroshi Seki; 関　　　洋; Hiroki Sano; 佐野　広樹; Yasuo Yoshinari; 吉成　康男; Hiroyuki Yuji; 湯地　弘幸
Original assignee: Hitachi Ltd
Current assignee: Hitachi Ltd
Priority date: 2002-08-29
Filing date: 2002-08-29
Publication date: 2004-03-18

Abstract

<P>PROBLEM TO BE SOLVED: To provide an apparatus for supporting that the attribute items in a table are quickly associated with each other in a plurality of different databases. <P>SOLUTION: The values of attributes in the table belonging to two different databases 110c, 120c are compared with each other by using a comparison mapping model generation apparatus 1200 between design databases. The same apparatus 1200 generates a mapping model for associating the attribute items in each table from the one with higher matching rate resulted from the comparison between attribute values. <P>COPYRIGHT: (C)2004,JPO

Description

【０００１】
【発明の属する技術分野】
本発明はプラントの設計・運転・保守などの分業化された分散設計業務の、より効率的な情報伝達と共有を支援するための異種データベース統合支援装置に関する。
【０００２】
【従来の技術】
従来技術として特開平１１−２８２８７８号公報に、複数のデータベース（以下、ＤＢともいう。）から関連付けの推定を行いつつ複数のデータベースの検索を実施する関連情報検索装置が開示されている。この従来技術の例は様々な外部データを利用した関連付け「推定」結果に基づいて検索し、検索内容に基づいて、「連想値」にしたがった関連を学習しつつ、データベース間の関連性に関するモデル化を実施していくものである。
【０００３】
【発明が解決しようとする課題】
コンカレント（同時並行）設計業務の複数の部署のデータベース整合性管理に関わるものであり、従来技術の例は複数ＤＢから関連付けの推定を行いつつ検索を実施する。したがって、従来技術の例は様々な外部データを利用した関連付け「推定」結果に基づくデータベース整合性管理を行う。このように従来技術の例は、データベース全体を分析して、属性値の一致数などに基づいて、ボトムアップにデータベース間の関係を作り上げるものでないから、信頼性の高いモデルを作り上げることが困難である。
【０００４】
コンカレント設計業務における設計データの整合性管理を実施するには、まず、最初に、コンカレント設計業務における別々の設計分野の異種ＤＢの対応関係モデル作成時にデータベース間の完全な対応関係を作り上げるものになっていなければならないが、従来技術の例では、ボトムアップにデータベース間の関係を作り上げるものでないから、異種データベース間で同じ値であるべきデータの整合性を管理する統合支援に従来技術の例を利用することが出来ない。
【０００５】
したがって、本発明の目的は、異種データベース間で同じ値であるべきデータの整合性を管理する統合支援に利用する異種データベース統合支援装置を提供することにある。
【０００６】
【課題を解決するための手段】
本発明は複数の異なるデータベースに属するテーブルの属性が有する属性値同士を比較することで、属性値同士の比較結果に基づいて、各テーブル内の属性項目同士を対応付けるマッピング・モデルを生成する手段を備えている。その比較結果を出す際には、前述の属性値同士を比較して一致度が高いものから各テーブル内の属性項目同士を対応付けるマッピング・モデルを生成する手段を有することが好ましい。
【０００７】
【発明の実施の形態】
原子力・火力などのプラントの設計業務は通常は分業化されており、いくつかの段階に分かれて設計業務が進行する。設計の初めの段階は「上流」と呼ばれ、後の段階は「下流」としばしば呼ばれる。原子力，火力などのプラントの設計部署においては、系統設計，配管三次元設計，機器設計などに関わる各データベースに設計結果を格納し、管理している。これらのデータベースは、データ量が膨大で、業務の性格上、各データベースにそれぞれのデータがコンカレントに蓄積されていく。すなわち、上流側の設計を待たずに下流側の設計がスタートしてしまうため、単にデータベース間のリンクを張るだけではデータベースの整合性管理ができない。これらの業務はコンカレントに進むものであり、上流の設計結果を待たずに、値の仮決めにより下流側の設計もスタートしている。
【０００８】
図２は一般的なプラント設計業務の種類と業務間の係わりを示すブロック図である。ここでは例として上流から下流に向けて、プラントの系統設計１０，配管設計２０，機器設計３０などの順番で業務の成果が移行する場合を示している。この図のようにその成果が順次流れれば、これらの業務の設計結果に関して差異はなく統一が保たれる。しかしながら、現実的にはシステムが膨大であること、全体の設計にかけられる時間を多くとることができないこと、業務に関わる人数が多いこと、などの理由から、ほとんどの場合コンカレントに仕事が進められる。
【０００９】
例えば、図２の例で言えば、系統設計の部門１０では配管計装線図を作成
（１０ａ）し、また機器リストの作成（１０ｂ）などを行い、その結果を系統設計データベース（ＤＢ）１２０ｃに格納する。また配管設計の部門では配管空間３Ｄレイアウトなどを作成（２０ａ）して配管設計データベース（ＤＢ）１１０ｃに格納する。さらに、機器設計の部門では、機器詳細設計（３０ａ），発注品手配用仕様（３０ｂ）などを作成して機器設計データベース（ＤＢ）１３０ｃに格納する。それぞれの業務がコンカレントにデータをデータベースに蓄積しているから、整合性情報１から３、すなわち４００ａから４００ｃを確認し、設計結果の調整を設計者間でとって、ＤＢ間でのデータ転送により、それぞれの業務の成果を他の部署のデータベースに反映することが必須条件となる。
【００１０】
本発明の実施例では、それぞれの業務がコンカレントにデータをデータベースに蓄積している場合において、以下のように、それぞれの業務の成果を他の部署のデータベースに反映することを支援する。本発明の実施の形態に係わる異種データベース統合支援装置の構成について図１を用いて説明する。
【００１１】
本発明の異種データベース統合支援装置は設計データベース群１１００として管理されているデータベース，配管設計データベース１１０ｃ，系統設計データベース１２０ｃ，機器設計データベース１３０ｃなどを利用する。なお、データベースはこの３個以上の複数があってもよい。これらのデータベースに格納されているテーブルおよび属性，属性値を利用して、異なる二つのテーブルの間についての属性のマッピング（対応付け）を、設計データベース間比較マッピング・モデル生成装置１２００を用いて実施する。生成したマッピング・モデルはマッピング・モデル・データベース１３００に格納される。
【００１２】
すなわち配管設計データベース１１０ｃと系統設計データベース１２０ｃとの関係は１１０−１２０で、その逆の関係は１２０−１１０のマッピング・モデルとして格納される。また、系統設計データベース１２０ｃと機器設計データベース１３０ｃとの関係は１２０−１３０で、その逆の関係は１３０−１２０のマッピング・モデルとして格納される。逆の関係があるのはそれぞれのデータベースで利用している情報表現のコード体系の違いや単位系の違いに対応して、情報表現方法を変えるためである。
【００１３】
生成したマッピング・モデルについては管理・編集端末１０００を利用して表示する他に編集操作によってマッピング・モデルの内容を変更することができる。最終的に生成・編集されたマッピング・モデルは差異管理データサーバ６００に実装されているタグ付き文書変換装置３００（図３では３１０，３２０で表現される）で利用され、データベース間の整合性維持状況を表すタグ付き文書を生成する処理に利用される。
【００１４】
図３はデータベース間の整合性情報を管理するシステムの構成を示すものである。図３のデータベース統合システムは既存システムとして、配管設計システム１１０，系統設計システム１２０および機器設計システム１３０を有している場合の実施例である。この例では，配管設計システム１１０は、端末装置１１０ａ、と業務対応アプリケーションとしての配管設計アプリケーション（プログラム）１１０ｂおよび業務対応データベースとしての配管設計データベース１１０ｃから構成されている。
【００１５】
また、系統設計システム１２０は端末装置１２０ａ，業務対応アプリケーションとしての系統設計アプリケーション（プログラム）１２０ｂおよび業務対応
ＤＢとしての系統設計データベース１２０ｃから構成されている。また、機器設計システム１３０は端末装置１３０ａ，業務対応アプリケーションとしての機器設計アプリケーション（プログラム）１３０ｂおよび業務対応ＤＢとしての機器設計データベース１３０ｃから構成されている。
【００１６】
各システムのデータベース１１０ｃ，１２０ｃおよび１３０ｃに、マッピング・モデル・データベース１３００に格納されているマッピング・モデル１１０−１２０および１２０−１３０を割り当てることによって、系統設計データベース１２０ｃを中心にして、配管設計データベース１１０ｃと機器設計データベース１３０ｃを対応付ける環境を作っている。
【００１７】
また、１１０ｃ，１２０ｃおよび１３０ｃの各データベースにおける変更検知処理装置２１０，２２０，２３０を通じて、データベースに変更があったかどうかを検知する。この変更検知結果およびデータベースの変更部分のデータについての信号２１０（ｉ），２２０（ｉ），２２０（ｉｉ），２３０（ｉ）を入力データとする差異データ管理サーバ６００で、タグ付き文書によってデータベース間の整合状態を管理する。
【００１８】
ここで、タグとはデータに対して意味付けをする情報、すなわち属性名称などから作った記述子のことを言う。差異データ管理サーバ６００では信号２１０（ｉ）または信号２２０（ｉ）のいずれかの信号が発生したときに、マッピング・モデル１１０−１２０を利用してタグ付き文書変換処理装置３１０が起動し、系統設計データベース−配管設計データベース１２０ｃと１１０ｃの整合性維持状況を表すタグ付き文書を生成し、タグ付き文書蓄積ファイルシステム４１０に蓄積する。タグ付き文書蓄積ファイルシステム４１０は系統設計−配管設計のデータベース間で対応状況に差異を検出した場合に、各業務アプリケーション１１０ｂ，１２０ｂにデータ修正要求４１０（ｉ）と４１０（ｉｉ）を出力する。
【００１９】
同様に、信号２２０（ｉｉ）または信号２３０（ｉ）のいずれかの信号が発生したときに、マッピング・モデル１２０−１３０を利用してタグ付き文書変換処理装置３２０が起動し、系統設計データベース−機器設計データベース１２０ｃと１３０ｃの整合性維持状況を表すタグ付き文書を生成し、タグ付き文書蓄積ファイルシステム４２０に蓄積する。タグ付き文書蓄積ファイルシステム４２０は系統設計−配管設計のデータベース間で対応状況に差異を検出した場合に各業務アプリケーション１２０ｂ，１３０ｂにデータ修正要求４２０（ｉ）と４２０（ｉｉ）を出力し、系統設計アプリケーション・プログラム１２０ｂあるいは機器設計アプリケーション・プログラム１３０ｂにより設計データの修正を行う。
【００２０】
また、既存システムとは別にネットワークに接続された端末装置７００と、これに実装されているブラウザにより、ユーザからの表示要求を表示要求受付装置５００で処理する。この際、表示要求受付装置５００が必要とするタグ付き文書はタグ付き文書蓄積ファイルシステム４１０または４２０から選択し、表示形態も予め用意したスタイル設定ファイル５１０から選択して、表示画面を生成する。
【００２１】
図３の例では系統設計データベース１２０ｃと配管設計データベース１１０ｃ，系統設計データベース１２０ｃと機器設計データベース１３０ｃとの間のデータベース間の差異を管理するシステムを表現しているが、配管設計データベース１１０ｃと機器設計データベース１３０ｃとの間の差異関係も同様の構成で確認できるように構成することができる。
【００２２】
図４は系統設計データベースと配管設計データベースの対応付けの関係を示す図である。図４（ａ）は実際のデータベース内容の一例を示すものである。例えば、配管設計データベース１１０ｃのテーブルと系統設計データベース１２０ｃのテーブルを対応付けた例である。例えば、配管番号とＬＣ，ＬＤが１１０−１２０−１ｃ，最高使用圧力とＬＧが１１０−１２０−２ｃ，最高使用温度とＬＨが１１０−１２０−３ｃという関係で各行ごとに対応が付けられている。この関係は図４（ｂ）のようなモデルで表されるようなものである。図４（ｂ）は（ａ）の対応関係を図式的に表示したものである。
【００２３】
すなわち、系統設計＿配管設計という内容で比較すべき属性項目集合ＲＯＷが複数行存在する。複数行存在するという情報はＲＯＷの上についている＊で表している。＊がなければ、一つ存在するという意味になる。そしてＲＯＷの構成要素として、配管番号１１０−１２０−３，最高使用圧力１１０−１２０−４，最高使用温度１１０−１２０−５などといった情報がある。さらに、配管番号１１０−１２０−１の下位情報として系統設計側の配管番号、配管設計側のＬＤとＬＣを「−（ハイフン）」で結びつけた内容が対応付けられる、などといった階層構成になっている。この図４（ｂ）で表される対応付けの関係を利用して実際のデータベースの属性値を対応付ける。
【００２４】
図５は配管設計データベース１１０ｃ，系統設計データベース１２０ｃの対応付けをＷ３Ｃ（Ｗｏｒｌｄ　Ｗｉｄｅ　Ｗｅｂ　Ｃｏｎｓｏｒｔｉｕｍ、ｈｔｔｐ：／／ｗｗｗ．ｗ３ｃ．ｏｒｇ）で国際的に標準化されているタグ付き文書規格、すなわちＸＭＬに従う構文であるＸＭＬ　Ｓｃｈｅｍａに従った場合の、マッピング・モデル１１０−１２０の記述例を示すものである。この例では系統設計＿配管設計という名称で最上位の要素を定めて、比較すべき属性項目集合をＲＯＷとして定めると、ＲＯＷの中には配管番号，最高使用圧力などといったものがある。
【００２５】
さらに配管番号として比較する情報は系統設計データベース側の配管番号という属性項目，配管設計データベース側のＬＣ＿ＬＤという属性項目があるといったように階層的に対応関係の情報を展開して記述する。また、系統設計データベース側のテーブル名を＄ｔａｂｌｅＮａｍｅ１、配管設計データベース側のテーブル名を＄ｔａｂｌｅＮａｍｅ２としておくと、実際のテーブル内の属性項目はそれぞれ、＄ｔａｂｌｅＮａｍｅ１．配管番号，ＣＯＮＣＡＴ（＄ｔａｂｌｅＮａｍｅ２．ＬＤ，ＣＯＮＣＡＴ（‘−’，＄ｔａｂｌｅＮａｍｅ２．ＬＣ））で表現するような情報で表される。ここで、ＣＯＮＣＡＴ（‘−’，＄ｔａｂｌｅＮａｍｅ２．ＬＣ）は‘−’と＄ｔａｂｌｅＮａｍｅ２．ＬＣの文字列を結合する関数を意味している。
【００２６】
図６は図５に示すマッピング・モデルを使って生成したタグ付き文書の例を示す。タグ付き文書の一例としてＸＭＬでデータベース間の対応関係を表すと図６のように表され、この対応付けは図４（ｂ）または図５で表すマッピングデータを用いて、図４（ａ）の実際の属性値を用いて実施される。すなわち、＜系統設計＿配管設計＞と＜／系統設計＿配管設計＞というタグで、情報の先頭と最後をはさむことで対応関係全体を表している。
【００２７】
＜系統設計＿配管設計＞の中にはプラントコード，更新時刻，データ数のパラメータとして、それぞれ、“Ａ”，“２００１／１２／１７　１８：１４：５６”，“１００”を表記する。＜系統設計＿配管設計＞＜／系統設計＿配管設計＞のタグの内側に＜ＲＯＷ＞＜／ＲＯＷ＞タグで実際のデータの、対応関係の集合を表す。
【００２８】
＜ＲＯＷ＞タグの中にはｎｕｍ＝“０”といったデータの通し番号を付ける。＜ＲＯＷ＞タグの内側には＜配管番号＞＜／配管番号＞，＜最高使用圧力＞＜／最高使用圧力＞などといった対応付ける属性の意味情報を表す。各タグの中には、例えば＜配管番号＿系統設計＞＜／配管番号＿系統設計＞，＜ＬＣ＿ＬＤ＿配管設計＞＜／ＬＣ＿ＬＤ＿系統設計＞などで属性値をはさんで、実際のデータベースの固有の情報をＲＨＲ−００１というように具体的に記述する。
【００２９】
図７は図６のタグ付き文書を用いて生成した表示画面の一例で、対応関係生成処理により生成されたタグ付き文書を用いて生成した画面の例を示す図である。一般的にタグ付き文書に関しては表示変換のためのプログラムが公開・市販されており、容易に入手可能になっている。図７では、図６に示すようなタグ付き文書を表形式の表示に変換する処理を表示変換プログラムで実施している。ウィンドウ７００−ａには配管データ比較対応関係を表示する。例えば、プルダウンメニュー７０１からプラント名として、Ａプラントを選択し、系統として、ツリーメニューからＥ１１残留熱除去系７０２を選択してから、対応関係表示ボタン７０３を押すと、右下のウィンドウ７０７に各属性間の対応関係が例えば最高使用圧力とＬＧ、最高使用温度とＬＨのように配管番号順に表示される。これは図６に示すようなタグ付き文書をもとに生成している。
【００３０】
図８は２つの設計データベースの属性値同士を比較し、マッピング・モデルを生成する処理の流れの概要を示す図である。この処理は設計データベース間比較マッピング・モデル生成装置１２００上における処理を説明するものであり、管理・編集端末１０００からのデータベース・マッピング・モデル作成指示信号の出力によって実行される。
【００３１】
ここでは、まず、管理・編集端末１０００からのデータベース関連パラメータの指定情報をまずプログラムが利用しているメモリに格納する（ＳＴＥＰ１０）。データベース関連パラメータとしては、例えばテーブル名称やそのテーブルの主キー情報などがある。ここでは仮にデータベース１（ＤＢ１）とデータベース２（ＤＢ２）に格納されているテーブルに含まれる属性同士を対応付けるものとする。
【００３２】
次に、ＤＢ１とＤＢ２から選択した各テーブルの主キーのドメイン情報を収集する（ＳＴＥＰ２０）。ここで主キーとはテーブルに含まれる行方向のデータを特定する属性を意味するものであり、ドメイン情報とはある特定の属性の列方向に関わる属性値についての集合のことを意味する。すなわち、主キーのドメイン情報とは行方向のデータを特定する属性に対応する属性値の集合のことになる。
【００３３】
次に、ＤＢ１とＤＢ２から選択した各テーブルの主キーの共通ドメイン情報を収集する（ＳＴＥＰ３０）。主キーの共通ドメイン情報とはＤＢ１とＤＢ２で行方向のデータを特定する属性に対応する共通の属性値の集合である。
【００３４】
次に、ＤＢ１とＤＢ２から選択した各テーブルの属性名称を収集する（ＳＴＥＰ４０）。これは、選択した２つのテーブルについてそれぞれ、列方向を特定する属性の名称を収集することになる。
【００３５】
次に、ＤＢ１とＤＢ２から選択した各テーブルの間でＳＴＥＰ４０で収集した属性名称ごとに、属性値が一致するかどうか判定し、一致数・不一致数を計数する（ＳＴＥＰ５０）。
【００３６】
次に、ＳＴＥＰ５０で得られた属性値の一致数・不一致数から計算した一致指数を基にＤＢ１とＤＢ２から選択した各テーブルの間を対応付けるマッピング・モデルを生成する（ＳＴＥＰ６０）。
【００３７】
最後に、マッピング・モデルを図１に示したような管理・編集端末１０００にマッピング・モデル作成結果を表示する（ＳＴＥＰ７０）ことで処理を終了する。
【００３８】
以下、図８に示す処理の詳細を図とフローチャートを用いて説明する。
【００３９】
図９はマッピング・モデル生成処理のためのデータを入力するシステムの表示画面例を示す図である。これは管理・編集端末１０００に処理開始前の必須データを与える入力パラメータ設定画面１０００−ａとして利用される。入力パラメータ設定画面１０００−ａにはＤＢ１とＤＢ２に分けて情報を設定する項目がある（１０００−ａ−１，１０００−ａ−２）。
【００４０】
設定項目としてはテーブル名１０００−ａ−３，主キー属性名１０００−ａ−４，区切り文字１０００−ａ−５，検索条件１０００−ａ−６，生成条件１０００−ａ−７がある。ここでは、例えば、ＤＢ１とＤＢ２のそれぞれのテーブル名として「系統設計」と「配管設計」を、主キー属性名として「配管番号」と「ＬＤ」および「ＬＣ」を、図４に示すようなテーブルで「ＲＨＲ−００１」のような配管番号の属性値を区切る文字列１０００−ａ−５を「−」として設定する。これにより、「ＲＨＲ−００１」から「ＲＨＲ」と「００１」を抽出して、「配管番号」と「ＬＤ」，「ＬＣ」を比較するための情報を生成することができる。また、検索条件１０００−ａ−６は、ある特定の属性値のときに比較したい場合に使用することができ、空白でもかまわない。検索条件１０００−ａ−６が空白のときには、テーブルに含まれる全属性値を用いて比較することになる。さらに、生成条件１０００−ａ−７は例えばＤＢ１とＤＢ２のテーブル「系統設計」，「配管設計」それぞれに対して「“系統略称＝＄ｓｙｓｔｅｍＮａｍｅＡｂｂ”」，「“ｌａ＝＄ｐｌａｎｔＮａｍｅ　ａｎｄ　ｎｔ＝＄ａｒｅａ　ａｎｄ　ｌｄ＝＄ｓｙｓｔｅｍＮａｍｅＡｂｂ”」と設定する。これは、タグ付き文書変換装置３００において二つのデータベースのテーブルからタグ付き文書を生成する際に利用する外部条件を指定するものであり、＄ｓｙｓｔｅｍＮａｍｅＡｂｂとして系統略称を、＄ｐｌａｎｔＮａｍｅとしてプラント名を、＄ａｒｅａとして建屋を、外部から変数として与えるのに利用する。ここで、「系統略称」，「ｌａ」，「ｎｔ」，「ｌｄ」はそれぞれのテーブルに含まれる属性の名称である。
【００４１】
データベースの設定に依存しない共通の設定項目として保存ディレクトリ名１０００−ａ−８，比較レコード名１０００−ａ−９，コード変換の指定１０００−ａ−１０および属性値比較方法１０００−ａ−１１がある。ここでは、設計データベース間比較マッピング・モデル生成処理により作成されたマッピング・モデルを保存するコンピュータ上のディレクトリ名１０００−ａ−８を「Ｃ：￥ＸＭＬＳｃｈｅｍａ」、属性値を比較するためのデータ数（レコード数とも言う）１０００−ａ−９を「１００」、単位系や独自のコード体系の変換があるかどうかの指定１０００−ａ−１０を「ｙｅｓ」、属性値の比較方法１０００−ａ−１１を「主キー−属性値一致」として指定している。これらの指定項目を入力したあと、モデル生成ボタン１０００−ａ−１２を押すことによって、図８のＳＴＥＰ１０以降の処理が開始される。リセットボタン１０００−ａ−１３を押すことによって、入力パラメータ設定画面１０００−ａですでに入力した内容を消去することができる。図８に示すＤＢ関連パラメータ指定処理ＳＴＥＰ１０では図９で指定した内容をプログラム上で確保したメモリに記憶させる処理を実施することになる。
【００４２】
図１０はＤＢ１とＤＢ２の主キーのドメイン情報を収集する処理ＳＴＥＰ２０の詳細な処理の流れを示す図である。まず、ＳＴＥＰ１０においてＤＢ関連パラメータが設定されたあと、本処理が実行される。
【００４３】
処理においては、まず、データベース番号として１を設定する（ＳＴＥＰ２０−１）。
【００４４】
次に、図９に示すような入力画面とＳＴＥＰ１０の処理で設定したＤＢ１についての主キーの属性名称とテーブル名称を設定する（ＳＴＥＰ２０−３）。
【００４５】
そして、該テーブルにおける主キーの属性値を全数検索して、一時的にメモリに格納する（ＳＴＥＰ２０−４）。
【００４６】
この後、あらかじめカウント数を０にしておき（ＳＴＥＰ２０−６）、データ番号が１の主キー属性値から始めて（ＳＴＥＰ２０−５およびＳＴＥＰ２０−７，Ｙｅｓ）、属性値が空ではないかどうか判定する。もし、属性値が空ではないとき（ＳＴＥＰ２０−８，Ｙｅｓ）、カウント数を１増やし（ＳＴＥＰ２０−９）、データ番号に対応付けた主キードメイン情報の集合、すなわち主キードメイン情報［カウント数］に空ではない主キーの属性値を格納する（ＳＴＥＰ２０−１０）。ＳＴＥＰ２０−９およびＳＴＥＰ２０−１０は属性値が空のときは実施しない（ＳＴＥＰ２０−８，Ｎｏ）。これらの処理の後、データ番号を１増やし（ＳＴＥＰ２０−１１）、データ番号がＳＴＥＰ２０−４で検索した結果得られたデータの全件数に達するまで（ＳＴＥＰ２０−７，Ｎｏとなるまで）処理を繰り返す。
【００４７】
データベース番号１について処理が終了するとデータベース番号を１増やして、すなわちＤＢ２についてＳＴＥＰ２０−３からＳＴＥＰ２０−１２を繰り返して実施する。
【００４８】
以上の処理によりＤＢ１とＤＢ２のそれぞれのデータベースに含まれる、指定したテーブルについてのそれぞれの主キーに関する属性値についてのドメイン情報を収集する。
【００４９】
図１１は主キーの共通ドメイン情報収集の流れを示す図である。この図は図８におけるＳＴＥＰ３０の処理を詳細に説明したものである。
【００５０】
まず、図１０に示す処理に得られた各データベースに対応する主キードメイン情報を主キードメインの配列に格納する。すなわち、ＤＢ１に関わる主キードメインの配列の内容を主キードメイン１という名称の配列に、ＤＢ２に関わる主キードメインの配列の内容を主キードメイン２という名称の配列に格納する。そして、共通ドメインデータ番号を０にしておく（ＳＴＥＰ３０−１）。
【００５１】
次に、主キードメイン１の配列の要素を順にサーチしていくためのデータ番号１を１にする（ＳＴＥＰ３０−２）。データ番号１が主キードメイン１のデータ数、すなわち主キードメイン１の配列の最大要素数以内であるとき（ＳＴＥＰ３０−３，Ｙｅｓ）、以下の処理を繰り返す。
【００５２】
まず、主キードメイン２の配列の要素を順にサーチしていくためのデータ番号２を１にする（ＳＴＥＰ３０−４）。そして、データ番号２が主キードメイン２のデータ数、すなわち主キードメイン２の配列の最大要素数以内であるとき（ＳＴＥＰ３０−５，Ｙｅｓ）、さらに、以下の処理を繰り返す。
【００５３】
ここでは、データ番号１に対応する主キードメイン１の値、すなわち主キードメイン１［データ番号１］で表される値とデータ番号２に対応する主キードメイン２の値、すなわち主キードメイン２［データ番号２］で表される値を比較し、それらが一致する場合（ＳＴＥＰ３０−６，Ｙｅｓ）、共通ドメインデータ番号を１増やし（ＳＴＥＰ３０−７）、共通ドメイン［共通ドメインデータ番号］に主キードメイン１［データ番号１］，主キードメイン２［データ番号２］を格納する（ＳＴＥＰ３０−８）。
【００５４】
これらの処理の終了後、データ番号２を１増やし（ＳＴＥＰ３０−９）、ＳＴＥＰ３０−５からＳＴＥＰ３０−９までの処理をデータ番号２が主キードメイン２の最大データ数に達するまで（ＳＴＥＰ３０−５，Ｎｏ）、繰り返す。これらの処理の終了後、データ番号１を増やし（ＳＴＥＰ３０−１０）、ＳＴＥＰ３０−３からＳＴＥＰ３０−１０までの処理をデータ番号１が主キードメイン１の最大データ数に達するまで（ＳＴＥＰ３０−３，Ｎｏ）、繰り返す。
【００５５】
以上の処理により共通ドメイン［共通ドメインデータ番号］に、ＤＢ１とＤＢ２で共通する主キードメイン情報を収集する。
【００５６】
図１２はデータベースの属性名称収集処理の流れを示す図である。この図は図８におけるＳＴＥＰ４０の処理を詳細に説明したものである。
【００５７】
処理においては、まず、データベース番号として１を設定する（ＳＴＥＰ４０−１）。次に、テーブル名にデータベース番号１に対応したテーブル名を設定する（ＳＴＥＰ４０−３）。次に、データベース側で持っているテーブル名に属する属性の名称のデータベース辞書を参照することによって、テーブル名に対応した属性名称のリストを検索する（ＳＴＥＰ４０−４）。
【００５８】
次に、検索の結果得られた属性名称リストを属性名称リスト［１］に格納する（ＳＴＥＰ４０−５）。
【００５９】
次に、データベース番号を１増やしてデータベース番号が２のデータベースについてＳＴＥＰ４０−２からＳＴＥＰ４０−６までの処理を繰り返して実施する。これにより、属性名称リスト［２］にデータベース番号２の属性名称のリストを格納する。そして、ＳＴＥＰ４０−２，Ｎｏの条件が成立して終了することによって、次のＳＴＥＰ５０に属性名称リストを渡す。
【００６０】
以上の処理により、ＤＢ１とＤＢ２の各データベースで指定したテーブルの属性名称を収集する。
【００６１】
図１３は主キーを利用した属性値の一致情報収集処理の流れを示す図である。この図は図８におけるＳＴＥＰ５０の処理を詳細に説明したものである。処理においては、まず、共通ドメインデータ番号として１を設定する（ＳＴＥＰ５０−１）。また、属性値の一致数と不一致数を見るための属性番号１と属性番号２の対応関係と非対応関係を表す一致カウント［属性番号１］［属性番号２］と不一致カウント［属性番号１］［属性番号２］としての初期値０を設定する（ＳＴＥＰ５０−２）。図では簡略表記として一致カウント［］［］，不一致カウント［］［］で表している。
【００６２】
次に、以下の処理を共通ドメインデータ番号がＳＴＥＰ３０で収集した共通ドメイン情報の最大数になるまで（ＳＴＥＰ５０−３，Ｎｏとなるまで）、繰り返す。ここで、生成された一致カウント［］［］，不一致カウント［］［］の情報はＳＴＥＰ６０で利用する。
【００６３】
共通ドメインデータ番号が共通ドメイン情報の最大数以内であるとき（ＳＴＥＰ５０−３，Ｙｅｓ）、まず、ＤＢ１で指定したテーブルに格納される属性の順番を表す属性番号１に１を設定する（ＳＴＥＰ５０−４）。そして、ＳＴＥＰ５０−５からＳＴＥＰ５０−１３までの処理を属性番号１がＤＢ１で指定したテーブルに格納される属性の最大数になるまで（ＳＴＥＰ５０−５，Ｎｏとなるまで）、繰り返す。ＳＴＥＰ５０−５，Ｎｏとなれば、共通ドメインデータ番号を１増やして（ＳＴＥＰ５０−１５）、ＳＴＥＰ５０−３からＳＴＥＰ５０−１５までの処理を繰り返す。
【００６４】
ＳＴＥＰ５０−５からＳＴＥＰ５０−１４までの繰り返し処理では、まず、共通ドメイン［共通ドメインデータ番号］と属性番号１で特定されるＤＢ１側の属性値をデータ１に割り当てる。また、データベースの属性値のコード体系や単位系を変換するために、コード変換の指定を図９に示すような入力画面で指定した場合には、コード変換フラグをＹｅｓにして、得られた属性値に対してコード変換を施したものを比較用の値として利用する（ＳＴＥＰ５０−６）。
【００６５】
次に、ＤＢ２で指定したテーブルに格納される属性の順番を表す属性番号２に１を設定する（ＳＴＥＰ５０−７）。ＳＴＥＰ５０−８からＳＴＥＰ５０−１３までの処理を属性番号２がＤＢ２で指定したテーブルに格納される属性の最大数になるまで（ＳＴＥＰ５０−８，Ｎｏとなるまで）、繰り返す。ＳＴＥＰ５０−８，Ｎｏとなれば、属性番号１を１増やして（ＳＴＥＰ５０−１４）、ＳＴＥＰ５０−５からＳＴＥＰ５０−１４までの処理を繰り返す。
【００６６】
ＳＴＥＰ５０−８からＳＴＥＰ５０−１３までの繰り返し処理では、共通ドメイン［共通ドメインデータ番号］と属性番号２で特定されるＤＢ２側の属性値をデータ２に割り当てる。また、データベースの属性値のコード体系や単位系を変換するために、コード変換の指定を図９に示すような入力画面で指定した場合には、コード変換フラグをＹｅｓにして、得られた属性値に対してコード変換を施したものを比較用の値として利用する（ＳＴＥＰ５０−９）。
【００６７】
次に、データ１とデータ２をある評価関数（ここでは、一致関数と名付けた関数）により一致すると判定された場合（ＳＴＥＰ５０−１０，Ｙｅｓ）、一致カウント［属性番号１］［属性番号２］を１増やす（ＳＴＥＰ５０−１１）。また、一致関数の評価で不一致であると判定された場合（ＳＴＥＰ５０−１０，Ｎｏ）、不一致カウント［属性番号１］［属性番号２］を１増やす（ＳＴＥＰ５０−１２）。ＳＴＥＰ５０−１１またはＳＴＥＰ５０−１２の処理後、属性番号２を１増やして（ＳＴＥＰ５０−１３）、ＳＴＥＰ５０−８からＳＴＥＰ５０−１３までの処理を繰り返す。
【００６８】
以上の処理により、ＤＢ１とＤＢ２より選択した各テーブルに属する属性の属性値同士の一致度および不一致度を、二次元配列の形式の情報として、一致カウント［属性番号１］［属性番号２］，不一致カウント［属性番号１］［属性番号２］のように生成する。
【００６９】
図１４は主キーを利用した属性値の一致情報収集処理により得られた属性一致度に関する情報をグラフ形式で示した例である。例えば、ＤＢ１とＤＢ２に関して設定したテーブル名をそれぞれ「系統設計」，「配管設計」とすると、主キーの属性を除いて比較した結果を図１４では示している。
【００７０】
系統設計テーブルに含まれる属性の名称を最高使用圧力，最高使用温度、および耐震クラスとして、配管設計テーブルに含まれる属性の名称をＬＧ，ＬＨ、およびＬＩとすると、最高使用圧力−ＬＧ，最高使用温度−ＬＨ、および、耐震クラス−ＬＩの一致指数が大きいことが評価の結果わかったことがグラフより読み取ることができる。その他の属性間の対応関係は一致指数が非常に低くなっている。
【００７１】
ここで、一致指数は、例えば、一致カウント［属性番号１］［属性番号２］／（一致カウント［属性番号１］［属性番号２］＋不一致カウント［属性番号１］［属性番号２］）で表されるもので、分母は属性値が空ではないものの属性の総数を表す。このような一致指数に対して、例えば閾値を０．７　と設定するとそれ以上のものを属性同士が対応していると判定し、閾値より小さいものについての属性間は対応していないものと判定することができる。
【００７２】
図２２はＤＢ１とＤＢ２に含まれる各テーブルの属性間の対応関係を表すマッピング・モデルを生成する処理の流れを示す図である。この図は図８におけるＳＴＥＰ６０の処理を詳細に説明したものである。ここでは、ＳＴＥＰ１０からＳＴＥＰ５０までの処理結果に基づき、例えば、図５に示すようなマッピング・モデルの雛型を生成する。
【００７３】
処理においては、まず、ＤＢ１とＤＢ２から選択したテーブル名からマッピング・モデルのルート要素、つまり、図６で示すようなタグ付き文書の一番先頭となる要素を生成する（ＳＴＥＰ６０−１）。図５，図６の例ではルート要素は「系統設計＿配管設計」で記述される要素となる。次に、各テーブルの行に対応する行属性要素「ＲＯＷ」を生成して、ルート要素に子要素として追加する（ＳＴＥＰ６０−２）。ここでは、「系統設計＿配管設計」の下位に「ＲＯＷ」を追加する。次に、主キー属性要素を生成し、ＲＯＷに子要素として追加する
（ＳＴＥＰ６０−３）。
【００７４】
さらに、ＤＢ１のテーブルの主キー要素，データベース接続情報を生成し、主キー属性要素に子要素として追加する。この処理はＤＢ２についても実施する（ＳＴＥＰ６０−４，５）。次に、属性番号１に最初に１を設定した後、ＳＴＥＰ６０−６からＳＴＥＰ６０−１４までの処理を属性番号１に関して繰り返す（ＳＴＥＰ６０−６，Ｙｅｓ）。さらに、属性番号１に関する繰り返し処理の中では、属性番号２に１を設定した後、ＳＴＥＰ６０−７からＳＴＥＰ６０−１３までの処理を属性番号２に関して繰り返す（ＳＴＥＰ６０−７，Ｙｅｓ）。
【００７５】
属性番号２に関する繰り返し処理の中では、図１３に関する説明で述べた一致指数［属性番号１］［属性番号２］が予め定めた閾値以上で、マッピング・モデル生成用に未選択の場合（ＳＴＥＰ６０−８，Ｙｅｓ）、属性間の対応関係を表す要素を生成する。もし、閾値より小さいか、マッピング・モデル生成用に属性が選択済みの場合（ＳＴＥＰ６０−８，Ｎｏ）、属性値間の対応関係を表す要素を生成することはしない。
【００７６】
属性間の対応関係を表す要素を生成する処理はＳＴＥＰ６０−９からＳＴＥＰ６０−１２で実施する。ここでは、代表属性名を生成し、ＲＯＷに子要素として追加する（ＳＴＥＰ６０−９）。なお、代表属性名として利用する属性名はＤＢ１かＤＢ２のいずれかのテーブルに属する属性名を設定するものとし、どちらを使用するかは予め定めておくことにする。ここでは、ＤＢ１のテーブルに属する属性名を選択している。
【００７７】
次に、ＤＢ１から選択したテーブルの属性名で要素を生成し、かつ、データベースへの接続情報も生成して、代表属性名の子要素として追加する。この処理はＤＢ２についても実施する（ＳＴＥＰ６０−１０，１１）。次に、属性名１［属性番号１］，属性名２［属性番号２］についてはマッピング・モデル用の要素生成に使用済みであることを表すフラグを設定する（ＳＴＥＰ６０−１２）。次に、属性番号２を１増やした（ＳＴＥＰ６０−１３）後、ＳＴＥＰ６０−７からＳＴＥＰ６０−１３までの処理を繰り返す。すべての属性に関して、対応付けられる属性をマッピング・モデル中に生成した後、対応付けることが不可能な属性の要素群を生成し、ＲＯＷに子要素として追加する（ＳＴＥＰ６０−１５）。
【００７８】
以上の処理により、属性値の一致指数の評価により、対応付けられる属性に関してはマッピング・モデル中に対応関係を表す要素群を表現し、対応付けることが不可能な属性に関してはマッピング・モデル中に対応付け不可能であった要素群を生成して、マッピング・モデルとしての雛型を生成することができる。
【００７９】
図２３は生成したマッピング・モデルの雛型に関して、編集中の画面例を示す図である。これは図８のマッピング・モデル表示処理ＳＴＥＰ７０により生成する。
【００８０】
この図に示す画面は管理・編集端末１０００に表示して、ユーザが操作を加えるものである。この画面例１０００−ｃにおいては表示枠１０００−ｃ−１の中に系統設計＿配管設計をルート要素として、その下にＲＯＷという要素が追加されている。
【００８１】
ＲＯＷの下位には、ＳＴＥＰ１０からＳＴＥＰ６０までの処理により生成された自動対応付け部分と自動対応付け不可だった不一致部分が接続される。
【００８２】
不一致部分に関してはユーザの編集操作１０００−ｃ−ａ１により自動対応付け部分に要素を移動して、手動でデータベース間の要素対応付けを図ることができる。また、自動対応付けの結果に関しても、ユーザの判断で対応すべきでないと思われる要素については削除することができる。
【００８３】
このような画面を用いて、編集した後に、マッピング・モデルをマッピング・モデル・データベース１３００に「修正結果保存ボタン」１０００−ｃ−２を押して保存する。
【００８４】
以上のような構成・処理により、二つのテーブルの持つ属性値の一致／不一致の関係から、異なるデータベースを対応付けるモデルを自動的に作成できる。
【００８５】
次に、上記の実施の形態から処理・構成を変更した場合における実施の形態について、別の図を用いて説明する。
【００８６】
図１５は図１３の主キーを利用した属性値の一致情報収集処理ＳＴＥＰ５０に変わるものであり、属性の度数分布を利用した属性値の一致情報収集処理の流れを示す図である。この図は図８におけるＳＴＥＰ５０の処理を入れ替えて利用するものなので、処理番号をＳＴＥＰ５０ａとする。
【００８７】
処理においては、まず、属性番号１に１を設定する（ＳＴＥＰ５０ａ−１）。
【００８８】
属性番号１がテーブル１の属性数（属性数１）以下（ＳＴＥＰ５０ａ−２，Ｙｅｓ）であれば、レコード番号に１を設定する（ＳＴＥＰ５０ａ−３）。レコード番号が最大レコード数以内（ＳＴＥＰ５０ａ−４，Ｙｅｓ）であれば、レコード番号と属性番号１で特徴付けられるデータ１の配列（データ１［レコード番号］［属性番号１］）に、コード変換フラグ，レコード番号，属性番号１で特徴付けられるデータベース１のテーブル１における属性値を代入する（ＳＴＥＰ５０ａ−５）。次に、レコード番号を１増やし（ＳＴＥＰ５０ａ−６）、ＳＴＥＰ５０ａ−４からＳＴＥＰ５０ａ−６の処理を繰り返す。
【００８９】
もし、レコード番号が最大レコード数を超えた場合（ＳＴＥＰ５０ａ−４，Ｎｏ）、属性番号１を１増やし（ＳＴＥＰ５０ａ−７）、ＳＴＥＰ５０ａ−２からＳＴＥＰ５０ａ−７の処理を繰り返す。
【００９０】
属性番号１がテーブル１の属性数の最大値を超えた場合（ＳＴＥＰ５０ａ−２，Ｎｏ）、いくつかのレコードの値から求めたデータ１［］［属性番号１］からヒストグラム生成関数を利用することにより、属性番号１ごとのヒストグラム１［属性番号１］、すなわち、テーブル１における属性ごとの属性値の分布を表すヒストグラムを生成する（ＳＴＥＰ５０ａ−８）。
【００９１】
次に、ＳＴＥＰ５０ａ−１からＳＴＥＰ５０ａ−８までと同等の処理をテーブル２の属性に関しても繰り返し、テーブル２における属性ごとの属性値の分布を表すヒストグラム２を生成する（ＳＴＥＰ５０ａ−１００）。最後に、ヒストグラム１とヒストグラム２をそれぞれの属性数１と属性数２にわたって面積の差を比較する（ＳＴＥＰ５０ａ−２００）。
【００９２】
図１６は属性の度数分布（ヒストグラム）についての比較方法の一例を示す図である。例えば、ＤＢ１の系統設計テーブルに関する度数分布を（ａ）最高使用圧力，（ｂ）最高使用温度，（ｃ）耐震クラスの属性について、属性値ごとに示している。また、同様にＤＢ２の配管設計テーブルについても（ｄ）ＬＧ，（ｅ）ＬＨ，（ｆ）ＬＩの属性について、属性値ごとに示している。なお、（ｄ）についてはコード変換により単位系を変えた場合の度数分布を示している。
【００９３】
これらの度数分布における軸方向のデータの型と範囲を度数分布同士で比較し、比較可能なものについては縦軸方向の度数の差を面積差として求め、（ｇ）のように度数分布比較グラフを作成する。このグラフ（ｇ）は図１４に示すものと同等のものであるが縦軸の一致指数は１−（規格化された度数分布の区分ごとの面積の差の総和）となっている。この一致指数が、ある閾値を超えた場合には、一致する属性の対が求められる。また、グラフ（ｇ）では耐震クラスとＬＩのような文字列型の属性値の分布のその他の数値型の属性値の分布は比較不可能なので、一致指数を０に設定している。
【００９４】
このように度数分布を用いた属性値同士の比較により、比較対象のテーブルの主キーが不明な場合でも属性間の一致関係を抽出することができる。
【００９５】
次に、図１７と図１８は図９のＳＴＥＰ４０とＳＴＥＰ５０の間に挿入可能な処理であり、データ変換処理として機能する。図１７は指定した属性値を分解した場合における属性値の一致情報収集処理の流れを示す図である。
【００９６】
処理においてはまず、データベース番号に１を設定する（ＳＴＥＰ４００−１）。データベース番号が２以下である場合（ＳＴＥＰ４００−２，Ｙｅｓ）、以下の処理（ＳＴＥＰ４００−３からＳＴＥＰ４００−１６）を繰り返す。
【００９７】
処理ではまず、共通ドメインデータ番号に１を設定する（ＳＴＥＰ４００−３）。共通ドメインデータ番号が最大となる共通ドメイン数以下であるならば（ＳＴＥＰ４００−４，Ｙｅｓ）、属性番号に１を設定する（ＳＴＥＰ４００−５）。
【００９８】
次に、属性番号がデータベース番号に応じた属性数（属性数［データベース番号］）以下ならば、すなわち、ＤＢ１またはＤＢ２の各テーブル中に含まれる属性数以下ならば（ＳＴＥＰ４００−６，Ｙｅｓ）、データベース番号，共通ドメインデータ番号，属性番号で特定する属性値（属性値［データベース番号］［共通ドメインデータ番号］［属性番号］）に、データベース番号，共通ドメイン［共通ドメインデータ番号］，属性番号で取得することのできる属性値を属性値取得関数で代入する（ＳＴＥＰ４００−７）。
【００９９】
さらに、分割番号として１を設定する（ＳＴＥＰ４００−８）。
【０１００】
図９の１０００−ａ−５のように示した“−”のような区切り文字があるか、予め属性値の切り出し指定がある場合（ＳＴＥＰ４００−９，Ｙｅｓ）、データベース番号，共通ドメインデータ番号，属性番号，分割番号で特定する属性値（属性値［データベース番号］［共通ドメインデータ番号］［属性番号］［分割番号］）に、属性値［データベース番号］［共通ドメインデータ番号］［属性番号］に区切り文字で指定される文字切り出し条件を基にして、文字切り出し関数を作用させて文字列を切り出す（ＳＴＥＰ４００−１０）。これは、例えば、文字列“ＲＨＲ−００１”に対して区切り文字列“−”という条件で“ＲＨＲ”を切り出すようなものである。
【０１０１】
そして、属性名称として、属性名称［データベース番号］［共通ドメインデータ番号］［属性番号］［分割番号］を定義し、これに、もとの属性名称（属性名称［データベース番号］［共通ドメインデータ番号］［属性番号］）と分割番号を連結した文字列を新たに生成する（ＳＴＥＰ４００−１１）。これは、例えば、“配管番号”という属性名称に分割番号の“１”を追加して、新たに“配管番号１”を生成するようなものである。
【０１０２】
この後に、分割番号を１増やし（ＳＴＥＰ４００−１２）、区切り文字がなくなるまで（ＳＴＥＰ４００−９，Ｎｏ）繰り返す。区切り文字がなくなれば、属性番号を１増やして（ＳＴＥＰ４００−１３）、属性番号が最大の属性数になるまで（ＳＴＥＰ４００−６，Ｎｏ）、ＳＴＥＰ４００−６からＳＴＥＰ４００−１３までの処理を繰り返す。
【０１０３】
属性番号が最大の属性になると、共通ドメインデータ番号を１増やし、共通ドメインデータ番号が最大の共通ドメイン数になるまで（ＳＴＥＰ４００−４，Ｎｏ）、ＳＴＥＰ４００−４からＳＴＥＰ４００−１４までの処理を繰り返す。共通ドメインデータ番号が最大の共通ドメイン数になれば、データベース番号を１増やし（ＳＴＥＰ４００−１６）、ＤＢ２についてＳＴＥＰ４００−３からＳＴＥＰ４００−１６の処理を繰り返す。
【０１０４】
ＤＢ２についての処理が終了した段階で、次の処理ＳＴＥＰ４０００にデータ変換後のデータ、すなわち、属性値［データベース番号］［共通ドメインデータ番号］［属性番号］［分割番号］と属性名称［データベース番号］［共通ドメインデータ番号］［属性番号］［分割番号］を渡す。
【０１０５】
以上の処理により、たとえ、一つの属性の情報が複数の構造からなる情報から構成されていたとしても、その属性値を分解して、他のデータベースに含まれる情報と比較可能なように情報を分解して取り出すことができる。
【０１０６】
図１８は属性のデータ格納率に応じて属性値比較対象となる属性を選択する処理の流れを示す図である。ここでは、本図で表される処理をＳＴＥＰ４０００としてある。なお、処理の流れの中では、データベース番号，共通ドメインデータ番号，属性番号，分割番号を各条件判断分岐の前に１に初期化する処理の表現を省略してある。
【０１０７】
処理では、まずデータベース番号が初期値１になっているので、データベース番号が２以下であるという条件を満たし（ＳＴＥＰ４０００−１，Ｙｅｓ）、共通ドメインデータ番号，属性番号，分割番号おのおのについてその最大数になるまで（ＳＴＥＰ４０００−２，３，４がＮｏとなるまで）繰り返し処理を実行する。
【０１０８】
処理の最も内側のループでは、データベース番号，共通ドメインデータ番号，属性番号，分割番号で特徴付けられる属性値［データベース番号］［共通ドメインデータ番号］［属性番号］［分割番号］を属性値という一時変数に代入し（ＳＴＥＰ４０００−５）、属性値がＮＵＬＬでなければ、すなわち、空でなければ（ＳＴＥＰ４０００−６，Ｙｅｓ）、データベース番号，共通ドメインデータ番号，属性番号，分割番号で特徴付けられるカウント［データベース番号］［共通ドメインデータ番号］［属性番号］［分割番号］を１増やす（ＳＴＥＰ４０００−７）。なお、カウント［データベース番号］［共通ドメインデータ番号］［属性番号］［分割番号］は処理開始直前にゼロクリアされているものとする。この後、分割番号を１増やし（ＳＴＥＰ４０００−９）、処理を繰り返す。
【０１０９】
以上の一連の処理（ＳＴＥＰ４０００−１からＳＴＥＰ４０００−１２）までの処理が終了したあと、属性データ格納率［データベース番号］［共通ドメインデータ番号］［属性番号］［分割番号］をカウント［データベース番号］［共通ドメインデータ番号］［属性番号］［分割番号］／共通ドメイン数から求める（ＳＴＥＰ４０００−１３）。
【０１１０】
以上の処理による属性データ格納率を属性値同士の対応付けの際、属性値を比較すべきかどうかの基準に用いれば、すなわち、属性データ格納率が少ないものは属性値同士の対応付けには利用しないということにすれば、比較的、データの格納率が高いもの同士の属性の対応付けが可能になり、属性対応付けの精度を向上させることができる。
【０１１１】
図１９は属性値プールの例を示す図である。これは、過去の属性対応付けの際に、代表的な属性値について、対応する正しい属性名があらかじめ求められたものの結果を格納したものであり、例えば、図１３のＳＴＥＰ５０−６やＳＴＥＰ５０−９でデータ１やデータ２を求めた場合に一致関数を利用しなくても、即座に代表属性名を求めるために利用できるものである。すなわち、過去の属性対応付けの結果を新規の属性対応付けに利用することができ、属性対応付けの精度の向上に寄与することができる。
【０１１２】
図２０はマッピングデータを生成処理のため変数を入力するための画面を用いて、複数テーブルを結合するためのデータを入力する画面例を示す図である。これは、図９に示す入力パラメータ設定画面１０００−ａとほぼ同等のものであり、テーブル名の入力フィールドが複数に増えた（１０００−ａ−３１から３２）のと、テーブル接続条件指定ボタン１０００−ａ−３４と３５が増えたものである。
【０１１３】
テーブル接続条件指定ボタン１０００−ａ−３４を画面上でクリックすると、テーブル接続条件設定ダイアログ１０００−ｂが開く。ここで、マスタテーブル１０００−ｂ−１と接続テーブル１０００−ｂ−２を指定し、属性接続条件を１０００−ｂ−３、その他の条件を１０００−ｂ−４で指定してＯＫボタン１０００−ｂ−５で確定するものである。画面例ではＤＢ１のプラント管理テーブルと系統設計テーブルをそれぞれのテーブルのプラントコードで接続し、その他の条件として、プラント管理のプラント名を“Ａプラント”に制限したものを表している。なお、テーブル接続条件指定ボタン１０００−ａ−３５をクリックするとＤＢ２についてのテーブル接続条件を設定できる。
【０１１４】
図２１は設計データベースの対応付けに使用する結合テーブルの例である。例では、ＤＢ１のプラント管理テーブルと系統設計テーブルをプラントコードという属性で、プラントコードがＡの場合に接続した場合を示している。このようにすることによって、複数のテーブルを一つのテーブルとして表現することができるので、図８に示すようなデータベース対応付けモデル作成の処理が複数のテーブルの場合にも適用することができる。
【０１１５】
以上説明したように、本発明の実施例によれば、複数のテーブルの持つ属性値の一致／不一致の関係から、異なるデータベースを対応付けるモデルを自動的に作成できる。
【０１１６】
また、度数分布を用いた属性値同士の比較手段を備えるものにあっては、比較対象のテーブルの主キーが不明な場合でも属性間の一致関係を抽出することができる。
【０１１７】
更に、属性値同士の比較の際に、あらかじめ作成済みの属性値のコード変換関数を利用して、比較相手のデータベースで利用しているコード体系または単位系に合わせる処理手段を有するものにあっては、データベースで利用している単位系，コード体系が異なる場合でも属性値同士の比較ができる。
【０１１８】
また、属性値同士の比較の際に、あらかじめ外部から規定した属性について属性値の内容を複数の要素に分割し、分割した要素を用いて、比較対象データベースのテーブルの属性が有する属性値と比較する手段を有するものにあっては、たとえ、一つの属性の情報が複数の構造からなる情報から構成されていたとしても、その属性値を分解して、他のデータベースに含まれる情報と比較可能なように情報を分解して取り出すことができる。
【０１１９】
また、属性値同士の比較の前に、格納されているデータ量が少ない属性については比較対象の属性から除去する手段を有するものにあっては、属性データ格納率が少ないものは属性値同士の対応付けには利用しないということにすれば、比較的、データの格納率が高いもの同士の属性の対応付けが可能になり、属性対応付けの精度を向上させることができる。
【０１２０】
また、属性値の集合がどのような設計意味情報に対応付けられるかを示す情報を利用して、各テーブルの属性ごとに、属性値の情報から、その属性が対応付けられる設計意味情報を判定し、同一の設計意味情報に対応付けられる属性同士が対応付けられる属性として判定する手段を有するものにあっては、過去の属性対応付けの結果を新規の属性対応付けに利用することができ、属性対応付けの精度の向上に寄与することができる。
【０１２１】
また、比較対象の少なくとも一方のテーブルを複数のテーブルの結合からなる一つのテーブルとして扱って比較する手段を有するものにあっては、複数のテーブルを一つのテーブルとして表現することができるので、データベース対応付けモデル作成の処理が複数のテーブルの場合にも適用することができる。
【０１２２】
【発明の効果】
以上説明したように、本発明によれば、異なるデータベースの複数のテーブルの持つ属性値を比較した結果から、異なるデータベースを対応付けるためのモデルを作成できる。
【図面の簡単な説明】
【図１】本発明の実施の形態に係わる設計データベースをマッピングするモデルを生成するシステムの構成図である。
【図２】一般的なプラント設計業務の種類と業務間の係わりを示す図である。
【図３】データベース統合システムの構成図の一例である。
【図４】系統設計データベースと配管設計データベースの対応付けの関係を示す図である。
【図５】系統設計データベースと配管設計データベースを対応付けるモデルをＸＭＬ
Ｓｃｈｅｍａ形式で示す図である。
【図６】対応関係生成処理により生成されたタグ付き文書の例を示す図である。
【図７】対応関係生成処理により作成されたタグ付き文書を用いて構築した画面の例を示す図である。
【図８】２つの設計データベースの属性値同士を比較し、マッピング・モデルを生成する処理の流れの概要を示す図である。
【図９】マッピング・モデル生成処理のためのデータを入力するシステムの表示画面例を示す図である。
【図１０】主キーのドメイン情報収集処理の流れを示す図である。
【図１１】主キーの共通ドメイン情報収集の流れを示す図である。
【図１２】データベースの属性名称収集処理の流れを示す図である。
【図１３】主キーを利用した属性値の一致情報収集処理の流れを示す図である。
【図１４】主キーを利用した属性値の一致情報収集処理により得られた属性一致度に関するグラフを示す図である。
【図１５】属性の度数分布を利用した属性値の一致情報収集処理の流れを示す図である。
【図１６】属性の度数分布についての比較方法の一例を示す図である。
【図１７】指定した属性値を分解した場合における属性値の一致情報収集処理の流れを示す図である。
【図１８】属性のデータ格納率に応じて属性値比較対象となる属性を選択する処理の流れを示す図である。
【図１９】属性値プールの例を示す図である。
【図２０】マッピングデータを生成処理のため変数を入力するための画面を用いて、複数テーブルを結合するためのデータを入力する画面例を示す図である。
【図２１】設計データベースの対応付けに使用する結合テーブルの例である。
【図２２】マッピング・モデル生成処理の流れを示す図である。
【図２３】生成したマッピング・モデルの雛型に関して、編集中の画面例を示す図である。
【符号の説明】
１１０−１２０，１２０−１１０，１２０−１３０，１３０−１２０…マッピング・モデル、１１０ｃ〜１３０ｃ…各設計アプリケーションが利用するデータベース、３００…タグ付き文書変換装置、１０００…管理・編集端末、１１００…設計データベース群、１２００…設計データベース間比較マッピング・モデル生成装置、１３００…マッピング・モデル・データベース。[0001]
TECHNICAL FIELD OF THE INVENTION
The present invention relates to a heterogeneous database integration support device for supporting more efficient information transmission and sharing in a distributed design task such as plant design, operation, and maintenance.
[0002]
[Prior art]
As a prior art, Japanese Patent Application Laid-Open No. 11-282878 discloses a related information search device that searches a plurality of databases while estimating association from a plurality of databases (hereinafter, also referred to as DBs). In this prior art example, a search is performed based on the result of association "estimation" using various external data, and based on the content of the search, while learning the association according to the "association value", a model for the association between databases is obtained. Will be implemented.
[0003]
[Problems to be solved by the invention]
This is related to database consistency management of a plurality of departments of concurrent (simultaneous and parallel) design work. In the example of the related art, a search is performed while estimating an association from a plurality of DBs. Therefore, the prior art example performs database consistency management based on the association "estimation" result using various external data. As described above, in the example of the conventional technology, it is difficult to create a highly reliable model because the entire database is analyzed and the relationship between the databases is not created from the bottom up based on the number of matching attribute values and the like. is there.
[0004]
In order to manage the consistency of design data in concurrent design work, first, a complete correspondence between databases is created when a correspondence model of different DBs in different design fields in concurrent design work is created. However, since the prior art example does not create a relationship between databases from the bottom up, the prior art example is used to support integration that manages data consistency that should have the same value between different types of databases. I can't do it.
[0005]
Therefore, an object of the present invention is to provide a heterogeneous database integration support apparatus used for integration support for managing data consistency that should have the same value between different databases.
[0006]
[Means for Solving the Problems]
The present invention provides means for comparing attribute values of attributes of tables belonging to a plurality of different databases to generate a mapping model for associating attribute items in each table based on a result of comparison between attribute values. Have. When producing the comparison result, it is preferable to have a means for comparing the above-mentioned attribute values and generating a mapping model for associating the attribute items in each table from those having a high degree of coincidence.
[0007]
BEST MODE FOR CARRYING OUT THE INVENTION
The design work for plants such as nuclear power and thermal power is usually separated, and the design work proceeds in several stages. The first stage of the design is often called "upstream" and the later stages are often called "downstream". In the design departments of plants such as nuclear and thermal power plants, design results are stored and managed in databases related to system design, piping three-dimensional design, equipment design, and the like. These databases have an enormous amount of data, and due to the nature of business, their data is concurrently accumulated in each database. That is, since the downstream design starts without waiting for the upstream design, the database consistency management cannot be performed simply by linking the databases. These tasks proceed concurrently, and the design on the downstream side has also been started by tentatively determining the values without waiting for the design results on the upstream side.
[0008]
FIG. 2 is a block diagram showing the types of general plant design operations and the relationship between the operations. Here, as an example, a case is shown in which the results of operations shift from the upstream to the downstream in the order of plant system design 10, piping design 20, equipment design 30, and the like. If the results flow sequentially as shown in this figure, there is no difference in the design results of these tasks, and unification is maintained. However, in reality, most of the work is performed concurrently because the system is enormous, the time required for the overall design cannot be increased, and the number of people involved in the work is large.
[0009]
For example, in the example of FIG. 2, the system design department 10 creates a piping instrumentation diagram.
(10a), and a device list creation (10b) is performed, and the result is stored in the system design database (DB) 120c. In the piping design department, a piping space 3D layout or the like is created (20a) and stored in the piping design database (DB) 110c. Further, in the equipment design department, a detailed equipment design (30a), specifications for order placement (30b) and the like are created and stored in the equipment design database (DB) 130c. Since each job concurrently accumulates data in the database, the consistency information 1 to 3, ie, 400a to 400c, is confirmed, the design result is adjusted among the designers, and the data is transferred between the DBs. It is an essential condition that the results of each task be reflected in the databases of other departments.
[0010]
In the embodiment of the present invention, when each task concurrently stores data in the database, it is supported to reflect the result of each task in the database of another department as described below. The configuration of the heterogeneous database integration support device according to the embodiment of the present invention will be described with reference to FIG.
[0011]
The heterogeneous database integration support apparatus of the present invention uses a database managed as a design database group 1100, a piping design database 110c, a system design database 120c, a device design database 130c, and the like. The database may have three or more databases. Using the tables, attributes, and attribute values stored in these databases, attribute mapping (association) between two different tables is performed using the design database comparison mapping model generation apparatus 1200. I do. The generated mapping model is stored in the mapping model database 1300.
[0012]
That is, the relationship between the piping design database 110c and the system design database 120c is 110-120, and the reverse relationship is stored as a mapping model of 120-110. The relationship between the system design database 120c and the device design database 130c is 120-130, and the reverse relationship is stored as a mapping model 130-120. The reverse relationship exists because the information expression method is changed according to the difference in the code system of the information expression used in each database and the difference in the unit system.
[0013]
The generated mapping model can be displayed by using the management / editing terminal 1000, and the content of the mapping model can be changed by an editing operation. The mapping model finally generated and edited is used by the tagged document conversion device 300 (represented by 310 and 320 in FIG. 3) mounted on the difference management data server 600, and maintains consistency between databases. It is used for processing to generate a tagged document indicating the situation.
[0014]
FIG. 3 shows the configuration of a system for managing consistency information between databases. The database integration system in FIG. 3 is an embodiment in which a piping design system 110, a system design system 120, and an equipment design system 130 are provided as existing systems. In this example, the piping design system 110 includes a terminal device 110a, a piping design application (program) 110b as a business correspondence application, and a piping design database 110c as a business correspondence database.
[0015]
The system design system 120 includes a terminal device 120a, a system design application (program) 120b as a business application, and a business application.
It is composed of a system design database 120c as a DB. The device design system 130 includes a terminal device 130a, a device design application (program) 130b as a business support application, and a device design database 130c as a business support DB.
[0016]
By assigning the mapping models 110-120 and 120-130 stored in the mapping model database 1300 to the databases 110c, 120c and 130c of the respective systems, the piping design database 110c is centered on the system design database 120c. And an environment for associating the device design database 130c with the device design database 130c.
[0017]
Further, it detects whether or not the database has been changed through the change detection processing devices 210, 220 and 230 in the databases 110c, 120c and 130c. The difference data management server 600 receives the change detection result and the signals 210 (i), 220 (i), 220 (ii), 230 (i) for the data of the changed part of the database as input data, and stores the database in the form of a tagged document. Manage the consistency between them.
[0018]
Here, a tag refers to information that gives meaning to data, that is, a descriptor created from attribute names and the like. When either the signal 210 (i) or the signal 220 (i) is generated in the difference data management server 600, the tagged document conversion processing device 310 is activated using the mapping models 110-120, and the system The design database—generates a tagged document representing the consistency maintenance status of the piping design databases 120c and 110c, and stores the generated document in the tagged document storage file system 410. The tag-added document storage file system 410 outputs data correction requests 410 (i) and 410 (ii) to the business applications 110b and 120b when detecting a difference in the correspondence status between the system design and piping design databases.
[0019]
Similarly, when either the signal 220 (ii) or the signal 230 (i) is generated, the tagged document conversion processing device 320 is activated using the mapping model 120-130, and the system design database- A tagged document representing the consistency maintenance status of the device design databases 120c and 130c is generated and stored in the tagged document storage file system 420. The tag-added document storage file system 420 outputs data correction requests 420 (i) and 420 (ii) to each business application 120b and 130b when detecting a difference in the correspondence status between the system design and piping design databases. The design data is corrected by the design application program 120b or the device design application program 130b.
[0020]
Also, the display request from the user is processed by the terminal device 700 connected to the network separately from the existing system and the browser mounted on the terminal device 700 by the display request receiving device 500. At this time, a tagged document required by the display request receiving apparatus 500 is selected from the tagged document storage file system 410 or 420, and a display form is selected from the style setting file 510 prepared in advance to generate a display screen.
[0021]
In the example of FIG. 3, a system that manages differences between the system design database 120c and the piping design database 110c, and the database between the system design database 120c and the device design database 130c is represented. The difference between the database 130c and the database 130c can be confirmed in a similar configuration.
[0022]
FIG. 4 is a diagram showing the relationship between the system design database and the piping design database. FIG. 4A shows an example of actual database contents. For example, this is an example in which a table in the piping design database 110c is associated with a table in the system design database 120c. For example, each line is associated with the relationship that the piping number and LC and LD are 110-120-1c, the maximum operating pressure and LG are 110-120-2c, and the maximum operating temperature and LH are 110-120-3c. . This relationship is represented by a model as shown in FIG. FIG. 4B schematically shows the correspondence in FIG. 4A.
[0023]
That is, there are a plurality of rows of attribute item sets ROW to be compared under the contents of system design_pipe design. Information indicating that there are a plurality of rows is indicated by * attached to the ROW. If there is no *, it means that there is one. And, as the constituent elements of the ROW, there are information such as a piping number 110-120-3, a maximum operating pressure 110-120-4, and a maximum operating temperature 110-120-5. Furthermore, a hierarchical structure is established in which, as lower-level information of the piping numbers 110-120-1, piping numbers on the system design side, and contents where LDs and LCs on the piping design side are linked by "-(hyphen)" are associated. I have. The attribute values of the actual database are associated using the association relationship shown in FIG. 4B.
[0024]
FIG. 5 shows the correspondence between the piping design database 110c and the system design database 120c in a tagging document standard internationally standardized in the W3C (World Wide Web Consortium, http://www.w3c.org), that is, a syntax according to XML. 3 shows an example of a description of a mapping model 110-120 in the case of following XML Schema. In this example, when the top-level element is defined as “system design_pipe design” and the attribute item set to be compared is defined as ROW, the ROW includes a pipe number, a maximum working pressure, and the like.
[0025]
Further, the information to be compared as the piping number is described by developing hierarchically corresponding information such as an attribute item called a piping number on the system design database side and an attribute item called LC_LD on the piping design database side. If the table name on the system design database side is $ tableName1 and the table name on the piping design database side is $ tableName2, the attribute items in the actual table are respectively @ tableName1. It is represented by information such as a pipe number and CONCAT (@ tableName2.LD, CONCAT ('-', @ tableName2.LC)). Here, CONCAT ('-', $ tableName2.LC) is "-" and \ tableName2.LC. It means a function to combine LC character strings.
[0026]
FIG. 6 shows an example of a tagged document generated using the mapping model shown in FIG. As an example of a tagged document, the correspondence between the databases is represented in XML as shown in FIG. 6, and this correspondence is obtained by using the mapping data shown in FIG. 4B or FIG. Implemented using actual attribute values. In other words, tags <system design_pipe design> and </ system design_pipe design> represent the entire correspondence by inserting the beginning and end of the information.
[0027]
In <system design_pipe design>, “A”, “2001/12/17 18:14:56”, and “100” are described as parameters of the plant code, the update time, and the number of data, respectively. A <ROW></ROW> tag inside a tag of <system design_pipe design></ system design_pipe design> represents a set of correspondences of actual data.
[0028]
In the <ROW> tag, a data serial number such as num = "0" is assigned. Inside the <ROW> tag, semantic information of attributes to be associated, such as <piping number></ piping number>, <maximum operating pressure>, </ maximum operating pressure>, etc., is represented. In each tag, for example, <pipe number_system design></ pipe number_system design>, <LC_LD_pipe design></ LC_LD_system design> The information is specifically described as RHR-001.
[0029]
FIG. 7 is an example of a display screen generated using the tagged document of FIG. 6, and is a diagram illustrating an example of a screen generated using the tagged document generated by the correspondence generation processing. Generally, with respect to tagged documents, a program for display conversion is disclosed and marketed, and is easily available. In FIG. 7, a process for converting a document with a tag as shown in FIG. 6 into a display in a table format is performed by a display conversion program. The window 700-a displays the piping data comparison correspondence. For example, when the plant A is selected as the plant name from the pull-down menu 701, the E11 residual heat removal system 702 is selected as the system from the tree menu, and then the correspondence display button 703 is pressed, the respective windows are displayed in the lower right window 707. The correspondence between the attributes is displayed in the order of the piping numbers, for example, the maximum operating pressure and LG, the maximum operating temperature and LH. This is generated based on a tagged document as shown in FIG.
[0030]
FIG. 8 is a diagram showing an outline of a flow of a process of comparing attribute values of two design databases with each other and generating a mapping model. This processing is for explaining processing on the design database comparison mapping model generation apparatus 1200, and is executed by outputting a database mapping model creation instruction signal from the management / editing terminal 1000.
[0031]
Here, first, the designation information of the database-related parameters from the management / editing terminal 1000 is stored in the memory used by the program (STEP 10). Examples of the database-related parameters include a table name and primary key information of the table. Here, it is assumed that attributes included in tables stored in database 1 (DB1) and database 2 (DB2) are associated with each other.
[0032]
Next, domain information of the primary key of each table selected from DB1 and DB2 is collected (STEP 20). Here, the primary key means an attribute that specifies data in the row direction included in the table, and the domain information means a set of attribute values related to the column direction of a specific attribute. That is, the domain information of the primary key is a set of attribute values corresponding to an attribute that specifies data in the row direction.
[0033]
Next, common domain information of the primary key of each table selected from DB1 and DB2 is collected (STEP 30). The common key information of the primary key is a set of common attribute values corresponding to attributes for specifying data in the row direction in DB1 and DB2.
[0034]
Next, the attribute names of each table selected from DB1 and DB2 are collected (STEP 40). This means that for each of the two selected tables, the names of the attributes specifying the column direction are collected.
[0035]
Next, for each attribute name collected in STEP 40 between the tables selected from DB1 and DB2, it is determined whether or not the attribute values match, and the number of matches / mismatches is counted (STEP 50).
[0036]
Next, a mapping model for associating each table selected from DB1 and DB2 based on the matching index calculated from the number of matches and the number of mismatches of the attribute values obtained in STEP50 (STEP60).
[0037]
Finally, the processing is terminated by displaying the mapping model creation result on the management / editing terminal 1000 as shown in FIG. 1 (STEP 70).
[0038]
Hereinafter, the details of the processing illustrated in FIG. 8 will be described with reference to the drawings and the flowchart.
[0039]
FIG. 9 is a diagram showing an example of a display screen of the system for inputting data for the mapping model generation processing. This is used as an input parameter setting screen 1000-a for providing the management / editing terminal 1000 with essential data before the processing is started. The input parameter setting screen 1000-a has items for setting information separately for DB1 and DB2 (1000-a-1, 1000-a-2).
[0040]
The setting items include a table name 1000-a-3, a primary key attribute name 1000-a-4, a delimiter 1000-a-5, a search condition 1000-a-6, and a generation condition 1000-a-7. Here, for example, as shown in FIG. 4, "system design" and "piping design" are used as table names of DB1 and DB2, and "piping number", "LD" and "LC" are used as primary key attribute names. In the table, a character string 1000-a-5 delimiting the attribute value of the piping number such as "RHR-001" is set as "-". This makes it possible to extract “RHR” and “001” from “RHR-001” and generate information for comparing “pipe number” with “LD” and “LC”. Also, the search condition 1000-a-6 can be used when comparison is desired for a specific attribute value, and may be blank. When the search condition 1000-a-6 is blank, the comparison is performed using all the attribute values included in the table. Furthermore, the generation condition 1000-a-7 is, for example, for each of the tables “system design” and “pipe design” of DB1 and DB2, ““ system abbreviation = ＄ systemNameAbb ””, ““ la = ＄ plantName and nt = ＄ area. and ld = {systemNameAbb "". This specifies an external condition to be used when generating a tagged document from two database tables in the tagged document conversion device 300. The system abbreviation is used as {systemNameAbb, the plant name is used as {plantName, The building is used as an area to be externally given as a variable. Here, “system abbreviation”, “la”, “nt”, and “ld” are names of attributes included in each table.
[0041]
Common setting items that do not depend on database settings include a storage directory name 1000-a-8, a comparison record name 1000-a-9, a code conversion specification 1000-a-10, and an attribute value comparison method 1000-a-11. . Here, the directory name 1000-a-8 on the computer that stores the mapping model created by the comparison database model mapping design generation process is set to “C: \ XMLSchema”, and the number of data for comparing attribute values ( 1000-a-9 is "100", designation of whether or not there is conversion of a unit system or a unique code system is "yes", and attribute value comparison method 1000-a-11 Is designated as "primary key-attribute value match". After inputting these specified items, the model generation button 1000-a-12 is pressed to start the processing after STEP 10 in FIG. By pressing the reset button 1000-a-13, the contents already input on the input parameter setting screen 1000-a can be deleted. In the DB-related parameter designation process STEP10 shown in FIG. 8, a process for storing the contents designated in FIG. 9 in a memory secured on the program is performed.
[0042]
FIG. 10 is a diagram showing a detailed processing flow of processing STEP20 of collecting domain information of primary keys of DB1 and DB2. First, after the DB-related parameters are set in STEP 10, this process is executed.
[0043]
In the process, first, 1 is set as the database number (STEP 20-1).
[0044]
Next, the input screen as shown in FIG. 9 and the attribute name of the primary key and the table name of DB1 set in the processing of STEP 10 are set (STEP 20-3).
[0045]
Then, all the attribute values of the primary key in the table are searched and temporarily stored in the memory (STEP 20-4).
[0046]
Thereafter, the count is set to 0 in advance (STEP 20-6), and the data number is started from the primary key attribute value of 1 (STEP 20-5 and STEP 20-7, Yes), and it is determined whether the attribute value is not empty. . If the attribute value is not empty (STEP 20-8, Yes), the count is incremented by 1 (STEP 20-9), and a set of primary key domain information associated with the data number, that is, primary key domain information [count] Is stored with the attribute value of the non-empty primary key (STEP 20-10). STEP 20-9 and STEP 20-10 are not performed when the attribute value is empty (STEP 20-8, No). After these processes, the data number is incremented by 1 (STEP 20-11), and the process is repeated until the data number reaches the total number of data obtained as a result of the search in STEP 20-4 (until STEP 20-7, No). .
[0047]
When the process is completed for database number 1, the database number is incremented by 1, that is, steps 20-3 to 20-12 are repeated for DB2.
[0048]
Through the above processing, the domain information on the attribute values related to the respective primary keys of the designated table, which are included in the respective databases of DB1 and DB2, is collected.
[0049]
FIG. 11 is a diagram showing the flow of collecting the primary key common domain information. This figure explains the processing of STEP 30 in FIG. 8 in detail.
[0050]
First, the primary key domain information corresponding to each database obtained in the processing shown in FIG. 10 is stored in the primary key domain array. That is, the contents of the array of primary key domains relating to DB1 are stored in an array named primary key domain 1, and the contents of the array of primary key domains relating to DB2 are stored in an array named primary key domain 2. Then, the common domain data number is set to 0 (STEP 30-1).
[0051]
Next, the data number 1 for sequentially searching the elements of the array of the primary key domain 1 is set to 1 (STEP 30-2). When the data number 1 is within the number of data of the primary key domain 1, that is, the maximum number of elements of the array of the primary key domain 1 (STEP 30-3, Yes), the following processing is repeated.
[0052]
First, the data number 2 for sequentially searching the elements of the array of the primary key domain 2 is set to 1 (STEP 30-4). When the data number 2 is within the number of data of the primary key domain 2, that is, the maximum number of elements of the array of the primary key domain 2 (STEP 30-5, Yes), the following processing is further repeated.
[0053]
Here, the value of primary key domain 1 corresponding to data number 1, ie, the value represented by primary key domain 1 [data number 1], and the value of primary key domain 2 corresponding to data number 2, ie, primary key domain 2 The values represented by [Data No. 2] are compared, and if they match (STEP 30-6, Yes), the common domain data number is incremented by 1 (STEP 30-7), and the common domain [Common Domain Data No.] The key domain 1 [data number 1] and the primary key domain 2 [data number 2] are stored (STEP 30-8).
[0054]
After these processes are completed, the data number 2 is incremented by 1 (STEP 30-9), and the processes from STEP 30-5 to STEP 30-9 are repeated until the data number 2 reaches the maximum data number of the primary key domain 2 (STEP 30-5). No), repeat. After these processes are completed, the data number 1 is increased (STEP 30-10), and the processes from STEP 30-3 to STEP 30-10 are repeated until the data number 1 reaches the maximum data number of the primary key domain 1 (STEP 30-3, No. ),repeat.
[0055]
Through the above processing, primary key domain information common to DB1 and DB2 is collected in the common domain [common domain data number].
[0056]
FIG. 12 is a diagram showing the flow of the attribute name collection processing of the database. This figure explains the processing of STEP 40 in FIG. 8 in detail.
[0057]
In the process, first, 1 is set as the database number (STEP 40-1). Next, the table name corresponding to the database number 1 is set as the table name (STEP 40-3). Next, a list of attribute names corresponding to the table names is searched by referring to the database dictionary of the names of the attributes belonging to the table names held on the database side (STEP 40-4).
[0058]
Next, the attribute name list obtained as a result of the search is stored in the attribute name list [1] (STEP 40-5).
[0059]
Next, the database number is incremented by one, and the processing from STEP 40-2 to STEP 40-6 is repeatedly performed for the database with the database number 2. As a result, the attribute name list of the database number 2 is stored in the attribute name list [2]. Then, when the condition of STEP 40-2, No is satisfied and the processing ends, the attribute name list is passed to the next STEP 50.
[0060]
Through the above processing, the attribute names of the tables specified in the databases DB1 and DB2 are collected.
[0061]
FIG. 13 is a diagram showing the flow of attribute value coincidence information collection processing using a primary key. This figure explains the processing of STEP 50 in FIG. 8 in detail. In the process, first, 1 is set as the common domain data number (STEP 50-1). Also, a match count [attribute number 1] [attribute number 2] and a mismatch count [attribute number 1] representing the correspondence and non-correspondence between attribute numbers 1 and 2 to see the number of matches and the number of mismatches of attribute values An initial value 0 is set as [attribute number 2] (STEP 50-2). In the figure, as a shorthand notation, they are represented by a match count [] [] and a mismatch count [] [].
[0062]
Next, the following processing is repeated until the common domain data number reaches the maximum number of the common domain information collected in STEP 30 (until STEP 50-3, No). Here, the information of the generated match count [] [] and mismatch count [] [] is used in STEP60.
[0063]
When the common domain data number is within the maximum number of the common domain information (STEP 50-3, Yes), first, 1 is set to the attribute number 1 representing the order of the attributes stored in the table specified in DB1 (STEP 50-). 4). Then, the processing from STEP 50-5 to STEP 50-13 is repeated until the attribute number 1 reaches the maximum number of attributes stored in the table specified in DB1 (until STEP 50-5, No). If the result of STEP 50-5 is No, the common domain data number is incremented by 1 (STEP 50-15), and the processing from STEP 50-3 to STEP 50-15 is repeated.
[0064]
In the repetitive processing from STEP50-5 to STEP50-14, first, the attribute value on the DB1 side specified by the common domain [common domain data number] and the attribute number 1 is assigned to the data 1. When the code conversion is specified on the input screen as shown in FIG. 9 in order to convert the code system or unit system of the attribute value of the database, the code conversion flag is set to Yes and the obtained attribute is set. A value obtained by performing code conversion on the value is used as a value for comparison (STEP 50-6).
[0065]
Next, 1 is set to the attribute number 2 representing the order of the attributes stored in the table specified in DB2 (STEP 50-7). The processes from STEP50-8 to STEP50-13 are repeated until the attribute number 2 reaches the maximum number of attributes stored in the table specified in DB2 (until STEP50-8, No). If the determination in STEP50-8 is No, the attribute number 1 is incremented by 1 (STEP50-14), and the processing from STEP50-5 to STEP50-14 is repeated.
[0066]
In the repetition processing from STEP50-8 to STEP50-13, the attribute value on the DB2 side specified by the common domain [common domain data number] and the attribute number 2 is assigned to the data 2. When the code conversion is specified on the input screen as shown in FIG. 9 in order to convert the code system or unit system of the attribute value of the database, the code conversion flag is set to Yes and the obtained attribute is set. The value obtained by performing code conversion on the value is used as a value for comparison (STEP 50-9).
[0067]
Next, when it is determined that the data 1 and the data 2 match with a certain evaluation function (here, a function named as a matching function) (STEP50-10, Yes), the match count [attribute number 1] [attribute number 2] Is increased by 1 (STEP 50-11). When it is determined that there is a mismatch in the evaluation of the matching function (STEP 50-10, No), the mismatch count [attribute number 1] and [attribute number 2] are increased by 1 (STEP 50-12). After the processing in STEP50-11 or STEP50-12, the attribute number 2 is incremented by 1 (STEP50-13), and the processing from STEP50-8 to STEP50-13 is repeated.
[0068]
By the above processing, the coincidence degree and the inconsistency degree of the attribute values of the attributes belonging to the tables selected from DB1 and DB2 are used as the information in the form of a two-dimensional array as the match count [attribute number 1] [attribute number 2], It is generated as a mismatch count [attribute number 1] [attribute number 2].
[0069]
FIG. 14 is an example showing, in a graph format, information on the attribute matching degree obtained by the attribute value matching information collection processing using the primary key. For example, assuming that the table names set for DB1 and DB2 are "system design" and "piping design", respectively, FIG. 14 shows a comparison result excluding the attribute of the primary key.
[0070]
If the names of the attributes included in the system design table are the maximum operating pressure, the maximum operating temperature, and the seismic class, and the names of the attributes included in the piping design table are LG, LH, and LI, the maximum operating pressure-LG, the maximum operating It can be read from the graph that it was found as a result of the evaluation that the coincidence index of the temperature-LH and the seismic class-LI was large. The correspondence between the other attributes has a very low matching index.
[0071]
Here, the match index is, for example, match count [attribute number 1] [attribute number 2] / (match count [attribute number 1] [attribute number 2] + mismatch count [attribute number 1] [attribute number 2]). As shown, the denominator indicates the total number of attributes whose attribute value is not empty. For example, if a threshold value is set to 0.7 for such a matching index, it is determined that the attributes correspond to each other if the threshold value is greater than 0.7, and it is determined that the attributes corresponding to those smaller than the threshold value do not correspond to each other. can do.
[0072]
FIG. 22 is a diagram showing a flow of processing for generating a mapping model representing a correspondence between attributes of each table included in DB1 and DB2. This figure explains the processing of STEP 60 in FIG. 8 in detail. Here, for example, a template of a mapping model as shown in FIG. 5 is generated based on the processing results from STEP10 to STEP50.
[0073]
In the processing, first, the root element of the mapping model, that is, the first element of the tagged document as shown in FIG. 6 is generated from the table names selected from DB1 and DB2 (STEP 60-1). In the examples of FIGS. 5 and 6, the route element is an element described in “system design_piping design”. Next, a row attribute element "ROW" corresponding to the row of each table is generated and added as a child element to the root element (STEP 60-2). Here, “ROW” is added below “System design_Piping design”. Next, generate a primary key attribute element and add it as a child element to ROW
(STEP60-3).
[0074]
Further, a primary key element of the table of DB1 and database connection information are generated and added as a child element to the primary key attribute element. This process is also performed for DB2 (STEP60-4, STEP5). Next, after the attribute number 1 is set to 1 for the first time, the processing from STEP 60-6 to STEP 60-14 is repeated for the attribute number 1 (STEP 60-6, Yes). Further, in the repetition processing for the attribute number 1, after setting 1 to the attribute number 2, the processing from STEP 60-7 to STEP 60-13 is repeated for the attribute number 2 (STEP 60-7, Yes).
[0075]
In the iterative process for attribute number 2, when the coincidence index [attribute number 1] [attribute number 2] described in the description related to FIG. 13 is equal to or greater than a predetermined threshold value and is not selected for mapping model generation (STEP 60- 8, Yes), an element representing the correspondence between attributes is generated. If the value is smaller than the threshold value or the attribute has already been selected for the generation of the mapping model (STEP 60-8, No), the element indicating the correspondence between the attribute values is not generated.
[0076]
The process of generating an element representing the correspondence between attributes is performed in STEP60-9 to STEP60-12. Here, a representative attribute name is generated and added to the ROW as a child element (STEP 60-9). The attribute name used as the representative attribute name is set to an attribute name belonging to one of the tables DB1 and DB2, and which one to use is determined in advance. Here, the attribute name belonging to the table of DB1 is selected.
[0077]
Next, an element is generated with the attribute name of the table selected from DB1, and connection information to the database is also generated and added as a child element of the representative attribute name. This process is also performed for DB2 (STEPs 60-10 and 11). Next, for attribute name 1 [attribute number 1] and attribute name 2 [attribute number 2], a flag is set to indicate that the attribute name has been used to generate an element for a mapping model (STEP 60-12). Next, after incrementing the attribute number 2 by 1 (STEP 60-13), the processing from STEP 60-7 to STEP 60-13 is repeated. After generating attributes to be associated with all attributes in the mapping model, an element group of attributes that cannot be associated is generated and added to the ROW as child elements (STEP 60-15).
[0078]
According to the above processing, by evaluating the matching index of the attribute value, an element group indicating a correspondence relationship is expressed in the mapping model for the attribute to be associated, and an attribute that cannot be associated is supported in the mapping model. By generating an element group that could not be attached, a template as a mapping model can be generated.
[0079]
FIG. 23 is a diagram showing an example of a screen being edited with respect to the generated model of the mapping model. This is generated by the mapping model display processing STEP70 of FIG.
[0080]
The screen shown in this figure is displayed on the management / editing terminal 1000, and is operated by the user. In this screen example 1000-c, a system design_piping design is set as a root element in the display frame 1000-c-1, and an element called ROW is added below the root element.
[0081]
Below the ROW, an automatic association portion generated by the processing from STEP 10 to STEP 60 and a mismatched portion that could not be automatically associated are connected.
[0082]
With respect to the mismatched portion, the element can be moved to the automatically associated portion by the user's editing operation 1000-c-a1, and the element can be manually associated between the databases. Also, with regard to the result of the automatic association, it is possible to delete elements that are not to be handled by the user's judgment.
[0083]
After editing using such a screen, the mapping model is saved in the mapping model database 1300 by pressing a “correction result save button” 1000-c-2.
[0084]
With the above configuration and processing, a model for associating different databases can be automatically created based on the matching / mismatching of the attribute values of the two tables.
[0085]
Next, an embodiment in a case where processing and configuration are changed from the above embodiment will be described with reference to another drawing.
[0086]
FIG. 15 is a diagram showing a flow of the attribute value match information collection process using the frequency distribution of attributes, which is replaced with the attribute value match information collection process STEP50 using the primary key of FIG. In this figure, since the process of STEP 50 in FIG. 8 is used by replacing it, the process number is set to STEP 50a.
[0087]
In the process, first, 1 is set to the attribute number 1 (STEP 50a-1).
[0088]
If the attribute number 1 is equal to or less than the number of attributes of the table 1 (the number of attributes 1) (STEP50a-2, Yes), the record number is set to 1 (STEP50a-3). If the record number is within the maximum number of records (STEP 50a-4, Yes), the code conversion flag is added to the array of data 1 (data 1 [record number] [attribute number 1]) characterized by the record number and attribute number 1. , Record number, and attribute number 1 are assigned to the attribute values in table 1 of database 1 (STEP 50a-5). Next, the record number is incremented by 1 (STEP 50a-6), and the processing from STEP 50a-4 to STEP 50a-6 is repeated.
[0089]
If the record number exceeds the maximum number of records (STEP50a-4, No), the attribute number 1 is incremented by 1 (STEP50a-7), and the processing from STEP50a-2 to STEP50a-7 is repeated.
[0090]
When the attribute number 1 exceeds the maximum value of the number of attributes in the table 1 (STEP 50a-2, No), a histogram generation function is used from data 1 [] [attribute number 1] obtained from some record values. Generates a histogram 1 [attribute number 1] for each attribute number 1, that is, a histogram representing the distribution of attribute values for each attribute in the table 1 (STEP 50a-8).
[0091]
Next, the same processing as in STEP 50a-1 to STEP 50a-8 is repeated for the attributes of Table 2, and a histogram 2 representing the distribution of attribute values for each attribute in Table 2 is generated (STEP 50a-100). Finally, the difference between the areas of the histogram 1 and the histogram 2 over the number of attributes 1 and the number of attributes 2 is compared (STEP 50a-200).
[0092]
FIG. 16 is a diagram illustrating an example of a comparison method for the frequency distribution (histogram) of the attribute. For example, the frequency distribution relating to the system design table of DB1 is shown for each attribute value of (a) the maximum operating pressure, (b) the maximum operating temperature, and (c) the attribute of the seismic class. Similarly, in the piping design table of DB2, the attributes of (d) LG, (e) LH, and (f) LI are shown for each attribute value. (D) shows a frequency distribution when the unit system is changed by code conversion.
[0093]
The type and range of data in the axial direction in these frequency distributions are compared between frequency distributions, and for those that can be compared, the difference in frequency in the vertical axis direction is obtained as the area difference, and a frequency distribution comparison graph as shown in (g). Create This graph (g) is the same as that shown in FIG. 14, but the coincidence index on the vertical axis is 1- (sum of differences in area for each section of the normalized frequency distribution). If the match index exceeds a certain threshold, a matching attribute pair is determined. In the graph (g), the coincidence index is set to 0 because the distribution of other numeric attribute values of the distribution of character string type attribute values such as the earthquake-resistant class and LI cannot be compared.
[0094]
As described above, by comparing attribute values using the frequency distribution, it is possible to extract a matching relationship between attributes even when the primary key of the comparison target table is unknown.
[0095]
Next, FIGS. 17 and 18 show processes that can be inserted between STEP 40 and STEP 50 in FIG. 9 and function as data conversion processes. FIG. 17 is a diagram showing a flow of attribute value coincidence information collection processing when a designated attribute value is decomposed.
[0096]
In the process, first, 1 is set to the database number (STEP 400-1). When the database number is 2 or less (STEP400-2, Yes), the following processing (STEP400-3 to STEP400-16) is repeated.
[0097]
In the process, first, 1 is set to the common domain data number (STEP 400-3). If the common domain data number is equal to or less than the maximum number of common domains (STEP 400-4, Yes), 1 is set to the attribute number (STEP 400-5).
[0098]
Next, if the attribute number is equal to or less than the number of attributes (number of attributes [database number]) corresponding to the database number, that is, if the number of attributes is equal to or less than the number of attributes included in each table of DB1 or DB2 (STEP400-6, Yes), The database number, common domain data number, and attribute value specified by attribute number (attribute value [database number] [common domain data number] [attribute number]), database number, common domain [common domain data number], and attribute number An attribute value that can be acquired is substituted by an attribute value acquisition function (STEP 400-7).
[0099]
Further, 1 is set as the division number (STEP 400-8).
[0100]
If there is a delimiter such as "-" shown as 1000-a-5 in FIG. 9 or if there is an attribute value cutout designation in advance (STEP400-9, Yes), the database number, the common domain data number, Attribute values specified by attribute number and division number (attribute value [database number] [common domain data number] [attribute number] [division number]), attribute value [database number] [common domain data number] [attribute number] Then, a character string is cut out by operating a character cutout function based on a character cutout condition specified by a delimiter (STEP 400-10). This is, for example, such that "RHR" is cut out from the character string "RHR-001" under the condition of a delimiter character string "-".
[0101]
Then, attribute names [database number] [common domain data number] [attribute number] [division number] are defined as attribute names, and the original attribute names (attribute name [database number] [common domain data number] ] [Attribute number]) and a division number are newly generated (STEP400-11). This is, for example, such that a new “pipe number 1” is generated by adding a division number “1” to the attribute name “pipe number”.
[0102]
Thereafter, the division number is incremented by 1 (STEP 400-12), and the process is repeated until there is no delimiter (STEP 400-9, No). If there is no delimiter, the attribute number is incremented by 1 (STEP400-13), and the processing from STEP400-6 to STEP400-13 is repeated until the attribute number reaches the maximum attribute number (STEP400-6, No).
[0103]
When the attribute number becomes the maximum attribute, the common domain data number is incremented by 1, and the processing from STEP 400-4 to STEP 400-14 is repeated until the common domain data number becomes the maximum number of common domains (STEP400-4, No). . When the common domain data number reaches the maximum number of common domains, the database number is incremented by 1 (STEP400-16), and the processing from STEP400-3 to STEP400-16 is repeated for DB2.
[0104]
When the process for DB2 is completed, the data after data conversion to the next process STEP4000, that is, the attribute value [database number] [common domain data number] [attribute number] [division number] and the attribute name [database number] Pass [common domain data number], [attribute number] and [division number].
[0105]
By the above processing, even if the information of one attribute is composed of information having a plurality of structures, the attribute value is decomposed and the information is compared so that it can be compared with the information included in another database. It can be disassembled and taken out.
[0106]
FIG. 18 is a diagram illustrating a flow of a process of selecting an attribute to be compared with an attribute value according to the data storage ratio of the attribute. Here, the processing shown in this figure is set as STEP4000. Note that in the processing flow, the expression of the processing for initializing the database number, the common domain data number, the attribute number, and the division number to 1 before each condition determination branch is omitted.
[0107]
In the process, first, since the database number is the initial value 1, the condition that the database number is 2 or less is satisfied (STEP 4000-1, Yes), and the maximum number of each of the common domain data number, attribute number, and division number is satisfied. Is repeated (until STEP4000-2, 3, 4 becomes No).
[0108]
In the innermost loop of the process, the attribute values [database number], [common domain data number], [attribute number], and [division number] are attributed by the database number, common domain data number, attribute number, and division number. When the attribute value is not NULL, that is, when it is not empty (STEP 4000-6, Yes), the count is characterized by a database number, a common domain data number, an attribute number, and a division number (STEP 4000-5). [Database number], [Common domain data number], [Attribute number], and [Division number] are incremented by 1 (STEP 4000-7). It is assumed that the counts [database number], [common domain data number], [attribute number], and [division number] have been cleared to zero immediately before the start of processing. Thereafter, the division number is incremented by 1 (STEP 4000-9), and the process is repeated.
[0109]
After the above series of processing (STEP 4000-1 to STEP 4000-12) is completed, the attribute data storage rate [database number] [common domain data number] [attribute number] [division number] is counted [database number] It is obtained from [common domain data number] [attribute number] [division number] / number of common domains (STEP 4000-13).
[0110]
When the attribute data storage rate by the above processing is used to associate attribute values with each other, it is used as a criterion for determining whether or not attribute values should be compared. If not, it is possible to associate attributes with relatively high data storage rates, and it is possible to improve the accuracy of attribute association.
[0111]
FIG. 19 is a diagram illustrating an example of the attribute value pool. This stores the result of a previously obtained correct attribute name corresponding to a representative attribute value at the time of past attribute association. For example, STEP50-6 and STEP50-9 in FIG. Can be used to immediately obtain a representative attribute name without using a matching function when obtaining data 1 or data 2. That is, the result of past attribute association can be used for new attribute association, which can contribute to improvement in attribute association accuracy.
[0112]
FIG. 20 is a diagram illustrating an example of a screen for inputting data for combining a plurality of tables using a screen for inputting variables for a process of generating mapping data. This is almost the same as the input parameter setting screen 1000-a shown in FIG. 9. The number of input fields for the table name is increased to a plurality (1000-a-31 to 32) and the table connection condition designation button 1000 -A-34 and 35 are increased.
[0113]
Clicking a table connection condition designation button 1000-a-34 on the screen opens a table connection condition setting dialog 1000-b. Here, a master table 1000-b-1 and a connection table 1000-b-2 are designated, an attribute connection condition is designated by 1000-b-3, and other conditions are designated by 1000-b-4, and an OK button 1000-b is designated. It is determined at -5. In the screen example, the plant management table and the system design table of DB1 are connected by the plant code of each table, and as other conditions, the plant name of the plant management is limited to “A plant”. When the user clicks the table connection condition designation button 1000-a-35, the table connection condition for DB2 can be set.
[0114]
FIG. 21 is an example of a join table used for associating a design database. In the example, a case is shown in which the plant management table and the system design table in DB1 are connected when the plant code is A with the attribute of the plant code. By doing so, a plurality of tables can be represented as one table, so that the process of creating a database association model as shown in FIG. 8 can be applied to a case of a plurality of tables.
[0115]
As described above, according to the embodiment of the present invention, a model for associating different databases can be automatically created based on the matching / mismatching of the attribute values of a plurality of tables.
[0116]
Further, in the apparatus provided with means for comparing attribute values using a frequency distribution, it is possible to extract a matching relationship between attributes even when the primary key of the table to be compared is unknown.
[0117]
Further, when comparing attribute values with each other, there is provided a processing unit that uses a code conversion function for attribute values that has been created in advance and matches the code system or unit system used in the database of the comparison partner. Can compare attribute values even when the unit system and code system used in the database are different.
[0118]
Also, when comparing attribute values, the content of the attribute value is divided into a plurality of elements with respect to the attribute specified in advance from the outside, and the divided elements are used to compare with the attribute value of the attribute of the table of the comparison target database. Even if the information of one attribute is composed of information with multiple structures, the attribute value can be decomposed and compared with the information contained in other databases. Information can be decomposed and extracted as follows.
[0119]
In addition, before comparing attribute values with each other, an attribute having a small amount of stored data is provided with means for removing the attribute from the comparison target attributes. If not used for association, attributes having relatively high data storage rates can be associated with each other, and the accuracy of attribute association can be improved.
[0120]
Also, for each attribute of each table, design semantic information to which the attribute is associated is determined from the attribute value information using information indicating what kind of design semantic information is associated with the set of attribute values. However, in a device having means for determining attributes associated with the same design semantic information as attributes to be associated, the result of past attribute association can be used for new attribute association, This can contribute to an improvement in the accuracy of attribute association.
[0121]
Further, in a device having means for comparing at least one table to be compared as one table formed by joining a plurality of tables, since a plurality of tables can be expressed as one table, the database The present invention can also be applied to a case where the process of creating an association model is performed for a plurality of tables.
[0122]
【The invention's effect】
As described above, according to the present invention, a model for associating different databases can be created from a result of comparing attribute values of a plurality of tables of different databases.
[Brief description of the drawings]
FIG. 1 is a configuration diagram of a system that generates a model that maps a design database according to an embodiment of the present invention.
FIG. 2 is a diagram showing types of general plant design operations and relationships between the operations.
FIG. 3 is an example of a configuration diagram of a database integration system.
FIG. 4 is a diagram showing the relationship between the system design database and the piping design database.
FIG. 5 is an XML model for associating a system design database with a piping design database.
It is a figure shown in Schema format.
FIG. 6 is a diagram illustrating an example of a tagged document generated by the correspondence generation processing.
FIG. 7 is a diagram illustrating an example of a screen constructed using a tagged document created by a correspondence generation process.
FIG. 8 is a diagram illustrating an outline of a flow of a process of comparing attribute values of two design databases with each other and generating a mapping model.
FIG. 9 is a diagram showing an example of a display screen of a system for inputting data for mapping / model generation processing.
FIG. 10 is a diagram showing a flow of a primary key domain information collecting process.
FIG. 11 is a diagram showing a flow of collecting primary domain common domain information.
FIG. 12 is a diagram showing a flow of a process of collecting attribute names of a database.
FIG. 13 is a diagram showing a flow of attribute value coincidence information collection processing using a primary key.
FIG. 14 is a diagram showing a graph relating to the attribute coincidence obtained by the attribute value coincidence information collection processing using the primary key.
FIG. 15 is a diagram showing a flow of attribute value coincidence information collection processing using the frequency distribution of attributes.
FIG. 16 is a diagram illustrating an example of a comparison method for the frequency distribution of attributes.
FIG. 17 is a diagram showing a flow of attribute value coincidence information collection processing when a designated attribute value is decomposed.
FIG. 18 is a diagram showing a flow of a process of selecting an attribute to be compared with an attribute value according to the data storage ratio of the attribute.
FIG. 19 is a diagram illustrating an example of an attribute value pool.
FIG. 20 is a diagram showing an example of a screen for inputting data for combining a plurality of tables using a screen for inputting variables for a process of generating mapping data.
FIG. 21 is an example of a join table used for associating a design database.
FIG. 22 is a diagram showing a flow of a mapping model generation process.
FIG. 23 is a diagram illustrating an example of a screen being edited with respect to a generated mapping model template.
[Explanation of symbols]
110-120, 120-110, 120-130, 130-120: mapping model; 110c-130c: database used by each design application; 300: tagged document conversion device; 1000: management / editing terminal; 1100: design Database group, 1200: design database comparison mapping model generation device, 1300: mapping model database.

Claims

Means for comparing attribute values of attributes of each table belonging to different databases, means for associating attribute items in each table from a result of comparison between the attribute values, and Means for creating a mapping model for associating genre items between different databases with each other.

2. The device according to claim 1, wherein the means for comparing the attribute values has means for calculating the degree of coincidence between the attribute values in each of the tables, and the associating means is defined in advance from the calculation result of the calculating means. A heterogeneous database integration support device comprising means for extracting an attribute pair having a degree of coincidence equal to or greater than a threshold value.

2. The means for comparing attribute values according to claim 1, wherein the comparing means for attribute values has means for taking a frequency distribution of attribute values for each attribute and comparing the shape of the frequency distribution. The heterogeneous database integration support device, characterized in that the attaching means has means for extracting, as a matching attribute pair, one having the smallest shape difference from the comparison result by the means for comparing the shapes of the frequency distributions.

4. A process according to claim 2 or 3, wherein, when comparing attribute values, using a code conversion function of attribute values that has been created in advance to match with a code system or a unit system used in a database of a comparison partner. A heterogeneous database integration support device characterized by comprising means.

In Claims 2 and 3, when comparing attribute values, the content of the attribute value is divided into a plurality of elements for the attribute specified in advance from outside, and the attribute of the table of the database to be compared is determined using the divided elements. A heterogeneous database integration support device having means for comparing the attribute value with the attribute value.

2. The heterogeneous database integration support apparatus according to claim 1, further comprising means for removing an attribute having a small amount of stored data from attributes to be compared before comparing the attribute values.

2. The design meaning to which the attribute is associated with the attribute value information for each attribute of each table using information indicating what kind of design meaning information the attribute value set is associated with. An apparatus for supporting the integration of heterogeneous databases, comprising means for judging information and judging attributes associated with the same design meaning information as attributes associated with each other.

2. The heterogeneous database integration support apparatus according to claim 1, further comprising means for treating and comparing at least one table to be compared as one table formed by joining a plurality of tables.