JP2010218386A

JP2010218386A - Method of matching data between business models and computer program

Info

Publication number: JP2010218386A
Application number: JP2009066307A
Authority: JP
Inventors: Hiroyuki Kobayashi; 宏至小林
Original assignee: Hitachi Software Engineering Co Ltd
Current assignee: Hitachi Software Engineering Co Ltd
Priority date: 2009-03-18
Filing date: 2009-03-18
Publication date: 2010-09-30
Anticipated expiration: 2029-03-18
Also published as: JP5145275B2

Abstract

<P>PROBLEM TO BE SOLVED: To perform data matching between business models each composed of a data model and a process model. <P>SOLUTION: A data model and a process model corresponding to a first business model are integrated into one graph based on a relation between elements of the models to generate a first extension process graph on data. Similarly, a data model and a process model corresponding to a second business model are integrated into one graph based on a relation between elements of the models to generate a second extension process graph on data. Thereafter, the elements of the first extension process graph and the nodes of the second extension process graph are paired, and a pair-wise connection graph describing a relation each between pairs by labeling is generated on data. A similarity propagation graph in which similarity and propagation coefficient between pairs are set is generated on data, and the similarity each between pairs is calculated by repetitive operation. Thereafter, a pair with high similarity obtained by filtering is presented on a display device. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、複数の業務モデルをデータマッチングさせるための信号処理技術に関する。例えばある企業で使用されている業務モデルと業界リファレンスとして使用される業務モデルとを信号処理を通じてデータマッチングさせるための技術に関する。 The present invention relates to a signal processing technique for data matching of a plurality of business models. For example, the present invention relates to a technique for data matching between a business model used in a certain company and a business model used as an industry reference through signal processing.

現在、ＩＴ基盤への投資の抑制や相互接続を目的として、業界団体による業務モデルの標準化が進められている。業界団体によって策定されたリファレンスとなる業務モデルを、この明細書では「リファレンス業務モデル」というものとする。リファレンス業務モデルには、例えばテレコム分野のNGOSS （New Generation Operation Systems and Software）、保健医療分野のHL７(Healthcare Level Seven)、ACORD (Association for Cooperative Operations Research and Development)等がある。 Currently, industry groups are working to standardize business models for the purpose of reducing investment in IT infrastructure and interconnecting them. A business model to be a reference developed by an industry group is referred to as a “reference business model” in this specification. Reference service models include, for example, NGOSS (New Generation Operation Systems and Software) in the telecom field, HL7 (Healthcare Level Seven) in the health care field, and ACORD (Association for Cooperative Operations Research and Development).

ところが、リファレンス業務モデルを企業内に展開する場合には、リファレンス業務モデルと企業内にある既存の業務モデルとのマッチングが問題となる。一般に、業務モデルは交換されるデータを表現したデータモデルと、処理とその手順を表したプロセスモデルとで構成される。このため、両モデルを合わせるとマッチング対象となる項目は、場合によって数万項目にも及ぶ。しかし、現時点において、両業務モデルをマッチングさせる技術は確立されていない。このため、現在のマッチング作業は人手で行われており、大きな工数が問題となっている。 However, when the reference business model is deployed in a company, matching between the reference business model and an existing business model in the company becomes a problem. In general, a business model is composed of a data model representing data to be exchanged, and a process model representing processing and its procedure. For this reason, when the two models are combined, the number of items to be matched reaches tens of thousands in some cases. However, at present, a technology for matching both business models has not been established. For this reason, the current matching work is performed manually, and a large man-hour is a problem.

現在、複数の業務モデルのマッチングを自動化する技術として、データモデルにおけるメタモデル（スキーマ）間のマッチングが知られている。図１に、発注伝票に対応するリファレンスデータモデルのメタモデル１０１（図１（Ａ））の具体例と、対応する既存データモデルのメタモデル１０２（図１（Ｂ））の具体例を示す。データモデルは、一般にはＵＭＬクラス図やデータフロー図等として与えられる。以下の説明では、図１に示すように、このＵＭＬ表記によりデータモデルを表記する。このように、既存の手法では、メタモデルレベルでマッチングを実行するのが一般的である。なお、既存のマッチング技術では、クラス・属性の名前・型の類似性や継承・集約・依存等のクラス間の関係をマッチングに利用する。 Currently, matching between metamodels (schema) in a data model is known as a technique for automating matching of a plurality of business models. FIG. 1 shows a specific example of the meta model 101 (FIG. 1A) of the reference data model corresponding to the order slip and a specific example of the meta model 102 (FIG. 1B) of the corresponding existing data model. The data model is generally given as a UML class diagram or a data flow diagram. In the following description, as shown in FIG. 1, the data model is represented by this UML notation. As described above, in existing methods, matching is generally performed at the metamodel level. In the existing matching technology, the class / attribute name / type similarity and the relationship between classes such as inheritance / aggregation / dependency are used for matching.

特開2007-188343号公報JP 2007-188343 A

Melnik著,「Generic Model Management」,Springer 2004年,p.117-131Melnik, `` Generic Model Management '', Springer 2004, p.117-131

しかし、データモデルとプロセスモデルから構成される業務モデルの全体をデータマッチングによって統合する仕組みは未だ実現されていない。特に、プロセスモデルの場合、データモデルについて前述したメタモデルレベルでのマッチング方法をそのまま適用することができない。共通のメタモデルに基づいて作成されたプロセスモデル同士がマッチングの対象となるためである。 However, a mechanism for integrating the entire business model composed of the data model and the process model by data matching has not yet been realized. In particular, in the case of a process model, the above-described matching method at the metamodel level for the data model cannot be applied as it is. This is because process models created based on a common metamodel are subject to matching.

図２に、購買業務に対するリファレンスプロセスモデル２１０（図２（Ｂ））の具体例と、対応する既存プロセスモデル２２０（図２（Ｃ））の具体例とを示す。ここで、プロセスモデルは、ＢＰＭＮ（Business Process Management Notation）、ＵＭＬアクティビティ図等のグラフで表記される。以下の説明では、このＢＰＭＮ表記によりプロセスモデルを表記する。なお、プロセスモデルのメタモデル２０１を図２（Ａ）に示す。メタモデル２０１は、グラフを表現する節（Node）と辺（Edge）により表現される。前述したように、リファレンスプロセスモデル２１０と既存プロセスモデル２２０は、共にメタモデル２０１に基づいて作成されている。 FIG. 2 shows a specific example of the reference process model 210 (FIG. 2B) for a purchasing operation and a specific example of the corresponding existing process model 220 (FIG. 2C). Here, the process model is represented by a graph such as a BPMN (Business Process Management Notation) or a UML activity diagram. In the following description, a process model is expressed by this BPMN notation. A metamodel 201 of the process model is shown in FIG. The meta model 201 is represented by a node (Node) and an edge (Edge) representing a graph. As described above, both the reference process model 210 and the existing process model 220 are created based on the meta model 201.

従って、プロセスモデルのマッチングレベルは、データモデルのマッチングレベル（すなわち、メタモデル）とは異なるレベルで実行する必要がある。加えて、プロセスモデルレベルでのマッチングでは、意味的には同じプロセスモデルであるにもかかわらず、構造的には異なる表記を採り得るという、プロセスモデルに特有の曖昧性が問題となる。 Accordingly, the process model matching level needs to be executed at a level different from the data model matching level (ie, meta model). In addition, in the matching at the process model level, the ambiguity peculiar to the process model, in which the description is structurally different even though the process model is semantically the same, becomes a problem.

そこで、発明者は、業務モデルレベルでのデータマッチング処理を実現するために、データモデルとプロセスモデルを同等に扱うことができ、かつ、プロセスモデルが有する曖昧性を解決できる次のような仕組みを提供する。 Therefore, the inventor has the following mechanism that can handle the data model and the process model equally and can solve the ambiguity of the process model in order to realize the data matching process at the business model level. provide.

まず、データ処理部は、第１の業務モデルに対応するデータモデルとプロセスモデルの要素間の関係に基づいて、これらを一つのグラフに統合して第１の拡張プロセスグラフをデータ上で生成する。同様に、データ処理部は、第２の業務モデルに対応するデータモデルとプロセスモデルの要素間の関係に基づいて、これらを一つのグラフに統合して第２の拡張プロセスグラフをデータ上で生成する。この後、データ処理部は、第１の拡張プロセスグラフの要素と第２の拡張プロセスグラフの要素とでペアを形成すると共に、各ペア間の関係をラベルで記述するペアワイズ接続グラフをデータ上で生成する。次に、データ処理部は、ペア間の類似度及び伝播係数を設定した類似性伝播グラフをデータ上で生成し、ペア間の類似性を繰り返し演算により計算する。その後、データ処理部は、フィルタリングにより類似性の高いペアを表示装置上に提示する。 First, the data processing unit generates a first extended process graph on the data by integrating them into one graph based on the relationship between the data model corresponding to the first business model and the elements of the process model. . Similarly, the data processing unit generates a second extended process graph on the data by integrating them into one graph based on the relationship between the data model corresponding to the second business model and the elements of the process model. To do. Thereafter, the data processing unit forms a pair with the elements of the first extended process graph and the elements of the second extended process graph, and creates a pair-wise connection graph describing the relationship between each pair with a label on the data. Generate. Next, the data processing unit generates a similarity propagation graph in which the similarity between the pairs and the propagation coefficient are set on the data, and calculates the similarity between the pairs by repeated calculation. Thereafter, the data processing unit presents a pair having high similarity on the display device by filtering.

本発明によれば、従来手法に比してマッチングに要する工数を大幅に削減することができる。加えて、発明者の提案する拡張プロセスグラフを用いれば、データモデルとプロセスモデルを統一的に扱うことができる。これにより、２つの業務モデルの全体をデータマッチングすることができる。これにより、マッチング精度を向上することができる。 According to the present invention, the number of man-hours required for matching can be greatly reduced as compared with the conventional method. In addition, if the extended process graph proposed by the inventor is used, the data model and the process model can be handled in a unified manner. As a result, the entire two business models can be data-matched. Thereby, matching accuracy can be improved.

データモデルの具体例を説明する図である。It is a figure explaining the specific example of a data model. プロセスモデルの具体例を説明する図である。It is a figure explaining the specific example of a process model. 形態例に係るマッチングシステムのシステム構成の概要を示す図である。It is a figure which shows the outline | summary of the system configuration | structure of the matching system which concerns on an example. 形態例に係るマッチングプロセスの概要を示すフローチャートである。It is a flowchart which shows the outline | summary of the matching process which concerns on an example. 形態例に係るマッチングプロセスの実行手順例を示すフローチャートである。It is a flowchart which shows the example of an execution procedure of the matching process which concerns on an example. プロセスモデルの正規化動作例を示す図である。It is a figure which shows the normalization operation example of a process model. 拡張プロセスグラフの具体例を説明する図である。It is a figure explaining the specific example of an extended process graph. ペアワイズ接続グラフの具体例を説明する図である。It is a figure explaining the specific example of a pairwise connection graph. 類似性伝播グラフの具体例を説明する図である。It is a figure explaining the specific example of a similarity propagation graph.

以下、業務モデル間のデータマッチング処理を実行するマッチングシステムの形態例を説明する。 An example of a matching system that executes data matching processing between business models will be described below.

（１）形態例
（１−１）マッチングシステムの構成
図３に、業務モデル間のマッチング処理を実行するマッチングシステムの構成例を示す。形態例に係るマッチングシステムは、コンピュータ３００と、入力装置３２０と、表示装置３３０とで構成される。コンピュータ３００は、データ演算を実行するＣＰＵ３０１、ＲＯＭ３０２、ＲＡＭ３０３、ハードディスク駆動装置３０６、これらデバイス間のデータ転送を実現するＣＰＵバス３１２、これらデバイスとＣＰＵバス３１２とを結合するインターフェース３０９〜３１１により構成される。 (1) Configuration Example (1-1) Configuration of Matching System FIG. 3 shows a configuration example of a matching system that executes matching processing between business models. The matching system according to the embodiment includes a computer 300, an input device 320, and a display device 330. The computer 300 includes a CPU 301 that executes data computation, a ROM 302, a RAM 303, a hard disk drive 306, a CPU bus 312 that realizes data transfer between these devices, and interfaces 309 to 311 that couple these devices to the CPU bus 312. The

因みに、ＲＡＭ３０３には、ＣＰＵ３０１に演算処理を実行させる業務モデル間マッチングプログラム３０４の実行領域と、演算時に一時的に生成されるデータを格納する作業領域３０５とが少なくとも確保される。また、ハードディスク駆動装置３０６の記憶領域には、業務モデル間マッチングプログラム３０４の格納領域としてのプログラム格納部３０７と、リファレンス業務モデルと既存業務モデルのデータ格納領域としてのデータ格納部３０８とが少なくとも確保される。 Incidentally, the RAM 303 has at least an execution area for the inter-business-model matching program 304 for causing the CPU 301 to execute arithmetic processing, and a work area 305 for storing data temporarily generated at the time of arithmetic operation. In addition, in the storage area of the hard disk drive 306, at least a program storage unit 307 as a storage area for the business model matching program 304 and a data storage unit 308 as a data storage area for the reference business model and the existing business model are secured. Is done.

なお、形態例の説明では、マッチング処理がコンピュータ３００上で実行されるプログラムの一機能として実現される場合を想定しているが、マッチング処理専用の装置として実現することもできる。その場合、当該処理機能は、ＡＳＩＣや専用の処理ボードとして実現することもできる。因みに、これらのハードウェア構成が、特許請求の範囲における「データ処理部」に対応する。 In the description of the embodiment, it is assumed that the matching process is realized as one function of a program executed on the computer 300, but the matching process may be realized as an apparatus dedicated to the matching process. In this case, the processing function can be realized as an ASIC or a dedicated processing board. Incidentally, these hardware configurations correspond to the “data processing unit” in the claims.

また、ハードディスク駆動装置３０６に対する業務モデル間マッチングプログラム３０４の書き込みは、当該プログラムを記録した記録媒体から行っても良く、当該プログラムを含む放送信号又は通信信号等を通じて行っても良い。 Further, the business model matching program 304 can be written to the hard disk drive 306 from a recording medium in which the program is recorded, or through a broadcast signal or a communication signal including the program.

また、この形態例の場合には、ハードディスク駆動装置３０６に対して業務モデル間マッチングプログラム３０４を書き込んでいるが、記録媒体はフレキシブルディスク、CD-ROM、DVD-ROM、半導体メモリその他のコンピュータが読み取り可能記憶媒体であれば良い。 In this embodiment, the business model matching program 304 is written in the hard disk drive 306, but the recording medium is read by a flexible disk, CD-ROM, DVD-ROM, semiconductor memory or other computer. Any possible storage medium may be used.

（１−２）業務モデル間マッチング動作
（ａ）概略動作
図４に、業務モデル間マッチングプログラム３０４を通じて実行されるデータマッチングプロセスの概要を示す。このデータマッチングプロセスは、ユーザによる入力装置３２０を通じての操作入力により開始される（ステップ４０１）。コンピュータ３００は、当該操作入力を検出すると、マッチング対象に指定入力されたリファレンス業務モデルのデータと既存業務モデルのデータを、それぞれハードディスク駆動装置３０６のデータ格納部３０８から読み込む。ここで、リファレンス業務モデルのデータとは、リファレンスデータモデルに対応するメタモデル１０１とリファレンスプロセスモデル２１０で構成される。また、既存業務モデルのデータとは、既存データモデルに対応するメタモデル１０２と既存プロセスモデル２２０で構成される。コンピュータ３００は、読み込んだデータをＲＡＭ３０３の作業領域３０５に格納する（ステップ４０２）。 (1-2) Business Model Matching Operation (a) Schematic Operation FIG. 4 shows an overview of a data matching process executed through the business model matching program 304. This data matching process is started by an operation input by the user through the input device 320 (step 401). When the computer 300 detects the operation input, the computer 300 reads, from the data storage unit 308 of the hard disk drive 306, the data of the reference business model and the data of the existing business model that are designated and input as matching targets. Here, the reference business model data includes a meta model 101 and a reference process model 210 corresponding to the reference data model. The existing business model data includes the meta model 102 and the existing process model 220 corresponding to the existing data model. The computer 300 stores the read data in the work area 305 of the RAM 303 (step 402).

コンピュータ３００は、業務モデル間マッチングプログラム３０４に記述された処理手順に従い、作業領域３０５に格納された業務モデルのデータ間についてのマッチング演算を実行する（ステップ４０３）。処理手順の詳細については後述する。なお、コンピュータ３００のマッチング演算によって得られた計算結果は作業領域３０５に格納される。 The computer 300 executes a matching operation between the business model data stored in the work area 305 in accordance with the processing procedure described in the business model matching program 304 (step 403). Details of the processing procedure will be described later. The calculation result obtained by the matching operation of the computer 300 is stored in the work area 305.

コンピュータ３００は、作業領域３０５に格納された計算結果を表示装置３３０に表示する（ステップ４０４）。これにより、業務モデル間マッチングプログラム３０４としての処理動作は終了する（ステップ４０５）。 The computer 300 displays the calculation result stored in the work area 305 on the display device 330 (step 404). As a result, the processing operation as the business model matching program 304 ends (step 405).

一方、表示装置３３０を通じて計算結果の提示を受けたユーザは、提示された可能性の高いマッチング結果についての検証作業を開始する。ここでの検証作業は確度の高いマッチング結果に対して行えば良い。従って、ユーザの作業工数を大幅に削減することができる。 On the other hand, the user who has received the calculation result through the display device 330 starts verification work on the matching result that is likely to be presented. The verification work here may be performed on a highly accurate matching result. Therefore, the user's work man-hours can be greatly reduced.

（ｂ）マッチング演算で実行される詳細動作
続いて、ステップ４０３で実行されるマッチング演算の詳細動作を説明する。図５に、マッチング演算で実行される処理手順の概要を示す。 (B) Detailed Operation Performed in Matching Calculation Next, the detailed operation of the matching calculation performed in Step 403 will be described. FIG. 5 shows an outline of a processing procedure executed in the matching calculation.

（ステップ５０２）
マッチング演算の開始後、コンピュータ３００は、初期マッチング処理を実行する。この初期マッチング処理において、コンピュータ３００は、リファレンスデータモデルのメタモデル１０１と既存データモデルのメタモデル１０２を要素毎に文字列で比較するマッチング処理と、リファレンスプロセスモデル２１０と既存プロセスモデル２２０を要素毎に文字列で比較するマッチング処理とを実行する。例えば図２の例の場合、リファレンスプロセスモデル２１０の処理「検収」と既存プロセスモデル２２０の処理「検収」とは、名称が一致している。この場合、コンピュータ３００は、これらの要素がマッチングすると判定する。 (Step 502)
After the start of the matching calculation, the computer 300 executes an initial matching process. In this initial matching process, the computer 300 compares the meta model 101 of the reference data model and the meta model 102 of the existing data model by a character string for each element, and compares the reference process model 210 and the existing process model 220 for each element. And a matching process for comparing with a character string. For example, in the case of the example in FIG. 2, the process “verification” of the reference process model 210 and the process “verification” of the existing process model 220 have the same name. In this case, the computer 300 determines that these elements match.

（ステップ５０３）
次に、コンピュータ３００は、リファレンスプロセスモデル２１０と既存のプロセスモデル２２０とを参照して、同じ意味でありながら複数の構造を採り得る構造を検索し、事前に定めた特定の構造に統一する処理を実行する。すなわち、プロセスモデルの正規化処理を実行する。図６に、プロセスモデルの正規化動作例を示す。図６（Ａ）は、戻り辺があるプロセスモデル６００を、ループ処理のプロセスモデル６０１に書き換えた例である。また、図６（Ｂ）の例は、お互いにデータを交換することがない連続する処理をもつプロセスモデル６１０を、並列処理を実行するプロセスモデル６１１に書き換えた例である。 (Step 503)
Next, the computer 300 refers to the reference process model 210 and the existing process model 220, searches for a structure that can have a plurality of structures with the same meaning, and unifies the structure into a predetermined specific structure. Execute. That is, a process model normalization process is executed. FIG. 6 shows an example of normalization operation of the process model. FIG. 6A shows an example in which a process model 600 having a return side is rewritten to a process model 601 for loop processing. 6B is an example in which a process model 610 having continuous processing that does not exchange data with each other is rewritten to a process model 611 that executes parallel processing.

このように、プロセスモデルに対して予め正規化処理を実行することにより、モデル構造上での表現の違いを統一的な表現に変更することができる。これにより、マッチング精度を向上することができる。また、正規化の際に戻り辺を排除して処理の順番を確定することにより、後述する拡張プロセスグラフ上での要素間の対応関係や辺のラベリングが可能になる。 Thus, by performing normalization processing on the process model in advance, the difference in expression on the model structure can be changed to a unified expression. Thereby, matching accuracy can be improved. Also, by eliminating the return edge and determining the processing order during normalization, it is possible to make correspondence between elements and labeling edges on the extended process graph described later.

（ステップ５０４）
この後、コンピュータ３００は、データモデルをプロセスモデルに統合したグラフ（以下、「拡張プロセスグラフ」という。）を、各業務モデルについて生成する（ステップ５０４）。すなわち、コンピュータ３００は、リファレンスデータモデルのメタデータ１０１とリファレンスプロセスモデル２１０の要素間の関係に基づいて、リファレンス業務モデルに対応する拡張プロセスグラフ７１０（図７（Ａ））を作成する。同様に、既存データモデルのメタデータ１０２と既存プロセスモデル２２０の要素間の関係に基づいて、既存業務モデルに対応する拡張プロセスグラフ７２０（図７（Ｂ））を作成する。 (Step 504)
Thereafter, the computer 300 generates a graph in which the data model is integrated with the process model (hereinafter referred to as “extended process graph”) for each business model (step 504). That is, the computer 300 creates an extended process graph 710 (FIG. 7A) corresponding to the reference business model based on the relationship between the elements of the reference data model metadata 101 and the reference process model 210. Similarly, based on the relationship between the metadata 102 of the existing data model and the elements of the existing process model 220, an extended process graph 720 (FIG. 7B) corresponding to the existing business model is created.

ここで、拡張プロセスグラフＧ_epとは、プロセスモデルとデータモデルを統合したものであり、以下の式(１)に示すように定義される。 Here, the extended process graph G _ep is an integration of the process model and the data model, and is defined as shown in the following equation (1).

因みに、拡張プロセスグラフＧ_epは、節の集合Ｖ_epと辺の集合Ｅ_epとによって定義される。節の集合Ｖ_epは、プロセスモデルの節の集合Ｖ^processと、データモデルの節の集合Ｖ^dataとを合わせたものとして定義される。もっとも、図７に示すように、プロセスモデルとデータモデル間の辺上に１つ又は複数の仮想的な節を定義する場合には、これらの節の集合も、節の集合Ｖ_epに含まれる。辺の集合Ｅ_epは、プロセスモデルの辺の集合Ｅ^processと、データモデルの辺の集合Ｅ^dataと、プロセスモデルとデータモデル間の辺の集合Ｅ^crossとを合わせたものとして定義される。なお、この明細書において、プロセスモデルとデータモデルともに辺（ｖ_s ,l,ｖ_t ）は、開始節ｖ_s と、終了節ｖ_t と、ラベルｌとの組み合わせによって定義される。 Incidentally, the extended process graph G _ep is defined by a set of nodes V _ep and a set of edges E _ep . The clause set V _ep is defined as a combination of the process model clause set V ^process and the data model clause set V ^data . However, as shown in FIG. 7, when one or more virtual clauses are defined on the edge between the process model and the data model, the set of these clauses is also included in the clause set V _ep. . The edge set E _ep is defined as a combination of a process model edge set E ^process , a data model edge set E ^data, and an edge set E ^cross between the process model and the data model. In this specification, in both the process model and the data model, the edge (v _s , l, v _t ) is the start clause v _s. And the end clause v _t And a combination with the label l.

なお、この形態例における拡張プロセスグラフＧ_epでは、データモデルに由来する辺と、プロセスモデルに由来する辺と、データモデルとプロセスモデルとを接続する辺とで異なるラベルを付けることによりデータ上区別する。例えば拡張プロセスグラフ７１０、７２０の場合、プロセスモデルに由来する辺にはラベルＬ１又はＬ２を付し、データモデルに由来する辺にはラベルＯ１を付し、プロセスモデルとデータモデルとを接続する辺にはラベルＣ１を付す。それ以外の辺には、ラベルとして要素間の関係を表すラベルattribute、type等を付す。 In the extended process graph G _ep in this embodiment, the data model is distinguished by attaching different labels to the edge derived from the data model, the edge derived from the process model, and the edge connecting the data model and the process model. To do. For example, in the case of the extended process graphs 710 and 720, the side derived from the process model is labeled L1 or L2, the side derived from the data model is labeled O1, and the side connecting the process model and the data model Is labeled C1. Labels “attribute” and “type” representing the relationship between elements are attached to the other sides as labels.

また、この形態例における拡張プロセスグラフＧ_epでは、ステップ５０２の初期マッチングでマッチングした要素に対してプロセス的に上流側の辺と下流側の辺とで異なるラベルを付けることによりデータ上区別する。この形態例の場合、ステップ５０２において、リファレンスプロセスモデル２１０の処理「検収」と既存プロセスモデル２２０の処理「検収」とがマッチしている。このため、図７（Ａ）に示す拡張プロセスグラフ７１０では、処理「検収」に対応する要素ＲＴ６よりも上流側の辺にラベルＬ１を付し、要素ＲＴ６よりも下流側の辺にラベルＬ２を付している。同様に、図７（Ｂ）に示す拡張プロセスグラフ７２０では、処理「検収」に対応する要素ＡＴ５よりも上流側の辺にラベルＬ１を付し、要素ＡＴ５よりも下流側の辺にラベルＬ２を付している。これにより、リファレンス業務モデルに対応する拡張プロセスグラフ７１０の要素ＲＴ６以降の要素が、既存業務モデルに対応する拡張プロセスグラフ７２０の要素ＡＴ５以前の要素とマッチングするのを防止することができる。 Further, in the extended process graph _Gep in this embodiment, the elements matched by the initial matching in step 502 are distinguished in terms of data by attaching different labels on the upstream side and the downstream side in terms of the process. In the case of this embodiment, in step 502, the process “verification” of the reference process model 210 matches the process “verification” of the existing process model 220. For this reason, in the extended process graph 710 shown in FIG. 7A, the label L1 is attached to the upstream side of the element RT6 corresponding to the process “verification”, and the label L2 is attached to the downstream side of the element RT6. It is attached. Similarly, in the extended process graph 720 shown in FIG. 7B, the label L1 is attached to the upstream side of the element AT5 corresponding to the process “verification”, and the label L2 is attached to the downstream side of the element AT5. It is attached. As a result, it is possible to prevent the elements after the element RT6 of the extended process graph 710 corresponding to the reference work model from matching with the elements before the element AT5 of the extended process graph 720 corresponding to the existing work model.

また、この形態例における拡張プロセスグラフＧ_epでは、データモデルに由来する節の要素と、プロセスモデルに由来する節の要素との間でも異なるラベルを付けてデータ上区別する。また、リファレンス業務モデルに対応する拡張プロセスグラフＧ_epと既存業務モデルに対応する拡張プロセスグラフＧ_epとで異なるラベルを付してデータ上区別する。また、節の要素には、由来が同じモデル内の上流側から順番に通し番号を付してデータ上区別する。これにより、拡張プロセスグラフ７１０の要素と拡張プロセスグラフ７２０の要素のプロセス上での位置関係が明確化され、後述するペアワイズ接続グラフの生成の際、拡張プロセスグラフ７１０の要素と拡張プロセスグラフ７２０の要素のペアを過不足無く作成することが可能になる。 Further, in the extended process graph _Gep in this embodiment, a different label is also provided between the element of the node derived from the data model and the element of the node derived from the process model to distinguish on the data. Moreover, distinguishing the data given different labels and expansion process graph G _ep corresponding to the existing business model and expansion process graph G _ep corresponding to the reference work model. In addition, the elements of the sections are serially numbered from the upstream side in the model with the same origin to distinguish them in the data. As a result, the positional relationship of the elements of the extended process graph 710 and the extended process graph 720 in the process is clarified, and when the pairwise connection graph described later is generated, It is possible to create pairs of elements without excess or deficiency.

（ステップ５０５）
この後、コンピュータ３００は、ステップ５０４で生成された拡張プロセスグラフ７１０の各要素と拡張プロセスグラフ７２０の各要素について考えられる全てのペアの接続関係をグラフで表現するペアワイズ接続グラフを作成する。図７の拡張プロセスグラフから作成されるペアワイズ接続グラフの一部分を図８に示す。図８では、拡張プロセスグラフを構成する節のうちラベルが一致する要素同士をペアとしてペアワイズ接続グラフを作成している。因みに、部分グラフ８１０（図８（Ａ）)、８２０（図８（Ｂ）)、８３０（図８（Ｃ）)、８４０（図８（Ｄ）)は、それぞれ辺のラベルがＬ１、Ｌ２、activityType、Ｃ１の例である。 (Step 505)
Thereafter, the computer 300 creates a pair-wise connection graph that represents the connection relationships of all the possible pairs for each element of the extended process graph 710 and each element of the extended process graph 720 generated in step 504 in a graph. FIG. 8 shows a part of a pair-wise connection graph created from the extended process graph of FIG. In FIG. 8, a pair-wise connection graph is created by pairing elements having the same label among the nodes constituting the extended process graph. Incidentally, the subgraphs 810 (FIG. 8A), 820 (FIG. 8B), 830 (FIG. 8C), and 840 (FIG. 8D) have side labels L1, L2, It is an example of activityType, C1.

ペアワイズ接続グラフＧ_pcは、リファレンス業務モデルに対応する拡張プロセスグラフＧ_ep(reff)の節と既存業務モデルに対応する拡張プロセスグラフＧ_ep(AsIs)の節の間の関係を見るためのグラフであり、以下の式（２）に示すように定義される。節の集合Ｖ_pcの要素は、リファレンス業務モデルに対応する拡張プロセスグラフの節Ｖ_ep(reff)の要素と、既存業務モデルに対応する拡張プロセスグラフの節の集合Ｖ_ep(AsIs)の要素との対として定義される。辺の集合Ｅ_pcは、リファレンス業務モデルに対応する拡張プロセスグラフの辺の集合Ｅ_pc(reff)と既存業務モデルに対応する拡張プロセスグラフの辺の集合Ｅ_pc(AsIs)のうちラベルｌが等しい辺から構成される。 The pairwise connection graph G _pc is a graph for viewing the relationship between the clause of the extended process graph G _{ep (reff)} corresponding to the reference business model and the clause of the extended process graph G _{ep (AsIs)} corresponding to the existing business model. Yes, it is defined as shown in Equation (2) below. The elements of the clause set V _pc are the elements of the extended process graph clause V _{ep (reff)} corresponding to the reference business model and the elements of the extended process graph clause set V _{ep (AsIs)} corresponding to the existing business model. Defined as a pair. The edge set E _pc has the same label l among the edge set E _{pc (reff)} of the extended process graph corresponding to the reference business model and the edge set E _{pc (AsIs)} of the extended process graph corresponding to the existing business model. Consists of sides.

この際、計算量の削減のため、正規化されたプロセスモデルの実行順序を考慮して節同士のペアを作成する。この形態例の場合、拡張プロセスグラフ７１０のうちプロセスモデルに由来する節に対して順番に番号ｉを付与し、拡張プロセスグラフ７２０のうちプロセスモデルに由来の節に対して順番ｊを付与している。 At this time, in order to reduce the amount of calculation, pairs of clauses are created in consideration of the execution order of the normalized process model. In the case of this embodiment, number i is assigned in order to the nodes derived from the process model in the extended process graph 710, and order j is assigned to the nodes derived from the process model in the extended process graph 720. Yes.

ここで、レファレンス業務モデルに対応する拡張プロセスグラフ７１０の節ｖ_i0(reff)と、既存業務モデルに対応する拡張プロセスグラフの節ｖ_j0(AsIs)とをペアにしてペアワイズ接続グラフの新たな節（ｖ_i0(reff)，ｖ_j0(AsIs)）を生成する場合を考える。このとき、次の拡張プロセスグラフ７１０の節ｖ_i0+1(reff)は、拡張プロセスグラフ７２０の節ｖ_j0+1(AsIs)以降の節とのみペアを組む。これにより、例えば図７の拡張プロセスグラフから、ペア（ＲＴ２，ＡＴ２）を作成した後、（ＲＥ１，ＡＴ３）という本来のプロセスモデルの実行順序が逆転したペアを作成するのを防止し、計算量を削減することができる。 Here, a clause v _{i0 (reff)} of the extended process graph 710 corresponding to the reference business model and a clause v _{j0 (AsIs) of} the extended process graph corresponding to the existing business model are paired, and a new clause of the pair-wise connection graph. Consider the case of generating (v _{i0 (reff)} , v _{j0 (AsIs)} ). At this time, the node v _{i0 + 1 (reff)} of the next extended process graph 710 forms a pair only with the node after the node v _{j0 + 1 (AsIs)} of the extended process graph 720. Thus, for example, after creating the pair (RT2, AT2) from the extended process graph of FIG. 7, it is possible to prevent the creation of a pair (RE1, AT3) in which the execution order of the original process model is reversed, and the calculation amount Can be reduced.

（ステップ５０６）
この後、コンピュータ３００は、ステップ５０５で作成したペアワイズ接続グラフ（図８）に基づいて類似性伝播グラフを作成する。図９に、ペアワイズ接続グラフ８１０、８２０、８３０、８４０にそれぞれ対応する類似性伝播グラフ９１０（図９（Ａ）)、９２０（図９（Ｂ）)、９３０（図９（Ｃ）)、９４０（図９（Ｄ）)を示す。特に、類似性伝播グラフ９４０は、プロセスモデルとデータモデルとを結びつける類似性伝播グラフである。この類似性伝播グラフ９４０が存在することで、プロセスモデル間の節のペアとデータモデル間の節のペアとの間で類似性の相互伝播が保障される。 (Step 506)
Thereafter, the computer 300 creates a similarity propagation graph based on the pair-wise connection graph created in step 505 (FIG. 8). FIG. 9 shows similarity propagation graphs 910 (FIG. 9A), 920 (FIG. 9B), 930 (FIG. 9C), and 940 corresponding to the pairwise connection graphs 810, 820, 830, and 840, respectively. (FIG. 9D) is shown. In particular, the similarity propagation graph 940 is a similarity propagation graph that connects a process model and a data model. The existence of the similarity propagation graph 940 ensures mutual propagation of similarity between a pair of clauses between process models and a pair of clauses between data models.

類似性伝播グラフＧ_spとは、ペアワイズ接続グラフＧ_pcの節（ｖ_reff,ｖ_AsIs）∈Ｖ_pcのペアｖ_reffとｖ_AsIs間の類似性とこの類似性がどのように隣接する節に伝播するかを表現したグラフであり、以下の式（３)で定義される。式（３）に示すように、類似性伝播グラフＧ_spは、節の集合Ｖ_sp、辺の集合Ｅ_sp、類似度関数σ、類似度の集合Σ、伝播関数ω、伝播係数の集合Ωで定義される。なお、ペアワイズ接続グラフＧ_pcの辺が片方向だったのに対し、この類似性伝播グラフＧ_spにおける辺の集合Ｅ_spは双方向の辺から構成される。 The affinity propagation graph G _sp, pairwise connection graph G _pc sections _{_{(v reff, v AsIs) ∈V}} pc similarity between pairs v _reff and v _AsIs with the paragraph this similarity is how adjacent propagated It is a graph expressing whether or not to be defined, and is defined by the following formula (3). As shown in the equation (3), the similarity propagation graph G _sp includes a node set V _sp , an edge set E _sp , a similarity function σ, a similarity set Σ, a propagation function ω, and a propagation coefficient set Ω. Defined. Note that the edge of the pair-wise connection graph G _pc is unidirectional, whereas the edge set E _sp in the similarity propagation graph G _sp is composed of bidirectional edges.

ここで、ペアの類似度の集合Σの各要素の値は、０〜１の実数値で与えられるものとする。なお、類似度Σに対する初期値の与え方は複数考えられるが、例えば全ての初期値を１とする方法、ペアがともにテキストの場合は、シソーラスや編集距離などからテキスト間の類似性を求め、それを類似性の初期値とする方法などを使用する。同様に、伝播係数の集合Ωの各要素の値は、０〜１の実数値で与えられるものとする。なお、初期値は、着目している節が開始節となっている全ての辺に対する伝播係数の合計が１となるように与える。個々の辺に対する初期値の与え方は複数考えられるが、例えば着目している節が開始節となっている全ての辺で等しくなるように与える方法、その辺がプロセスモデルに由来する辺かデータモデルに由来する辺かで異なる値を与える方法などを使用する。 Here, the value of each element of the pair similarity set Σ is assumed to be a real value from 0 to 1. There are a plurality of ways of giving initial values for the similarity Σ. For example, when all the initial values are set to 1, when both pairs are texts, the similarity between the texts is obtained from the thesaurus or the edit distance, A method of using it as an initial value of similarity is used. Similarly, the value of each element of the set of propagation coefficients Ω is given as a real value from 0 to 1. Note that the initial value is given so that the sum of the propagation coefficients for all sides where the node of interest is the start node is 1. There are multiple ways to assign initial values to each side. For example, a method in which the target node is the same for all the sides that are the starting clause, or the side is derived from the process model or data Use a method that gives different values depending on the edges from the model.

（ステップ５０７）
次に、コンピュータ３００は、作成された類似性伝播グラフ（例えば９１０〜９４０）における各ペアの類似性を、以下の式（４）に基づいて計算し、計算結果が収束するまで計算処理を繰り返し実行する。すなわち、コンピュータ３００は、与えられたペアに対する類似度関数σ^k+1 の値とσ^k の値との差が与えられた閾値以下になるまで繰り返し計算処理を実行する。ここで、ｋは繰り返し回数である。なお、式（４）の第２項は計算対象とする節に対してプロセスの下流側に位置する節との間における類似成分値であり、第３項は計算対象とする節に対してプロセスの上流側に位置する節との間における類似成分値である。 (Step 507)
Next, the computer 300 calculates the similarity of each pair in the created similarity propagation graph (for example, 910 to 940) based on the following formula (4), and repeats the calculation process until the calculation result converges: Execute. That is, the computer 300 repeatedly performs a calculation process until the difference between the value of the similarity function σ ^{k + 1 and} the value of σ ^k for a given pair is equal to or less than a given threshold value. Here, k is the number of repetitions. Note that the second term of Equation (4) is the similar component value between the node to be calculated and the node located downstream of the process, and the third term is the process for the node to be calculated. It is a similar component value between the nodes located on the upstream side.

（ステップ５０８）
この後、コンピュータ３００は、先のステップ５０７で算出されたペア間の類似度と閾値とを比較し、閾値より小さい類似度が得られたペアを除外するフィルタリング処理を実行する。なお、閾値は、ユーザによる作業工数やフィルタリング結果として残るペア数等を考慮して適時修正する。因みに、閾値の修正には入力装置３２０を使用する。 (Step 508)
Thereafter, the computer 300 compares the similarity between the pairs calculated in the previous step 507 with a threshold value, and executes a filtering process for excluding the pairs having a similarity smaller than the threshold value. The threshold value is corrected in a timely manner in consideration of the number of work steps by the user, the number of pairs remaining as a filtering result, and the like. Incidentally, the input device 320 is used to correct the threshold value.

（１−３）まとめ
以上説明したように、マッチング処理の対象となる各プロセスモデルに現れる表記上のバラツキを、戻り辺や順序関係の曖昧性を排除した表記に初期化することにより、プロセスモデルの順番関係を明確にできる。 (1-3) Summary As described above, the process model is initialized by initializing the notation variation appearing in each process model to be subjected to the matching process to the notation in which the ambiguity of the return side and the order relation is excluded. The order relationship can be clarified.

また、初期化処理が終了したプロセスモデルのモデルとデータモデルのメタモデルとを同一グラフ上に表現した拡張プロセスグラフを作成し、その際、節と節を接続する辺の由来に基づいて固有のラベルを付ける。同様に、初期マッチングにおいて、プロセスモデル間の比較においてマッチングが確認された節の要素に対するプロセスの前に位置するか後に位置するかに基づいて各辺に異なるラベルを付ける。これにより、プロセス上の処理順序を考慮して、拡張プロセスグラフについてペアワイズ接続グラフを作成する際のマッチング範囲の切り分けを明確にできる。この結果、コンピュータ３００によるペアの探索範囲が不必要に広がるのを避けることができる。その分、計算量を削減できる。 In addition, an extended process graph that represents the model of the process model that has been initialized and the meta model of the data model is created on the same graph. Label it. Similarly, in the initial matching, each side is given a different label based on whether it is located before or after the process for the element of the clause whose matching is confirmed in the comparison between the process models. This makes it possible to clarify the matching range when creating a pair-wise connection graph for the extended process graph in consideration of the processing order on the process. As a result, it is possible to avoid unnecessarily widening the pair search range by the computer 300. The amount of calculation can be reduced accordingly.

また、この形態例の場合、類似性伝播グラフの作成時には、テキストのペアである節に対してシソーラスや編集距離を用いテキスト間の類似度を求め、これをペア間の類似性の初期値に与えることにより、算出されるペア間の類似度に対する信頼性を高めることができる。また、類似性伝播グラフの作成時、信頼度の高いモデルに由来する辺か否かで伝播係数の初期値を変更することにより、算出されるペア間の類似度に対する信頼性を高めることができる。特に、マッチング対象である業務モデルを構成するプロセスモデルとデータモデルとの間に信頼性の差がある場合には、信頼性の高い方から伝播係数の影響を高めることにより、最終的に算出される各ペア間の類似度の信頼性を高めることができる。 In the case of this form example, when creating the similarity propagation graph, the similarity between texts is obtained using a thesaurus or editing distance for the clauses that are pairs of text, and this is used as the initial value of similarity between pairs. By giving, the reliability with respect to the calculated similarity between pairs can be improved. In addition, when creating a similarity propagation graph, the reliability of similarity between calculated pairs can be improved by changing the initial value of the propagation coefficient depending on whether the edge is derived from a model with high reliability. . In particular, if there is a difference in reliability between the process model and the data model that make up the business model to be matched, it is finally calculated by increasing the influence of the propagation coefficient from the higher reliability. The reliability of the similarity between each pair can be increased.

（２）他の形態例
前述した形態例の場合には、購買業務を具体例として業務モデル同士をデータマッチングする場合について説明した。しかし、言うまでも無く、マッチング対象とする業務モデルはこれに限らない。 (2) Other Embodiments In the case of the above-described embodiment examples, the case has been described in which business models are subjected to data matching using a purchase operation as a specific example. However, it goes without saying that the business model to be matched is not limited to this.

また、形態例の説明では、便宜上、マッチング対象とする一方の業務モデル（データモデル、プロセスモデル）をリファレンスモデルと呼び、他方の業務モデル（データモデル、プロセスモデル）を既存モデルと呼んだが、マッチング対象の一方が業界団体によって策定されたリファレンス業務モデルである必要は無い。 In the description of the example, for convenience, one business model (data model, process model) to be matched is called a reference model, and the other business model (data model, process model) is called an existing model. One of the targets need not be a reference business model formulated by an industry group.

前述した形態例の場合には、初期マッチング（ステップ５０２）の実行後に、プロセスモデルの正規化処理（ステップ５０３）を実行する場合について説明した。しかしながら、実行順序は入れ替わっても良い。 In the case of the above-described embodiment, the case where the process model normalization process (step 503) is executed after the initial matching (step 502) has been described. However, the execution order may be changed.

３００…コンピュータ
３０１…ＣＰＵ
３０２…ＲＯＭ
３０３…ＲＡＭ
３０６…ＨＤ
３２０…入力装置
３３０…表示装置 300 ... Computer 301 ... CPU
302 ... ROM
303 ... RAM
306 ... HD
320 ... Input device 330 ... Display device

Claims

A process in which the data processing unit reads the data of the data model and the process model corresponding to the first business model and the data model and the process model of the data corresponding to the second business model from the storage device;
The data processing unit compares the character string of each element constituting the process model corresponding to the first business model and the character string of each element constituting the process model corresponding to the second business model on the data And a process for detecting an element with a matching relationship,
A process in which the data processing unit normalizes the model notation of the process model corresponding to the first and second business models into a notation using a preset model notation;
A process in which the data processing unit generates a first extended process graph on data by integrating these into one graph based on the relationship between the data model corresponding to the first business model and the elements of the process model When,
A process in which the data processing unit generates a second extended process graph on data by integrating these into one graph based on the relationship between the data model corresponding to the second business model and the elements of the process model When,
The data processing unit forms a pair with the elements of the first extended process graph and the elements of the second extended process graph and generates a pair-wise connection graph describing the relationship between each pair with a label on the data Processing to
A process in which the data processing unit generates on the data a similarity propagation graph in which the similarity and propagation coefficient between pairs are set;
A process in which the data processing unit repeatedly calculates the similarity between the pairs,
A data matching method between business models, in which a data processing unit has a process of filtering calculation results and presenting pairs with high similarity on a display device.

2. The matching method according to claim 1, wherein, in the process of normalizing the process model, the data processing unit rewrites the notation into a loop notation when a return side exists in the process model.

In the process of normalizing the process model, the data processing unit rewrites the notation into a parallel execution process when there is a notation of a process having no data relation between continuously executed processes. The matching method according to claim 1 or 2.

In the process of generating the extended process graph, the data processing unit attaches different labels to edges derived from the data model and edges derived from the process model, respectively. Matching method as described in.

Processing for generating the extended process graph when an element having a matching character string is detected between the process model corresponding to the first business model and the process model corresponding to the second business model; 5. The matching according to claim 1, wherein the data processing unit to execute attaches different labels to the upstream side and the downstream side to the element for which a match is detected. Method.

6. The matching according to claim 1, wherein, in the process of generating the pair-wise connection graph, the data processing unit creates a pair in consideration of the order of the normalized process model. Method.

In the process of generating the similarity propagation graph, the data processing unit obtains a similarity between texts using a thesaurus or an editing distance for a clause that is a text pair, and uses this as an initial value of the similarity between the pairs. It sets. The matching method of any one of Claims 1-6 characterized by the above-mentioned.

In the process of generating the similarity propagation graph, the data processing unit changes an initial value of the propagation coefficient according to whether the edge propagation coefficient is derived from a process model or a data model. Item 8. The matching method according to any one of Items 1 to 7.

A process of reading data from the data model and process model corresponding to the first business model, and a data model and process model data corresponding to the second business model, respectively from the storage device;
A character string of each element constituting the process model corresponding to the first business model and a character string of each element constituting the process model corresponding to the second business model are compared on the data, and the matching relationship is A process to detect the allowed elements;
Normalizing the model notation of the process model corresponding to the first and second business models into a notation using a preset model notation;
Based on the relationship between the data model corresponding to the first business model and the elements of the process model, processing for generating on the data a first extended process graph that integrates them into one graph;
Based on the relationship between the data model corresponding to the second business model and the elements of the process model, a process of generating on the data a second extended process graph that integrates these into one graph;
Forming a pair-wise connection graph on the data that forms a pair with the elements of the first extended process graph and the elements of the second extended process graph and that describes the relationship between each pair with a label;
Processing to generate a similarity propagation graph in which the similarity and propagation coefficient between pairs are set on the data;
A process of calculating similarity between pairs by iterative operations;
A computer program that causes a computer to execute a process of filtering a calculation result and presenting a highly similar pair on a display device.