JP5090481B2

JP5090481B2 - Data modeling method, apparatus and program

Info

Publication number: JP5090481B2
Application number: JP2010017302A
Authority: JP
Inventors: 俊文榎本; 伸幸小林; 源吾鈴木; 雅司山室; 展郎谷口
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2010-01-28
Filing date: 2010-01-28
Publication date: 2012-12-05
Anticipated expiration: 2030-01-28
Also published as: JP2011154653A

Description

本発明は、データモデリング方法及び装置及びプログラムに係り、特に、構造化データである多数のXMLデータのような構造化データを取り扱うシステムの設計において、具体的なXMLデータ構造を設計するためのデータモデリング方法及び装置及びプログラムに関する。 The present invention relates to a data modeling method, apparatus, and program, and more particularly to data for designing a specific XML data structure in designing a system that handles structured data such as a large number of XML data that is structured data. The present invention relates to a modeling method, apparatus, and program.

一定量以上のデータの管理が必要となる情報システムを設計する場合、データベース管理システム（DBMS）が利用され、そのために、通常、データモデリングと呼ばれるシステムで管理するデータ構造の設計作業が行われる。 When designing an information system that requires management of a certain amount of data or more, a database management system (DBMS) is used, and for this purpose, design work of a data structure managed by a system called data modeling is usually performed.

データモデリングは、システムで利用するDBMSを対象に、以下の３種類のデータモデリングを順次設計していく。 For data modeling, the following three types of data modeling are designed sequentially for the DBMS used in the system.

（１）概念データモデル：
DBMSに依存しない抽象的なデータモデルであり、情報の意味的な側面を記述する。 (1) Conceptual data model:
An abstract data model that does not depend on DBMS and describes the semantic aspects of information.

（２）論理データモデル：
DBMSが対象とするモデルに依存したデータモデルで、例えば、関係データベース（RDB）の場合、テーブル（表）、カラム（列）等から構成されるリレーショナルモデルとなり、概念データモデルからの変形が行われる。 (2) Logical data model:
A data model that depends on the model targeted by the DBMS. For example, in the case of a relational database (RDB), it becomes a relational model composed of tables, columns, etc., and is transformed from the conceptual data model. .

（３）物理データモデル：
データベースの物理的な内部構造を決定するものであり、性能を考慮し、論理データモデルから変形が行われる。 (3) Physical data model:
It determines the physical internal structure of the database, and is transformed from the logical data model in consideration of performance.

これらのデータベースモデルを設計することを、それぞれ、概念設計、論理設計、物理設計と呼ぶ。 Designing these database models is called conceptual design, logical design, and physical design, respectively.

一般に、データモデリングの技法として、実体関連図（ER図）（例えば、非特許文献１参照）や統一モデリング言語（UML）のクラス図が用いられる。また、従来はDBMSには関係データベース（RDB）が用いられている。 In general, as a data modeling technique, an entity relation diagram (ER diagram) (see, for example, Non-Patent Document 1) or a unified modeling language (UML) class diagram is used. Conventionally, a relational database (RDB) is used for the DBMS.

そして、上記技法に対する支援ツールが用意されており、ER図は、概念、論理、物理データモデルを描画することができ、そこから自動的にRDBスキーマ（テーブル定義、カラム定義DDL）を出力する、といった機能を持つものもある。 And the support tool for the above technique is prepared, ER diagram can draw conceptual, logical and physical data models, and automatically outputs RDB schema (table definition, column definition DDL) from there, Some have such functions.

以下に、従来手法であるRDB向けのデータモデリングについて、ER図を用いて説明する。 The following is an explanation of data modeling for RDB, which is a conventional method, using ER diagrams.

＜概念設計＞
まず、概念設計では、エンティティ（事象／人／物などの管理対象）と、リレーションシップ（エンティティ間の関連）、さらに、属性（エンティティのデータ項目）を洗い出す。例として、学校の生徒を管理するシステムの概念データモデルをER図として図１８に示す。このようなER図を支援ツールを用いて描画することができる。 <Conceptual design>
First, in conceptual design, entities (managed objects such as events / people / things), relationships (associations between entities), and attributes (data items of entities) are identified. As an example, a conceptual data model of a system for managing school students is shown as an ER diagram in FIG. Such an ER diagram can be drawn using a support tool.

表記法として、IDEF1Xと呼ばれる記法を用いており、図１８において、「クラス」、「生徒」、「部活動」、「試験成績」、といった情報のまとまりがエンティティである。同図において、エンティティ間を結んでいる線がリレーションシップを表しており、黒丸はエンティティの関係が１：N(１対多)である「N」側を表している。一般に、「１」側を「親（エンティティ）」、「N」側を「子（エンティティ）」と呼ぶ。従って、「クラス」と「生徒」の関係は１：Nであり、「生徒」と「試験成績」の関係も１：Nである。双方に黒丸がある「部活動」と「生徒」の関係はＭ：Ｎ(多対多)を表している。エンティティの中に書かれているものが属性で、区切り線の上にある属性（群）が「主キー」と呼ばれ、情報を一意に決定することができる項目である。 As a notation, a notation called IDEF1X is used. In FIG. 18, a group of information such as “class”, “student”, “club activity”, and “examination result” is an entity. In the figure, the line connecting the entities represents the relationship, and the black circle represents the “N” side where the entity relationship is 1: N (one-to-many). In general, the “1” side is called “parent (entity)”, and the “N” side is called “child (entity)”. Therefore, the relationship between “class” and “student” is 1: N, and the relationship between “student” and “test result” is also 1: N. The relationship between “club activities” and “students” with black circles on both sides represents M: N (many-to-many). What is written in the entity is an attribute, and the attribute (group) on the separator is called a “primary key”, which is an item that can uniquely determine information.

概念データモデルは、データの冗長性と不整合を回避した正規形であることが求められるため、実際の設計作業では、エンティティの洗い出し、リレーションシップの洗い出し、正規化（正規形とする作業）といった手順により、作成・修正を繰り返して完成させていく。 Since the conceptual data model is required to be in a normal form that avoids data redundancy and inconsistencies, in the actual design work, entity identification, relationship identification, normalization (operation to normalize), etc. Repeat the creation and modification process according to the procedure.

また、同じデータモデルをＵＭＬのクラス図で表したものを図１９に示す。ＥＲ図とほぼ同等の情報を記述できるため、こちらを用いてデータモデリングが行われる場合も多い。 Further, FIG. 19 shows the same data model represented by a UML class diagram. Since it is possible to describe almost the same information as the ER diagram, data modeling is often performed using this information.

＜論理設計＞
次に、論理設計では、リレーショナルモデル（表モデル）の制約を満たすよう、概念データモデルからの変形を行う。例えば、リレーショナルモデルでは、Ｍ：Ｎの関係はそのままでは表現できず、中間に新しいエンティティを新設し、１：Ｎでそれぞれにリレーションシップを作成する必要がある。 <Logical design>
Next, in the logical design, transformation from the conceptual data model is performed so as to satisfy the constraints of the relational model (table model). For example, in the relational model, the relationship of M: N cannot be expressed as it is, and it is necessary to create a new entity in the middle and create a relationship in 1: N.

また、ＲＤＢでは、エンティティはテーブルに対応し、属性はカラムに対応する。そしてリレーションシップは外部キーと呼ばれる属性を追加することで表現する。図１８をＲＤＢ向けに変形させたＥＲ図の例を図２０に示す。Ｍ：Ｎ関係の部活動と生徒の間に「部活動−生徒関連」というエンティティが追加されている。また、「（ＦＫ）」と書かれている属性が外部キー属性であり、例えば、生徒のクラスコード（ＦＫ）は、クラスの主キーであるクラスコードと同一値を保持することでリレーションシップを表現する、ということを意味している。 In RDB, an entity corresponds to a table, and an attribute corresponds to a column. Relationships are expressed by adding attributes called foreign keys. An example of an ER diagram obtained by transforming FIG. 18 for RDB is shown in FIG. An entity “club activity-student relation” is added between M: N related club activities and students. In addition, the attribute written as “(FK)” is a foreign key attribute. For example, a student's class code (FK) holds a relationship by holding the same value as the class code that is the primary key of the class. It means to express.

支援ツールによっては、図１８の概念データモデルから、（半）自動的にこのような論理データモデルを作成するものもある（新規追加されたエンティティ名などは仮の名称となるため、編集は必要であるため、半自動と表現している）。 Some support tools automatically (semi-) create such a logical data model from the conceptual data model in FIG. 18 (editing is necessary because newly added entity names are temporary names) Therefore, it is expressed as semi-automatic).

＜物理設計＞
物理設計は、主に性能向上を目的に行われる。ＲＤＢにおいて一般的にコストが高い処理は結合処理である。 <Physical design>
Physical design is performed mainly for the purpose of improving performance. In general, a process with high cost in RDB is a combination process.

例えば、図２０に対し、「生徒」と「試験成績」を同時に取得する処理を頻繁に行う場合、「生徒」と「試験成績」の２つのテーブル間の結合処理が必要となり、取得に時間がかかることが懸念される。そこで、物理設計ではあえて正規形を崩し、「試験成績」を「生徒」のエンティティ（＝テーブル）に含めてしまう（テーブル統合を行う）ことで、性能向上を図ったデータモデルを選択するといったことを行う場合がある。その例を図２１に示す。この場合、さらに「試験成績」は、３つまでしか保持できないという制限も加えられている。 For example, in the case of frequently performing the process of simultaneously acquiring “student” and “test results” as shown in FIG. 20, a join process between two tables of “student” and “test results” is required, and acquisition takes time. This is a concern. Therefore, in the physical design, the normal form is intentionally broken, and the “test results” are included in the “student” entity (= table) (table integration) to select a data model with improved performance. May do. An example is shown in FIG. In this case, there is also a restriction that only “three test results” can be retained.

支援ツールを用いて論理データモデルから物理データモデルのＥＲ図を編集した後、ＲＤＢスキーマを自動的に出力させることができる。 After editing the ER diagram of the physical data model from the logical data model using the support tool, the RDB schema can be automatically output.

＜XML及びXMLＤＢ＞
XML（Extensible Markup Language）とはマークアップ言語の一つで、XMLで記述されたデータは構造化され、構造に意味を持ったデータである。図２２はXMLデータとその構造を示した図である。従って、XMLデータは木構造モデルであり、各節をノードと呼ぶ。特に、根のノードをルート要素、値（記述内容）をテキストノード、タグ中に記述されたものを属性（データモデルの属性と区別するため、以降『XML属性』と呼ぶ）、テキストノードとXML属性以外のノードを要素と呼ぶ。また、データ全体をXMLドキュメントと呼ぶ。 <XML and XMLDB>
XML (Extensible Markup Language) is one of markup languages, and data described in XML is structured and meaningful in structure. FIG. 22 is a diagram showing XML data and its structure. Therefore, XML data is a tree structure model, and each section is called a node. In particular, the root node is the root element, the value (description content) is a text node, the one described in the tag is the attribute (to distinguish it from the attribute of the data model), the text node and XML Nodes other than attributes are called elements. The entire data is called an XML document.

このようなXMLデータを管理する機能を有するデータベースがXMLＤＢである。 A database having a function of managing such XML data is XMLDB.

ＲＤＢのリレーショナルモデルに比べると、表現力が高いことが長所であり、それが故に、逆に取り扱いが難しいことが短所である。 Compared to the relational model of RDB, it is an advantage that it has a high expressive power. Therefore, it is a disadvantage that it is difficult to handle.

Chen, Peter P. (1976年). "The entity-Relationship Model-Toward a Unified View of Data", ACM Transactions on Database Systems, Vol. 1, No.1, 1976, pp. 9-36.Chen, Peter P. (1976). "The entity-Relationship Model-Toward a Unified View of Data", ACM Transactions on Database Systems, Vol. 1, No.1, 1976, pp. 9-36.

従来のＲＤＢのデータモデリングでは、物理設計において、性能向上のため正規形を崩してしまう場合がある。正規形を崩してしまうと、データの一貫性の確保などの保守性が低下してしまうという問題がある。また、XMLDBは、RDBに比べて表現力は高いが、取り扱いが難しいという問題がある。 In conventional RDB data modeling, there is a case where the normal form is destroyed in order to improve performance in physical design. If the normal form is broken, there is a problem that maintainability such as ensuring data consistency is lowered. XMLDB is more expressive than RDB, but it is difficult to handle.

本発明は、上記の点に鑑みなされたもので、物理設計において正規形の維持と性能向上の両立を図ることが可能なデータモデリング方法及び装置及びプログラムを提供することを目的とする。 The present invention has been made in view of the above points, and an object of the present invention is to provide a data modeling method, apparatus, and program capable of achieving both maintaining a normal form and improving performance in physical design.

図１は、本発明の原理構成図である。 FIG. 1 is a principle configuration diagram of the present invention.

本発明（請求項１）は、ユーザの端末３００と接続され、構造化文書のXMLを格納するデータベース（XMLDB）のXMLデータ構造を設計するデータモデリング装置であって、
論理データモデルを表す実態関連図（ＥＲ図）の各エンティティに対応する論理ＸＭＬを定義し、論理ＸＭＬ定義情報としてＥＲ図の情報と共に論理ＸＭＬ定義情報記憶手段に格納する論理モデル設計手段１１０と、
入力されたアプリケーションのアクセスパターンを取得してアクセスパターン記憶手段１４０に格納するアクセスパターン入力手段１２１と、
論理XML定義情報記憶手段１３０の論理XML定義情報とアクセスパターン記憶手段１４０のアクセスパターンに基づいて、エンティティ毎に個別の更新処理のアクセスがなされる場合には外部キーによりリレーションシップを表現する基本形式とし、複数のエンティティがまとめてアクセスさせる場合には複数のエンティティを統合した木構造により表現する統合形式としてエンティティを物理データモデル記憶手段１５０に格納するXMLコレクション設計手段１２２と、
物理データモデル記憶手段１５０からエンティティを取得し、該エンティティの属性へのアクセスパターンに基づいて同時にアクセスされている属性群を抽出し、階層化を行い、物理データモデルとして該物理データモデル記憶手段の内容を更新するXML文書形式設計手段１２３と、
物理データモデル記憶手段１５０から物理データモデルを取得して、該物理データモデルに対応するXMLスキーマを作成し、XMLスキーマ記憶手段１６０に格納するXMLスキーマ生成手段１２４と、を有する。 The present invention (Claim 1) is a data modeling apparatus for designing an XML data structure of a database (XMLDB) that is connected to a user terminal 300 and stores XML of a structured document.
A logical model design unit 110 that defines a logical XML corresponding to each entity of an actual relationship diagram (ER diagram) representing a logical data model, and stores the logical XML definition information in the logical XML definition information storage unit together with the information of the ER diagram as logical XML definition information;
An access pattern input unit 121 that acquires an access pattern of the input application and stores it in the access pattern storage unit 140;
Based on the logical XML definition information stored in the logical XML definition information storage unit 130 and the access pattern stored in the access pattern storage unit 140, a basic format for expressing a relationship using a foreign key when an individual update process is accessed for each entity. And an XML collection design unit 122 that stores the entity in the physical data model storage unit 150 as an integrated format that represents a plurality of entities in a unified tree structure when the plurality of entities are accessed together,
An entity is acquired from the physical data model storage unit 150, an attribute group accessed simultaneously based on an access pattern to the attribute of the entity is extracted, hierarchized, and the physical data model storage unit XML document format design means 123 for updating contents;
An XML schema generation unit 124 that acquires a physical data model from the physical data model storage unit 150, creates an XML schema corresponding to the physical data model, and stores the XML schema in the XML schema storage unit 160.

また、本発明（請求項２）のXMLコレクション設計手段１２２は、
アクセスパターン記憶手段１４０のアクセスパターンから、アプリケーション毎の各エンティティへのアクセス頻度、参照または更新の頻度、または、条件に合致する対象データ数のいずれかに基づいて、基本形式または統合形式のいずれかを選択する手段を含む。 The XML collection design means 122 of the present invention (Claim 2)
From the access pattern of the access pattern storage means 140, either the basic format or the integrated format based on either the access frequency to each entity for each application, the frequency of reference or update, or the number of target data that meets the conditions Means for selecting.

また、本発明（請求項３）のXML文書形式設計手段１２３は、
抽出された属性群について、属性へのアクセスパターンによりグループ化を行う。 The XML document format design means 123 of the present invention (Claim 3)
The extracted attribute group is grouped according to the access pattern to the attribute.

また、本発明（請求項４）のXML文書形式設計手段１２３は、
属性へのアクセスパターンから同時にアクセスされている属性群を抽出し、あるアプリケーションから抽出された該属性群が他のアプリケーションと競合または重複しており、完全に包含関係がある場合には階層化したグループ化を行い、グループ名を設定する手段を含む。 The XML document format design means 123 of the present invention (Claim 4)
The attribute group that is accessed simultaneously from the attribute access pattern is extracted, and the attribute group extracted from one application conflicts with or overlaps with other applications, and is hierarchical when there is a complete inclusion relationship Includes means for grouping and setting the group name.

図２は、本発明の原理を説明するための図である。 FIG. 2 is a diagram for explaining the principle of the present invention.

本発明（請求項５）は、ユーザの端末と接続され、構造化文書のXML(Extensible Markup Language)を格納するデータベース（XMLDB）のXMLデータ構造を設計するデータモデリング方法であって、
論理XML定義情報を格納するXML定義情報記憶手段と、アクセスパターン記憶手段と、物理データモデル記憶手段と、XMLスキーマ記憶手段と、論理設計モデル設計手段と、アクセスパターン入力手段と、XMLコレクション設計手段と、XML文書形式設計手段と、XMLスキーマ生成手段と、を有する装置において、
論理モデル設計手段が、論理データモデルを表す実態関連図（ＥＲ図）の各エンティティに対応する論理ＸＭＬを定義し、論理ＸＭＬ定義情報としてＥＲ図の情報と共に論理ＸＭＬ定義情報記憶手段に格納する論理モデル設計ステップ（ステップ１）と、
アクセスパターン入力手段が、入力されたアプリケーションのアクセスパターンを取得してアクセスパターン記憶手段に格納するアクセスパターン入力ステップ（ステップ２）と、
XMLコレクション設計手段が、ＥＲ図からリレーションシップを持つエンティティの組が指定されると、論理XML定義情報記憶手段の論理XML定義情報とアクセスパターン記憶手段のアクセスパターンに基づいて、エンティティ毎に個別の更新処理のアクセスがなされる場合には外部キーによりリレーションシップを表現する基本形式とし、複数のエンティティがまとめてアクセスさせる場合には複数のエンティティを統合した木構造により表現する統合形式としてエンティティを物理データモデル記憶手段に格納するXMLコレクション設計ステップ（ステップ３）と、
XML文書形式設計手段が、物理データモデル記憶手段からエンティティを取得し、該エンティティの属性へのアクセスパターンに基づいて同時にアクセスされている属性群を抽出し、階層化を行い、物理データモデルとして該物理データモデル記憶手段の内容を更新するXML文書形式設計ステップ（ステップ４）と、
XMLスキーマ生成手段が、物理データモデル記憶手段から物理データモデルを取得して、該物理データモデルに対応するXMLスキーマを作成し、XMLスキーマ記憶手段に格納するXMLスキーマ生成ステップ（ステップ５）と、を行う。 The present invention (Claim 5) is a data modeling method for designing an XML data structure of a database (XMLDB) connected to a user terminal and storing XML (Extensible Markup Language) of a structured document,
XML definition information storage means for storing logical XML definition information, access pattern storage means, physical data model storage means, XML schema storage means , logical design model design means, access pattern input means, and XML collection design means And an apparatus having XML document format design means and XML schema generation means ,
The logical model design means defines logical XML corresponding to each entity of the actual relationship diagram (ER diagram) representing the logical data model, and stores the logical XML definition information in the logical XML definition information storage means together with the ER diagram information. Model design step (Step 1),
An access pattern input step (step 2) in which the access pattern input means acquires the access pattern of the input application and stores it in the access pattern storage means;
When a set of entities having a relationship is specified from the ER diagram , the XML collection design means is individually set for each entity based on the logical XML definition information of the logical XML definition information storage means and the access pattern of the access pattern storage means. When update processing is accessed, the entity is represented as a basic format that expresses a relationship using a foreign key. When multiple entities are accessed together, the entity is physically represented as an integrated format that represents a tree structure that integrates multiple entities. XML collection design step (step 3) for storing in the data model storage means;
The XML document format design unit obtains an entity from the physical data model storage unit, extracts attribute groups that are accessed simultaneously based on the access pattern to the attribute of the entity, performs stratification, and forms the physical data model. XML document format design step (step 4) for updating the contents of the physical data model storage means;
An XML schema generation means, and acquires the physical data model from a physical data model storage unit, to create an XML schema corresponding to the physical data model, XML schema generation step of storing the XML schema storing means (step 5), I do.

また、本発明（請求項６）は、XMLコレクション設計ステップにおいて、
アクセスパターン記憶手段のアクセスパターンから、アプリケーション毎の各エンティティへのアクセス頻度、参照または更新の頻度、または、条件に合致する対象データ数のいずれかに基づいて、基本形式または統合形式のいずれかを選択する。 Further, the present invention (Claim 6), in the XML collection design step,
Based on either the access frequency of each entity for each application, the frequency of reference or update, or the number of target data that meets the conditions, from the access pattern of the access pattern storage means, either basic format or integrated format select.

また、本発明（請求項７）は、XML文書形式設計ステップにおいて
抽出された前記属性群について、属性へのアクセスパターンによりグループ化を行う。 In the present invention (Claim 7), the attribute group extracted in the XML document format design step is grouped according to an access pattern to the attribute.

また、本発明（請求項８）は、XML文書形式設計ステップにおいて、
属性へのアクセスパターンから同時にアクセスされている属性群を抽出し、あるアプリケーションから抽出された該属性群が他のアプリケーションと競合または重複しており、完全に包含関係がある場合には階層化したグループ化を行い、グループ名を設定する。 In the XML document format design step, the present invention (claim 8)
The attribute group that is accessed simultaneously from the attribute access pattern is extracted, and the attribute group extracted from one application conflicts with or overlaps with other applications, and is hierarchical when there is a complete inclusion relationship Perform grouping and set the group name.

本発明（請求項９）は、請求項１乃至４のいずれか１項に記載のデータモデリング装置を構成する各手段としてコンピュータを機能させるためのデータモデリングプログラムである。 The present invention (Claim 9) is a data modeling program for causing a computer to function as each means constituting the data modeling apparatus according to any one of Claims 1 to 4.

上記のように本発明によれば、XMLDB向けのデータモデリングにおいて、概念設計は従来の関係データベース（RDB）向けの手法をそのまま適用し、論理設計はXMLの木構造モデルの表現力を利用することで簡素化し、物理設計はエンティティ間のリレーションシップを主キー、外部キー属性による表現と、１つのXMLドキュメント内の構造による表現の２種類から、アプリケーションの振る舞い（アクセスパターン）を考慮し、XMLコレクションの形式（基本形式・統合形式）選択することにより、正規形を維持したまま、性能を向上させることができる。さらに、利便性の高いデータ構造を設計することができる。 As described above, according to the present invention, in data modeling for XMLDB, conceptual design applies the conventional method for relational database (RDB) as it is, and logical design uses the expressive power of XML tree structure model. In the physical design, the relationship between entities is expressed by the primary key and foreign key attributes, and the expression by the structure in one XML document. By selecting the format (basic format / integrated format), the performance can be improved while maintaining the normal form. Furthermore, a highly convenient data structure can be designed.

本発明の原理構成図である。It is a principle block diagram of this invention. 本発明の原理を説明するための図である。It is a figure for demonstrating the principle of this invention. 本発明の方法と従来の方法の比較を示す図である。It is a figure which shows the comparison of the method of this invention, and the conventional method. 本発明の一実施の形態におけるモデリング装置の構成図である。It is a block diagram of the modeling apparatus in one embodiment of this invention. 本発明の一実施の形態における木構造モデルによるＭ：Ｎ関係の表現例である。It is an example of expression of M: N relation by a tree structure model in one embodiment of the present invention. 本発明の一実施の形態における２種類のリレーションシップの表現例である。It is an example of expression of two kinds of relationships in one embodiment of the present invention. 本発明の一実施の形態におけるＥＲ図の表記の拡張（統合形式の表記）を示す図である。It is a figure which shows the expansion (notation of an integrated format) of the notation of the ER figure in one embodiment of this invention. 本発明の一実施の形態におけるXMLコレクション設計部の動作のフローチャートである。It is a flowchart of operation | movement of the XML collection design part in one embodiment of this invention. 本発明の一実施の形態におけるグループ化の表記とXML表現への対応を示す図である。It is a figure which shows the correspondence of the notation of grouping and XML expression in one embodiment of this invention. 本発明の一実施例におけるモデリング装置の画面イメージである。It is a screen image of the modeling apparatus in one Example of this invention. 本発明の一実施例の論理ＸＭＬの例である。It is an example of logic XML of one Example of this invention. 本発明の一実施例のエンティティへのアクセスパターン例である。It is an example of the access pattern to the entity of one Example of this invention. 本発明の一実施例の属性へのアクセスパターン例である。It is an example of the access pattern to the attribute of one Example of this invention. 本発明の一実施例のXMLコレクション設計後のＥＲ図の例である。It is an example of ER figure after the XML collection design of one Example of this invention. 本発明の一実施例のグループ化後の物理データモデルのＥＲ図の例である。It is an example of the ER figure of the physical data model after grouping of one Example of this invention. 本発明の一実施例のXMLデータの例（XMLコレクションとそのデータ構造）である。It is an example (XML collection and its data structure) of the XML data of one Example of this invention. 本発明の一実施例のXMLスキーマの例である。It is an example of the XML schema of one Example of this invention. 概念データモデルのＥＲ図の例である。It is an example of the ER figure of a conceptual data model. ＵＭＬのクラス図の例である。It is an example of a UML class diagram. RDB向け論理データモデルのＥＲ図の例である。It is an example of the ER figure of the logical data model for RDB. RDB向け物理データモデルのＥＲ図の例である。It is an example of the ER figure of the physical data model for RDB. XMLデータとその構造の例である。It is an example of XML data and its structure.

以下、図面と共に本発明の実施の形態を説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

概念設計はＤＢＭＳ非依存であるため、従来の方法をそのまま用い、論理設計及び物理設計の方法について提案する。特に、物理設計において、正規形の維持と性能向上の両立を図る方法を提案する。 Since the conceptual design is independent of DBMS, a conventional method is used as it is, and a method of logical design and physical design is proposed. In particular, in physical design, we propose a method for maintaining both normal form and improving performance.

また、支援ツールによる自動化方法も同時に提案する。 In addition, an automated method using support tools is also proposed.

まず、従来のＲＤＢ向けと比較した全体の手順を図３に示す。同図において、太線で囲んだ「論理XMLの設計」「XMLコレクションの設計」「XML文書形式の設計」が本発明で提案する部分であり、以下に説明する。それ以外の部分は従来と同様の方法を用いるものとし、その説明は省略する。 First, FIG. 3 shows an overall procedure compared with that for a conventional RDB. In the same figure, “design of logical XML”, “design of XML collection”, and “design of XML document format” surrounded by bold lines are the parts proposed by the present invention, and will be described below. Other parts are assumed to use the same method as in the prior art, and the description thereof is omitted.

本発明では、XMLＤＢ向けのデータモデリングについて説明する。 In the present invention, data modeling for XMLDB will be described.

図４は、本発明の一実施の形態におけるモデリング装置の構成を示す。 FIG. 4 shows the configuration of the modeling apparatus in one embodiment of the present invention.

同図に示すモデリング装置１００は、ユーザ端末３００と接続されており、概念／論理モデル管理部１１０、物理モデル管理部１２０、概念／論理モデルデータ（論理XML定義情報）記憶部１３０、アクセスパターン記憶部１４０、物理モデルデータ記憶部１５０、XMLスキーマ記憶部１６０から構成される。 A modeling apparatus 100 shown in the figure is connected to a user terminal 300, and includes a concept / logical model management unit 110, a physical model management unit 120, a concept / logical model data (logical XML definition information) storage unit 130, and an access pattern storage. Unit 140, physical model data storage unit 150, and XML schema storage unit 160.

物理モデル管理部１２０は、アクセスパターン入力部１２１、XMLコレクション設計部１２２、XML文書形式設計部１２３、XMLスキーマ生成部１２４を有する。 The physical model management unit 120 includes an access pattern input unit 121, an XML collection design unit 122, an XML document format design unit 123, and an XML schema generation unit 124.

概念／論理モデル管理部１１０は、端末３００でユーザが描画したＥＲ図やＵＭＬクラス図を、概念／論理モデルデータ（論理XML定義情報）として概念／論理モデルデータ記憶部１３０に格納する。 The concept / logical model management unit 110 stores the ER diagram and UML class diagram drawn by the user on the terminal 300 in the concept / logical model data storage unit 130 as concept / logical model data (logical XML definition information).

物理モデル管理部１２０のアクセスパターン入力部１２１は、端末３００でユーザが入力したアクセスパターンをアクセスパターン記憶部１４０に格納する。但し、入力はＣＳＶ(Comma Separated Values)ファイルを渡すような形式であってもよい。 The access pattern input unit 121 of the physical model management unit 120 stores the access pattern input by the user at the terminal 300 in the access pattern storage unit 140. However, the input may be in a format that passes a CSV (Comma Separated Values) file.

物理モデル管理部１２０のXMLコレクション設計部１２２は、端末３００からユーザが実行を選択（ボタン押す等）すると、概念／論理モデルデータ記憶部１３０とアクセスパターン記憶部１４０の情報を参照してXMLコレクション(XMLドキュメントの種類)設計を行う。この際、必要に応じて、ユーザとのインタラクションも含むものとする。そしてXMLコレクション設計結果を物理モデルデータとして物理モデルデータ記憶部１５０に格納する。 The XML collection design unit 122 of the physical model management unit 120 refers to the information in the concept / logical model data storage unit 130 and the access pattern storage unit 140 when the user selects execution (presses a button, etc.) from the terminal 300, and the XML collection (XML document type) Design. At this time, interaction with the user is included as necessary. The XML collection design result is stored in the physical model data storage unit 150 as physical model data.

物理モデル管理部１２０のXML文書形式設計部１２３は、端末３００でユーザが属性群のグループ化を直接指定（描画）し、実行を選択（ボタンを押すなどの操作）すると、XML文書形式の設計を行い、アクセスパターン記憶部１４０と物理モデルデータ記憶部１５０のデータに基づいてユーザが直接指定していない箇所のグループ化を行う。グループ化を行う際に、必要に応じて、端末３００とのインタラクションも含むものとする。そして設計結果を物理モデルデータ記憶部１５０に格納（更新）する。 The XML document format design unit 123 of the physical model management unit 120 designs the XML document format when the user directly designates (draws) grouping of attribute groups on the terminal 300 and selects execution (operation such as pressing a button). Then, based on the data in the access pattern storage unit 140 and the physical model data storage unit 150, the parts not directly designated by the user are grouped. When performing grouping, interaction with the terminal 300 is included as necessary. The design result is stored (updated) in the physical model data storage unit 150.

物理モデル管理部１２０のXMLスキーマ生成部１２４は、端末３００からユーザが実行を選択（ボタンを押す等の操作）すると、物理モデルデータ記憶部１５０のデータに基づいて、XMLスキーマを作成し、XMLスキーマ記憶部１６０に格納する。 When the user selects execution (operation such as pressing a button) from the terminal 300, the XML schema generation unit 124 of the physical model management unit 120 creates an XML schema based on the data in the physical model data storage unit 150, Stored in the schema storage unit 160.

以下、図３の流れに沿って、XMLＤＢ向け論理設計、ＭＬ向け物理設計の順に説明する。 In the following, the logical design for XMLDB and the physical design for ML will be described in the order of FIG.

＜論理設計＞（論理XMLの設計）
当該論理設計は、概念／論理モデル管理部１１０によって行われる。 <Logical design> (Logical XML design)
The logical design is performed by the concept / logical model management unit 110.

まず、ＲＤＢ向けの論理設計では、リレーショナルモデルへの制限により、ＥＲ図の変形（リレーショナルモデルへの正規化）が必要であったが、XMLＤＢ向けの場合は、木構造モデルの表現力の高さにより変形は不要である。具体的には、Ｍ：Ｎ関係も外部キー属性を繰り返し項目として表現することが可能である。 First, in the logical design for RDB, transformation of the ER diagram (normalization to the relational model) was required due to restrictions on the relational model, but in the case of XMLDB, the expressive power of the tree structure model is high. Therefore, no deformation is required. Specifically, the M: N relationship can also express the foreign key attribute as a repeated item.

例えば、図１８の部活動と生徒のＭ：Ｎの関係を図５に示すような木構造モデルのデータ構造とすることで表現できる。生徒エンティティに中間ノード「部活動リスト」を作成し、部活動の主キーに対応した外部キー『部活動コード（ＦＫ）』を繰り返し要素として複数保持させている。 For example, the relationship between the club activities shown in FIG. 18 and the student's M: N can be expressed as a data structure of a tree structure model as shown in FIG. An intermediate node “club activity list” is created in the student entity, and a plurality of external keys “club activity codes (FK)” corresponding to the primary key of club activities are held as repeated elements.

次に、概念／論理モデル管理部１１０は、ＥＲ図と対応した「論理XML」を定義し、概念／論理モデルデータ記憶部１３０に格納する。ＲＤＢの場合、エンティティは論理テーブル、属性は論理カラム、リレーションシップは外部キー属性として論理カラムに対応させるが、論理XMLの場合、以下の対応とする。 Next, the concept / logical model management unit 110 defines “logical XML” corresponding to the ER diagram and stores it in the concept / logical model data storage unit 130. In the case of RDB, an entity corresponds to a logical table, an attribute corresponds to a logical column, and a relationship corresponds to a logical column as a foreign key attribute. In the case of logical XML, the correspondence is as follows.

・エンティティ：１つの論理XMLに対応させる。 Entity: Corresponds to one logical XML.

・属性：論理XML内のノードに対応させる（要素で表現するか、XML属性で表現するかは特に規定しない）
・リレーションシップ：表現形式は特に規定しない（リレーションシップの具体的なデータ設計は物理設計で行うため、ここでは情報保持以上の作業は行わない）。・ Attribute: Corresponds to a node in logical XML (whether it is expressed as an element or XML attribute is not specified)
・ Relationship: The expression format is not stipulated (the specific data design of the relationship is done by physical design, so no work beyond holding information is done here).

従って、従来のデータモデリング技法の支援ツールもそのまま利用できる（概念データモデルの描画ができればよい）。 Therefore, the support tool of the conventional data modeling technique can also be used as it is (it is sufficient if the conceptual data model can be drawn).

＜物理設計＞
次に、物理モデル管理部１２０における、XMLＤＢ向けの物理設計について記すが、まずはその基礎となる考え方について説明する。 <Physical design>
Next, physical design for XMLDB in the physical model management unit 120 will be described. First, the basic concept will be described.

・リレーションシップの表現は、２種類（基本形式と統合形式）の表現方法から選択できる。・ Relationship can be selected from two types (basic format and integrated format).

−主キー、外部キー属性の追加（ＲＤＢと同様の方法）；
−XMLドキュメント内の木構造による表現（１エンティティ＝１種類のXMLドキュメントである必要はない。複数エンティティを１種類のXMLドキュメントで表現することもできる）。 -Add primary key and foreign key attributes (same method as RDB);
-Representation by tree structure in XML document (one entity does not have to be one kind of XML document. Multiple entities can be represented by one kind of XML document).

・エンティティ内の属性は、中間ノードを追加し、階層化を行うこともできる。 -Attributes in entities can be layered by adding intermediate nodes.

前者のリレーションシップの２種類の表現方法についての例を、図６に示す。左側が従来の外部キーによる表現（基本形式）で、エンティティ毎に２種理のXMLドキュメントに分けられ、「学籍番号」により関係が保持されている。右側が、XMLの木構造による表現（統合形式）の一例であり、１種類のXMLドキュメントに、「生徒」エンティティと「試験成績」エンティティが統合され、木構造の親・子ノードの関係で表されている。また、「試験成績」エンティティが繰り返し項目となっているのは、「生徒」と「試験成績」が１：Ｎの関係であるためである。 An example of the two types of expression methods of the former relationship is shown in FIG. The left side is a conventional foreign key representation (basic form), which is divided into two types of XML documents for each entity, and the relationship is maintained by “student ID”. The right side is an example of the XML tree structure representation (integrated format). The “student” entity and the “exam” entity are integrated into one type of XML document, and expressed in the relationship between the parent and child nodes of the tree structure. Has been. The “test result” entity is a repeated item because “student” and “test result” have a 1: N relationship.

このように、リレーションシップは２種類の表現方法があり、外部キーによる表現を『基本形式』、木構造による表現を『統合形式』と以降呼ぶこととする。また、XMLドキュメントの種類を、『XMLコレクション』と以降呼ぶこととする。つまり、リレーションシップを持つ２つのエンティティは、基本形式では２つのXMLコレクションとなり（図６左側）、統合形式では、１つのXMLコレクションに対応する（図６右側）。 As described above, there are two types of relationship representation methods, and the expression by the foreign key is hereinafter referred to as “basic form”, and the expression by the tree structure is hereinafter referred to as “integrated form”. In addition, the type of XML document is hereinafter referred to as “XML collection”. That is, two entities having a relationship become two XML collections in the basic format (left side in FIG. 6), and correspond to one XML collection in the integrated format (right side in FIG. 6).

統合形式は、基本的には１：Ｎの「１」側の親エンティティを中心に、「Ｎ」側の子エンティティを繰り返しノードとして表現することができる。さらに、子エンティティに対して別のエンティティとの１：Ｎのリレーションシップがあれば、入れ子的に統合形式を適用することもできる。この場合も、データは冗長になることはなく、正規形を崩すことなく表現できることが重要なポイントである。 In the integration format, basically, a parent entity on the “1” side of 1: N can be centered and a child entity on the “N” side can be expressed as a repetition node. Further, if there is a 1: N relationship with another entity for a child entity, the integration format can be applied in a nested manner. Also in this case, it is important that the data is not redundant and can be expressed without breaking the normal form.

但し、無制限というわけではなく、以下の条件を満たす範囲となるが、従来のＲＤＢと比較すると広範囲になる。 However, it is not unlimited and is in a range that satisfies the following conditions, but it is in a wider range compared to the conventional RDB.

［条件］
・ある子エンティティが複数の親エンティティを持つ場合、１つの親に対してのみ、統合形式で表現し、残りのリレーションシップは基本形式で表現する。 [conditions]
-When a child entity has multiple parent entities, only one parent is expressed in an integrated format, and the remaining relationships are expressed in a basic format.

・Ｍ：Ｎ関係のリレーションシップは基本形式で表現する（統合形式で表現しない）。 -M: N relationship is expressed in a basic format (not expressed in an integrated format).

また、以降の説明のために、ＥＲ図の表記を拡張し、図７に示すように、点線で囲んだエンティティ群（厳密にはリレーションシップ群）は統合形式の適用を表すこととする。 For the following explanation, the notation of the ER diagram is expanded, and as shown in FIG. 7, the entity group (strictly, the relationship group) surrounded by a dotted line represents the application of the integrated format.

モデリング技法の支援ツールにおいては、統合形式の表記を新しく用意する必要がある。もちろん、点線で囲む以外に、リレーションシップの線の書式を変えるといった、他の様々な表記が考えられる。 In the support tool for modeling technique, it is necessary to prepare a new notation in an integrated format. Of course, in addition to enclosing with dotted lines, various other notations such as changing the format of the relationship lines can be considered.

＜XMLコレクションの設計＞
次に、XMLコレクション設計部１２２の処理について説明する。 <XML collection design>
Next, processing of the XML collection design unit 122 will be described.

XMLコレクションは、リレーションシップの表現により決定されることを前述した。また、XMLコレクションは、ＲＤＢのテーブルに相当するものであり、ＲＤＢの場合と同様、性能向上を観点に設計されたものである。 It was mentioned above that an XML collection is determined by a relationship representation. An XML collection corresponds to an RDB table, and is designed from the viewpoint of improving performance, as in the case of RDB.

基本形式と統合形式では、期待できる性能の特徴が異なり、一般的には以下となる。 The performance characteristics that can be expected differ between the basic format and the integrated format, and are generally as follows.

・基本形式が性能有利な場合：
エンティティ毎に個別の更新系のアクセスがなされる場合。特に、親エンティティの挿入、削除、子エンティティの挿入、外部キーの更新時が有利である。・ When the basic type is advantageous in performance:
When individual update access is made for each entity. In particular, it is advantageous when inserting or deleting a parent entity, inserting a child entity, or updating a foreign key.

・統合形式が性能有利な場合：
双方のエンティティをまとめてアクセスされる場合。結合検索（参照）、一括挿入、一括削除時が有利である。・ If the integrated format offers performance advantages:
When both entities are accessed together. It is advantageous at the time of combined search (reference), batch insertion, and batch deletion.

すなわち、統合形式は２つのエンティティがまとまった状態で管理されるため、まとまったアクセスが有利であり、その反面、更新系のアクセスの際に、一方の変更がそれだけでは済まず、もう一方の変更も引き起こす場合があり、基本形式が有利となる。 In other words, since the integrated form is managed in a state where two entities are grouped together, it is advantageous to have a grouped access. On the other hand, when accessing the update system, one change does not have to be done alone, the other change The basic form is advantageous.

従って、XMLコレクション設計部１２２は、XMLコレクションを、業務アプリケーション（ＡＰ）がどのエンティティにどういう頻度でどのようにアクセスするかといった振る舞いを考慮して有利な形式を選択し、設計する。 Accordingly, the XML collection design unit 122 selects and designs an XML collection in consideration of the behavior such as how and how often the business application (AP) accesses which entity.

ＡＰの振る舞いを整理したものを、ここでは「エンティティへのアクセスパターン」と呼び、XMLコレクションの設計作業の前に「アクセスパターンの整理」作業でアクセスパターン入力部１２１で作成され、アクセスパターン記憶部１４０に格納される。 An arrangement of the AP behavior is referred to herein as an “entity access pattern”, which is created by the access pattern input unit 121 in the “arrangement of access patterns” work before the XML collection design work, and an access pattern storage unit 140.

アクセスパターンの整理作業は、従来手法でも同様に行われており、
・ＡＰの洗い出し；
・ＡＰ毎の各エンティティへのアクセス頻度、参照／更新、ヒット数（条件に合致する対象データ数）
・各エンティティ毎のデータ件数、分布
といった情報が整理される。 Organization of access patterns is done in the same way in the conventional method,
-Washing out AP;
-Access frequency to each entity for each AP, reference / update, number of hits (number of target data that matches the conditions)
・ Information such as the number and distribution of data for each entity is organized.

論理データモデルのＥＲ図と、エンティティへのアクセスパターンを入力して行う、XMLコレクションの設計手順を図８に示す。これは、支援ツールにより（半）自動化することもできる。 FIG. 8 shows an XML collection design procedure performed by inputting an ER diagram of a logical data model and an access pattern to an entity. This can also be (semi) automated by support tools.

以下に図８の動作を説明する。 The operation of FIG. 8 will be described below.

ユーザにより、画面イメージの「ＸＭＬコレクション設計」ボタンが押下されたタイミングで、システムがリレーションシップを全て走査すると、XMLコレクション設計部１２２は、物理データモデル記憶部１５０からリレーションシップを持つエンティティの組を選択する（ステップ１０１）。XMLコレクション設計部１２２は、概念／物理モデルデータ記憶部１３０を参照して、ステップ１０１で選択されたエンティティの組がＭ：Ｎ関係かを判定し、Ｍ：Ｎであれば（ステップ１０２、Ｙｅｓ）『基本形式』を選択する（ステップ１０７）。Ｍ：Ｎ関係でない場合は、アクセスパターン記憶部１４０を参照して、まとめてアクセスするＡＰがあるかを判定し（ステップ１０２）、ない場合は（ステップ１０３、Ｎｏ）『基本形式』を選択する（ステップ１０７）。一方、まとめてアクセスするＡＰがある場合には（ステップ１０３、Ｙｅｓ）、個別に更新系のアクセスするＡＰはあるかを判定し、ない場合は（ステップ１０４、Ｎｏ）『統合形式』を選択する（ステップ１０６）。ある場合は（ステップ１０４、Ｙｅｓ）双方のＡＰ群を総合的に判断し、統合が有利であると推定できる場合は（ステップ１０５、Ｙｅｓ）、『統合形式』を選択する（ステップ１０６）。有利でないと推定した場合は（ステップ１０５、Ｎｏ）、『基本形式』を選択する（ステップ１０７）。 When the system scans all the relationships at the timing when the “XML collection design” button of the screen image is pressed by the user, the XML collection design unit 122 creates a set of entities having relationships from the physical data model storage unit 150. Select (step 101). The XML collection design unit 122 refers to the concept / physical model data storage unit 130 to determine whether the set of entities selected in step 101 is in the M: N relationship, and if M: N (step 102, Yes). ) "Basic format" is selected (step 107). If not in the M: N relationship, the access pattern storage unit 140 is referred to determine whether there is an AP to be accessed collectively (step 102). If not (No in step 103), “basic format” is selected. (Step 107). On the other hand, if there are APs that are collectively accessed (step 103, Yes), it is determined whether there are APs that are individually accessed for the update system (step 104, No), and “integrated format” is selected. (Step 106). If there is (Step 104, Yes), both AP groups are comprehensively judged. If it can be estimated that the integration is advantageous (Step 105, Yes), the “integration format” is selected (Step 106). When it is estimated that it is not advantageous (No at Step 105), "basic form" is selected (Step 107).

なお、上記のステップ１０５において、「ＡＰ群を総合的に判断する方法」としては、自動的に判断する方法として、アクセス頻度と更新／参照コストの比率を事前に決定しておき、アクセス頻度×コストの総和が高い方が有利になる形式を選択する方法等が考えられる。 In the above-mentioned step 105, as the “method for comprehensively determining the AP group”, as a method for automatically determining, the ratio between the access frequency and the update / reference cost is determined in advance, and the access frequency × There may be a method of selecting a format in which the higher total cost is advantageous.

上記の処理を全てのエンティティの組について行った場合は（ステップ１０８、Ｙｅｓ）、『統合形式』が選択された範囲において、複数の親エンティティへのリレーションシップがある場合、最も有利となる１つの親エンティティを選択し、それ以外は『基本形式』に変更する（ステップ１０９）。 If the above processing is performed for all entity pairs (step 108, Yes), the most advantageous one is the relationship when there are relationships to a plurality of parent entities in the range where "integrated form" is selected. The parent entity is selected, and the others are changed to “basic format” (step 109).

上記のステップ１０５の処理は、アクセスパターン記憶部１４０を参照して、２つのエンティティにまとめてアクセスするＡＰ群と、個別に更新系のアクセスを行うＡＰ群が競合した場合は、両派の性能への影響を検討し、判断を行う必要がある。この際、アクセス頻度が重要な指標になるが、他にもデータ件数やヒット数（条件に合致する対象データ数）などのアクセスパターンの情報全般を考慮するべきである。なお、アクセス頻度については、アクセスパターン記憶部１４０を参照するものとする。アクセス頻度は、その頻度により高、中、低のようにレベルで表しても、アクセス回数自体で表してもよい。 When the AP group that accesses the two entities collectively and the AP group that individually accesses the update system compete with each other by referring to the access pattern storage unit 140 with reference to the access pattern storage unit 140, the performance of both groups is improved. It is necessary to examine the influence of the At this time, the access frequency is an important index, but in addition, general access pattern information such as the number of data and the number of hits (the number of target data that matches the conditions) should be considered. Note that the access pattern storage unit 140 is referred to for the access frequency. Depending on the frequency, the access frequency may be expressed as a level such as high, medium, or low, or may be expressed as the number of accesses.

また、上記のステップ１０９において、複数の親エンティティへの統合は正規形を崩してしまうため、１つの親エンティティを選択する必要がある。これもステップ１０５と同様に、アクセス頻度等からの統合的な検討が必要となる。 Further, in step 109 described above, since integration into a plurality of parent entities breaks the normal form, it is necessary to select one parent entity. As in step 105, this also requires an integrated study from the access frequency.

ステップ１０５，１０９の支援ツールにおける（半）自動化方法としては、
・各種の指標群を入力とした特定の計算式やルールを定め、自動的に判断する方法；
・指標群や計算結果などをユーザ（設計者）に提示し、ユーザに判断を求める方法；
が考えられる。 As a (semi) automated method in the support tool in steps 105 and 109,
・ A method to determine specific formulas and rules with various indicators as input and to make judgments automatically;
-A method of presenting a group of indicators, calculation results, etc. to the user (designer) and requesting judgment from the user;
Can be considered.

上記の手順により、ＡＰから複数のエンティティにまとめてアクセスされるものについては、『統合形式』で、個別の更新系アクセスが行われるものについては『基本形式』でXMLコレクションを設計し、性能向上を図ると共に、データモデルの正規形も維持できるため、性能向上と保守性の両立を図ることができる。 By using the above procedure, an XML collection is designed in the “integrated format” for those that are accessed by multiple entities from the AP in a unified manner, and an XML collection is designed in the “basic format” for those in which individual update access is performed. In addition, since the normal form of the data model can be maintained, both performance improvement and maintainability can be achieved.

また、本発明の範囲外であるが、更なる性能向上を優先したい場合、上記に加えて従来のＲＤＢの場合と同様の方法で正規形を崩して設計を行うことも可能である。 In addition, if it is outside the scope of the present invention and priority is given to further improvement in performance, it is possible to design with the normal form destroyed in the same manner as in the case of the conventional RDB in addition to the above.

＜XML文書形式の設計＞
次に、XML文書形式設計部１２３が行うXML文書形式の設計処理について説明する。 <Design of XML document format>
Next, an XML document format design process performed by the XML document format design unit 123 will be described.

XMLの木構造モデルの表現力を活用し、より利便性の高いデータ構造とするため、XML文書形式設計部１２３は、エンティティ内の属性の階層化を行う。これも、アクセスパターン記憶部１４０のＡＰのアクセスパターンにより判断するが、より詳細な情報である、エンティティ内の属性へのアクセスパターンから判断する。これは、エンティティ毎に、各ＡＰがどういう頻度でどの属性にアクセスするかといった振る舞いを整理したものである。これも、「アクセスパターンの整理」作業により従来も行われており、属性毎にアクセスされるＡＰ、参照／更新の種別等が整理され、アクセスパターン記憶部１４０に格納されている。 In order to make the data structure more convenient by utilizing the expressive power of the XML tree structure model, the XML document format design unit 123 stratifies the attributes in the entity. This is also determined from the access pattern of the AP in the access pattern storage unit 140, but is determined from the access pattern to the attribute in the entity, which is more detailed information. This is an arrangement of behaviors such as how often each AP accesses which attribute for each entity. This is also done conventionally by the “organization of access patterns” operation, and the APs accessed for each attribute, the type of reference / update, etc. are organized and stored in the access pattern storage unit 140.

XML文書形式の設計手順を以下に示す。XML文書形式設計部１２３は、各エンティティ毎に以下の処理を行う。 The design procedure for the XML document format is shown below. The XML document format design unit 123 performs the following processing for each entity.

（１） XML文書形式設計部１２３は、アクセスパターン記憶部１４０を参照し、属性へのアクセスパターンから、同時にアクセスされている属性群を抽出する。 (1) The XML document format design unit 123 refers to the access pattern storage unit 140, and extracts attribute groups that are accessed simultaneously from the access pattern to the attributes.

（２） XML文書形式設計部１２３は、上記の（１）であるＡＰから抽出した属性群が他のＡＰのものと競合していない場合は、そのままグループ化（グループ名の設定と所属属性群の決定）を行う。 (2) If the attribute group extracted from the AP of (1) above does not conflict with those of other APs, the XML document format design unit 123 performs grouping as it is (group name setting and belonging attribute group) Decision).

抽出した属性群の範囲が、競合／重複している場合は、
・完全に包含関係がある場合は、階層化したグループ化を行う；
・参照よりも、挿入／更新・削除を優先する；
・アクセス頻度等の処理コストから優先するものを検討する；
といった観点から、総合的にグループ化を行い、グループ名を設定する。 If the extracted attribute group ranges conflict / overlap,
・ If there is a complete inclusion relationship, perform hierarchical grouping;
・ Prefer insert / update / delete over reference;
・ Consider priority on processing costs such as access frequency;
From such a viewpoint, grouping is performed comprehensively and a group name is set.

（３）ＡＰでの内部処理において、利便性の高い構造が明確な場合は、更なるグループ化を行う。ここで、利便性の高い構造が明確か否かを判定する場合に、例えば、ＡＰで「住所」「氏名」「年齢」「職業」を画面に表示する場合に、「氏名」「年齢」を見出しのように表示し、それ以外は小さく表示する、といったように異なる処理となった場合、それぞれグループ化されていれば、利便性が高いと判定することができる。 (3) In the internal processing at the AP, if a highly convenient structure is clear, further grouping is performed. Here, when it is determined whether or not a highly convenient structure is clear, for example, when “address”, “name”, “age”, and “profession” are displayed on the screen by the AP, “name” and “age” are changed. When the processing is different, such as displaying like a headline and displaying the others small, it can be determined that the convenience is high if they are grouped.

この際、グループ名は全てのエンティティ名と当該エンティティ内の属性名とは異なる名称にする方がよい。 At this time, the group name should be different from all entity names and attribute names in the entity.

図９に、グループ化した場合の拡張したＥＲ図表記（左側）とXML表現のデータ構造例（右側）を示す。XML文書形式設計部１２３は、ＥＲ図表記では、グループ化のグループ名の記述と、グループに所属する属性をインデントにより表現するよう、表記を拡張している。 FIG. 9 shows an expanded ER diagram notation (left side) and a data structure example of XML representation (right side) when grouped. In the ER diagram notation, the XML document format design unit 123 extends the notation so that the group name description of grouping and the attributes belonging to the group are expressed by indentation.

支援ツールによる（半）自動化方法としては、図９のようなグループ化を表す表記方法を追加するが、上記設計手順の完全自動化は困難であるため、各種指標などを設計者の端末３００に提示し、ユーザに判断を求める方法が考えられる。 As a (semi) automated method using the support tool, a notation method representing grouping as shown in FIG. 9 is added. However, since it is difficult to fully automate the above design procedure, various indicators are presented on the terminal 300 of the designer. Then, a method for requesting judgment from the user can be considered.

XML表現においては、グループ名を要素名とした中間ノード（ルート要素でも最下層の要素でもない中間的な要素）として表現するのが自然である。 In the XML representation, it is natural to represent it as an intermediate node (an intermediate element that is neither the root element nor the lowest layer element) with the group name as the element name.

従来のＲＤＢにおいては、ＡＰが必要とする項目（エンティティ内の属性）のみを列挙指定したクエリ（ＳＱＬの「SELECT」句）を発行することで、データ転送量を必要最低限に抑え、性能向上を図ることが行われている。XMLＤＢにおいても同様の方法により性能向上を図ることができるが、木構造データモデルの利点を活かし、上記のグループ化を行うことで、ここの属性の列挙の代わりにグループ名を指定することができ、性能向上だけでなく、クエリの簡素化による保守性も向上する。 In conventional RDB, by issuing a query (SQL “SELECT” clause) that lists and specifies only the items required by the AP (attributes in the entity), the amount of data transfer is minimized and performance is improved. It has been done. In XMLDB, the same method can be used to improve performance, but by taking advantage of the tree structure data model and performing the above grouping, group names can be specified instead of listing attributes here. , Not only improves performance, but also improves maintainability by simplifying queries.

＜XMLスキーマの作成＞
最後に、XMLスキーマ作成部１２４におけるXMLスキーマの作成処理について説明する。 <Create XML schema>
Finally, an XML schema creation process in the XML schema creation unit 124 will be described.

ＥＲ図と対応した最終的なXML表現を規定する。XMLコレクションをXMLドキュメントに対応させることと、論理XMLの規定を前提に、不足している
・リレーションシップの表現。具体的には、基本形式のキー属性の表現と統合形式の構造表現：
・Ｍ：Ｎ関係の外部キー表現：
を定める。 Specifies the final XML representation corresponding to the ER diagram. Insufficient provision of XML collection for XML documents and logical XML rules. ・ Representation of relationships. Specifically, the basic form of key attribute representation and the integrated form of structural representation:
M: N-related foreign key expression:
Determine.

そして、それに基づいて、物理モデルデータ記憶部１５０の物理データモデルに対応したXMLスキーマを作成し、XMLスキーマ記憶部１６０に格納する。 Based on this, an XML schema corresponding to the physical data model in the physical model data storage unit 150 is created and stored in the XML schema storage unit 160.

従来のＲＤＢスキーマの作成と同様、XML表現の定義から、支援ツールによってXMLスキーマを自動生成することができる。 Similar to the creation of the conventional RDB schema, the XML schema can be automatically generated by the support tool from the definition of the XML expression.

本発明自体では、特定のXML表現への定義は特に規定しない。全ての情報が保存され、アクセスや操作がしやすい表現を検討し、決定すべきである。 The present invention itself does not particularly define a definition for a specific XML representation. All information should be stored, and expressions that are easy to access and operate should be considered and determined.

以下に、実施例として、図１８の概念データモデルを対象に、ER図をベースとした設計例を示す。 Hereinafter, as an example, a design example based on the ER diagram for the conceptual data model of FIG. 18 is shown.

図１０は、本発明の一実施例のモデリング装置の画面イメージである。 FIG. 10 is a screen image of the modeling apparatus according to the embodiment of the present invention.

＜論理設計＞（論理XMLの設計）
最初に、概念／論理モデル管理部１１０における論理設計について述べる。 <Logical design> (Logical XML design)
First, the logic design in the concept / logic model management unit 110 will be described.

当該論理設計では、ER図の変形は特に行わないものとする。 In the logic design, the ER diagram is not particularly modified.

ER図の表現に対する論理XMLの定義を以下のように定める。 The definition of logical XML for the representation of ER diagram is defined as follows.

・エンティティ：
ルート要素をエンティティ要素とし、要素名（タグ名）をエンティティ名とする。 ·entity:
The root element is an entity element, and the element name (tag name) is an entity name.

・属性：
エンティティ要素直下の要素とする。要素名（タグ名）を属性名とし、テキストノードに値を格納する。 ·attribute:
The element immediately below the entity element. The element name (tag name) is the attribute name, and the value is stored in the text node.

図１１は、本発明の一実施例の論理XMLの例であり、上記の定義により、図１８のER図に対応させた論理XMLを示している。なお、同図において、リレーションシップについては省略している。 FIG. 11 is an example of the logical XML of one embodiment of the present invention, and shows the logical XML corresponding to the ER diagram of FIG. 18 based on the above definition. In the figure, the relationship is omitted.

図１０に示す画面イメージにおいて、ユーザによって『概念／論理モデル』タブが選択されると、ER図を描画することができ、上記の論理XMLの定義情報を概念／論理モデルデータ記憶部１３０で保持するものとする。 In the screen image shown in FIG. 10, when the “concept / logical model” tab is selected by the user, an ER diagram can be drawn, and the definition information of the logical XML is held in the conceptual / logical model data storage unit 130. It shall be.

＜物理設計＞
まず、物理モデル管理部１２０のアクセスパターン入力部１２１は、アクセスパターン整理作業が行われた図１２に示すエンティティへのアクセスパターンを取得する。 <Physical design>
First, the access pattern input unit 121 of the physical model management unit 120 acquires an access pattern to the entity shown in FIG.

この例では、５つのＡＰを縦軸に、各エンティティへのアクセスを表形式にまとめており、ＡＰ毎のアクセス頻度（高／中／低）と、アクセスの種別を、Ｃ（新規作成）、Ｂ（参照）、Ｕ（更新）、Ｄ（削除）で表している。 In this example, five APs are plotted on the vertical axis, and access to each entity is summarized in a table format. The access frequency (high / medium / low) for each AP and the type of access are C (new creation), B (reference), U (update), and D (delete).

また、生徒のエンティティの、属性のアクセスパターンとして図１３に示すパターンが得られたとする。同図の例では、生徒のエンティティにアクセスする４つのＡＰを縦軸に、同じ表形式としている。 Further, it is assumed that the pattern shown in FIG. 13 is obtained as the attribute access pattern of the student entity. In the example shown in the figure, the four APs that access the student entity have the same table format on the vertical axis.

図１０に示す画面イメージにおいて、「物理モデル」のタブが選択され、「アクセスパターンの入力」ボタンが押下され、アクセスパターンが入力される。これにより、アクセスパターン入力部１２１は、図１２に示すエンティティへのアクセスパターンと、図１３に示す属性のアクセスパターンをアクセスパターン記憶部１４０に格納する。 In the screen image shown in FIG. 10, the “physical model” tab is selected, the “input access pattern” button is pressed, and the access pattern is input. As a result, the access pattern input unit 121 stores the access pattern to the entity shown in FIG. 12 and the access pattern of the attribute shown in FIG. 13 in the access pattern storage unit 140.

＜XMLコレクションの設計＞
以下では、XMLコレクション設計部１２２において、図１８のＥＲ図に対し、アクセスパターン記憶部１４０に格納されている図１２に示すアクセスパターンから、図８に示した手順で設計を行った例を示す。 <XML collection design>
In the following, an example is shown in which the XML collection design unit 122 performs design according to the procedure shown in FIG. 8 from the access pattern shown in FIG. 12 stored in the access pattern storage unit 140 with respect to the ER diagram of FIG. .

また、競合時の判断は、アクセスパターン記憶部１４０に格納されているＡＰのアクセス頻度を参照して行うものとする。 Further, it is assumed that the determination at the time of contention is made with reference to the access frequency of the AP stored in the access pattern storage unit 140.

［１］『基本形式』選択：
図８のステップ１０１で「クラス（親）と生徒（子）」の組を選択した場合は、Ｍ：Ｎの関係ではなく（ステップ１０２、Ｎｏ）、クラス一覧取得ＡＰで、まとめて参照しており（ステップ１０３、Ｙｅｓ）、生徒の情報変更ＡＰで生徒（子）のみを更新している（ステップ１０４、Ｙｅｓ）。アクセス頻度は、クラスの一覧取得ＡＰ＜生徒の情報変更ＡＰとなっているので（ステップ１０５、Ｎｏ）、『基本形式』を選択する（ステップ１０７）。 [1] "Basic format" selection:
When the group of “class (parent) and student (child)” is selected in step 101 of FIG. 8, it is not the relationship of M: N (step 102, No), but it is referred collectively by the class list acquisition AP. (Step 103, Yes), only the student (child) is updated by the student information change AP (Step 104, Yes). Since the access frequency is such that the class list acquisition AP <student information change AP (step 105, No), “basic format” is selected (step 107).

［２］『基本形式』選択：
図８のステップ１０１で「部活動（親）と生徒（子）」の組を選択した場合は、Ｍ：Ｎ関係である（ステップ１０２、Ｙｅｓ）ので、『基本形式』を選択する（ステップ１０７）。 [2] "Basic format" selection:
When the group of “club activity (parent) and student (child)” is selected in step 101 of FIG. 8, the “basic form” is selected because of the M: N relationship (step 102, Yes) (step 107). ).

［３］『統合形式』選択：
図９のステップ１０１で「生徒（親）と試験成績（子）」の組を選択した場合は、Ｍ：Ｎの関係ではなく（ステップ１０２、Ｎｏ）、生徒の姓・名と試験結果の取得ＡＰで、まとめて参照しており（ステップ１０３、Ｙｅｓ）、生徒の情報変更ＡＰで、生徒（親）のみ更新しており、また、試験結果の登録ＡＰで試験成績（子）のみを作成している（ステップ１０４、Ｙｅｓ）。アクセス頻度は、「生徒の姓・名と試験結果の取得ＡＰ＞生徒の情報変更ＡＰ＋試験結果の登録ＡＰ」であるので（ステップ１０５、Ｙｅｓ）、『統合形式』を選択する（ステップ１０６）。 [3] "Integrated format" selection:
If the pair of “student (parent) and exam grade (child)” is selected in step 101 of FIG. 9, the student's first name / first name and exam results are acquired instead of the M: N relationship (No in step 102). AP is collectively referred to (step 103, Yes), only the student (parent) is updated by the student information change AP, and only the test result (child) is created by the registered AP of the test result. (Step 104, Yes). Since the access frequency is “student's first and last name and test result acquisition AP> student information change AP + test result registration AP” (step 105, Yes), “integrated form” is selected (step 106).

ステップ１０８において、全ての処理を行ったものとし、複数の親への『統合形式』の選択はない、とする。 In step 108, it is assumed that all the processes have been performed, and there is no selection of “integrated form” for a plurality of parents.

図１０に示すモデリング装置の画面イメージに対して、『XMLコレクション設計』ボタンを押下することで、自動的に上記の結果をＥＲ図描画に反映することができる。従って、ＥＲ図は図１４に示すようになる。「生徒」と「試験成績」は『統合形式』の点線で囲まれ、「生徒」には「クラス」と「部活動」の『基本形式』の外部キーが追加されている。 By pressing the “XML collection design” button on the screen image of the modeling apparatus shown in FIG. 10, the above result can be automatically reflected in the ER diagram drawing. Therefore, the ER diagram is as shown in FIG. “Student” and “Examination results” are surrounded by a dotted line of “integrated form”, and “basic form” foreign keys of “class” and “club activity” are added to “student”.

＜XML文書形式の設計＞
次に、XML文書形式設計部１２３の具体例を以下に示す。 <Design of XML document format>
Next, a specific example of the XML document format design unit 123 is shown below.

アクセスパターン記憶部１４０の図１３のアクセスパターンから、「生徒」エンティティでは、３つのＡＰから姓と名が同時に参照されていることがわかる。よって、XML文書形式設計部１２３は、姓と名を「名前」でグループ化し、物理モデルデータ記憶部１５０に格納する。 From the access pattern of the access pattern storage unit 140 shown in FIG. 13, it can be seen that the “student” entity refers to the last name and the first name simultaneously from three APs. Therefore, the XML document format design unit 123 groups the first name and last name by “name” and stores them in the physical model data storage unit 150.

例えば、ユーザが図１０に示す画面イメージ（図１０）において、「生徒」エンティティを選択し、「XML文書形式設計」ボタンを押下すると、XML文書形式設計部１２３は、物理モデルデータ記憶部１５０を参照し、「生徒」エンティティの属性毎のアクセスパターンだけを端末３００に表示し、ユーザが行う判断の支援をすることができる。 For example, when the user selects the “Student” entity in the screen image shown in FIG. 10 (FIG. 10) and presses the “XML document format design” button, the XML document format design unit 123 stores the physical model data storage unit 150. The access pattern for each attribute of the “student” entity can be displayed on the terminal 300 to assist in the determination made by the user.

図１４の例に対して、XML文書形式設計部１２３がグループ化を行った後の最終的なデータモデルのＥＲ図を図１５に示す。 FIG. 15 shows an ER diagram of the final data model after the XML document format design unit 123 performs grouping for the example of FIG.

＜XMLスキーマの生成＞
最後に、XMLスキーマ生成部１２４の具体例を以下に示す。 <Generate XML schema>
Finally, a specific example of the XML schema generation unit 124 is shown below.

ＥＲ図に対するXML表現の対応を、以下のように規定する。論理XMLの規定に加えて、
・XMLコレクションで１番親となるエンティティ名をルート要素名とする；
・基本形式の外部キー属性は、通常の属性と同様に、要素名（タグ名）を属性名とし、テキストノードに値を格納する要素とする；
・統合形式の子エンティティは、親エンティティの要素直下に「エンティティ名＋"リスト"」という名前の中間ノードを作成し、繰り返し要素とする；
・Ｍ：Ｎ関係の外部キー属性は、「属性名＋"リスト"」という名前の中間ノードを作成し、外部キー属性を繰り返し要素として表現する；
また、通常は、各種名称には論理名とは別の物理名を定義し、最終的なデータには物理名を付与するが、本実施例では、物理名と論理名は同じ設定とする。 The correspondence of XML expressions to ER diagrams is specified as follows. In addition to the provisions of logical XML,
-The entity name that is the first parent in the XML collection is the root element name;
・ For the foreign key attribute in the basic format, the element name (tag name) is used as the attribute name and the value is stored in the text node as in the normal attribute;
-An integrated child entity creates an intermediate node named "entity name +" list "" directly below the parent entity element, and makes it a repeated element;
For the M: N related foreign key attribute, create an intermediate node named “attribute name +“ list ”” and express the foreign key attribute as a repeating element;
Normally, a physical name different from the logical name is defined for various names, and a physical name is assigned to final data. In this embodiment, the physical name and the logical name are set to be the same.

上記の規定を適用したXMLデータ（３つのXMLコレクションのデータ構造）のイメージを図１６に示す。 FIG. 16 shows an image of XML data (data structure of three XML collections) to which the above rules are applied.

上記の規定は物理モデルデータ記憶部１５０に格納されているものとし、支援ツール（図１０）では、端末３００から「スキーマ作成」ボタンが押下されると、物理モデルデータ記憶部１５０の論理XMLも含めた規定から自動的に図１７に示すようなXMLスキーマ（XML Schemaによる記述例）を生成することができる。 It is assumed that the above rules are stored in the physical model data storage unit 150. In the support tool (FIG. 10), when the “create schema” button is pressed from the terminal 300, the logical XML of the physical model data storage unit 150 is also displayed. An XML schema (description example based on XML Schema) as shown in FIG. 17 can be automatically generated from the included rules.

なお、上記の概念／論理モデル管理部１１０、物理モデル管理部１２０の動作をプログラムとして構築し、データモデリング装置として利用されるコンピュータにインストールして実行させる、または、ネットワークを介して流通させることが可能である。 Note that the operations of the concept / logical model management unit 110 and the physical model management unit 120 described above are constructed as programs and can be installed in a computer used as a data modeling apparatus and executed, or distributed via a network. Is possible.

また、構築されたプログラムをハードディスクや、フレキシブルディスク・ＣＤ−ＲＯＭ等の可搬記憶媒体に格納し、コンピュータにインストールする、または、配布することが可能である。 Further, the constructed program can be stored in a portable storage medium such as a hard disk, a flexible disk, or a CD-ROM, and can be installed or distributed in a computer.

なお、本発明は、上記の実施の形態及び実施例に限定されることなく、特許請求の範囲内において種々変更・応用が可能である。 The present invention is not limited to the above-described embodiments and examples, and various modifications and applications can be made within the scope of the claims.

１００モデリング装置
１１０概念／論理モデル設計手段、概念／論理モデル管理部
１２０物理モデル管理部
１２１アクセスパターン入力手段、アクセスパターン入力部
１２２ XMLコレクション設計手段、XMLコレクション設計部
１２３ XML文書形式設計手段、XML文書形式設計部
１２４ XMLスキーマ生成手段、XMLスキーマ生成部
１３０論理XML定義情報記憶手段、概念／論理モデルデータ記憶部
１４０アクセスパターン記憶手段、アクセスパターン記憶部
１５０物理データモデル記憶手段、物理データモデル記憶部
１６０ XMLスキーマ記憶手段、XMLスキーマ記憶部
３００端末 100 Modeling Device 110 Concept / Logical Model Design Unit, Concept / Logical Model Management Unit 120 Physical Model Management Unit 121 Access Pattern Input Unit, Access Pattern Input Unit 122 XML Collection Design Unit, XML Collection Design Unit 123 XML Document Format Design Unit, XML Document format design unit 124 XML schema generation unit, XML schema generation unit 130 Logical XML definition information storage unit, concept / logical model data storage unit 140 Access pattern storage unit, access pattern storage unit 150 Physical data model storage unit, physical data model storage Unit 160 XML schema storage means, XML schema storage unit 300 terminal

Claims

A data modeling device for designing an XML data structure of a database (XMLDB) connected to a user terminal and storing XML (Extensible Markup Language) of a structured document,
A logical model design unit that defines a logical XML corresponding to each entity of an actual relationship diagram (ER diagram) representing a logical data model, and stores the logical XML definition information in the logical XML definition information storage unit together with the information of the ER diagram as logical XML definition information;
An access pattern input means for acquiring an access pattern of the input application and storing it in the access pattern storage means;
When a set of entities having a relationship is specified from the ER diagram, an individual update process is performed for each entity based on the logical XML definition information of the logical XML definition information storage unit and the access pattern of the access pattern storage unit The physical data model represents the entity as a basic format that expresses the relationship with a foreign key when multiple accesses are made, and as an integrated format that represents a tree structure that integrates multiple entities when multiple entities are accessed together XML collection design means for storing in the storage means;
An entity is acquired from the physical data model storage means, an attribute group that is accessed simultaneously based on an access pattern to the attribute of the entity is extracted, hierarchized, and the physical data model storage means XML document format design means for updating content,
XML schema generation means for acquiring the physical data model from the physical data model storage means and creating an XML schema corresponding to the physical data model;
A data modeling apparatus comprising:

The XML collection design means is:
Based on the access pattern of the access pattern storage means, the frequency of access to each entity for each application, the frequency of reference or update, or the number of target data that matches the conditions, the basic format or the integrated format The data modeling apparatus according to claim 1, further comprising means for selecting one of them.

The XML document format design means includes:
The data modeling apparatus according to claim 1, further comprising means for grouping the extracted attribute group according to an access pattern to the attribute.

The XML document format design means includes:
Extract attribute groups that are accessed simultaneously from the access pattern to the attribute, and if the attribute group extracted from one application conflicts or overlaps with other applications and has a complete inclusion relationship, it is hierarchical 4. The data modeling apparatus according to claim 3, further comprising means for performing grouping and setting a group name.

A data modeling method for designing an XML data structure of a database (XMLDB) connected to a user terminal and storing XML (Extensible Markup Language) of a structured document,
XML definition information storage means for storing logical XML definition information, access pattern storage means, physical data model storage means, XML schema storage means , logical design model design means, access pattern input means, and XML collection design means And an apparatus having XML document format design means and XML schema generation means ,
The logical model design means defines a logical XML corresponding to each entity of an actual relationship diagram (ER diagram) representing a logical data model, and stores it in the logical XML definition information storage unit as information on the ER diagram as logical XML definition information. Logical model design steps to
The access pattern input means, an access pattern input storing acquires the access patterns of the input application to the access pattern storage means,
When the XML collection design means designates a set of entities having a relationship from the ER diagram, based on the logical XML definition information of the logical XML definition information storage means and the access pattern of the access pattern storage means, When access to individual update processing is made for each entity, the basic format is used to express the relationship using a foreign key, and when multiple entities are accessed together, integration is expressed using a tree structure that integrates multiple entities. XML collection design step for storing entities in physical data model storage means as a format;
The XML document format design means acquires an entity from the physical data model storage means, extracts an attribute group that is accessed simultaneously based on an access pattern to the attribute of the entity, performs stratification, and a physical data model XML document format design step for updating the contents of the physical data model storage means as
The XML schema generating means acquires the physical data model from the physical data model storage means, creates an XML schema corresponding to the physical data model, and stores the XML schema in the XML schema storage means;
A data modeling method characterized by:

In the XML collection design step,
Based on the access pattern of the access pattern storage means, the frequency of access to each entity for each application, the frequency of reference or update, or the number of target data that matches the conditions, the basic format or the integrated format 6. The data modeling method according to claim 5, wherein either one is selected.

In the XML document format design step,
The data modeling method according to claim 5, wherein the extracted attribute group is grouped according to an access pattern to the attribute.

In the XML document format design step,
Extract attribute groups that are accessed simultaneously from the access pattern to the attribute, and if the attribute group extracted from one application conflicts or overlaps with other applications and has a complete inclusion relationship, it is hierarchical 8. The data modeling method according to claim 7, wherein grouping is performed and a group name is set.

A data modeling program for causing a computer to function as each means constituting the data modeling apparatus according to any one of claims 1 to 4.