JP6244521B2

JP6244521B2 - Database processing program, database processing method, and database processing apparatus

Info

Publication number: JP6244521B2
Application number: JP2015212905A
Authority: JP
Inventors: 一樹有田; 成典高橋; リン．アール．ヤマシタ
Original assignee: DBE., INC.
Current assignee: DBE., INC.
Priority date: 2015-10-29
Filing date: 2015-10-29
Publication date: 2017-12-13
Anticipated expiration: 2035-10-29
Also published as: JP2017084162A; WO2017073714A1

Description

本発明は、データベース内のスキーマの可視化及び設計・構築の簡易化を実現するデータベース処理プログラム、データベース処理方法及びデータベース処理装置に関する。 The present invention relates to a database processing program, a database processing method, and a database processing apparatus that realize visualization of a schema in a database and simplification of design and construction.

データベース（以下、ＤＢという）の型としては、扱い易さの利点からリレーショナルＤＢが主流となっている。しかも長年に亘ってデータを追加して肥大化したＤＢでは、新たなスキーマの定義、変更は非常に困難な場合が多い。複数のＤＢ夫々の設計者によって同意の属性であっても異なる属性名を定義することが多く、異なる設計者によって構築された複数のＤＢを統合する必要がある場合、非常に膨大な労力が必要となる。 As a type of database (hereinafter referred to as DB), relational DB has become the mainstream because of its ease of handling. Moreover, in a DB that has been enlarged by adding data over many years, it is often very difficult to define and change a new schema. Even if it is an attribute agreed by the designers of multiple DBs, different attribute names are often defined, and when it is necessary to integrate multiple DBs constructed by different designers, a tremendous amount of labor is required. It becomes.

属性（データ，値）同士の関係性はリレーショナルスキーマのみの記述では適切ではないことは勿論である。階層型ＤＢでは属性同士の親子（枝葉）関係を明確にすることができるが、子同士でも従属（枝葉）関係が存在する等、複雑な関係性を記述するには冗長である。ネットワーク型ＤＢでは、上述した子同士のデータであっても従属関係が存在するなど複雑な関係を明確にすることができるが、新たなスキーマの追加、変更に対しては変更が困難である。 It goes without saying that the relationship between attributes (data, values) is not appropriate if only the relational schema is described. In the hierarchical DB, the parent-child (branch and leaf) relationship between attributes can be clarified, but it is redundant to describe a complicated relationship such as a subordinate (branch and leaf) relationship between children. In the network type DB, even if it is the data between the above-mentioned children, a complicated relationship such as the existence of a dependency relationship can be clarified, but it is difficult to change when a new schema is added or changed.

また、ＤＢの可視化については種々の工夫が従来されているが（特許文献１等）、依然としてＥＲ（entity-relationship ）図が多用されている。ＥＲ図では属性個々の重要性及び属性間の関係性の分かりやすさが不足しており、技術者以外では直感的に把握することが難しい。 Various ideas have been conventionally used for DB visualization (Patent Document 1, etc.), but ER (entity-relationship) diagrams are still frequently used. In the ER diagram, the importance of each attribute and the relationship between the attributes are not easy to understand, and it is difficult for a non-technical person to grasp intuitively.

特開２０１４−０２９５９７号公報JP 2014-029597 A

ＤＢの設計及び構築はデータベースの専門知識を有する技術者であっても、膨大なデータを含むＤＢにおけるスキーマの最適化は非常に困難である。ましてや専門知識を有さないユーザにとっては、ＤＢの設計、構築は到底困難であるとして敬遠されてきた。 Even if an engineer who has database expertise is involved in designing and constructing a DB, it is very difficult to optimize a schema in a DB including a large amount of data. For users who do not have specialized knowledge, the design and construction of DBs are far from difficult.

本発明は斯かる事情に鑑みてなされたものであり、ＤＢ自体及びＤＢを扱うための技術に対する専門知識を有さないユーザであっても、ＤＢのスキーマを直感的に把握すること
可能とするデータベース処理プログラム、データベース処理方法及びデータベース処理装置を提供することを目的とする。 The present invention has been made in view of such circumstances, and enables a user who does not have expertise in the DB itself and the technology for handling the DB to intuitively grasp the DB schema. It is an object to provide a database processing program, a database processing method, and a database processing apparatus.

本発明に係るデータベース処理プログラムは、データベースと通信するコンピュータが、前記データベースに対する処理を実行させるデータベース処理プログラムにおいて、前記コンピュータに、データベースのリレーションを解析する解析ステップ、解析されたリレーションに含まれる複数の属性に、オブジェクトを夫々対応付けるステップ、対応付けられたオブジェクトを所定の形状の外形に沿うようにして配置して描画する描画ステップを実行させることを特徴とする。 The database processing program according to the present invention is a database processing program in which a computer that communicates with a database executes processing on the database. The computer includes an analysis step that analyzes the relation of the database, and a plurality of relations included in the analyzed relation. A step of associating an object with each attribute, and a drawing step of arranging and drawing the associated object so as to follow an outer shape of a predetermined shape are executed.

本発明に係るデータベース処理プログラムは、描画された前記オブジェクトに対する選択及び配置操作を受け付け、前記コンピュータに、操作に応じた前記オブジェクトの配置を受け付けるステップ、受け付けた配置に応じて、前記データベースのリレーションを変更するステップを更に実行させることを特徴とする。 The database processing program according to the present invention receives a selection and placement operation for the drawn object, and accepts the placement of the object according to the operation in the computer, and performs relations of the database according to the accepted placement. The step of changing is further executed.

本発明に係るデータベース処理プログラムは、前記描画ステップは、前記解析によりデータベースが複数のリレーションに分割されていると解析された場合、リレーションの分割数分の所定の形状を隣接させ、隣接箇所にリレーション間で共通する属性に対応するオブジェクトを配置させ、前記複数のリレーション毎に所属する属性のオブジェクトを前記隣接箇所以外の外形上に配置させて描画することを特徴とする。 In the database processing program according to the present invention, when the drawing step is analyzed that the database is divided into a plurality of relations by the analysis, predetermined shapes corresponding to the number of divisions of the relations are adjacent to each other, and the relations are adjacent to adjacent parts. An object corresponding to an attribute common to the plurality of relations is arranged, and an object having an attribute belonging to each of the plurality of relations is arranged on an outline other than the adjacent portion and is drawn.

本発明に係るデータベース処理プログラムは、前記解析ステップは、前記複数のリレーション間の階層関係を特定するステップを含み、前記描画ステップは、上位のリレーションを画面内の中央に配置するように描画することを特徴とする。 In the database processing program according to the present invention, the analysis step includes a step of specifying a hierarchical relationship between the plurality of relations, and the drawing step draws the upper relation so as to be arranged in the center of the screen. It is characterized by.

本発明に係るデータベース処理プログラムは、前記複数のリレーション夫々に所属する属性に対応するオブジェクトが配置される前記所定の形状が描画される大きさは、前記リレーションに所属する属性の属性値の前記データベース内でのバラつきの大小を示すことを特徴とする。 In the database processing program according to the present invention, the size at which the predetermined shape in which an object corresponding to an attribute belonging to each of the plurality of relations is arranged is drawn is the database of attribute values of attributes belonging to the relation It is characterized by showing the magnitude of the variation in the inside.

本発明に係るデータベース処理プログラムは、前記解析ステップは、データベースを属性毎に分解するステップ、属性毎に、該属性の属性値の前記データベース内でのバラつきの大小を算出するステップ、算出された大小に基づき前記属性に順位を付与するステップ、付与された順位に基づき候補キーを抽出するステップ、候補キーを基準に、分解された属性間の関数従属性を特定するステップ、関係従属性に基づき属性間の決定木を作成するステップ、作成された決定木に基づき、複数のリレーションに分割するステップ、及び前記決定木に基づき階層化ネットワークを作成するステップを含むことを特徴とする。 In the database processing program according to the present invention, the analysis step includes a step of decomposing the database for each attribute, a step for calculating the size of variation in the attribute value of the attribute for each attribute, and the calculated size A step of assigning ranks to the attributes based on the steps, a step of extracting candidate keys based on the assigned ranks, a step of identifying functional dependencies between decomposed attributes based on the candidate keys, and attributes based on relational dependencies The method includes a step of creating a decision tree in between, a step of dividing into a plurality of relations based on the created decision tree, and a step of creating a hierarchical network based on the decision tree.

本発明に係るデータベース処理プログラムは、前記解析ステップは、前記階層化ネットワーク内で直接的に親子関係でない属性間をリンクさせた複雑ネットワークを生成するステップを更に含むことを特徴とする。 In the database processing program according to the present invention, the analyzing step further includes a step of generating a complex network in which attributes that are not directly in a parent-child relationship are linked in the hierarchical network.

本発明に係るデータベース処理方法は、データベースに対する処理を実行するコンピュータによるデータベース処理方法において、前記コンピュータが、データベースのリレーションを解析し、解析されたリレーションに含まれる複数の属性に、オブジェクトを夫々対応付け、対応付けられたオブジェクトを所定の形状の外形に沿うようにして配置して描画した画面情報を作成することを特徴とする。 The database processing method according to the present invention is a database processing method by a computer that executes processing on a database. The computer analyzes a relation of the database and associates objects with a plurality of attributes included in the analyzed relation. The screen information is created by arranging and drawing the associated objects so as to follow the outer shape of a predetermined shape.

本発明に係るデータベース処理方法は、前記コンピュータは、データベースを属性毎に分解するステップ、属性毎に、該属性の属性値の前記データベース内でのバラつきの大小を算出するステップ、算出された大小に基づき前記属性に順位を付与するステップ、付与された順位に基づき候補キーを抽出するステップ、候補キーを基準に、分解された属性間の関数従属性を特定するステップ、関係従属性に基づき属性間の決定木を作成するステップ、作成された決定木に基づき、複数のリレーションに分割するステップ、及び前記決定木に基づき階層化ネットワークを作成するステップを実行して前記データベースを解析することを特徴とする。 In the database processing method according to the present invention, the computer decomposes the database for each attribute, calculates for each attribute, the step of calculating the degree of variation in the attribute value of the attribute in the database, A step of assigning ranks to the attributes based on the steps, a step of extracting candidate keys based on the given ranks, a step of identifying functional dependencies between decomposed attributes based on the candidate keys, and between attributes based on relational dependencies Analyzing the database by executing a step of creating a decision tree, a step of dividing into a plurality of relations based on the created decision tree, and a step of creating a hierarchical network based on the decision tree To do.

本発明に係るデータベース処理装置は、データベース及び外部装置夫々と通信する手段を備え、前記データベースにおける処理の結果を外部装置へ送信するデータベース処理装置において、データベースのリレーションを解析する解析手段と、解析されたリレーションに含まれる複数の属性に、オブジェクトを夫々対応付け、対応付けられたオブジェクトを所定の形状の外形に沿うようにして配置して描画した画面情報を作成する手段とを備えることを特徴とする。 The database processing apparatus according to the present invention includes means for communicating with each of the database and the external apparatus, and the database processing apparatus for transmitting the processing result in the database to the external apparatus is analyzed with the analyzing means for analyzing the relation of the database. Means for associating objects with a plurality of attributes included in the relation, and creating screen information in which the associated objects are arranged and drawn along a contour of a predetermined shape. To do.

本発明に係るデータベース処理装置は、前記解析手段は、データベースを属性毎に分解する手段、属性毎に、該属性の属性値の前記データベース内でのバラつきの大小を算出する手段、算出された大小に基づき前記属性に順位を付与する手段、付与された順位に基づき候補キーを抽出する手段、候補キーを基準に、分解された属性間の関数従属性を特定する手段、関係従属性に基づき属性間の決定木を作成する手段、作成された決定木に基づき、複数のリレーションに分割する手段、及び前記決定木に基づき階層化ネットワークを作成する手段を有することを特徴とする。 In the database processing apparatus according to the present invention, the analyzing means is a means for decomposing the database for each attribute, a means for calculating the degree of variation in the attribute value of the attribute for each attribute, and the calculated magnitude A means for assigning a rank to the attribute based on the above, a means for extracting a candidate key based on the assigned rank, a means for specifying a functional dependency between the decomposed attributes based on the candidate key, and an attribute based on the relation dependency It is characterized by having means for creating a decision tree in between, means for dividing into a plurality of relations based on the created decision tree, and means for creating a hierarchical network based on the decision tree.

本発明では、ＤＢにおける複数の属性に夫々対応するオブジェクトが、該属性の所属するリレーション（テーブル）毎に、所定の図形の外形に沿って配置されるようにしてスキーマが描画される。 In the present invention, the schema is drawn so that objects corresponding to a plurality of attributes in the DB are arranged along the outline of a predetermined figure for each relation (table) to which the attribute belongs.

本発明では、描画された各オブジェクトの画面上での配置を変更する操作を受け付けることが可能に構成されており、操作後のオブジェクトの配置情報に応じて実際のデータベースのリレーションが変更される。 In the present invention, an operation for changing the arrangement of each drawn object on the screen can be received, and the actual database relation is changed according to the arrangement information of the object after the operation.

本発明では、データベースが複数のリレーションに分割される場合、複数のリレーションで重複する属性に対応するオブジェクトが接点に配置されるようにして、複数のリレーション毎に所定の図形に沿うようにして配置されてスキーマが描画される。 In the present invention, when the database is divided into a plurality of relations, an object corresponding to an attribute overlapping in the plurality of relations is arranged at the contact point, and arranged so as to follow a predetermined figure for each of the plurality of relations. And the schema is drawn.

本発明では、複数のリレーションに分割される場合、階層関係が特定され、より上位のリレーションに対応するオブジェクトが中央に配置されて描画される。 In the present invention, when the relation is divided into a plurality of relations, the hierarchical relationship is specified, and the object corresponding to the higher relation is arranged at the center and drawn.

本発明では、複数のリレーションに分割される場合、リレーション毎に属性に対応するオブジェクトが配置される外形の大小の差異が、各リレーション内における属性の属性値のバラつきの大小の差異に基づき示される。リレーションに候補キーが多く含まれ、外側に更に他のリレーションに繋がる可能性が高い程、大きい形状で配置されて描画される。 In the present invention, when divided into a plurality of relations, the difference in size of the outline in which the object corresponding to the attribute is arranged for each relation is indicated based on the difference in the attribute value of the attribute in each relation. . A relation includes a large number of candidate keys, and the possibility of being connected to another relation on the outside is higher, so that the relation is drawn in a larger shape.

本発明では、ＤＢの元となるデータに対するリレーショナルスキーマが、属性間の関数従属性に基づく決定木を元に作成される階層化ネットワークから自動的に最適化されて構築される。これにより第３正規化が完備されたスキーマが構築される。 In the present invention, a relational schema for data that is the basis of a DB is automatically optimized and constructed from a hierarchical network created based on a decision tree based on functional dependencies between attributes. As a result, a schema complete with the third normalization is constructed.

本発明では、階層化ネットワークから更に複雑ネットワークが生成され、複雑ネットワークに基づいてスキーマが構築される。 In the present invention, a more complex network is generated from the hierarchical network, and a schema is constructed based on the complex network.

本発明による場合、ＤＢの元となるリレーションデータに対するリレーションスキーマについて解析がされ、各属性に対応するオブジェクトが、該属性の所属するテーブル毎に所定の図形の外形に沿って配置されるようにして可視化される。これにより、ＤＢへのアクセスに関する知識を有さないユーザであっても解析されたＤＢのスキーマを直感的に把握することができる。 According to the present invention, the relation schema for the relation data that is the source of the DB is analyzed, and an object corresponding to each attribute is arranged along the outline of a predetermined figure for each table to which the attribute belongs. Visualized. Thereby, even the user who does not have knowledge about access to the DB can intuitively grasp the analyzed DB schema.

更に、オブジェクトの配置を変更する操作を受け付けるようにしてあることで、ユーザはオブジェクトを画面上で動かすという直感的な操作でデータベース内のスキーマを変更することも可能になる。これにより、ＤＢ及びＤＢを扱うための技術に対する専門知識を有さないユーザであっても簡単にＤＢの設計・構築が可能になる。 Furthermore, by accepting an operation for changing the arrangement of objects, the user can also change the schema in the database by an intuitive operation of moving the object on the screen. Thereby, even a user who does not have expertise in DB and the technology for handling the DB can easily design and construct the DB.

また本発明による場合、リレーションスキーマの解析は、適切に機械的に自動的に生成され、正規化、更に非正規化により最適化されて構築される。 Further, according to the present invention, the analysis of the relation schema is automatically automatically generated appropriately and optimized and constructed by normalization and further denormalization.

実施の形態１における情報処理システムの構成を示すブロック図である。1 is a block diagram illustrating a configuration of an information processing system in a first embodiment. 実施の形態１におけるＤＢ処理装置、端末装置及びＤＢＩＦ提供装置の内部構成を示すブロック図である。3 is a block diagram illustrating an internal configuration of a DB processing device, a terminal device, and a DBIF providing device according to Embodiment 1. FIG. 実施の形態１の端末装置のブラウザ画面に表示されるインタフェースの例を示す説明図である。6 is an explanatory diagram illustrating an example of an interface displayed on a browser screen of the terminal device according to the first embodiment. FIG. ＤＢ処理装置にて実行される処理手順の一例を示すフローチャートである。It is a flowchart which shows an example of the process sequence performed with DB processing apparatus. 端末装置から送信されるデータ（テーブル）の内容例を示す説明図である。It is explanatory drawing which shows the example of the content of the data (table) transmitted from a terminal device. テーブル定義受付画面の内容例を示す説明図である。It is explanatory drawing which shows the example of the content of a table definition reception screen. 決定木の内容例を示す説明図である。It is explanatory drawing which shows the example of the content of a decision tree. 階層化ネットワーク構造の内容例を示す説明図である。It is explanatory drawing which shows the example of the content of a hierarchical network structure. 複雑ネットワーク生成の概要を示す説明図である。It is explanatory drawing which shows the outline | summary of complex network production | generation. 複雑ネットワーク構造の内容例を示す説明図である。It is explanatory drawing which shows the example of the content of a complex network structure. スキーマ描画例を示す説明図である。It is explanatory drawing which shows the example of a schema drawing. スキーマの変更後の描画例を示す説明図である。It is explanatory drawing which shows the example of a drawing after the change of a schema. 構築されたＤＢを示す説明図であるIt is explanatory drawing which shows DB constructed | assembled 実施の形態２における情報処理システムの構成を示すブロック図である。3 is a block diagram illustrating a configuration of an information processing system in Embodiment 2. FIG. 実施の形態２におけるＤＢ処理装置にて実行される処理手順の一例を示すフローチャートである。12 is a flowchart illustrating an example of a processing procedure executed by the DB processing apparatus according to the second embodiment.

以下、本発明をその実施形態を示す図面に基づいて具体的に説明する。なお、以下に示す実施の形態は例示であって、本発明は以下の構成に限られないことは勿論である。 Hereinafter, the present invention will be specifically described with reference to the drawings illustrating embodiments thereof. In addition, the embodiment shown below is an illustration, and of course, the present invention is not limited to the following configuration.

（実施の形態１）
図１は、実施の形態１における情報処理システムの構成を示すブロック図である。実施の形態１における情報処理システムは、ＤＢ処理装置１、複数の端末装置２、ＤＢインタフェース提供装置（以下、ＤＢＩＦ提供装置という）３及びＤＢ群４を含む。ＤＢ処理装置１はＤＢ群４及びＤＢＩＦ提供装置３に直接的、又はインターネットを含むネットワークＮを介して接続されている。ＤＢＩＦ提供装置３はネットワークＮに接続されており、同様にネットワークＮに接続されている複数の端末装置２と通信が可能である。 (Embodiment 1)
FIG. 1 is a block diagram illustrating a configuration of an information processing system according to the first embodiment. The information processing system in the first embodiment includes a DB processing device 1, a plurality of terminal devices 2, a DB interface providing device (hereinafter referred to as a DBIF providing device) 3, and a DB group 4. The DB processing device 1 is connected to the DB group 4 and the DBIF providing device 3 directly or via a network N including the Internet. The DBIF providing device 3 is connected to the network N and can communicate with a plurality of terminal devices 2 connected to the network N as well.

ＤＢ群４はＤＢ化された各種データを記憶している。ＤＢ群４はネットワークを介して接続されている複数のＤＢから構成されてもよく、各ＤＢ間及びＤＢ内のテーブル同士の関係を示す情報を記録した管理ＤＢを含む。 The DB group 4 stores various data converted into a DB. The DB group 4 may be composed of a plurality of DBs connected via a network, and includes a management DB in which information indicating the relationship between the DBs and between the tables in the DBs is recorded.

ＤＢ処理装置１は、ＤＢ群４におけるデータ定義の解析及びモデリングを自動的に行ない、データスキーマを最適化する装置である。 The DB processing apparatus 1 is an apparatus that automatically analyzes and models the data definition in the DB group 4 and optimizes the data schema.

ＤＢＩＦ提供装置３は、端末装置２向けにＤＢ群４へのインタフェースをＷｅｂベースで提供するサーバ装置である。ＤＢＩＦ提供装置３は端末装置２からのＤＢ群４に対する操作指示を受け付けてＤＢ処理装置１へ送信し、ＤＢ処理装置１における処理の結果を示す画面を端末装置２へ送信する。 The DBIF providing device 3 is a server device that provides the terminal device 2 with an interface to the DB group 4 on a Web basis. The DBIF providing device 3 receives an operation instruction for the DB group 4 from the terminal device 2 and transmits it to the DB processing device 1, and transmits a screen showing the processing result in the DB processing device 1 to the terminal device 2.

端末装置２は、ユーザが使用するパーソナルコンピュータ（ＰＣ：Personal Computer ）である。端末装置２は、Ｗｅｂブラウザプログラムを実行可能な装置であれば、デスクトップ型ＰＣ、ノートＰＣ（ラップトップ型ＰＣ）、タブレット端末、又はスマートフォン等の種々の情報端末を用いてよい。 The terminal device 2 is a personal computer (PC) used by a user. As long as the terminal device 2 can execute a Web browser program, various information terminals such as a desktop PC, a notebook PC (laptop PC), a tablet terminal, or a smartphone may be used.

図２は、実施の形態１におけるＤＢ処理装置１、端末装置２及びＤＢＩＦ提供装置３の内部構成を示すブロック図である。ＤＢ処理装置１は、制御部１０、記憶部１１、一時記憶部１２及び通信部１３を備える。 FIG. 2 is a block diagram showing an internal configuration of the DB processing device 1, the terminal device 2, and the DBIF providing device 3 according to the first embodiment. The DB processing apparatus 1 includes a control unit 10, a storage unit 11, a temporary storage unit 12, and a communication unit 13.

制御部１０は、ＣＰＵ（Central Processing Unit ）を用いる。制御部１０は記憶部１１に記憶されているＤＢモデリングプログラム１Ｐを読み出して実行することにより、ＤＢ群４に対するモデリング処理機能を発揮する。一時記憶部１２は、ＤＲＡＭ（Dynamic Random Access Memory）等のＲＡＭを用いる。一時記憶部１２は、制御部１０の処理によって生成される情報を一時的に記憶する。 The control unit 10 uses a CPU (Central Processing Unit). The control unit 10 reads out and executes the DB modeling program 1P stored in the storage unit 11, thereby exhibiting a modeling processing function for the DB group 4. The temporary storage unit 12 uses a RAM such as a DRAM (Dynamic Random Access Memory). The temporary storage unit 12 temporarily stores information generated by the processing of the control unit 10.

記憶部１１は、ハードディスク等の不揮発性の記憶媒体を用いる。記憶部１１には上述したＤＢモデリングプログラム１Ｐが記憶されている。 The storage unit 11 uses a nonvolatile storage medium such as a hard disk. The storage unit 11 stores the above-described DB modeling program 1P.

通信部１３は、ＤＢＩＦ提供装置３及びＤＢ群４夫々と通信することが可能に接続されている。 The communication unit 13 is connected to be able to communicate with the DBIF providing device 3 and the DB group 4.

端末装置２は、制御部２０、記憶部２１、一時記憶部２２、表示部２３、操作部２４及び通信部２５を備える。 The terminal device 2 includes a control unit 20, a storage unit 21, a temporary storage unit 22, a display unit 23, an operation unit 24, and a communication unit 25.

制御部２０は、ＣＰＵを用いる。制御部２０は、記憶部２１に記憶されている各種プログラムを読み出して実行することにより、端末装置２に多様な機能を発揮させる。一時記憶部２２は、ＤＲＡＭ等のＲＡＭを用いる。一時記憶部２２は、制御部２０の処理によって生成される情報を一時的に記憶する。 The control unit 20 uses a CPU. The control unit 20 causes the terminal device 2 to exhibit various functions by reading and executing various programs stored in the storage unit 21. The temporary storage unit 22 uses a RAM such as a DRAM. The temporary storage unit 22 temporarily stores information generated by the processing of the control unit 20.

記憶部２１は、ハードディスク又はフラッシュメモリ等の不揮発性の記憶媒体を用いる。記憶部２１には、ブラウザプログラム２１１が記憶されている。ブラウザプログラム２１１は、ＤＢＩＦ提供装置３から送信されるＨＴＭＬベースのＷｅｂページを受信し、Ｗｅｂページに含まれる文字、画像、又は動画像を表示部２３へ表示させるＷｅｂブラウザ機能を制御部２０に発揮させる。 The storage unit 21 uses a nonvolatile storage medium such as a hard disk or a flash memory. The storage unit 21 stores a browser program 211. The browser program 211 receives the HTML-based Web page transmitted from the DBIF providing apparatus 3 and displays on the control unit 20 a Web browser function that causes the display unit 23 to display characters, images, or moving images included in the Web page. Let

表示部２３は、例えば液晶ディスプレイを用いる。勿論他の種類のディスプレイを用いてもよい。制御部２０は、表示部２３へ文字又は画像を含む各種画面を表示させる。操作部２４は、キーボード及びポインティングデバイス等の入力インタフェースを用いる。制御部２０は、入力インタフェースによって入力した操作情報に基づいて各構成部を制御する。なお表示部２３及び操作部２４は、タッチパネル内蔵型ディスプレイとして一体的に構成されてもよい。 As the display unit 23, for example, a liquid crystal display is used. Of course, other types of displays may be used. The control unit 20 causes the display unit 23 to display various screens including characters or images. The operation unit 24 uses an input interface such as a keyboard and a pointing device. The control unit 20 controls each component based on operation information input through the input interface. The display unit 23 and the operation unit 24 may be integrally configured as a touch panel built-in display.

通信部２５は、インターネットを含むネットワークＮに接続されており、ネットワークＮを介した他装置との間での情報の送受信を実現する。制御部２０は通信部２５によって、ＤＢＩＦ提供装置３から送信される操作画面、及び処理結果を示す画面を含むＷｅｂページ等を送受信する。 The communication unit 25 is connected to a network N including the Internet, and realizes transmission and reception of information with other devices via the network N. The control unit 20 transmits / receives an operation screen transmitted from the DBIF providing apparatus 3 and a Web page including a screen showing a processing result through the communication unit 25.

ＤＢＩＦ提供装置３は、制御部３０、記憶部３１、一時記憶部３２、及び通信部３３を備える。 The DBIF providing apparatus 3 includes a control unit 30, a storage unit 31, a temporary storage unit 32, and a communication unit 33.

制御部３０は、ＣＰＵを用い、記憶部３１に記憶されているＷｅｂサーバプログラム及びＤＢＩＦ提供プログラムを読み出して実行することにより、Ｗｅｂサーバ機能、及びＤＢ群４に対するアクセスインタフェース機能を発揮する。具体的には制御部３０はネットワークＮを介して複数の端末装置２にＤＢ群４へのアクセスインタフェース機能を発揮するためのＷｅｂページを提供する。一時記憶部３２は、ＤＲＡＭ等のＲＡＭを用い、制御部３０の処理によって生成される情報を一時的に記憶する。 The control unit 30 reads out and executes the Web server program and the DBIF providing program stored in the storage unit 31 using the CPU, thereby exhibiting the Web server function and the access interface function for the DB group 4. Specifically, the control unit 30 provides a Web page for exhibiting an access interface function to the DB group 4 to the plurality of terminal devices 2 via the network N. The temporary storage unit 32 uses a RAM such as a DRAM and temporarily stores information generated by the processing of the control unit 30.

記憶部３１は、ハードディスク等の不揮発性の記憶媒体を用いる。記憶部３１には上述したＤＢＩＦ提供プログラムが記憶されている。また記憶部３１は、端末装置２へ提供するアクセスインタフェースを構成するリソース（ＨＴＭＬを始めとする各種言語にて各ページを記述したソース、静止画像等）が記憶されている。 The storage unit 31 uses a non-volatile storage medium such as a hard disk. The storage unit 31 stores the DBIF providing program described above. The storage unit 31 stores resources (sources describing still pages in various languages such as HTML, still images, etc.) that constitute an access interface provided to the terminal device 2.

通信部３３は、ネットワークＮに接続されており、ネットワークＮを介した端末装置２との間での情報の送受信を実現する。また通信部３３は、ＤＢ処理装置１と通信することが可能に接続されている。 The communication unit 33 is connected to the network N, and realizes information transmission / reception with the terminal device 2 via the network N. The communication unit 33 is connected to be able to communicate with the DB processing apparatus 1.

このように構成される情報処理システムでは、ＤＢＩＦ提供装置３から提供されるアクセスインタフェース機能を発揮するためのＷｅｂページが、端末装置２のブラウザ機能により表示される。端末装置２で表示されるブラウザ画面上での操作内容により、ＤＢＩＦ提供装置３にてＤＢ群４へのアクセスが受け付けられ、ＤＢ処理装置１においてアクセスに応じた処理が実行され、処理結果を示す画面がＤＢＩＦ提供装置３から端末装置２へ送信される。 In the information processing system configured as described above, a Web page for performing the access interface function provided from the DBIF providing device 3 is displayed by the browser function of the terminal device 2. Depending on the operation contents on the browser screen displayed on the terminal device 2, the DBIF providing device 3 receives access to the DB group 4, and the DB processing device 1 executes processing according to the access, and shows the processing result. The screen is transmitted from the DBIF providing device 3 to the terminal device 2.

まずＤＢＩＦ提供装置１から送信されるインタフェース情報について、該インタフェース情報により表示されるブラウザ上の画面例を参照して具体的に説明する。図３は、実施の形態１の端末装置２のブラウザ画面に表示されるインタフェース２３１の例を示す説明図である。ユーザが端末装置２を用いてＤＢＩＦ提供装置３を管理する事業者から提供されるＷｅｂサービスにログインした場合のトップページには、ログインしたアカウント（ユーザ）がアクセス権限を有しているＤＢ及びアカウント用のローカルデータの一覧が選択可能に表示される。 First, the interface information transmitted from the DBIF providing apparatus 1 will be specifically described with reference to a screen example on the browser displayed by the interface information. FIG. 3 is an explanatory diagram illustrating an example of the interface 231 displayed on the browser screen of the terminal device 2 according to the first embodiment. The top page when the user logs in to the Web service provided by the business entity that manages the DBIF providing device 3 using the terminal device 2 is the DB and account to which the logged in account (user) has access authority. A list of local data is displayed for selection.

図３に示すように、インタフェース２３１は、新たなローカルデータの新規作成、コピー、削除、及びＤＢ群４へのアップロードの各操作に対応するボタンを含む。図３の例では、業務管理に係るデータに関し、２０１５年９月度の売上データ（テーブル）がローカルデータとして選択可能に表示されている。ユーザが２０１５年９月度の売上データをアップロードする操作を行なった場合、端末装置２からローカルデータがＤＢＩＦ提供装置１経由で、ＤＢ群４中のユーザ用のＤＢへの追加を指示する情報（ＳＱＬ文）と共にＤＢ処理装置１へ送信される。このときにＤＢ処理装置１において実行される処理についてフローチャート及び説明図を参照して説明する。 As shown in FIG. 3, the interface 231 includes buttons corresponding to new operations for creating new local data, copying, deleting, and uploading to the DB group 4. In the example of FIG. 3, sales data (table) for September 2015 is displayed so as to be selectable as local data regarding data related to business management. When the user performs an operation of uploading sales data for September 2015, the local data from the terminal device 2 via the DBIF providing device 1 is instructed to add to the user DB in the DB group 4 (SQL To the DB processing device 1. Processing executed in the DB processing apparatus 1 at this time will be described with reference to a flowchart and an explanatory diagram.

図４は、ＤＢ処理装置１にて実行される処理手順の一例を示すフローチャートである。ＤＢ処理装置１の制御部１０は、記憶部１１に記憶されているＤＢ処理プログラム１Ｐに基づき、以下に示す処理を実行する。 FIG. 4 is a flowchart illustrating an example of a processing procedure executed by the DB processing apparatus 1. The control unit 10 of the DB processing apparatus 1 executes the following process based on the DB processing program 1P stored in the storage unit 11.

ＤＢ処理装置１の制御部１０は、通信部１３によりＤＢ群４へ新たに追加するリレーション（テーブル）を入力する（ステップＳ１０１）。制御部１０は、入力したテーブルの属性（行）毎に分解する（ステップＳ１０２）。 The control unit 10 of the DB processing apparatus 1 inputs a relation (table) to be newly added to the DB group 4 by the communication unit 13 (step S101). The control unit 10 decomposes each attribute (row) of the input table (step S102).

制御部１０は、分解した属性毎に属性値（数値又は文字列等）のバラつきをエントロピーとして算出する（ステップＳ１０３）。バラつきとは属性（行）内での重複回数の少なさに対応する。 The control unit 10 calculates the variation of the attribute value (numerical value or character string or the like) as entropy for each decomposed attribute (step S103). The variation corresponds to the small number of times of duplication within the attribute (row).

制御部１０は、分解した属性夫々のエントロピーの大きさ順に順位を付与し（ステップＳ１０４）、エントロピーが最も大きい（最もバラつきが多い）属性を抽出する（ステップＳ１０５）。 The control unit 10 assigns a rank in order of the entropy size of each decomposed attribute (step S104), and extracts the attribute having the largest entropy (most variation) (step S105).

制御部１０は、ステップＳ１０５で抽出した属性の内、キーとしての制約に反していないものを、候補キーとして更に抽出する（ステップＳ１０６）。なお制約とはＮＵＬＬでない、フィールド長（値）等である。 The control unit 10 further extracts, as candidate keys, the attributes extracted in step S105 that do not violate the restrictions as keys (step S106). Note that the constraint is a field length (value) or the like that is not NULL.

制御部１０は、分解した属性間の関数従属性（functional dependency ）の計算を行なう（ステップＳ１０７）。ステップＳ１０７にて具体的には、制御部１０は、ステップＳ１０６で抽出した候補キーと、候補キー（複数可）により一意に決まる他の属性（非キー属性）との関係を候補キー及び他の属性の組毎に特定する。またこのとき制御部１０は、バラつきが多い属性（候補キーである可能性が高い）とバラつきが少ない属性（非キー属性である可能性が高い）との間で優先的に関数従属性を特定する。 The control unit 10 calculates a functional dependency between the decomposed attributes (step S107). Specifically, in step S107, the control unit 10 determines the relationship between the candidate key extracted in step S106 and other attributes (non-key attributes) uniquely determined by the candidate key (s). Specify for each set of attributes. In addition, at this time, the control unit 10 preferentially specifies the functional dependency between an attribute with a large variation (highly likely to be a candidate key) and an attribute with a small variation (highly likely a non-key attribute). To do.

制御部１０は、ステップＳ１０７で得られた関数従属性に基づいて、ステップＳ１０１で入力したテーブル内の属性間の決定木（decision tree ）を作成する（ステップＳ１０８）。これにより、主キー及び属性間の階層が決定される。 The control unit 10 creates a decision tree between the attributes in the table input in step S101 based on the function dependency obtained in step S107 (step S108). Thereby, the hierarchy between the primary key and the attribute is determined.

制御部１０は、作成された決定木により複数のテーブルに分割する（ステップＳ１０９）。ステップＳ１０９において詳細には、制御部１０はステップ１０８で作成された決定木に基づき、各関係性を２部グラフ（ｎ部グラフ）として抽出し、抽出した２部グラフ毎に最大マッチングを求め、最大マッチングを使用してトポロジカルソートを行なって複数のテーブルに分割する（ダルメージ・メンデルゾーン分解（Dulmage-Mendelsohn decomposition ））。なおステップＳ１０９により、入力したテーブルのリレーショナルＤＢとしての第３正規化までが行なわれることになる。 The control unit 10 divides the table into a plurality of tables based on the created decision tree (step S109). Specifically, in step S109, the control unit 10 extracts each relationship as a bipartite graph (n-part graph) based on the decision tree created in step 108, obtains a maximum matching for each extracted bipartite graph, Perform topological sorting using maximum matching to divide into multiple tables (Dulmage-Mendelsohn decomposition). In step S109, the input table is subjected to the third normalization as a relational DB.

制御部１０は、ステップＳ１０９で分割されたテーブルから階層化ネットワーク構造を作成する（ステップＳ１１０）。更に制御部１０は、階層化ネットワークから複雑ネットワークを生成する（ステップＳ１１１）。 The control unit 10 creates a hierarchical network structure from the table divided in step S109 (step S110). Further, the control unit 10 generates a complex network from the hierarchical network (step S111).

ステップＳ１１１において制御部１０は、階層化ネットワークにおいて各属性を頂点とした場合の平均頂点間距離を算出する。そして階層化ネットワーク（全リレーション）全体における平均頂点間距離が小さくなるように、頂点（属性）間にリンクを生成する。ただしこのとき、階層化ネットワークにおいて隣り合う階層（１階層スキップしない）間でリンクさせることとし、しかも出リンクが多い頂点（ハブ）、即ち、外部キーとなっている属性間は接続しないという制約条件下でリンクの生成を行なう。つまり制御部１０は、ある頂点から、該頂点からの距離が算出された平均頂点間距離以上であり、しかもハブに従属している他の属性（ハブでない属性）との間でリンクを生成する。具体的には制御部１０は、接続されていない属性間であって制約条件を満たす属性間のリストを抽出し、リスト毎に接続した場合の平均頂点間距離、及び出リンクの合計を算出する。制御部１０は、リストの内、平均頂点間距離が最も小さくなり、且つ出リンクの合計が大きくならない属性ペアを特定してこれらを接続する。ステップＳ１１１により、ステップＳ１０９により完備された正規化を一部崩し、スモールワールド性が高く、且つスケールフリー性は低いＤＢを自動的に作成することが可能になる。スモールワールド性が高くスケールフリー性が低いことでテーブル間の結合演算を減らして処理性能の低下を回避することができ、且つ、ＤＢの更新時に異常を来たさない優れたＤＢとなる。 In step S111, the control unit 10 calculates an average inter-vertex distance when each attribute is a vertex in the hierarchical network. Then, links are generated between the vertices (attributes) so that the average distance between vertices in the entire hierarchical network (all relations) becomes small. However, at this time, in the hierarchical network, a link is made between adjacent hierarchies (not skipping one hierarchy), and a vertex (hub) with many outgoing links, that is, a constraint that attributes that are foreign keys are not connected. Create a link below. That is, the control unit 10 generates a link from a certain vertex to another attribute (non-hub attribute) whose distance from the vertex is equal to or greater than the calculated average inter-vertex distance and subordinate to the hub. . Specifically, the control unit 10 extracts a list between attributes that are not connected and satisfy the constraint condition, and calculates the average distance between the vertices and the sum of the outgoing links when connected for each list. . The control unit 10 identifies and connects attribute pairs whose average vertex distance is the smallest in the list and whose total number of outgoing links does not increase. In step S111, the normalization completed in step S109 is partly broken, and a DB having high small world characteristics and low scale-free characteristics can be automatically created. Since the small-world property is high and the scale-free property is low, it is possible to reduce the join operation between the tables and avoid the deterioration of the processing performance, and it is an excellent DB that does not cause any abnormality when updating the DB.

制御部１０は、ステップＳ１１１で生成した複雑ネットワークに基づいて、入力したテーブルのスキーマを描画する（ステップＳ１１２）。制御部１０は、描画して得られた情報をＤＢＩＦ提供装置３経由で端末装置２へ送信する（ステップＳ１１３）。 The control unit 10 draws the schema of the input table based on the complex network generated in step S111 (step S112). The control unit 10 transmits the information obtained by drawing to the terminal device 2 via the DBIF providing device 3 (step S113).

ステップＳ１１２において制御部１０は、上述の階層化ネットワーク構造における主キー（最上流）の属性に対応するオブジェクトを描画範囲内の中央に配置し、該属性に対応するオブジェクトと同一階層における他の属性に対応させたオブジェクトを円形状に配置して描画する。このときテーブルに含まれる属性のエントロピー（バラつきの多さ）の総和を基準にして円形の半径とする。制御部１０は、描画範囲の外側に向けて下位の階層の属性に対応するオブジェクトを同様にして、分解されたテーブル毎に、該テーブルに属する属性に対応するオブジェクトを円形状に配置する。ステップＳ１１１において計算した平均頂点間距離を表示するようにしてもよい。なおオブジェクトはいずれもカード型とし、端末装置２の表示部２３にてJavascript（登録商標）に基づき選択して移動することが可能に描画されるように、制御部１０は上述のスキーマを示す画面情報を作成する。 In step S112, the control unit 10 arranges the object corresponding to the attribute of the primary key (the most upstream) in the above-described hierarchical network structure in the center of the drawing range, and other attributes in the same hierarchy as the object corresponding to the attribute. The object corresponding to is arranged in a circle and drawn. At this time, the radius of the circle is set based on the sum of entropy (variation) of attributes included in the table. The control unit 10 similarly arranges the objects corresponding to the attributes belonging to the table in a circular shape for each decomposed table in the same manner for the objects corresponding to the attributes of the lower hierarchy toward the outside of the drawing range. The average vertex distance calculated in step S111 may be displayed. The object is a card type, and the control unit 10 displays a screen showing the above-described schema so that the object can be selected and moved based on Javascript (registered trademark) on the display unit 23 of the terminal device 2. Create information.

またオブジェクトの配置は円形状に限定されず、楕円形は勿論、大小によってエントロピーの総和を表現できればよい。配置される形状（例えば円）が大きいほど、そのテーブルには候補キーが多く含まれ、外側に更に他のテーブルに対応する円形が繋がる可能性があることを示す。 Further, the arrangement of the objects is not limited to a circular shape, and it is sufficient that the total entropy can be expressed by the size of an ellipse as well as an ellipse. The larger the arranged shape (for example, a circle), the more candidate keys are included in the table, indicating that there is a possibility that a circle corresponding to another table may be connected to the outside.

更に制御部１０は、端末装置２の表示部２３上に描画されたスキーマを示す画面上で操作を受け付け、受け付けた操作内容を示す情報をＤＢＩＦ提供装置３経由で受信する（ステップＳ１１４）。制御部１０は、受け付けられた操作に応じて、ステップＳ１０７〜Ｓ１１０の処理により得られた、関数従属性、複雑ネットワーク構造等に基づいてステップＳ１０１で入力したテーブルによりＤＢ群４内にＤＢを構築し（ステップＳ１１５）、処理を終了する。 Further, the control unit 10 receives an operation on the screen indicating the schema drawn on the display unit 23 of the terminal device 2, and receives information indicating the received operation content via the DBIF providing device 3 (step S114). The control unit 10 constructs a DB in the DB group 4 based on the table input in step S101 based on the functional dependency, complex network structure, etc. obtained by the processing in steps S107 to S110 according to the accepted operation. (Step S115), and the process ends.

図４のフローチャートに示した処理手順を、具体例を挙げて説明する。図５は、端末装置２から送信されるデータ（テーブル）の内容例を示す説明図である。図５に示す例は、図３に示した売上データであり、売上内容（レコード）の注文日、顧客番号、顧客の氏名、都道府県、住所、顧客種別、商品番号、カテゴリ名、商品名、販売担当の従業員名等の実際の値を表形式にしたものである。売上データには図５では省略した単価、個数、金額も含まれる。更に、売上データはユーザがアクセス権限を有しているＤＢの内、業務データに追加されるものである。業務データは更に、従業員について、従業員名、身長、体重、誕生日、電子メール、入社年度、給与情報、所属部署情報を属性として含むリレーション（テーブル）を有している。 The processing procedure shown in the flowchart of FIG. 4 will be described with a specific example. FIG. 5 is an explanatory diagram showing a content example of data (table) transmitted from the terminal device 2. The example shown in FIG. 5 is the sales data shown in FIG. 3. The sales data (record) order date, customer number, customer name, prefecture, address, customer type, product number, category name, product name, Actual values such as the names of employees in charge of sales are tabulated. The sales data includes the unit price, number, and amount omitted in FIG. Furthermore, the sales data is added to the business data in the DB to which the user has access authority. The business data further includes a relation (table) for the employee including an employee name, height, weight, date of birth, e-mail, year of employment, salary information, and department information as attributes.

図５に示したテーブルのアップロード操作が端末装置２で行なわれた場合、ＤＢ処理装置１の制御部１０は、テーブルを入力し（Ｓ１０１）、追加先の業務データと合わせて属性（行）毎に分解する（Ｓ１０２）。このとき制御部１０は、新たに入力するテーブルのテーブル定義をユーザから受け付ける画面の画面情報を端末装置２へＤＢＩＦ提供装置３経由で送信する。図６は、テーブル定義受付画面の内容例を示す説明図である。図６に示すように、制御部１０はアップロードされたテーブルの各行の先頭の文字列を属性名として識別して分解した属性の属性値の種別（数値／文字列）、長さ、空（ＮＵＬＬ）を許可するか否か等の選択が可能である。 When the upload operation of the table shown in FIG. 5 is performed by the terminal device 2, the control unit 10 of the DB processing device 1 inputs the table (S101), and adds each attribute (row) together with the business data of the addition destination. (S102). At this time, the control unit 10 transmits screen information of a screen for receiving a table definition of a table to be newly input from the user to the terminal device 2 via the DBIF providing device 3. FIG. 6 is an explanatory diagram showing an example of the contents of the table definition acceptance screen. As shown in FIG. 6, the control unit 10 identifies the first character string of each row of the uploaded table as an attribute name, decomposes the attribute value type (numeric value / character string), length, and empty (NULL). ) Can be selected.

図６に示したテーブル定義受付画面にて受け付けられる情報以外に、その他アップロードされたテーブルの属性の関係性を特定するための参考情報を受け付けるようにしてもよい。例えば、複数のテーブルをアップロードした場合には、ＤＢ処理装置１はＤＢ提供装置３経由で端末装置２にてテーブル間の関係性の有無の選択を受け付けてもよい。また図６に示したように分解された複数の属性の内、主キーの指定を受け付けるようにしてもよいし、属性に対応するＩＤ属性の意図的な追加の指示を受け付けるようにしてもよい。 In addition to the information received on the table definition reception screen shown in FIG. 6, other reference information for specifying the relationship of attributes of the uploaded table may be received. For example, when a plurality of tables are uploaded, the DB processing device 1 may accept selection of presence / absence of a relationship between the tables at the terminal device 2 via the DB providing device 3. Also, as shown in FIG. 6, the designation of the primary key may be received from among the plurality of decomposed attributes, or an instruction to intentionally add an ID attribute corresponding to the attribute may be accepted. .

ＤＢ処理装置１の制御部１０は、複数の属性から候補キーの属性を抽出する（Ｓ１０６）。このとき制御部１０は、エントロピーが高い属性について、ＩＤ属性が必要な属性（レコードをユニークにするために必要な場合）について自動的にＩＤ属性を追加する。例えば、入力した売上データの属性が注文日、顧客番号、氏名、都道府県、住所、顧客種別、商品番号、カテゴリ名、商品名、単価、個数、金額、従業員名である場合、複数の商品番号を含むレコードを第１正規化して売上ＩＤ属性を追加する。売上ＩＤ属性は最もバラつきの大きい属性となる。同様にして顧客ＩＤ、商品ＩＤ、従業員ＩＤを自動的に又はユーザからの指示により追加する。制御部１０は、ＩＤ属性を追加した上で関数従属性を計算し（Ｓ１０７）、決定木を作成する（Ｓ１０８）。関数従属性の計算では制御部１０は、エントロピーが高い１つの属性について、該１つの属性値における属性値により、他の属性における属性値が一意に決まるか否かを判断し、決まる場合には関数従属すると特定される。つまり、前記１つの属性における属性値が同一である場合には、他の属性における属性値が同一の属性値となる関係を特定する。 The control unit 10 of the DB processing apparatus 1 extracts candidate key attributes from a plurality of attributes (S106). At this time, the control unit 10 automatically adds an ID attribute for an attribute that requires an ID attribute (when necessary to make a record unique) for an attribute with high entropy. For example, if the attributes of the input sales data are order date, customer number, name, prefecture, address, customer type, product number, category name, product name, unit price, quantity, amount, employee name, multiple products A record ID number is first normalized and a sales ID attribute is added. The sales ID attribute has the largest variation. Similarly, a customer ID, a product ID, and an employee ID are added automatically or by an instruction from the user. The control unit 10 calculates the functional dependency after adding the ID attribute (S107), and creates a decision tree (S108). In calculating the functional dependency, the control unit 10 determines whether or not an attribute value in another attribute is uniquely determined by the attribute value in the one attribute value for one attribute having high entropy. Identified as function dependent. That is, when the attribute values in the one attribute are the same, the relationship in which the attribute values in the other attributes are the same attribute value is specified.

図７は、決定木の内容例を示す説明図である。最もバラつきの大きい属性である売上属性を最上流として、関数従属性に基づき図７に示すようなツリー構造が特定される。 FIG. 7 is an explanatory diagram showing an example of the contents of a decision tree. Based on the functional dependency, the tree structure as shown in FIG. 7 is specified with the sales attribute being the attribute having the largest variation as the most upstream.

次に制御部１０は、図７に示した決定木からステップＳ１０９によりテーブルを分割し、階層化ネットワーク構造を作成する。図８は、階層化ネットワーク構造の内容例を示す説明図である。図８に示すように、決定木に基づいて最上流の売上ＩＤを含み個数、顧客ＩＤ、注文日、商品ＩＤ及び従業員ＩＤを含む同一階層の属性が１つのテーブルとして分割されている。同様にして上述のテーブルにも属している顧客ＩＤを含み、住所、都道府県ＩＤ、顧客種別ＩＤ、（顧客の）氏名、及び顧客番号を含む同一階層の属性が１つのテーブルとして分割されている。また、売上ＩＤを含むテーブルにも属している商品ＩＤを含み、商品名、商品番号、商品カテゴリＩＤ及び単価を含む同一階層の属性が１つのテーブルとして分割されている。更に、売上ＩＤを含むテーブルにも属している従業員ＩＤを含み、従業員名、身長、体重、電子メール、入社年度、誕生日、血液型、所属ＩＤ、及び給与ＩＤを含む同一階層の属性が既存のテーブルとして存在している。同一のテーブルは破線で示すようにグループ化される。このように関連するテーブルは共に処理（更新）される。 Next, the control unit 10 divides the table from the decision tree shown in FIG. 7 in step S109 to create a hierarchical network structure. FIG. 8 is an explanatory diagram showing an example of the contents of a hierarchical network structure. As shown in FIG. 8, on the basis of the decision tree, attributes of the same hierarchy including the most upstream sales ID, including the number, customer ID, order date, product ID, and employee ID are divided as one table. Similarly, the attributes of the same hierarchy including the customer ID belonging to the above-mentioned table, including the address, prefecture ID, customer type ID, (customer) name, and customer number are divided as one table. . Further, the table including the sales ID includes the product ID belonging to the table, and the attributes of the same hierarchy including the product name, the product number, the product category ID, and the unit price are divided as one table. In addition, the employee IDs belonging to the table including the sales IDs are included, and the attributes in the same hierarchy including the employee name, height, weight, e-mail, year of employment, birthday, blood type, affiliation ID, and salary ID Exists as an existing table. The same table is grouped as shown by the broken line. Thus, the related tables are processed (updated) together.

更に制御部１０は、階層化ネットワークから複雑ネットワークを生成する（Ｓ１１１）。図９は、複雑ネットワーク生成の概要を示す説明図である。図９に示すような階層化されたネットワーク構造が存在する場合、まず平均頂点間距離を算出する。平均頂点間距離は図９の例であれば、主キーであるＡから同一テーブル内の各頂点Ｂ，Ｃ，Ｄまでの頂点間距離は各１、非キー属性であるＢ，Ｃ，Ｄ間、Ｅ，Ｆ，Ｇ間は、ＨＩＪいずれもキー経由で２であり、頂点Ａから頂点Ｋまでの頂点間距離は３である。これらを全頂点間で算出してその平均を求めたものが平均頂点間距離である。制御部１０は、頂点間の組み合わせの内で未接続の組み合わせを抽出し、同一階層に含まれず、且つ２階層以上離れていない組み合わせを抽出する。例えば頂点Ｂと頂点Ｃとの組み合わせは同一階層（テーブル）に含まれ、頂点Ｅと頂点Ｉとの組み合わせは階層が２つ離れているからいずれも抽出されない。そして、頂点間距離が平均頂点間距離より大きく、且つ、ハブでない頂点（頂点Ｃ，Ｇ以外）を結ぶ組み合わせが抽出される。図９の例であれば例えば、頂点Ｂ，Ｃ，Ｅのいずれかと頂点Ｆ又は頂点Ｈとの組み合わせ等が抽出される。抽出された組み合わせの内、所定の基準を満たす組み合わせの頂点間を従属関係（リンク）で接続する。所定の基準とは頂点間距離、より上位の階層又はより下位の階層、又は頂点（属性）における属性値の種別（文字列、数値）が同一である等である。例えば、抽出された組み合わせの内、より上位の階層に含まれる頂点をペアに含み、且つ頂点に対応する属性の属性値が数値である場合には、頻度が高い演算に使用する可能性が高いなどとして従属関係（リンク）を結ぶ。これにより図９に示すような新たなリンクが接続される。 Further, the control unit 10 generates a complex network from the hierarchical network (S111). FIG. 9 is an explanatory diagram showing an outline of complex network generation. When a hierarchical network structure as shown in FIG. 9 exists, first, an average vertex distance is calculated. In the example of FIG. 9, the average distance between vertices is 1 between the vertices B, C, and D from the primary key A to each of the vertices B, C, and D in the same table. , E, F, and G are all 2 through the keys, and the distance between the vertices A to K is 3. The average distance between vertices is obtained by calculating the average between these vertices and calculating the average. The control unit 10 extracts unconnected combinations among the combinations between the vertices, and extracts combinations that are not included in the same hierarchy and are not separated by two or more hierarchies. For example, the combination of vertex B and vertex C is included in the same hierarchy (table), and the combination of vertex E and vertex I is not extracted because the hierarchy is separated by two. Then, a combination that connects the vertices (other than the vertices C and G) that are larger than the average vertex distance and that are not hubs is extracted. In the example of FIG. 9, for example, a combination of one of the vertices B, C, and E with the vertex F or the vertex H is extracted. Among the extracted combinations, vertices of combinations that satisfy a predetermined criterion are connected by a dependency relationship (link). The predetermined standard is the distance between vertices, higher hierarchy or lower hierarchy, or the same attribute value type (character string, numerical value) at the vertex (attribute). For example, if the extracted combination includes vertices included in a higher hierarchy in a pair and the attribute value of the attribute corresponding to the vertex is a numerical value, it is likely to be used for a high-frequency operation. For example, a dependency relationship (link) is established. As a result, a new link as shown in FIG. 9 is connected.

図１０は、複雑ネットワーク構造の内容例を示す説明図である。図１０に示す複雑ネットワーク構造の内容例は、図８の階層化ネットワークから生成される。図１０に示すように、未接続の組み合わせの内、隣り合う階層の非ハブである頂点（属性）であって、いずれも数値である属性として「単価」が優先的に接続される。これにより、単価と個数とが同一のテーブルに含まれ、テーブルの結合演算による処理負荷が軽減されることが期待される。 FIG. 10 is an explanatory diagram showing an example of the contents of a complex network structure. The content example of the complex network structure shown in FIG. 10 is generated from the hierarchical network of FIG. As shown in FIG. 10, “unit price” is preferentially connected as a non-hub vertex (attribute) that is a non-hub of an adjacent hierarchy from among unconnected combinations. As a result, the unit price and the number are included in the same table, and it is expected that the processing load due to the join operation of the tables is reduced.

そして図１０に示したように生成された複雑化ネットワークに基づいてリレーションのスキーマが描画される（Ｓ１１２）。図１１は、スキーマ描画例を示す説明図である。図３に示したインタフェース２３１にてローカルデータをアップロードした場合、図１１に示すように「モデリングデッキ」の機能によりアップロードしたデータの統合先ＤＢのスキーマを描画した画面を見ることができる。なおこれらのスキーマの描画情報はＤＢ処理装置１の制御部１０が作成し、ＤＢＩＦ提供装置３経由で端末装置２へ送信する。図１１に示すように、本実施の形態におけるスキーマ描画例では、図１０に示した複雑ネットワーク構造における分割されたテーブル毎に、該テーブルに含まれる属性が、対応するカード（オブジェクト）２３２として円形状に配置されている。各テーブルに対応する円２３３の大きさは所属する属性夫々のエントロピー（バラつきの多さ）の総和に対応している。 Then, a relation schema is drawn based on the complicated network generated as shown in FIG. 10 (S112). FIG. 11 is an explanatory diagram illustrating a schema drawing example. When local data is uploaded through the interface 231 shown in FIG. 3, as shown in FIG. 11, a screen on which the schema of the integration destination DB of the uploaded data is drawn can be viewed by the function of “modeling deck”. The drawing information of these schemas is created by the control unit 10 of the DB processing apparatus 1 and transmitted to the terminal apparatus 2 via the DBIF providing apparatus 3. As shown in FIG. 11, in the schema drawing example according to the present embodiment, for each divided table in the complex network structure shown in FIG. 10, the attributes included in the table are circles as corresponding cards (objects) 232. Arranged in shape. The size of the circle 233 corresponding to each table corresponds to the total sum of entropy (variation) of the attributes to which the table belongs.

図１１に示すスキーマの描画においてはまず、テーブルが円２３３で表示され、属性はカード２３２で円２３３の円周上に配置される。図１１に示す例では図８（１０）の階層化ネットワーク構造から分かるように主キーは「売上ＩＤ」であるが、該主キーに直接従属している属性（顧客ＩＤ、注文日、商品ＩＤ、個数、授業員ＩＤ）は第１次元のリレーション（テーブル）として「モデリングデッキ」のできるだけ中央部に描画される。更に第１次元のテーブルに対応する円２３３の外側に隣接する円２３３に対応するテーブルが第２次元である。第２次元のテーブルの円２３３に更に外側に向けて隣接する円２３３に対応するテーブルが第３次元であり、更に外側に隣接する円２３３に対応するテーブルが第４次元である。図１１の例では顧客テーブル、商品テーブル、従業員テーブルが第２次元、都道府県テーブル、顧客種別テーブル、カテゴリテーブル、所属テーブル、給与テーブルが第３次元、所属テーブルの外側に位置する部署テーブルが第４次元である。 In the schema drawing shown in FIG. 11, first, a table is displayed as a circle 233, and attributes are arranged on the circumference of the circle 233 by the card 232. In the example shown in FIG. 11, as can be seen from the hierarchical network structure of FIG. 8 (10), the primary key is “sales ID”, but the attributes (customer ID, order date, product ID) directly subordinate to the primary key. , Number, instructor ID) are drawn as centrally as possible on the “modeling deck” as a first dimension relation (table). Furthermore, the table corresponding to the circle 233 adjacent to the outside of the circle 233 corresponding to the first dimension table is the second dimension. The table corresponding to the circle 233 adjacent to the outer side of the circle 233 of the second dimension table is the third dimension, and the table corresponding to the circle 233 adjacent to the outer side is the fourth dimension. In the example of FIG. 11, the customer table, the product table, and the employee table are the second dimension, the prefecture table, the customer type table, the category table, the belonging table, and the salary table are the third dimension, and the department table that is located outside the belonging table is The fourth dimension.

図１１に示すように、第１次元のテーブルと第２次元のテーブル、第２次元のテーブルと第３次元のテーブル、第３次元のテーブルと第４次元のテーブルは夫々、外部キーとなる属性を接点として共有することで隣接する。 As shown in FIG. 11, the first dimension table and the second dimension table, the second dimension table and the third dimension table, and the third dimension table and the fourth dimension table are attributes that are foreign keys, respectively. Adjacent by sharing as a contact.

なお図１１における各属性に対応するカード２３３は操作部２４により選択が可能であり、ユーザが操作部２４によりカード２３２上をクリックした場合には、属性の情報が表示される。つまりＤＢ処理装置１の制御部１０は、各属性の情報も対応付けてスキーマの描画情報と共に端末装置２へ送信している。更に各テーブルに対応する円２３３も選択が可能である。円２３３についてユーザが操作部２４により円２３３をクリックした場合には、対応するテーブルの情報が表示される。なお各テーブルの名称は、属性名に共通する単語を用いてＤＢ処理装置１の制御部１０が自動的に決定してもよいし、ユーザが各テーブルの名称を図１１に示す「モデリングデッキ」上で円２３３を選択し、その際に表示されるテーブルの情報を編集する形式で指定することも可能である。 Note that the card 233 corresponding to each attribute in FIG. 11 can be selected by the operation unit 24. When the user clicks on the card 232 by the operation unit 24, attribute information is displayed. That is, the control unit 10 of the DB processing apparatus 1 also associates information of each attribute and transmits it to the terminal apparatus 2 together with schema drawing information. Further, a circle 233 corresponding to each table can also be selected. When the user clicks on the circle 233 using the operation unit 24 for the circle 233, information on the corresponding table is displayed. Note that the name of each table may be automatically determined by the control unit 10 of the DB processing apparatus 1 using a word common to the attribute name. It is also possible to select the circle 233 above and specify the table information displayed at that time in an edit format.

図１１に示す「モデリングデッキ」上における各カード２３２は、表示部２３の画面上で選択を受け付けることが可能に構成されている。ユーザは操作部２４によってカード２３２をドラッグし、テーブル内の位置（階層）を移動させることが可能である。ユーザは操作部２４によって円２３３をつかんで移動させ、他の円２３３へ統合させることも可能である。 Each card 232 on the “modeling deck” shown in FIG. 11 is configured to accept selection on the screen of the display unit 23. The user can drag the card 232 with the operation unit 24 to move the position (hierarchy) in the table. The user can also grab and move the circle 233 by the operation unit 24 and integrate it with another circle 233.

図１２は、スキーマの変更後の描画例を示す説明図である。図１２は、図１１における「都道府県名」の属性に対応するカード２３２が「顧客」テーブルに対応する円２３３内へ移動された場合の表示例を示している。この操作によって「都道府県名」の属性は「顧客」テーブル内にも追加され、これによりユーザの操作によってテーブルが逆正規化される。また、図９に示す「モデリングデッキ」では、円形アイコン２３４のドラッグアンドドロップで新しいテーブルの配置（追加）を受け付けることが可能である。このように受け付けられた操作の内容はＪＳＯＮ等の言語により端末装置２からＤＢＩＦ提供装置３経由でＤＢ処理装置１へ送信され、受け付けられる（Ｓ１１４）。 FIG. 12 is an explanatory diagram of a drawing example after the schema is changed. FIG. 12 shows a display example when the card 232 corresponding to the attribute of “prefecture name” in FIG. 11 is moved into the circle 233 corresponding to the “customer” table. By this operation, the attribute of “prefecture name” is also added to the “customer” table, and the table is denormalized by the operation of the user. Further, in the “modeling deck” shown in FIG. 9, it is possible to accept the arrangement (addition) of a new table by dragging and dropping the circular icon 234. The contents of the accepted operation are transmitted from the terminal device 2 to the DB processing device 1 via the DBIF providing device 3 in a language such as JSON and accepted (S114).

更に、図１１，１２に示す「モデリングデッキ」には「ＯＫ」ボタン２３５が表示されている。「ＯＫ」ボタンにより、入力されたテーブルに対し、「モデリングデッキ」上に描画されているスキーマでのＤＢの構築が行なわれる（Ｓ１１５）。 Furthermore, an “OK” button 235 is displayed on the “modeling deck” shown in FIGS. With the “OK” button, a DB is constructed with the schema drawn on the “modeling deck” for the input table (S115).

図１３は、構築されたＤＢを示す説明図である。図１３に示すように、リレーショナルＤＢが構築される。図１３に示されるように、第３正規化が実行された上で、更に複雑ネットワーク化により、また図１２に示したユーザの操作に基づいて一部が非正規化（太線に示す）されたＤＢが構築される。 FIG. 13 is an explanatory diagram showing the constructed DB. As shown in FIG. 13, a relational DB is constructed. As shown in FIG. 13, after the third normalization was performed, a part of the network was further denormalized (shown in bold lines) due to the complicated networking and based on the user operation shown in FIG. 12. DB is constructed.

このように、ユーザは自身が作成した表形式のリレーション（テーブル）をＤＢＩＦ提供装置３経由でＤＢ処理装置１へアップロードする操作を行なう程度で、リレーショナルスキーマが機械的に自動的に生成される。これにより、ＤＢ及びＤＢを扱うための技術に対する専門知識を有さないユーザであっても簡単にＤＢの更新が可能になる。 In this way, the relational schema is automatically generated mechanically to the extent that the user performs an operation of uploading the table-format relation (table) created by the user to the DB processing apparatus 1 via the DBIF providing apparatus 3. Thereby, even if it is a user who does not have the technical knowledge with respect to DB and the technique for handling DB, it becomes possible to update DB easily.

更に、従来のＥＲ図とは異なる方法で図１１に示すようにスキーマが視覚化されて描画される。これにより、自身がアップロードしたリレーションのＤＢのスキーマを直感的に把握することができる。これにより、ＤＢへのアクセスに関する知識を有さないユーザであっても自動的に生成されたＤＢのスキーマを直感的に把握することができる。 Further, the schema is visualized and drawn as shown in FIG. 11 by a method different from the conventional ER diagram. Thereby, it is possible to intuitively grasp the schema of the relation DB uploaded by itself. Thereby, even the user who does not have knowledge about access to the DB can intuitively grasp the schema of the automatically generated DB.

また、可視化されたスキーマにおける各属性及びテーブルがオブジェクトされており、選択及び移動の操作を受け付けるようにしてあることで、図１２に示すように、ユーザはオブジェクトを画面上で動かすという直感的な操作でデータベース内の構造を調整することも可能になる。これにより、ＤＢ及びＤＢを扱うための技術に対する専門知識を有さないユーザであっても簡単にＤＢの更新が可能になる。 In addition, since each attribute and table in the visualized schema is an object, and selection and movement operations are accepted, the user can intuitively move the object on the screen as shown in FIG. It is also possible to adjust the structure in the database by operation. Thereby, even if it is a user who does not have the technical knowledge with respect to DB and the technique for handling DB, it becomes possible to update DB easily.

実施の形態１では、ユーザがローカルデータのアップロード操作を行なった場合に、アップロード対象のリレーションに対して解析が行なわれた。しかしながら解析の対象はアップロード対象のみならず、既にＤＢ群４に記憶されている既存のＤＢであってもよい。これにより、既存のＤＢのスキーマを可視化し、更に既存のＤＢのスキーマ変更が直感的に可能になる。 In the first embodiment, when a user performs an operation of uploading local data, the relation to be uploaded is analyzed. However, the analysis target is not limited to the upload target, but may be an existing DB already stored in the DB group 4. As a result, the schema of the existing DB is visualized, and the schema of the existing DB can be changed intuitively.

（実施の形態２）
図１４は、実施の形態２における情報処理システムの構成を示すブロック図である。実施の形態２における情報処理システムの構成は、ＤＢ処理装置１においてＤＢ群４に対する学習データを参照した処理を行なうこと以外は実施の形態１における構成と同様である。したがって実施の形態１における構成と共通する構成については同一の符号を付して詳細な説明を省略する。実施の形態２におけるＤＢ群４は、ＤＢ化された各種データ（実データ）、各ＤＢ間及びＤＢ内のテーブル同士の関係を示す情報を記録した管理ＤＢの他に、学習データ４１を含む。 (Embodiment 2)
FIG. 14 is a block diagram illustrating a configuration of the information processing system according to the second embodiment. The configuration of the information processing system in the second embodiment is the same as the configuration in the first embodiment except that the DB processing device 1 performs processing referring to learning data for the DB group 4. Therefore, the same reference numerals are assigned to configurations common to those in Embodiment 1, and detailed description thereof is omitted. The DB group 4 in the second embodiment includes learning data 41 in addition to a management DB in which various types of data (actual data) converted into DBs, information indicating relationships between tables in the DBs, and tables in the DBs are recorded.

学習データ４１は、類語辞書、共起辞書等を含み予めＤＢの属性名用に辞書データと、属性名及び属性名のリレーション（テーブル）を含むコーパスとを有する。辞書データにおける類語辞書は例えば、「売上」、「売り上げ」、「注文」、「セールス」、「Sales 」は類語であることが記述する。共起辞書は例えば、「住所」、「都道府県」、「電話番号」は同一テーブルに存在する属性名であることを記述している。コーパスは過去に、設計者により又はＳＱＬの学習に基づいて作成されたテーブルを文章に見立てて主キー及び非キー属性の組み合わせとしたものの集合である。例えば、「注文日／商品名／個数／都道府県名」、「注文日時／商品番号／個数／単価／金額」、及び「注文番号／商品番号／個数／金額／住所」等を含む第１コーパス、「顧客名／顧客番号／顧客種別」、「顧客番号／顧客名／顧客種別／住所」、及び「顧客名／顧客種別／都道府県／年齢」等を含む第２コーパスと分類される複数種類のコーパス群を含む。 The learning data 41 includes a synonym dictionary, a co-occurrence dictionary, and the like, and has dictionary data for DB attribute names in advance and a corpus including attribute names and attribute name relations (tables). For example, “sales”, “sales”, “order”, “sales”, and “Sales” are described as synonyms in the dictionary data. For example, the co-occurrence dictionary describes that “address”, “prefecture”, and “phone number” are attribute names existing in the same table. The corpus is a set of combinations of primary key and non-key attributes that are created in the past by using a table created by a designer or based on SQL learning as a sentence. For example, the first corpus including “order date / product name / quantity / prefecture name”, “order date / time / product number / quantity / unit price / amount”, “order number / product number / quantity / amount / address”, etc. , “Customer name / customer number / customer type”, “customer number / customer name / customer type / address”, “customer name / customer type / prefecture / age”, etc. Of corpus groups.

なお学習データ４１のコーパス群は予め作成された初期データに基づき、各ユーザからのＤＢの追加に応じて逐次集約され、学習されるものであってもよい。なお学習データ４１は、ユーザアカウント又はアカウントグループ個別に作成され、各ユーザの思考の傾向に合わせて学習されるものであってもよい。 Note that the corpus group of the learning data 41 may be sequentially aggregated and learned according to the addition of the DB from each user based on the initial data created in advance. The learning data 41 may be created for each user account or account group and learned in accordance with each user's thinking tendency.

図１５は、実施の形態２におけるＤＢ処理装置１にて実行される処理手順の一例を示すフローチャートである。図１５に示す処理手順の内、実施の形態１の図４のフローチャートに示した処理手順と共通する手順については同一のステップ番号を付して詳細な説明を省略する。 FIG. 15 is a flowchart illustrating an example of a processing procedure executed by the DB processing apparatus 1 according to the second embodiment. Of the processing procedures shown in FIG. 15, procedures common to the processing procedures shown in the flowchart of FIG. 4 of the first embodiment are denoted by the same step numbers, and detailed description thereof is omitted.

ＤＢ処理装置１の制御部１０は、Ｓ１１１で複雑ネットワークを生成する前に、Ｓ１０９でテーブルを分割して正規化されたリレーションにおける属性に近似する種別のコーパスを、学習データ４１のコーパス群から抽出する（ステップＳ１１０１）。ステップＳ１１０１において制御部１０はまず、リレーションにおける出現頻度及び逆文書頻度（ＴＦＩＤＦ：Term Frequency、Inverse Document Frequency）、並びにコーパス毎のＴＦＩＤＦを夫々計算する。そして制御部１０は、リレーションにおける属性名、及び各コーパスにおける属性名を、類語辞書データを参照して同意の属性名を合致させる（揃える）。その上でコサイン距離（コサイン類似度関数）を用いて、コーパス群の内のいずれのコーパスに近似するか否かを判断し、近似するコーパスを特定する。 The control unit 10 of the DB processing apparatus 1 extracts from the corpus group of learning data 41 a type of corpus that approximates the attribute in the normalized relation by dividing the table in S109 before generating a complex network in S111. (Step S1101). In step S1101, the control unit 10 first calculates the appearance frequency and inverse document frequency (TFIDF: Term Frequency, Inverse Document Frequency) in the relation, and the TFIDF for each corpus. Then, the control unit 10 matches (aligns) the attribute name in the relation and the attribute name in each corpus with the attribute name of consent by referring to the synonym dictionary data. Then, using a cosine distance (cosine similarity function), it is determined whether or not to approximate any corpus in the corpus group, and an approximate corpus is specified.

ステップＳ１１０１は例えば、ユーザのリレーションから分割されて得られたリレーションの１つが「売上ＩＤ／注文日／商品ＩＤ／個数／顧客ＩＤ／従業員ＩＤ」である場合、制御部１０は上述した第１コーパスを近似コーパスとして抽出することができる。 In step S1101, for example, when one of the relations obtained by dividing from the user relation is “sales ID / order date / product ID / quantity / customer ID / employee ID”, the control unit 10 performs the above-described first operation. The corpus can be extracted as an approximate corpus.

制御部１０は、抽出された近似コーパスから、Ｓ１０９で分割して得られたリレーションに含まれる属性名と共起ペアとなる属性名を抽出する（ステップＳ１１０２）。このとき共起ペアとなる属性は、非キー属性であって、未接続の属性且つ階層（テーブル）内に存在しない属性（異なる階層で重複していない単語）となるように抽出する。更に、近似コーパスにおける全共起率の平均を算出し、算出された平均との差異（距離）に応じてフィルターを掛けるとよい。つまり制御部１０は、平均と略同値の共起率の属性名のみを抽出するか、また後述する選択操作によって選択実績がある属性名の共起率を参照して類似する属性名を抽出するようにしてもよい。 The control unit 10 extracts an attribute name that is a co-occurrence pair with the attribute name included in the relation obtained by dividing in S109 from the extracted approximate corpus (step S1102). At this time, the attribute that becomes the co-occurrence pair is a non-key attribute, and is extracted so as to be an unconnected attribute and an attribute that does not exist in the hierarchy (table) (words that are not duplicated in different hierarchies). Furthermore, the average of all co-occurrence rates in the approximate corpus may be calculated, and a filter may be applied according to the difference (distance) from the calculated average. That is, the control unit 10 extracts only attribute names having a co-occurrence rate that is substantially equivalent to the average, or extracts similar attribute names by referring to the co-occurrence rates of attribute names that have been selected by a selection operation described later. You may do it.

ステップＳ１１０２は例えば、抽出された第１コーパスにおいて「都道府県名」が高頻度に出現する場合、制御部１０は「都道府県名」を共起ペアとして抽出する。過去に作成されたリレーションにおいては経験的に「都道府県名」の属性を注文日、商品ＩＤ、個数等を含むテーブルに追加しておくことで、後に地域毎に売り上げを分類するなどする場合の結合演算処理を省略していた場合がある。上述のコーパスからの共起ペアの抽出により、結合演算処理の省略を実現するＤＢを作成することができる。 In step S1102, for example, when “prefecture name” appears frequently in the extracted first corpus, the control unit 10 extracts “prefecture name” as a co-occurrence pair. In relations created in the past, the attribute of “prefecture name” is empirically added to a table containing the order date, product ID, quantity, etc., so that sales can be classified by region later. In some cases, the join operation process is omitted. By extracting the co-occurrence pair from the corpus described above, a DB that realizes the omission of the join operation processing can be created.

制御部１０は、抽出した属性名をリスト化して端末装置２へ送信し、いずれの属性名を選択するかを表示部２３にて受け付ける（ステップＳ１１０３）。そして制御部１０は、ステップＳ１１１にて複雑ネットワークを生成するに際し、ステップＳ１１０３で選択された属性名を用いて非正規化する。具体的には１対多の従属関係のリレーションにおいて、選択された属性（非キー属性）を子側（高次元側）に吸収するか、１対多の従属関係のリレーションにおいて選択された属性（非キー属性）を親側（低次元側）に吸収するか、又は、１対１の従属関係（例えば図１０における「都道府県ＩＤ」と「都道府県名」、「顧客種別ＩＤ」と「顧客種別名」等）を統合する等がある。 The control unit 10 lists the extracted attribute names and transmits them to the terminal device 2, and the display unit 23 receives which attribute name is selected (step S1103). Then, when generating the complex network in step S111, the control unit 10 performs denormalization using the attribute name selected in step S1103. Specifically, in a one-to-many dependency relationship, the selected attribute (non-key attribute) is absorbed on the child side (higher dimension side) or the attribute selected in the one-to-many dependency relationship ( Non-key attribute) is absorbed by the parent side (low-dimensional side) or one-to-one dependency relationship (for example, “prefecture ID” and “prefecture name”, “customer type ID” and “customer” in FIG. 10) Type name ”) and the like.

なおステップＳ１１０３においては、ユーザからの選択を受け付けることなく、制御部１０がリストに優先順位を付与し、上位数個の属性について共起ペアで接続するようにしてもよい。上述の例であれば「都道府県名」は、売上ＩＤを含む第１次元のリレーション内に吸収される。 In step S1103, the control unit 10 may give priority to the list without accepting a selection from the user, and connect the top few attributes in co-occurrence pairs. In the above example, “prefecture name” is absorbed in the first dimension relation including the sales ID.

制御部１０は、Ｓ１１４で受け付けた操作に基づき変更されたリレーションを、学習データ４１内に別途、ユーザアカウントと対応付けて記憶し（ステップＳ１１４１）、ＤＢを構築する（Ｓ１１５）。 The control unit 10 separately stores the relation changed based on the operation received in S114 in association with the user account in the learning data 41 (step S1141), and constructs a DB (S115).

このように、ユーザは自身が作成した表形式のリレーション（テーブル）をＤＢＩＦ提供装置３経由でＤＢ処理装置１へアップロードする操作を行なう程度で、正規化されて更新時に異常を来たさず、且つ演算処理等の負荷を考慮した非正規化をも完備したリレーショナルスキーマを得ることができる。つまり、ＤＢ及びＤＢを扱うための技術に対する専門知識を有さないユーザであっても簡単にＤＢの更新が可能になる。 In this way, the user does not cause an abnormality at the time of updating by performing the operation of uploading the table-like relation (table) created by himself / herself to the DB processing apparatus 1 via the DBIF providing apparatus 3, In addition, it is possible to obtain a relational schema complete with denormalization in consideration of a load such as arithmetic processing. That is, even a user who does not have expertise in the DB and the technology for handling the DB can easily update the DB.

なお制御部１０は、ステップＳ１０７における属性間の関数従属性の計算時にも学習データを参照してもよい。これにより過去に作成されたＤＢにおける属性間の関係性を集合知として用いて経験的に適切と考えられるＤＢを構築することも可能となる。 Note that the control unit 10 may also refer to the learning data when calculating the functional dependency between attributes in step S107. As a result, it is possible to construct a DB that is empirically considered appropriate using the relationship between attributes in the DB created in the past as collective intelligence.

なお学習データ４１に含まれる類語辞書、共起辞書、コーパスを参照することによって、異なる設計者によって入力され、構築されたＤＢを総合的に解析することも可能になる。同意に定義された属性も異なる属性名として定義されている場合、また同一名で定義された属性も、ＤＢ内における論理的意味が異なる場合がある。本実施の形態２に係るＤＢ処理装置１の学習データに基づく処理により、異なるＤＢを統合することが可能となり、ビッグデータの解析が可能となる。更に、統合されたＤＢは、次に追加されるリレーション（テーブル）に対して集合知として機能する。 By referring to the synonym dictionary, the co-occurrence dictionary, and the corpus included in the learning data 41, it is possible to comprehensively analyze the DB input and constructed by different designers. When the attributes defined in the consent are also defined as different attribute names, the attributes defined with the same name may also have different logical meanings in the DB. By the process based on the learning data of the DB processing apparatus 1 according to the second embodiment, different DBs can be integrated, and big data can be analyzed. Further, the integrated DB functions as a collective intelligence for the relation (table) to be added next.

更に、従来のＥＲ図とは異なる方法で図１１，１２に示すようにスキーマが視覚化されて描画される。これにより、自身がアップロードしたリレーションのＤＢのスキーマを直感的に把握することができる。これにより、ＤＢへのアクセスに関する知識を有さないユーザであっても自動的に生成されたＤＢのスキーマを直感的に把握することができる。 Further, the schema is visualized and drawn as shown in FIGS. 11 and 12 by a method different from the conventional ER diagram. Thereby, it is possible to intuitively grasp the schema of the relation DB uploaded by itself. Thereby, even the user who does not have knowledge about access to the DB can intuitively grasp the schema of the automatically generated DB.

今回開示された実施形態はすべての点で例示であって、制限的なものではないと考えられるべきである。本発明の範囲は、上記した意味ではなく、特許請求の範囲によって示され、特許請求の範囲と均等の意味及び範囲内でのすべての変更が含まれることが意図される。 It should be thought that embodiment disclosed this time is an illustration and restrictive at no points. The scope of the present invention is defined by the terms of the claims, rather than the meanings described above, and is intended to include any modifications within the scope and meaning equivalent to the terms of the claims.

１ＤＢ処理装置（中央装置）
１０制御部
１１記憶部
１３通信部
１ＰＤＢ処理プログラム
２端末装置（外部装置）
２０制御部
２３表示部
２３１インタフェース
３ＤＢＩＦ提供装置 1 DB processing device (central device)
DESCRIPTION OF SYMBOLS 10 Control part 11 Memory | storage part 13 Communication part 1P DB processing program 2 Terminal device (external device)
20 control unit 23 display unit 231 interface 3 DBIF providing device

Claims

In a database processing program for causing a computer communicating with a database to execute processing on the database,
In the computer,
An analysis step to analyze database relations,
Associating objects with multiple attributes in the analyzed relation,
A database processing program that executes a drawing step of arranging and drawing an associated object along an outer shape of a predetermined shape.

Accepts selection and placement operations for the drawn object,
In the computer,
Receiving an arrangement of the object according to an operation;
The database processing program according to claim 1, further comprising executing a step of changing the relation of the database according to the accepted arrangement.

The drawing step includes
When it is analyzed that the database is divided into a plurality of relations by the analysis, a predetermined shape corresponding to the number of divisions of the relation is adjacent, and an object corresponding to an attribute common between the relations is arranged in an adjacent place, The database processing program according to claim 1 or 2, wherein an object having an attribute belonging to each of a plurality of relations is drawn on an outer shape other than the adjacent portion.

The analyzing step includes a step of identifying a hierarchical relationship between the plurality of relations;
The database processing program according to claim 3, wherein the drawing step draws the upper relation so as to be arranged in the center of the screen.

The size at which the predetermined shape in which the object corresponding to the attribute belonging to each of the plurality of relations is arranged is drawn indicates the degree of variation in the database of the attribute value of the attribute belonging to the relation. The database processing program according to claim 3 .

The analysis step includes
Decomposing the database by attribute,
Calculating, for each attribute, the size of variation in the database of the attribute value of the attribute;
Assigning a rank to the attribute based on the calculated magnitude;
Extracting candidate keys based on the given rank;
Identifying functional dependencies between decomposed attributes based on candidate keys;
Creating a decision tree between attributes based on relational dependencies;
The database according to any one of claims 1 to 5, further comprising: dividing a plurality of relations based on the created decision tree; and creating a hierarchical network based on the decision tree. Processing program.

The analysis step includes
The database processing program according to claim 6, further comprising: generating a complex network in which attributes that are not directly parent-child relationships are linked in the hierarchical network.

In a database processing method by a computer that executes processing on a database,
The computer is
Analyzing database relationships,
Associating objects with multiple attributes in the analyzed relation,
A database processing method, characterized in that screen information is created by arranging and drawing an associated object along an outer shape of a predetermined shape.

The computer
Decomposing the database by attribute,
Calculating, for each attribute, the size of variation in the database of the attribute value of the attribute;
Assigning a rank to the attribute based on the calculated magnitude;
Extracting candidate keys based on the given rank;
Identifying functional dependencies between decomposed attributes based on candidate keys;
Creating a decision tree between attributes based on relational dependencies;
The database according to claim 8, wherein the database is analyzed by executing a step of dividing into a plurality of relations based on the created decision tree, and a step of creating a hierarchical network based on the decision tree. Processing method.

In the database processing apparatus comprising means for communicating with each of the database and the external apparatus, and transmitting the processing results in the database to the external apparatus,
An analysis means for analyzing database relations;
Means for associating an object with each of the plurality of attributes included in the analyzed relation, creating a screen information drawn by arranging the associated object along a contour of a predetermined shape, and A database processing apparatus.

The analysis means includes
Means to decompose the database by attribute,
Means for calculating the size of variation in the database of the attribute value of the attribute for each attribute;
Means for assigning a rank to the attribute based on the calculated magnitude;
Means for extracting candidate keys based on the given rank;
Means for identifying functional dependencies between decomposed attributes based on candidate keys;
Means to create a decision tree between attributes based on relational dependencies;
The database processing apparatus according to claim 10, comprising: means for dividing a plurality of relations based on the created decision tree; and means for creating a hierarchical network based on the decision tree.