JP6633009B2

JP6633009B2 - Table data analysis program

Info

Publication number: JP6633009B2
Application number: JP2017016994A
Authority: JP
Inventors: 神　明夫; 明夫神; 井上　雅之; 雅之井上; 田中　弘一; 弘一田中; 啓一田端; 桂太郎堀川
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2017-02-01
Filing date: 2017-02-01
Publication date: 2020-01-22
Anticipated expiration: 2037-02-01
Also published as: JP2018124828A

Description

本発明は、表データ分析プログラムに関する。 The present invention relates to a table data analysis program.

従来、既存のシステムが無い状態において新規なシステムを開発する場合、当該システムの仕様を概念データモデルを使って統一した記法としてステークホルダ間で共有し、次第に明確化していく技術があった（例えば、特許文献１参照）。 In the past, when developing a new system without an existing system, there was a technology that shared the specifications of the system among stakeholders as a unified notation using a conceptual data model, and gradually clarified it (for example, Patent Document 1).

このような技術は、トップダウンの流れによって、まず論理設計を行い、次に物理設計に進み、最終的に緻密な設計書として記述してその通りに実装するために概念データモデルを使用するものである。 These technologies use a conceptual data model to perform a logical design first, then a physical design, and finally describe as a detailed design document and implement it according to a top-down flow. It is.

一方で、既に運用されている現行システムの改変によって新たなシステムを開発する場合、ボトムアップの流れで現行システムの仕様（仕組み、構造）を理解する必要が有る。 On the other hand, when developing a new system by modifying the existing system that is already in operation, it is necessary to understand the specifications (structure, structure) of the current system in a bottom-up flow.

特開２０１１−１５４６５３号公報JP 2011-154653 A

しかしながら、上記のような従来法では、ボトムアップの流れで現行システムの仕様（以下、「現行仕様」という。）を理解するために、現行仕様をそのまま概念データモデルとして自動的に変換して表すことが困難であった。 However, in the conventional method as described above, in order to understand the specifications of the current system (hereinafter, referred to as “current specifications”) in a bottom-up flow, the current specifications are automatically converted and represented as conceptual data models as they are. It was difficult.

現行仕様を理解するための手法として、既存ドキュメント（例えば、システム仕様書、システム設計書、ユーザ利用マニュアル、保守・運用マニュアル）を読み解いたり、システム利用者からヒアリングしたり、システムのプログラムソースコードを解析したりして仕様を理解する方法が有る。 As a method to understand the current specifications, read existing documents (for example, system specifications, system design documents, user use manuals, maintenance and operation manuals), hear from system users, and read the system program source code. There is a way to understand the specification by analyzing it.

しかし、このような方法は、入手した多種多様な様々な情報を見て総合的に判断する必要があり、様々な過去の知見を保有するベテランの熟練技術者でないと難しい作業である。また、このような方法は、作業量も多いため、多くの開発者でも作業できるように技術レベルの敷居を下げ、かつ、作業量を削減可能な技術が望まれている。 However, such a method requires a comprehensive judgment by looking at a wide variety of information obtained, and is a difficult task unless a veteran skilled technician possessing various past knowledge. Further, since such a method requires a large amount of work, a technique capable of lowering the technical level threshold so that many developers can work and reducing the amount of work is desired.

また、仕様書等のドキュメント類が紛失している場合や、現行システムの運用が長期にわたってなされてきたような場合には、システム自体が何度も修正・手直しがされているにも関わらずドキュメント類が現行化されていない場合もあり、このような場合には、ドキュメント類から仕様を抽出するのは困難である。 If documents such as specifications have been lost, or if the current system has been operated for a long period of time, the document may have been modified and reworked many times. In some cases, it is difficult to extract specifications from documents in such cases.

また、システム利用者にヒアリングする方法でも、得られる情報は、システム利用者が知っていることに限られてしまう。 Also, even with the method of hearing with the system user, the information obtained is limited to what the system user knows.

更に、プログラムソースコードを解析する方法でも、ソースコードで表現されている業務ルールは分析できるが、システムを利用している業務担当者しか知らないローカルルール（見落としやすいマイナールール）などを検出することは困難である。 Furthermore, even with the method of analyzing the program source code, it is possible to analyze the business rules expressed in the source code, but to detect local rules (minor rules that are easy to overlook) that only the business staff using the system knows. It is difficult.

本発明は、上記の点に鑑みてなされたものであって、コンピュータシステムの仕様の理解を支援することを目的とする。 The present invention has been made in view of the above points, and has as its object to support the understanding of computer system specifications.

そこで上記課題を解決するため、表データ分析プログラムは、第１の表データにおける列のうち、複数の種別の値を含む列を種別ごとの列に分類し、前記第１の表データの各行を、それぞれの行が含む値の種別に応じて分類する分類部と、前記分類部による分類結果に基づいて前記第１の表データを加工して第２の表データを生成する加工部と、前記第１の表データにおいて複数の種別の値を含む列における種別の繰り返しのパタンの単位を解析し、前記単位における先頭の種別を含む行と前記単位における先頭以外の種別を含む行とを区別する情報を前記第２の表データに追加する解析部と、としてコンピュータを機能させる。 Therefore, in order to solve the above problem, the table data analysis program classifies columns including a plurality of types of values among columns in the first table data into columns for each type, and divides each row of the first table data into columns. a classification unit for classifying according to the type of value that each row comprises a processing unit for generating a second table data by processing the first table data based on the classification result by the classifying unit, wherein In the first table data, the unit of the pattern of the repetition of the type in the column including the values of the plurality of types is analyzed, and the line including the head type in the unit and the line including the type other than the head in the unit are distinguished. A computer functions as an analysis unit that adds information to the second table data .

コンピュータシステムの仕様の理解を支援することができる。 It can assist in understanding computer system specifications.

第１の実施の形態における分析装置１０のハードウェア構成例を示す図である。FIG. 2 is a diagram illustrating an example of a hardware configuration of an analyzer 10 according to the first embodiment. 第１の実施の形態における分析装置１０の機能構成例を示す図である。FIG. 2 is a diagram illustrating an example of a functional configuration of an analyzer 10 according to the first embodiment. 第１の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。5 is a flowchart for explaining an example of a processing procedure executed by the analyzer 10 according to the first embodiment. テーブルデータの一例を示す図である。FIG. 4 is a diagram illustrating an example of table data. 混在が解消されてラベルが付与されたテーブルデータの例を示す図である。FIG. 11 is a diagram illustrating an example of table data to which labels are added by eliminating mixing. 第２の実施の形態における分析装置１０の機能構成例を示す図である。It is a figure showing the example of functional composition of analyzer 10 in a 2nd embodiment. 第２の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。9 is a flowchart for explaining an example of a processing procedure executed by the analyzer 10 according to the second embodiment. 第２の実施の形態におけるテーブルデータの例を示す図である。FIG. 14 is a diagram illustrating an example of table data according to the second embodiment. 第２の実施の形態において修正後のテーブルデータの例を示す図である。FIG. 14 is a diagram illustrating an example of table data after correction in the second embodiment. 第２の実施の形態におけるテーブルデータの手修正の例を示す図である。FIG. 14 is a diagram illustrating an example of manual correction of table data according to the second embodiment. 第３の実施の形態における分析装置１０の機能構成例を示す図である。FIG. 13 is a diagram illustrating an example of a functional configuration of an analyzer 10 according to a third embodiment. 第３の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。15 is a flowchart for explaining an example of a processing procedure executed by the analyzer 10 according to the third embodiment. ノイズに該当する行又は列の一例を示す図である。FIG. 4 is a diagram illustrating an example of a row or a column corresponding to noise. 第４の実施の形態における分析装置１０の機能構成例を示す図である。It is a figure showing the example of functional composition of analyzer 10 in a 4th embodiment. 第４の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。15 is a flowchart for explaining an example of a processing procedure executed by the analyzer 10 according to the fourth embodiment. マルチレイアウト構造の単位の解析を説明するための第１の図である。FIG. 9 is a first diagram for explaining an analysis of a unit of a multi-layout structure. マルチレイアウト構造の単位の解析を説明するための第２の図である。FIG. 9 is a second diagram for explaining the analysis of the unit of the multi-layout structure. マルチレイアウト構造の単位の解析を説明するための第３の図である。FIG. 14 is a third diagram for describing the analysis of the unit of the multi-layout structure. マルチレイアウト構造の単位が解析されたテーブルデータの例を示す図である。FIG. 4 is a diagram illustrating an example of table data in which units of a multi-layout structure are analyzed. 第５の実施の形態における分析装置１０の機能構成例を示す図である。It is a figure showing the example of functional composition of analyzer 10 in a 5th embodiment. 第５の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。15 is a flowchart illustrating an example of a processing procedure executed by an analyzer 10 according to a fifth embodiment. 特異点の第１の検出例を示す図である。It is a figure showing the 1st example of detection of a singular point. 特異点の第２の検出例を示す図である。It is a figure showing the 2nd example of detection of a singular point. 特異点の第３の検出例を示す図である。It is a figure showing the 3rd example of detection of a singular point. 第６の実施の形態における分析装置１０の機能構成例を示す図である。It is a figure showing the example of functional composition of analyzer 10 in a 6th embodiment. 第６の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。15 is a flowchart for explaining an example of a processing procedure executed by the analyzer 10 according to the sixth embodiment. 第６の実施の形態において入力される複数のテーブルデータの例を示す図である。FIG. 24 is a diagram illustrating an example of a plurality of table data input in the sixth embodiment. 第６の実施の形態において関係構造が推定された複数のテーブルデータの例を示す図である。FIG. 21 is a diagram illustrating an example of a plurality of table data for which a relational structure is estimated in the sixth embodiment. 第７の実施の形態における分析装置１０の機能構成例を示す図である。It is a figure showing the example of functional composition of analyzer 10 in a 7th embodiment. 概念データモデル図の第１の例を示す図である。It is a figure showing the 1st example of a conceptual data model diagram. 概念データモデル図の第２の例を示す図である。It is a figure showing the 2nd example of a conceptual data model figure. 特異点ごとに列が追加されたテーブルデータの例を示す図である。FIG. 9 is a diagram illustrating an example of table data in which a column is added for each singular point. 概念データモデル図の第３の例を示す図である。It is a figure showing the 3rd example of a conceptual data model figure.

以下、図面に基づいて本発明の実施の形態を説明する。図１は、第１の実施の形態における分析装置１０のハードウェア構成例を示す図である。図１の分析装置１０は、それぞれバスＢで相互に接続されているドライブ装置１００、補助記憶装置１０２、メモリ装置１０３、ＣＰＵ１０４、インタフェース装置１０５、表示装置１０６、及び入力装置１０７等を有する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings. FIG. 1 is a diagram illustrating an example of a hardware configuration of an analyzer 10 according to the first embodiment. 1 includes a drive device 100, an auxiliary storage device 102, a memory device 103, a CPU 104, an interface device 105, a display device 106, an input device 107, and the like, which are mutually connected by a bus B.

分析装置１０での処理を実現するプログラムは、ＣＤ−ＲＯＭ等の記録媒体１０１によって提供される。プログラムを記憶した記録媒体１０１がドライブ装置１００にセットされると、プログラムが記録媒体１０１からドライブ装置１００を介して補助記憶装置１０２にインストールされる。但し、プログラムのインストールは必ずしも記録媒体１０１より行う必要はなく、ネットワークを介して他のコンピュータよりダウンロードするようにしてもよい。補助記憶装置１０２は、インストールされたプログラムを格納すると共に、必要なファイルやデータ等を格納する。 A program for realizing the processing in the analyzer 10 is provided by a recording medium 101 such as a CD-ROM. When the recording medium 101 storing the program is set in the drive device 100, the program is installed from the recording medium 101 to the auxiliary storage device 102 via the drive device 100. However, the program need not always be installed from the recording medium 101, and may be downloaded from another computer via a network. The auxiliary storage device 102 stores installed programs and also stores necessary files and data.

メモリ装置１０３は、プログラムの起動指示があった場合に、補助記憶装置１０２からプログラムを読み出して格納する。ＣＰＵ１０４は、メモリ装置１０３に格納されたプログラムに従って分析装置１０に係る機能を実現する。インタフェース装置１０５は、ネットワークに接続するためのインタフェースとして用いられる。表示装置１０６はプログラムによるＧＵＩ（Graphical User Interface）等を表示する。入力装置１０７はキーボード及びマウス等で構成され、様々な操作指示を入力させるために用いられる。 The memory device 103 reads out the program from the auxiliary storage device 102 and stores it when there is an instruction to start the program. The CPU 104 implements functions related to the analyzer 10 according to a program stored in the memory device 103. The interface device 105 is used as an interface for connecting to a network. The display device 106 displays a GUI (Graphical User Interface) based on a program. The input device 107 includes a keyboard, a mouse, and the like, and is used to input various operation instructions.

図２は、第１の実施の形態における分析装置１０の機能構成例を示す図である。図２において、分析装置１０は、入力部１１、分類部１２及び加工部１３等を有する。これら各部は、分析装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。 FIG. 2 is a diagram illustrating an example of a functional configuration of the analyzer 10 according to the first embodiment. 2, the analyzer 10 has an input unit 11, a classification unit 12, a processing unit 13, and the like. Each of these units is realized by a process of causing the CPU 104 to execute one or more programs installed in the analyzer 10.

入力部１１は、仕様の分析対象とされているコンピュータシステムのデータベースのデータ（以下、「ＤＢストアデータ」という。）がテキスト形式に変換されたデータ（以下、「テーブルデータ」という。）を格納したファイル（以下、「テーブルデータファイル」という。）を入力する。ＤＢストアデータは、表形式の構造を有するデータである。 The input unit 11 stores data (hereinafter, referred to as “table data”) obtained by converting data of a database of a computer system (hereinafter, referred to as “DB store data”), which is a specification analysis target, into a text format. (Hereinafter, referred to as “table data file”). DB store data is data having a tabular structure.

分類部１２は、テーブルデータにおける列のうち、複数の種別の値を含む列を種別ごとの列に分類すると共に、テーブルデータの各行を、それぞれの行が含む値の種別に応じて分類する。 The classification unit 12 classifies a column including a plurality of types of values among columns in the table data into columns for each type, and classifies each row of the table data according to the type of the value included in each row.

加工部１３は、分類部１２による分類結果に基づいて、入力されたテーブルデータを加工することで、当該分類結果が反映されたテーブルデータを生成する。 The processing unit 13 processes the input table data based on the classification result by the classification unit 12 to generate table data in which the classification result is reflected.

以下、分析装置１０が実行する処理手順について説明する。図３は、第１の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。 Hereinafter, a processing procedure executed by the analyzer 10 will be described. FIG. 3 is a flowchart illustrating an example of a processing procedure executed by the analyzer 10 according to the first embodiment.

ステップＳ１００において、入力部１１は、ユーザによって指定されたテーブルデータファイルに格納されているテーブルデータを読み込む。 In step S100, the input unit 11 reads table data stored in a table data file specified by the user.

図４は、テーブルデータの一例を示す図である。図４に示されるように、テーブルデータは、表形式の構造を有する。 FIG. 4 is a diagram illustrating an example of the table data. As shown in FIG. 4, the table data has a tabular structure.

続いて、分類部１２は、テーブルデータ内の列方向又は行方向において、種別の異なるデータが混在しているか否かを判定する（Ｓ１１０）。具体的には、分類部１２は、まず、各列について、当該列に含まれる各値の種別を判定する。値の種別の判定には、例えば、「正規表現による分類フィルタ」を用いることができる。当該分類フィルタは、種別ごとに用意されており、各値に各分類フィルタを適用することで、各値の種別を判定することができる。例えば、「電話番号」を表現する分類フィルタに合致した値の種別は、「電話番号」と判定される。そうすることで、複数の種別の値が含まれている列は、種別の異なるデータが混在している列であると判定される。 Subsequently, the classification unit 12 determines whether different types of data are mixed in the column direction or the row direction in the table data (S110). Specifically, the classification unit 12 first determines, for each column, the type of each value included in the column. For the determination of the type of the value, for example, a “classification filter using a regular expression” can be used. The classification filter is prepared for each type, and the type of each value can be determined by applying each classification filter to each value. For example, the type of the value that matches the classification filter expressing “telephone number” is determined to be “telephone number”. By doing so, a column including a plurality of types of values is determined to be a column in which data of different types are mixed.

図４の例では、２列目について、「住所」と「英数字」の２種類のデータが混在していることが判定される。すなわち、１行目について２列目の値は住所であるが、２行目及び４行目の値は、英数字である。一方、１列目、３列目、４列目、５列目については、それぞれ、「氏名」、「メールアドレス」、「年月日」、「数字」の単一の種別の列であると判定される。 In the example of FIG. 4, it is determined that two types of data of “address” and “alphanumeric” are mixed in the second column. That is, in the first row, the value in the second column is an address, but the values in the second and fourth rows are alphanumeric. On the other hand, the first, third, fourth, and fifth columns are columns of a single type of “name”, “mail address”, “date”, and “number”, respectively. Is determined.

また、行方向については、列方向の判定が行われた後に、各行を構成する列の種別の組み合わせの異同によって、各行の種別の異同が判定される。 Further, regarding the row direction, after the determination in the column direction is performed, the difference in the type of each row is determined based on the difference in the combination of the types of the columns constituting each row.

なお、ＲＤＢ（Relational Database）等のデータベースにおいては、図４に示されるような、２種類以上のデータが混在する列又は行を含むテーブルが構築される可能性は低いが、一般的にレガシーシステムと呼ばれるような、メインフレーム系のシステムにおいては、例えば、記憶容量の削減等の目的のため、図４に示されるような形式のテーブル情報が存在する場合が有る。 In a database such as an RDB (Relational Database), it is unlikely that a table including columns or rows in which two or more types of data are mixed is low as shown in FIG. For example, in a mainframe system such as that described above, there is a case where table information in a format as shown in FIG. 4 exists for the purpose of reducing storage capacity and the like.

列方向又は行方向において複数の種別が混在している場合（Ｓ１１０でＹｅｓ）、加工部１３は、複数の種別が混在している列を種別ごとに分類して、種別の混在を解消する（Ｓ１１１）。すなわち、加工部１３は、複数の種別が混在している列を、種別ごとの列に分類（分割）することで、テーブルデータを加工する。 When a plurality of types are mixed in the column direction or the row direction (Yes in S110), the processing unit 13 classifies a column in which a plurality of types are mixed for each type, and eliminates the mixed types ( S111). That is, the processing unit 13 processes the table data by classifying (dividing) a column in which a plurality of types are mixed into columns for each type.

続いて、加工部１３は、分類された各列及び各行にラベルを付与する（Ｓ１３０）。 Subsequently, the processing unit 13 assigns a label to each of the classified columns and rows (S130).

図５は、混在が解消されてラベルが付与されたテーブルデータの例を示す図である。図５では、各列に対して、当該列について判定された種別（「氏名」、「住所」、「英数字」、「メールアドレス」、「年月日」、「数値」）がラベルとして付与されている。なお、当初の図４の状態では、「住所」と「英数字」とは同じ列に属していたが、図５では、ステップＳ１１１の作用により異なる列に分類（分割）されている。 FIG. 5 is a diagram illustrating an example of table data to which labels have been added by eliminating the mixture. In FIG. 5, the type determined for the column (“name”, “address”, “alphanumeric character”, “mail address”, “date”, “numerical value”) is assigned to each column as a label. Have been. In the initial state of FIG. 4, "address" and "alphanumeric" belong to the same column, but in FIG. 5, they are classified (divided) into different columns by the operation of step S111.

また、図５では、各行に対して、「★」又は「○」がラベルとして付与されている。すなわち、「★」は、「氏名」、「住所」、「メールアドレス」、「年月日」及び「数字」を含む行に対して付与されたラベルである。「○」は、「英数字」及び「数字」を含む行に対して付与されたラベルである。なお、同じ種別の行に対して共通のラベルが付与されればよく、「★」及び「○」以外の記号又は文字列等がラベルとして使用されてもよい。 In FIG. 5, “★” or “○” is given as a label to each line. That is, “★” is a label assigned to a line including “name”, “address”, “mail address”, “date”, and “number”. “○” is a label given to a line including “alphanumeric characters” and “numeric characters”. Note that a common label may be given to rows of the same type, and a symbol or character string other than “★” and “O” may be used as a label.

なお、加工部１３は、ラベルが付与されたテーブルデータを、例えば、図５に示されるような表形式で表示装置１０６に表示してもよい。 Note that the processing unit 13 may display the table data with the label on the display device 106, for example, in a table format as shown in FIG.

上述したように、第１の実施の形態によれば、異種類のデータが混在した列を含むテーブルデータについて、種別ごとに列が分類されたテーブルデータに変換することができる。その結果、分かりにくかったテーブルデータの構造の意味の明確性を向上させることができる。すなわち、現行システム等のコンピュータシステムの仕様の理解を支援することができる。例えば、新システム設計等の設計負担を軽減するとともに、データ解析等に高スキル者を不要とすることを可能とすることができる。 As described above, according to the first embodiment, table data including columns in which different types of data are mixed can be converted into table data in which columns are classified by type. As a result, it is possible to improve the clarity of the meaning of the invisible table data structure. That is, it is possible to support understanding of the specifications of the computer system such as the current system. For example, it is possible to reduce the design burden of new system design and the like, and to eliminate the need for highly skilled persons for data analysis and the like.

次に、第２の実施の形態について説明する。第２の実施の形態では第１の実施の形態と異なる点について説明する。第２の実施の形態において特に言及されない点については、第１の実施の形態と同様でもよい。 Next, a second embodiment will be described. In the second embodiment, points different from the first embodiment will be described. Points that are not particularly mentioned in the second embodiment may be the same as those in the first embodiment.

図６は、第２の実施の形態における分析装置１０の機能構成例を示す図である。図６中、図２と同一部分には同一符号を付し、その説明は省略する。図６において、分析装置１０は、更に、分類支援部１４を有する。分類支援部１４は、分析装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。 FIG. 6 is a diagram illustrating an example of a functional configuration of the analyzer 10 according to the second embodiment. 6, the same components as those in FIG. 2 are denoted by the same reference numerals, and description thereof will be omitted. 6, the analyzer 10 further includes a classification support unit 14. The classification support unit 14 is realized by a process of causing the CPU 104 to execute one or more programs installed in the analyzer 10.

分類支援部１４は、図５のように自動的に加工（混在の解消及びラベルの付与）されたテーブルデータについて、ユーザの手作業等による更なる分類を支援する。 The classification support unit 14 supports further classification of the table data that has been automatically processed (elimination of mixing and labeling) as illustrated in FIG.

図７は、第２の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。図７中、図３と同一ステップには同一ステップ番号を付し、その説明は省略する。 FIG. 7 is a flowchart illustrating an example of a processing procedure executed by the analyzer 10 according to the second embodiment. 7, the same steps as those in FIG. 3 are denoted by the same step numbers, and a description thereof will be omitted.

ステップＳ１３０に続いて、分類支援部１４は、現時点のテーブルデータに対する修正の要否を判定する（Ｓ１４０）。現時点のテーブルデータとは、ステップＳ１１１が実行されている場合には、ステップＳ１１１の実行後のテーブルデータをいい、ステップＳ１１１が実行されていない場合には、ステップＳ１００において入力された状態のテーブルデータをいう。修正の要否の判定は、ユーザによる入力の有無に基づいて行われてもよい。例えば、加工部１３によって表示されたテーブルデータに対する修正の要否がユーザによって入力されてもよい。 Subsequent to step S130, the classification support unit 14 determines whether or not the current table data needs to be corrected (S140). The current table data refers to the table data after execution of step S111 when step S111 is executed, and the table data in the state input at step S100 when step S111 is not executed. Say. The determination of the necessity of the correction may be performed based on the presence or absence of the input by the user. For example, the user may input whether or not the table data displayed by the processing unit 13 needs to be corrected.

続いて、分類支援部１４は、テーブルデータの修正のために、新たな分類フィルタが入力されたか否かを判定する（Ｓ１４１）。ここでは、ステップＳ１３０までが実行されることで表示されたテーブルデータが、図８に示される通りであったとする。 Subsequently, the classification support unit 14 determines whether a new classification filter has been input to correct the table data (S141). Here, it is assumed that the table data displayed by executing steps up to step S130 is as shown in FIG.

図８は、第２の実施の形態におけるテーブルデータの例を示す図である。図８では、「氏名」が「佐藤誠」である行の「住所」の値が、「大阪府芸術文化管理財団」である。すなわち、図８では、「大阪府芸術文化管理財団」が、誤って「住所」に分類された例が示されている。 FIG. 8 is a diagram illustrating an example of table data according to the second embodiment. In FIG. 8, the value of the “address” of the row whose “name” is “Makoto Sato” is “Osaka Prefectural Arts and Cultural Management Foundation”. That is, FIG. 8 shows an example in which the “Osaka Prefectural Arts and Cultural Management Foundation” is incorrectly classified as an “address”.

この場合、ユーザは、例えば、末尾が「財団」である文字列について、企業名に分類するための分類フィルタを定義し、ステップＳ１１１において利用されるフィルタ群の一つとして追加することができる。なお、新たに追加される分類フィルタは、既存の種別に対応するものであってもよいし、新たな種別に対応するものであってもよい。 In this case, for example, the user can define a classification filter for classifying a character string ending in “Foundation” into a company name and add it as one of the filter groups used in step S111. Note that the newly added classification filter may correspond to an existing type, or may correspond to a new type.

新たな分類フィルタが入力されると（Ｓ１４０でＹｅｓ）、当該分類フィルタと既存の分類フィルタとが利用されてステップＳ１００以降が再実行される。その結果、図８のテーブルデータは、図９に示されるように修正される。 When a new classification filter is input (Yes in S140), the processing from step S100 is repeated using the classification filter and the existing classification filter. As a result, the table data in FIG. 8 is modified as shown in FIG.

図９は、第２の実施の形態において修正後のテーブルデータの例を示す図である。図９では、「住所」の列の右隣に「企業名」の列が追加され、「大阪府芸術文化管理財団」が、「企業名」の列に移動されている。 FIG. 9 is a diagram illustrating an example of table data after correction in the second embodiment. In FIG. 9, a column of "company name" is added to the right of the column of "address", and "Osaka Prefectural Arts and Cultural Management Foundation" is moved to the column of "company name".

このような方法をとることによって、例えば、想定した分類フィルタによって分類し切れなかった種別の混在が発見された場合に、分類フィルタを更に追加することで正しい分類を行なうことができる。 By adopting such a method, for example, when a mixture of types that cannot be classified by the assumed classification filter is found, correct classification can be performed by further adding a classification filter.

一方、分類フィルタでは分類しきれない場合（Ｓ１４１でＮｏ）、分類支援部１４は、ユーザの手修正によって混在を解消するための直接的な修正指示をユーザから受け付ける。例えば、新たな列の追加と、当該列に分類される値とがユーザによって選択される。この場合、分類支援部１４は、テーブルデータに対して新たな列を追加し、選択された値を当該列に移動する（Ｓ１４１）。 On the other hand, when the classification cannot be performed by the classification filter (No in S141), the classification support unit 14 receives a direct correction instruction from the user for eliminating the mixture by the user's manual correction. For example, addition of a new column and values classified into the column are selected by the user. In this case, the classification support unit 14 adds a new column to the table data and moves the selected value to the column (S141).

図１０は、第２の実施の形態におけるテーブルデータの手修正の例を示す図である。図１０の（１）には、テーブルデータの或る列について、氏名を抽出できる分類フィルタによって氏名の抽出を行なった結果、誤って、「氏名」のデータとして「所長」、「室長」が選択されてしまい、「氏名以外」のデータとして、「主幹研究員」、「主任研究員」、「担当課長」、「主査」等が選択されてしまった例が示されている。なお、「氏名以外」のデータとは、「氏名」として選択されなかったデータをいい、「氏名以外」という種別が存在することを意図するものではない。また、図１０に示されるデータは、便宜上、図４とは異なるデータである。 FIG. 10 is a diagram illustrating an example of manual correction of table data according to the second embodiment. In (1) of FIG. 10, a certain column of the table data is extracted with a classification filter capable of extracting a name, and as a result, the “director” and the “room manager” are erroneously selected as the data of the “name”. An example is shown in which “Principal Investigator”, “Principal Investigator”, “Manager in Charge”, “Chief Investigator”, and the like have been selected as data other than “Name”. It should be noted that the data other than “name” refers to data not selected as “name”, and is not intended to include the type “other than name”. The data shown in FIG. 10 is different from the data shown in FIG. 4 for convenience.

この場合、ユーザは、（２）に示されるように、「役職」というラベルが付与された新たな列をテーブルデータに追加し、「所長」、「室長」、「主幹研究員」、「主任研究員」、「担当課長」、「主査」等の役職に該当する値を当該列に移動することの指示を入力する。 In this case, as shown in (2), the user adds a new column labeled “Position” to the table data, and adds “New director”, “Office director”, “Chief researcher”, “Chief researcher”. "," Director in charge "," Chief investigator ", etc., and input an instruction to move a value corresponding to the position to the column.

このようにすれば、例えば、ユーザが、分類フィルタによる自動分類の結果に誤りが有ると気づいた場合に、手修正によって正しい分類結果に導くことができる。 By doing so, for example, when the user notices that there is an error in the result of the automatic classification by the classification filter, it is possible to lead to a correct classification result by manual correction.

次に、第３の実施の形態について説明する。第３の実施の形態では第２の実施の形態と異なる点について説明する。第３の実施の形態において特に言及されない点については、第２の実施の形態と同様でもよい。 Next, a third embodiment will be described. In the third embodiment, points different from the second embodiment will be described. What is not particularly mentioned in the third embodiment may be the same as in the second embodiment.

図１１は、第３の実施の形態における分析装置１０の機能構成例を示す図である。図１１中、図６と同一部分には同一符号を付し、その説明は省略する。図１１において、分析装置１０は、更に、ノイズ除去部１５を有する。ノイズ除去部１５は、分析装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。 FIG. 11 is a diagram illustrating an example of a functional configuration of the analyzer 10 according to the third embodiment. 11, the same parts as those of FIG. 6 are denoted by the same reference numerals, and the description thereof will be omitted. In FIG. 11, the analyzer 10 further includes a noise removing unit 15. The noise removing unit 15 is realized by a process of causing the CPU 104 to execute one or more programs installed in the analyzer 10.

ノイズ除去部１５は、テーブルデータにおいて、ノイズ（テストデータ等の実際の運用では使われていないデータ）と思われるデータを行又は列において検出した場合に、当該行又は当該列を削除（除去）する。 The noise elimination unit 15 deletes (removes) the row or the column when the data considered to be noise (data not used in actual operation such as test data) is detected in the row or the column in the table data. I do.

図１２は、第３の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。図１２中、図７と同一ステップには同一ステップ番号を付し、その説明は省略する。 FIG. 12 is a flowchart illustrating an example of a processing procedure executed by the analyzer 10 according to the third embodiment. 12, the same steps as those in FIG. 7 are denoted by the same step numbers, and a description thereof will be omitted.

ステップＳ１１０又はステップＳ１１１に続いて、ノイズ除去部１５は、テーブルデータの中に、ノイズに該当する行又は列が有るか否かを判定する（Ｓ１２０）。ノイズに該当するか否かは、例えば、補助記憶装置１０２に予め記憶されているキーワードのうちのいずれかが含まれているか否かによって判定されてもよい。いずれかのキーワードが１つでも含まれている場合にノイズに該当すると判定されてもよいし、或るキーワードが所定の割合以上含まれている場合にノイズに該当すると判定されてもよい。この場合、キーワードと共に当該所定の割合が、ノイズ対象を特定するためのルール（規則）として補助記憶装置１０２に記憶されていてもよい。 Subsequent to step S110 or step S111, the noise removing unit 15 determines whether there is a row or a column corresponding to noise in the table data (S120). Whether it corresponds to noise may be determined based on, for example, whether any of the keywords stored in the auxiliary storage device 102 in advance is included. When any one of the keywords is included, it may be determined to correspond to the noise, or when a certain keyword is included in a predetermined ratio or more, it may be determined to correspond to the noise. In this case, the predetermined ratio together with the keyword may be stored in the auxiliary storage device 102 as a rule (rule) for specifying the noise target.

図１３は、ノイズに該当する行又は列の一例を示す図である。例えば、「ｔｅｓｔ」というキーワードと一致する文字を含むデータが全データのうちの８０％以上に及ぶ列をノイズ対象とするルールが有る場合、図１３における列ｃ１がノイズに該当する。列ｃ１は、２０行中１６行において「ｔｅｓｔ」を含むからである。 FIG. 13 is a diagram illustrating an example of a row or a column corresponding to noise. For example, when there is a rule in which a column including 80% or more of all data including characters matching the keyword “test” is a noise target, the column c1 in FIG. 13 corresponds to noise. This is because column c1 includes “test” in 16 rows out of 20 rows.

また、「旅費太郎」を１つでも含む行をノイズ対象とするルールが有る場合、矩形ｒ１によって囲まれている行がノイズに該当する。 If there is a rule that includes a line that includes at least one “Travel Expense Taro” as a noise target, a line surrounded by a rectangle r1 corresponds to the noise.

なお、仮に、「旅費太郎」をキーワードと一致する文字を含むデータが全データの８０％以上に及ぶ列をノイズ対象とするルールが有ったとしても、列ｃ２はノイズに該当しない。列ｃ２において、「旅費太郎」の割合は５０％だからである。 It should be noted that even if there is a rule in which a column in which data including characters matching the keyword "Travel expense" matches 80% or more of all data is a noise target, the column c2 does not correspond to noise. This is because, in column c2, the ratio of “Travel expenses Taro” is 50%.

ノイズに該当する行又は列が有る場合（Ｓ１２０でＹｅｓ）、ノイズ除去部１５は、当該行又は当該列を削除する（Ｓ１２１）。 When there is a row or a column corresponding to the noise (Yes in S120), the noise removing unit 15 deletes the row or the column (S121).

上述したように、第３の実施の形態によれば、ノイズに該当する行又は列が削除された状態で、ステップＳ１３０以降の処理を実行することができる。したがって、ステップＳ１３０以降の処理の精度を高めることができると共に、当該処理を効率化することができる。 As described above, according to the third embodiment, the processing after step S130 can be executed in a state where the row or column corresponding to noise has been deleted. Therefore, the accuracy of the processing after step S130 can be improved, and the processing can be made more efficient.

次に、第４の実施の形態について説明する。第４の実施の形態では第３の実施の形態と異なる点について説明する。第４の実施の形態において特に言及されない点については、第３の実施の形態と同様でもよい。 Next, a fourth embodiment will be described. In the fourth embodiment, the points that are different from the third embodiment will be described. What is not particularly mentioned in the fourth embodiment may be the same as in the third embodiment.

図１４は、第４の実施の形態における分析装置１０の機能構成例を示す図である。図１４中、図１１と同一部分には同一符号を付し、その説明は省略する。図１４において、分析装置１０は、更に、マルチレイアウト解析部１６を有する。マルチレイアウト解析部１６は、分析装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。 FIG. 14 is a diagram illustrating an example of a functional configuration of the analyzer 10 according to the fourth embodiment. 14, the same parts as those of FIG. 11 are denoted by the same reference numerals, and the description thereof will be omitted. In FIG. 14, the analysis device 10 further has a multi-layout analysis unit 16. The multi-layout analysis unit 16 is realized by a process of causing the CPU 104 to execute one or more programs installed in the analysis device 10.

マルチレイアウト解析部１６は、テーブルデータ内におけるマルチレイアウト構造の存在を検出すると共に、当該マルチレイアウト構造の単位を解析する。マルチレイアウト構造とは、図４に示したように、１つのテーブルデータ内において、複数種別が混在した列を含む構造をいう。「マルチレイアウト構造の単位」とは、複数の種別の値を含む列における種別の繰り返しのパタンの単位をいう。 The multi-layout analysis unit 16 detects the presence of a multi-layout structure in the table data and analyzes a unit of the multi-layout structure. As shown in FIG. 4, the multi-layout structure refers to a structure that includes columns in which a plurality of types are mixed in one table data. The “unit of the multi-layout structure” refers to a unit of a pattern of repeating a type in a column including values of a plurality of types.

マルチレイアウト構造の検出は、通常、専門知識の豊富な高スキル者が手動で分析し、検出することで行われるが、それではごく限られた特定の人にしか検出できず、広く手法を広めることができない。また、高スキル者の手作業に委ねられるため、高コストとなり普及が阻害される。そこで、本実施の形態では、マルチレイアウト解析部１６がマルチレイアウト構造を自動で検出する。 The detection of multi-layout structures is usually performed by manual analysis and detection by highly-skilled people with specialized knowledge, but it can be detected only by a limited number of specific people, and the method must be widely used. Can not. In addition, since it is left to the manual work of a highly skilled person, the cost is high and dissemination is hindered. Thus, in the present embodiment, the multi-layout analysis unit 16 automatically detects a multi-layout structure.

図１５は、第４の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。図１５中、図１２と同一ステップには同一ステップ番号を付し、その説明は省略する。 FIG. 15 is a flowchart illustrating an example of a processing procedure executed by the analyzer 10 according to the fourth embodiment. 15, the same steps as those of FIG. 12 are denoted by the same step numbers, and a description thereof will be omitted.

ステップＳ１４０又はステップＳ１４２に続いて、マルチレイアウト解析部１６は、テーブルデータ内にマルチレイアウト構造が存在するか否かを判定する（Ｓ１５０）。マルチレイアウト構造が存在する場合（Ｓ１５０でＹｅｓ）、マルチレイアウト解析部１６は、マルチレイアウト構造の単位を解析する（Ｓ１５１）。 Subsequent to step S140 or step S142, the multi-layout analysis unit 16 determines whether a multi-layout structure exists in the table data (S150). If a multi-layout structure exists (Yes in S150), the multi-layout analysis unit 16 analyzes a unit of the multi-layout structure (S151).

図１６は、マルチレイアウト構造の単位の解析を説明するための第１の図である。図１６の（１）には、列＃１において、住所と氏名とが交互に出現し、他の種別は単一である例が示されている。すなわち、（１）のテーブルデータは、住所と氏名との繰り返しのパタンが２行ごとであり、２行単位の周期性を有する。なお、厳密には、ステップＳ１５１の時点において、列＃１は、ステップＳ１１１の作用により、２つの列に分類されている。したがって、ステップＳ１５１では、当初同じ列であった列の集合ごとに解析が行われる。なお、列＃１〜列＃４は、各列のラベルを抽象的に示す記号である。 FIG. 16 is a first diagram illustrating the analysis of the unit of the multi-layout structure. FIG. 16A shows an example in which an address and a name appear alternately in column # 1, and the other type is single. That is, in the table data of (1), the repetition pattern of the address and the name is every two lines, and has a periodicity of two lines. Strictly, at the time of step S151, column # 1 is classified into two columns by the operation of step S111. Therefore, in step S151, analysis is performed for each set of columns that were initially the same column. Note that columns # 1 to # 4 are symbols that abstractly indicate the labels of each column.

この場合の解析結果は（２）に示される通りである。すなわち、マルチレイアウト解析部１６は、２行をマルチレイアウト構造の単位として判定する。また、マルチレイアウト解析部１６は、２行のうちの先頭の行（「住所」を含む行）を、マルチレイアウト構造の「ヘッド構造（ヘッド行）」（主要情報構造）であると判定し、それ以外の行（「氏名」を含む行）を、マルチレイアウト構造の「ボディ構造（ボディ行）」（補助情報構造）であると判定する。 The analysis result in this case is as shown in (2). That is, the multi-layout analysis unit 16 determines two rows as a unit of the multi-layout structure. Further, the multi-layout analysis unit 16 determines that the first row (the row including the “address”) of the two rows is the “head structure (head row)” (main information structure) of the multi-layout structure, The other rows (rows including “name”) are determined to be the “body structure (body row)” (auxiliary information structure) of the multi-layout structure.

すなわち、マルチレイアウト構造に規則的な周期性が有る場合には、当該周期がマルチレイアウト構造の単位として判定される。 That is, when the multi-layout structure has regular periodicity, the period is determined as a unit of the multi-layout structure.

また、図１７は、マルチレイアウト構造の単位の解析を説明するための第２の図である。図１７の（１）には、列＃１のみならず、列＃３（厳密には、列＃３から分類された列の集合）にも周期性が有る例が示されている。但し、列＃１の周期性の単位は２であるのに対し、列＃３の周期性の単位は４である。 FIG. 17 is a second diagram illustrating the analysis of the unit of the multi-layout structure. FIG. 17A shows an example in which not only column # 1 but also column # 3 (strictly, a set of columns classified from column # 3) has periodicity. However, the unit of periodicity of column # 1 is 2, whereas the unit of periodicity of column # 3 is 4.

この場合の解析結果は（２）に示される通りである。すなわち、マルチレイアウト解析部１６は、各列の周期性（本例では２と４）の最小公倍数である４を全体の行の周期性とみなし、これをもってマルチレイアウト構造の単位と判定する。また、マルチレイアウト解析部１６は、マルチレイアウト構造の単位ごとに、先頭の行をヘッド行と判定し、それ以外の行（図１７では２〜３行目）をボディ行と判定する。 The analysis result in this case is as shown in (2). That is, the multi-layout analysis unit 16 regards 4 which is the least common multiple of the periodicity (2 and 4 in this example) of each column as the periodicity of the entire row, and determines this as the unit of the multi-layout structure. In addition, the multi-layout analysis unit 16 determines, for each unit of the multi-layout structure, the first row as a head row and the other rows (the second to third rows in FIG. 17) as body rows.

また、図１８は、マルチレイアウト構造の単位の解析を説明するための第３の図である。図１８の（１）には、列＃１のみにおいて種別（種別Ａ及び種別Ｂ）が混在しており、その他の列では種別が混在していない例が示されている。但し、列＃１は、一定周期ではないが、種別Ａと種別Ｂが繰り返し現れる構造になっている。 FIG. 18 is a third diagram for explaining the analysis of the unit of the multi-layout structure. FIG. 18A shows an example in which types (type A and type B) are mixed only in column # 1, and types are not mixed in other columns. However, the column # 1 has a structure in which the type A and the type B repeatedly appear, although the period is not constant.

この場合の解析結果は（２）に示される通りである。すなわち、マルチレイアウト解析部１６は、１〜３行、４〜８行をそれぞれマルチレイアウト構造の単位として判定する。また、マルチレイアウト解析部１６は、マルチレイアウト構造の単位ごとに、先頭の行をヘッド行と判定し、それ以外の行をボディ行と判定する。 The analysis result in this case is as shown in (2). That is, the multi-layout analysis unit 16 determines each of the first to third rows and the fourth to eighth rows as a unit of the multi-layout structure. In addition, the multi-layout analysis unit 16 determines, for each unit of the multi-layout structure, the first row as a head row and the other rows as body rows.

なお、マルチレイアウト構造の検出及びマルチレイアウト構造の単位の解析は、例えば、各行における各列の種別の集合をベクトルとし、ベクトルのパタンが存在することを検出し（例えば、ベクトルＡとベクトルＢ）、ベクトルＡの複数回出現とベクトルＢの複数回出現が繰り返されるパタンを見出し、これらの繰り返しの単位をマルチレイアウト構造の単位として判定することで行われてもよい。 In the detection of the multi-layout structure and the analysis of the unit of the multi-layout structure, for example, a set of types of each column in each row is set as a vector, and the existence of a vector pattern is detected (for example, a vector A and a vector B). , A pattern in which a plurality of occurrences of the vector A and a plurality of occurrences of the vector B are repeated is determined, and the unit of these repetitions is determined as a unit of the multi-layout structure.

なお、マルチレイアウト解析部１６は、マルチレイアウト構造の単位の解析結果を、例えば、図１９に示されるようにテーブルデータに反映してもよい。 Note that the multi-layout analysis unit 16 may reflect the analysis result of the unit of the multi-layout structure in, for example, table data as shown in FIG.

図１９は、マルチレイアウト構造の単位が解析されたテーブルデータの例を示す図である。図１９に示されるテーブルデータは、「マルチレイアウトフラグ」の列を含む。また、各列のラベルが、「ＨＥＡＤの場合」及び「ＢＯＤＹの場合」に分類されている。なお、図１９のテーブルデータは、便宜上、図４のテーブルデータとは異なるテーブルデータである。 FIG. 19 is a diagram illustrating an example of table data in which units of the multi-layout structure are analyzed. The table data shown in FIG. 19 includes a column of “multi-layout flag”. Further, the labels of the respective columns are classified into “in the case of HEAD” and “in the case of BODY”. The table data in FIG. 19 is different from the table data in FIG. 4 for convenience.

「マルチレイアウトフラグ」の列は、各行が、ヘッド行であるのかボディ行であるのかを示す列である。すなわち、ヘッド行における当該列の値は「ＨＥＡＤ」であり、ボディ行における当該列の値は「ＢＯＤＹ」である。 The column of “multi-layout flag” is a column indicating whether each row is a head row or a body row. That is, the value of the column in the head row is “HEAD”, and the value of the column in the body row is “BODY”.

また、「ＨＥＡＤの場合」の行は、ヘッド行における各列のラベルを示し、「ＢＯＤＹの場合」の行は、ボディ行における各列のラベルを示す。 Also, the row of “in the case of HEAD” indicates the label of each column in the head row, and the row of “in the case of BODY” indicates the label of each column in the body row.

上述したように、第４の実施の形態によれば、古い現行システムにありがちな、テーブルデータ内におけるマルチレイアウト構造を自動的に検出することができ、当該マルチレイアウト構造の単位を判定（推定）することができる。その結果、テーブル構造の明確性を向上させることができる。 As described above, according to the fourth embodiment, it is possible to automatically detect a multi-layout structure in table data, which is common in an old current system, and determine (estimate) a unit of the multi-layout structure. can do. As a result, the clarity of the table structure can be improved.

なお、第４の実施の形態は、第１の実施の形態のみ又は第２の実施の形態のみと組み合わされてもよい。 Note that the fourth embodiment may be combined with only the first embodiment or only with the second embodiment.

次に、第５の実施の形態について説明する。第５の実施の形態では第４の実施の形態と異なる点について説明する。第５の実施の形態において特に言及されない点については、第４の実施の形態と同様でもよい。 Next, a fifth embodiment will be described. In the fifth embodiment, points different from the fourth embodiment will be described. What is not particularly mentioned in the fifth embodiment may be the same as the fourth embodiment.

現行システムを新システムへ移行する場合、現行システム（及びそれを用いた業務）に存在する重要なビジネスルールの検出が漏れてしまい、新システムの開発の下流工程（主にテスト工程）で問題が発見され、開発の手戻りとなることが問題となっている。この問題を解消するために、現行システムの保有するビジネスルール（特に重要なルール）を漏れなく検出する必要があるが、この作業は、現状、経験豊富な高スキル者が現行システムの各種ドキュメントを読み理解したり、現行システムの業務担当者からヒアリングしたり、更に最終手段としては現行システムのソースコードを解析するなどして検出しており、非常に手間と稼動がかかり、そのわりには抜け漏れも発生している。 When migrating an existing system to a new system, the detection of important business rules existing in the current system (and the business using it) is omitted, and problems occur in the downstream process (mainly the test process) of the development of the new system. It has been a problem that it was discovered and reworked the development. In order to solve this problem, it is necessary to thoroughly detect the business rules (especially important rules) possessed by the current system. Reading and understanding, hearing from the person in charge of the operation of the current system, and analyzing the source code of the current system as a last resort, etc., are detected, and it takes a lot of trouble and operation, and instead it is omission Has also occurred.

そこで、第５の実施の形態では、誰でも簡単に現行システムの持つ重要なビジネスルールを検出するため、現行システムの保有するＤＢストアデータに着目し、ＤＢストアデータのみを入力情報として、重要なビジネスルールを発見する例について説明する。 Therefore, in the fifth embodiment, in order to easily detect important business rules of the current system, anyone focuses on the DB store data held by the current system, and uses only the DB store data as input information. An example of finding a business rule will be described.

図２０は、第５の実施の形態における分析装置１０の機能構成例を示す図である。図２０中、図１４と同一部分には同一符号を付し、その説明は省略する。図２０において、分析装置１０は、更に、特異点検出部１７を有する。特異点検出部１７は、分析装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。 FIG. 20 is a diagram illustrating a functional configuration example of the analyzer 10 according to the fifth embodiment. 20, the same parts as those of FIG. 14 are denoted by the same reference numerals, and the description thereof will be omitted. In FIG. 20, the analyzer 10 further includes a singular point detection unit 17. The singularity detection unit 17 is realized by a process of causing the CPU 104 to execute one or more programs installed in the analyzer 10.

特異点検出部１７は、ステップＳ１５１までにおいて明らかにされた（推定された）、テーブルデータ内の関係構造に基づいて、数字列ごとに、異端な値（特異点）を検出する。数字列とは、数字のみを値として含む列（すなわち、数値を値とする列）をいう。特異点検出部１７は、特異点を検出することにより、同一のジャンルに属する業務の中で、メジャーな作業に潜むマイナーな作業の兆候を発見し、そこからマイナーな業務ルールを推定することに寄与する。同一のジャンルに属する業務は、１テーブル内の１列に相当すると考え、その中で特異点となる値を検出することによってマイナーなルールの兆候を検出する。 The singularity detection unit 17 detects an outlier (singularity) for each numeral string based on the relational structure in the table data that has been clarified (estimated) up to step S151. The number sequence refers to a sequence including only numbers as values (that is, a sequence having numerical values). By detecting the singularity, the singularity detecting unit 17 finds signs of minor work hidden in major work in work belonging to the same genre, and estimates minor work rules therefrom. Contribute. The tasks belonging to the same genre are considered to correspond to one row in one table, and a sign of a minor rule is detected by detecting a value that is a singular point in the row.

図２１は、第５の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。図２１中、図１５と同一ステップには同一ステップ番号を付し、その説明は省略する。 FIG. 21 is a flowchart illustrating an example of a processing procedure performed by the analyzer 10 according to the fifth embodiment. In FIG. 21, the same steps as those in FIG. 15 are denoted by the same step numbers, and a description thereof will be omitted.

ステップＳ１５０又はステップＳ１５１に続いて、特異点検出部１７は、テーブルデータ内に１以上の数字列が有るか否かを判定する（Ｓ１６０）。数字列の判定は、例えば、各列に含まれている値を確認することによって行われる。図１９に示したテーブルデータであれば、ヘッド行及びボディ行の「年月日」の列、ボディ行の「商品番号」の列、ボディ行の「番号」の列、ヘッド行の「電話番号」の列が数字列に該当する。 Subsequent to step S150 or step S151, the singularity detection unit 17 determines whether there is one or more numeric strings in the table data (S160). The determination of the numeric string is performed, for example, by checking the value included in each column. In the case of the table data shown in FIG. 19, the column of “date” of the head row and the body row, the column of “product number” of the body row, the column of “number” of the body row, and the “telephone number” of the head row Column corresponds to a numeric string.

１以上の数字列が有る場合（Ｓ１６０でＹｅｓ）、特異点検出部１７は、数字列ごとに、特異点の検出を試みる（Ｓ１６１）。例えば、特異点検出部１７は、数字列ごとに、マルチレイアウト構造の各単位について、数字が示す数値の最小値、最大値及び平均値を求め、最小値、最大値、又は平均値において、他の大多数の単位における値とは異なる傾向を示す値（特異点）があれば、当該値を検出する。 When there is one or more number strings (Yes in S160), the singularity detection unit 17 attempts to detect a singularity for each number string (S161). For example, the singularity detection unit 17 obtains, for each unit of the multi-layout structure, the minimum value, the maximum value, and the average value of the numerical values indicated by the numbers, and calculates the minimum value, the maximum value, or the average value. If there is a value (singular point) showing a tendency different from the value in the majority unit of, this value is detected.

図２２は、特異点の第１の検出例を示す図である。図２２では、図１９に示したテーブルデータにおいて、「年月日」の列と、ボディ行の「番号」の列について、マルチレイアウト構造の単位ごとに、最小値（最古年月日）、最大値（最新年月日）、及び平均値（平均年月日）が算出された例が示されている。 FIG. 22 is a diagram illustrating a first detection example of a singular point. In FIG. 22, in the table data shown in FIG. 19, the minimum value (oldest date) and the minimum value (oldest date) of the column of “date” and the column of “number” of the body row are set for each unit of the multi-layout structure. An example in which the maximum value (latest date) and the average value (average date) are calculated is shown.

図２２の例では、「年月日」の列の最初のマルチレイアウト構造の単位の最古年月日（「１００００１０１」）が、同じ列の他のマルチレイアウト構造の単位の最古年月日から乖離していることが分かる。この場合、特異点検出部１７は、「年月日」の列の最初のマルチレイアウト構造の単位の最古年月日（「１００００１０１」）を特異点として検出し、当該最古年月日が格納されているセルの位置情報を出力する。 In the example of FIG. 22, the oldest date (“10000101”) of the unit of the first multi-layout structure in the column of “date” is the oldest date of the unit of the other multi-layout structure in the same column. It can be seen that it deviates from. In this case, the singularity detecting unit 17 detects the oldest date (“10000101”) of the unit of the first multi-layout structure in the column of “date” as a singularity, and determines that the oldest date is Outputs the location information of the stored cell.

ユーザは、この特異点となる値がなぜ大多数の値と異なるのかを調査し、原因となるビジネスルールを、その特異点が持つ意味を知っていると思われる現場の業務担当者等に対するヒアリング等によって発見する。その結果、例えば、図２２の例によれば、ユーザは、１０００年１月１日という値は他の年月日（発送完了日）とは違って返納処理を行なったというローカルルールであるといった、見落としやすいビジネスルールを発見することができる。 The user investigates why the value of this singularity is different from the majority of the values, and interviews the business rules that cause the problem with business operations staff who seem to know the meaning of the singularity. Discover by etc. As a result, for example, according to the example of FIG. 22, the user has determined that the value of January 1, 1000 is a local rule that the return processing was performed differently from other dates (shipping completion dates). , You can discover business rules that are easy to overlook.

又は、特異点の検出は次のように行われてもよい。図２３は、特異点の第２の検出例を示す図である。図２３では、「年月日」の列の値が、当該列内で最も古い年月日（図２３では、１行目の年月日）を基準日として、当該基準日からの積算日の数値列に変換されている。この場合、特異点検出部１７は、当該数値列内において、他とは大きくかけ離れた値を検出する。図２３の例では、０のみが他とは大きくかけ離れていることが検出される。その結果、上記したようなローカルルールの発見を支援することができる。 Alternatively, the detection of a singular point may be performed as follows. FIG. 23 is a diagram illustrating a second example of detecting a singular point. In FIG. 23, the value of the column of “year, month, day” is the date of integration from the reference date, with the oldest date in the column (the date of the first row in FIG. 23) as the reference date. Has been converted to a numeric column. In this case, the singular point detection unit 17 detects a value that is far from the others in the numerical value sequence. In the example of FIG. 23, it is detected that only 0 is significantly different from the others. As a result, it is possible to support the discovery of the local rule as described above.

また、特異点の検出は次のように行われてもよい。図２４は、特異点の第３の検出例を示す図である。図２４では、列＃１及び列＃３が数字列であるとする。 The detection of a singular point may be performed as follows. FIG. 24 is a diagram illustrating a third detection example of a singular point. In FIG. 24, it is assumed that columns # 1 and # 3 are numeric strings.

特異点検出部１７は、各数字列について、分散値を算出する（１）。図２４では、列＃１についての分散値が０．１であり、列＃３についての分散値が０．００１であったとする。この場合、列＃１の分散値が最大であるため、特異点検出部１７は、列＃１の中に特異点が有るだろうと推定し（２）、列＃１の値をクラスタリング手法によって分類する（３）。クラスタリング手法としては、例えば、Ｋ−ｍｅａｎｓ法等の公知の手法が用いられればよい。特異点検出部１７は、クラスタリングの結果、相対的に要素数が少ないクラスタに属する値を、特異点として検出する（４）。 The singular point detection unit 17 calculates a variance value for each number string (1). In FIG. 24, it is assumed that the variance value for column # 1 is 0.1 and the variance value for column # 3 is 0.001. In this case, since the variance of column # 1 is the largest, the singularity detection unit 17 estimates that there will be a singularity in column # 1 (2), and classifies the value of column # 1 by the clustering method. (3). As the clustering method, for example, a known method such as the K-means method may be used. As a result of the clustering, the singularity detecting unit 17 detects a value belonging to a cluster having a relatively small number of elements as a singularity (4).

ユーザは、当該特異点に基づいて、上述したようなローカルルールを発見することができる。 The user can find a local rule as described above based on the singularity.

上述したように、第５の実施の形態によれば、ＤＢストアデータ（テーブルデータ）から特異点を検出し、当該特異点をユーザに通知することができる。ユーザは、当該特異点の原因を調査することでビジネスルールを発見することができる。すなわち、従来法のソースコード解析などの手法では、そこに実装されているルールしか抽出できないため、業務の現場担当者のみが知っているような見落としがちなマイナー業務のビジネスルールを検出することは困難であった。本実施の形態によれば、現行の作業結果を保持しているＤＢストアデータから業務ルール等を抽出するため、このようなマイナーなビジネスルールも検出できる。したがって、高度のスキルを要することなく、重要なビジネスルールの発見を可能とすることができる。 As described above, according to the fifth embodiment, it is possible to detect a singular point from DB store data (table data) and notify the user of the singular point. The user can find a business rule by investigating the cause of the singularity. In other words, conventional methods such as source code analysis can extract only the rules implemented in them, so it is not possible to detect business rules for minor operations that are often overlooked, such as those known only by business site staff. It was difficult. According to the present embodiment, since a business rule or the like is extracted from the DB store data holding the current work result, such a minor business rule can also be detected. Therefore, it is possible to discover important business rules without requiring advanced skills.

なお、第５の実施の形態は、第４の実施の形態以外の各実施の形態とのみ組み合わされてもよい。 Note that the fifth embodiment may be combined only with each embodiment other than the fourth embodiment.

次に、第６の実施の形態について説明する。第６の実施の形態では第５の実施の形態と異なる点について説明する。第６の実施の形態において特に言及されない点については、第５の実施の形態と同様でもよい。 Next, a sixth embodiment will be described. In the sixth embodiment, points different from the fifth embodiment will be described. What is not particularly mentioned in the sixth embodiment may be the same as the fifth embodiment.

図２５は、第６の実施の形態における分析装置１０の機能構成例を示す図である。図２５中、図２０と同一部分には同一符号を付し、その説明は省略する。図２５において、分析装置１０は、更に、関係性推定部１８を有する。関係性推定部１８は、分析装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。 FIG. 25 is a diagram illustrating an example of a functional configuration of the analyzer 10 according to the sixth embodiment. 25, those parts that are the same as those corresponding parts in FIG. 20 are designated by the same reference numerals, and a description thereof will be omitted. In FIG. 25, the analyzer 10 further includes a relationship estimating unit 18. The relationship estimating unit 18 is realized by a process of causing the CPU 104 to execute one or more programs installed in the analyzer 10.

関係性推定部１８は、複数のテーブルデータ（複数のテーブル）が入力された場合に、テーブル間の関係性（参照関係）を推定する。 When a plurality of table data (a plurality of tables) are input, the relationship estimating unit 18 estimates the relationship (reference relationship) between the tables.

図２６は、第６の実施の形態において分析装置１０が実行する処理手順の一例を説明するためのフローチャートである。図２６中、図２１と同一ステップには同一ステップ番号を付し、その説明は省略する。 FIG. 26 is a flowchart illustrating an example of a processing procedure performed by the analyzer 10 according to the sixth embodiment. 26, those steps which are the same as those corresponding steps in FIG. 21 are designated by the same step numbers, and a description thereof will be omitted.

第６の実施の形態では、ステップＳ１００〜Ｓ１６０又はＳ１６１までが、複数のテーブルデータについて実行される。 In the sixth embodiment, steps S100 to S160 or S161 are executed for a plurality of table data.

図２７は、第６の実施の形態において入力される複数のテーブルデータの例を示す図である。図２７には、テーブルデータＴ１〜Ｔ３の３つのテーブルデータが示されている。 FIG. 27 is a diagram illustrating an example of a plurality of table data input in the sixth embodiment. FIG. 27 shows three table data T1 to T3.

テーブルデータＴ１〜Ｔ３のそれぞれについて、ステップＳ１００〜Ｓ１６０又はＳ１６１までが実行されると、各テーブルデータは、例えば、図２８に示される状態になる。 When steps S100 to S160 or S161 are executed for each of the table data T1 to T3, each table data is in a state shown in FIG. 28, for example.

図２８は、第６の実施の形態において関係構造が推定された複数のテーブルデータの例を示す図である。図２８では、テーブルデータＴ１〜Ｔ３の各列が、カラム１〜７、カラム１１〜１３、又はカラム２１〜２３に分類されている。特に、テーブルデータＴ１については、図２７における２番目の列が、カラム２−１及びカラム２−２に分類されている。なお、各列のラベル及び各行に対するラベルは省略されている。 FIG. 28 is a diagram illustrating an example of a plurality of table data for which a relational structure has been estimated in the sixth embodiment. In FIG. 28, each column of the table data T1 to T3 is classified into columns 1 to 7, columns 11 to 13, or columns 21 to 23. In particular, regarding the table data T1, the second column in FIG. 27 is classified into a column 2-1 and a column 2-2. Note that labels for each column and labels for each row are omitted.

続いて、関係性推定部１８は、各テーブルデータ間の関係性を推定する（Ｓ１７０）。例えば、テーブルデータ間の関係性は、例えば、一方のテーブルデータのいずれかの列に含まれている全ての値が、他方のテーブルデータのいずれかの列に含まれているか否かにより判定される。一方のテーブルデータのいずれかの列に含まれている全ての値が、他方のテーブルデータのいずれかの列に含まれている場合、当該２つのテーブルデータ間（厳密には当該２つの列の間）には参照関係が有ると判定される。この場合、参照関係にあると判定された２つの列のうち、値の重複の有る列から値の重複の無い列への方向が、参照の方向とされてもよい。値の重複の無い列は、当該列を含むテーブルデータにおいてキーとなる値を格納している列である可能性が推定されるからである。 Subsequently, the relationship estimating unit 18 estimates the relationship between the respective table data (S170). For example, the relationship between table data is determined by, for example, whether all values included in any column of one table data are included in any column of the other table data. You. When all the values included in any one column of one table data are included in any column of the other table data, between all the two table data (strictly speaking, Is determined to have a reference relationship. In this case, of the two columns determined to be in the reference relationship, the direction from the column having the overlapping value to the column having no overlapping value may be the reference direction. This is because it is estimated that a column having no value duplication may be a column storing a key value in the table data including the column.

例えば、テーブルデータＴ１のカラム２−２の全ての値は、テーブルデータＴ２カラム１１に含まれている。また、テーブルデータＴ１のカラム２−２には値の重複が有るのに対し、テーブルデータＴ２のカラム１１には値の重複が無い。したがって、テーブルデータＴ１のカラム２−２は、テーブルデータＴ２カラム１１を参照していると判定される。また、テーブルデータＴ１のカラム７の全ての値は、テーブルデータＴ３のカラム２１に含まれている。また、テーブルデータＴ１のカラム７には値の重複が有るのに対し、テーブルデータＴ３のカラム２１には値の重複が無い。したがって、テーブルデータＴ１のカラム７は、テーブルデータＴ３のカラム２１を参照していると判定される。 For example, all values in column 2-2 of table data T1 are included in table data T2 column 11. Further, while there is a value overlap in the column 2-2 of the table data T1, there is no value overlap in the column 11 of the table data T2. Therefore, it is determined that the column 2-2 of the table data T1 refers to the table data T2 column 11. All the values in column 7 of table data T1 are included in column 21 of table data T3. Further, while there is a value overlap in column 7 of table data T1, there is no value overlap in column 21 of table data T3. Therefore, it is determined that the column 7 of the table data T1 refers to the column 21 of the table data T3.

関係性推定部１８は、判定結果を示す情報を表示装置１０６に表示してもよい。 The relationship estimating unit 18 may display information indicating the determination result on the display device 106.

上述したように、第６の実施の形態によれば、テーブルデータ同士の関係構造を明確化することができる。テーブル間の関係構造の明確化により、不明だったシステムの仕様やビジネスルールの発見等の容易化を期待することができる。 As described above, according to the sixth embodiment, the relationship structure between table data can be clarified. By clarifying the relational structure between the tables, it is possible to expect easier discovery of unknown system specifications and business rules.

なお、第６の実施の形態は、第５の実施の形態以外の各実施の形態とのみ組み合わされてもよい。 Note that the sixth embodiment may be combined only with each embodiment other than the fifth embodiment.

次に、第７の実施の形態について説明する。第７の実施の形態では第６の実施の形態と異なる点について説明する。第７の実施の形態において特に言及されない点については、第６の実施の形態と同様でもよい。 Next, a seventh embodiment will be described. In the seventh embodiment, points different from the sixth embodiment will be described. What is not particularly mentioned in the seventh embodiment may be the same as in the sixth embodiment.

図２９は、第７の実施の形態における分析装置１０の機能構成例を示す図である。図２９中、図２５と同一部分には同一符号を付し、その説明は省略する。図２９において、分析装置１０は、更に、モデル生成部１９を有する。モデル生成部１９は、分析装置１０にインストールされた１以上のプログラムが、ＣＰＵ１０４に実行させる処理により実現される。 FIG. 29 is a diagram illustrating a functional configuration example of the analyzer 10 according to the seventh embodiment. 29, those parts that are the same as those corresponding parts in FIG. 25 are designated by the same reference numerals, and a description thereof will be omitted. In FIG. 29, the analysis device 10 further has a model generation unit 19. The model generation unit 19 is realized by a process of causing the CPU 104 to execute one or more programs installed in the analysis device 10.

モデル生成部１９は、ステップＳ１６１以前の処理結果について、概念データモデルを生成する。現行システム上にあるＤＢストアデータはシステムが複雑かつ大規模になるほどデータ量が大量となり、テーブルデータ数も多くなる。その場合、テーブルデータ同士の関係を構造として理解することが難しくなるため、何らかの方法で図形によって表すと理解がしやすい。そこで、モデル生成部１９は、１テーブルデータを１概念とし、１テーブルデータ内の各列のラベルを当該概念の属性として概念データモデル図（クラス図）を生成して、分かりにくいデータの構造の理解の向上に寄与する。なお、モデル生成部１９は、他の各部と並行して処理を実行してもよい。 The model generation unit 19 generates a concept data model for the processing result before step S161. The DB store data in the current system has a large data amount and a large number of table data as the system becomes complicated and large-scale. In this case, it is difficult to understand the relationship between the table data as a structure. Therefore, the model generation unit 19 sets one table data as one concept, generates a concept data model diagram (class diagram) using the label of each column in the one table data as an attribute of the concept, and generates a data structure diagram that is difficult to understand. Contribute to better understanding. Note that the model generation unit 19 may execute processing in parallel with other units.

例えば、モデル生成部１９は、図２７に示した３種類のテーブルデータが入力されると、入力された状態における各テーブルデータを１個ずつの箱（概念又はクラス）として表し、各テーブル内のカラム（列構造）をクラスの属性として自動変換することで得られる概念データモデル図を表示装置１０６に表示する。 For example, when the three types of table data shown in FIG. 27 are input, the model generation unit 19 expresses each table data in the input state as one box (concept or class), and A conceptual data model diagram obtained by automatically converting a column (column structure) as a class attribute is displayed on the display device 106.

図３０は、概念データモデル図の第１の例を示す図である。図３０において、クラス１は、図２７のテーブルデータＴ１に基づくクラスである。クラス２は、テーブルデータＴ２に基づくクラスである。クラス３は、テーブルデータＴ３に基づくクラスである。各クラスは、対応するテーブルデータが有する列に対応する属性を有する。 FIG. 30 is a diagram illustrating a first example of a conceptual data model diagram. In FIG. 30, class 1 is a class based on the table data T1 in FIG. Class 2 is a class based on the table data T2. Class 3 is a class based on the table data T3. Each class has attributes corresponding to the columns of the corresponding table data.

また、各テーブルデータについてステップＳ１７０までが実行された時点において、モデル生成部１９は、概念データモデル図を図３１に示されるように更新してもよい。 In addition, at the time when the steps up to step S170 are executed for each table data, the model generating unit 19 may update the conceptual data model diagram as shown in FIG.

図３１は、概念データモデル図の第２の例を示す図である。図３１では、クラス１が、クラス１Ｈ及びクラス１Ｂを集約することが示されている。クラス１Ｈは、テーブルＴ１についてステップＳ１５１が実行されることにより解析される、マルチレイアウト構造のヘッド行に対応するクラスである。すなわち、クラス１Ｈは、図２８のテーブルＴ１において、カラム１、カラム２−１、カラム３、カラム４、カラム５、カラム６及びカラム７に値を含む行に対応するクラスである。一方、クラス１Ｂは、テーブルＴ１について図２６のステップＳ１５１が実行されることにより解析される、マルチレイアウト構造のボディ行に対応するクラスである。すなわち、クラス１Ｂは、図２８のテーブルＴ１においてカラム２−２、カラム４及びカラム５に値を含む行に対応するクラスである。なお、クラス１は、複数のマルチレイアウト構造の単位を含む。したがって、クラス１とクラス１Ｈとの多重度は、１対多であり、当該多重度がクラス１とクラス１Ｈとの関係線に付与されている。同様に、クラス１とクラス１Ｂと多重度は、１対多であり、当該多重度がクラス１とクラス１Ｂとの関係線に付与されている。 FIG. 31 is a diagram illustrating a second example of the conceptual data model diagram. FIG. 31 shows that class 1 aggregates class 1H and class 1B. The class 1H is a class corresponding to the head row having the multi-layout structure, which is analyzed by performing the step S151 on the table T1. That is, the class 1H is a class corresponding to a row including values in column 1, column 2-1, column 3, column 4, column 5, column 6, and column 7 in the table T1 of FIG. On the other hand, the class 1B is a class corresponding to a body row having a multi-layout structure, which is analyzed by executing step S151 in FIG. 26 for the table T1. That is, the class 1B is a class corresponding to a row including values in the columns 2-2, 4 and 5 in the table T1 of FIG. Class 1 includes a plurality of units of a multi-layout structure. Therefore, the multiplicity between class 1 and class 1H is one-to-many, and the multiplicity is assigned to the relationship line between class 1 and class 1H. Similarly, the multiplicity between the class 1 and the class 1B is one-to-many, and the multiplicity is given to the relationship line between the class 1 and the class 1B.

マルチレイアウト構造のヘッド行及びボディ行が、概念データモデル上で分離して表示されることにより、当該マルチレイアウト構造の把握を容易とすることができる。 Since the head row and the body row of the multi-layout structure are displayed separately on the conceptual data model, it is possible to easily grasp the multi-layout structure.

また、図３１では、クラス１Ｈとクラス３とが関係線で接続されており、クラス１Ｂとクラス２とが関係線で接続されている。関係線の矢印の方向は、当該関係線に係るクラス間の参照方向に従う。これは、テーブルデータＴ１〜Ｔ３についての図２６のステップＳ１７０の実行結果に基づく。すなわち、ステップＳ１７０では、テーブルデータＴ１のカラム７に係る列が、テーブルデータ３のカラム１に係る列を参照していることが推定される。また、テーブルデータＴ１のカラム２−２に係る列が、テーブルデータ２のカラム１に係る列を参照していることが推定される。なお、図３１では、矢印の元の概念が矢印の先の概念を参照していることを示す。なお、図３１に示されるように、各関係線には、参照関係を有する列のラベル等が付記されてもよい。 In FIG. 31, the class 1H and the class 3 are connected by a relation line, and the class 1B and the class 2 are connected by a relation line. The direction of the arrow of the relation line follows the reference direction between the classes related to the relation line. This is based on the execution result of step S170 in FIG. 26 for the table data T1 to T3. That is, in step S170, it is estimated that the column related to column 7 of the table data T1 refers to the column related to column 1 of the table data 3. In addition, it is estimated that the column related to column 2-2 of the table data T1 refers to the column related to column 1 of the table data T2. FIG. 31 shows that the original concept of the arrow refers to the concept ahead of the arrow. As shown in FIG. 31, each relationship line may be additionally provided with a label of a column having a reference relationship.

更に、図２６のステップＳ１６１の実行結果が概念データモデル図に反映されてもよい。この場合、ステップＳ１６１において、特異点検出部１７は、検出した特異点ごとの列をテーブルデータに追加し、当該列に対して当該特異点を移動する。その結果、図２８のテーブルデータＴ１であれば、例えば、図３２に示されるように更新される。 Furthermore, the execution result of step S161 in FIG. 26 may be reflected in the conceptual data model diagram. In this case, in step S161, the singularity detection unit 17 adds a row for each detected singularity to the table data, and moves the singularity with respect to the row. As a result, if it is the table data T1 in FIG. 28, for example, it is updated as shown in FIG.

図３２は、特異点ごとに列が追加されたテーブルデータの例を示す図である。図３２において、テーブルデータＴ１のカラム４は、カラム４−１、４−２、及び４−３に分類されている。カラム４−２は、カラム４に含まれていた特異点「１００００１０１」の移動先の列である。カラム４−３は、カラム４に含まれていた特異点「９９９９９９９９」の移動先の列である。 FIG. 32 is a diagram illustrating an example of table data in which a column is added for each singular point. In FIG. 32, the column 4 of the table data T1 is classified into columns 4-1, 4-2, and 4-3. Column 4-2 is a destination column of the singular point “10000101” included in column 4. Column 4-3 is a destination column of the singular point “999999999” included in column 4.

モデル生成部１９は、このようなテーブルデータＴ１について、図３３に示されるような概念データモデル図を生成してもよい。 The model generation unit 19 may generate a conceptual data model diagram as shown in FIG. 33 for such table data T1.

図３３は、概念データモデル図の第３の例を示す図である。図３３では、特異点に対応する列（カラム４−２、４−３）についても、クラス１Ｈの属性として明確に示されている。そうすることで、概念データモデル構造を用いて、特異点の情報（特異点の存在）を分かり易く示すことができる。 FIG. 33 is a diagram illustrating a third example of the conceptual data model diagram. In FIG. 33, the columns (columns 4-2 and 4-3) corresponding to the singular points are also clearly shown as attributes of class 1H. By doing so, the information of the singular point (the existence of the singular point) can be shown in an easily understandable manner using the conceptual data model structure.

上述したように、第７の実施の形態によれば、テーブルデータ内及びテーブルデータ間の構造を明確化した結果を概念データモデルを用いて自動変換し表現することによって、テーブルデータの構造の理解を容易化することができる。 As described above, according to the seventh embodiment, the result of clarifying the structure in the table data and between the table data is automatically converted and expressed by using the conceptual data model, so that the structure of the table data can be understood. Can be facilitated.

なお、第７の実施の形態は、第６以外の各実施の形態とのみ組み合わされてもよい。 Note that the seventh embodiment may be combined with only the respective embodiments other than the sixth embodiment.

また、上記各実施の形態によれば、ＤＢストアデータを入力情報として使用することにより、様々な入力情報を総合的に判断する技量を不要とすることができる。 Further, according to each of the above-described embodiments, by using the DB store data as the input information, it is possible to eliminate the need for the skill of comprehensively determining various input information.

また、上記各実施の形態によれば、本発明では、現行システムの持つ仕様の情報をＤＢストアデータから抽出することによって、様々なドキュメントやヒアリングやプログラム解析を行なわずに、ＤＢストアデータのみを分析するという唯一の方法によって熟練者でなくても現行システムの仕様を推定することができる。更に、概念データモデルによって、システムの構造を表現することによって、現行システムの仕様を推定する人が分かりやすく理解することができる。 Further, according to each of the above-described embodiments, the present invention extracts information on the specifications of the current system from the DB store data, so that only the DB store data can be obtained without performing various documents, hearings, and program analysis. The only way to analyze is to allow non-experts to estimate the specifications of the current system. Further, by expressing the structure of the system using the conceptual data model, a person who estimates the specifications of the current system can understand the system easily.

なお、上記各実施の形態において、列と行との概念が入れ替えられてもよい。すなわち、列が行として把握されてもよいし、行が列として把握されてもよい。 In each of the above embodiments, the concept of columns and rows may be interchanged. That is, a column may be grasped as a row, or a row may be grasped as a column.

なお、上記各実施の形態において、分析装置１０にインストールされるプログラムは、表データ分析プログラムの一例である。テーブルデータは、表データの一例である。分類支援部１４は、受付部の一例である。マルチレイアウト解析部１６は、解析部の一例である。特異点検出部１７は、検出部の一例である。モデル生成部１９は、生成部の一例である。ノイズ除去部１５は、削除部の一例である。 In each of the above embodiments, the program installed in the analyzer 10 is an example of a table data analysis program. Table data is an example of table data. The classification support unit 14 is an example of a reception unit. The multi-layout analysis unit 16 is an example of an analysis unit. The singular point detection unit 17 is an example of a detection unit. The model generation unit 19 is an example of a generation unit. The noise removing unit 15 is an example of a deleting unit.

以上、本発明の実施例について詳述したが、本発明は斯かる特定の実施形態に限定されるものではなく、特許請求の範囲に記載された本発明の要旨の範囲内において、種々の変形・変更が可能である。 As mentioned above, although the Example of this invention was described in full detail, this invention is not limited to such a specific embodiment, A various deformation | transformation is carried out within the range of the gist of this invention described in the claim.・ Change is possible.

１０分析装置
１１入力部
１２分類部
１３加工部
１４分類支援部
１５ノイズ除去部
１６マルチレイアウト解析部
１７特異点検出部
１８関係性推定部
１９モデル生成部
１００ドライブ装置
１０１記録媒体
１０２補助記憶装置
１０３メモリ装置
１０４ＣＰＵ
１０５インタフェース装置
１０６表示装置
１０７入力装置
Ｂバス Reference Signs List 10 analysis device 11 input unit 12 classification unit 13 processing unit 14 classification support unit 15 noise removal unit 16 multi-layout analysis unit 17 singular point detection unit 18 relationship estimation unit 19 model generation unit 100 drive device 101 recording medium 102 auxiliary storage device 103 Memory device 104 CPU
105 interface device 106 display device 107 input device B bus

Claims

Among the columns in the first table data, columns including a plurality of types of values are classified into columns for each type, and each row of the first table data is classified according to the type of the value included in each row. A classifier,
A processing unit that processes the first table data based on a classification result by the classification unit to generate second table data;
Analyzing the unit of the pattern repetition of the type in the column including a plurality of types of values in the first table data, and distinguishing between the row including the head type in the unit and the row including the type other than the head in the unit An analyzing unit for adding information to be performed to the second table data;
A table data analysis program for causing a computer to function as a computer.

The classification unit, the type based on the classification information for determining determines the type of each value in the first table data, the necessity of the classification result in which the second table data to the pair to correct Is determined, if further classification information is added when it is determined that correction is necessary, a column including a plurality of types of values is classified into columns for each type based on the classification information.
2. The table data analysis program according to claim 1, wherein:

Causing a computer to function as a receiving unit that receives correction by the user for the classification result,
3. The table data analysis program according to claim 1, wherein:

Causing the computer to function as a deletion unit that deletes a column or a row that meets a predetermined rule among the columns and rows in the second table data;
The table data analysis program according to any one of claims 1 to 3, wherein:

For a column related to a numerical value in the second table data, in a set of numerical values included in the column, cause the computer to function as a detecting unit that detects a numerical value indicating a tendency different from other numerical values,
The table data analysis program according to any one of claims 1 to 4, wherein:

Causing the computer to function as a generation unit that generates a conceptual data model diagram in which the second table data is a class and the columns of the second table data are attributes;
The table data analysis program according to any one of claims 1 to 5, wherein: