JP2012173745A

JP2012173745A - Database analysis device and database analysis program

Info

Publication number: JP2012173745A
Application number: JP2011031645A
Authority: JP
Inventors: Kiyoto Kawachi; 清人河内; Shoji Sakurai; 鐘治桜井
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 2011-02-17
Filing date: 2011-02-17
Publication date: 2012-09-10

Abstract

PROBLEM TO BE SOLVED: To extract and provide the explanation information of each column from a Web application program using its database.SOLUTION: A database analysis device 100 includes: a Web application input part 110 which inputs a Web application 101 using an analysis object database, Web application setting data 102 to be used for the setting of the Web application 101 and a Web page template 103; and a specification processing execution part 1010 which specifies an input form component to be used for the creation of an input form displayed on a Web page created by the Web application 101, that is, an input form component associated with an input item name corresponding to the column of a table owned by the analysis object database on the basis of the Web application 101, the Web application setting data 102 and the Web page template 103, and specifies the input item name corresponding to the specified input form component.

Description

この発明は、データベースを利用するＷｅｂアプリケーションを解析してデータベースのテーブル、カラムの説明情報を抽出するデータベース解析装置及びデータベース解析プログラムに関する。 The present invention relates to a database analysis apparatus and a database analysis program that analyze a Web application that uses a database to extract database table and column description information.

一般に製品化されているデータモデリングツールでは、データベース解析機能としてテーブルやカラムの一覧を出力する機能は極めて一般的に提供されている（例えば、非特許文献１）。データベースを解析するだけでは抽出できない情報を、そのデータベースを利用するプログラムを解析することによって、得ようとする技術としては、例えば特許文献１が知られている。特許文献１では、データベースから読み込まれたデータが格納される変数のデータ型や、プログラム内で行われる定数値との比較を調べることで、データベース中の各データ項目のデータ型、上限値、下限値、あるいはループ脱出条件として使用されているかどうか、といった情報を抽出する方法を開示している。 In general, data modeling tools that are commercialized provide a function for outputting a list of tables and columns as a database analysis function (for example, Non-Patent Document 1). For example, Patent Document 1 is known as a technique for obtaining information that cannot be extracted simply by analyzing a database by analyzing a program that uses the database. In Patent Document 1, the data type of each data item in the database, the upper limit value, the lower limit value are checked by examining the comparison between the data type of the variable in which the data read from the database is stored and the constant value performed in the program. A method for extracting information such as a value or whether or not it is used as a loop exit condition is disclosed.

国際公開第ＷＯ２００９／０１１０５７号公報International Publication No. WO2009 / 011057

ＩＢＭＩｎｆｏＳｐｈｅｒｅＤａｔａＡｒｃｈｉｔｅｃｔｈｔｔｐ：／／ｗｗｗ−０６．ｉｂｍ．ｃｏｍ／ｓｏｆｔｗａｒｅ／ｊｐ／ｄａｔａ／ｏｐｔｉｍ／ｄａｔａ−ａｒｃｈｉｔｅｃｔ／ｆｕｎｃｔｉｏｎ．ｈｔｍｌIBM InfoSphere Data Architect http: // www-06. ibm. com / software / jp / data / optim / data-architect / function. html

例えば、資材システムのデータベースに入力された発注金額を、経理システムのデータベース上に反映させる、といった複数のデータベース間のデータ連携システムを開発するには、各データベース間で連携させなければならないテーブル、カラムを特定する作業が必要となる。しかし、従来技術では、データベース内に格納されたテーブルと各テーブル上に存在するカラムの名称（物理名）一覧と、その値域などが抽出されるのみであり、各カラムに格納されるデータの意味（用途）は，カラムやテーブルの名称をもとに仕様書を調査する必要があった。調査には時間がかかる上，仕様書の記載が不十分だったり，内容が古かったりした場合には正しい情報が得られない、といった課題があった。 For example, to develop a data linkage system between multiple databases, such as reflecting the order amount entered in the material system database on the accounting system database, the tables and columns that must be linked between each database Work to identify is necessary. However, in the prior art, only the table stored in the database, the list of column names (physical names) existing in each table, its range, etc. are extracted, and the meaning of the data stored in each column For (Usage), it was necessary to investigate the specifications based on the names of columns and tables. In addition to the time required for the investigation, there was a problem that correct information could not be obtained if the description of the specification was insufficient or the contents were old.

この発明は、各データベースのカラムの意味を利用者が推測することを容易にするために、各カラムの説明情報をそのデータベースを利用するＷｅｂアプリケーションプログラム（以下、Ｗｅｂアプリケーションという場合もある）から抽出し、提供することを目的とする。 In the present invention, in order to make it easy for a user to guess the meaning of each database column, the description information of each column is extracted from a Web application program (hereinafter also referred to as a Web application) that uses the database. And intended to provide.

この発明のデータベース解析装置は、
解析対象のデータベースを使用するＷｅｂアプリケーションプログラムと、前記Ｗｅｂアプリケーションプログラムの設定に使用する所定のＷｅｂアプリ設定情報が記載されたＷｅｂアプリ設定ファイルと、前記ＷｅｂアプリケーションプログラムがＷｅｂページを生成する際の雛型情報を示すＷｅｂページテンプレートとを入力する入力部と、
前記入力部に入力された前記Ｗｅｂアプリケーションプログラムによって生成されるＷｅｂページに表示される入力フォームの生成に使用される入力フォーム部品を、前記Ｗｅｂアプリケーションプログラムと、前記Ｗｅｂアプリ設定ファイルと、前記Ｗｅｂページテンプレートとに基づき特定し、特定した前記入力フォーム部品に対応する前記入力項目名であって前記解析対象のデータベースの有するテーブルのカラムに対応する入力項目名を、前記Ｗｅｂページテンプレートを解析することにより特定する特定処理実行部と、
前記特定処理実行部が特定した前記入力項目名を少なくとも含む情報を、前記Ｗｅｂページに表示される前記入力フォームへ入力される入力データによって更新される前記カラムの説明情報として出力する出力部と
を備えたことを特徴とする。 The database analysis apparatus of the present invention is
A Web application program that uses a database to be analyzed, a Web application setting file that describes predetermined Web application setting information used for setting the Web application program, and a template used when the Web application program generates a Web page An input unit for inputting a web page template indicating type information;
An input form component used to generate an input form displayed on a Web page generated by the Web application program input to the input unit, the Web application program, the Web application setting file, and the Web page By analyzing the Web page template, specifying the input item name corresponding to the specified input form part and corresponding to the column of the table of the database to be analyzed A specific process execution unit to be identified;
An output unit that outputs information including at least the input item name specified by the specifying process execution unit as explanatory information of the column updated by input data input to the input form displayed on the Web page; It is characterized by having.

この発明により、各カラムの説明情報を、そのデータベースを利用するＷｅｂアプリケーションプログラムから抽出して、利用者に提供することができる。 According to the present invention, description information of each column can be extracted from a Web application program that uses the database and provided to the user.

実施の形態１のデータベース解析装置１００の構成図。1 is a configuration diagram of a database analysis device 100 according to a first embodiment. 実施の形態１のＷｅｂアプリケーションプログラム１０１〜Ｗｅｂページテンプレート１０３の概要を示す図。FIG. 3 is a diagram showing an outline of a Web application program 101 to a Web page template 103 according to the first embodiment. 実施の形態１のＷｅｂアプリケーション入力部１１０〜処理フロー解析部１４０の処理動作を示すフロー。9 is a flow showing processing operations of the Web application input unit 110 to the processing flow analysis unit 140 according to the first embodiment. 実施の形態１の処理フロー解析部１４０の動作を説明する図。FIG. 6 is a diagram for explaining the operation of a processing flow analysis unit 140 according to the first embodiment. 実施の形態１の入力項目名特定部１５０の動作を示すフロー。5 is a flow showing the operation of the input item name specifying unit 150 according to the first embodiment. 実施の形態１の入力項目名特定部１５０の動作を説明する図。FIG. 6 is a diagram for explaining the operation of the input item name specifying unit 150 according to the first embodiment. 図５のＳ２０４の内容を具体的に説明するフロー。6 is a flow for specifically explaining the content of S204 in FIG. 図５のＳ２０４の内容を具体的に説明するフロー。6 is a flow for specifically explaining the content of S204 in FIG. 図５のＳ２０４の内容を具体的に説明するフロー。6 is a flow for specifically explaining the content of S204 in FIG. 実施の形態１のデータベース解析装置１００による入力項目名特定の方式が適用される例を示す図。The figure which shows the example to which the input item name specific system by the database analysis apparatus 100 of Embodiment 1 is applied. 実施の形態１の出力項目名特定部１６０の動作を説明する図。FIG. 6 is a diagram for explaining the operation of the output item name specifying unit 160 according to the first embodiment. 実施の形態１の出力項目名特定部１６０の動作を説明する図。FIG. 6 is a diagram for explaining the operation of the output item name specifying unit 160 according to the first embodiment. 実施の形態１の説明情報記録部１７０によって出力される解析結果を示す図。The figure which shows the analysis result output by the description information recording part 170 of Embodiment 1. FIG. 実施の形態２の処理フロー解析部１４０のフローチャート。10 is a flowchart of a processing flow analysis unit 140 according to the second embodiment. 実施の形態２の処理フロー解析部１４０の別のフローチャート。4 is another flowchart of the processing flow analysis unit 140 according to the second embodiment. 実施の形態３のデータベース解析装置３００の構成図。The block diagram of the database analyzer 300 of Embodiment 3. FIG. 実施の形態４のデータベース解析装置の外観を示す図。FIG. 6 shows an appearance of a database analysis device according to a fourth embodiment. 実施の形態４のデータベース解析装置のハードウェア構成を示す図。FIG. 6 is a diagram illustrating a hardware configuration of a database analysis device according to a fourth embodiment.

実施の形態１．
図１は、データベース解析装置１００を示す装置構成図である。図１のように、データベース解析装置１００は、Ｗｅｂアプリケーション入力部１１０（入力部）、ページ処理関数特定部１２０、ＤＢ呼び出し関数特定部１３０、処理フロー解析部１４０、入力項目名特定部１５０、出力項目名特定部１６０、説明情報記録部１７０、および解析結果出力部１８０（出力部）を備える。ページ処理関数特定部１２０〜出力項目名特定部１６０は、特定処理実行部１０１０を構成する。なお、以下ではデータベースを「ＤＢ」と表記する場合がある。 Embodiment 1 FIG.
FIG. 1 is a device configuration diagram showing a database analysis device 100. As shown in FIG. 1, the database analysis apparatus 100 includes a Web application input unit 110 (input unit), a page processing function specifying unit 120, a DB call function specifying unit 130, a processing flow analyzing unit 140, an input item name specifying unit 150, and an output. An item name specifying unit 160, an explanation information recording unit 170, and an analysis result output unit 180 (output unit) are provided. The page processing function specifying unit 120 to the output item name specifying unit 160 constitute a specifying process executing unit 1010. Hereinafter, the database may be referred to as “DB”.

Ｗｅｂアプリケーション入力部１１０は、解析対象とするＷｅｂアプリケーションプログラム１０１、Ｗｅｂアプリケーションの設定データであるＷｅｂアプリ設定データ１０２（Ｗｅｂアプリ設定ファイル）、およびＷｅｂページテンプレート１０３を装置内に読み込む。 The web application input unit 110 reads the web application program 101 to be analyzed, the web application setting data 102 (web application setting file) that is the setting data of the web application, and the web page template 103 into the apparatus.

図２は、Ｗｅｂアプリケーションプログラム１０１〜Ｗｅｂページテンプレート１０３の概要を示す図である。
（１）Ｗｅｂアプリケーションプログラム１０１は、ソースコードの形態で入力される。
（２）Ｗｅｂアプリ設定データ１０２には、
（ａ）Ｗｅｂアプリケーションプログラム１０１が受理するＵＲＬの一覧、
（ｂ）ＵＲＬが呼び出されたときに起動される関数名（以下エントリーポイント）、
（ｃ）およびＵＲＬ処理結果として使用すべきＷｅｂページテンプレートの識別子、
が記載されている。なお、本実施の形態１では便宜上、「関数名」と述べているが、必ずしも関数名である必要は無い。例えば、Ｊａｖａ（登録商標）で記述されたＷｅｂアプリケーションでは、ＵＲＬが呼び出されたときに起動されるクラス名さえ判明すれば、呼び出される関数（メソッド）は標準で定められたものが使用される。
（３）Ｗｅｂページテンプレート１０３とは、Ｗｅｂアプリケーションプログラム１０１がＷｅｂページを生成する時の雛形データであり、ＪＳＰ（登録商標）など、一般的なＷｅｂアプリケーションで用いられている技術である。Ｗｅｂアプリケーションプログラム１０１は、Ｗｅｂページテンプレート１０３を読み込み、その中のパラメータ部分を実行時に生成された値で置換えたデータをブラウザに返す。 FIG. 2 is a diagram illustrating an outline of the Web application program 101 to the Web page template 103.
(1) The web application program 101 is input in the form of source code.
(2) The web application setting data 102 includes
(A) a list of URLs received by the Web application program 101;
(B) a function name (hereinafter referred to as an entry point) that is activated when a URL is called;
(C) and an identifier of a Web page template to be used as a URL processing result,
Is described. In the first embodiment, the “function name” is described for convenience, but the function name is not necessarily required. For example, in a Web application described in Java (registered trademark), a function (method) that is defined as a standard is used as long as a class name that is activated when a URL is invoked is known.
(3) The Web page template 103 is template data used when the Web application program 101 generates a Web page, and is a technique used in general Web applications such as JSP (registered trademark). The Web application program 101 reads the Web page template 103 and returns data in which the parameter portion therein is replaced with a value generated at the time of execution to the browser.

（１）ページ処理関数特定部１２０は、図２に示すように、読み込まれたＷｅｂアプリ設定データ１０２から、Ｗｅｂアプリケーションプログラム１０１の各ページを処理するためのエントリーポイントを特定する。
（２）ＤＢ呼び出し関数特定部１３０は、図２に示すように、Ｗｅｂアプリケーションプログラム１０１内に存在するデータベース呼び出し関数を特定する。
（３）処理フロー解析部１４０は、図２に示すように、エントリーポイントからＤＢ呼び出し関数までの、Ｗｅｂアプリケーションプログラム１０１中の処理フローを解析し、エントリーポイントから開始される処理によって更新／参照されるデータベースのテーブル、カラムを特定すると共に、各カラムの入力／出力データに関連するプログラム内の「変数群」を特定する。 (1) The page processing function specifying unit 120 specifies an entry point for processing each page of the Web application program 101 from the read Web application setting data 102 as shown in FIG.
(2) The DB call function specifying unit 130 specifies a database call function existing in the Web application program 101 as shown in FIG.
(3) As shown in FIG. 2, the processing flow analysis unit 140 analyzes the processing flow in the Web application program 101 from the entry point to the DB call function, and is updated / referenced by processing started from the entry point. In addition to specifying the database table and column, the “variable group” in the program related to the input / output data of each column is specified.

（変数群）
ここで、入力データに関連する「変数群」とは、ＤＢ呼び出し関数上でデータベースの更新が発生する場合に、各カラムに対して格納される値に影響を与える変数群を指す。
同様に、出力データに関連する変数群とは、ＤＢ呼び出し関数上でデータベースの検索が発生する場合に、検索結果のうち、各カラムに対応する値が、影響を与える変数群を指す。 (Variable group)
Here, the “variable group” related to the input data refers to a variable group that affects the value stored for each column when a database update occurs on the DB call function.
Similarly, the variable group related to the output data refers to a variable group in which a value corresponding to each column in the search result affects when a database search occurs on the DB call function.

入力項目名特定部１５０は、各カラムへの入力データに関連する入力データ関連変数群のうち、解析対象のＷｅｂアプリケーションプログラム１０１上のフォームから入力される変数を特定する。さらに、特定された各変数への入力を与えるフォーム部品が配置されているＷｅｂページテンプレート１０３を解析し、出力されるＷｅｂページのタイトルと、フォーム部品に対応する入力項目名を抽出し、エントリーポイントから開始される一連の処理によって更新されるテーブル上の各カラムに対する説明情報として説明情報記録部１７０に記録する。 The input item name specifying unit 150 specifies a variable input from a form on the Web application program 101 to be analyzed among a group of input data related variables related to input data to each column. Further, the Web page template 103 in which form parts giving input to each specified variable are arranged is analyzed, the title of the output Web page and the input item name corresponding to the form part are extracted, and the entry point Is recorded in the explanation information recording unit 170 as explanation information for each column on the table updated by a series of processes starting from the above.

同様に、出力項目名特定部１６０は、各カラムからの出力データに関連する変数群のうち、Ｗｅｂページに出力される変数を特定する。さらに、特定された各変数が出力されるＷｅｂページテンプレートを解析し、Ｗｅｂページのタイトルと、出力されるデータを説明する出力項目名を抽出し、エントリーポイントから開始される一連の処理によって参照されるテーブル上の各カラムに対する説明情報として説明情報記録部１７０に記録する。 Similarly, the output item name specifying unit 160 specifies a variable to be output to the web page from among variable groups related to output data from each column. Furthermore, it analyzes the Web page template that outputs each specified variable, extracts the title of the Web page and the output item name that describes the output data, and is referenced by a series of processes starting from the entry point. Is recorded in the explanation information recording unit 170 as explanation information for each column on the table.

解析結果出力部１８０は、説明情報記録部１７０によって記録された説明情報を、解析結果１０４として出力する。 The analysis result output unit 180 outputs the description information recorded by the description information recording unit 170 as the analysis result 104.

図３は、Ｗｅｂアプリケーション入力部１１０〜処理フロー解析部１４０の処理動作を示す。次に、図３を参照して、動作を説明する。
（１）まずＷｅｂアプリケーション入力部１１０は、Ｗｅｂアプリケーションプログラム１０１、Ｗｅｂアプリ設定データ１０２、Ｗｅｂページテンプレート１０３を装置内に読み込む（Ｓ１１）。
（２）次に、ページ処理関数特定部１２０は、入力されたＷｅｂアプリ設定データ１０２の情報を用いて、各ページのＵＲＬ、各ページのエントリーポイント、及び関数実行後に表示されるＷｅｂページテンプレートの一覧を作成する（Ｓ１２）。
（３）次に、ＤＢ呼び出し関数特定部１３０は、Ｗｅｂアプリケーションプログラム内を検索し、ＤＢ呼び出し関数の出現位置を特定する（Ｓ１３）。ここで、「ＤＢ呼び出し関数」とは、データベースに対してＳＱＬ文を発行するためにＯＳやフレームワークといったプラットフォームが提供する組み込み関数である。ＤＢ呼び出し関数は、引数としてＳＱＬ文を受け取り（後述の図４）、実行結果（検索の結果得られたデータ等）を返す関数を指す。ＤＢ呼び出し関数の関数名は、あらかじめ定義されているものとする。 FIG. 3 shows processing operations of the Web application input unit 110 to the processing flow analysis unit 140. Next, the operation will be described with reference to FIG.
(1) First, the web application input unit 110 reads the web application program 101, the web application setting data 102, and the web page template 103 into the apparatus (S11).
(2) Next, the page processing function specifying unit 120 uses the information of the input Web application setting data 102 to specify the URL of each page, the entry point of each page, and the Web page template displayed after the function is executed. A list is created (S12).
(3) Next, the DB call function specifying unit 130 searches the Web application program and specifies the appearance position of the DB call function (S13). Here, the “DB call function” is a built-in function provided by a platform such as an OS or a framework in order to issue an SQL statement to the database. The DB call function indicates a function that receives an SQL statement as an argument (FIG. 4 to be described later) and returns an execution result (data obtained as a result of the search). It is assumed that the function name of the DB call function is defined in advance.

処理フロー解析部１４０は、ページ処理関数特定部１２０によって特定された各ページのエントリーポイント、およびＤＢ呼び出し関数特定部１３０によって特定されたプログラム内のＤＢ呼び出し関数一覧を入力し、Ｗｅｂアプリケーションプログラム内の、各ページエントリーポイントからＤＢ呼び出し関数を呼び出す可能性があるかを判定（Ｓ１４−１）する。
図４は、処理フロー解析部１４０の動作を説明する図である。可能性がある場合には、処理フロー解析部１４０は、図４に示すように、エントリーポイント内で行われる文字列操作（代入、連結、置換等）を追跡し、ＤＢ呼び出し関数に渡される引数のうち、ＳＱＬ文を指定する引数に代入される文字列変数が、どのような文字列を格納するかを特定する（Ｓ１４−２）。なお、変数に格納される文字列は、プログラムの分岐や、あるいは外部から入力された値によって不定となる箇所もあるため、一般には、変数が格納し得る文字列全体を表すパターンが特定される。 The processing flow analysis unit 140 inputs the entry point of each page specified by the page processing function specifying unit 120 and the DB call function list in the program specified by the DB call function specifying unit 130, and enters the Web application program Then, it is determined whether there is a possibility of calling the DB call function from each page entry point (S14-1).
FIG. 4 is a diagram for explaining the operation of the processing flow analysis unit 140. If there is a possibility, the processing flow analysis unit 140 tracks the character string operation (assignment, concatenation, replacement, etc.) performed in the entry point as shown in FIG. 4, and the argument passed to the DB call function Among these, the character string variable assigned to the argument designating the SQL statement specifies what character string is stored (S14-2). Since the character string stored in the variable may be undefined depending on the branch of the program or the value input from the outside, a pattern representing the entire character string that can be stored in the variable is generally specified. .

（コマンド文字列の抽出）
処理フロー解析部１４０は、ＳＱＬ文を指定する引数に代入される文字列のパターンを特定すると、あらかじめ定義されたＳＱＬ文法データと特定した文字列パターンとを比較し、ＳＱＬ文法上、コマンドに相当する文字列を抽出する（Ｓ１４−３）。 (Extract command string)
When the processing flow analysis unit 140 identifies the pattern of the character string to be assigned to the argument specifying the SQL statement, the processing flow analysis unit 140 compares the predefined SQL grammar data with the identified character string pattern, and corresponds to a command in the SQL grammar. The character string to be extracted is extracted (S14-3).

（更新命令の場合）
コマンドが、ＩＮＳＥＲＴやＵＰＤＡＴＥといった、データベースの内容を更新する命令であった場合には、処理フロー解析部１４０は、同様にＳＱＬ文法と比較することでテーブル名、カラム名を特定し、さらに各カラムに代入される値に影響を与える変数群を特定する（Ｓ１４−４）。 (For update instructions)
If the command is an instruction to update the contents of the database, such as INSERT or UPDATE, the processing flow analysis unit 140 similarly identifies the table name and column name by comparing with the SQL grammar, and further each column. A variable group that affects the value to be assigned to is identified (S14-4).

なお、本実施の形態１では、処理フロー解析部１４０における文字列の特定、およびＳＱＬ文法との比較処理に、下記の＜参考文献＞で開示されている方式を利用することが可能である。詳細については割愛するが、同文献で開示された方式では、エントリーポイントの入口を起点とした制御フローグラフを生成し、制御フロー上に現れる文字列操作を、文字列が格納される変数を非終端記号とした文脈自由文法として表現することで、各文字列変数の値が満たすべき「文法」を特定する。特定された文法を調べることで、任意の文字列変数上で生成される文字列のパターンや、文字列の特定の部分に影響を与える変数群を特定することが可能である。 In the first embodiment, the method disclosed in the following <Reference> can be used for character string specification in the processing flow analysis unit 140 and comparison processing with the SQL grammar. Although the details are omitted, the method disclosed in this document generates a control flow graph starting from the entry point entry, and performs string operations that appear on the control flow, non-terminates the variables that store the strings. By expressing as a symbolic context-free grammar, the “grammar” that each string variable value should satisfy is specified. By examining the specified grammar, it is possible to specify a character string pattern generated on an arbitrary character string variable and a variable group that affects a specific part of the character string.

＜参考文献＞
Ｇ．ＷａｓｓｅｒｍａｎｎａｎｄＺ．Ｓｕ．ＳｏｕｎｄａｎｄＰｒｅｃｉｓｅＡｎａｌｙｓｉｓｏｆＷｅｂＡｐｐｌｉｃａｔｉｏｎｓｆｏｒＩｎｊｅｃｔｉｏｎＶｕｌｎｅｒａｂｉｌｉｔｉｅｓ．ＩｎＣｏｎｆｅｒｅｎｃｅｏｎＰｒｏｇｒａｍｍｉｎｇＬａｎｇｕａｇｅＤｅｓｉｇｎａｎｄＩｍｐｌｅｍｅｎｔａｔｉｏｎ（ＰＬＤＩ），２００７． <References>
G. Wassermann and Z.M. Su. Sound and Precision Analysis of Web Applications for Injection Vulnerabilities. In Conference on Programming Language Design and Implementation (PLDI), 2007. In Conference on Programming Language Design and Implementation (PLDI), 2007.

（参照命令の場合）
再び動作の説明に戻る。処理フロー解析部１４０は、コマンドがＳＥＬＥＣＴのように、データベースの内容を参照する命令であった場合には、先ほどと同様、テーブル名を求めた後、ＳＱＬ検索結果を取得する関数にカラム名として渡される文字列を特定し、さらに、同関数の戻り値によって影響される変数群を特定する（Ｓ１４−５）。ＳＱＬ検索結果取得関数の戻り値に影響される変数群を特定するには、プログラム上で、同関数の処理が終了した箇所を起点とし、戻り値が最初に代入された変数を右辺に持つ代入文を追跡すればよい。 (For reference instructions)
Returning to the description of the operation again. If the command is an instruction that refers to the contents of the database, such as SELECT, the processing flow analysis unit 140 obtains the table name and then sets the column name to the function that acquires the SQL search result as before. A character string to be passed is specified, and further, a variable group affected by the return value of the function is specified (S14-5). In order to identify a variable group affected by the return value of the SQL search result acquisition function, an assignment having the variable on which the return value is first assigned on the right side is set as the starting point in the program. Just track the sentence.

処理フロー解析部１４０により解析した結果（入出力の変数群）は、次に入力項目名特定部１５０、および出力項目名特定部１６０によって、Ｗｅｂページ上の入力項目名、および出力項目名と対応付けられる。はじめに、図５、図６を参照しながら入力項目名特定部１５０の動作から説明する。
図５、図６は入力項目名特定部１５０の動作を示すフローチャート、概念図である。おもに図６を参照して説明する。 The result (input / output variable group) analyzed by the processing flow analysis unit 140 corresponds to the input item name and output item name on the Web page by the input item name specifying unit 150 and the output item name specifying unit 160. Attached. First, the operation of the input item name specifying unit 150 will be described with reference to FIGS. 5 and 6.
5 and 6 are a flowchart and a conceptual diagram showing the operation of the input item name specifying unit 150. FIG. A description will be given mainly with reference to FIG.

（入力項目名特定部１５０の動作）
入力項目名特定部１５０は、まず、Ｗｅｂページ上のフォームから入力された値が代入される変数を特定する（ステップ２０１）。この特定処理は、各フォーム部品からの入力値を取得する関数をエントリーポイントからの制御グラフ上で探し、戻り値が代入される変数を特定することで容易に実施できる。また、どのフォーム部品からの入力値かも、関数名等から容易に特定可能である。例えば、Ｗｅｂアプリケーション開発フレームワークとして広く使用されているｓｔｒｕｔｓでは、Ｗｅｂページテンプレート上で“ｕｓｅｒ”と名づけられたフォーム部品からの入力値は、ｇｅｔＵｓｅｒ（）という関数で取得する仕組みとなっており、関数名からｇｅｔを取り除けば、対応するフォーム部品名を取得することができる。 (Operation of input item name specifying unit 150)
The input item name specifying unit 150 first specifies a variable to which a value input from a form on a Web page is substituted (step 201). This specifying process can be easily performed by searching for a function for obtaining an input value from each form part on the control graph from the entry point and specifying a variable to which the return value is substituted. Also, the input value from which form part can be easily specified from the function name or the like. For example, in struts widely used as a web application development framework, an input value from a form part named “user” on a web page template is obtained by a function called getUser (). If get is removed from the function name, the corresponding form part name can be acquired.

次に、入力項目名特定部１５０は、各フォーム部品からの入力値が代入される変数のうち、データベースの更新に影響を与える変数（更新影響変数ともいう）を特定する（ステップ２０２）。その後、入力項目名特定部１５０は、各変数によって影響を受けるテーブル名およびカラム名を、処理フロー解析部１４０の処理結果（Ｓ１４−４で特定された変数群）と比較することで特定する（ステップ２０３）。これにより、更新に影響を与える変数に値を代入するフォーム部品が特定できるため、このフォーム部品に対応する入力項目名を、Ｗｅｂページテンプレート１０３を解析することで特定し（ステップ２０４）、特定結果を説明情報記録部１７０に蓄積する（ステップ２０５）。 Next, the input item name specifying unit 150 specifies a variable (also referred to as an update influence variable) that affects the update of the database among variables to which the input value from each form part is substituted (step 202). Thereafter, the input item name identification unit 150 identifies the table name and column name affected by each variable by comparing with the processing result of the processing flow analysis unit 140 (variable group identified in S14-4) ( Step 203). As a result, the form part whose value is to be substituted for the variable that affects the update can be identified. Therefore, the input item name corresponding to the form part is identified by analyzing the Web page template 103 (step 204), and the identification result is obtained. Is stored in the explanation information recording unit 170 (step 205).

解析対象とするＷｅｂページテンプレート１０３は、現在解析対象としているエントリーポイントに対応するＵＲＬをａｃｔｉｏｎ属性に持つフォームを含んだＷｅｂページテンプレートが選択される。入力項目名特定部１５０がＷｅｂページテンプレートから入力項目名を特定する処理（ステップ２０４）について、図７〜図９を参照しながら、詳細に説明する。図７、図８、図９の動作主体は入力項目名特定部１５０である。 As the Web page template 103 to be analyzed, a Web page template including a form having a URL corresponding to the entry point currently being analyzed as an action attribute is selected. The process (step 204) in which the input item name specifying unit 150 specifies the input item name from the Web page template will be described in detail with reference to FIGS. 7, 8, and 9 is the input item name specifying unit 150.

はじめに、当該テンプレート内の＜ｔｉｔｌｅ＞〜＜／ｔｉｔｌｅ＞に囲まれたテキストを、そのテンプレートによって生成されるＷｅｂページのページ名として記憶する（ステップ３０１）。次に当該フォーム部品がＷｅｂページテンプレートのＨＴＭＬ上、ｔａｂｌｅ要素に含まれているかを確認する（ステップ３０２）。もしｔａｂｌｅ要素に囲まれていないならば、そのフォーム部品を含む行の先頭にある単語を入力項目名とみなす（ステップ３１４）。 First, the text enclosed in <title> to </ title> in the template is stored as the page name of the Web page generated by the template (step 301). Next, it is confirmed whether the form part is included in the table element on the HTML of the Web page template (step 302). If it is not surrounded by table elements, the word at the head of the line including the form part is regarded as the input item name (step 314).

ｔａｂｌｅ要素に囲まれていた場合、ｔａｂｌｅ内の各セルに記載された文字列のうち、当該フォーム部品に対応する入力項目名を見つける必要がある。本実施の形態では、次のように入力項目名の特定を実施する。 If it is surrounded by table elements, it is necessary to find the input item name corresponding to the form part from the character strings described in each cell in the table. In the present embodiment, the input item name is specified as follows.

はじめに、ｔａｂｌｅの各セル（＜ｔｄ＞〜＜／ｔｄ＞で囲まれた範囲）のうち、ｒｏｗｓｐａｎ属性が２以上であるセルは、行毎に分解し、各行に同じ値（テキスト、フォーム部品、Ｗｅｂページテンプレート上で出力される際に格納される領域（以下Ｗｅｂアプリケーション出力値格納領域））を含んだセルがｒｏｗｓｐａｎ分だけ並んでいる表へと変形する（ステップ３０３）。 First, among the cells of the table (the range surrounded by <td> to </ td>), the cells whose rowspan attribute is 2 or more are decomposed for each row, and the same value (text, form part, The table is transformed into a table in which cells including the area stored when output on the Web page template (hereinafter referred to as Web application output value storage area) are arranged for the row span (step 303).

次に、Ｌ＝１とし（ステップ３０４）、Ｌ＝テーブル行数となるまで以下の処理を繰り返す（ステップ３０５）。 Next, L = 1 is set (step 304), and the following processing is repeated until L = the number of table rows (step 305).

まず、Ｌ行目から、フォーム部品を含まない行を連続して取れる限り取り出す（ステップ３０７）。これらを見出し候補行と呼ぶことにする。もし見出し候補行が存在しなければステップ３５３へ進む。取り出された見出し候補行（行数をＮとする）の下からｓ行目（１≦ｓ≦Ｎ）からｍ行分（１≦ｍ≦Ｎ−ｓ＋１）と同じ構造が、それ以降の行で繰り返し現れる最大回数ｒを求める（ステップ３０９から３１３、３２０から３３３）。同じ構造かどうかは、次の条件をともに満たしている場合とする。 First, from the Lth row, as many rows as possible that do not include form parts are taken out (step 307). These are called heading candidate lines. If there is no heading candidate line, the process proceeds to step 353. The same structure as that of m rows (1 ≦ m ≦ N−s + 1) from the s-th row (1 ≦ s ≦ N) from the bottom of the extracted heading candidate row (the number of rows is N) is the subsequent rows. The maximum number r of repeated appearances is obtained (steps 309 to 313 and 320 to 333). Whether or not they have the same structure satisfies the following conditions.

（１）各セルのｃｏｌｓｐａｎ属性値が見出し候補行の対応するセルと同じ値である（ｃｏｌｓｐａｎが無い場合はｃｏｌｓｐａｎ＝１とみなす）（ステップ３２２）。
（２）一回目にマッチしたセルと同じフォーム部品（あるいはテキスト、又はＷｅｂアプリケーション出力値格納領域）が出現する（ステップ３２３，３２４）。なお、テキストは異なっていても構わないものとする。 (1) The colspan attribute value of each cell is the same value as the corresponding cell of the heading candidate row (if there is no colspan, colspan = 1 is assumed) (step 322).
(2) The same form part (or text or Web application output value storage area) as the first matched cell appears (steps 323 and 324). The text may be different.

全てのｓ，ｍに対し、上記処理によってｒを求め、ｍとｒを積算した結果が最大となるｍを見出し行数とする（ステップ３２８，３２９）。もし複数のｍでｍ・ｒが同じ値になった場合には、ｍが最大のものを見出し行数とみなす。対応するｓについては、下からｓ−１行分の見出し候補行は、注釈等、見出しとは無関係なテキストとみなす。もし、全てのｓ，ｍでｒが０になった場合（ステップ３４０）はステップ３５１へ進む。 For all s and m, r is obtained by the above processing, and m that maximizes the result of integrating m and r is set as the number of heading rows (steps 328 and 329). If m · r has the same value for a plurality of m, the one with the largest m is regarded as the number of heading lines. Regarding the corresponding s, the heading candidate lines corresponding to s-1 lines from the bottom are regarded as text unrelated to the heading, such as annotations. If r becomes 0 for all s and m (step 340), the process proceeds to step 351.

最後に、繰り返し構造に含まれた各フォーム部品全てに対して、ステップ３４１から３５０に示す方法で入力項目名を与える。ステップ３４７で示される項目名の付与の方法は次の通りである。まず入力項目名Ｗ＝空文字列とする。同一セルの先頭にテキストＷ０が含まれている場合には、Ｗ＝Ｗ０とする。次に、見出し候補行の中から対応するセル（（Ｌ−１）＋ｋ＋Ｎ−（Ｍ＋Ｓ）＋１行、ｃｏｌ列）のテキストＷ１をスラッシュ記号等、適当な区切り文字で付加する。その後、繰り返し構造に現れない見出し候補行の各テキストのうち、同じ列に属するテキストを順に付加していく。ただし、連続して同じテキストが現れる場合は、付加するのは１度だけとし、残りは単に無視されるものとする。 Finally, input item names are given to all the form parts included in the repetitive structure by the method shown in steps 341 to 350. The method for assigning the item name shown in step 347 is as follows. First, assume that the input item name W = empty character string. If the text W0 is included at the beginning of the same cell, W = W0. Next, the text W1 of the corresponding cell ((L-1) + k + N- (M + S) +1 row, col column) from among the heading candidate rows is added with an appropriate delimiter such as a slash symbol. Thereafter, the text belonging to the same column is sequentially added among the texts of the candidate heading lines that do not appear in the repeated structure. However, if the same text appears continuously, it is added only once and the rest are simply ignored.

以上の処理を実行後、Ｌ＝Ｌ＋Ｎ＋Ｒ・Ｍとし、ループの先頭に戻る（ステップ３５２）。ステップ３５３〜３５８では、表中に見出し候補行が見つけられなかった場合の処理を行う。このような場合、各行は、独立した入力項目を示しているとみなす。ステップ３５６で示される入力項目名の付与の方法は次の通りである。まず入力項目名Ｗ＝空文字列とする。同一セルの先頭にテキストＷ０が含まれている場合には、Ｗ＝Ｗ０とする。次に、セルの左側のセルを順に走査して行き、最初に現れるテキストのみを含んだセルの内容Ｗ１をスラッシュ記号等、適当な区切り文字で付加する。その後、さらに左側のセルを走査していき、次にフォーム部品を含んだセルが現れるか、先頭列に達するまで文字列を順に付加していく。 After executing the above processing, L = L + N + R · M and return to the top of the loop (step 352). In steps 353 to 358, processing is performed when no heading candidate row is found in the table. In such a case, each line is regarded as indicating an independent input item. The method for assigning the input item name shown in step 356 is as follows. First, assume that the input item name W = empty character string. If the text W0 is included at the beginning of the same cell, W = W0. Next, the cells on the left side of the cell are sequentially scanned, and the contents W1 of the cell including only the text that appears first is added with an appropriate delimiter such as a slash symbol. Thereafter, the left cell is further scanned, and character strings are sequentially added until a cell including a form part appears or the first column is reached.

以上の処理を、再び全てのセルがテキストである行にあたるまで（ステップ３５５）、もしくは表全体の処理を終了するまで（ステップ３５４）繰り返し、その後Ｌ＝ｉに更新後（ステップ３５８）、ループの先頭に戻る。 The above processing is repeated until all the cells again correspond to text lines (step 355) or until the processing of the entire table is completed (step 354), and then updated to L = i (step 358). Back to top

ループ終了後、Ｗｅｂページテンプレート上の各フォーム部品に対して入力項目名が定まるため、入力項目名を特定する対象であるフォーム部品に対する入力項目名を、当該フォーム部品に対応するデータベースのテーブル名、カラム名、およびテンプレートのページ名とともに説明情報記録部に保存する（ステップ３１５）。 Since the input item name is determined for each form part on the Web page template after the loop ends, the input item name for the form part for which the input item name is specified is set to the table name of the database corresponding to the form part, The column name and the page name of the template are stored in the explanation information recording unit (step 315).

データベース解析装置１００による本方式を図１０に示す入力フォームに対する入力項目名特定に適用した場合を例とすると、図１０のフォーム部品４０１には、“発注明細／参考ＵＲＬ”という入力項目名が付与される。同様にフォーム部品４０２については、“発注明細／数量”が付与される。フォーム部品４０３では、“納期”が付与され、フォーム部品４０４では“特記事項”が付与される。 As an example of the case where this method by the database analysis apparatus 100 is applied to input item name specification for the input form shown in FIG. 10, the input item name “ordering details / reference URL” is given to the form part 401 in FIG. Is done. Similarly, “order details / quantity” is given to the form part 402. In the form part 403, “delivery date” is given, and in the form part 404, “special notes” are given.

（出力項目名特定部１６０の動作）
次に、図１１、図１２を参照して、出力項目名特定部１６０の動作について説明する。
図１１、図１２は出力項目名特定部１６０の動作を示すフローチャート、概念図である。
（１）まず、出力項目名特定部１６０は、Ｗｅｂページ上に出力される変数（出力変数）を特定する（Ｓ２１）。この特定処理は、Ｗｅｂページへの出力値を設定する関数をエントリーポイントからの制御グラフ上で探し、その引数として入力される変数を特定することで容易に実施できる。また、出力されるデータがＷｅｂページテンプレート上でどのような識別名で参照されるかも、関数名等から容易に特定可能である。例えば、ｓｔｒｕｔｓでは、Ｗｅｂページテンプレート上で“ｐｒｉｃｅ”として参照されるデータは、ｓｅｔＰｒｉｃｅ（）という関数で取得する仕組みとなっており、関数名からｓｅｔを取り除けば、対応する出力データ名（変数）を取得することができる。
（２）次に、処理フロー解析部１４０の処理結果と比較することで、各Ｗｅｂページへ出力される出力変数のうち、データベースの検索結果の影響を受ける変数（検索影響変数ともいう）と、影響を受けるカラム名を特定する（Ｓ２２）。これにより、検索結果に影響を受ける検索影響変数に対するＷｅｂアプリケーション出力値格納領域（格納領域）が特定できる（Ｓ２３）。このため、出力項目名特定部１６０は、特定したＷｅｂアプリケーション出力値格納領域に対応する出力項目名を、Ｗｅｂページテンプレートを解析することで特定する（Ｓ２４）。解析対象とするＷｅｂページテンプレートは、現在解析対象としているエントリーポイントの実行結果が出力されるＷｅｂページテンプレートが選択される。すなわち、入力項目名特定部１５０と全く同様な方法（図７〜９）で、各出力データに対応する出力項目名を特定する（Ｓ２４）。ただし、入力項目名特定部１５０における前記説明（図７〜９）の「フォーム部品」は「Ｗｅｂアプリケーション出力値格納領域」に、「Ｗｅｂアプリケーションからの出力値格納領域」は「フォーム部品」に読み替えた方式となる。 (Operation of output item name specifying unit 160)
Next, the operation of the output item name specifying unit 160 will be described with reference to FIGS. 11 and 12.
11 and 12 are a flowchart and a conceptual diagram showing the operation of the output item name specifying unit 160. FIG.
(1) First, the output item name specifying unit 160 specifies a variable (output variable) output on the Web page (S21). This specifying process can be easily performed by searching for a function for setting an output value to the Web page on the control graph from the entry point and specifying a variable input as an argument thereof. In addition, it is possible to easily specify from the function name or the like what identification name the output data is referred to on the Web page template. For example, in struts, data referred to as “price” on a Web page template is obtained by a function called setPrice (), and if set is removed from the function name, the corresponding output data name (variable) Can be obtained.
(2) Next, among the output variables output to each Web page by comparing with the processing result of the processing flow analysis unit 140, a variable affected by the search result of the database (also referred to as a search influence variable), The affected column name is specified (S22). Thereby, the Web application output value storage area (storage area) for the search influence variable affected by the search result can be specified (S23). For this reason, the output item name specifying unit 160 specifies the output item name corresponding to the specified Web application output value storage area by analyzing the Web page template (S24). As a Web page template to be analyzed, a Web page template that outputs an execution result of an entry point currently being analyzed is selected. That is, the output item name corresponding to each output data is specified by the same method (FIGS. 7 to 9) as the input item name specifying unit 150 (S24). However, “form part” in the above description (FIGS. 7 to 9) in the input item name specifying unit 150 is read as “Web application output value storage area”, and “output value storage area from Web application” is read as “form part”. Method.

Ｗｅｂアプリケーションプログラム１０１の全てのエントリーポイントに対する解析が終了すると、解析結果出力部１８０が、説明情報記録部１７０に蓄えられた解析結果をユーザに出力する。
図１３は、説明情報記録部１７０によって出力される解析結果を示す。図１３に示すように、テーブル名５０１、カラム名５０２、各カラムに対する入力ページ名５０３、入力項目名５０４、出力ページ名５０５、及び出力項目名５０６である。表中、ｔａｂｌｅ２のカラムｘ２は出力ページ名、出力項目名が記載されていないが、これは同カラムのデータがＷｅｂページ上に現れなかったことを意味するものである。 When the analysis for all the entry points of the Web application program 101 is completed, the analysis result output unit 180 outputs the analysis results stored in the explanation information recording unit 170 to the user.
FIG. 13 shows the analysis result output by the explanation information recording unit 170. As shown in FIG. 13, there are a table name 501, a column name 502, an input page name 503 for each column, an input item name 504, an output page name 505, and an output item name 506. In the table, the column x2 of table2 does not describe the output page name and output item name, which means that the data of the same column did not appear on the Web page.

（１）以上のように、Ｗｅｂアプリケーション内で発生するデータベースアクセスで更新／参照されるカラムに対応するＷｅｂアプリケーションの入出力データに対し、対応する入出力項目名をＷｅｂページテンプレートを解析して特定するようにしているので、データベースの各カラムに対し、そのカラムの意味を利用者が推測するための説明情報を出力することができる。
（２）また、各入出力項目が、Ｗｅｂページテンプレート上でテーブルの中に現れた場合に、テーブル内の見出しに該当するセルを特定することで、表中に入出力項目が現れた場合にも正しく項目名を識別できるという効果がある。
（３）さらに、見出しの特定において、表の繰り返し構造に着目することで、複数行からなる見出しにも対応することができるという効果がある。
（４）さらに、複数行にわたり連結されているセルを、あらかじめ行毎に分解しておくことで、階層の異なる複数の見出しを含んだ表にも対応できる。
（５）さらに、見出し候補行の下の行からなるべく大きな構造の繰り返しを特定することで、実際の入出力項目を含んだ行には現れない、見出しの見出しを特定することができる。
（６）さらに、上記繰り返し構造を特定する処理において、繰り返し構造が見出せなかった場合、見出し候補行の下から順に取り除いて再び繰り返し構造の特定処理を実施することで、注意書き等、見出しとは無関係な行が挿入されている場合に対応することができる。
（７）さらに、繰り返し構造が見つけられなかった場合には、各フォーム部品に対し、左側のセルを走査していき、最初に見つかる連続したテキストセルを見出しとみなす事で、横方向に見出しを持った表に対応することができる。
（８）さらに、繰り返し構造が途切れた箇所から別途解析を再開することで、複数の構造を持った複雑な表に対しても見出しを特定することができる。
（９）さらに、複数行、又は複数列からなる見出しが特定された場合には、見出しのテキストを区切り文字で連結することで、階層化された入出力項目名とすることができる。 (1) As described above, for the input / output data of the Web application corresponding to the column updated / referenced by the database access generated in the Web application, the corresponding input / output item name is identified by analyzing the Web page template. Therefore, for each column in the database, description information for the user to guess the meaning of the column can be output.
(2) When each input / output item appears in the table on the Web page template, when the input / output item appears in the table by specifying the cell corresponding to the heading in the table Has the effect of correctly identifying the item name.
(3) Furthermore, in specifying a headline, by paying attention to the repeating structure of the table, there is an effect that a headline consisting of a plurality of lines can be handled.
(4) Furthermore, it is possible to deal with a table including a plurality of headings having different hierarchies by disassembling cells connected to a plurality of rows in advance for each row.
(5) Furthermore, by specifying a repetition of a structure as large as possible from the lines below the heading candidate line, it is possible to specify a heading heading that does not appear in a line including an actual input / output item.
(6) Further, in the process of specifying the repetitive structure, if a repetitive structure cannot be found, the repetitive structure specifying process is performed again by removing from the bottom of the heading candidate line, so It is possible to deal with a case where an irrelevant line is inserted.
(7) Furthermore, if the repeated structure is not found, the left side cell is scanned for each form part, and the heading is detected in the horizontal direction by considering the first consecutive text cell found as the heading. It can correspond to the table you have.
(8) Further, by restarting the analysis separately from the place where the repeated structure is interrupted, the heading can be specified even for a complicated table having a plurality of structures.
(9) Further, when a heading consisting of a plurality of lines or a plurality of columns is specified, the input / output item names can be made hierarchical by connecting the heading text with a delimiter.

本実施の形態１ではデータベースの参照／更新と関連する入出力項目を特定してから、Ｗｅｂページテンプレート上で、これらの項目に対する項目名を特定することとしたが、データベースへの参照／更新が発生することが判明した時点で、解析対象とすべきＷｅｂページテンプレート上の全ての項目に対し、項目名を計算しておくことももちろん可能である。 In the first embodiment, input / output items related to database reference / update are specified, and then item names for these items are specified on the Web page template. Of course, it is possible to calculate item names for all items on the Web page template to be analyzed when it is found to occur.

なお、Ｗｅｂアプリケーション言語としてＰＨＰが使用されている場合、各ＵＲＬに対応して起動されるＰＨＰスクリプトの先頭行から順にコードが実行されるため関数の特定は不要である。さらに、ＰＨＰでは、処理プログラムと、処理結果を表示するためのＷｅｂページテンプレートが同一ファイル上で記載される上、ＵＲＬと起動されるＰＨＰスクリプトとの対応はＰＨＰの格納ディレクトリから判別できる。結果としてＰＨＰにおいては、設定データに記載すべき情報は自動で抽出することも可能である。 When PHP is used as the Web application language, it is not necessary to specify a function because the code is executed in order from the first line of the PHP script activated corresponding to each URL. Furthermore, in PHP, the processing program and the Web page template for displaying the processing result are described in the same file, and the correspondence between the URL and the activated PHP script can be determined from the PHP storage directory. As a result, in PHP, information to be described in setting data can be automatically extracted.

実施の形態２．
図１４、図１５を参照して実施の形態２を説明する。実施の形態２は、処理フロー解析部１４０に関する。図１４、図１５の動作主体は処理フロー解析部１４０である。以上の実施の形態１では、処理フロー解析部１４０に第１方式（図１４）の静的解析技術を適用したものであるが、次に、第１方式の解析処理を高速化する場合に、ＳＱＬ文の特定における不要な処理を省略する「第２方式」（図１５）を説明する。 Embodiment 2. FIG.
The second embodiment will be described with reference to FIGS. The second embodiment relates to the processing flow analysis unit 140. 14 and 15 is the processing flow analysis unit 140. In the first embodiment described above, the static analysis technology of the first method (FIG. 14) is applied to the processing flow analysis unit 140. Next, when speeding up the analysis processing of the first method, A “second method” (FIG. 15) that omits unnecessary processing in specifying an SQL sentence will be described.

まず、図１４を参照しながら第１方式の概要について説明する。第１方式ではまず、Ｗｅｂアプリケーションプログラムを解析し、エントリーポイントを起点とした制御フローグラフを生成する（ステップ６０１）。制御フローグラフとは、プログラム中の命令の実行経路をグラフ化したものである。次に、制御フローグラフ上にある文字列操作（代入、連結、置換等）に対応し、形式言語理論で言うところの生成規則を記録していく（ステップ６０２〜６０９）。
（１）文字列操作として代入操作があった場合、代入先変数名を非終端記号とし、代入元の変数、もしくは文字列定数を右辺に持つ生成規則Ａ→Ｘ（Ｘは変数名、もしくは文字列定数）を生成する（ステップ６０３、６０８）。
（２）文字列操作として連結操作があった場合、連結後の文字列が代入される変数名を非終端記号とし、連結される複数の変数、あるいは文字列定数を右辺に持つ生成規則Ａ→ＸＹ（Ｘ，Ｙは変数名もしくは文字列定数）を生成する（ステップ６０４、６０７）。
（３）その他の場合、つまり第１方式の文字列に変更が行われる操作、例えば文字列置換などは、当該操作によって入力として与えた文字列の生成規則が変換されると考え、当該文字列操作の内容を反映したトランスデューサと呼ばれる有限オートマトンによって、入力変数に対して記録された第１方式の生成規則を変換した上で、生成規則Ａ←Ｘ（Ｘは変換関数への入力変数、あるいは文字列定数）を生成する（ステップ６０５、６０６）。 First, the outline of the first method will be described with reference to FIG. In the first method, first, a Web application program is analyzed, and a control flow graph starting from an entry point is generated (step 601). A control flow graph is a graph of the execution path of instructions in a program. Next, in accordance with the character string operation (substitution, concatenation, substitution, etc.) on the control flow graph, the generation rule in the formal language theory is recorded (steps 602 to 609).
(1) When there is an assignment operation as a character string operation, the generation rule A → X (X is a variable name or character string) with the assignment destination variable name as a non-terminal symbol and the assignment source variable or character string constant on the right side Constant) is generated (steps 603 and 608).
(2) When there is a concatenation operation as a character string operation, a generation rule A → XY having a variable name to which the concatenated character string is assigned as a non-terminal symbol and a plurality of concatenated variables or character string constants on the right side (X and Y are variable names or character string constants) are generated (steps 604 and 607).
(3) In other cases, that is, an operation in which the character string of the first method is changed, for example, character string replacement, is considered that the character string generation rule given as input by the operation is converted, The first generation rule recorded for the input variable is converted by a finite automaton called a transducer reflecting the contents of the operation, and then the generation rule A ← X (X is an input variable or character to the conversion function) Column constant) is generated (steps 605 and 606).

以上の処理を繰り返すことで、制御フローグラフ上の全ての文字列変数に対する「文法」を定義することができる。最後に、ＤＢ呼び出し関数に引数として渡される変数に対応する非終端記号を開始記号とする文法と、ＳＱＬ文の文法とを比較することで、ＳＱＬ文のコマンドや、カラム名、およびテーブル名に該当する文字列を特定する。さらに、コマンドがｉｎｓｅｒｔやｕｐｄａｔｅといったデータベースを更新するコマンドであった場合には、データベースの各カラムに投入される値に関係している変数群を特定する（ステップ６１０）。 By repeating the above processing, it is possible to define a “grammar” for all the character string variables on the control flow graph. Finally, by comparing the grammar with the non-terminal symbol corresponding to the variable passed as an argument to the DB call function as the start symbol and the grammar of the SQL statement, it corresponds to the SQL statement command, column name, and table name. The character string to be specified is specified. Further, when the command is a command for updating the database such as insert and update, a variable group related to the value input to each column of the database is specified (step 610).

第１方式において、前記（３）に該当する処理、すなわちトランスデューサによる文法変換処理は、計算コストが高い。そこで、本実施の形態２では、（３）の処理を必要最小限に留めることで処理の高速化を実現する。実施の形態２における第２方式の処理について、図１５を参照しながら説明する。 In the first method, the processing corresponding to the above (3), that is, the grammar conversion processing by the transducer has a high calculation cost. Therefore, in the second embodiment, the processing speed is increased by keeping the processing (3) to the minimum necessary. The processing of the second method in the second embodiment will be described with reference to FIG.

第２方式においても、制御フローグラフを生成し、フローグラフ上の各文字列操作に対応して生成規則を記録していく点は第１方式と同様である（ステップ７０１〜７０４、７０７、および７０８）。ただし、文字列操作が（３）に該当する場合だけ第１方式とは異なる処理を行う。第１方式は、入力として与えられた変数に対する生成規則を変換していたが、第２方式では、代わりに特殊な生成規則Ａ←［ダミー］（［ダミー］は便宜上の表記であり、識別可能な記号であれば何でも構わない）を記録する（ステップ７０５）。また、本来行うべき文法変換に必要な情報、すなわち、入力変数名、変換に必要なトランスデューサもあわせて記録しておく（ステップ７０６）。 The second method is the same as the first method in that a control flow graph is generated and a generation rule is recorded corresponding to each character string operation on the flow graph (steps 701 to 704, 707, and 708). However, processing different from the first method is performed only when the character string operation corresponds to (3). In the first method, the generation rule for a variable given as an input is converted. In the second method, instead, a special generation rule A ← [dummy] ([dummy] is a notation for convenience and can be identified. Any symbol may be recorded (step 705). In addition, information necessary for grammatical conversion to be performed, that is, an input variable name and a transducer necessary for conversion are also recorded (step 706).

その後、第１方式と同様に、ＤＢ呼び出し関数に引数として渡される変数に対応する非終端記号を開始記号とする文法と、ＳＱＬ文の文法とを比較する（ステップ７１０）が、この時に、文法上カラム名やテーブル名が指定されるべき箇所に［ダミー］が現れていないか確認する（ステップ７１１）。もしも現れていた場合には、当該［ダミー］に対応する入力変数名と、トランスデューサを取り出し、文法変換（正規の生成規則の復元の一例）を行った上で、再度ＳＱＬ文法との比較を実施する（ステップ７１２）。以上の処理を繰り返し、最終的にカラム名、およびテーブル名に対応する箇所が全て取得できたら処理を終了させる。最後に、コマンドがｉｎｓｅｒｔやｕｐｄａｔｅといったデータベースを更新するコマンドであった場合には、データベースの各カラムに投入される値に関係している変数群を特定するが、これは［ダミー］がＸに依存しているという情報を利用すれば、投入される値に［ダミー］が含まれていたとしても第１方式と同様に特定可能である。 Thereafter, as in the first method, the grammar having a non-terminal symbol corresponding to the variable passed as an argument to the DB call function as the start symbol is compared with the grammar of the SQL statement (step 710). It is confirmed whether or not [dummy] appears at a position where a column name or a table name should be specified (step 711). If it appears, the input variable name corresponding to the [dummy] and the transducer are taken out, subjected to grammar conversion (an example of restoration of normal generation rules), and then compared with the SQL grammar again. (Step 712). The above process is repeated, and when all the positions corresponding to the column name and table name are finally obtained, the process is terminated. Finally, if the command is a command that updates the database, such as insert or update, the variable group related to the value input to each column of the database is specified. If the information that it depends is used, even if [Dummy] is included in the input value, it can be specified as in the first method.

以上のように、文字列変換操作に対して仮の生成規則を記録しておき、カラム名、テーブル名を抽出するために必要な部分に限って文法変換処理を実施することで、処理フロー解析部を高速化することができる。もしも、全てのテーブル名、カラム名に対して文字列変換処理が発生した場合、第２方式は第１方式と同じだけの文法変換処理を行う必要がある。しかし、一般に文字列変換処理が発生するのはデータベースに記録されるデータや、検索文字列部分であり、アクセス対象であるテーブル名、カラム名に対し文字列変換が発生することは稀である。従って、殆どの場合第２方式を適用することで第１方式よりも高速に処理を行うことが可能である。 As described above, a temporary generation rule is recorded for the character string conversion operation, and the grammar conversion process is performed only on the part necessary for extracting the column name and table name, thereby processing flow analysis. The part can be speeded up. If character string conversion processing occurs for all table names and column names, the second method needs to perform as many grammatical conversion processing as the first method. However, generally, character string conversion processing occurs in data recorded in a database or a search character string portion, and it is rare that character string conversion occurs for a table name or a column name to be accessed. Therefore, in most cases, the second method can be applied to perform processing faster than the first method.

実施の形態３．
図１６を参照して実施の形態３を説明する。以上の実施の形態１、２では、Ｗｅｂアプリケーション設定データをページ処理関数特定部１２０が解析可能である必要があるが、本実施の形態３では、フレームワーク特定部３１０が、対象Ｗｅｂアプリケーションが開発に使用したフレームワークを特定する。この特定により、ページ処理関数特定部１２０が解析可能な設定データを生成、入力することで、様々なＷｅｂアプリケーション設定データの形式に対応できる。以下にその形態を示す。 Embodiment 3 FIG.
The third embodiment will be described with reference to FIG. In the first and second embodiments described above, it is necessary for the page processing function specifying unit 120 to be able to analyze the Web application setting data. However, in the third embodiment, the framework specifying unit 310 develops the target Web application. Identify the framework used for. By specifying and generating setting data that can be analyzed by the page processing function specifying unit 120 by this specification, it is possible to cope with various formats of Web application setting data. The form is shown below.

図１６は実施の形態３のデータベース解析装置３００の構成図である。データベース解析装置３００は図１に示す形態１のデータベース解析装置１００に対して、さらに、フレームワーク特定部３１０、フレームワーク特定ルール蓄積部３２０、設定データ生成機能蓄積部３３０を付加した。 FIG. 16 is a configuration diagram of the database analysis apparatus 300 according to the third embodiment. The database analysis device 300 further includes a framework specifying unit 310, a framework specifying rule storage unit 320, and a setting data generation function storage unit 330, in addition to the database analysis device 100 of the form 1 shown in FIG.

フレームワーク特定部３１０は、ユーザの入力したＷｅｂアプリケーションのＷｅｂアプリ格納ディレクトリ名１０５−３の中のファイルを調べ、格納されているファイル名などからＷｅｂアプリケーションの開発に使用されたフレームワーク名を特定する。特定にはフレームワーク特定ルール蓄積部３２０に格納されたルールを使用する。フレームワークが特定されると、フレームワーク特定部３１０は、設定データ生成機能蓄積部３３０に蓄えられた設定データ生成機能３３１のうち、特定されたフレームワークに対応した設定データ生成機能３３１を呼び出す。設定データ生成機能３３１は、各フレームワークに固有の方式でＷｅｂアプリケーション格納ディレクトリ中のファイルから設定データに必要な情報を抽出し、ページ処理関数特定部１２０へ入力する。 The framework specifying unit 310 checks the file in the Web application storage directory name 105-3 of the Web application input by the user, and specifies the framework name used for the development of the Web application from the stored file name. To do. The rules stored in the framework specifying rule storage unit 320 are used for specifying. When the framework is specified, the framework specifying unit 310 calls the setting data generation function 331 corresponding to the specified framework among the setting data generation functions 331 stored in the setting data generation function storage unit 330. The setting data generation function 331 extracts information necessary for setting data from a file in the Web application storage directory by a method specific to each framework, and inputs the information to the page processing function specifying unit 120.

以上のように、フレームワーク特定部３１０がＷｅｂアプリケーション格納ディレクトリ内を調査してフレームワークを特定する。そして、設定データ生成機能３３１が、フレームワークに固有の方式に従ってＷｅｂアプリケーション格納ディレクトリ内のファイルから情報を抽出し、Ｗｅｂアプリケーション設定データを生成することで、様々なフレームワークに対応することができる、という効果がある。 As described above, the framework specifying unit 310 specifies the framework by examining the Web application storage directory. The setting data generation function 331 can correspond to various frameworks by extracting information from the files in the Web application storage directory according to a scheme specific to the framework and generating Web application setting data. There is an effect.

実施の形態４．
図１７、図１８を参照して実施の形態３を説明する。実施の形態３は、コンピュータであるデータベース解析装置１００、３００のハードウェア構成を説明する。
図１７は、データベース解析装置１００、３００の外観の一例を示す図である。
図１８は、データベース解析装置１００、３００のハードウェア資源の一例を示す図である。
データベース解析装置１００とデータベース解析装置３００とは同様であるので、データベース解析装置１００を例に説明する。 Embodiment 4 FIG.
The third embodiment will be described with reference to FIGS. 17 and 18. In the third embodiment, the hardware configuration of the database analysis apparatuses 100 and 300 which are computers will be described.
FIG. 17 is a diagram illustrating an example of the appearance of the database analysis devices 100 and 300.
FIG. 18 is a diagram illustrating an example of hardware resources of the database analysis apparatuses 100 and 300.
Since the database analysis device 100 and the database analysis device 300 are the same, the database analysis device 100 will be described as an example.

外観を示す図１７において、データベース解析装置１００は、システムユニット８３０、ＣＲＴ（Ｃａｔｈｏｄｅ・Ｒａｙ・Ｔｕｂｅ）やＬＣＤ（液晶）の表示画面を有する表示装置８１３、キーボード８１４（Ｋｅｙ・Ｂｏａｒｄ：Ｋ／Ｂ）、マウス８１５、コンパクトディスク装置８１８（ＣＤＤ：ＣｏｍｐａｃｔＤｉｓｋＤｒｉｖｅ）などのハードウェア資源を備え、これらはケーブルや信号線で接続されている。システムユニット８３０はネットワークに接続している。 In FIG. 17 showing the appearance, the database analysis device 100 includes a system unit 830, a display device 813 having a CRT (Cathode / Ray / Tube) or LCD (liquid crystal) display screen, and a keyboard 814 (Key / Board: K / B). Hardware resources such as a mouse 815 and a compact disk device 818 (CDD: Compact Disk Drive), which are connected by cables and signal lines. The system unit 830 is connected to the network.

またハードウェア資源を示す図１８において、データベース解析装置１００は、プログラムを実行するＣＰＵ８１０（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）を備えている。ＣＰＵ８１０は、バス８２５を介してＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）８１１、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）８１２、表示装置８１３、キーボード８１４、マウス８１５、通信ボード８１６、ＣＤＤ８１８、磁気ディスク装置８２０と接続され、これらのハードウェアデバイスを制御する。磁気ディスク装置８２０の代わりに、光ディスク装置、フラッシュメモリなどの記憶装置でもよい。 In FIG. 18 showing hardware resources, the database analysis apparatus 100 includes a CPU 810 (Central Processing Unit) for executing a program. The CPU 810 is connected to a ROM (Read Only Memory) 811, a RAM (Random Access Memory) 812, a display device 813, a keyboard 814, a mouse 815, a communication board 816, a CDD 818, and a magnetic disk device 820 via a bus 825. Control hardware devices. Instead of the magnetic disk device 820, a storage device such as an optical disk device or a flash memory may be used.

ＲＡＭ８１２は、揮発性メモリの一例である。ＲＯＭ８１１、ＣＤＤ８１８、磁気ディスク装置８２０等の記憶媒体は、不揮発性メモリの一例である。これらは、「記憶装置」あるいは記憶部、格納部、記録部、バッファの一例である。通信ボード８１６、キーボード８１４などは、入力部、入力装置の一例である。また、通信ボード８１６、表示装置８１３などは、出力部、出力装置の一例である。通信ボード８１６は、ネットワークに接続されている。 The RAM 812 is an example of a volatile memory. Storage media such as the ROM 811, the CDD 818, and the magnetic disk device 820 are examples of nonvolatile memories. These are examples of a “storage device” or a storage unit, a storage unit, a recording unit, and a buffer. The communication board 816, the keyboard 814, and the like are examples of an input unit and an input device. The communication board 816, the display device 813, and the like are examples of an output unit and an output device. The communication board 816 is connected to the network.

磁気ディスク装置８２０には、オペレーティングシステム８２１（ＯＳ）、ウィンドウシステム８２２、プログラム群８２３、ファイル群８２４が記憶されている。プログラム群８２３のプログラムは、ＣＰＵ８１０、オペレーティングシステム８２１、ウィンドウシステム８２２により実行される。 The magnetic disk device 820 stores an operating system 821 (OS), a window system 822, a program group 823, and a file group 824. The programs in the program group 823 are executed by the CPU 810, the operating system 821, and the window system 822.

上記プログラム群８２３には、以上の実施の形態の説明において「〜部」として説明した機能を実行するプログラムが記憶されている。プログラムは、ＣＰＵ８１０により読み出され実行される。 The program group 823 stores programs that execute the functions described as “˜units” in the description of the above embodiments. The program is read and executed by the CPU 810.

ファイル群８２４には、以上の実施の形態の説明において、「〜の判定結果」、「〜の算出結果」、「〜の抽出結果」、「〜の生成結果」、「〜の処理結果」として説明した情報や、データや信号値や変数値やパラメータなどが、「〜ファイル」や「〜データベース」の各項目として記憶されている。「〜ファイル」や「〜データベース」は、ディスクやメモリなどの記録媒体に記憶される。ディスクやメモリなどの記憶媒体に記憶された情報やデータや信号値や変数値やパラメータは、読み書き回路を介してＣＰＵ８１０によりメインメモリやキャッシュメモリに読み出され、抽出・検索・参照・比較・演算・計算・処理・出力・印刷・表示などのＣＰＵの動作に用いられる。抽出・検索・参照・比較・演算・計算・処理・出力・印刷・表示のＣＰＵの動作の間、情報やデータや信号値や変数値やパラメータは、メインメモリやキャッシュメモリやバッファメモリに一時的に記憶される。 In the description of the above embodiment, the file group 824 includes “to determination result”, “to calculation result”, “to extraction result”, “to generation result”, and “to processing result”. The described information, data, signal values, variable values, parameters, and the like are stored as items of “˜file” and “˜database”. The “˜file” and “˜database” are stored in a recording medium such as a disk or a memory. Information, data, signal values, variable values, and parameters stored in a storage medium such as a disk or memory are read out to the main memory or cache memory by the CPU 810 via a read / write circuit, and extracted, searched, referenced, compared, and calculated. Used for CPU operations such as calculation, processing, output, printing, and display. Information, data, signal values, variable values, and parameters are temporarily stored in the main memory, cache memory, and buffer memory during the CPU operations of extraction, search, reference, comparison, operation, calculation, processing, output, printing, and display. Is remembered.

また、以上に述べた実施の形態の説明において、データや信号値は、ＲＡＭ８１２のメモリ、ＣＤＤ８１８のコンパクトディスク、磁気ディスク装置８２０の磁気ディスク、その他光ディスク、ミニディスク、ＤＶＤ（Ｄｉｇｉｔａｌ・Ｖｅｒｓａｔｉｌｅ・Ｄｉｓｋ）等の記録媒体に記録される。また、データや信号は、バス８２５や信号線やケーブルその他の伝送媒体によりオンライン伝送される。 In the description of the embodiment described above, the data and signal values are the memory of the RAM 812, the compact disk of the CDD 818, the magnetic disk of the magnetic disk device 820, other optical disks, mini disks, and DVDs (Digital Versatile Disk). Or the like. Data and signals are transmitted on-line via the bus 825, signal lines, cables, and other transmission media.

また、以上の実施の形態の説明において、「〜部」として説明したものは、「〜手段」、であってもよく、また、「〜ステップ」、「〜手順」、「〜処理」であってもよい。すなわち、「〜部」として説明したものは、ソフトウェアのみ、或いは、ソフトウェアとハードウェアとの組み合わせ、さらには、ファームウェアとの組み合わせで実施されても構わない。ファームウェアとソフトウェアは、プログラムとして、磁気ディスク、フレキシブルディスク、光ディスク、コンパクトディスク、ミニディスク、ＤＶＤ等の記録媒体に記憶される。プログラムはＣＰＵ８１０により読み出され、ＣＰＵ８１０により実行される。すなわち、プログラムは、以上に述べた「〜部」としてコンピュータを機能させるものである。あるいは、以上に述べた「〜部」の手順や方法をコンピュータに実行させるものである。 In the above description of the embodiment, what has been described as “to part” may be “to means”, and “to step”, “to procedure”, and “to processing”. May be. That is, what has been described as “˜unit” may be implemented by software alone, a combination of software and hardware, or a combination of firmware. Firmware and software are stored as programs in a recording medium such as a magnetic disk, a flexible disk, an optical disk, a compact disk, a mini disk, and a DVD. The program is read by the CPU 810 and executed by the CPU 810. That is, the program causes the computer to function as the “˜unit” described above. Alternatively, the computer executes the procedure and method of “to part” described above.

以上の実施の形態では、データベース解析装置を説明したが、データベース解析装置の各「〜部」の動作を、コンピュータに実行させるためのデータベース解析プログラムとして把握することもできる。あるいは、データベース解析装置の各「〜部」の動作を、データベース解析方法として把握することもできる。 In the above embodiment, the database analysis device has been described. However, the operation of each “˜unit” of the database analysis device can also be grasped as a database analysis program for causing a computer to execute. Alternatively, the operation of each “˜unit” of the database analysis apparatus can be grasped as a database analysis method.

以上の実施の形態では以下を特徴とするデータベース解析装置を説明した。
（１）データベース解析装置は、解析対象のデータベースを使用するＷｅｂアプリケーションのソースコードおよび設定ファイル、Ｗｅｂページテンプレートを入力とし、Ｗｅｂアプリケーションの各ページ上の入力フォーム部品を特定し、同フォーム部品に対応する入力項目名を特定して、同入力データによって更新されるデータベースカラムの説明情報として出力する。
（２）データベース解析装置は、さらに、データベースのあるカラムから得られたデータが出力されるＷｅｂページ上の出力項目を特定し、同項目に対応する出力項目名を特定して、当該カラムの説明情報として出力する。
（３）データベース解析装置は、入力／出力項目名の特定を行うにあたり、入力／出力項目がテーブルに含まれていた場合には、テーブルの各項目に対する見出しに相当する部分を、テーブル内の繰り返し構造から特定し、特定された見出しを入力／出力項目名とする。さらにデータベース解析装置は、以下の（ａ）〜（ｅ）を特徴とする。
（ａ）データベース解析装置は、複数行にわたり連結されているセルを、あらかじめ行毎に分解しておくことで、階層の異なる複数の見出しを含んだ表にも対応できる。
（ｂ）データベース解析装置は、見出し候補行の下の行からなるべく大きな構造の繰り返しを特定することで、実際の入出力項目を含んだ行には現れない、見出しの見出しを特定することができる。
（ｃ）データベース解析装置は、繰り返し構造を特定する処理において、繰り返し構造が見出せなかった場合、見出し候補行の下から順に取り除いて再び繰り返し構造の特定処理を実施することで、注意書き等、見出しとは無関係な行が挿入されている場合に対応する。
（ｄ）データベース解析装置は、繰り返し構造が現れなかった行に対しては左側のセルを走査していき、最初に見つかる連続したテキストセルを見出しとみなす。
（ｅ）データベース解析装置は、繰り返し構造が途切れた箇所から別途解析を再開することで、複数の構造を持った複雑な表に対しても見出しを特定することができる。
（ｆ）データベース解析装置は、複数行、又は複数列からなる見出しが特定された場合には、見出しのテキストを区切り文字で連結することで、階層化された入出力項目名とする。
（４）データベース解析装置は、Ｗｅｂアプリケーションプログラム内の文字列操作を静的解析で追跡し、ＤＢ呼び出し関数に渡されるＳＱＬ文を特定する際に、解析に時間のかかる文字列変換に対する追跡処理は省略し、テーブル名、カラム名を特定するために最低限必要な箇所のみ上記追跡処理を実施する。
（５）データベース解析装置は、Ｗｅｂアプリケーションが使用している開発フレームワークを特定することで、Ｗｅｂアプリケーション格納ディレクトリから、解析に必要なＷｅｂアプリケーション設定データを抽出する。 In the above embodiment, the database analysis device having the following features has been described.
(1) The database analysis device takes as input the source code and setting file of a Web application that uses the database to be analyzed, and the Web page template, identifies input form parts on each page of the Web application, and supports the form parts The input item name to be specified is specified and output as the description information of the database column updated by the input data.
(2) The database analysis device further specifies an output item on the Web page to which data obtained from a certain column of the database is output, specifies an output item name corresponding to the item, and describes the column. Output as information.
(3) When the input / output item name is specified in the table when the input / output item name is specified, the database analyzer repeats the portion corresponding to the heading for each item of the table in the table. It is specified from the structure, and the specified heading is used as the input / output item name. Furthermore, the database analysis device is characterized by the following (a) to (e).
(A) The database analysis apparatus can deal with a table including a plurality of headlines having different hierarchies by previously disassembling cells connected to a plurality of rows for each row.
(B) The database analysis apparatus can specify a headline that does not appear in a line including an actual input / output item by specifying a repetition of a structure that is as large as possible from the line below the heading candidate line. .
(C) When the repeated structure is not found in the process of identifying the repeated structure, the database analysis apparatus removes the heading candidate line from the bottom in order and performs the repeated structure identifying process again, so that a headline such as a cautionary note is displayed. This corresponds to the case where a line unrelated to is inserted.
(D) The database analysis apparatus scans the left cell for rows where no repeated structure appears, and considers the first consecutive text cell found as a headline.
(E) The database analysis apparatus can specify a heading even for a complex table having a plurality of structures by restarting the analysis separately from a place where the repeated structure is interrupted.
(F) When a heading consisting of a plurality of rows or a plurality of columns is specified, the database analysis device concatenates the heading text with a delimiter to obtain a hierarchical input / output item name.
(4) The database analysis device tracks the character string operation in the Web application program by static analysis, and when specifying the SQL statement passed to the DB call function, the tracking processing for the character string conversion that takes time to analyze is performed. Omitting, the above tracking process is performed only at the minimum necessary place to specify the table name and column name.
(5) The database analysis device extracts the web application setting data necessary for the analysis from the web application storage directory by specifying the development framework used by the web application.

１００，３００データベース解析装置、１０１Ｗｅｂアプリケーションプログラム、１０２Ｗｅｂアプリ設定データ、１０３Ｗｅｂページテンプレート、１０４解析結果、１０５−３Ｗｅｂアプリ格納ディレクトリ名、１１０Ｗｅｂアプリケーション入力部、１２０ページ処理関数特定部、１３０ＤＢ呼び出し関数特定部、１４０処理フロー解析部、１５０入力項目名特定部、１６０出力項目名特定部、１７０説明情報記録部、１８０解析結果出力部、１０１０特定処理実行部、３１０フレームワーク特定部、３２０フレームワーク特定ルール蓄積部、３３０設定データ生成機能蓄積部、３３１設定データ生成機能。 100, 300 Database analysis device, 101 Web application program, 102 Web application setting data, 103 Web page template, 104 Analysis result, 105-3 Web application storage directory name, 110 Web application input unit, 120 Page processing function identification unit, 130 DB call function identification unit, 140 processing flow analysis unit, 150 input item name identification unit, 160 output item name identification unit, 170 description information recording unit, 180 analysis result output unit, 1010 identification process execution unit, 310 framework identification unit, 320 framework specific rule storage unit, 330 setting data generation function storage unit, 331 setting data generation function.

Claims

A Web application program that uses a database to be analyzed, a Web application setting file that describes predetermined Web application setting information used for setting the Web application program, and a template used when the Web application program generates a Web page An input unit for inputting a web page template indicating type information;
An input form component used to generate an input form displayed on a Web page generated by the Web application program input to the input unit, the Web application program, the Web application setting file, and the Web page By analyzing the Web page template, specifying the input item name corresponding to the specified input form part and corresponding to the column of the table of the database to be analyzed A specific process execution unit to be identified;
An output unit that outputs information including at least the input item name specified by the specifying process execution unit as explanatory information of the column updated by input data input to the input form displayed on the Web page; A database analysis apparatus characterized by comprising.

The specific process execution unit
Based on the Web application program, the Web application setting file, and the Web page template, the storage area of the output item in the Web page to which data obtained from the table column held by the database is output is specified. The output item name corresponding to the identified storage area and the output item name of the storage area associated with the column from which the data is obtained is identified by analyzing the Web page template,
The output unit is
The database analysis apparatus according to claim 1, wherein information including at least the output item name specified by the specifying process execution unit is output as description information of the column from which the data is acquired.

The specific process execution unit
When the input item is included in the table, a plurality of heading candidates that are candidates corresponding to the heading for the input item of the table are extracted, and whether or not a repeated structure appears for the plurality of extracted heading candidates. When the determination is made and it is determined that the repeating structure appears, the heading for the input item of the table is specified based on the repeating structure from the plurality of heading candidates, and the specified heading is used as the input item name The database analysis apparatus according to claim 2, wherein the database analysis apparatus is employed.

The specific process execution unit
When the output item is included in the table, a plurality of heading candidates that are candidates corresponding to the heading for the output item of the table are extracted, and whether or not a repeated structure appears for the plurality of extracted heading candidates. When the determination is made and it is determined that the repeating structure appears, the heading for the output item of the table is specified based on the repeating structure from the plurality of heading candidates, and the specified heading is used as the output item name 4. The database analysis apparatus according to claim 3, which is employed.

The specific process execution unit
If it is determined that the repetitive structure does not appear as a result of the determination, a removal process for removing at least one headline candidate from the plurality of headline candidates is executed based on a predetermined criterion, and the headline candidate after the removal process is targeted 5. The database analysis apparatus according to claim 3, wherein it is determined whether or not a repetitive structure appears.

The specific process execution unit
A string operation in the Web application program is tracked by static analysis and an SQL statement specifying process for specifying an SQL statement passed to a database call function is executed, and assigned as a character string operation in the SQL statement specifying process When it is determined that a character string operation that does not correspond to either the operation or the concatenation operation is performed, a temporary generation rule that is different from the normal generation rule for the character string operation and can be restored to the normal generation rule The rule is recorded, and the temporary generation rule is restored to the normal generation rule only when the normal generation rule is necessary when specifying the table name and the column name based on the SQL statement. The database analysis device according to any one of claims 1 to 5.

The input unit is
Enter the storage directory information indicating the storage directory information of the Web application program,
The database analysis device further includes:
Specifying the framework used by the Web application program by referring to the storage directory information input by the input unit, and extracting the Web application setting information from the storage directory of the Web application program The database analysis apparatus according to claim 1, further comprising a unit.

A Web application program that uses a database to be analyzed, a Web application setting file that describes predetermined Web application setting information used for setting the Web application program, and a template used when the Web application program generates a Web page A process of inputting a Web page template indicating type information;
Based on the Web application program, the Web application setting file, and the Web page template, an input form component used to generate an input form displayed on a Web page generated by the input Web application program. Identifying and identifying the input item name corresponding to the identified input form part and corresponding to the column of the table of the database to be analyzed by analyzing the Web page template; ,
A database for causing a computer to execute processing for outputting information including at least the specified input item name as explanatory information of the column updated by input data input to the input form displayed on the Web page Analysis program.