JP6568021B2

JP6568021B2 - Logical relationship recognition apparatus, logical relationship recognition method, and logical relationship recognition program

Info

Publication number: JP6568021B2
Application number: JP2016138932A
Authority: JP
Inventors: 郁子高木; 山田　光一; 光一山田; 勉丸山
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2016-07-13
Filing date: 2016-07-13
Publication date: 2019-08-28
Anticipated expiration: 2036-07-13
Also published as: JP2018010489A

Description

本発明は、論理関係認識装置、論理関係認識方法および論理関係認識プログラムに関する。 The present invention relates to a logical relationship recognition apparatus, a logical relationship recognition method, and a logical relationship recognition program.

様々な企業における業務の処理に、ＯｐＳ（業務支援システム：Operation Support System）が導入されている。一方で、ＯｐＳが支援できない業務については、各組織独自の運用がなされており、その中で帳票は情報流通に利用されている。帳票は半構造データのため、それぞれの値の意味（論理関係）は、人にとっては理解できても機械にとっては理解困難なものであり、帳票上のデータを活用する際に多大な人手の作業負担が発生している。これに対し、罫線枠の並びを基に、帳票の項目名間の論理関係、および項目名と項目値との間の論理関係を自動的に認識する技術が知られている（例えば、特許文献１および非特許文献１〜３を参照）。 OpS (Operation Support System: Operation Support System) has been introduced for business processing in various companies. On the other hand, for operations that cannot be supported by OpS, each organization has its own operations, and forms are used for information distribution. Since the form is semi-structured data, the meaning (logical relationship) of each value can be understood by a person but difficult for a machine, and a large amount of manual work is required when utilizing the data on the form. There is a burden. On the other hand, a technique for automatically recognizing a logical relationship between item names of a form and a logical relationship between item names and item values based on the arrangement of ruled line frames is known (for example, Patent Documents). 1 and non-patent documents 1 to 3).

特開２０１６−００４４１７号公報JP, 2006-004417, A

高木ほか、“電子帳票群に対する横断的データ操作技術のための抽出手法の検討”、信学技報、vol.114、No.150 pp.1-6(2014)Takagi et al., “Examination of Extraction Techniques for Cross-sectional Data Manipulation Techniques for Electronic Forms”, IEICE Technical Report, vol.114, No.150 pp.1-6 (2014) 高木ほか、“電子帳票データ連携のための帳票視覚表現の調査”、電子情報通信学会通信ソサイエティ大会講演論文集(2015)Takagi et al., “Investigation of visual representation of forms for linking electronic form data”, IEICE Communication Society Conference Proceedings (2015) 高木ほか、“視覚表現を利用した電子帳票のデータ構造変換手法の検討”、信学技法、vol.115、No.409、pp25-30(2016)Takagi et al., "Examination of data structure conversion method of electronic forms using visual expression", Shingaku Techniques, vol.115, No.409, pp25-30 (2016)

しかしながら、従来の技術は、入れ子構造型に縦列挙、横列挙、縦列挙と横列挙の複合型、縦表、横表、直交表、入れ子構造型列挙、入れ子構造縦表、入れ子構造横表または入れ子構造直交表が含まれる、または混在する場合に、帳票の項目名間の論理関係、および項目名と項目値との間の論理関係を正確に認識することができない場合があるという問題があった。 However, the conventional technique has a vertical enumeration, horizontal enumeration, vertical enumeration and horizontal enumeration combined type, vertical table, horizontal table, orthogonal table, nested structure type enumeration, nested structure vertical table, nested structure horizontal table or When nested orthogonal tables are included or mixed, there is a problem that the logical relationship between item names in the form and the logical relationship between item names and item values may not be recognized correctly. It was.

例えば、４行×３列の罫線枠からなる表について、上端の行の３つの罫線枠が項目名である縦表である場合と、左端の列の４つの罫線枠が項目名である横表である場合と、上端の行の３つの罫線枠、および左端の列の４つの罫線枠の両方が項目名である直交表である場合と、が考えられる。このような場合、従来の技術では、項目名間の論理関係、および項目名と項目値との間の論理関係を正確に認識することは困難である。 For example, with respect to a table composed of ruled line frames of 4 rows × 3 columns, a vertical table in which the three ruled line frames in the uppermost row are item names and a horizontal table in which the four ruled line frames in the leftmost column are item names And the case where both the three ruled line frames in the uppermost row and the four ruled line frames in the leftmost column are orthogonal tables that are item names. In such a case, it is difficult for the conventional technique to accurately recognize the logical relationship between the item names and the logical relationship between the item names and the item values.

例えば、縦列挙の場合、縦方向に項目名に対応した同じ幅の罫線枠が複数存在する場合がある。さらに、それぞれの項目名は項目値を持ち、項目値に対応した罫線枠の幅も、項目名に対応した罫線枠の幅と同じである。このような場合、従来の技術では、どの罫線枠が項目名であるかを特定することができないため、項目名間の論理関係、および項目名と項目値との間の論理関係を正確に認識することは困難である。 For example, in the case of vertical enumeration, there may be a plurality of ruled line frames of the same width corresponding to the item names in the vertical direction. Further, each item name has an item value, and the width of the ruled line frame corresponding to the item value is the same as the width of the ruled line frame corresponding to the item name. In such a case, the conventional technology cannot identify which ruled line frame is the item name, so it accurately recognizes the logical relationship between the item names and the logical relationship between the item names and the item values. It is difficult to do.

さらに、例えば、入れ子構造の中に列挙および入れ子構造で表現された表が含まれる場合、従来技術では項目名と項目値との間の論理関係を正確に認識することは困難である。 Further, for example, when a table represented by an enumeration and a nested structure is included in the nested structure, it is difficult for the conventional technique to accurately recognize the logical relationship between the item name and the item value.

本発明の論理関係認識装置は、帳票の項目名または項目値を表す領域に関する情報をノードとして表し、前記ノード間の隣接関係をエッジとして表したグラフを基に、前記ノードのうち、あらかじめ設定された条件を満たすノードを、項目名を表す領域のノードである項目名ノードとして抽出する抽出部と、前記エッジを基に、前記項目名ノードから表の起点となるノードである起点ノードを抽出し、前記起点ノードを起点とした表を、上端と左端の両方に項目名が存在する直交表、上端に項目名が存在する縦表、および左端に項目名が存在する横表のうちのいずれかに分類する表分類部と、前記直交表における前記項目名ノード間の縦方向の論理関係、前記直交表における前記項目名ノード間の横方向の論理関係、前記直交表における前記項目名ノードと前記項目名ノード以外のノードである項目値ノードとの間の縦方向の論理関係、および前記直交表における前記項目名ノードと前記項目値ノードとの間の横方向の論理関係を取得する直交表取得部と、前記縦表における前記項目名ノード間の縦方向の論理関係、および前記縦表における前記項目名ノードと前記項目値ノードとの間の縦方向の論理関係を取得する縦表取得部と、前記横表における前記項目名ノード間の横方向の論理関係、および前記横表における前記項目名ノードと前記項目値ノードとの間の横方向の論理関係を取得する横表取得部と、前記直交表、前記縦表および前記横表から、不整合な表であることを示す所定の条件を満たす表を除外した合成対象の表を特定する第１の特定部と、１つのノードの所定の方向に、複数のノードが隣接している場合、前記１つのノードと前記複数のノードとの隣接関係を表すエッジを削除する第１の削除部と、前記項目名ノードのうち、所定の方向に項目値を表す領域のノードである項目値ノードが隣接している項目名ノードと、前記項目値ノードとの隣接関係を表すエッジを削除する第２の削除部と、前記第１の削除部および前記第２の削除部によってエッジの削除が行われた前記グラフを基に、前記項目名ノードと前記項目値ノードとの間の論理関係を取得する第１の取得部と、前記第１の削除部によってエッジの削除が行われた前記グラフを基に、前記項目名ノード間の包含関係を取得する第２の取得部と、前記第１の取得部によって取得された論理関係のうち、前記合成対象の表に含まれるノードに関する論理関係を除外した合成対象の列挙に関する論理関係を特定する第２の特定部と、前記第２の取得部によって取得された包含関係のうち、前記合成対象の表に関する論理関係および前記合成対象の列挙に関する論理関係のいずれにも含まれない包含関係を基に第１の木構造のデータを作成する第１の合成部と、前記合成対象の表に関する論理関係を基に木構造のデータを作成し、当該木構造のデータと前記第１の木構造のデータとを合成した第２の木構造のデータを作成する第２の合成部と、前記合成対象の列挙に関する論理関係を基に木構造のデータを作成し、当該木構造のデータと前記第２の木構造のデータとを合成した第３の木構造のデータを作成する第３の合成部と、を有することを特徴とする。 The logical relationship recognition apparatus of the present invention represents information relating to an area representing an item name or item value of a form as a node, and is set in advance among the nodes based on a graph representing an adjacency relationship between the nodes as an edge. An extraction unit that extracts a node that satisfies the condition as an item name node that is a node in an area that represents an item name, and based on the edge, extracts a starting node that is a node that is a starting point of the table from the item name node The table starting from the origin node is either an orthogonal table in which item names exist at both the top and left ends, a vertical table in which item names exist at the top, and a horizontal table in which item names exist at the left end. A vertical classification logical relationship between the item name nodes in the orthogonal table, a horizontal logical relationship between the item name nodes in the orthogonal table, the table in the orthogonal table A vertical logical relationship between a name node and an item value node that is a node other than the item name node, and a horizontal logical relationship between the item name node and the item value node in the orthogonal table. An orthogonal table acquisition unit to acquire, a vertical logical relationship between the item name nodes in the vertical table, and a vertical logical relationship between the item name node and the item value node in the vertical table are acquired. A horizontal table for acquiring a horizontal logical relationship between the item name nodes in the horizontal table and a horizontal logical relationship between the item name nodes and the item value nodes in the horizontal table An acquisition unit; a first specifying unit that specifies a table to be synthesized by excluding a table that satisfies a predetermined condition indicating that the table is inconsistent from the orthogonal table, the vertical table, and the horizontal table; The given one of the two nodes When a plurality of nodes are adjacent to each other, a first deletion unit that deletes an edge representing an adjacent relationship between the one node and the plurality of nodes, and an item in a predetermined direction among the item name nodes An item name node that is adjacent to an item value node that is a node of an area that represents a value, a second deletion unit that deletes an edge that represents an adjacent relationship with the item value node, the first deletion unit, and the A first acquisition unit that acquires a logical relationship between the item name node and the item value node based on the graph in which an edge is deleted by a second deletion unit; and the first deletion unit Based on the graph in which the edge is deleted by the second acquisition unit that acquires the inclusion relationship between the item name nodes, and the combination target among the logical relationships acquired by the first acquisition unit On the nodes in the table A second specifying unit that specifies a logical relationship regarding enumeration of synthesis targets excluding a logical relationship; and an inclusion relationship acquired by the second acquisition unit, a logical relationship related to the table to be combined and the combination target Creating a first tree structure data based on an inclusive relationship not included in any of the enumerated logical relationships, and creating a tree structure data based on the logical relationship regarding the table to be synthesized A tree structure based on a logical relationship relating to the enumeration of the synthesis target and a second synthesis unit that creates a second tree structure data obtained by synthesizing the tree structure data and the first tree structure data And a third synthesis unit for creating a third tree structure data obtained by synthesizing the tree structure data and the second tree structure data.

また、本発明の論理関係認識方法は、論理関係認識装置で実行される論理関係認識方法であって、帳票の項目名または項目値を表す領域に関する情報をノードとして表し、前記ノード間の隣接関係をエッジとして表したグラフを基に、前記ノードのうち、あらかじめ設定された条件を満たすノードを、項目名を表す領域のノードである項目名ノードとして抽出する抽出工程と、前記エッジを基に、前記項目名ノードから表の起点となるノードである起点ノードを抽出し、前記起点ノードを起点とした表を、上端と左端の両方に項目名が存在する直交表、上端に項目名が存在する縦表、および左端に項目名が存在する横表のうちのいずれかに分類する表分類工程と、前記直交表における前記項目名ノード間の縦方向の論理関係、前記直交表における前記項目名ノード間の横方向の論理関係、前記直交表における前記項目名ノードと前記項目名ノード以外のノードである項目値ノードとの間の縦方向の論理関係、および前記直交表における前記項目名ノードと前記項目値ノードとの間の横方向の論理関係を取得する直交表取得工程と、前記縦表における前記項目名ノード間の縦方向の論理関係、および前記縦表における前記項目名ノードと前記項目値ノードとの間の縦方向の論理関係を取得する縦表取得工程と、前記横表における前記項目名ノード間の横方向の論理関係、および前記横表における前記項目名ノードと前記項目値ノードとの間の横方向の論理関係を取得する横表取得工程と、前記直交表、前記縦表および前記横表から、不整合な表であることを示す所定の条件を満たす表を除外した合成対象の表を特定する第１の特定工程と、１つのノードの所定の方向に、複数のノードが隣接している場合、前記１つのノードと前記複数のノードとの隣接関係を表すエッジを削除する第１の削除工程と、前記項目名ノードのうち、所定の方向に項目値を表す領域のノードである項目値ノードが隣接している項目名ノードと、前記項目値ノードとの隣接関係を表すエッジを削除する第２の削除工程と、前記第１の削除工程および前記第２の削除工程によってエッジの削除が行われた前記グラフを基に、前記項目名ノードと前記項目値ノードとの間の論理関係を取得する第１の取得工程と、前記第１の削除工程によってエッジの削除が行われた前記グラフを基に、前記項目名ノード間の包含関係を取得する第２の取得工程と、前記第１の取得工程によって取得された論理関係のうち、前記合成対象の表に含まれるノードに関する論理関係を除外した合成対象の列挙に関する論理関係を特定する第２の特定工程と、前記第２の取得工程によって取得された包含関係のうち、前記合成対象の表に関する論理関係および前記合成対象の列挙に関する論理関係のいずれにも含まれない包含関係を基に第１の木構造のデータを作成する第１の合成工程と、前記合成対象の表に関する論理関係を基に木構造のデータを作成し、当該木構造のデータと前記第１の木構造のデータとを合成した第２の木構造のデータを作成する第２の合成工程と、前記合成対象の列挙に関する論理関係を基に木構造のデータを作成し、当該木構造のデータと前記第２の木構造のデータとを合成した第３の木構造のデータを作成する第３の合成工程と、を含んだことを特徴とする。 The logical relationship recognition method of the present invention is a logical relationship recognition method executed by a logical relationship recognition device, wherein information relating to an area representing an item name or item value of a form is represented as a node, and the adjacent relationship between the nodes Based on the graph expressed as an edge, an extraction step of extracting a node that satisfies a preset condition from among the nodes as an item name node that is a node of an area representing an item name, and based on the edge, A starting node that is a node that is a starting point of the table is extracted from the item name node, and a table that has the starting node as the starting point is an orthogonal table in which item names exist at both the upper end and the left end, and an item name exists at the upper end. A table classification process for classifying the table into one of a vertical table and a horizontal table having an item name at the left end, a vertical logical relationship between the item name nodes in the orthogonal table, and the orthogonal table A horizontal logical relationship between the item name nodes, a vertical logical relationship between the item name node in the orthogonal table and an item value node that is a node other than the item name node, and the orthogonal table An orthogonal table acquisition step of acquiring a horizontal logical relationship between an item name node and the item value node, a vertical logical relationship between the item name nodes in the vertical table, and the item name in the vertical table A vertical table acquisition step of acquiring a vertical logical relationship between a node and the item value node, a horizontal logical relationship between the item name nodes in the horizontal table, and the item name node in the horizontal table; A table that satisfies a predetermined condition indicating that the table is an inconsistent table from the orthogonal table, the vertical table, and the horizontal table, and a horizontal table acquisition step of acquiring a horizontal logical relationship with the item value node. When a plurality of nodes are adjacent to each other in a predetermined direction of one node and a first specifying step of specifying the excluded synthesis target table, the adjacent relationship between the one node and the plurality of nodes is represented. An item name node adjacent to an item value node that is a node of an area representing an item value in a predetermined direction, and the item value node; The item name node and the item value based on a second deletion step of deleting an edge representing an adjacency, and the graph in which the edge is deleted by the first deletion step and the second deletion step. A first acquisition step of acquiring a logical relationship between the nodes, and a second acquisition step of acquiring an inclusion relationship between the item name nodes based on the graph in which edges are deleted by the first deletion step. And the acquisition step A second specifying step of specifying a logical relationship related to enumeration of synthesis targets, excluding a logical relationship related to a node included in the table to be combined among the logical relationships acquired in one acquisition step; and the second acquisition A first tree-structured data is generated based on an inclusion relation that is not included in any of the inclusion relation acquired by the process and included in either the logical relation relating to the table to be synthesized or the logical relation relating to the enumeration of the synthesis target. The data of the second tree structure in which the data of the tree structure is created based on the synthesis process of 1 and the logical relationship regarding the table to be synthesized, and the data of the tree structure and the data of the first tree structure are synthesized Based on the logical relationship regarding the enumeration of the synthesis target, and the third synthesis step of synthesizing the tree structure data and the second tree structure data. Wooden frame Characterized in a third synthesis step of creating the data, that it contained.

また、本発明の論理関係認識プログラムは、コンピュータに、帳票の項目名または項目値を表す領域に関する情報をノードとして表し、前記ノード間の隣接関係をエッジとして表したグラフを基に、前記ノードのうち、あらかじめ設定された条件を満たすノードを、項目名を表す領域のノードである項目名ノードとして抽出する抽出ステップと、前記エッジを基に、前記項目名ノードから表の起点となるノードである起点ノードを抽出し、前記起点ノードを起点とした表を、上端と左端の両方に項目名が存在する直交表、上端に項目名が存在する縦表、および左端に項目名が存在する横表のうちのいずれかに分類する表分類ステップと、前記直交表における前記項目名ノード間の縦方向の論理関係、前記直交表における前記項目名ノード間の横方向の論理関係、前記直交表における前記項目名ノードと前記項目名ノード以外のノードである項目値ノードとの間の縦方向の論理関係、および前記直交表における前記項目名ノードと前記項目値ノードとの間の横方向の論理関係を取得する直交表取得ステップと、前記縦表における前記項目名ノード間の縦方向の論理関係、および前記縦表における前記項目名ノードと前記項目値ノードとの間の縦方向の論理関係を取得する縦表取得ステップと、前記横表における前記項目名ノード間の横方向の論理関係、および前記横表における前記項目名ノードと前記項目値ノードとの間の横方向の論理関係を取得する横表取得ステップと、前記直交表、前記縦表および前記横表から、不整合な表であることを示す所定の条件を満たす表を除外した合成対象の表を特定する第１の特定ステップと、１つのノードの所定の方向に、複数のノードが隣接している場合、前記１つのノードと前記複数のノードとの隣接関係を表すエッジを削除する第１の削除ステップと、前記項目名ノードのうち、所定の方向に項目値を表す領域のノードである項目値ノードが隣接している項目名ノードと、前記項目値ノードとの隣接関係を表すエッジを削除する第２の削除ステップと、前記第１の削除ステップおよび前記第２の削除ステップによってエッジの削除が行われた前記グラフを基に、前記項目名ノードと前記項目値ノードとの間の論理関係を取得する第１の取得ステップと、前記第１の削除ステップによってエッジの削除が行われた前記グラフを基に、前記項目名ノード間の包含関係を取得する第２の取得ステップと、前記第１の取得ステップによって取得された論理関係のうち、前記合成対象の表に含まれるノードに関する論理関係を除外した合成対象の列挙に関する論理関係を特定する第２の特定ステップと、前記第２の取得ステップによって取得された包含関係のうち、前記合成対象の表に関する論理関係および前記合成対象の列挙に関する論理関係のいずれにも含まれない包含関係を基に第１の木構造のデータを作成する第１の合成ステップと、前記合成対象の表に関する論理関係を基に木構造のデータを作成し、当該木構造のデータと前記第１の木構造のデータとを合成した第２の木構造のデータを作成する第２の合成ステップと、前記合成対象の列挙に関する論理関係を基に木構造のデータを作成し、当該木構造のデータと前記第２の木構造のデータとを合成した第３の木構造のデータを作成する第３の合成ステップと、を実行させることを特徴とする。 Further, the logical relationship recognition program of the present invention represents, on a computer, information relating to an area representing an item name or item value of a form as a node, and a graph representing the adjacency relationship between the nodes as an edge. Among these, the extraction step of extracting a node that satisfies a preset condition as an item name node that is a node of an area representing an item name, and a node that is a starting point of the table from the item name node based on the edge Extracts the origin node, and uses the origin node as the origin table, the orthogonal table in which the item name exists at both the upper end and the left end, the vertical table in which the item name exists in the upper end, and the horizontal table in which the item name exists at the left end A table classification step for classifying into any of the above, a vertical logical relationship between the item name nodes in the orthogonal table, and between the item name nodes in the orthogonal table Logical relationship in direction, vertical logical relationship between the item name node in the orthogonal table and an item value node that is a node other than the item name node, and the item name node and the item value node in the orthogonal table An orthogonal table acquisition step for acquiring a horizontal logical relationship between the item name node and the item name node in the vertical table, and the item name node and the item value node in the vertical table A vertical table acquisition step of acquiring a vertical logical relationship between the item name nodes in the horizontal table, and between the item name node and the item value node in the horizontal table A table obtained by excluding a table satisfying a predetermined condition indicating that the table is inconsistent from the orthogonal table, the vertical table, and the horizontal table, and a horizontal table acquisition step of acquiring a logical relationship in the horizontal direction. When a plurality of nodes are adjacent to each other in a first direction for specifying a target table and a predetermined direction of one node, an edge representing an adjacent relationship between the one node and the plurality of nodes is deleted. A first deletion step, and an item name node adjacent to an item value node that is a node of an area representing an item value in a predetermined direction among the item name nodes and an adjacency relationship between the item value nodes Based on the second deletion step of deleting the edge to be represented, and the graph in which the deletion of the edge was performed by the first deletion step and the second deletion step, the item name node and the item value node A first acquisition step of acquiring a logical relationship between the item name nodes, and a second acquisition of acquiring an inclusion relationship between the item name nodes based on the graph from which edges have been deleted by the first deletion step. A second specifying step for specifying a logical relationship related to enumeration of synthesis targets, excluding a logical relationship related to nodes included in the table to be synthesized among the logical relationships obtained in the first obtaining step; The first tree structure based on the inclusion relationship that is not included in either the logical relationship related to the table to be synthesized and the logical relationship related to the enumeration of the synthesis target among the inclusion relationships acquired by the second acquisition step A first synthesizing step for creating the data of the tree, and creating a tree structure data based on the logical relationship relating to the table to be synthesized, and synthesizing the tree structure data and the first tree structure data. A second synthesizing step for creating two tree-structured data, and creating a tree-structured data on the basis of a logical relationship relating to the enumeration of the synthesis target, A third synthesis step of creating the data of the third tree structure obtained by synthesizing the data of the tree structure, characterized in that to the execution.

本発明によれば、入れ子構造型に縦列挙、横列挙、縦列挙と横列挙の複合型、縦表、横表、直交表、入れ子構造型列挙、入れ子構造縦表、入れ子構造横表または入れ子構造直交表が含まれる、または混在する場合に、帳票の項目名間の論理関係、および項目名と項目値との間の論理関係を正確に認識することができる。 According to the present invention, a vertical enumeration, horizontal enumeration, vertical enumeration and horizontal enumeration, vertical table, horizontal table, orthogonal table, nested structure type enumeration, nested structure vertical table, nested structure horizontal table or nested When the structure orthogonal table is included or mixed, it is possible to accurately recognize the logical relationship between the item names of the form and the logical relationship between the item name and the item value.

図１は、論理関係認識処理の概要について説明するための図である。FIG. 1 is a diagram for explaining the outline of the logical relationship recognition process. 図２は、横方向の並びに縦表が含まれる帳票の一例を示す図である。FIG. 2 is a diagram illustrating an example of a form including a horizontal table and a vertical table. 図３は、横方向の並びに縦表が含まれる帳票の一例を示す図である。FIG. 3 is a diagram illustrating an example of a form including a horizontal table and a vertical table. 図４は、縦方向の並びに縦表が含まれる帳票の一例を示す図である。FIG. 4 is a diagram illustrating an example of a form including a vertical table. 図５は、入れ子構造の中に直交表が含まれる帳票の一例を示す図である。FIG. 5 is a diagram illustrating an example of a form in which an orthogonal table is included in a nested structure. 図６は、第１の実施形態に係る論理関係認識装置の構成の一例を示す図である。FIG. 6 is a diagram illustrating an example of the configuration of the logical relationship recognition apparatus according to the first embodiment. 図７は、隣接ノードの取得について説明するための図である。FIG. 7 is a diagram for explaining acquisition of adjacent nodes. 図８は、横表の表リストの一例である。FIG. 8 is an example of a table list of a horizontal table. 図９は、直交表の表リストの一例である。FIG. 9 is an example of a table list of an orthogonal table. 図１０は、横表の表リストの一例である。FIG. 10 is an example of a table list of a horizontal table. 図１１は、列挙リストの一例である。FIG. 11 is an example of an enumeration list. 図１２は、包含グラフについて説明するための図である。FIG. 12 is a diagram for explaining the inclusion graph. 図１３は、包含グラフについて説明するための図である。FIG. 13 is a diagram for explaining the inclusion graph. 図１４は、木構造のデータの一例である。FIG. 14 is an example of tree structure data. 図１５は、帳票グラフを生成する処理の流れを示すフローチャートである。FIG. 15 is a flowchart showing a flow of processing for generating a form graph. 図１６は、ノードを取得する処理の流れを示すフローチャートである。FIG. 16 is a flowchart showing a flow of processing for acquiring a node. 図１７は、ノードを生成する処理の流れを示すフローチャートである。FIG. 17 is a flowchart showing a flow of processing for generating a node. 図１８は、隣接関係を取得する処理の流れを示すフローチャートである。FIG. 18 is a flowchart showing a flow of processing for acquiring the adjacency relationship. 図１９は、隣接するノード群を取得する処理の流れを示すフローチャートである。FIG. 19 is a flowchart showing a flow of processing for acquiring adjacent node groups. 図２０は、様式グラフを生成する処理の流れを示すフローチャートである。FIG. 20 is a flowchart showing a flow of processing for generating a style graph. 図２１は、抽出部の処理の流れを示すフローチャートである。FIG. 21 is a flowchart showing the flow of processing of the extraction unit. 図２２は、解析部の処理の流れを示すフローチャートである。FIG. 22 is a flowchart showing the flow of processing of the analysis unit. 図２３は、表分類部の処理の流れを示すフローチャートである。FIG. 23 is a flowchart showing the flow of processing of the table classification unit. 図２４は、縦表取得部の処理の流れを示すフローチャートである。FIG. 24 is a flowchart showing the flow of processing of the vertical table acquisition unit. 図２５は、横表取得部の処理の流れを示すフローチャートである。FIG. 25 is a flowchart showing the flow of processing of the horizontal table acquisition unit. 図２６は、直交表取得部の処理の流れを示すフローチャートである。FIG. 26 is a flowchart showing the flow of processing of the orthogonal table acquisition unit. 図２７は、縦方向の論理関係を取得する処理の流れを示すフローチャートである。FIG. 27 is a flowchart illustrating a flow of processing for acquiring a vertical logical relationship. 図２８は、横方向の論理関係を取得する処理の流れを示すフローチャートである。FIG. 28 is a flowchart showing a flow of processing for acquiring a horizontal logical relationship. 図２９は、表整合部の処理の流れを示すフローチャートである。FIG. 29 is a flowchart showing the flow of processing of the table matching unit. 図３０は、第１の削除部の処理の流れを示すフローチャートである。FIG. 30 is a flowchart showing the flow of processing of the first deletion unit. 図３１は、第２の削除部の処理の流れを示すフローチャートである。FIG. 31 is a flowchart showing the flow of processing of the second deletion unit. 図３２は、列挙分類部の処理の流れを示すフローチャートである。FIG. 32 is a flowchart showing the flow of processing of the enumeration classification unit. 図３３は、縦列挙取得部の処理の流れを示すフローチャートである。FIG. 33 is a flowchart showing the flow of processing of the vertical enumeration acquisition unit. 図３４は、横列挙取得部の処理の流れを示すフローチャートである。FIG. 34 is a flowchart illustrating a process flow of the horizontal enumeration acquisition unit. 図３５は、包含関係取得部の処理の流れを示すフローチャートである。FIG. 35 is a flowchart illustrating a process flow of the inclusion relationship acquisition unit. 図３６は、右側の包含関係を取得する処理の流れを示すフローチャートである。FIG. 36 is a flowchart showing the flow of processing for acquiring the right inclusion relationship. 図３７は、下側の包含関係を取得する処理の流れを示すフローチャートである。FIG. 37 is a flowchart showing a flow of processing for acquiring the lower inclusion relationship. 図３８は、包含グラフ生成部の処理の流れを示すフローチャートである。FIG. 38 is a flowchart showing the flow of processing of the inclusion graph generation unit. 図３９は、列挙整合部の処理の流れを示すフローチャートである。FIG. 39 is a flowchart showing the flow of processing of the enumeration matching unit. 図４０は、項目名間合成部の処理の流れを示すフローチャートである。FIG. 40 is a flowchart showing the flow of processing of the item name synthesizing unit. 図４１は、表合成部の処理の流れを示すフローチャートである。FIG. 41 is a flowchart showing a process flow of the table synthesis unit. 図４２は、列挙合成部の処理の流れを示すフローチャートである。FIG. 42 is a flowchart showing the flow of processing of the enumeration synthesis unit. 図４３は、追加部の処理の流れを示すフローチャートである。FIG. 43 is a flowchart showing the flow of processing of the adding unit. 図４４は、その他の実施形態について説明するための図である。FIG. 44 is a diagram for explaining another embodiment. 図４５は、プログラムが実行されることにより論理関係認識装置が実現されるコンピュータの一例を示す図である。FIG. 45 is a diagram illustrating an example of a computer in which the logical relationship recognition apparatus is realized by executing a program.

以下に、本願に係る論理関係認識装置、論理関係認識方法および論理関係認識プログラムの実施形態を図面に基づいて詳細に説明する。なお、この実施形態により本発明が限定されるものではない。 Hereinafter, embodiments of a logical relationship recognition device, a logical relationship recognition method, and a logical relationship recognition program according to the present application will be described in detail with reference to the drawings. In addition, this invention is not limited by this embodiment.

まず、図１を用いて、論理関係認識装置を有する論理関係認識システムによる論理関係認識処理の概要について説明する。図１は、論理関係認識処理の概要について説明するための図である。図１に示すように、まず、論理関係認識システムは、ＰＣ等から表形式の帳票を読み込む（ステップＳ１）。このとき、論理関係認識システムが読み込むデータは、帳票に限られず、ＷｅｂＧＵＩ、システムＧＵＩ、および画像上の表構造等の、半構造のデータであればよい。次に、論理関係認識システムは、読み込んだ帳票から罫線枠に関する情報を取得する（ステップＳ２）。また、論理関係認識システムは、スキャナ等で読み込まれた紙の帳票の画像から記載内容をＯＣＲ（光学文字認識：Optical Character Recognition）によって取得してもよい（ステップＳ３）。 First, the outline of the logical relationship recognition process by the logical relationship recognition system having the logical relationship recognition device will be described with reference to FIG. FIG. 1 is a diagram for explaining the outline of the logical relationship recognition process. As shown in FIG. 1, first, the logical relationship recognition system reads a tabular form from a PC or the like (step S1). At this time, the data read by the logical relationship recognition system is not limited to a form, and may be semi-structured data such as a Web GUI, a system GUI, and a table structure on an image. Next, the logical relationship recognition system acquires information on the ruled line frame from the read form (step S2). Further, the logical relationship recognition system may acquire the description content from an image of a paper form read by a scanner or the like by OCR (Optical Character Recognition) (step S3).

そして、論理関係認識システムは、罫線枠情報および項目名定義情報を基に、様式グラフを生成する（ステップＳ４）。罫線枠情報には、例えば罫線枠の座標、罫線枠内の文字列、罫線枠の塗りつぶし色、罫線の種類や太さ、色等の視覚的な情報が含まれる。また、項目名定義情報には、罫線枠を項目名として判断する際の条件が含まれる。 Then, the logical relationship recognition system generates a style graph based on the ruled line frame information and the item name definition information (step S4). The ruled line frame information includes visual information such as the coordinates of the ruled line frame, the character string in the ruled line frame, the fill color of the ruled line frame, the type, thickness, and color of the ruled line. Further, the item name definition information includes a condition for determining a ruled line frame as an item name.

例えば、色が黄色の罫線枠を項目名とする項目名定義情報は、「if {node:{color:#FFFF00}} then item」と記述される。また、例えば、文字列の空でない罫線枠を項目名とする罫線枠定義情報は、「if {node:{!string:null}} then item」と記述される。また、例えば、色が白でなく、かつ文字列が空でない罫線枠を項目名とする罫線枠定義情報は、「if {node:{!color:white},{!string:null} then item」と記述される。また、例えば、文字列が「y1」の罫線枠を項目名とする罫線枠定義情報は、「if {node:{string:”y1”}} then item」と記述される。 For example, item name definition information whose item name is a ruled frame with a yellow color is described as “if {node: {color: # FFFF00}} then item”. Also, for example, ruled line frame definition information whose item name is a non-empty ruled line frame of a character string is described as “if {node: {! String: null}} then item”. Also, for example, ruled line frame definition information whose item name is a ruled line frame whose color is not white and whose character string is not empty is “if {node: {! Color: white}, {! String: null} then item” Is described. Further, for example, ruled line frame definition information whose item name is a ruled line frame with the character string “y1” is described as “if {node: {string:“ y1 ”}} then item”.

また、様式グラフとは、帳票に含まれる複数の様式ごとに、罫線枠等をノード、ノードの隣接関係をエッジとして表したグラフである。以降の処理において、論理関係認識システムは、帳票の様式をグラフ形式のデータとして扱う。また、様式には表が含まれる。以降の説明では、様式グラフを基に論理関係を認識する場合について説明するが、帳票全体をノードとエッジで表した帳票グラフを用いることとしてもよい。 Further, the style graph is a graph in which a ruled line frame or the like is represented as a node and an adjacent relationship between nodes is represented as an edge for each of a plurality of styles included in the form. In the subsequent processing, the logical relationship recognition system handles the form format as graph format data. The form also includes a table. In the following description, the case where the logical relationship is recognized based on the style graph will be described. However, a form graph in which the entire form is represented by nodes and edges may be used.

次に、論理関係認識システムは、表形式の構造から論理関係を認識し（ステップＳ５）、認識した結果を所定の形式のデータ構造（ｘｍｌ、ｙａｍｌ、ｊｓｏｎ等）に変換し（ステップＳ６）、ＤＢに格納する。このとき、ＤＢへはデータ構造のリンクパスを格納してもよいし、あらかじめ定義したスキーマに合わせてデータを格納してもよい。 Next, the logical relationship recognition system recognizes the logical relationship from the tabular structure (step S5), converts the recognized result into a predetermined data structure (xml, yaml, json, etc.) (step S6), Store in DB. At this time, the link path of the data structure may be stored in the DB, or the data may be stored in accordance with a predefined schema.

次に、図２〜５を用いて、論理関係認識システムによる論理関係認識処理の対象となる帳票の構造について説明する。なお、図２〜５の各符号はノードを表しており、以降の説明では、説明のために、これらの符号が示すノードを、単にノードと呼ぶ場合と、項目名ノードまたは項目値ノードと呼ぶ場合がある。なお、図２〜５において、網掛け部分は項目名を表し、網掛けでない部分は項目値を表している。 Next, with reference to FIGS. 2 to 5, the structure of a form that is a target of logical relationship recognition processing by the logical relationship recognition system will be described. 2 to 5 represent nodes, and in the following description, for the sake of explanation, the nodes indicated by these symbols are simply referred to as “nodes” or “item name nodes” or “item value nodes”. There is a case. 2 to 5, shaded portions represent item names, and portions not shaded represent item values.

ここで、縦表は、例えば項目名同士、および項目名と項目値が縦方向に隣接した表である。また、横表は、例えば項目名同士、および項目名と項目値が横方向に隣接した表である。また、直交表は、例えば項目名同士、および項目名と項目値が縦方向および横方向に隣接した表である。また、縦列挙は、項目名同士、および項目名と項目値の関係である論理関係を、縦方向に複数有する。横列挙は、論理関係を横方向に複数有する。 Here, the vertical table is a table in which, for example, item names and item names and item values are adjacent in the vertical direction. The horizontal table is a table in which, for example, item names are adjacent to each other and item names and item values are adjacent in the horizontal direction. The orthogonal table is a table in which, for example, item names and item names and item values are adjacent in the vertical direction and the horizontal direction. The vertical enumeration has a plurality of logical relationships in the vertical direction, which are the relationship between item names and the relationship between item names and item values. The horizontal enumeration has a plurality of logical relationships in the horizontal direction.

図２は、横方向の並びに縦表が含まれる帳票の一例を示す図である。図２には、例えば、ノードａ１を起点とする横方向の並びの中に、ノードａ２、ａ３およびａ４を起点とする縦表が含まれている。 FIG. 2 is a diagram illustrating an example of a form including a horizontal table and a vertical table. In FIG. 2, for example, a vertical table starting from the nodes a2, a3, and a4 is included in the horizontal arrangement starting from the node a1.

また、図３は、横方向の並びに縦表が含まれる帳票の一例を示す図である。図３には、例えば、ｂ１を起点とする横方向の並びの中に、ｂ８、ｂ９およびｂ１０を起点とする縦表が含まれている。なお、図３の構造は、横方向の論理関係が複数あるため、横列挙である。 FIG. 3 is a diagram illustrating an example of a form including a horizontal table and a vertical table. FIG. 3 includes, for example, a vertical table starting from b8, b9, and b10 in the horizontal arrangement starting from b1. Note that the structure of FIG. 3 is horizontal enumeration because there are a plurality of horizontal logical relationships.

図４は、縦方向の並びに縦表が含まれる帳票の一例を示す図である。図４には、例えば、ｃ１を起点とする縦方向の並びの中に、ｃ６、ｃ７、ｃ８およびｃ９を起点とする縦表が含まれている。なお、図４の構造は、縦方向の論理関係が複数あるため、縦列挙である。 FIG. 4 is a diagram illustrating an example of a form including a vertical table. In FIG. 4, for example, a vertical table starting from c6, c7, c8 and c9 is included in the vertical arrangement starting from c1. Note that the structure in FIG. 4 is listed vertically because there are a plurality of logical relationships in the vertical direction.

図５は、入れ子構造の中に直交表が含まれる帳票の一例を示す図である。図５には、例えば、ｄ１、ｄ６、ｄ８、ｄ３６、ｄ３８およびｄ４０を起点とする横方向の並びの中に、入れ子として直交表が含まれている。 FIG. 5 is a diagram illustrating an example of a form in which an orthogonal table is included in a nested structure. In FIG. 5, for example, an orthogonal table is included as a nesting in a horizontal arrangement starting from d1, d6, d8, d36, d38, and d40.

［第１の実施形態の構成］
次に、図６を用いて、第１の実施形態に係る論理関係認識装置の構成について説明する。図６は、第１の実施形態に係る論理関係認識装置の構成の一例を示す図である。図６に示すように、論理関係認識装置１０は、制御部２０および記憶部３０を有する。 [Configuration of First Embodiment]
Next, the configuration of the logical relationship recognition apparatus according to the first embodiment will be described with reference to FIG. FIG. 6 is a diagram illustrating an example of the configuration of the logical relationship recognition apparatus according to the first embodiment. As illustrated in FIG. 6, the logical relationship recognition device 10 includes a control unit 20 and a storage unit 30.

制御部２０は、論理関係認識装置１０全体を制御する。制御部２０は、例えば、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）等の電子回路や、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field Programmable Gate Array）等の集積回路である。また、制御部２０は、各種の処理手順を規定したプログラムや制御データを格納するための内部メモリを有し、内部メモリを用いて各処理を実行する。また、制御部２０は、各種のプログラムが動作することにより各種の処理部として機能する。例えば、制御部２０は、抽出部２０１、解析部２０２、表分類部２０３、縦表取得部２０４、横表取得部２０５、直交表取得部２０６、表整合部２０７、第１の削除部２０８、第２の削除部２０９、列挙分類部２１０、縦列挙取得部２１１、横列挙取得部２１２、列挙整合部２１３、包含関係取得部２１４、包含グラフ生成部２１５、項目名間合成部２１６、表合成部２１７、列挙合成部２１８および追加部２１９を有する。 The control unit 20 controls the entire logical relationship recognition apparatus 10. The control unit 20 is, for example, an electronic circuit such as a CPU (Central Processing Unit) or MPU (Micro Processing Unit), or an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array). The control unit 20 has an internal memory for storing programs and control data that define various processing procedures, and executes each process using the internal memory. The control unit 20 functions as various processing units when various programs are operated. For example, the control unit 20 includes an extraction unit 201, an analysis unit 202, a table classification unit 203, a vertical table acquisition unit 204, a horizontal table acquisition unit 205, an orthogonal table acquisition unit 206, a table matching unit 207, a first deletion unit 208, Second deletion unit 209, enumeration classification unit 210, vertical enumeration acquisition unit 211, horizontal enumeration acquisition unit 212, enumeration matching unit 213, inclusion relation acquisition unit 214, inclusion graph generation unit 215, item name composition unit 216, table composition A unit 217, an enumeration synthesis unit 218, and an addition unit 219.

記憶部３０は、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、光ディスク等の記憶装置である。なお、記憶部３０は、ＲＡＭ（Random Access Memory）、フラッシュメモリ、ＮＶＳＲＡＭ（Non Volatile Static Random Access Memory）等のデータを書き換え可能な半導体メモリであってもよい。記憶部３０は、論理関係認識装置１０で実行されるＯＳ（Operating System）や各種プログラムを記憶する。さらに、記憶部３０は、プログラムの実行で用いられる各種情報を記憶する。また、記憶部３０は、例えば表リスト３０１、列挙リスト３０２および包含グラフ３０３を記憶する。 The storage unit 30 is a storage device such as an HDD (Hard Disk Drive), an SSD (Solid State Drive), or an optical disk. Note that the storage unit 30 may be a semiconductor memory that can rewrite data, such as a random access memory (RAM), a flash memory, and a non-volatile static random access memory (NVSRAM). The storage unit 30 stores an OS (Operating System) executed by the logical relationship recognition apparatus 10 and various programs. Furthermore, the storage unit 30 stores various information used in executing the program. In addition, the storage unit 30 stores, for example, a table list 301, an enumeration list 302, and an inclusion graph 303.

ここで、論理関係認識装置１０による論理関係認識処理について説明するとともに、論理関係認識装置１０の各部の詳細について説明する。抽出部２０１は、帳票の項目名または項目値を表す領域に関する情報をノードとして表し、ノード間の隣接関係をエッジとして表したグラフ、すなわち様式グラフを基に、ノードのうち、あらかじめ設定された条件を満たすノードを、項目名を表す領域のノードである項目名ノードとして抽出する。 Here, the logical relationship recognition processing by the logical relationship recognition device 10 will be described, and details of each part of the logical relationship recognition device 10 will be described. The extraction unit 201 represents information related to the area representing the item name or item value of the form as a node, and a pre-set condition among the nodes based on a graph representing the adjacent relationship between nodes as an edge, that is, a style graph. Nodes satisfying the condition are extracted as item name nodes that are nodes in the area representing the item name.

図７は、隣接ノードの取得について説明するための図である。ノード１およびノード２という２つのノードがある場合について説明する。一方のノードの辺が他方のノードの辺を内包している場合、または一方のノードの辺が他方のノードの辺に外延している場合、ノードが隣接関係を有することとする。 FIG. 7 is a diagram for explaining acquisition of adjacent nodes. A case where there are two nodes, node 1 and node 2, will be described. When the side of one node includes the side of the other node, or when the side of one node extends to the side of the other node, the nodes have an adjacency relationship.

例えば、図７の（ａ）〜（ｃ）の例では、ノード２はノード１の左側に隣接していると判定される。また、図７の（ｄ）〜（ｆ）の例では、ノード２はノード１の右側に隣接していると判定される。また、図７の（ｇ）〜（ｉ）の例では、ノード２はノード１の上側に隣接していると判定される。また、図７の（ｊ）〜（ｌ）の例では、ノード２はノード１の下側に隣接していると判定される。 For example, in the example of (a) to (c) of FIG. 7, it is determined that the node 2 is adjacent to the left side of the node 1. Further, in the examples of (d) to (f) in FIG. 7, it is determined that the node 2 is adjacent to the right side of the node 1. Further, in the examples of (g) to (i) in FIG. 7, it is determined that the node 2 is adjacent to the upper side of the node 1. In the example of (j) to (l) in FIG. 7, it is determined that the node 2 is adjacent to the lower side of the node 1.

［表の取得処理］
まず、表分類部２０３、縦表取得部２０４、横表取得部２０５、直交表取得部２０６および表整合部２０７の処理について説明するとともに、表の取得処理について説明する。表分類部２０３は、様式グラフのエッジを基に、項目名ノードから表の起点となるノードである起点ノードを抽出し、起点ノードを起点とした表を、上端と左端の両方に項目名が存在する直交表、上端に項目名が存在する縦表、および左端に項目名が存在する横表のうちのいずれかに分類する。 [Table acquisition processing]
First, the processing of the table classification unit 203, the vertical table acquisition unit 204, the horizontal table acquisition unit 205, the orthogonal table acquisition unit 206, and the table matching unit 207 will be described, and the table acquisition processing will be described. Based on the edge of the style graph, the table classification unit 203 extracts a starting node, which is a node that is the starting point of the table, from the item name node, and converts the table starting from the starting node to the item name at both the upper end and the left end. The table is classified into an orthogonal table that exists, a vertical table that has an item name at the upper end, and a horizontal table that has an item name at the left end.

例えば、表分類部２０３は、左上のノード、すなわち起点ノードを対象ノードとした場合に、所定の条件が満たされるか否かによって分類を行う。以後、対象ノードから右側に隣接しているノードを辿って得られるノードとの関係を「右側に接続する」、対象ノードから下側に隣接しているノードを辿って得られるノードとの関係を「下側に接続する」と呼ぶ。このとき、まず、表分類部２０３は、起点ノードの右側に接続するノード群をｒｉｇｈｔｓ、起点ノードの下側に接続するノード群をｂｏｔｔｏｍｓと定義する。そして、ｒｉｇｈｔｓおよびｂｏｔｔｏｍｓについて、下記の条件１および条件２が満たされるか否かを判定する。
（条件１）ｒｉｇｈｔｓが全て項目名かつｒｉｇｈｔｓの高さが起点ノードの高さと同じである。
（条件２）ｂｏｔｔｏｍｓが全て項目名かつｂｏｔｔｏｍｓの幅が起点ノードの幅と同じである。 For example, the table classification unit 203 performs classification based on whether or not a predetermined condition is satisfied when the upper left node, that is, the starting node is the target node. From now on, the relationship with the node obtained by tracing the node adjacent to the right side from the target node is “connected to the right side”, and the relationship with the node obtained by tracing the node adjacent to the lower side from the target node Called “connect to the bottom”. At this time, the table classification unit 203 first defines a node group connected to the right side of the starting node as rights and a node group connected to the lower side of the starting node as bottoms. Then, it is determined whether or not the following conditions 1 and 2 are satisfied for rights and bottoms.
(Condition 1) All rights are item names and the height of rights is the same as the height of the starting node.
(Condition 2) “bottoms” is the item name and the width of “bottoms” is the same as the width of the starting node.

例えば、図５の例では、表分類部２０３は、ノードｄ１を起点ノードとした場合、ノードｄ２およびｄ４をｒｉｇｈｔｓとする。また、表分類部２０３は、ノードｄ９を起点ノードとした場合、ノードｄ１０、ｄ１１およびｄ１２をｒｉｇｈｔｓとし、ノードｄ１３、ｄ１４、ｄ１７、ｄ２０、ｄ２１、ｄ２４、ｄ２７、ｄ３０およびｄ３３をｂｏｔｔｏｍｓとする。 For example, in the example of FIG. 5, the table classification unit 203 sets the nodes d2 and d4 to rights when the node d1 is the starting node. Further, when the node d9 is the starting node, the table classification unit 203 sets the nodes d10, d11, and d12 as rights, and the nodes d13, d14, d17, d20, d21, d24, d27, d30, and d33 as bottoms.

表分類部２０３は、条件１、および条件２が満たされる場合、起点ノードを起点とする表を直交表に分類し、条件１が満たされ、条件２が満たされない場合、起点ノードを起点とする表を縦表に分類し、条件２が満たされ、条件１が満たされない場合、起点ノードを起点とする表を横表に分類する。 When condition 1 and condition 2 are satisfied, table classification unit 203 classifies the table starting from the starting node into an orthogonal table, and when condition 1 is satisfied and condition 2 is not satisfied, the starting node is the starting point. The table is classified into a vertical table, and when the condition 2 is satisfied and the condition 1 is not satisfied, the table starting from the starting node is classified as a horizontal table.

次に、直交表取得部２０６は、直交表における項目名ノード間の縦方向の論理関係、直交表における項目名ノード間の横方向の論理関係、直交表における項目名ノードと項目名ノード以外のノードである項目値ノードとの間の縦方向の論理関係、および直交表における項目名ノードと項目値ノードとの間の横方向の論理関係を取得する。また、縦表取得部２０４は、縦表における項目名ノード間の縦方向の論理関係、および縦表における項目名ノードと項目値ノードとの間の縦方向の論理関係を取得する。また、横表取得部２０５は、横表における項目名ノード間の横方向の論理関係、および横表における項目名ノードと項目値ノードとの間の横方向の論理関係を取得する。 Next, the orthogonal table acquisition unit 206 includes a vertical logical relationship between item name nodes in the orthogonal table, a horizontal logical relationship between item name nodes in the orthogonal table, and items other than item name nodes and item name nodes in the orthogonal table. A vertical logical relationship between the item value nodes, which are nodes, and a horizontal logical relationship between the item name node and the item value node in the orthogonal table are acquired. Further, the vertical table acquisition unit 204 acquires the vertical logical relationship between the item name nodes in the vertical table and the vertical logical relationship between the item name node and the item value node in the vertical table. In addition, the horizontal table acquisition unit 205 acquires a horizontal logical relationship between item name nodes in the horizontal table and a horizontal logical relationship between item name nodes and item value nodes in the horizontal table.

直交表取得部２０６は、項目名および当該項目名の右側にある項目名または項目値の組み合わせを横方向の論理関係として取得し、項目名および当該項目名の下側にある項目名または項目値の組み合わせを縦方向の論理関係として取得する。また、横表取得部２０５は、項目名および当該項目名の右側にある項目名または項目値の組み合わせを横方向の論理関係として取得する。また、縦表取得部２０４は、項目名および当該項目名の下側にある項目名または項目値の組み合わせを縦方向の論理関係として取得する。 The orthogonal table acquisition unit 206 acquires a combination of the item name and the item name or item value on the right side of the item name as a logical relationship in the horizontal direction, and the item name or item value below the item name and the item name. Is obtained as a vertical logical relationship. The horizontal table acquisition unit 205 acquires a combination of an item name and an item name or item value on the right side of the item name as a logical relationship in the horizontal direction. The vertical table acquisition unit 204 acquires a combination of an item name and an item name or item value below the item name as a vertical logical relationship.

縦表取得部２０４は、論理関係をリストとして取得し、取得したリストを表リスト３０１として記憶部３０に格納する。なお、抽出部２０１および表分類部２０３の処理によって、表が縦表であること、およびノードａ３およびノードａ４が項目名であることがわかっているため、縦表取得部２０４は、ノードａ４をノードａ３の横方向の子とするような誤った論理関係の取得を行わない。 The vertical table acquisition unit 204 acquires the logical relationship as a list, and stores the acquired list in the storage unit 30 as the table list 301. Note that the processing of the extraction unit 201 and the table classification unit 203 knows that the table is a vertical table and that the node a3 and the node a4 are item names, so the vertical table acquisition unit 204 changes the node a4 to the node a4. It does not acquire an erroneous logical relationship as a child in the horizontal direction of the node a3.

また、縦表取得部２０４は、例えば表の左上のノードの右側に接続するノード群のうち最も上側にあるノード群をｌｅａｓｔ＿ｔｏｐｓとし、ｌｅａｓｔ＿ｔｏｐｓを起点として論理関係の取得を行う。また、横表取得部２０５は、論理関係をリストとして取得し、取得したリストを表リスト３０１として記憶部３０に格納する。 Further, the vertical table acquisition unit 204 acquires the logical relationship from the node group connected to the right side of the upper left node of the table, for example, with the uppermost node group as last_tops and the last_tops as a starting point. Further, the horizontal table acquisition unit 205 acquires the logical relationship as a list, and stores the acquired list in the storage unit 30 as the table list 301.

また、横表取得部２０５は、例えば表の左上のノードの下側に接続するノード群のうち最も左側にあるノード群をｌｅａｓｔ＿ｌｅｆｔｓとし、ｌｅａｓｔ＿ｌｅｆｔｓを起点として論理関係の取得を行う。また、直交表取得部２０６は、論理関係をリストとして取得し、取得したリストを表リスト３０１として記憶部３０に格納する。 In addition, the horizontal table acquisition unit 205 acquires, for example, the leftmost node group among the node groups connected to the lower side of the upper left node of the table as the last_lefts, and acquires the logical relationship using the last_lefts as a starting point. The orthogonal table acquisition unit 206 acquires the logical relationship as a list, and stores the acquired list in the storage unit 30 as the table list 301.

また、直交表取得部２０６は、例えば表の左上のノードの右側に接続するノード群のうち最も上側にあるノード群をｌｅａｓｔ＿ｔｏｐｓとし、表の左上のノードの下側に接続するノード群のうち最も左側にあるノード群をｌｅａｓｔ＿ｌｅｆｔｓとする。そして、直交表取得部２０６は、ｌｅａｓｔ＿ｔｏｐｓおよびｌｅａｓｔ＿ｌｅｆｔｓを起点として論理関係の取得を行う。 Further, the orthogonal table acquisition unit 206, for example, designates the uppermost node group among the node groups connected to the right side of the upper left node of the table as least_tops, and is the most of the node group connected to the lower side of the upper left node of the table. Let the node group on the left be the last_lefts. Then, the orthogonal table acquisition unit 206 acquires a logical relationship starting from last_tops and last_lefts.

図８は、横表の表リストの一例である。まず、図８に示すように、表リストには表の分類が「横表」であることが記載されている。また、例えば、図８のＮｏ．１の行は、ノードｄ８の横方向の子がノードｄ９、ｄ１３、ｄ２０、ｄ２７、ｄ３０およびｄ３３であることを示している。また、例えば、図８のＮｏ．２の行は、ノードｄ９の横方向の子がノードｄ１０およびｄ１１であることを示している。また、例えば、図８のＮｏ．３の行は、ノードｄ１１の横方向の子がノードｄ１２であることを示している。また、例えば、図８のＮｏ．４の行は、ノードｄ１３の横方向の子がノードｄ１４およびｄ１７であることを示している。 FIG. 8 is an example of a table list of a horizontal table. First, as shown in FIG. 8, the table list describes that the table classification is “horizontal table”. Further, for example, as shown in FIG. The row 1 indicates that the horizontal children of the node d8 are nodes d9, d13, d20, d27, d30, and d33. Further, for example, as shown in FIG. The row 2 indicates that the horizontal children of the node d9 are nodes d10 and d11. Further, for example, as shown in FIG. The row 3 indicates that the horizontal child of the node d11 is the node d12. Further, for example, as shown in FIG. The row 4 indicates that the horizontal children of the node d13 are nodes d14 and d17.

ここで、図５に示すように、ノードｄ１０、ｄ１１およびｄ１２は項目名ノードであるため、縦方向に子となる項目値ノードが存在している必要がある。しかしながら、図８の表リストのＮｏ．３の行にはノードｄ１１の横方向の子がノード１２であることが記載されている。また、表リストにはノードｄ１０の子は記載されていない。このため、表整合部２０７は、図８のような誤った表リストを除外する。 Here, as shown in FIG. 5, since the nodes d10, d11, and d12 are item name nodes, it is necessary to have item value nodes that are children in the vertical direction. However, No. in the table list of FIG. Line 3 describes that the horizontal child of node d11 is node 12. Further, the child of the node d10 is not described in the table list. For this reason, the table matching unit 207 excludes an erroneous table list as shown in FIG.

図９は、直交表の表リストの一例である。また、図１０は、横表の表リストの一例である。図９および１０の表リストは誤っていないため、表整合部２０７によって除外されない。 FIG. 9 is an example of a table list of an orthogonal table. FIG. 10 is an example of a table list of a horizontal table. Since the table lists of FIGS. 9 and 10 are not in error, they are not excluded by the table matching unit 207.

表整合部２０７は、直交表、縦表および横表から、不整合な表であることを示す所定の条件を満たす表を除外した表を特定する。例えば、表整合部２０７は、直交表、縦表および横表から、下側にも右側にも項目値ノードが存在しない項目名ノードを有する直交表と、下側に項目値ノードが存在しない項目名ノードを有する縦表と、右側に項目値ノードが存在しない項目名ノードを有する横表と、を除外した表を特定する。具体的には、表整合部２０７は、表リスト３０１から、項目値ノードを子ノードとして持たない項目名ノードが存在する表リストを削除する。または、項目名ノードを子ノードとしてもつ項目名ノードが存在する表リストを削除する。なお、表整合部２０７は、第１の特定部の一例である。例えば、表整合部２０７は、図８の表リストを削除し、図９および１０の表リストを削除しない。 The table matching unit 207 identifies a table that excludes a table that satisfies a predetermined condition indicating that the table is inconsistent from the orthogonal table, the vertical table, and the horizontal table. For example, the table matching unit 207 includes an orthogonal table having an item name node having no item value node on the lower side and the right side, and an item having no item value node on the lower side from the orthogonal table, the vertical table, and the horizontal table. A table excluding a vertical table having name nodes and a horizontal table having item name nodes having no item value nodes on the right side is specified. Specifically, the table matching unit 207 deletes from the table list 301 a table list that includes item name nodes that do not have item value nodes as child nodes. Alternatively, the table list in which the item name node having the item name node as a child node exists is deleted. The table matching unit 207 is an example of a first specifying unit. For example, the table matching unit 207 deletes the table list of FIG. 8 and does not delete the table lists of FIGS.

［列挙の取得処理］
次に、第１の削除部２０８、第２の削除部２０９、列挙分類部２１０、縦列挙取得部２１１、横列挙取得部２１２、包含関係取得部２１４および包含グラフ生成部２１５の処理について説明するとともに、列挙の取得処理について説明する。 [Enumeration acquisition processing]
Next, processing of the first deletion unit 208, the second deletion unit 209, the enumeration classification unit 210, the vertical enumeration acquisition unit 211, the horizontal enumeration acquisition unit 212, the inclusion relationship acquisition unit 214, and the inclusion graph generation unit 215 will be described. In addition, enumeration acquisition processing will be described.

第１の削除部２０８は、１つのノードの所定の方向に、複数のノードが隣接している場合、１つのノードと複数のノードとの隣接関係を表すエッジを削除する。なお、第１の削除部２０８は、１つのノードの左側または上側に、複数のノードが隣接している場合、１つのノードと複数のノードとの隣接関係を表すエッジを削除するようにしてもよい。 When a plurality of nodes are adjacent to each other in a predetermined direction of one node, the first deletion unit 208 deletes an edge representing the adjacency relationship between the one node and the plurality of nodes. The first deletion unit 208 may delete an edge representing the adjacency relationship between one node and a plurality of nodes when a plurality of nodes are adjacent to the left or upper side of the one node. Good.

また、第２の削除部２０９は、項目名ノードのうち、所定の方向に項目値を表す領域のノードである項目値ノードが隣接している項目名ノードと、項目値ノードとの隣接関係を表すエッジを削除する。なお、第２の削除部２０９は、項目名ノードのうち、左側または上側に、項目値を表す領域のノードである項目値ノードが隣接している項目名ノードと、項目値ノードとの隣接関係を表すエッジを削除するようにしてもよい。 Also, the second deletion unit 209 determines the adjacency relationship between the item name node and the item value node adjacent to the item value node that is the node of the area representing the item value in a predetermined direction among the item name nodes. Delete the representing edge. Note that the second deletion unit 209 has an adjacency relationship between an item name node and an item value node adjacent to an item value node that is a node of an area representing an item value on the left or upper side of the item name nodes. You may make it delete the edge showing.

そして、縦列挙取得部２１１および横列挙取得部２１２は、第１の削除部２０８および第２の削除部２０９によってエッジの削除が行われたグラフを基に、項目名ノードと項目値ノードとの間の論理関係を取得する。具体的に、縦列挙取得部２１１および横列挙取得部２１２は、論理関係を、図１１に示すようなリストとして取得し、取得したリストを列挙リスト３０２として記憶部３０に格納する。図１１は、列挙リストの一例である。なお、列挙分類部２１０は、縦列挙取得部２１１および横列挙取得部２１２による処理の前に、グラフを縦列挙と横列挙に分類する。 Then, the vertical enumeration acquisition unit 211 and the horizontal enumeration acquisition unit 212 determine whether the item name node and the item value node are based on the graph in which the edge is deleted by the first deletion unit 208 and the second deletion unit 209. Get logical relationship between. Specifically, the vertical enumeration acquisition unit 211 and the horizontal enumeration acquisition unit 212 acquire the logical relationship as a list as illustrated in FIG. 11 and store the acquired list in the storage unit 30 as the enumeration list 302. FIG. 11 is an example of an enumeration list. The enumeration classification unit 210 classifies the graph into a vertical enumeration and a horizontal enumeration before processing by the vertical enumeration acquisition unit 211 and the horizontal enumeration acquisition unit 212.

図１１は、図５の列挙の論理関係を表した列挙リストである。図１１のリストのＮｏ．１の行は、項目名ノードｄ１１の縦方向の子がノードｄ１５、ｄ１８、ｄ２２、ｄ２５、ｄ２８、ｄ３１およびｄ３４であることを表している。また、図１１のリストのＮｏ．３の行は、項目名ノードｄ２の横方向の子がノードｄ３であることを表している。 FIG. 11 is an enumeration list showing the logical relationship of the enumeration of FIG. In the list of FIG. The row 1 indicates that the vertical children of the item name node d11 are nodes d15, d18, d22, d25, d28, d31, and d34. In the list of FIG. Line 3 indicates that the horizontal child of the item name node d2 is the node d3.

また、例えば、図５の場合、第１の削除部２０８によって、ノードｄ１と、ノードｄ２およびｄ４との隣接関係を表すエッジが削除されているため、縦列挙取得部２１１および横列挙取得部２１２が、項目名ノードｄ１の横方向の子をｄ３およびｄ５とするような論理関係を取得することはない。 Further, for example, in the case of FIG. 5, since the edge indicating the adjacency relationship between the node d1 and the nodes d2 and d4 has been deleted by the first deletion unit 208, the vertical enumeration acquisition unit 211 and the horizontal enumeration acquisition unit 212. However, a logical relationship in which the horizontal children of the item name node d1 are d3 and d5 is not acquired.

また、例えば、図４の場合、第２の削除部２０９によって、ノードｃ２と、ノードｃ６との隣接関係を表すエッジが削除されているため、縦列挙取得部２１１および横列挙取得部２１２が、項目名ノードｃ２の縦方向の子をｃ１０とするような論理関係を取得することはない。 For example, in the case of FIG. 4, since the edge indicating the adjacency relationship between the node c2 and the node c6 is deleted by the second deletion unit 209, the vertical enumeration acquisition unit 211 and the horizontal enumeration acquisition unit 212 A logical relationship in which the vertical child of the item name node c2 is c10 is not acquired.

包含関係取得部２１４は、第１の削除部２０８によってエッジの削除が行われたグラフを基に、項目名ノード間の包含関係を取得する。また、包含グラフ生成部２１５は、包含関係取得部２１４によって取得された包含関係を基に、図１２に示すようなグラフを生成し、生成したグラフを包含グラフ３０３として記憶部３０に格納する。図１２は、包含グラフについて説明するための図である。図１２の破線は縦方向の包含関係を表している。また、図１２の実線は横方向の包含関係を表している。以後、対象ノードが縦方向（または横方向）に他のノードを1つ以上包含する場合、対象ノードを縦方向（または横方向）の包含ノードと呼ぶ。また、包含関係のあるノードのうち、他のノードに包含されないノードを主のノード、包含される他のノードを従のノードと呼ぶ。 The inclusion relationship acquisition unit 214 acquires the inclusion relationship between the item name nodes based on the graph in which the edge is deleted by the first deletion unit 208. Further, the inclusion graph generation unit 215 generates a graph as shown in FIG. 12 based on the inclusion relationship acquired by the inclusion relationship acquisition unit 214 and stores the generated graph in the storage unit 30 as the inclusion graph 303. FIG. 12 is a diagram for explaining the inclusion graph. The broken line in FIG. 12 represents the vertical inclusion relationship. Also, the solid line in FIG. 12 represents the horizontal inclusion relationship. Hereinafter, when the target node includes one or more other nodes in the vertical direction (or horizontal direction), the target node is referred to as a vertical (or horizontal) included node. Among nodes having an inclusion relationship, a node that is not included in another node is referred to as a master node, and another node that is included is referred to as a slave node.

図１２の包含グラフは、図５の構造に対応したものである。図５および１２に示すように、ノードｄ１は、ノードｄ２およびｄ４を横方向に包含している。また、ノードｄ８は、ノードｄ９およびｄ１０を横方向に包含している。また、ノードｄ９は、ノードｄ１３、ｄ２０、ｄ２７、ｄ３０およびｄ３３を縦方向に包含している。また、ノードｄ１３は、ノードｄ１４およびｄ１７を横方向に包含している。また、ノードｄ２０は、ノードｄ２１およびｄ２４を横方向に包含している。また、ノードｄ１０は、ノードｄ１１およびｄ１２を縦方向に包含している。また、ノードｄ４０は、ノードｄ４１、ｄ４２およびｄ４７を横方向に包含している。また、ノードｄ４７は、ノードｄ４８およびｄ４９を縦方向に包含している。 The inclusion graph of FIG. 12 corresponds to the structure of FIG. As shown in FIGS. 5 and 12, the node d1 includes nodes d2 and d4 in the horizontal direction. The node d8 includes nodes d9 and d10 in the horizontal direction. The node d9 includes nodes d13, d20, d27, d30, and d33 in the vertical direction. The node d13 includes nodes d14 and d17 in the horizontal direction. The node d20 includes nodes d21 and d24 in the horizontal direction. The node d10 includes nodes d11 and d12 in the vertical direction. The node d40 includes nodes d41, d42, and d47 in the horizontal direction. The node d47 includes nodes d48 and d49 in the vertical direction.

具体的に、包含関係取得部２１４は、第１の項目名ノードの右側に隣接する第１のノード群のうち少なくとも１つが項目名ノードであり、かつ、第１のノード群に含まれる全てのノードの高さが第１の項目名ノードの高さ以下であり、かつ、第１のノード群の左上端のノードの頂点と、第１の項目名ノードの頂点が重なっている場合、第１の項目名ノードが第１のノード群を横方向に包含していると判定する。また、包含関係取得部２１４は、第２の項目名ノードの下側に隣接する第２のノード群のうち少なくとも１つが項目名ノードであり、かつ、第２のノード群に含まれる全てのノードの幅が第２の項目名ノードの幅以下であり、かつ、第２のノード群の左上端のノードの頂点と、第１の項目名ノードの頂点が重なっている場合、第２の項目名ノードが第２のノード群を縦方向に包含していると判定する。 Specifically, the inclusion relationship acquisition unit 214 includes at least one of the first node groups adjacent to the right side of the first item name node as an item name node, and all of the first node groups included in the first node group. If the height of the node is less than or equal to the height of the first item name node, and the vertex of the upper left node of the first node group and the vertex of the first item name node overlap, the first Is determined to include the first node group in the horizontal direction. In addition, the inclusion relationship acquisition unit 214 has at least one of the second node groups adjacent to the lower side of the second item name node as an item name node, and all the nodes included in the second node group The width of the second item name node is less than or equal to the width of the second item name node, and the vertex of the upper left node of the second node group overlaps the vertex of the first item name node, the second item name It is determined that the node includes the second node group in the vertical direction.

例えば、図５に示すように、項目名ノードｄ８の右側に隣接するノード群のうち、少なくともノードｄ９は項目名ノードである。また、項目名ノードｄ８の右側に隣接するノード群に含まれる全てのノードの高さは、全て項目名ノードｄ８の高さ以下である。また、項目名ノードｄ８の右側に隣接するノード群の左上端のノード、すなわちノードｄ９の頂点は、項目名ノードｄ８の頂点と重なり、かつ項目名ノードｄ８の右側に隣接するノード群の左下端のノード、すなわちノードｄ３３の頂点は項目名ノードｄ８の頂点と重なる。これより、包含関係取得部２１４は、項目名ノードｄ８が、項目名ノードｄ８の右側に隣接するノード群を包含していると判定する。 For example, as shown in FIG. 5, at least a node d9 is an item name node in a node group adjacent to the right side of the item name node d8. The heights of all nodes included in the node group adjacent to the right side of the item name node d8 are all equal to or lower than the height of the item name node d8. Also, the upper left node of the node group adjacent to the right side of the item name node d8, that is, the vertex of the node d9 overlaps with the vertex of the item name node d8, and the lower left end of the node group adjacent to the right side of the item name node d8. , That is, the vertex of the node d33 overlaps with the vertex of the item name node d8. Accordingly, the inclusion relationship acquisition unit 214 determines that the item name node d8 includes a node group adjacent to the right side of the item name node d8.

なお、包含関係取得部２１４の処理対象となる様式グラフは、第１の削除部２０８によるエッジの削除は行われているが、第２の削除部２０９によるエッジの削除は行われていないものである。このため、例えば、ノードｄ４０と、ノードｄ４７との間の隣接関係を表すエッジは削除されていない。そのため、包含関係取得部２１４は、項目名ノードｄ４０が、ノードｄ４７を横方向に包含していると判定する。 Note that the format graph to be processed by the inclusion relation acquisition unit 214 has been deleted by the first deletion unit 208 but not by the second deletion unit 209. is there. For this reason, for example, the edge representing the adjacency relationship between the node d40 and the node d47 is not deleted. Therefore, the inclusion relationship acquisition unit 214 determines that the item name node d40 includes the node d47 in the horizontal direction.

列挙整合部２１３は、縦列挙取得部２１１および横列挙取得部２１２によって取得された論理関係のうち、表リスト３０１に含まれるノードに関する論理関係を除外した合成対象の列挙に関する論理関係を特定する。具体的には、列挙整合部２１３は、列挙リスト３０２の各行の論理関係のうち、表リスト３０１に含まれる論理関係を削除する。なお、列挙整合部２１３は、第２の特定部の一例である。例えば、図１１のＮｏ．６の行の論理関係（親がノードｄ１４、子がノードｄ１５およびｄ１６、方向が横）は、図９のＮｏ．６の行の論理関係と一致するため、列挙整合部２１３は、図１１のＮｏ．６の行の論理関係を削除する。 The enumeration matching unit 213 specifies a logical relationship related to enumeration of synthesis targets from which logical relationships related to nodes included in the table list 301 are excluded from the logical relationships acquired by the vertical enumeration acquisition unit 211 and the horizontal enumeration acquisition unit 212. Specifically, the enumeration matching unit 213 deletes the logical relationship included in the table list 301 among the logical relationships of each row of the enumeration list 302. The enumeration matching unit 213 is an example of a second specifying unit. For example, in FIG. The logical relationship of the row 6 (the parent is the node d14, the children are the nodes d15 and d16, the direction is horizontal) is shown in No. 6 of FIG. 6 coincides with the logical relationship of the row 6, the enumeration matching unit 213 displays No. 6 in FIG. Delete the logical relationship of line 6.

［合成処理］
項目名間合成部２１６は、包含グラフを、図１３に示すような、木構造のデータとして表す。図１３は、包含グラフについて説明するための図である。なお、図１３の木構造のデータは、図１２の包含グラフに基づくものである。ここで、図１２の包含グラフは、各ノードの包含関係を集合として表現したものであるため、包含グラフ上には表内部の項目名同士の論理関係も含まれる。このとき、入れ子構造型に直交表が含まれる場合、直交表の親ノードは起点ノードとなるため、包含グラフのみからでは構造を正しく取得することができない。 [Composition process]
The item name synthesizing unit 216 represents the inclusion graph as tree-structured data as shown in FIG. FIG. 13 is a diagram for explaining the inclusion graph. The data of the tree structure in FIG. 13 is based on the inclusion graph in FIG. Here, since the inclusion graph of FIG. 12 expresses the inclusion relationship of each node as a set, the inclusion graph also includes the logical relationship between the item names in the table. At this time, if the nested structure type includes an orthogonal table, the parent node of the orthogonal table becomes the starting node, and therefore the structure cannot be acquired correctly only from the inclusion graph.

そこで、項目名間合成部２１６は、包含関係取得部２１４によって取得された包含関係のうち、合成対象の表に関する論理関係および合成対象の列挙に関する論理関係のいずれにも含まれない包含関係を基に木構造のデータを作成する。つまり、項目名間合成部２１６は、主のノードから従のノードへの包含関係を木構造のデータとして形成する際に、表リストの起点ノードまで、または列挙リストの項目名ノードまでを木ノードとして木構造のデータに合成し、以降のノードは合成しない。 Therefore, the inter-item name composition unit 216 is based on the inclusion relationships acquired by the inclusion relationship acquisition unit 214 based on the inclusion relationships that are not included in either the logical relationship related to the table to be combined or the logical relationship related to the enumeration of the synthesis target. Create tree structure data. That is, the item name synthesizing unit 216, when forming the inclusion relationship from the main node to the subordinate node as tree structure data, up to the starting node of the table list or the item name node of the enumeration list is a tree node Are synthesized into tree-structured data, and the subsequent nodes are not synthesized.

例えば、項目名間合成部２１６は、合成対象の表および合成対象の列挙に含まれるノードのうち、直交表の起点ノードと、縦表の起点ノードおよび当該起点ノードの右側に接続するノード群のうち最も上側にあるノード群と、横表の起点ノードおよび当該起点ノードの下側に接続するノード群のうち最も左側にあるノード群と、列挙の項目名ノードと、を木構造のデータに含める。つまり、項目名間合成部２１６は、縦表の起点ノードおよび起点ノードの右側のノードのうち最も上側にあるノード群（ｌｅａｓｔ＿ｔｏｐｓ）、横表の起点ノードおよび起点ノードの下側のノードのうち最も左側にあるノード群（ｌｅａｓｔ＿ｌｅｆｔｓ）、直交表の起点ノード、および列挙の項目名ノード以外のノードについては、木構造のデータに合成しない。 For example, the item name synthesizing unit 216 includes, among the nodes included in the synthesis target table and the synthesis target enumeration, a starting node of the orthogonal table, a starting node of the vertical table, and a node group connected to the right side of the starting node. The tree structure data includes the uppermost node group, the leftmost node group of the starting node of the horizontal table and the node group connected to the lower side of the starting node, and the enumerated item name node. . That is, the inter-item name composition unit 216 selects the uppermost node group (least_tops) among the starting node of the vertical table and the right node of the starting node, and the most of the starting node of the horizontal table and the lower node of the starting node. Nodes other than the node group (least_lefts) on the left side, the origin node of the orthogonal table, and the item name node of the enumeration are not synthesized with the tree structure data.

図１２の例では、項目名間合成部２１６は、ノードｄ１、ｄ９、ｄ３６、ｄ３８、ｄ４０、ｄ４１、ｄ４２、ｄ４７、ｄ４８およびｄ４９までを木構造のデータに合成する。例えば、ノードｄ９を親、ノードｄ１０を子とする論理関係は、図９の表リストに含まれている。同様に、ノードｄ９を起点ノードとする直交表に含まれるノードに関する論理関係は、図９の表リストに含まれているため、図９の表リストに含まれるノードのうち項目名間合成部２１６によって木構造のデータに合成されるのは、起点ノードであるノードｄ９のみである。 In the example of FIG. 12, the item name synthesizing unit 216 synthesizes nodes d1, d9, d36, d38, d40, d41, d42, d47, d48, and d49 into tree-structured data. For example, the logical relationship in which the node d9 is a parent and the node d10 is a child is included in the table list of FIG. Similarly, since the logical relationship regarding the nodes included in the orthogonal table starting from the node d9 is included in the table list of FIG. 9, among the nodes included in the table list of FIG. Only the node d9 that is the starting node is synthesized into the tree-structured data.

また、図１０の例では、項目名間合成部２１６は、ノードｄ４８およびノードｄ４９までを木構造のデータに合成する。例えば、ノードｄ４８を親、ノードｄ５０を子とする論理関係、および、ノードｄ４９を親、ノードｄ５１を子とする論理関係は、図１０の表リストに含まれている。このため、図１０の表リストに含まれるノードのうち項目名間合成部２１６によって木構造のデータに合成されるのは、起点ノードであるノードｄ４８および、ノードｄ４８の下側のノードのうち最も左側にあるノード（群）であるノードｄ４９のみである。 Further, in the example of FIG. 10, the item name synthesizing unit 216 synthesizes the nodes d48 and d49 up to tree structure data. For example, the logical relationship in which the node d48 is a parent and the node d50 is a child, and the logical relationship in which the node d49 is a parent and the node d51 is a child are included in the table list of FIG. For this reason, among the nodes included in the table list of FIG. 10, the item name synthesizing unit 216 synthesizes the tree-structured data with the node d48 that is the starting node and the node below the node d48 being the most. Only the node d49 which is the node (group) on the left side.

なお、ノードｄ８は、図８の表リストにおける起点ノードであるが、前述の通り、図８の表リストは表整合部２０７によって削除されている。このため、項目名間合成部２１６は、ノードｄ８を起点ノードとみなさない。 The node d8 is a starting node in the table list of FIG. 8, but the table list of FIG. 8 has been deleted by the table matching unit 207 as described above. Therefore, the item name synthesizing unit 216 does not regard the node d8 as a starting node.

また、表合成部２１７は、合成対象の表に関する論理関係を基に木構造のデータを作成し、当該木構造のデータと項目名間合成部２１６によって作成された木構造のデータとを合成した木構造のデータを作成する。具体的には、表合成部２１７は、縦表取得部２０４によって取得された論理関係、横表取得部２０５によって取得された論理関係、直交表取得部２０６によって取得された論理関係を表現した木構造のデータをそれぞれの表ごとに作成し、項目名間合成部２１６の木構造のデータに合成する。例えば、表合成部２１７は、図９に示すリストを基に、図１４のノードｄ９以降の枝に示す木構造のデータを作成する。図１４は、木構造のデータの一例である。 Further, the table synthesis unit 217 creates tree structure data based on the logical relationship regarding the synthesis target table, and synthesizes the tree structure data and the tree structure data created by the item name synthesis unit 216. Create tree structure data. Specifically, the table synthesizing unit 217 represents a logical relationship acquired by the vertical table acquisition unit 204, a logical relationship acquired by the horizontal table acquisition unit 205, and a logical relationship acquired by the orthogonal table acquisition unit 206. Structure data is created for each table and combined with the tree structure data of the item name synthesizing unit 216. For example, the table synthesizing unit 217 creates tree-structured data shown in the branches after the node d9 in FIG. 14 based on the list shown in FIG. FIG. 14 is an example of tree structure data.

また、列挙合成部２１８は、合成対象の列挙に関する論理関係を基に木構造のデータを作成し、当該木構造のデータと表合成部２１７によって作成された木構造のデータとを合成した木構造のデータを作成する。例えば、列挙合成部２１８は、図１４のノードｄ２、ｄ４、ｄ６、ｄ３６、ｄ３８、ｄ４１、ｄ４２、ｄ４８およびｄ４９以降の枝に示す木構造のデータを作成する。図１４は、木構造のデータの一例である。 In addition, the enumeration synthesis unit 218 creates tree structure data based on the logical relationship regarding the enumeration of synthesis targets, and synthesizes the tree structure data and the tree structure data created by the table synthesis unit 217. Create data for. For example, the enumeration synthesis unit 218 creates tree structure data shown in the branches after the nodes d2, d4, d6, d36, d38, d41, d42, d48, and d49 in FIG. FIG. 14 is an example of tree structure data.

例えば、列挙合成部２１８は、図１１の列挙リストを基に、項目名間合成部２１６によって作成された木構造データ、および表合成部２１７によって作成された木構造データを合成し、図１４に示す木構造のデータを作成する。 For example, the enumeration synthesis unit 218 synthesizes the tree structure data created by the item name synthesis unit 216 and the tree structure data created by the table synthesis unit 217 based on the enumeration list of FIG. Create the tree structure data shown.

また、追加部２１９は、表合成部２１７によって作成された木構造のデータに、当該木構造を定義する根ノードを追加する。例えば、追加部２１９は、当該木構造のデータに根ノード「ｆｏｒｍ１」を追加する。なお、根ノードは様式を構成する論理関係を示すために追加しているため、様式を構成する情報が必要なければ追加部２１９は必須ではない。 The adding unit 219 adds a root node that defines the tree structure to the tree structure data created by the table synthesis unit 217. For example, the adding unit 219 adds the root node “form1” to the data of the tree structure. In addition, since the root node is added to indicate the logical relationship that forms the format, the adding unit 219 is not essential if the information that configures the format is not necessary.

なお、木構造データの各ノードである木ノードは、例えば項目名または項目値の文字列等の帳票の書式情報から取得された情報の他に、子や親のノードを識別する情報、子や親のノードとの隣接方向、および当該木ノードが表に含まれるものであることを示す情報等を有する。 In addition, the tree node that is each node of the tree structure data includes, for example, information for identifying child and parent nodes in addition to information acquired from form format information such as item name or item value character strings, Information indicating that the adjacent direction to the parent node, the tree node is included in the table, and the like.

［第１の実施形態の処理］
まず、帳票グラフを生成する処理について説明する。帳票グラフを生成する処理は、論理関係認識装置１０による処理の前に前提として実行される処理である。ここでは、図示しない帳票グラフ生成部によって帳票グラフを生成する処理が行われることとする。なお、帳票グラフ生成部は、論理関係認識装置１０に備えられていても良いし、他の装置に備えられていてもよい。 [Process of First Embodiment]
First, processing for generating a form graph will be described. The process for generating the form graph is a process executed as a precondition before the process by the logical relationship recognition apparatus 10. Here, it is assumed that a form graph is generated by a form graph generation unit (not shown). The form graph generation unit may be provided in the logical relationship recognition device 10 or may be provided in another device.

まず、図１５を用いて帳票グラフ生成部の処理の全体の流れを説明する。図１５は、帳票グラフを生成する処理の流れを示すフローチャートである。図１５に示すように、帳票グラフ生成部は、まず帳票を読み込む（ステップＳ１１）。次に、帳票グラフ生成部は、読み込んだ帳票からノードを取得する（ステップＳ１２）。そして、帳票グラフ生成部は、ノードの隣接関係を取得する（ステップＳ１３）。最後に、帳票グラフ生成部は、生成した帳票グラフを出力する（ステップＳ１４）。なお、帳票グラフは、帳票の項目名または項目値を表す領域に関する情報をノードとして表し、ノード間の隣接関係をエッジとして表したグラフである。 First, the overall flow of processing of the form graph generation unit will be described with reference to FIG. FIG. 15 is a flowchart showing a flow of processing for generating a form graph. As shown in FIG. 15, the form graph generation unit first reads a form (step S11). Next, the form graph generation unit acquires a node from the read form (step S12). Then, the form graph generation unit acquires the adjacent relationship between the nodes (step S13). Finally, the form graph generation unit outputs the generated form graph (step S14). Note that the form graph is a graph in which information about an area representing an item name or item value of the form is represented as a node, and the adjacent relationship between the nodes is represented as an edge.

次に、図１６を用いて、ノードを取得する処理（ステップＳ１２）について説明する。図１６は、ノードを取得する処理の流れを示すフローチャートである。図１６に示すように、帳票グラフ生成部は、まず、帳票を読み込む（ステップＳ２１）。次に、帳票グラフ生成部は、各罫線枠の範囲を取得する（ステップＳ２２）。そして、帳票グラフ生成部は、全罫線枠をノードに変換するまで以下の処理を繰り返す（ステップＳ２３、Ｓ２７）。 Next, the process for acquiring a node (step S12) will be described with reference to FIG. FIG. 16 is a flowchart showing a flow of processing for acquiring a node. As shown in FIG. 16, the form graph generation unit first reads a form (step S21). Next, the form graph generation unit acquires the range of each ruled line frame (step S22). Then, the form graph generation unit repeats the following processing until all ruled line frames are converted into nodes (steps S23 and S27).

まず、帳票グラフ生成部は、罫線枠の範囲からノードを生成する（ステップＳ２４）。次に、帳票グラフ生成部は、生成したノードの罫線枠フラグをｔｒｕｅにする（ステップＳ２５）。そして、帳票グラフ生成部は、生成したノードをノード集合Ｎに追加する（ステップＳ２６）。 First, the form graph generation unit generates a node from the range of the ruled line frame (step S24). Next, the form graph generation unit sets the ruled line frame flag of the generated node to true (step S25). Then, the form graph generation unit adds the generated node to the node set N (step S26).

そして、帳票グラフ生成部は、罫線枠外の文字列群の範囲を取得する（ステップＳ２８）。ここで、帳票グラフ生成部は、全文字列群の範囲をノードに変換するまで以下の処理を繰り返す（ステップＳ２９、Ｓ３３）。 Then, the form graph generation unit acquires the range of the character string group outside the ruled line frame (step S28). Here, the form graph generation unit repeats the following processing until the range of all character string groups is converted into nodes (steps S29 and S33).

まず、帳票グラフ生成部は、全文字列群の範囲を罫線枠とした場合の、文字列群の範囲からノードを生成する（ステップＳ３０）。次に、帳票グラフ生成部は、生成したノードの罫線枠フラグをｆａｌｓｅにする（ステップＳ３１）。そして、帳票グラフ生成部は、生成したノードをノード集合Ｎに追加する（ステップＳ３２）。最後に、帳票グラフ生成部は、ノード集合Ｎを出力する（ステップＳ３４）。 First, the form graph generation unit generates a node from the range of character string groups when the range of all character string groups is a ruled line frame (step S30). Next, the form graph generation unit sets the ruled line frame flag of the generated node to false (step S31). Then, the form graph generation unit adds the generated node to the node set N (step S32). Finally, the form graph generation unit outputs the node set N (step S34).

次に、図１７を用いて、罫線枠の範囲からノードを生成する処理（ステップＳ２４、Ｓ３０）について説明する。図１７は、ノードを生成する処理の流れを示すフローチャートである。図１７に示すように、帳票グラフ生成部は、帳票グラフおよび罫線枠の範囲を読み込み、罫線枠の範囲をａｒｅａとする（ステップＳ４１）。次に、帳票グラフ生成部は、帳票グラフに新規ノードを生成し（ステップＳ４２）、ａｒｅａの範囲情報および保持情報を当該新規ノードに設定する（ステップＳ４３）。そして、帳票グラフ生成部は、新規ノードにインデックスを付け（ステップＳ４４）、当該新規ノードを出力する（ステップＳ４５）。 Next, processing (steps S24 and S30) for generating a node from the range of the ruled line frame will be described with reference to FIG. FIG. 17 is a flowchart showing a flow of processing for generating a node. As illustrated in FIG. 17, the form graph generation unit reads the range of the form graph and the ruled line frame, and sets the range of the ruled line frame as area (step S41). Next, the form graph generation unit generates a new node in the form graph (step S42), and sets the area range information and the holding information in the new node (step S43). Then, the form graph generation unit indexes the new node (step S44) and outputs the new node (step S45).

次に、図１８を用いて、隣接関係を取得する処理（ステップＳ１３）について説明する。図１８は、隣接関係を取得する処理の流れを示すフローチャートである。図１８に示すように、帳票グラフ生成部は、まず、ノード集合を読み込む（ステップＳ５１）。次に、帳票グラフ生成部は、読み込んだノード集合の各ノードであるノード１について、以下の処理を繰り返す（ステップＳ５２、Ｓ５５）。 Next, the process for acquiring the adjacency relationship (step S13) will be described with reference to FIG. FIG. 18 is a flowchart showing a flow of processing for acquiring the adjacency relationship. As shown in FIG. 18, the form graph generation unit first reads a node set (step S51). Next, the form graph generation unit repeats the following processing for node 1 which is each node of the read node set (steps S52 and S55).

まず、帳票グラフ生成部は、上下左右に隣接するノード群を取得する（ステップＳ５３）。次に、ノードの上下左右の隣接エッジを隣接グラフに格納する（ステップＳ５４）。最後に、帳票グラフ生成部は、隣接グラフを出力する（ステップＳ５６）。 First, the form graph generation unit acquires a group of nodes adjacent vertically and horizontally (step S53). Next, the upper, lower, left and right adjacent edges of the node are stored in the adjacent graph (step S54). Finally, the form graph generation unit outputs an adjacent graph (step S56).

次に、図１９を用いて、隣接するノード群を取得する処理（ステップＳ５３）について説明する。図１９は、隣接するノード群を取得する処理の流れを示すフローチャートである。図１９に示すように、まず、帳票グラフ生成部は、ノード集合およびノード１を読み込む。帳票グラフ生成部は、ノード集合をＮとする（ステップＳ６１）。ここで、ノード集合Ｎの各ノードであるノード２について、以下の処理を繰り返す（ステップＳ６２、Ｓ７３）。 Next, the process (step S53) for acquiring adjacent node groups will be described with reference to FIG. FIG. 19 is a flowchart showing a flow of processing for acquiring adjacent node groups. As shown in FIG. 19, first, the form graph generation unit reads a node set and node 1. The form graph generation unit sets the node set to N (step S61). Here, the following processing is repeated for the node 2 which is each node of the node set N (steps S62 and S73).

まず、帳票グラフ生成部は、ノード１とノード２が同一である場合（ステップＳ６３、ｔｒｕｅ）、ノード２をノード１に隣接しないノードとする（ステップＳ７２）。ノード１とノード２が同一でない場合（ステップＳ６３、ｆａｌｓｅ）、帳票グラフ生成部は、左側の隣接条件が充足されている場合（ステップＳ６４、ｔｒｕｅ）はノード２がノード１の左に隣接していると判定し（ステップＳ６５）、右側の隣接条件が充足されている場合（ステップＳ６６、ｔｒｕｅ）はノード２がノード１の右に隣接していると判定し（ステップＳ６７）、上側の隣接条件が充足されている場合（ステップＳ６８、ｔｒｕｅ）はノード２がノード１の上に隣接していると判定し（ステップＳ６９）、下側の隣接条件が充足されている場合（ステップＳ７０、ｔｒｕｅ）はノード２がノード１の下に隣接していると判定する（ステップＳ７１）。最後に、帳票グラフ生成部は、ノード１の各ノードとの隣接関係を表す隣接エッジを出力する（ステップＳ７４）。 First, when the node 1 and the node 2 are the same (step S63, true), the form graph generation unit sets the node 2 as a node not adjacent to the node 1 (step S72). When the node 1 and the node 2 are not the same (step S63, false), the form graph generation unit determines that the node 2 is adjacent to the left of the node 1 when the left adjacent condition is satisfied (step S64, true). If the right adjacent condition is satisfied (step S66, true), it is determined that the node 2 is adjacent to the right of the node 1 (step S67), and the upper adjacent condition is satisfied. Is satisfied (step S68, true), it is determined that the node 2 is adjacent to the node 1 (step S69), and the lower adjacent condition is satisfied (step S70, true). Determines that node 2 is adjacent to node 1 (step S71). Finally, the form graph generation unit outputs an adjacent edge representing the adjacent relationship between each node of node 1 (step S74).

ここで、各方向の隣接条件の一例について説明する。ノード１の左上、右上、左下、右下の頂点の座標をそれぞれノード１（ｘ１，ｙ１）、ノード１（ｘ２，ｙ１）、ノード１（ｘ１，ｙ２）、ノード１（ｘ２，ｙ２）とする。また、ノード２の左上、右上、左下、右下の頂点の座標をそれぞれノード２（ｘ１，ｙ１）、ノード２（ｘ２，ｙ１）、ノード２（ｘ１，ｙ２）、ノード２（ｘ２，ｙ２）とする。また、ｍａｒｇｉｎを、あらかじめ設定された罫線枠間のマージンを表す定数（例えば０）とする。 Here, an example of the adjacency condition in each direction will be described. The coordinates of the upper left, upper right, lower left, and lower right vertices of node 1 are node 1 (x1, y1), node 1 (x2, y1), node 1 (x1, y2), and node 1 (x2, y2), respectively. . In addition, the coordinates of the upper left, upper right, lower left, and lower right vertices of node 2 are respectively node 2 (x1, y1), node 2 (x2, y1), node 2 (x1, y2), and node 2 (x2, y2). And Further, margin is a constant (for example, 0) representing a margin between ruled line frames set in advance.

このとき、各隣接条件は、一例として、下記のように表される。
（左側の隣接条件）
ノード２．ｙ１≧ノード１．ｙ１
＆ノード２．ｙ２≧ノード１．ｙ２
＆０≦ノード１．ｘ１−ノード２．ｘ２≦ｍａｒｇｉｎ
（右側の隣接条件）
ノード２．ｙ１≧ノード１．ｙ１
＆ノード２．ｙ２≧ノード１．ｙ２
＆０≦ノード２．ｘ１−ノード１．ｘ２≦ｍａｒｇｉｎ
（上側の隣接条件）
ノード２．ｘ１≧ノード１．ｘ１
＆ノード２．ｘ２≧ノード１．ｘ２
＆０≦ノード１．ｙ１−ノード２．ｙ２≦ｍａｒｇｉｎ
（下側の隣接条件）
ノード２．ｘ１≧ノード１．ｘ１
＆ノード２．ｘ２≧ノード１．ｘ２
＆０≦ノード２．ｙ１−ノード１．ｙ２≦ｍａｒｇｉｎ At this time, each adjacency condition is expressed as follows as an example.
(Adjacent condition on the left)
Node 2. y1 ≧ node 1. y1
& Node 2. y2 ≧ node 1. y2
& 0 ≦ node 1. x1-node2. x2 ≦ margin
(Adjacent condition on the right)
Node 2. y1 ≧ node 1. y1
& Node 2. y2 ≧ node 1. y2
& 0 ≦ node 2. x1-node1. x2 ≦ margin
(Upper adjacent condition)
Node 2. x1 ≧ node 1. x1
& Node 2. x2 ≧ node 1. x2
& 0 ≦ node 1. y1-node2. y2 ≦ margin
(Lower adjacent condition)
Node 2. x1 ≧ node 1. x1
& Node 2. x2 ≧ node 1. x2
& 0 ≦ node 2. y1-node1. y2 ≦ margin

また、帳票には複数の様式が含まれている場合があるため、帳票グラフを様式グラフに分割する。図２０を用いて、様式グラフの生成について説明する。図２０は、様式グラフを生成する処理の流れを示すフローチャートである。図２０に示すように、帳票グラフ生成部は、まず、帳票グラフを読み込む（ステップＳ８１）。次に、帳票グラフ生成部は、帳票グラフから、ノードが項目値である包含関係を全て削除する（ステップＳ８２）。 In addition, since the form may include a plurality of forms, the form graph is divided into form graphs. Generation of the style graph will be described with reference to FIG. FIG. 20 is a flowchart showing a flow of processing for generating a style graph. As shown in FIG. 20, the form graph generation unit first reads a form graph (step S81). Next, the form graph generation unit deletes all inclusion relationships whose nodes are item values from the form graph (step S82).

そして、帳票グラフ生成部は、分類されていないノードのうち任意のノードをノードＸに指定する（ステップＳ８３）。帳票グラフ生成部は、ノードＸを始点として連結グラフを求める（ステップＳ８４）。ここで、連結グラフに分類されていないノードがある場合（ステップＳ８５、ｔｒｕｅ）、帳票グラフ生成部は、分類されていないノードをさらにノードＸに指定する（ステップＳ８３）。そして、連結グラフに分類されていないノードがなくなった場合（ステップＳ８５、ｆａｌｓｅ）、帳票グラフ生成部は、求めた連結グラフを様式グラフ群として出力する（ステップＳ８６）。 Then, the form graph generation unit designates an arbitrary node among the unclassified nodes as the node X (step S83). The form graph generation unit obtains a connected graph starting from the node X (step S84). If there is a node that is not classified in the connected graph (step S85, true), the form graph generation unit further designates a node that is not classified as the node X (step S83). When there is no node that is not classified into the connected graph (step S85, false), the form graph generation unit outputs the obtained connected graph as a style graph group (step S86).

次に、論理関係認識装置１０の処理の流れについて説明する。まず、図２１を用いて抽出部２０１の処理について説明する。図２１は、抽出部の処理の流れを示すフローチャートである。図２１に示すように、まず、抽出部２０１は、様式グラフおよび解析後視覚表現ルールを読み込む（ステップＳ９１）。次に、抽出部２０１は、様式グラフから解析後視覚表現ルールの条件を満たすノード群を選択する（ステップＳ９２）。次に、抽出部２０１は、選択したノード群の項目名属性を項目名に設定、すなわち項目名であることを示すフラグの値をｔｒｕｅにする（ステップＳ９３）。最後に、抽出部２０１は、様式グラフをＧとして返す（ステップＳ９４）。 Next, the processing flow of the logical relationship recognition apparatus 10 will be described. First, processing of the extraction unit 201 will be described with reference to FIG. FIG. 21 is a flowchart showing the flow of processing of the extraction unit. As shown in FIG. 21, first, the extraction unit 201 reads a style graph and a visual expression rule after analysis (step S91). Next, the extraction unit 201 selects a node group that satisfies the condition of the post-analysis visual expression rule from the style graph (step S92). Next, the extraction unit 201 sets the item name attribute of the selected node group to the item name, that is, sets the value of the flag indicating the item name to true (step S93). Finally, the extraction unit 201 returns the style graph as G (step S94).

次に、図２２を用いて解析部２０２の処理について説明する。図２２は、解析部の処理の流れを示すフローチャートである。解析部２０２は、視覚表現ルールを基に解析後視覚表現ルールを作成する。図２２に示すように、まず、解析部２０２は、視覚表現ルール群を読み込む（ステップＳ１０１）。次に、解析部２０２は、ＡｒｒａｙまたはＨａｓｈ型の変数ｒｕｌｅ＿ｌｉｓｔを生成する（ステップＳ１０２）。以降、解析部２０２は、読み込んだ視覚表現ルールを１つずつ処理する（ステップＳ１０３、Ｓ１０７）。 Next, processing of the analysis unit 202 will be described with reference to FIG. FIG. 22 is a flowchart showing the flow of processing of the analysis unit. The analysis unit 202 creates a post-analysis visual expression rule based on the visual expression rule. As shown in FIG. 22, first, the analysis unit 202 reads a visual expression rule group (step S101). Next, the analysis unit 202 generates an Array or Hash type variable rule_list (step S102). Thereafter, the analysis unit 202 processes the read visual expression rules one by one (steps S103 and S107).

まず、解析部２０２は、視覚表現ルールの条件を解析する（ステップＳ１０４）。次に、解析部２０２は、視覚表現ルールのアクションを解析する（ステップＳ１０５）。そして、解析部２０２は、解析した条件およびアクションを解析後視覚表現ルールとしてｒｕｌｅ＿ｌｉｓｔに格納する（ステップＳ１０６）。解析部２０２は、全ての視覚表現ルールについて処理を行った後、解析後視覚表現ルール群が格納されたｒｕｌｅ＿ｌｉｓｔを出力する（ステップＳ１０８）。 First, the analysis unit 202 analyzes the condition of the visual expression rule (step S104). Next, the analysis unit 202 analyzes the action of the visual expression rule (step S105). Then, the analysis unit 202 stores the analyzed conditions and actions in the rule_list as post-analysis visual expression rules (step S106). The analysis unit 202 processes all visual expression rules, and then outputs a rule_list in which the post-analysis visual expression rule group is stored (step S108).

次に、図２３を用いて表分類部２０３の処理について説明する。図２３は、表分類部の処理の流れを示すフローチャートである。図２３に示すように、まず、表分類部２０３は、様式グラフおよび様式グラフの左上のノードである対象ノード、および表リスト３０１を読み込む（ステップＳ３１ａ）。ここで、対象ノードが項目名でないか、対象ノードが既に親もしくは子として表リスト３０１に記載されている場合（ステップＳ３２ａ、ｔｒｕｅ）、表分類部２０３は、表フラグを「処理しない」に設定する（ステップＳ４２ａ）。なお、表フラグは、表の分類を「縦表」、「横表」、「直交表」および「処理しない」のうちのいずれかで示す変数である。 Next, the process of the table classification | category part 203 is demonstrated using FIG. FIG. 23 is a flowchart showing the flow of processing of the table classification unit. As shown in FIG. 23, first, the table classification unit 203 reads the style graph, the target node that is the upper left node of the style graph, and the table list 301 (step S31a). Here, when the target node is not an item name or the target node is already described in the table list 301 as a parent or child (step S32a, true), the table classification unit 203 sets the table flag to “do not process”. (Step S42a). The table flag is a variable indicating the table classification as one of “vertical table”, “horizontal table”, “orthogonal table”, and “not processed”.

表分類部２０３は、対象ノードが項目名、かつ、対象ノードが親もしくは子として表リスト３０１に未記載の場合（ステップＳ３２ａ、ｆａｌｓｅ）、表分類部２０３は、対象ノードの下側に接続するノード群をｂｏｔｔｏｍｓに格納し、対象ノードの右側に接続するノード群をｒｉｇｈｔｓに格納する（ステップＳ３３ａ）。 When the target node is an item name and the target node is not described in the table list 301 as a parent or a child (step S32a, false), the table classification unit 203 connects to the lower side of the target node. The node group is stored in bottoms, and the node group connected to the right side of the target node is stored in rights (step S33a).

図５の例で、対象ノードがノードｄ９である場合、ｒｉｇｈｔｓにはノードｄ１０、ｄ１１およびｄ１２が格納され、ｂｏｔｔｏｍｓにはノードｄ１３、ｄ１４、ｄ１７、ｄ２０、ｄ２１、ｄ２４、ｄ２７、ｄ３０およびｄ３３が格納される。 In the example of FIG. 5, when the target node is the node d9, the nodes d10, d11, and d12 are stored in the rights, and the nodes d13, d14, d17, d20, d21, d24, d27, d30, and d33 are stored in the bottoms. Stored.

そして、表分類部２０３は、ｒｉｇｈｔｓが全て項目名である場合、ｆｌｇ１をｔｒｕｅとし、それ以外の場合はｆｌｇ１をｆａｌｓｅとする。また、表分類部２０３は、ｒｉｇｈｔｓの領域が対象ノードと同じ高さである場合、ｆｌｇ２をｔｒｕｅとし、それ以外の場合はｆｌｇ２をｆａｌｓｅとする（ステップＳ３４ａ）。また、表分類部２０３は、ｆｌｇ１およびｆｌｇ２の両方がｔｒｕｅである場合はｆｌｇ＿ｒをｔｒｕｅとし、それ以外の場合はｆｌｇ＿ｒをｆａｌｓｅとする。 Then, the table classification unit 203 sets flg1 to true when all the rights are item names, and sets flg1 to false otherwise. Also, the table classification unit 203 sets flg2 to true when the rights region is the same height as the target node, and sets flg2 to false otherwise (step S34a). The table classification unit 203 sets flg_r to true when both flg1 and flg2 are true, and sets flg_r to false otherwise.

また、表分類部２０３は、ｂｏｔｔｏｍｓが全て項目名である場合、ｆｌｇ３をｔｒｕｅとし、それ以外の場合はｆｌｇ３をｆａｌｓｅとする。また、表分類部２０３は、ｂｏｔｔｏｍｓの領域が対象ノードと同じ幅である場合、ｆｌｇ４をｔｒｕｅとし、それ以外の場合はｆｌｇ４をｆａｌｓｅとする（ステップＳ３５ａ）。また、表分類部２０３は、ｆｌｇ３およびｆｌｇ４の両方がｔｒｕｅである場合はｆｌｇ＿ｂをｔｒｕｅとし、それ以外の場合はｆｌｇ＿ｂをｆａｌｓｅとする。 Also, the table classification unit 203 sets flg3 to true if bottoms are all item names, and sets flg3 to false otherwise. Also, the table classification unit 203 sets flg4 to true when the bottoms region has the same width as the target node, and sets flg4 to false otherwise (step S35a). Also, the table classification unit 203 sets flg_b to true when both flg3 and flg4 are true, and sets flg_b to false otherwise.

ｆｌｇ＿ｒおよびｆｌｇ＿ｂの両方がｔｒｕｅである場合（ステップＳ３６ａがｔｒｕｅ、かつ、ステップＳ３７ａがｔｒｕｅ）、表分類部２０３は、表フラグを「直交表」に設定する（ステップＳ３８ａ）。また、ｆｌｇ＿ｒがｔｒｕｅかつｆｌｇ＿ｂがｆａｌｓｅである場合（ステップＳ３６ａがｔｒｕｅ、かつ、ステップＳ３７ａがｆａｌｓｅ）、表分類部２０３は、表フラグを「縦表」に設定する（ステップＳ３９ａ）。また、ｆｌｇ＿ｒがｆａｌｓｅかつｆｌｇ＿ｂがｔｒｕｅである場合（ステップＳ３６ａがｆａｌｓｅ、かつ、ステップＳ４０ａがｔｒｕｅ）、表分類部２０３は、表フラグを「横表」に設定する（ステップＳ４１ａ）。また、ｆｌｇ＿ｒおよびｆｌｇ＿ｂの両方がｆａｌｓｅである場合（ステップＳ３６ａがｆａｌｓｅ、かつ、ステップＳ４０ａがｆａｌｓｅ）、表分類部２０３は、表フラグを「処理しない」に設定する（ステップＳ４２ａ）。最後に、表分類部２０３は、表フラグを出力する（ステップＳ４３ａ）。 When both flg_r and flg_b are true (step S36a is true and step S37a is true), the table classification unit 203 sets the table flag to “orthogonal table” (step S38a). When flg_r is true and flg_b is false (step S36a is true and step S37a is false), the table classification unit 203 sets the table flag to “vertical table” (step S39a). When flg_r is false and flg_b is true (step S36a is false and step S40a is true), the table classification unit 203 sets the table flag to “horizontal table” (step S41a). When both flg_r and flg_b are false (step S36a is false and step S40a is false), the table classification unit 203 sets the table flag to “not process” (step S42a). Finally, the table classification unit 203 outputs a table flag (step S43a).

例えば、図５のノードｄ９を起点とした例では、ｒｉｇｈｔｓ（ノードｄ１０、ｄ１１およびｄ１２）は全て項目名、かつ、ｒｉｇｈｔｓの領域の高さがノードｄ９と同じであるため、表分類部２０３は、ｆｌｇ１およびｆｌｇ２をｔｒｕｅにする。また、ｂｏｔｔｏｍｓ（ノードｄ１３、ｄ１４、ｄ１７、ｄ２０、ｄ２１、ｄ２４、ｄ２７、ｄ３０およびｄ３３）は全て項目名、かつ、ｂｏｔｔｏｍｓの領域の幅がノードｄ９と同じであるため、表分類部２０３は、ｆｌｇ３およびｆｌｇ４をｔｒｕｅにする。これより、ｆｌｇ＿ｒおよびｆｌｇ＿ｂは両方ともｔｒｕｅとなるため、表分類部２０３は表フラグを「直交表」とする。これにより、図５のノードｄ９を起点とする表は、直交表に分類される。 For example, in the example starting from the node d9 in FIG. 5, all the rights (nodes d10, d11, and d12) have the same item name and the height of the area of the rights is the same as that of the node d9. , Flg1 and flg2 are set to true. Further, since bottoms (nodes d13, d14, d17, d20, d21, d24, d27, d30, and d33) are all item names and the width of the bottoms area is the same as that of the node d9, the table classification unit 203 Set flg3 and flg4 to true. Thus, since both flg_r and flg_b are true, the table classification unit 203 sets the table flag to “orthogonal table”. As a result, the table starting from the node d9 in FIG. 5 is classified as an orthogonal table.

次に、図２４を用いて、縦表取得部２０４の処理について説明する。図２４は、縦表取得部の処理の流れを示すフローチャートである。図２４に示すように、まず、縦表取得部２０４は、起点ノードおよび様式グラフを読み込む（ステップＳ５１ａ）。次に、縦表取得部２０４は、新規表リストを作成する（ステップＳ５２ａ）。次に、縦表取得部２０４は、ｌｅａｓｔ＿ｔｏｐｓに、起点ノードと起点ノードの右側に接続するノード群のうち最も上側にあるノード群を格納する（ステップＳ５３ａ）。 Next, processing of the vertical table acquisition unit 204 will be described with reference to FIG. FIG. 24 is a flowchart showing the flow of processing of the vertical table acquisition unit. As shown in FIG. 24, first, the vertical table acquisition unit 204 reads a starting point node and a style graph (step S51a). Next, the vertical table acquisition unit 204 creates a new table list (step S52a). Next, the vertical table acquisition unit 204 stores, in last_tops, the uppermost node group among the node groups connected to the starting node and the right side of the starting node (step S53a).

ここで、縦表取得部２０４は、ｌｅａｓｔ＿ｔｏｐｓに含まれるノードのそれぞれについて、以下の処理を行う（ステップＳ５４ａ、Ｓ５７ａ）。まず、縦表取得部２０４は、縦表論理関係を取得する（ステップＳ５５ａ）。縦表論理関係を取得する処理の詳細については後述する。次に、縦表取得部２０４は、表リストに論理関係を追加する（ステップＳ５６ａ）。縦表取得部２０４は、全てのノードについて処理を行った後、表リストを出力する（ステップＳ５８ａ）。 Here, the vertical table acquisition unit 204 performs the following processing for each of the nodes included in the least_tops (steps S54a and S57a). First, the vertical table acquisition unit 204 acquires a vertical table logical relationship (step S55a). Details of the process of acquiring the vertical table logical relationship will be described later. Next, the vertical table acquisition unit 204 adds a logical relationship to the table list (step S56a). The vertical table acquisition unit 204 outputs the table list after processing all the nodes (step S58a).

次に、図２５を用いて、横表取得部２０５の処理について説明する。図２５は、横表取得部の処理の流れを示すフローチャートである。図２５に示すように、まず、横表取得部２０５は、起点ノードおよび様式グラフを読み込む（ステップＳ６１ａ）。次に、横表取得部２０５は、新規表リストを作成する（ステップＳ６２ａ）。次に、横表取得部２０５は、ｌｅａｓｔ＿ｌｅｆｔｓに、起点ノードと起点ノードの下側に接続するノード群のうち最も左側にあるノード群を格納する（ステップＳ６３ａ）。 Next, the process of the horizontal table acquisition unit 205 will be described with reference to FIG. FIG. 25 is a flowchart showing the flow of processing of the horizontal table acquisition unit. As shown in FIG. 25, first, the horizontal table acquisition unit 205 reads a starting point node and a style graph (step S61a). Next, the horizontal table acquisition unit 205 creates a new table list (step S62a). Next, the horizontal table acquisition unit 205 stores, in last_lefts, the leftmost node group among the node groups connected to the origin node and the origin node (step S63a).

ここで、横表取得部２０５は、ｌｅａｓｔ＿ｌｅｆｔｓに含まれるノードのそれぞれについて、以下の処理を行う（ステップＳ６４ａ、Ｓ６７ａ）。まず、横表取得部２０５は、横表論理関係を取得する（ステップＳ６５ａ）。横表論理関係を取得する処理の詳細については後述する。次に、横表取得部２０５は、表リストに論理関係を追加する（ステップＳ６６ａ）。横表取得部２０５は、全てのノードについて処理を行った後、表リストを出力する（ステップＳ６８ａ）。 Here, the horizontal table acquisition unit 205 performs the following processing for each of the nodes included in the last_lefts (steps S64a and S67a). First, the horizontal table acquisition unit 205 acquires a horizontal table logical relationship (step S65a). Details of the process of acquiring the horizontal and logical relation will be described later. Next, the horizontal table acquisition unit 205 adds a logical relationship to the table list (step S66a). The horizontal table acquisition unit 205 outputs the table list after processing all the nodes (step S68a).

例えば、横表取得部２０５は、図５のノードｄ４８を起点とした表に対応した様式グラフを基に、図１０に示すような表リストを出力する。このとき、横表取得部２０５は、ｌｅａｓｔ＿ｌｅｆｔｓに、ノードｄ４９を格納する。 For example, the horizontal table acquisition unit 205 outputs a table list as shown in FIG. 10 based on the style graph corresponding to the table starting from the node d48 in FIG. At this time, the horizontal table acquisition unit 205 stores the node d49 in the last_lefts.

次に、図２６を用いて、直交表取得部２０６の処理について説明する。図２６は、直交表取得部の処理の流れを示すフローチャートである。図２６に示すように、まず、直交表取得部２０６は、起点ノードおよび様式グラフを読み込む（ステップＳ７１ａ）。次に、直交表取得部２０６は、新規表リストを作成する（ステップＳ７２ａ）。 Next, the processing of the orthogonal table acquisition unit 206 will be described with reference to FIG. FIG. 26 is a flowchart showing the flow of processing of the orthogonal table acquisition unit. As shown in FIG. 26, first, the orthogonal table acquisition unit 206 reads the start node and the style graph (step S71a). Next, the orthogonal table acquisition unit 206 creates a new table list (step S72a).

次に、直交表取得部２０６は、ｌｅａｓｔ＿ｔｏｐｓに、起点ノードの右側に接続するノード群のうち最も上側にあるノード群を格納する（ステップＳ７３ａ）。ここで、直交表取得部２０６は、ｌｅａｓｔ＿ｔｏｐｓに含まれるノードのそれぞれについて、以下の処理を行う（ステップＳ７４ａ、Ｓ７７ａ）。まず、直交表取得部２０６は、縦表論理関係を取得する（ステップＳ７５ａ）。次に、直交表取得部２０６は、表リストに論理関係を追加する（ステップＳ７６ａ）。 Next, the orthogonal table acquisition unit 206 stores the uppermost node group among the node groups connected to the right side of the starting node in last_tops (step S73a). Here, the orthogonal table acquisition unit 206 performs the following process for each of the nodes included in the least_tops (steps S74a and S77a). First, the orthogonal table acquisition unit 206 acquires a vertical table logical relationship (step S75a). Next, the orthogonal table acquisition unit 206 adds a logical relationship to the table list (step S76a).

次に、直交表取得部２０６は、ｌｅａｓｔ＿ｌｅｆｔｓに、起点ノードの下側に接続するノード群のうち最も左側にあるノード群を格納する（ステップＳ７８ａ）。ここで、直交表取得部２０６は、ｌｅａｓｔ＿ｌｅｆｔｓに含まれるノードのそれぞれについて、以下の処理を行う（ステップＳ７９ａ、Ｓ８２ａ）。まず、直交表取得部２０６は、横表論理関係を取得する（ステップＳ８０ａ）。次に、直交表取得部２０６は、表リストに論理関係を追加する（ステップＳ８１ａ）。直交表取得部２０６は、全てのノードについて処理を行った後、表リストを出力する（ステップＳ８３ａ）。 Next, the orthogonal table acquisition unit 206 stores the leftmost node group among the node groups connected to the lower side of the origin node in last_lefts (step S78a). Here, the orthogonal table acquisition unit 206 performs the following processing for each of the nodes included in the least_lefts (steps S79a and S82a). First, the orthogonal table acquisition unit 206 acquires a horizontal table logical relationship (step S80a). Next, the orthogonal table acquisition unit 206 adds a logical relationship to the table list (step S81a). The orthogonal table acquisition unit 206 outputs the table list after processing all the nodes (step S83a).

例えば、直交表取得部２０６は、図５のノードｄ９を起点とした表に対応した様式グラフを基に、図９に示すような表リストを出力する。このとき、直交表取得部２０６は、ｌｅａｓｔ＿ｔｏｐｓに、ノードｄ１０を格納する。また、直交表取得部２０６は、ｌｅａｓｔ＿ｌｅｆｔｓに、ノードｄ１３、ｄ２０、ｄ２７、ｄ３０およびｄ３３を格納する。 For example, the orthogonal table acquisition unit 206 outputs a table list as shown in FIG. 9 based on the style graph corresponding to the table starting from the node d9 in FIG. At this time, the orthogonal table acquisition unit 206 stores the node d10 in the least_tops. Further, the orthogonal table acquisition unit 206 stores the nodes d13, d20, d27, d30, and d33 in last_lefts.

次に、図２７を用いて、縦方向の論理関係を取得する処理、すなわち縦表論理関係取得処理（図２４のステップＳ５５ａ、および図２６のステップＳ７５ａ）について説明する。図２７は、縦方向の論理関係を取得する処理の流れを示すフローチャートである。縦表論理関係取得処理は、縦表取得部２０４または直交表取得部２０６によって行われる。ここでは、直交表取得部２０６が縦表論理関係取得処理を行う場合の例について説明する。なお、縦表取得部２０４が縦表論理関係取得処理を行う場合も、直交表取得部２０６が縦表論理関係取得処理を行う場合と処理内容は同様である。 Next, processing for acquiring a vertical logical relationship, that is, vertical table logical relationship acquisition processing (step S55a in FIG. 24 and step S75a in FIG. 26) will be described with reference to FIG. FIG. 27 is a flowchart illustrating a flow of processing for acquiring a vertical logical relationship. The vertical table logical relationship acquisition processing is performed by the vertical table acquisition unit 204 or the orthogonal table acquisition unit 206. Here, an example in which the orthogonal table acquisition unit 206 performs vertical table logical relationship acquisition processing will be described. In addition, when the vertical table acquisition unit 204 performs the vertical table logical relationship acquisition processing, the processing content is the same as when the orthogonal table acquisition unit 206 performs the vertical table logical relationship acquisition processing.

図２７に示すように、まず、直交表取得部２０６は、対象ノードおよび様式グラフを読み込む（ステップＳ１０１ａ）。そして、直交表取得部２０６は、新規論理関係を表す変数ｐａｉｒを生成する（ステップＳ１０２ａ）。次に、直交表取得部２０６は、対象ノードの下隣にあるノード集合をｎｅｘｔ＿ｂに格納する（ステップＳ１０３ａ）。 As shown in FIG. 27, first, the orthogonal table acquisition unit 206 reads the target node and the style graph (step S101a). And the orthogonal table acquisition part 206 produces | generates the variable pair showing a new logical relationship (step S102a). Next, the orthogonal table acquisition unit 206 stores the node set next to the target node in next_b (step S103a).

ここで、ｎｅｘｔ＿ｂのノード数が２つ以上であるか、ｎｅｘｔ＿ｂが全て項目名である場合（ステップＳ１０４ａ、ｔｒｕｅ）、直交表取得部２０６は、ｐａｉｒに、親を対象ノード、子をｎｅｘｔ＿ｂ、方向を縦とする論理関係を追加する（ステップＳ１０５ａ）。 Here, when the number of nodes of next_b is two or more or next_b is an item name (step S104a, true), the orthogonal table acquisition unit 206 sets a parent as a target node, a child as a next_b, and a direction. Is added to the vertical (step S105a).

そして、直交表取得部２０６は、ｎｅｘｔ＿ｂに含まれるノードのそれぞれ（ｃｒｔ）について、縦表論理関係取得処理を再帰的に行う（ステップＳ１０６ａ、Ｓ１０８ａ）。直交表取得部２０６は、ｐａｉｒに項目名および項目値の論理関係群を格納し、Ｇに様式グラフを格納し、対象ノードをｃｒｔに置き換える（ステップＳ１０７ａ）。そして、直交表取得部２０６は、ステップＳ１０３ａに戻り、ｎｅｘｔ＿ｂに対象ノードの下隣にあるノード集合を格納する。 Then, the orthogonal table acquisition unit 206 recursively performs vertical table logical relationship acquisition processing for each node (crt) included in next_b (steps S106a and S108a). The orthogonal table acquisition unit 206 stores the logical relationship group of the item name and the item value in the pair, stores the style graph in G, and replaces the target node with crt (step S107a). Then, the orthogonal table acquisition unit 206 returns to step S103a, and stores the node set adjacent to the target node in next_b.

また、ｎｅｘｔ＿ｂのノード数が２つ以上でなく、かつ、ｎｅｘｔ＿ｂのいずれかが項目名でない場合（ステップＳ１０４ａ、ｆａｌｓｅ）、直交表取得部２０６は、対象ノードの下側のノード集合をｂｏｔｔｏｍｓに格納する（ステップＳ１０９ａ）。そして、直交表取得部２０６は、ｐａｉｒに、親を対象ノード、子をｂｏｔｔｏｍｓ、方向を縦とする論理関係を追加する（ステップＳ１１０ａ）。直交表取得部２０６は、全ての対象ノードについて処理を行った後、論理関係集合（ｐａｉｒ）を出力する（ステップＳ１１１ａ）。 Further, when the number of nodes of next_b is not two or more and any of the next_b is not an item name (step S104a, false), the orthogonal table acquisition unit 206 stores the lower node set of the target node in bottoms. (Step S109a). Then, the orthogonal table acquisition unit 206 adds a logical relationship in which the parent is the target node, the child is bottoms, and the direction is vertical (step S110a). The orthogonal table acquisition unit 206 outputs the logical relation set (pair) after performing processing for all target nodes (step S111a).

次に、図２８を用いて、横方向の論理関係を取得する処理、すなわち横表論理関係取得処理（図２５のステップＳ６５ａ、および図２６のステップＳ８０ａ）について説明する。図２８は、横方向の論理関係を取得する処理の流れを示すフローチャートである。横表論理関係取得処理は、横表取得部２０５または直交表取得部２０６によって行われる。ここでは、直交表取得部２０６が横表論理関係取得処理を行う場合の例について説明する。なお、横表取得部２０５が横表論理関係取得処理を行う場合も、直交表取得部２０６が横表論理関係取得処理を行う場合と処理内容は同様である。 Next, processing for acquiring a horizontal logical relationship, that is, horizontal table logical relationship acquisition processing (step S65a in FIG. 25 and step S80a in FIG. 26) will be described with reference to FIG. FIG. 28 is a flowchart showing a flow of processing for acquiring a horizontal logical relationship. The horizontal table logical relationship acquisition processing is performed by the horizontal table acquisition unit 205 or the orthogonal table acquisition unit 206. Here, an example in which the orthogonal table acquisition unit 206 performs a horizontal table logical relationship acquisition process will be described. Note that the processing contents are the same when the horizontal table acquisition unit 205 performs the horizontal table logical relationship acquisition processing as when the orthogonal table acquisition unit 206 performs the horizontal table logical relationship acquisition processing.

図２８に示すように、まず、直交表取得部２０６は、対象ノードおよび様式グラフを読み込む（ステップＳ２０１ａ）。そして、直交表取得部２０６は、新規論理関係を表す変数ｐａｉｒを生成する（ステップＳ２０２ａ）。次に、直交表取得部２０６は、対象ノードの右隣にあるノード集合をｎｅｘｔ＿ｒに格納する（ステップＳ２０３ａ）。 As shown in FIG. 28, first, the orthogonal table acquisition unit 206 reads the target node and the style graph (step S201a). And the orthogonal table acquisition part 206 produces | generates the variable pair showing a new logical relationship (step S202a). Next, the orthogonal table acquisition unit 206 stores the node set on the right side of the target node in next_r (step S203a).

ここで、ｎｅｘｔ＿ｒのノード数が２つ以上であるか、ｎｅｘｔ＿ｒが全て項目名である場合（ステップＳ２０４ａ、ｔｒｕｅ）、直交表取得部２０６は、ｐａｉｒに、親を対象ノード、子をｎｅｘｔ＿ｒ、方向を横とする論理関係を追加する（ステップＳ２０５ａ）。 Here, when the number of nodes of next_r is two or more, or when next_r is all item names (step S204a, true), the orthogonal table acquisition unit 206 sets the parent as the target node, the child as the next_r, and the direction. Is added to the side (step S205a).

そして、直交表取得部２０６は、ｎｅｘｔ＿ｒに含まれるノードのそれぞれ（ｃｒｔ）について、横表論理関係取得処理を再帰的に行う（ステップＳ２０６ａ、Ｓ２０８ａ）。直交表取得部２０６は、ｐａｉｒに項目名および項目値の論理関係群を格納し、Ｇに様式グラフを格納し、対象ノードをｃｒｔに置き換える（ステップＳ２０７ａ）。そして、直交表取得部２０６は、ステップＳ２０３ａに戻り、ｎｅｘｔ＿ｒに対象ノードの右隣にあるノード集合を格納する。 Then, the orthogonal table acquisition unit 206 recursively performs a horizontal table logical relationship acquisition process for each node (crt) included in next_r (steps S206a and S208a). The orthogonal table acquisition unit 206 stores the logical relationship group of the item name and item value in the pair, stores the style graph in G, and replaces the target node with crt (step S207a). Then, the orthogonal table acquisition unit 206 returns to step S203a, and stores the node set on the right side of the target node in next_r.

また、ｎｅｘｔ＿ｒのノード数が２つ以上でなく、かつ、ｎｅｘｔ＿ｒのいずれかが項目名でない場合（ステップＳ２０４ａ、ｆａｌｓｅ）、直交表取得部２０６は、対象ノードの右側のノード集合をｒｉｇｈｔｓに格納する（ステップＳ２０９ａ）。そして、直交表取得部２０６は、ｐａｉｒに、親を対象ノード、子をｒｉｇｈｔｓ、方向を横とする論理関係を追加する（ステップＳ２１０ａ）。直交表取得部２０６は、全ての対象ノードについて処理を行った後、論理関係集合（ｐａｉｒ）を出力する（ステップＳ２１１ａ）。 If the number of nodes in next_r is not two or more and any of the next_r is not an item name (step S204a, false), the orthogonal table acquisition unit 206 stores the right node set of the target node in rights. (Step S209a). Then, the orthogonal table acquisition unit 206 adds a logical relationship in which the parent is the target node, the child is rights, and the direction is horizontal (step S210a). The orthogonal table acquisition unit 206 outputs the logical relation set (pair) after performing processing for all target nodes (step S211a).

ここで、図２９を用いて、表整合部２０７の処理について説明する。図２９は、表整合部の処理の流れを示すフローチャートである。表整合部２０７は、まず、様式グラフおよび表リストを読み込む（ステップＳ１１１）。次に、表整合部２０７は、表リストから、子のないノード群ｃｈｉｌｄｒｅｎを取得する（ステップＳ１１２）。そして、ｃｈｉｌｄｒｅｎに項目名が含まれている場合（ステップＳ１１３、ｔｒｕｅ）、表整合部２０７は、当該表を表リストから削除する（ステップＳ１１４）。また、ｃｈｉｌｄｒｅｎに項目名が含まれていない場合（ステップＳ１１３、ｆａｌｓｅ）、表整合部２０７は、当該表を表リストに残す。 Here, the processing of the table matching unit 207 will be described with reference to FIG. FIG. 29 is a flowchart showing the flow of processing of the table matching unit. The table matching unit 207 first reads a style graph and a table list (step S111). Next, the table matching unit 207 acquires a node group children having no children from the table list (step S112). When the item name is included in the children (step S113, true), the table matching unit 207 deletes the table from the table list (step S114). If the item name is not included in the children (step S113, false), the table matching unit 207 leaves the table in the table list.

次に、図３０を用いて、第１の削除部２０８の処理について説明する。図３０は、第１の削除部の処理の流れを示すフローチャートである。まず、第１の削除部２０８は、様式グラフを読み込む（ステップＳ３１ｂ）。 Next, the processing of the first deletion unit 208 will be described with reference to FIG. FIG. 30 is a flowchart showing the flow of processing of the first deletion unit. First, the first deletion unit 208 reads a style graph (step S31b).

次に、第１の削除部２０８は、様式グラフの各ノードについて、以下の処理を行う（ステップＳ３２ｂ、Ｓ３７ｂ）。まず、左方向に隣接するノードの本数が２本以上である場合（ステップＳ３３ｂ、ｔｒｕｅ）、第１の削除部２０８は、左方向に隣接するノードと対象ノードの隣接関係を削除する（ステップＳ３４ｂ）。また、左方向に隣接するノードの本数が２本以上でない場合（ステップＳ３３ｂ、ｆａｌｓｅ）、第１の削除部２０８は、隣接関係を削除しない。 Next, the first deletion unit 208 performs the following process for each node of the style graph (steps S32b and S37b). First, when the number of nodes adjacent in the left direction is two or more (step S33b, true), the first deletion unit 208 deletes the adjacent relationship between the node adjacent in the left direction and the target node (step S34b). ). If the number of nodes adjacent in the left direction is not two or more (step S33b, false), the first deletion unit 208 does not delete the adjacent relationship.

次に、上方向に隣接するノードの本数が２本以上である場合（ステップＳ３５ｂ、ｔｒｕｅ）、第１の削除部２０８は、上方向に隣接するノードと対象ノードの隣接関係を削除する（ステップＳ３６ｂ）。また、上方向に隣接するノードの本数が２本以上でない場合（ステップＳ３５ｂ、ｆａｌｓｅ）、第１の削除部２０８は、隣接関係を削除しない。第１の削除部２０８は、全ての対象ノードについて処理を行った後、様式グラフを出力する（ステップＳ３８ｂ）。 Next, when the number of nodes adjacent in the upward direction is two or more (step S35b, true), the first deletion unit 208 deletes the adjacent relationship between the node adjacent in the upward direction and the target node (step S35b). S36b). If the number of nodes adjacent in the upward direction is not two or more (step S35b, false), the first deletion unit 208 does not delete the adjacent relationship. The 1st deletion part 208 outputs a style graph, after processing about all object nodes (Step S38b).

次に、図３１を用いて、第２の削除部２０９の処理について説明する。図３１は、第２の削除部の処理の流れを示すフローチャートである。まず、第２の削除部２０９は、様式グラフおよび項目名ノードリストを読み込む（ステップＳ４１ｂ）。なお、項目名ノードリストは、抽出部２０１によって抽出された項目名ノードのリストである。 Next, the process of the second deletion unit 209 will be described with reference to FIG. FIG. 31 is a flowchart showing the flow of processing of the second deletion unit. First, the second deletion unit 209 reads the style graph and the item name node list (step S41b). The item name node list is a list of item name nodes extracted by the extraction unit 201.

次に、第２の削除部２０９は、項目名ノードのリストの各項目名ノードについて、以下の処理を行う（ステップＳ４２ｂ、Ｓ４７ｂ）。まず、左方向に項目値ノードが隣接する場合（ステップＳ４３ｂ、ｔｒｕｅ）、第２の削除部２０９は、左方向に隣接するノードと対象ノードの隣接関係を削除する（ステップＳ４４ｂ）。また、左方向に項目値ノードが隣接しない場合（ステップＳ４３ｂ、ｆａｌｓｅ）、第２の削除部２０９は、隣接関係を削除しない。 Next, the second deletion unit 209 performs the following processing for each item name node in the list of item name nodes (steps S42b and S47b). First, when the item value node is adjacent in the left direction (step S43b, true), the second deletion unit 209 deletes the adjacent relationship between the node adjacent in the left direction and the target node (step S44b). When the item value node is not adjacent in the left direction (step S43b, false), the second deletion unit 209 does not delete the adjacent relationship.

次に、上方向に項目値ノードが隣接する場合（ステップＳ４５ｂ、ｔｒｕｅ）、第２の削除部２０９は、上方向に隣接するノードと対象ノードの隣接関係を削除する（ステップＳ４６ｂ）。また、上方向に項目値ノードが隣接しない場合（ステップＳ４５ｂ、ｆａｌｓｅ）、第２の削除部２０９は、隣接関係を削除しない。第２の削除部２０９は、全ての対象ノードについて処理を行った後、様式グラフを出力する（ステップＳ４８ｂ）。 Next, when the item value node is adjacent in the upward direction (step S45b, true), the second deletion unit 209 deletes the adjacent relationship between the node adjacent in the upward direction and the target node (step S46b). If the item value node is not adjacent in the upward direction (step S45b, false), the second deletion unit 209 does not delete the adjacent relationship. The 2nd deletion part 209 outputs a style graph, after processing about all object nodes (Step S48b).

次に、図３２を用いて、列挙分類部２１０の処理について説明する。図３２は、列挙分類部の処理の流れを示すフローチャートである。以後、対象ノードから右側に隣接しているノードを辿って得られるノードとの関係を「右側に接続する」、対象ノードから下側に隣接しているノードを辿って得られるノードとの関係を「下側に接続する」と呼ぶ。図３２に示すように、まず、列挙分類部２１０は、様式グラフおよび対象ノードを読み込む（ステップＳ５１ｂ）。ここで、対象ノードが項目名でない場合（ステップＳ５２ｂ、ｔｒｕｅ）、列挙分類部２１０は、列挙フラグを「列挙なし」に設定する（ステップＳ５８ｂ）。なお、列挙フラグは、各対象ノードの分類を「縦列挙」、「横列挙」および「列挙なし」のうちのいずれかで示す変数である。また、列挙には、縦列挙と横列挙の複合型や、列挙入れ子構造も存在するが、これらの列挙は、縦列挙および横列挙の組み合わせで表現することができる。 Next, processing of the enumeration classification unit 210 will be described with reference to FIG. FIG. 32 is a flowchart showing the flow of processing of the enumeration classification unit. From now on, the relationship with the node obtained by tracing the node adjacent to the right side from the target node is “connected to the right side”, and the relationship with the node obtained by tracing the node adjacent to the lower side from the target node Called “connect to the bottom”. As shown in FIG. 32, first, the enumeration classification unit 210 reads a style graph and a target node (step S51b). If the target node is not an item name (step S52b, true), the enumeration classification unit 210 sets the enumeration flag to “no enumeration” (step S58b). The enumeration flag is a variable indicating the classification of each target node as one of “vertical enumeration”, “horizontal enumeration”, and “no enumeration”. In addition, the enumeration includes a composite type of vertical enumeration and horizontal enumeration, and an enumeration nested structure. These enumerations can be expressed by a combination of vertical enumeration and horizontal enumeration.

また、対象ノードが項目名である場合（ステップＳ５１ｂ、ｆａｌｓｅ）、列挙分類部２１０は、対象ノードの下側に接続するノード群をｂｏｔｔｏｍｓに格納し、対象ノードの右側に接続するノード群をｒｉｇｈｔｓに格納する（ステップＳ５３ｂ）。 When the target node is an item name (step S51b, false), the enumeration classification unit 210 stores a node group connected to the lower side of the target node in bottoms, and sets the node group connected to the right side of the target node to the rights. (Step S53b).

ここで、ｒｉｇｈｔｓに項目名が含まれず、ｒｉｇｈｔｓの個数が１である場合（ステップＳ５４ｂ、ｔｒｕｅ）、列挙分類部２１０は、列挙フラグを「横列挙」に設定する（ステップＳ５５ｂ）。また、ｒｉｇｈｔｓに項目名が含まれる場合、またはｒｉｇｈｔｓの個数が１でない場合（ステップＳ５４ｂ、ｆａｌｓｅ）、列挙分類部２１０は、以下の処理を行う。 Here, when the item name is not included in rights and the number of rights is 1 (step S54b, true), the enumeration classification unit 210 sets the enumeration flag to “horizontal enumeration” (step S55b). When the item name is included in rights, or when the number of rights is not 1 (step S54b, false), the enumeration classification unit 210 performs the following processing.

ｂｏｔｔｏｍｓに項目名が含まれず、ｂｏｔｔｏｍｓの個数が１である場合（ステップＳ５６ｂ、ｔｒｕｅ）、列挙分類部２１０は、列挙フラグを「縦列挙」に設定する（ステップＳ５７ｂ）。ｂｏｔｔｏｍｓに項目名が含まれる場合、または、ｂｏｔｔｏｍｓの個数が１でない場合（ステップＳ５６ｂ、ｆａｌｓｅ）、列挙分類部２１０は、列挙フラグを「列挙なし」に設定する（ステップＳ５８ｂ）。最後に、列挙分類部２１０は、列挙フラグを出力する（ステップＳ５９ｂ）。 When the item name is not included in bottoms and the number of bottoms is 1 (step S56b, true), the enumeration classification unit 210 sets the enumeration flag to “vertical enumeration” (step S57b). When the item name is included in bottoms, or when the number of bottoms is not 1 (step S56b, false), the enumeration classification unit 210 sets the enumeration flag to “no enumeration” (step S58b). Finally, the enumeration classification unit 210 outputs an enumeration flag (step S59b).

次に、図３３を用いて、縦列挙取得部２１１の処理について説明する。図３３は、縦列挙取得部の処理の流れを示すフローチャートである。図３３に示すように、まず、縦列挙取得部２１１は、様式グラフおよび対象ノードを読み込む（ステップＳ６１ｂ）。次に、縦列挙取得部２１１は、ｂｏｔｔｏｍｓに対象ノードの下側に接続するノード群を格納する（ステップＳ６２ｂ）。ここで、ｂｏｔｔｏｍｓに項目名が含まれている場合（ステップＳ６３ｂ、ｔｒｕｅ）、縦列挙取得部２１１は、処理を終了する。また、ｂｏｔｔｏｍｓに項目名が含まれていない場合（ステップＳ６３ｂ、ｆａｌｓｅ）、縦列挙取得部２１１は、親を対象ノード、子をｂｏｔｔｏｍｓ、方向を縦とする論理関係を取得する（ステップＳ６４ｂ）。そして、縦列挙取得部２１１は、取得した論理関係を列挙リスト３０２に追加する（ステップＳ６５ｂ）。 Next, the processing of the vertical enumeration acquisition unit 211 will be described with reference to FIG. FIG. 33 is a flowchart showing the flow of processing of the vertical enumeration acquisition unit. As shown in FIG. 33, first, the vertical enumeration acquiring unit 211 reads a style graph and a target node (step S61b). Next, the vertical enumeration acquiring unit 211 stores a node group connected to the lower side of the target node in bottoms (step S62b). Here, when the item name is included in bottoms (step S63b, true), the vertical enumeration acquiring unit 211 ends the process. If the item name is not included in bottoms (step S63b, false), the vertical enumeration acquiring unit 211 acquires a logical relationship in which the parent is the target node, the child is bottoms, and the direction is vertical (step S64b). Then, the vertical enumeration acquisition unit 211 adds the acquired logical relationship to the enumeration list 302 (step S65b).

次に、図３４を用いて、横列挙取得部２１２の処理について説明する。図３４は、横列挙取得部の処理の流れを示すフローチャートである。図３４に示すように、まず、横列挙取得部２１２は、様式グラフおよび対象ノードを読み込む（ステップＳ７１ｂ）。次に、横列挙取得部２１２は、ｒｉｇｈｔｓに対象ノードの右側に接続するノード群を格納する（ステップＳ７２ｂ）。ここで、ｒｉｇｈｔｓに項目名が含まれている場合（ステップＳ７３、ｔｒｕｅ）、横列挙取得部２１２は、処理を終了する。また、ｒｉｇｈｔｓに項目名が含まれていない場合（ステップＳ７３ｂ、ｆａｌｓｅ）、横列挙取得部２１２ｂは、親を対象ノード、子をｒｉｇｈｔｓ、方向を横とする論理関係を取得する（ステップＳ７４ｂ）。そして、横列挙取得部２１２は、取得した論理関係を列挙リスト３０２に追加する（ステップＳ７５ｂ）。 Next, the process of the horizontal enumeration acquisition unit 212 will be described with reference to FIG. FIG. 34 is a flowchart illustrating a process flow of the horizontal enumeration acquisition unit. As shown in FIG. 34, the horizontal enumeration obtaining unit 212 first reads the style graph and the target node (step S71b). Next, the horizontal enumeration acquisition unit 212 stores a node group connected to the right side of the target node in rights (step S72b). If the item name is included in rights (step S73, true), the horizontal enumeration acquisition unit 212 ends the process. If the item name is not included in rights (step S73b, false), the horizontal enumeration acquisition unit 212b acquires a logical relationship in which the parent is the target node, the child is rights, and the direction is horizontal (step S74b). Then, the horizontal enumeration acquisition unit 212 adds the acquired logical relationship to the enumeration list 302 (step S75b).

次に、図３５を用いて、包含関係取得部２１４の処理について説明する。図３５は、包含関係取得部の処理の流れを示すフローチャートである。包含関係取得部２１４は、まず、様式グラフを読み込む（ステップＳ８１ｂ）。 Next, processing of the inclusion relationship acquisition unit 214 will be described using FIG. FIG. 35 is a flowchart illustrating a process flow of the inclusion relationship acquisition unit. The inclusion relationship acquisition unit 214 first reads a style graph (step S81b).

ここで、包含関係取得部２１４は、様式グラフに含まれる各ノードについて、以下の処理を行う（ステップＳ８２ｂ、Ｓ８７ｂ）。まず、ノード自身が項目名、かつ右方向に隣接するノードに項目名のノードを含む場合（ステップＳ８３ｂ、ｔｒｕｅ）、包含関係取得部２１４は、右側の包含関係を取得する（ステップＳ８４ｂ）。右側の包含関係を取得する処理の詳細については後述する。また、ノード自身が項目名でない場合、または右方向に隣接するノードに項目名のノードを含まない場合（ステップＳ８３ｂ、ｆａｌｓｅ）、包含関係取得部２１４は、右側の包含関係を取得しない。 Here, the inclusion relationship acquisition unit 214 performs the following processing for each node included in the style graph (steps S82b and S87b). First, when the node itself includes the item name and the node of the item name in the node adjacent to the right (step S83b, true), the inclusion relationship acquisition unit 214 acquires the right inclusion relationship (step S84b). Details of the process of acquiring the right inclusion relationship will be described later. Also, if the node itself is not an item name, or if the node adjacent to the right direction does not include the item name node (step S83b, false), the inclusion relationship acquisition unit 214 does not acquire the right inclusion relationship.

次に、ノード自身が項目名、かつ下方向に隣接するノードに項目名のノードを含む場合（ステップＳ８５ｂ、ｔｒｕｅ）、包含関係取得部２１４は、下側の包含関係を取得する（ステップＳ８６ｂ）。下側の包含関係を取得する処理の詳細については後述する。また、ノード自身が項目名でない場合、または下方向に隣接するノードに項目名のノードを含まない場合（ステップＳ８５ｂ、ｆａｌｓｅ）、包含関係取得部２１４は、下側の包含関係を取得しない。包含関係取得部２１４は、全てのノードについて処理を行った後、取得した包含関係を包含関係リストとして出力する（ステップＳ８８ｂ）。 Next, when the node itself includes the item name and the node of the item name in the node adjacent in the downward direction (step S85b, true), the inclusion relationship acquisition unit 214 acquires the lower inclusion relationship (step S86b). . Details of the process of acquiring the lower inclusion relationship will be described later. If the node itself is not an item name, or if the node adjacent to the downward direction does not include the item name node (step S85b, false), the inclusion relationship acquisition unit 214 does not acquire the lower inclusion relationship. The inclusion relationship acquisition unit 214 processes all the nodes, and then outputs the acquired inclusion relationship as an inclusion relationship list (step S88b).

次に、図３６を用いて、右側の包含関係を取得する処理について説明する。図３６は、右側の包含関係を取得する処理の流れを示すフローチャートである。図３６に示すように、まず、包含関係取得部２１４は、対象ノードおよび様式グラフを読み込む（ステップＳ１０１ｂ）。次に、包含関係取得部２１４は、ｍｉｎ＿ｙに、対象ノードの右側にあるノード群のｙ座標の最小値を格納し、ｍａｘ＿ｙに、対象ノードの右側にあるノード群のｙ座標の最大値を格納する（ステップＳ１０２ｂ）。 Next, processing for acquiring the right inclusion relationship will be described with reference to FIG. FIG. 36 is a flowchart showing the flow of processing for acquiring the right inclusion relationship. As shown in FIG. 36, first, the inclusion relationship acquisition unit 214 reads the target node and the style graph (step S101b). Next, the inclusion relationship acquisition unit 214 stores the minimum value of the y coordinate of the node group on the right side of the target node in min_y, and stores the maximum value of the y coordinate of the node group on the right side of the target node in min_y. (Step S102b).

ここで、対象ノードのｙ座標の範囲が対象ノードの右側のノード群のｙ座標の範囲と一致する場合（ステップＳ１０３ｂ、ｔｒｕｅ）、包含関係取得部２１４は、ｌｉｓｔに、対象ノードの右側のノード群を格納する（ステップＳ１０７ｂ）。なお、包含関係取得部２１４は、対象ノードの右側のノード群のｙ座標の範囲を、ｍｉｎ＿ｙおよびｍａｘ＿ｙを用いて計算する。そして、包含関係取得部２１４は、ｌｉｓｔに含まれるノードそれぞれについて以下の処理を行う（ステップＳ１０８ｂ、Ｓ１１０ｂ）。包含関係取得部２１４は、それぞれのノードｒを対象ノードとするたびに、ステップＳ１０２ｂに戻り再帰処理を実行する（ステップＳ１０９ｂ）。 If the y-coordinate range of the target node matches the y-coordinate range of the right node group of the target node (step S103b, true), the inclusion relationship acquisition unit 214 sets the list to the right node of the target node. The group is stored (step S107b). The inclusion relationship acquisition unit 214 calculates the y-coordinate range of the node group on the right side of the target node using min_y and max_y. Then, the inclusion relationship acquisition unit 214 performs the following processing for each node included in the list (steps S108b and S110b). The inclusion relationship acquisition unit 214 returns to step S102b and executes recursion processing every time each node r is set as a target node (step S109b).

また、対象ノードのｙ座標の範囲が対象ノードの右側のノード群のｙ座標の範囲と一致しない場合（ステップＳ１０３ｂ、ｆａｌｓｅ）であって、包含するノードの中に項目値が含まれない場合（ステップＳ１０４ｂ、ｆａｌｓｅ）、包含関係取得部２１４は、右側に包含する従のノード集合（ｌｉｓｔ）を初期化する（ステップＳ１０５ｂ）。また、包含するノードの中に項目値が含まれる場合（ステップＳ１０４ｂ、ｔｒｕｅ）、包含関係取得部２１４は、初期化を行わない。最後に、包含関係取得部２１４は、ｌｉｓｔを出力する（ステップＳ１０６ｂ）。 Further, when the y-coordinate range of the target node does not match the y-coordinate range of the right node group of the target node (step S103b, false), and the item value is not included in the included node ( In step S104b, false), the inclusion relationship acquisition unit 214 initializes the slave node set (list) included on the right side (step S105b). In addition, when the item value is included in the included node (step S104b, true), the inclusion relationship acquisition unit 214 does not perform initialization. Finally, the inclusion relationship acquisition unit 214 outputs list (step S106b).

次に、図３７を用いて、下側の包含関係を取得する処理について説明する。図３７は、下側の包含関係を取得する処理の流れを示すフローチャートである。図３７に示すように、まず、包含関係取得部２１４は、対象ノードおよび様式グラフを読み込む（ステップＳ１５１ｂ）。次に、包含関係取得部２１４は、ｍｉｎ＿ｘに、対象ノードの下側にあるノード群のｘ座標の最小値を格納し、ｍａｘ＿ｘに、対象ノードの下側にあるノード群のｘ座標の最大値を格納する（ステップＳ１５２ｂ）。 Next, processing for obtaining the lower inclusion relationship will be described with reference to FIG. FIG. 37 is a flowchart showing a flow of processing for acquiring the lower inclusion relationship. As shown in FIG. 37, first, the inclusion relationship acquisition unit 214 reads the target node and the style graph (step S151b). Next, the inclusion relationship acquisition unit 214 stores the minimum value of the x coordinate of the node group below the target node in min_x, and the maximum value of the x coordinate of the node group below the target node in max_x. Is stored (step S152b).

ここで、対象ノードのｘ座標の範囲が対象ノードの下側のノード群のｘ座標の範囲と一致する場合（ステップＳ１５３ｂ、ｔｒｕｅ）、包含関係取得部２１４は、ｌｉｓｔに、対象ノードの下側のノード群を格納する（ステップＳ１５７ｂ）。なお、包含関係取得部２１４は、対象ノードの下側のノード群のｘ座標の範囲を、ｍｉｎ＿ｘおよびｍａｘ＿ｘを用いて計算する。そして、包含関係取得部２１４は、ｌｉｓｔに含まれるノードそれぞれについて以下の処理を行う（ステップＳ１５８ｂ、Ｓ１６０ｂ）。包含関係取得部２１４は、それぞれのノードｂを対象ノードとするたびに、ステップＳ１５２ｂに戻り再帰処理を実行する（ステップＳ１５９ｂ）。 Here, when the x-coordinate range of the target node matches the x-coordinate range of the lower node group of the target node (step S153b, true), the inclusion relationship acquisition unit 214 sets the lower side of the target node to the list. Are stored (step S157b). The inclusion relationship acquisition unit 214 calculates the x-coordinate range of the lower node group of the target node using min_x and max_x. Then, the inclusion relationship acquisition unit 214 performs the following processing for each node included in the list (steps S158b and S160b). The inclusion relationship acquisition unit 214 returns to step S152b and executes recursion processing every time each node b is set as a target node (step S159b).

また、対象ノードのｘ座標の範囲が対象ノードの下側のノード群のｘ座標の範囲と一致しない場合（ステップＳ１５３ｂ、ｆａｌｓｅ）であって、包含するノードの中に項目値が含まれない場合（ステップＳ１５４ｂ、ｆａｌｓｅ）、包含関係取得部２１４は、下側に包含する従のノード集合（ｌｉｓｔ）を初期化する（ステップＳ１５５ｂ）。また、包含するノードの中に項目値が含まれる場合（ステップＳ１５４ｂ、ｔｒｕｅ）、包含関係取得部２１４は、初期化を行わない。最後に、包含関係取得部２１４は、ｌｉｓｔを出力する（ステップＳ１５６ｂ）。 Also, when the x-coordinate range of the target node does not match the x-coordinate range of the lower node group of the target node (step S153b, false), and the item value is not included in the included node (Step S154b, false), the inclusion relationship acquisition unit 214 initializes a slave node set (list) included in the lower side (Step S155b). In addition, when the item value is included in the included node (step S154b, true), the inclusion relationship acquisition unit 214 does not perform initialization. Finally, the inclusion relationship acquisition unit 214 outputs list (step S156b).

次に、図３８を用いて、包含グラフ生成部２１５の処理について説明する。図３８は、包含グラフ生成部の処理の流れを示すフローチャートである。図３８に示すように、まず、包含グラフ生成部２１５は、様式グラフおよび包含関係リストを読み込む（ステップＳ２０１ｂ）。次に、包含グラフ生成部２１５は、Ｎｖに、包含関係リストから取得した縦方向の包含ノードを格納し、Ｎｈに、包含関係リストから取得した横方向の包含ノードを格納し、ＩＧに、新規包含グラフ集合を格納する（ステップＳ２０２ｂ）。 Next, processing of the inclusion graph generation unit 215 will be described with reference to FIG. FIG. 38 is a flowchart showing the flow of processing of the inclusion graph generation unit. As shown in FIG. 38, first, the inclusion graph generation unit 215 reads the style graph and the inclusion relation list (step S201b). Next, the inclusion graph generation unit 215 stores the vertical inclusion node acquired from the inclusion relation list in Nv, stores the horizontal inclusion node acquired from the inclusion relation list in Nh, and stores the new inclusion in IG. The inclusion graph set is stored (step S202b).

ここで、包含グラフ生成部２１５は、ＮｖおよびＮｈに含まれる各包含ノードｉについて、以下の処理を行う（ステップＳ２０３ｂ、Ｓ２１３ｂ）。以後、他のノードの包含関係から、包含グラフに対象ノードが既に割り当てられている場合、対象ノードを「分割済みのノード」と呼ぶ。まず、包含ノードｉが分割済みである場合（ステップＳ２０４ｂ、ｔｒｕｅ）、包含グラフ生成部２１５は、次の包含ノードの処理に進む。また、包含ノードｉが分割済みでない場合（ステップＳ２０４ｂ、ｆａｌｓｅ）、包含グラフ生成部２１５は、ｉｎｃに包含ノードｉおよび従の項目名ノード集合を格納する（ステップＳ２０５ｂ）。 Here, the inclusion graph generation unit 215 performs the following processing for each inclusion node i included in Nv and Nh (steps S203b and S213b). Hereinafter, when the target node is already assigned to the inclusion graph due to the inclusion relationship of other nodes, the target node is referred to as a “divided node”. First, when the inclusion node i has been divided (step S204b, true), the inclusion graph generation unit 215 proceeds to processing of the next inclusion node. If the inclusion node i has not been divided (step S204b, false), the inclusion graph generation unit 215 stores the inclusion node i and the subordinate item name node set in inc (step S205b).

ここで、ｉｎｃの中で分割済みのノードがある場合（ステップＳ２０６ｂ、ｔｒｕｅ）、包含グラフ生成部２１５は、生成済包含グラフから従のノードが重なっているグラフを探し、当該包含ノードをｎに格納する（ステップＳ２０７ｂ）。次に、包含グラフ生成部２１５は、包含ノードｎを起点とする包含グラフ内にあるノードに追加されていないノード群を追加する（ステップＳ２０８ｂ）。そして、包含グラフ生成部２１５は、包含ノードの包含する方向、または項目名の並びから包含方向を設定する（ステップＳ２０９ｂ）。 Here, when there is a divided node in inc (step S206b, true), the inclusion graph generation unit 215 searches the generated inclusion graph for a graph in which a slave node overlaps, and sets the inclusion node to n. Store (step S207b). Next, the inclusion graph generation unit 215 adds a node group that has not been added to the nodes in the inclusion graph starting from the inclusion node n (step S208b). Then, the inclusion graph generation unit 215 sets the inclusion direction from the inclusion node's inclusion direction or the list of item names (step S209b).

一方、ｉｎｃの中で分割済みのノードがない場合（ステップＳ２０６ｂ、ｆａｌｓｅ）、包含グラフ生成部２１５は、新規包含グラフを生成する（ステップＳ２１０ｂ）。そして、包含グラフ生成部２１５は、主のノードが包含する方向、または項目名の並びから包含方向を設定し（ステップＳ２１１ｂ）、新規包含グラフを包含グラフ集合ＩＧに追加する（ステップＳ２１２ｂ）。包含グラフ生成部２１５は、全てのｉについて処理を行った後、包含グラフ集合ＩＧを出力する（ステップＳ２１４ｂ）。 On the other hand, when there is no divided node in inc (step S206b, false), the inclusion graph generation unit 215 generates a new inclusion graph (step S210b). Then, the inclusion graph generation unit 215 sets the inclusion direction from the direction included in the main node or the list of item names (step S211b), and adds a new inclusion graph to the inclusion graph set IG (step S212b). The inclusion graph generation unit 215 outputs the inclusion graph set IG after processing all i (Step S214b).

このように、包含グラフ生成部２１５は、包含関係のあるノードのうち、包含するノードの数が多いものを優先して処理していく。包含グラフ生成部２１５は、まず、様式グラフ内の全ての包含関係を取得し、包含方向にノードを探索していく。 As described above, the inclusion graph generation unit 215 preferentially processes a node having a large number of included nodes among nodes having an inclusion relationship. The inclusion graph generation unit 215 first acquires all inclusion relations in the style graph and searches for nodes in the inclusion direction.

ここで、図３９を用いて、列挙整合部２１３の処理について説明する。図３９は、列挙整合部の処理の流れを示すフローチャートである。図３９に示すように、列挙整合部２１３は、まず、様式グラフおよび列挙リストを読み込む（ステップＳ１２１）。次に、表リスト内に列挙リストの論理関係が含まれる場合（ステップＳ１２２、ｔｒｕｅ）、当該列挙リストを削除する（ステップＳ１２３）。 Here, the processing of the enumeration matching unit 213 will be described with reference to FIG. FIG. 39 is a flowchart showing the flow of processing of the enumeration matching unit. As shown in FIG. 39, the enumeration matching unit 213 first reads a style graph and an enumeration list (step S121). Next, when the logical relationship of the enumeration list is included in the table list (step S122, true), the enumeration list is deleted (step S123).

次に、図４０を用いて、項目名間合成部２１６の処理について説明する。図４０は、項目名間合成部の処理の流れを示すフローチャートである。図４０に示すように、まず、項目名間合成部２１６は、様式グラフ、包含グラフ３０３、表リスト３０１、列挙リスト３０２および始点ノード群を読み込む（ステップＳ３０１ｂ）。次に、項目名間合成部２１６は、新規木構造Ｔを生成する（ステップＳ３０２ｂ）。なお、始点ノード群は、包含グラフの起点となるノードの集合である。 Next, processing of the item name synthesizing unit 216 will be described with reference to FIG. FIG. 40 is a flowchart showing the flow of processing of the item name synthesizing unit. As shown in FIG. 40, first, the item name synthesizing unit 216 reads the style graph, the inclusion graph 303, the table list 301, the enumeration list 302, and the start point node group (step S301b). Next, the item name synthesizing unit 216 generates a new tree structure T (step S302b). The starting point node group is a set of nodes that are the starting points of the inclusion graph.

ここで、項目名間合成部２１６は、始点ノード群に含まれる各始点ノードについて、以下の処理を行う（ステップＳ３０３ｂ、Ｓ３１９ｂ）。まず、項目名間合成部２１６は、包含グラフからノードが項目名のノード群を求め、ｃｈｉｌｄｒｅｎに格納する（ステップＳ３０４ｂ）。次に、項目名間合成部２１６は、子をｃｈｉｌｄｒｅｎ、親をなしとする始点ノード用の新規木ノードｔを生成する（ステップＳ３０５ｂ）。そして、項目名間合成部２１６は、対象木ノードをｔとする（ステップＳ３０６ｂ）。そして、項目名間合成部２１６は、対象木ノードの種類を設定せずに木構造に追加する（ステップＳ３０７ｂ）。 Here, the item name synthesizing unit 216 performs the following processing for each start point node included in the start point node group (steps S303b and S319b). First, the item name synthesizing unit 216 obtains a node group whose node is the item name from the inclusion graph, and stores it in the children (step S304b). Next, the item name synthesizing unit 216 generates a new tree node t for the start point node whose child is child and whose parent is none (step S305b). The item name synthesizing unit 216 sets the target tree node to t (step S306b). Then, the item name synthesizing unit 216 adds to the tree structure without setting the type of the target tree node (step S307b).

ここで、対象木ノードが表リスト、または列挙リストにある場合（ステップＳ３０８ｂ、ｆａｌｓｅ）、項目名間合成部２１６は、次の始点ノードの処理に移行する。対象木ノードが列挙リストにない場合（ステップＳ３０８ｂ、ｔｒｕｅ）であって、さらに、対象木ノードが横方向に包含するノード集合が存在する場合（ステップＳ３０９ｂ、ｔｒｕｅ）、項目名間合成部２１６は、対象木ノードの種類を包含に設定する（ステップＳ３１０ｂ）。そして、項目名間合成部２１６は、横に包含するノード集合の次のノードを対象木ノードとし、ステップＳ３０７ｂへ戻り、再帰処理を行う（ステップＳ３１１ｂ、Ｓ３１２ｂ、Ｓ３１３ｂ）。 Here, when the target tree node is in the table list or the enumeration list (step S308b, false), the item name synthesizing unit 216 proceeds to the processing of the next start point node. When the target tree node is not in the enumeration list (step S308b, true) and there is a node set that the target tree node includes in the horizontal direction (step S309b, true), the inter-item name composition unit 216 The type of the target tree node is set to inclusion (step S310b). Then, the item name synthesizing unit 216 sets the next node in the horizontally included node set as the target tree node, returns to step S307b, and performs recursive processing (steps S311b, S312b, and S313b).

対象木ノードが表リスト、または列挙リストにない場合（ステップＳ３０８ｂ、ｔｒｕｅ）であって、さらに、対象木ノードが横方向に包含するノード集合が存在せず（ステップＳ３０９ｂ、ｆａｌｓｅ）、対象木ノードが縦方向に包含するノード集合が存在する場合（ステップＳ３１４ｂ、ｔｒｕｅ）、項目名間合成部２１６は、対象木ノードの種類を包含に設定する（ステップＳ３１５ｂ）。そして、項目名間合成部２１６は、縦に包含するノード集合の次のノードを対象木ノードとし、ステップＳ３０７ｂへ戻り、再帰処理を行う（ステップＳ３１６ｂ、Ｓ３１７ｂ、Ｓ３１８ｂ）。項目名間合成部２１６は、全ての始点ノードについて処理を行った後、木構造データを出力する（ステップＳ３２０ｂ）。 If the target tree node is not in the table list or the enumeration list (step S308b, true), and there is no node set that the target tree node includes in the horizontal direction (step S309b, false), the target tree node If there is a node set included in the vertical direction (step S314b, true), the item name name synthesizing unit 216 sets the type of the target tree node to include (step S315b). Then, the item name synthesizing unit 216 sets the next node of the vertically included node set as the target tree node, returns to step S307b, and performs recursive processing (steps S316b, S317b, and S318b). The item name synthesizing unit 216 performs processing on all the start point nodes, and then outputs tree structure data (step S320b).

なお、対象木ノードが列挙リストにない場合（ステップＳ３０８ｂ、ｔｒｕｅ）であって、さらに、対象木ノードが横方向に包含するノード集合が存在せず（ステップＳ３０９ｂ、ｆａｌｓｅ）、対象木ノードが縦方向に包含するノード集合が存在しない場合（ステップＳ３１４ｂ、ｆａｌｓｅ）、項目名間合成部２１６は、次の始点ノードの処理に移行する。 When the target tree node is not in the enumeration list (step S308b, true), there is no node set that the target tree node includes in the horizontal direction (step S309b, false), and the target tree node is vertical. If there is no node set included in the direction (step S314b, false), the inter-name-name composition unit 216 proceeds to processing of the next start point node.

次に、図４１を用いて、表合成部２１７の処理について説明する。図４１は、表合成部の処理の流れを示すフローチャートである。まず、表合成部２１７は、様式グラフおよび表リスト３０１を読み込む（ステップＳ３０１ａ）。次に、表合成部２１７は、新規木構造を生成する（ステップＳ３０２ａ）。そして、表合成部２１７は、表リストから先祖（親なしのノード）を取得する（ステップＳ３０３ａ）。 Next, processing of the table synthesis unit 217 will be described with reference to FIG. FIG. 41 is a flowchart showing a process flow of the table synthesis unit. First, the table synthesis unit 217 reads the style graph and the table list 301 (step S301a). Next, the table composition unit 217 generates a new tree structure (step S302a). Then, the table synthesizing unit 217 acquires an ancestor (parentless node) from the table list (step S303a).

ここで、表合成部２１７は、表リストに含まれる各ノードについて、以下の処理を行う（ステップＳ３０４ａ、Ｓ３１０ａ）。まず、表合成部２１７は、新規木ノードを生成する（ステップＳ３０５ａ）。次に、表合成部２１７は、表リストからノードの親を探索し木ノードの親に設定し、親がなければ木ノードを空とする（ステップＳ３０６ａ）。そして、表合成部２１７は、木ノードの子を、表リストの子に設定する（ステップＳ３０７ａ）。そして、表合成部２１７は、木ノードの種類を表に設定する（ステップＳ３０８ａ）。そして、表合成部２１７は、木ノードを木構造に合成する（ステップＳ３０９ａ）。最後に、表リストの各ノードについて処理が終わると、表合成部２１７は、木構造データを出力する（ステップＳ３１１ａ）。 Here, the table synthesizing unit 217 performs the following processing for each node included in the table list (steps S304a and S310a). First, the table synthesis unit 217 generates a new tree node (step S305a). Next, the table synthesizing unit 217 searches for the parent of the node from the table list and sets it as the parent of the tree node. If there is no parent, the table node is emptied (step S306a). Then, the table composition unit 217 sets the child of the tree node as the child of the table list (step S307a). Then, the table composition unit 217 sets the tree node type in the table (step S308a). The table synthesis unit 217 then synthesizes the tree node into a tree structure (step S309a). Finally, when processing is completed for each node in the table list, the table synthesis unit 217 outputs tree structure data (step S311a).

次に、図４２を用いて、列挙合成部２１８の処理について説明する。図４２は、列挙合成部の処理の流れを示すフローチャートである。図４２に示すように、列挙合成部２１８は、まず、様式グラフ、列挙リスト３０２および木構造データを読み込む（ステップＳ３４１ｂ）。 Next, processing of the enumeration synthesis unit 218 will be described with reference to FIG. FIG. 42 is a flowchart showing the flow of processing of the enumeration synthesis unit. As shown in FIG. 42, the enumeration synthesis unit 218 first reads the style graph, the enumeration list 302, and the tree structure data (step S341b).

そして、列挙合成部２１８は、列挙リスト３０２の各ノードごとに以下の処理を行う（ステップＳ３４２ｂ、Ｓ３４７ｂ）。まず、列挙合成部２１８は、新規木ノードを生成する（ステップＳ３４３ｂ）。次に、列挙合成部２１８は、木ノードの子を列挙リストの子に設定する（ステップＳ３４４ｂ）。そして、列挙合成部２１８は、木ノードの種類を列挙に設定し（ステップＳ３４５ｂ）、木構造データに合成する（ステップＳ３４６ｂ）。列挙合成部２１８は、以上の処理を全てのノードについて行った後、木構造データを出力する（ステップＳ３４８ｂ）。 Then, the enumeration synthesis unit 218 performs the following processing for each node of the enumeration list 302 (steps S342b and S347b). First, the enumeration synthesis unit 218 generates a new tree node (step S343b). Next, the enumeration composition unit 218 sets the children of the tree node as children of the enumeration list (step S344b). Then, the enumeration synthesis unit 218 sets the tree node type to enumeration (step S345b), and synthesizes the tree node data with the tree structure data (step S346b). The enumeration synthesis unit 218 outputs the tree structure data after performing the above processing for all the nodes (step S348b).

次に、図４３を用いて、追加部２１９の処理について説明する。図４３は、追加部の処理の流れを示すフローチャートである。図４３に示すように、まず、追加部２１９は、様式グラフおよび木構造データを読み込み、Ｓに木構造データを格納し、Ｇに様式グラフを格納する（ステップＳ１３１）。次に、追加部２１９は、ｎｄｓに親なしのノード集合を格納し、ｓｔｒに任意の文字列を格納する（ステップＳ１３２）。ここで、追加部２１９は、ｎｄｓの左または上方向にある罫線枠外の文字列が取得できれば、取得した文字列をｓｔｒに格納する（ステップＳ１３３）。 Next, processing of the adding unit 219 will be described with reference to FIG. FIG. 43 is a flowchart showing the flow of processing of the adding unit. As shown in FIG. 43, the adding unit 219 first reads the style graph and tree structure data, stores the tree structure data in S, and stores the style graph in G (step S131). Next, the adding unit 219 stores a parentless node set in nds and an arbitrary character string in str (step S132). Here, if the character string outside the ruled line frame in the left or upward direction of nds can be acquired, the adding unit 219 stores the acquired character string in str (step S133).

罫線枠外の文字列が発見されなかった場合（ステップＳ１３４、ｆａｌｓｅ）、追加部２１９は、ｓｔｒに格納した任意の文字列を有し、種類を様式とする木ノードをｎｄｓの親として木構造データに追加する（ステップＳ１３５）。また、罫線枠外の文字列が発見された場合（ステップＳ１３４、ｔｒｕｅ）、追加部２１９は、ｓｔｒに格納した任意の文字列を有し、種類を様式とする木ノードをｎｄｓの親として木構造データに追加する（ステップＳ１３６）。最後に、追加部２１９は、木構造データを出力する（ステップＳ１３７）。 When a character string outside the ruled line frame is not found (step S134, false), the adding unit 219 uses the tree node having an arbitrary character string stored in str and having a type as a style as a parent of nds. (Step S135). When a character string outside the ruled line frame is found (step S134, true), the adding unit 219 has an arbitrary character string stored in str and uses a tree node whose type is a style as a parent of nds. It adds to data (step S136). Finally, the adding unit 219 outputs tree structure data (step S137).

なお、任意の文字列としては、ｆｏｒｍ＃（ｉ）のように、カウントに合わせて文字列が変わるようにしてもよい。この場合、（ｉ）の部分がカウントに合わせて変化するため、根ノードの文字列は「ｆｏｒｍ１」、「ｆｏｒｍ２」のようになる。また、表の左または上方向の罫線枠外に、「○○直交表」のような記載があれば、根ノードの文字列を「○○直交表」のようにしてもよい。 In addition, as an arbitrary character string, the character string may be changed in accordance with the count as in form # (i). In this case, since the part (i) changes in accordance with the count, the character string of the root node becomes “form1”, “form2”. Further, if there is a description such as “XX orthogonal table” outside the left or upper ruled line frame of the table, the character string of the root node may be set as “XX orthogonal table”.

［第１の実施形態の効果］
抽出部２０１は、帳票の項目名または項目値を表す領域に関する情報をノードとして表し、ノード間の隣接関係をエッジとして表したグラフを基に、ノードのうち、あらかじめ設定された条件を満たすノードを、項目名を表す領域のノードである項目名ノードとして抽出する。表分類部２０３は、エッジを基に、項目名ノードから表の起点となるノードである起点ノードを抽出し、起点ノードを起点とした表を、上端と左端の両方に項目名が存在する直交表、上端に項目名が存在する縦表、および左端に項目名が存在する横表のうちのいずれかに分類する。 [Effect of the first embodiment]
The extraction unit 201 represents information related to the area representing the item name or item value of the form as a node, and based on the graph representing the adjacency relationship between the nodes as an edge, the node satisfying a preset condition is selected. , It is extracted as an item name node which is a node of the area representing the item name. Based on the edge, the table classification unit 203 extracts a starting node, which is a node that is a starting point of the table, from the item name node, and the table having the starting node as the starting point is orthogonal with the item names existing at both the upper end and the left end. The table is classified into one of a table, a vertical table in which the item name exists at the upper end, and a horizontal table in which the item name exists at the left end.

直交表取得部２０６は、直交表における項目名ノード間の縦方向の論理関係、直交表における項目名ノード間の横方向の論理関係、直交表における項目名ノードと項目名ノード以外のノードである項目値ノードとの間の縦方向の論理関係、および直交表における項目名ノードと項目値ノードとの間の横方向の論理関係を取得する。縦表取得部２０４は、縦表における項目名ノード間の縦方向の論理関係、および縦表における項目名ノードと項目値ノードとの間の縦方向の論理関係を取得する。横表取得部２０５は、横表における項目名ノード間の横方向の論理関係、および横表における項目名ノードと項目値ノードとの間の横方向の論理関係を取得する。また、表整合部２０７は、直交表、縦表および横表から、不整合な表であることを示す所定の条件を満たす表を除外した表を特定する。 The orthogonal table acquisition unit 206 is a vertical logical relationship between item name nodes in the orthogonal table, a horizontal logical relationship between item name nodes in the orthogonal table, and a node other than the item name node and the item name node in the orthogonal table. The vertical logical relationship between the item value nodes and the horizontal logical relationship between the item name node and the item value node in the orthogonal table are acquired. The vertical table acquisition unit 204 acquires a vertical logical relationship between item name nodes in the vertical table and a vertical logical relationship between item name nodes and item value nodes in the vertical table. The horizontal table acquisition unit 205 acquires a horizontal logical relationship between item name nodes in the horizontal table and a horizontal logical relationship between item name nodes and item value nodes in the horizontal table. In addition, the table matching unit 207 specifies a table that excludes a table that satisfies a predetermined condition indicating that it is an inconsistent table from the orthogonal table, the vertical table, and the horizontal table.

また、第１の削除部２０８は、１つのノードの所定の方向に、複数のノードが隣接している場合、１つのノードと複数のノードとの隣接関係を表すエッジを削除する。また、第２の削除部２０９は、項目名ノードのうち、所定の方向に項目値を表す領域のノードである項目値ノードが隣接している項目名ノードと、項目値ノードとの隣接関係を表すエッジを削除する。 In addition, when a plurality of nodes are adjacent to each other in a predetermined direction of one node, the first deletion unit 208 deletes an edge representing the adjacency relationship between the one node and the plurality of nodes. Also, the second deletion unit 209 determines the adjacency relationship between the item name node and the item value node adjacent to the item value node that is the node of the area representing the item value in a predetermined direction among the item name nodes. Delete the representing edge.

また、縦列挙取得部２１１および横列挙取得部２１２は、第１の削除部２０８および第２の削除部２０９によってエッジの削除が行われたグラフを基に、項目名ノードと項目値ノードとの間の論理関係を取得する。また、包含関係取得部２１４は、第１の削除部２０８によってエッジの削除が行われたグラフを基に、項目名ノード間の包含関係を取得する。 In addition, the vertical enumeration acquisition unit 211 and the horizontal enumeration acquisition unit 212, based on the graph in which the edge deletion is performed by the first deletion unit 208 and the second deletion unit 209, is performed between the item name node and the item value node. Get logical relationship between. The inclusion relationship acquisition unit 214 acquires the inclusion relationship between the item name nodes based on the graph in which the edge is deleted by the first deletion unit 208.

また、列挙整合部２１３は、縦列挙取得部２１１および横列挙取得部２１２によって取得された論理関係のうち、表リスト３０１に含まれるノードに関する論理関係を除外した合成対象の列挙に関する論理関係を特定する。 In addition, the enumeration matching unit 213 identifies the logical relationship related to the enumeration of the synthesis targets excluding the logical relationship related to the nodes included in the table list 301 among the logical relationships acquired by the vertical enumeration acquiring unit 211 and the horizontal enumeration acquiring unit 212. To do.

また、項目名間合成部２１６は、包含関係取得部２１４によって取得された包含関係のうち、合成対象の表に関する論理関係および合成対象の列挙に関する論理関係のいずれにも含まれない包含関係を基に木構造のデータを作成する。また、表合成部２１７は、合成対象の表に関する論理関係を基に木構造のデータを作成し、当該木構造のデータと項目名間合成部２１６によって作成された木構造のデータとを合成した木構造のデータを作成する。また、列挙合成部２１８は、合成対象の列挙に関する論理関係を基に木構造のデータを作成し、当該木構造のデータと表合成部２１７によって作成された木構造のデータとを合成した木構造のデータを作成する。 Further, the inter-item name composition unit 216 is based on the inclusion relationships that are not included in either the logical relationship related to the synthesis target table or the logical relationship related to the synthesis target enumeration among the inclusion relationships acquired by the inclusion relationship acquisition unit 214. Create tree structure data. Further, the table synthesis unit 217 creates tree structure data based on the logical relationship regarding the synthesis target table, and synthesizes the tree structure data and the tree structure data created by the item name synthesis unit 216. Create tree structure data. In addition, the enumeration synthesis unit 218 creates tree structure data based on the logical relationship regarding the enumeration of synthesis targets, and synthesizes the tree structure data and the tree structure data created by the table synthesis unit 217. Create data for.

このため、本実施形態によれば、帳票に縦表、横表、直交表または入れ子構造の直交表が含まれる場合であっても、帳票の項目名間の論理関係、および項目名と項目値との間の論理関係を正確に認識することができるようになる。さらに、本実施形態では、論理関係を半自動的に取得することができるため、帳票の半構造データの取得および活用を効率的に行うことができるようになる。 For this reason, according to this embodiment, even if the form includes a vertical table, a horizontal table, an orthogonal table, or a nested orthogonal table, the logical relationship between the item names of the form, and the item names and item values It becomes possible to accurately recognize the logical relationship between the two. Further, in the present embodiment, since the logical relationship can be acquired semi-automatically, it is possible to efficiently acquire and utilize the semi-structured data of the form.

また、本実施形態によれば、帳票に縦列挙、横列挙、縦列挙と横列挙の複合型、または列挙型入れ子構造が含まれる場合であっても、帳票の項目名間の論理関係、および項目名と項目値との間の論理関係を正確に認識することができるようになる。さらに、本実施形態では、論理関係を半自動的に取得することができるため、帳票の半構造データの取得および活用を効率的に行うことができるようになる。 Further, according to the present embodiment, even if the form includes a vertical enumeration, a horizontal enumeration, a combined type of vertical enumeration and horizontal enumeration, or an enumerated type nested structure, the logical relationship between the item names of the forms, and The logical relationship between the item name and the item value can be accurately recognized. Further, in the present embodiment, since the logical relationship can be acquired semi-automatically, it is possible to efficiently acquire and utilize the semi-structured data of the form.

さらに、本実施形態によれば、表整合部２０７および列挙整合部２１３による木構造データに含める論理関係の特定が行われるため、帳票の縦列挙、横列挙、縦列挙と横列挙の複合型、または入れ子構造型に縦列挙、横列挙、縦列挙と横列挙の複合型、縦表、横表、直交表、入れ子構造の縦表、入れ子構造の横表、入れ子構造の直交表が含まれる場合であっても、帳票の項目名間の論理関係、および項目名と項目値との間の論理関係を正確に認識することができる。 Further, according to the present embodiment, the logical relationship included in the tree structure data is specified by the table matching unit 207 and the enumeration matching unit 213, so that the vertical enumeration, horizontal enumeration, combined vertical enumeration and horizontal enumeration, Or when the nested structure type includes vertical enumeration, horizontal enumeration, vertical enumeration and horizontal enumeration, vertical table, horizontal table, orthogonal table, nested vertical table, nested horizontal table, nested orthogonal table Even so, it is possible to accurately recognize the logical relationship between the item names of the form and the logical relationship between the item name and the item value.

表分類部２０３は、起点ノードの右側に接続する第１のノード群が全て項目名かつ第１ノード群の高さが起点ノードの高さと同じであることである第１の条件、および、起点ノードの下側に接続する第２のノード群が全て項目名かつ第２ノード群の幅が起点ノードの幅と同じであることである第２の条件が満たされる場合、起点ノードを起点とする表を直交表に分類し、第１の条件が満たされ、第２の条件が満たされない場合、起点ノードを起点とする表を横表に分類し、第２の条件が満たされ、第１の条件が満たされない場合、起点ノードを起点とする表を縦表に分類する。 The table classification unit 203 includes a first condition that all the first node groups connected to the right side of the starting node are item names, and the height of the first node group is the same as the height of the starting node, and the starting point When the second condition that all the second node groups connected to the lower side of the node are item names and the width of the second node group is the same as the width of the starting node is satisfied, the starting node is set as the starting point If the table is classified as an orthogonal table and the first condition is satisfied and the second condition is not satisfied, the table starting from the starting node is classified into a horizontal table, the second condition is satisfied, and the first condition is satisfied. If the condition is not satisfied, the table starting from the starting node is classified into a vertical table.

さらに、表整合部２０７は、直交表、縦表および横表から、下側にも右側にも項目値ノードが存在しない項目名ノードを有する直交表と、下側に項目値ノードが存在しない項目名ノードを有する縦表と、右側に項目値ノードが存在しない項目名ノードを有する横表と、を除外した表を特定する。 Further, the table matching unit 207 includes an orthogonal table having an item name node having no item value node on the lower side and the right side, and an item having no item value node on the lower side from the orthogonal table, the vertical table, and the horizontal table. A table excluding a vertical table having name nodes and a horizontal table having item name nodes having no item value nodes on the right side is specified.

このように、ノードの隣接関係、高さおよび幅を利用することによって、縦表、横表、および直交表を正確に分類することができるだけでなく、項目名ノードが誤って項目値ノードとして扱われているような不整合な表を除外することができる。 Thus, by using the adjacency, height and width of nodes, not only can the vertical table, horizontal table, and orthogonal table be classified correctly, but the item name node is erroneously treated as an item value node. Inconsistent tables such as those listed can be excluded.

項目名間合成部２１６は、合成対象の表および合成対象の列挙に含まれるノードのうち、直交表の起点ノードと、縦表の起点ノードおよび当該起点ノードの右側に接続するノード群のうち最も上側にあるノード群と、横表の起点ノードおよび当該起点ノードの下側に接続するノード群のうち最も左側にあるノード群と、列挙の項目名ノードと、を第１の木構造のデータに含める。これにより、最終的に出力される木構造のデータに、論理関係が重複して含まれないようにすることができる。 Among the nodes included in the synthesis target table and the synthesis target enumeration, the item name synthesizing unit 216 is the first of the orthogonal table starting node, the vertical table starting node, and the node group connected to the right side of the starting node. The node group on the upper side, the leftmost node group of the starting node in the horizontal table and the node group connected to the lower side of the starting node, and the item name node of the enumeration as the data of the first tree structure include. Thereby, it is possible to prevent the logical relationship from being redundantly included in the finally output tree-structured data.

直交表取得部２０６は、項目名および当該項目名の右側にある項目名または項目値の組み合わせを横方向の論理関係として取得し、項目名および当該項目名の下側にある項目名または項目値の組み合わせを縦方向の論理関係として取得する。また、横表取得部２０５は、項目名および当該項目名の右側にある項目名または項目値の組み合わせを横方向の論理関係として取得する。また、縦表取得部２０４は、項目名および当該項目名の下側にある項目名または項目値の組み合わせを縦方向の論理関係として取得する。このように、縦方向の場合は下側、および横方向の場合は右側のノードを参照することで、論理関係を正確に取得することができる。 The orthogonal table acquisition unit 206 acquires a combination of the item name and the item name or item value on the right side of the item name as a logical relationship in the horizontal direction, and the item name or item value below the item name and the item name. Is obtained as a vertical logical relationship. The horizontal table acquisition unit 205 acquires a combination of an item name and an item name or item value on the right side of the item name as a logical relationship in the horizontal direction. The vertical table acquisition unit 204 acquires a combination of an item name and an item name or item value below the item name as a vertical logical relationship. Thus, the logical relationship can be accurately acquired by referring to the lower node in the vertical direction and the right node in the horizontal direction.

また、第１の削除部２０８は、１つのノードの左側または上側に、複数のノードが隣接している場合、１つのノードと複数のノードとの隣接関係を表すエッジを削除してもよい。また、第２の削除部２０９は、項目名ノードのうち、左側または上側に、項目値を表す領域のノードである項目値ノードが隣接している項目名ノードと、項目値ノードとの隣接関係を表すエッジを削除してもよい。一般的に帳票の項目名間や項目名と項目値との位置関係は、左から右、または上から下である場合が多い。このため、削除する隣接関係の方向を左側と上側に設定しておくことで、多くの帳票に対応することが可能になる。 In addition, when a plurality of nodes are adjacent to the left side or the upper side of one node, the first deletion unit 208 may delete an edge representing the adjacent relationship between the one node and the plurality of nodes. In addition, the second deletion unit 209 has an adjacency relationship between the item name node and the item value node adjacent to the item value node that is the node of the area representing the item value on the left or upper side of the item name nodes. You may delete the edge showing. In general, the positional relationship between item names in a form and between item names and item values is often left to right or top to bottom. For this reason, by setting the direction of the adjacent relationship to be deleted on the left side and the upper side, it is possible to deal with many forms.

包含関係取得部２１４は、第１の項目名ノードの右側に隣接する第１のノード群のうち少なくとも１つが項目名ノードであり、かつ、第１のノード群に含まれる全てのノードの高さが第１の項目名ノードの高さ以下であり、かつ、第１のノード群の左上端のノードの頂点と、第１の項目名ノードの頂点が重なっている場合、第１の項目名ノードが第１のノード群を包含していると判定する。また、包含関係取得部２１４は、第２の項目名ノードの下側に隣接する第２のノード群のうち少なくとも１つが項目名ノードであり、かつ、第２のノード群に含まれる全てのノードの幅が第２の項目名ノードの幅以下であり、かつ、第２のノード群の左上端のノードの頂点と、第１の項目名ノードの頂点が重なっている場合、第２の項目名ノードが第２のノード群を包含していると判定する。このように、ノードの隣接関係、高さおよび幅を利用することによって、包含関係を正確に認識することができる。 The inclusion relationship acquisition unit 214 has at least one of the first node groups adjacent to the right side of the first item name node as the item name node, and the heights of all the nodes included in the first node group. Is less than or equal to the height of the first item name node, and the vertex of the upper left node of the first node group and the vertex of the first item name node overlap, the first item name node Are included in the first node group. In addition, the inclusion relationship acquisition unit 214 has at least one of the second node groups adjacent to the lower side of the second item name node as an item name node, and all the nodes included in the second node group The width of the second item name node is less than or equal to the width of the second item name node, and the vertex of the upper left node of the second node group overlaps the vertex of the first item name node, the second item name It is determined that the node includes the second node group. Thus, the inclusion relation can be accurately recognized by utilizing the adjacent relation, height, and width of the nodes.

［その他の実施形態］
論理関係認識の対象は、帳票形式に整形可能であれば、Ｗｅｂ画面やシステムＧＵＩであってもよい。例えば、図４４に示すようなＷｅｂ上で航空券の予約を行うようなＷｅｂ画面から、項目名および項目値を取得し、帳票形式に整形することで、当該Ｗｅｂ画面を論理関係認識処理の対象とすることができる。図４４は、その他の実施形態について説明するための図である。 [Other Embodiments]
The target of the logical relationship recognition may be a Web screen or a system GUI as long as it can be formed into a form format. For example, an item name and an item value are acquired from a Web screen that makes a reservation for an air ticket on the Web as shown in FIG. 44, and the Web screen is subjected to logical relationship recognition processing by formatting it into a form format. It can be. FIG. 44 is a diagram for explaining another embodiment.

［システム構成等］
また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示の如く構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部または一部を、各種の負荷や使用状況等に応じて、任意の単位で機能的または物理的に分散・統合して構成することができる。さらに、各装置にて行われる各処理機能は、その全部または任意の一部が、ＣＰＵおよび当該ＣＰＵにて解析実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。 [System configuration, etc.]
Further, each component of each illustrated apparatus is functionally conceptual, and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution / integration of each device is not limited to that shown in the figure, and all or a part thereof may be functionally or physically distributed or arbitrarily distributed in arbitrary units according to various loads or usage conditions. Can be integrated and configured. Furthermore, all or a part of each processing function performed in each device may be realized by a CPU and a program that is analyzed and executed by the CPU, or may be realized as hardware by wired logic.

また、本実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部または一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部または一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。 Also, among the processes described in this embodiment, all or part of the processes described as being performed automatically can be performed manually, or the processes described as being performed manually can be performed. All or a part can be automatically performed by a known method. In addition, the processing procedure, control procedure, specific name, and information including various data and parameters shown in the above-described document and drawings can be arbitrarily changed unless otherwise specified.

［プログラム］
一実施形態として、論理関係認識装置１０は、パッケージソフトウェアやオンラインソフトウェアとして上記の論理関係認識を実行する論理関係認識プログラムを所望のコンピュータにインストールさせることによって実装できる。例えば、上記の論理関係認識プログラムを情報処理装置に実行させることにより、情報処理装置を論理関係認識装置１０として機能させることができる。ここで言う情報処理装置には、デスクトップ型またはノート型のパーソナルコンピュータが含まれる。また、その他にも、情報処理装置にはスマートフォン、携帯電話機やＰＨＳ（Personal Handyphone System）等の移動体通信端末、さらには、ＰＤＡ（Personal Digital Assistant）等のスレート端末等がその範疇に含まれる。 [program]
As an embodiment, the logical relationship recognition apparatus 10 can be implemented by installing a logical relationship recognition program for executing the logical relationship recognition as package software or online software on a desired computer. For example, the information processing apparatus can function as the logical relation recognition apparatus 10 by causing the information processing apparatus to execute the above logical relation recognition program. The information processing apparatus referred to here includes a desktop or notebook personal computer. In addition, the information processing apparatus includes mobile communication terminals such as smartphones, mobile phones and PHS (Personal Handyphone System), and slate terminals such as PDA (Personal Digital Assistant).

また、論理関係認識装置１０は、ユーザが使用する端末装置をクライアントとし、当該クライアントに上記の論理関係認識に関するサービスを提供する論理関係認識サーバ装置として実装することもできる。例えば、論理関係認識サーバ装置は、帳票を入力とし、木構造データを出力とする論理関係認識サービスを提供するサーバ装置として実装される。この場合、論理関係認識サーバ装置は、Ｗｅｂサーバとして実装することとしてもよいし、アウトソーシングによって上記の論理関係認識に関するサービスを提供するクラウドとして実装することとしてもかまわない。 The logical relationship recognition apparatus 10 can also be implemented as a logical relationship recognition server device that uses a terminal device used by a user as a client and provides the client with the above-described service related to logical relationship recognition. For example, the logical relationship recognition server device is implemented as a server device that provides a logical relationship recognition service that takes a form as input and outputs tree structure data. In this case, the logical relationship recognition server device may be implemented as a Web server, or may be implemented as a cloud that provides the above-described service related to logical relationship recognition by outsourcing.

図４５は、プログラムが実行されることにより論理関係認識装置が実現されるコンピュータの一例を示す図である。コンピュータ１０００は、例えば、メモリ１０１０、ＣＰＵ１０２０を有する。また、コンピュータ１０００は、ハードディスクドライブインタフェース１０３０、ディスクドライブインタフェース１０４０、シリアルポートインタフェース１０５０、ビデオアダプタ１０６０、ネットワークインタフェース１０７０を有する。これらの各部は、バス１０８０によって接続される。 FIG. 45 is a diagram illustrating an example of a computer in which the logical relationship recognition apparatus is realized by executing a program. The computer 1000 includes a memory 1010 and a CPU 1020, for example. The computer 1000 also includes a hard disk drive interface 1030, a disk drive interface 1040, a serial port interface 1050, a video adapter 1060, and a network interface 1070. These units are connected by a bus 1080.

メモリ１０１０は、ＲＯＭ（Read Only Memory）１０１１およびＲＡＭ１０１２を含む。ＲＯＭ１０１１は、例えば、ＢＩＯＳ（Basic Input Output System）等のブートプログラムを記憶する。ハードディスクドライブインタフェース１０３０は、ハードディスクドライブ１０９０に接続される。ディスクドライブインタフェース１０４０は、ディスクドライブ１１００に接続される。例えば磁気ディスクや光ディスク等の着脱可能な記憶媒体が、ディスクドライブ１１００に挿入される。シリアルポートインタフェース１０５０は、例えばマウス１１１０、キーボード１１２０に接続される。ビデオアダプタ１０６０は、例えばディスプレイ１１３０に接続される。 The memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM 1012. The ROM 1011 stores a boot program such as BIOS (Basic Input Output System). The hard disk drive interface 1030 is connected to the hard disk drive 1090. The disk drive interface 1040 is connected to the disk drive 1100. For example, a removable storage medium such as a magnetic disk or an optical disk is inserted into the disk drive 1100. The serial port interface 1050 is connected to a mouse 1110 and a keyboard 1120, for example. The video adapter 1060 is connected to the display 1130, for example.

ハードディスクドライブ１０９０は、例えば、ＯＳ１０９１、アプリケーションプログラム１０９２、プログラムモジュール１０９３、プログラムデータ１０９４を記憶する。すなわち、論理関係認識装置１０の各処理を規定するプログラムは、コンピュータにより実行可能なコードが記述されたプログラムモジュール１０９３として実装される。プログラムモジュール１０９３は、例えばハードディスクドライブ１０９０に記憶される。例えば、論理関係認識装置１０における機能構成と同様の処理を実行するためのプログラムモジュール１０９３が、ハードディスクドライブ１０９０に記憶される。なお、ハードディスクドライブ１０９０は、ＳＳＤにより代替されてもよい。 The hard disk drive 1090 stores, for example, an OS 1091, an application program 1092, a program module 1093, and program data 1094. That is, a program that defines each process of the logical relationship recognition apparatus 10 is implemented as a program module 1093 in which a code executable by a computer is described. The program module 1093 is stored in the hard disk drive 1090, for example. For example, a program module 1093 for executing processing similar to the functional configuration in the logical relationship recognition apparatus 10 is stored in the hard disk drive 1090. Note that the hard disk drive 1090 may be replaced by an SSD.

また、上述した実施形態の処理で用いられる設定データは、プログラムデータ１０９４として、例えばメモリ１０１０やハードディスクドライブ１０９０に記憶される。そして、ＣＰＵ１０２０が、メモリ１０１０やハードディスクドライブ１０９０に記憶されたプログラムモジュール１０９３やプログラムデータ１０９４を必要に応じてＲＡＭ１０１２に読み出して実行する。 The setting data used in the processing of the above-described embodiment is stored as program data 1094 in, for example, the memory 1010 or the hard disk drive 1090. Then, the CPU 1020 reads the program module 1093 and the program data 1094 stored in the memory 1010 and the hard disk drive 1090 to the RAM 1012 and executes them as necessary.

なお、プログラムモジュール１０９３やプログラムデータ１０９４は、ハードディスクドライブ１０９０に記憶される場合に限らず、例えば着脱可能な記憶媒体に記憶され、ディスクドライブ１１００等を介してＣＰＵ１０２０によって読み出されてもよい。あるいは、プログラムモジュール１０９３およびプログラムデータ１０９４は、ネットワーク（ＬＡＮ、ＷＡＮ（Wide Area Network）等）を介して接続された他のコンピュータに記憶されてもよい。そして、プログラムモジュール１０９３およびプログラムデータ１０９４は、他のコンピュータから、ネットワークインタフェース１０７０を介してＣＰＵ１０２０によって読み出されてもよい。 The program module 1093 and the program data 1094 are not limited to being stored in the hard disk drive 1090, but may be stored in, for example, a removable storage medium and read out by the CPU 1020 via the disk drive 1100 or the like. Alternatively, the program module 1093 and the program data 1094 may be stored in another computer connected via a network (LAN, WAN (Wide Area Network), etc.). The program module 1093 and the program data 1094 may be read by the CPU 1020 from another computer via the network interface 1070.

１０論理関係認識装置
２０制御部
３０記憶部
２０１抽出部
２０２解析部
２０３表分類部
２０４縦表取得部
２０５横表取得部
２０６直交表取得部
２０７表整合部
２０８第１の削除部
２０９第２の削除部
２１０列挙分類部
２１１縦列挙取得部
２１２横列挙取得部
２１３列挙整合部
２１４包含関係取得部
２１５包含グラフ生成部
２１６項目名間合成部
２１７表合成部
２１８列挙合成部
２１９追加部
３０１表リスト
３０２列挙リスト
３０３包含グラフ DESCRIPTION OF SYMBOLS 10 Logical relationship recognition apparatus 20 Control part 30 Storage part 201 Extraction part 202 Analysis part 203 Table classification part 204 Vertical table acquisition part 205 Horizontal table acquisition part 206 Orthogonal table acquisition part 207 Table matching part 208 1st deletion part 209 2nd deletion part 209 2nd Deletion unit 210 Enumeration classification unit 211 Vertical enumeration acquisition unit 212 Horizontal enumeration acquisition unit 213 Enumeration matching unit 214 Inclusion relation acquisition unit 215 Inclusion graph generation unit 216 Inter-name name synthesis unit 217 Table synthesis unit 218 Enumeration synthesis unit 219 Addition unit 301 Table list 302 Enumeration list 303 Inclusion graph

Claims

Based on a graph that represents information related to the area representing the item name or item value of the form as a node and the adjacent relationship between the nodes as an edge, a node that satisfies a preset condition is selected as the item name. An extraction unit that extracts as an item name node that is a node of an area that represents
Based on the edge, a starting node that is a node that is a starting point of the table is extracted from the item name node, and a table starting from the starting node is an orthogonal table having item names on both the upper end and the left end, and the upper end A table classification part that classifies the table into one of a vertical table in which the item name is present and a horizontal table in which the item name is present on the left end;
A vertical logical relationship between the item name nodes in the orthogonal table, a horizontal logical relationship between the item name nodes in the orthogonal table, and a node other than the item name node and the item name node in the orthogonal table. An orthogonal table acquisition unit for acquiring a vertical logical relationship between item value nodes, and a horizontal logical relationship between the item name node and the item value node in the orthogonal table;
A vertical table acquisition unit that acquires a vertical logical relationship between the item name nodes in the vertical table, and a vertical logical relationship between the item name node and the item value node in the vertical table;
A horizontal table acquisition unit for acquiring a horizontal logical relationship between the item name nodes in the horizontal table, and a horizontal logical relationship between the item name node and the item value node in the horizontal table;
A first specifying unit that specifies a table to be synthesized excluding a table that satisfies a predetermined condition indicating that the table is inconsistent from the orthogonal table, the vertical table, and the horizontal table;
A first deletion unit that deletes an edge representing an adjacency relationship between the one node and the plurality of nodes when a plurality of nodes are adjacent to each other in a predetermined direction of the one node;
Among the item name nodes, an item name node that is adjacent to an item value node that is a node of an area that represents an item value in a predetermined direction, and an edge that represents an adjacency relationship between the item value node are deleted Delete part,
A first acquisition unit that acquires a logical relationship between the item name node and the item value node based on the graph in which an edge is deleted by the first deletion unit and the second deletion unit. When,
A second acquisition unit that acquires an inclusion relationship between the item name nodes based on the graph in which an edge is deleted by the first deletion unit;
A second specifying unit for specifying a logical relationship related to enumeration of synthesis targets excluding a logical relationship related to a node included in the synthesis target table among the logical relationships acquired by the first acquisition unit;
Of the inclusion relationships acquired by the second acquisition unit, the first tree structure based on the inclusion relationship that is not included in either the logical relationship related to the table to be combined and the logical relationship related to the enumeration of the combination target. A first synthesis unit for creating data;
A tree structure data is created based on a logical relationship regarding the table to be synthesized, and a second tree structure data is created by synthesizing the tree structure data and the first tree structure data. A synthesis unit;
A tree structure data is created based on the logical relationship relating to the synthesis target enumeration, and a third tree structure data is created by synthesizing the tree structure data and the second tree structure data. A synthesis unit;
A logical relationship recognition apparatus comprising:

The table classification unit includes a first condition that all the first node groups connected to the right side of the starting node are item names and the height of the first node group is the same as the height of the starting node. And the second condition is that all the second node groups connected to the lower side of the starting node are item names and the width of the second node group is the same as the width of the starting node. In this case, the table starting from the starting node is classified as an orthogonal table, and when the first condition is satisfied and the second condition is not satisfied, the table starting from the starting node is classified as a horizontal table. If the second condition is satisfied and the first condition is not satisfied, the table starting from the starting node is classified into a vertical table,
The first specifying unit includes, from the orthogonal table, the vertical table, and the horizontal table, an orthogonal table having the item name node where the item value node does not exist on either the lower side or the right side, and the item on the lower side. The logical relationship according to claim 1, wherein a vertical table having the item name node for which no value node exists and a horizontal table having the item name node for which the item value node does not exist on the right side are excluded. Recognition device.

The first synthesizing unit includes a node connected to the origin node of the orthogonal table, the origin node of the vertical table, and the right side of the origin node among the nodes included in the synthesis target table and the synthesis target enumeration A node group on the uppermost side of the group, a node group on the leftmost side of the node group connected to the lower side of the starting node of the horizontal table and the starting node, and an item name node of enumeration The logical relationship recognition apparatus according to claim 1, wherein the logical relation recognition device is included in data having a tree structure.

The orthogonal table acquisition unit acquires a combination of an item name and an item name on the right side of the item name or an item value as a logical relationship in the horizontal direction, and an item name or item below the item name and the item name. A combination of values is acquired as the logical relationship in the vertical direction,
The vertical table acquisition unit acquires a combination of an item name and an item name or item value below the item name as a logical relationship in the vertical direction,
The said horizontal table acquisition part acquires the combination of the item name and the item name on the right side of the said item name, or item value as the said horizontal direction logical relationship, The any one of Claim 1 to 3 characterized by the above-mentioned. The logical relationship recognition device described.

The first deletion unit, when a plurality of nodes are adjacent to the left or upper side of one node, deletes an edge representing an adjacency relationship between the one node and the plurality of nodes;
The second deletion unit includes an item name node adjacent to an item value node that is a node of an area representing an item value on the left or upper side of the item name nodes, and an adjacency relationship between the item value nodes. The logical relationship recognition apparatus according to claim 1, wherein an edge representing the information is deleted.

In the second acquisition unit, at least one of the first node groups adjacent to the right side of the first item name node is the item name node, and all the nodes included in the first node group The height of the first item name node is equal to or less than the height of the first item name node, and the vertex of the upper left node of the first node group overlaps the vertex of the first item name node, It is determined that the first item name node includes the first node group, and at least one of the second node groups adjacent to the lower side of the second item name node is the item name node. And the width of all the nodes included in the second node group is equal to or less than the width of the second item name node, and the vertex of the upper left node of the second node group, and If the vertices of the first item name node overlap, the second term Logical relationship recognition apparatus according to claim 5 in which the name node and determines that encompasses the second node group.

A logical relationship recognition method executed by a logical relationship recognition device,
Based on a graph that represents information related to the area representing the item name or item value of the form as a node and the adjacent relationship between the nodes as an edge, a node that satisfies a preset condition is selected as the item name. An extraction step of extracting as an item name node that is a node of an area representing
Based on the edge, a starting node that is a node that is a starting point of the table is extracted from the item name node, and a table starting from the starting node is an orthogonal table having item names on both the upper end and the left end, and the upper end A table classification process for classifying the table into one of a vertical table in which the item name exists in the table and a horizontal table in which the item name exists on the left end;
A vertical logical relationship between the item name nodes in the orthogonal table, a horizontal logical relationship between the item name nodes in the orthogonal table, and a node other than the item name node and the item name node in the orthogonal table. An orthogonal table acquisition step of acquiring a vertical logical relationship between the item value nodes and a horizontal logical relationship between the item name node and the item value node in the orthogonal table;
A vertical table acquisition step of acquiring a vertical logical relationship between the item name nodes in the vertical table, and a vertical logical relationship between the item name node and the item value node in the vertical table;
A horizontal table acquisition step of acquiring a horizontal logical relationship between the item name nodes in the horizontal table and a horizontal logical relationship between the item name node and the item value node in the horizontal table;
A first specifying step of specifying a table to be synthesized by excluding a table satisfying a predetermined condition indicating that the table is inconsistent from the orthogonal table, the vertical table, and the horizontal table;
When a plurality of nodes are adjacent to each other in a predetermined direction of one node, a first deletion step of deleting an edge representing an adjacent relationship between the one node and the plurality of nodes;
Among the item name nodes, an item name node that is adjacent to an item value node that is a node of an area that represents an item value in a predetermined direction, and an edge that represents an adjacency relationship between the item value node are deleted. Delete process,
A first acquisition step of acquiring a logical relationship between the item name node and the item value node based on the graph in which an edge is deleted by the first deletion step and the second deletion step. When,
A second acquisition step of acquiring an inclusion relationship between the item name nodes based on the graph in which an edge is deleted by the first deletion step;
A second specifying step of specifying a logical relationship related to enumeration of synthesis targets, excluding a logical relationship related to nodes included in the synthesis target table, among the logical relationships acquired in the first acquisition step;
Of the inclusion relationships acquired by the second acquisition step, the first tree structure based on the inclusion relationship that is not included in either the logical relationship related to the table to be combined and the logical relationship related to the enumeration of the combination target. A first synthesis step for creating data;
A tree structure data is created based on a logical relationship regarding the table to be synthesized, and a second tree structure data is created by synthesizing the tree structure data and the first tree structure data. A synthesis process;
A tree structure data is created based on the logical relationship relating to the synthesis target enumeration, and a third tree structure data is created by synthesizing the tree structure data and the second tree structure data. A synthesis process;
The logical relationship recognition method characterized by including.

On the computer,
Based on a graph that represents information related to the area representing the item name or item value of the form as a node and the adjacent relationship between the nodes as an edge, a node that satisfies a preset condition is selected as the item name. An extraction step of extracting as an item name node that is a node of an area representing
Based on the edge, a starting node that is a node that is a starting point of the table is extracted from the item name node, and a table starting from the starting node is an orthogonal table having item names on both the upper end and the left end, and the upper end A table classification step for classifying the table into one of a vertical table in which the item name is present and a horizontal table in which the item name is present on the left end;
A vertical logical relationship between the item name nodes in the orthogonal table, a horizontal logical relationship between the item name nodes in the orthogonal table, and a node other than the item name node and the item name node in the orthogonal table. An orthogonal table acquisition step for acquiring a vertical logical relationship between item value nodes and a horizontal logical relationship between the item name node and the item value node in the orthogonal table;
A vertical table acquisition step of acquiring a vertical logical relationship between the item name nodes in the vertical table and a vertical logical relationship between the item name node and the item value node in the vertical table;
A horizontal table acquisition step for acquiring a horizontal logical relationship between the item name nodes in the horizontal table and a horizontal logical relationship between the item name node and the item value node in the horizontal table;
A first specifying step of specifying a table to be synthesized excluding a table satisfying a predetermined condition indicating that the table is inconsistent from the orthogonal table, the vertical table, and the horizontal table;
A first deletion step of deleting an edge representing an adjacency relationship between the one node and the plurality of nodes when a plurality of nodes are adjacent to each other in a predetermined direction of the one node;
Among the item name nodes, an item name node that is adjacent to an item value node that is a node of an area that represents an item value in a predetermined direction, and an edge that represents an adjacency relationship between the item value node are deleted Delete step,
A first acquisition step of acquiring a logical relationship between the item name node and the item value node based on the graph in which an edge is deleted by the first deletion step and the second deletion step. When,
A second acquisition step of acquiring an inclusion relationship between the item name nodes based on the graph in which an edge is deleted by the first deletion step;
A second specifying step of specifying a logical relationship regarding enumeration of synthesis targets excluding a logical relationship regarding nodes included in the synthesis target table among the logical relationships acquired by the first acquisition step;
Of the inclusion relationships acquired by the second acquisition step, the first tree structure based on the inclusion relationship not included in any of the logical relationship related to the table to be combined and the logical relationship related to the enumeration of the combination target. A first synthesis step for creating data;
A tree structure data is created based on a logical relationship regarding the table to be synthesized, and a second tree structure data is created by synthesizing the tree structure data and the first tree structure data. A synthesis step;
A tree structure data is created based on the logical relationship relating to the synthesis target enumeration, and a third tree structure data is created by synthesizing the tree structure data and the second tree structure data. A synthesis step;
A logical relationship recognition program characterized in that