JPH09128400A

JPH09128400A - Structured document preparing device

Info

Publication number: JPH09128400A
Application number: JP7280630A
Authority: JP
Inventors: Makoto Imamura; 誠今村; Osamu Moriguchi; 修森口
Original assignee: Mitsubishi Electric Corp
Current assignee: Mitsubishi Electric Corp
Priority date: 1995-10-27
Filing date: 1995-10-27
Publication date: 1997-05-16

Abstract

PROBLEM TO BE SOLVED: To improve the diversion property of document data and the possibility of an automatic database register processing by processing an attribute in a document tag based on a designated document type and generating a structured document. SOLUTION: The structured document preparing device is provided with a document picture data input means 1, a document picture area and logical structure correspondence definition storage part 2, a document picture area segmenting means 3, a pattern recognizing means 4, a structured document preparing means 5, a structured document output means 6, a control means 7 and a document-type definition storage part 35. The document picture area segmenting means 3 segments a prescribed part against a pattern-recognized input through the use of information concerning a document picture area in the correspondence definition storage part 2. When a stipulated output document is designated, the structured document preparing means 5 refers to a definition name and logical structure corresponding to designation in the document-type definition storage part 35, processes the area which is segmented by the document picture area segmenting means 3 by a processing attribute in the correspondence definition storage part 2 and prepares the structured document in accordance with logical structure.

Description

Detailed Description of the Invention

【０００１】[0001]

【発明の属する技術分野】本発明は、文書管理データべ
ース、文書作成／承認、文書保存／検索／閲覧、データ
集計、文書変換、出版印刷等の文書管理／処理業務にお
いて、文書画像情報から文書論理構造を抽出し、文書型
定義の情報を参照しながら、文書情報を計算機処理に適
した形式に変換し利用する構造化文書生成装置に関する
ものである。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to document image information in document management / processing such as document management database, document creation / approval, document storage / search / browsing, data aggregation, document conversion, and publication printing. The present invention relates to a structured document generation device that extracts a document logical structure from a document, converts the document information into a format suitable for computer processing, and refers to the information of the document type definition.

【０００２】[0002]

【従来の技術】図５９は、例えば公開特許公報平６−２
１４９８３に開示された従来の構造化文書生成装置の構
成である。ただし、比較をしやすくするために、装置の
名称の一部を変更している。この発明の目的は、黒白２
値画像で表現される複数の文書画像を、章、節、文書段
落、図等に分けられた論理構造をもつ文書に変換するこ
とである。図において、４１はレクタングル／仮ブロッ
ク生成部であり、文書画像から互いに隣接した領域を抽
出しおおまかなレイアウトを解析する。４２は文字認識
部であり、文字を認識する。４３はヘッダー／フッタ解
析部であり、文書画像のへッダー／フッタ領域を抽出
し、ページ番号および共通コンテントを識別する。４４
はカラム確定／ブロック構築部であり、各ページのカラ
ム領域を確定し、ページ内のブロックを構築する。４５
は節番号／表示属性解析部であり、各節内の表示属性の
解析を行い文書の段落分けを行う。４６は論理構造生成
部であり、カラム／ページ末の文書段落とそれに続くカ
ラム／ページ先頭の文書段落に注目し表示属性がほぼ等
しければマージを行い文書の論理構造を生成させる。2. Description of the Related Art FIG. 59 shows, for example, Japanese Unexamined Patent Publication No. 6-2.
This is the configuration of the conventional structured document generation device disclosed in 14983. However, in order to facilitate comparison, part of the device name has been changed. The purpose of this invention is to
It is to convert a plurality of document images represented by value images into a document having a logical structure divided into chapters, sections, document paragraphs, figures and the like. In the figure, reference numeral 41 denotes a rectangle / temporary block generation unit, which extracts regions adjacent to each other from a document image and analyzes a rough layout. A character recognition unit 42 recognizes characters. A header / footer analysis unit 43 extracts a header / footer area of the document image and identifies a page number and common content. 44
Is a column determination / block construction unit, which determines the column area of each page and constructs a block in the page. 45
Is a section number / display attribute analysis unit, which analyzes the display attributes in each section and divides the document into paragraphs. A logical structure generation unit 46 pays attention to a document paragraph at the end of a column / page and a document paragraph at the beginning of a subsequent column / page, and if the display attributes are almost equal, merges to generate a logical structure of the document.

【０００３】次に動作について説明する。図において、
レクタングル／仮ブロック生成部４１は文書画像からレ
クタングルと呼ぶ基本矩形を抽出し、文字認識部４２、
ヘッダー／フッター解析部４３、カラム確定／ブロック
確定部４４に送る。文字認識部４２は、前記レクタング
ルデータを文字認識し、その結果をレクタングル／仮ブ
ロック生成部４１に送り返す。ヘッダー／フッター解析
部４３は、前記レクタングルデータのヘッダー／フッタ
ーを解析しその結果を出力する。カラム確定／ブロック
確定部４４は、前記レクタングルデータのヘッダー／フ
ッターブロックを構築し構築されたブロックデータを節
番号／表示属性解析部４５に送る。節番号／表示属性解
析部４５は、前記ブロックデータのレクタングル属性
（位置、大きさ、オフセット値など）をチェックするこ
とにより生成される最終的なブロックデータを論理構造
生成部４６に送る。論理構造生成部４６は、前記最終的
なブロックデータを節・段落等の構造をもつ論理構造化
文書を生成する。Next, the operation will be described. In the figure,
The rectangle / temporary block generation unit 41 extracts a basic rectangle called a rectangle from the document image, and the character recognition unit 42,
It is sent to the header / footer analysis unit 43 and the column determination / block determination unit 44. The character recognition unit 42 character-recognizes the rectangle data and sends the result back to the rectangle / temporary block generation unit 41. The header / footer analysis unit 43 analyzes the header / footer of the rectangle data and outputs the result. The column confirmation / block confirmation unit 44 constructs the header / footer block of the rectangle data and sends the constructed block data to the node number / display attribute analysis unit 45. The node number / display attribute analysis unit 45 sends final block data generated by checking the rectangular attributes (position, size, offset value, etc.) of the block data to the logical structure generation unit 46. The logical structure generation unit 46 generates a logical structured document having a structure such as a section / paragraph from the final block data.

【０００４】[0004]

【発明が解決しようとする課題】上記従来技術では、次
に示す（１）から（３）の課題があった。（１）出力とする構造化文書の形式を自由に選ぶことが
できない。即ち、文書管理業務においては、入力の文書
画像データに対して異なる構造情報をもつ構造化文書を
出力する必要がある。各文書領域に対して、どのような
順序と加工を行うかという規則を適用して構造化文書を
生成する。しかし、従来技術では、論理構造を生成する
過程で、文書画像の表示データに関する情報のみを用い
ているので、必要とされる出力の構造化文書の形式に、
規則を適用して自由に出力ができない。（２）定型的なフォーマットに関する情報を有効に活用
できない。即ち、紙文書の定型的なフォーマットでは、
どの場所に何が記載されているかがあらかじめ決まって
いるものが多い。従来技術では、イメージ以外の定型的
なフォーマットに関する情報を管理していないので、こ
れらの入力情報を自由に指定、加工して有効に活用する
ことができない。（３）各文書領域に対して、付加すべき情報を指定する
ことができない。即ち、出力とする構造化文書の形式を
自由に選ぶためには、どのような文書領域に対して、ど
のような文書構造に関する情報を付加すべきかを規定す
る情報を指定する手段が必要となる。しかし、従来技術
では、付加すべき情報について管理する手段を用意して
いないので、各々の文書領域に対して、付加すべき情報
は何かを指定することができない。The above-mentioned prior art has the following problems (1) to (3). (1) The format of the structured document to be output cannot be freely selected. That is, in the document management work, it is necessary to output a structured document having different structure information with respect to the input document image data. A structured document is generated by applying rules regarding what order and processing is performed for each document area. However, in the conventional technology, since only the information about the display data of the document image is used in the process of generating the logical structure, the required structured document format of the output is:
I cannot apply the rules to output freely. (2) Information on standard formats cannot be used effectively. That is, in the standard format of a paper document,
In many cases, it is predetermined that what is written in which place. In the prior art, since information related to standard formats other than images is not managed, it is not possible to freely specify and process these input information and effectively utilize them. (3) Information to be added cannot be specified for each document area. In other words, in order to freely select the format of the structured document to be output, a means for designating the information regarding what document structure should be added to what document area is required. . However, in the prior art, since there is no means for managing the information to be added, it is impossible to specify what information should be added to each document area.

【０００５】本発明は上記のような課題を解消するため
になされたもので、文書画像中の部分領域毎に文書構造
に関する情報を対応付ける規則を容易に生成して記憶で
きるユーザインタフェースを持つようにし、またこの対
応付け規則を参照して入力、文書画像データから、アプ
リケーションに応じて設定された文書型型定義に従った
構造化文書を生成／利用する装置を得ることを目的とす
る。The present invention has been made to solve the above problems, and has a user interface capable of easily generating and storing a rule for associating information relating to a document structure with each partial area in a document image. Another object of the present invention is to obtain an apparatus for generating / using a structured document according to a document type definition set according to an application from input and document image data with reference to this association rule.

【０００６】[0006]

【課題を解決するための手段】この発明に係る構造化文
書生成装置は、入力の文書画像データの所定の部分であ
る文書画像領域に対して文書の特定の論理構造部分であ
ることを示す対応する文書タグを付加し、更に出力文書
に対して必要な処理属性を付加して定義として記憶する
文書画像領域・論理構造対応定義記憶手段（以下、対応
定義記憶手段という）と、パターン認識された入力に対
して対応定義記憶手段中の文書画像領域に関する情報を
用いて所定の部分を切り出す文書画像領域切り出し手段
と、各種の出力文書で必要な上記部分である文書画像領
域の集まりを記述した論理構造を文書毎に名称を付けて
定義として記憶する文書型定義記憶手段と、規定の出力
文書を指定されると、記憶されている文書型定義記憶手
段中の指定に対応した定義名称と論理構造を参照して文
書画像領域切り出し手段が切り出した領域を対応定義記
憶手段中の処理属性により処理し上記論理構造に従って
生成する構造化文書生成手段とを備えた。In the structured document generation apparatus according to the present invention, it is indicated that a document image area which is a predetermined portion of input document image data is a specific logical structure portion of a document. A document image area / logical structure correspondence definition storage means (hereinafter referred to as correspondence definition storage means) for storing a definition by adding a document tag to the output document and a necessary processing attribute to the output document, and pattern recognition is performed. A document image area cutout means for cutting out a predetermined portion by using information about the document image area in the correspondence definition storage means for an input, and a logic describing a collection of the document image area which is the above-mentioned portion necessary for various output documents. Document type definition storing means for storing the structure as a definition by giving a name to each document, and when a specified output document is designated, corresponds to the designation in the stored document type definition storing means. And a structured document generating means for generating in accordance with the process described above logical structure by processing attribute in the corresponding definition storage means an area cut out the document image area extracting means with reference to the definition name and logical structure.

【０００７】基本構成に加えて更に、入力の所定の部分
を切り出して対応定義記憶手段中の対応する文書画像領
域と比較して一定の評価値以上で似ていると判断する
と、対応定義記憶手段中の文書画像領域を置き換える
か、または記憶中のどれかの文書画像領域と同等と判断
する文書画像領域推定手段を付加した。In addition to the basic structure, a predetermined portion of the input is further cut out and compared with the corresponding document image area in the correspondence definition storage means, and if it is judged that they are similar to each other with a certain evaluation value or more, the correspondence definition storage means. A document image area estimation means for replacing the inside document image area or judging as equivalent to any of the stored document image areas is added.

【０００８】また更に、処理属性として別の詳細処理表
を指定し、入力の文書画像データの部分領域に対して上
記詳細処理表に基づく処理をして出力文書を生成するよ
うにした。Further, another detailed processing table is designated as a processing attribute, and a partial area of the input document image data is processed based on the detailed processing table to generate an output document.

【０００９】また更に、入力の所定のイメージ部分を切
り出してイメージ・ファイルとし、文書タグと結合して
出力するようにした。Furthermore, a predetermined image portion of the input is cut out to form an image file, which is combined with the document tag and output.

【００１０】また上記に加えて、入力の所定の部分に対
して別の詳細処理表として用意されたテーブルを参照し
てまたはアルゴリズムを適用して別の文字列に変換して
出力文書を生成するようにした。In addition to the above, the output document is generated by referring to a table prepared as another detailed processing table for a predetermined part of the input or applying an algorithm to convert it into another character string. I did it.

【００１１】また更に、切り出された入力の所定の部分
である文書画像領域に対して、文書を構成する論理構造
の部分を示す文書タグを指定されると、文書画像領域と
文書タグを結びつけて対応定義記憶手段に登録・記憶す
る文書画像領域・論理構造対応付け手段を付加した。Further, when a document tag indicating a portion of a logical structure forming a document is designated for a document image area which is a predetermined portion of the cut out input, the document image area and the document tag are linked. A document image area / logical structure associating unit to be registered / stored in the correspondence definition storing unit is added.

【００１２】また上記に加えて、文書タグ対応で、更に
出力文書に対して所定の処理を行う属性を付加して対応
定義記憶手段に登録・記憶するようにした。In addition to the above, in addition to the document tags, an attribute for performing a predetermined process on the output document is added and registered / stored in the correspondence definition storage means.

【００１３】また更に、入力の文書画像データの表示手
段を付加し、また文書タグを指定すると、文書画像デー
タの対応する部分の文書画像領域が識別表示されるよう
にした。Further, when a display means for inputting document image data is added and a document tag is designated, the document image area corresponding to the document image data is identified and displayed.

【００１４】また更に、文書型定義記憶手段に記憶され
た任意の文書型定義を読み出して木構造で表示し、また
は修正して表示された木構造の文書型定義を記憶指示さ
れると文書型定義記憶手段に記憶する文書論理構造表示
・修正手段を付加した。Furthermore, when an arbitrary document type definition stored in the document type definition storage means is read out and displayed in a tree structure, or when the document type definition of the tree structure displayed after being corrected is instructed to be stored, the document type is stored. A document logical structure display / correction means to be stored in the definition storage means is added.

【００１５】また上記に加えて、文書論理構造表示・修
正手段を付加して修正任意の文書型定義を木構造で表示
し、また文書画像領域・論理構造対応付け手段は、木構
造で表示された文書型定義との文書タグ毎の対応付けも
行うようにした。In addition to the above, the document logical structure display / correction means is added to display the modified arbitrary document type definition in a tree structure, and the document image area / logical structure associating means is displayed in a tree structure. Correspondence between each document tag and the document type definition is also performed.

【００１６】また上記に加えて、入力の文書画像データ
の表示手段を付加し、文書タグを指定すると文書画像デ
ータの対応する部分の文書画像領域が識別表示されるよ
うにした。In addition to the above, a display means for input document image data is added so that when a document tag is designated, the document image area corresponding to the document image data is identified and displayed.

【００１７】また上記に加えて、文書画像領域・論理構
造対応付け手段は、木構造で表示された文書型定義との
文書タグ毎の対応付けも行い、対応がない独立の木構造
があれば文書論理構造表示・修正手段は対応のない木構
造を識別表示するようにした。In addition to the above, the document image area / logical structure associating unit also associates each document tag with the document type definition displayed in a tree structure, and if there is an independent tree structure with no correspondence, The document logical structure display / correction means identifies and displays a tree structure that does not correspond.

【００１８】[0018]

BEST MODE FOR CARRYING OUT THE INVENTION

実施の形態１．まず、本実施の形態１においては、予め
記憶させた対応付け規則を用いて指定の形式の構造化文
書を生成して出力する構成と動作を説明する。以下、こ
の発明の実施の形態１を図について説明する。図１は本
発明の実施の形態１の構造化文書生成装置の構成図であ
る。図において、１はシードフィーダを有する光学式画
像読みとり装置または、表示一体型タブレットによる筆
跡読みとり装置等から構成され、読みとった画像データ
を計算機に処理可能な画像データへと変換する文書画像
データ入力手段である。例えば後述の図５に示される画
像データが入力されて認識されてイメージが認識又はコ
ード化される。ここで、構造化文書とは、文書の記述内
容で区別される表題、著者名、序文、本文のような文書
構成要素の文書中での役割を表現する文書タグを付与す
ることにより、文書を文書タグ単位で関係を記述し、文
書の論理的な構造を表現する情報をもった文書である。
また、文書型定義とは、構造化文書において、文書タグ
によって識別される文書中の構成要素（文書タグ単位）
の出現順序を規定する記述である。Embodiment 1 FIG. First, in the first embodiment, a configuration and an operation of generating and outputting a structured document of a designated format using a pre-stored association rule will be described. Embodiment 1 of the present invention will be described below with reference to the drawings. 1 is a block diagram of a structured document generation device according to a first embodiment of the present invention. In the figure, reference numeral 1 is a document image data inputting means which comprises an optical image reading device having a seed feeder, a handwriting reading device using a display-integrated tablet, or the like, and which converts the read image data into image data that can be processed by a computer. Is. For example, the image data shown in FIG. 5, which will be described later, is input and recognized to recognize or code an image. Here, a structured document refers to a document by adding a document tag that represents the role of a document component such as a title, an author name, an introductory text, and a text that are distinguished by the description content of the document. It is a document that has information describing the logical structure of the document by describing the relationship in document tag units.
In addition, the document type definition is a structural element (document tag unit) in the document identified by the document tag in the structured document.
Is a description that defines the appearance order of.

【００１９】構造化文書は、文書の構成要素の文書中で
の役割を表現する文書タグによって文書中の構成要素を
確定できるので、またその後の処理のための属性を付加
するので、文書管理データべース、文書作成／承認、文
書保存／検索／閲覧、データ集計、文書変換、出版印刷
等のアプリケーションが必要とする情報を計算機が自動
的に抽出することが可能となる。したがって、文書を利
用した計算機システムのよりいっそうの自動化を推進す
ることができる。また、文書型定義を連携するアプリケ
ーションに応じて設定することにより、システムの柔軟
性を実現することができる。一方、オフィスにおいて
は、長方形などの枠で囲むことにより、文書構成要素の
文書中での役割を表現する帳票などのフォーマット付き
の紙文書が多数存在している。したがって、既存のフォ
ーマット付き紙文書から、アプリケーションに応じて設
定された文書型型定義に従う構造化文書を自動的に生成
することができれば、既存の紙文書上のデータを、自動
的に計算機のアプリケーション上に取り込みことによ
り、オフィス紙文書の有効活用が可能となる。In the structured document, since the constituent elements in the document can be determined by the document tag expressing the role of the constituent elements of the document in the document, and the attribute for the subsequent processing is added, the document management data The computer can automatically extract information required by applications such as base, document creation / approval, document storage / search / browsing, data aggregation, document conversion, and publication printing. Therefore, further automation of the computer system using documents can be promoted. In addition, the flexibility of the system can be realized by setting the document type definition according to the application to be linked. On the other hand, in offices, there are many paper documents with formats such as forms that represent the roles of document constituent elements in a document by enclosing them in a rectangular frame. Therefore, if a structured document that complies with the document type definition set according to the application can be automatically generated from the existing formatted paper document, the data on the existing paper document can be automatically converted into a computer application. By incorporating the above, it becomes possible to effectively use office paper documents.

【００２０】２は、後述の図６にその記憶状態の一例を
示す文書画像領域・論理構造対応定義記憶部である。こ
れは本発明の主要要素であり、「文書画像データの部分
領域」と「文書の論理構造のある特定部分であることを
表現するマークアップ用タグ」との対応関係及びそれに
関連する情報を記述する「文書画像領域・論理構造対応
定義」を記憶する。３５は、後述の図７にその記憶状態
の一例を示す定義を記憶する文書型定義記憶手段であ
る。これは、文書の論理構造を規定する文書型定義を記
憶する。３は、文書画像領域切り出し手段であり、文書
画像領域・論理構造対応定義記憶部２と並んで主要要素
である。これは文書画像データ入力手段１から送られた
入力の文書画像データを、文書画像領域・論理構造対応
定義記憶部に記憶される文書画像領域・論理構造対応定
義を適用して、文書画像の部分領域を切り出し、その切
り出された文書画像データをパターン認識手段４に送
る。４は、文書画像領域切り出し手段３によって切り出
された文書画像データをパターン認識することによりコ
ード化されたデータを生成し、構造化文書生成手段５に
送る通常のパターン認識手段である。パターン認識の技
術は既知であり、詳細構成と動作の記述は省略する。Reference numeral 2 denotes a document image area / logical structure correspondence definition storage unit whose storage state is shown in FIG. This is a main element of the present invention, and describes the correspondence relationship between the “partial area of the document image data” and the “markup tag expressing a certain part of the logical structure of the document” and the information related thereto. The “document image area / logical structure correspondence definition” is stored. Reference numeral 35 denotes a document type definition storage means for storing a definition whose storage state is shown in FIG. It stores a document type definition that defines the logical structure of the document. Reference numeral 3 denotes a document image area cutout unit, which is a main element along with the document image area / logical structure correspondence definition storage unit 2. This is a document image portion obtained by applying the document image data input from the document image data input means 1 to the document image area / logical structure correspondence definition stored in the document image area / logical structure correspondence definition storage unit. The area is cut out, and the cut out document image data is sent to the pattern recognition means 4. Reference numeral 4 is a normal pattern recognition means for generating coded data by recognizing the pattern of the document image data cut out by the document image area cutting means 3 and sending the coded data to the structured document generation means 5. Since the technique of pattern recognition is known, the detailed configuration and description of the operation will be omitted.

【００２１】５は、構造化文書生成手段であり、文書画
像領域・論理構造対応定義記憶部２、文書画像領域切り
出し手段３と並んで主要要素である。これは、文書画像
領域・論理構造対応定義記憶部２に記憶される文書画像
領域・論理構造対応定義を用いて、文書型定義記憶手段
３５に記憶されている文書型定義を参照しながら、パタ
ーン認識手段４によってコード化されたデータに、文書
の部分データ毎にタグを付与することにより構造化文書
を生成し、構造化文書出力手段６へと送る。６は、構造
化文書生成手段５によって生成された構造化文書をモニ
ター等の表示装置またデスク等の記憶装置に出力する構
造化文書出力手段である。７は、ＣＰＵ（中央処理ユニ
ット）で構成されシステム全体の動作を制御する制御手
段である。Reference numeral 5 denotes a structured document generation means, which is a main element along with the document image area / logical structure correspondence definition storage section 2 and the document image area cutout means 3. This is a pattern using the document image area / logical structure correspondence definition stored in the document image area / logical structure correspondence definition storage unit 2 while referring to the document type definition stored in the document type definition storage means 35. A structured document is generated by adding tags to the data encoded by the recognition means 4 for each partial data of the document, and sent to the structured document output means 6. Reference numeral 6 is a structured document output means for outputting the structured document generated by the structured document generation means 5 to a display device such as a monitor or a storage device such as a desk. Reference numeral 7 is a control means which is composed of a CPU (central processing unit) and controls the operation of the entire system.

【００２２】図５は文書画像データの一例を示す図であ
り、文字列、箇条書、イメージ、表を含む文書である。
５１は文書画像全体、５２から５９は文書画面データ上
の閉曲線で囲まれた領域、６０は日付けが記述された領
域である。FIG. 5 is a diagram showing an example of document image data, which is a document including character strings, bullets, images and tables.
Reference numeral 51 is the entire document image, 52 to 59 are areas surrounded by a closed curve on the document screen data, and 60 is an area in which the date is described.

【００２３】図６は文書画像領域・論理構造対応定義の
一例を示す図であり、図５に示す入力のデータに対し、
「文書画像データの部分領域」と「文書の論理構造のあ
る特定部分であることを表現するマークアップ用タグ」
との対応関係及びそれに関連する情報の記述である。こ
こで６１は部分領域定義である。６２は領域ＩＤであ
り、この文書画像領域・論理構造対応定義では５つの部
分領域が定義されている。６３では部分領域の位置と形
状が記述される。６４では部分領域毎に対応付けられる
文書タグの列が記述される。本発明の他の重要な点は、
対応定義において、後に構造化文書生成手段５が行う処
理を指定した属性をＩＤ毎に付加したことにある。図６
の６４〜６８がその部分である。６５では部分領域毎に
付与される「切り出しモード」の設定値が記述され、図
３のｓｔｅｐ８における分岐先を決定する際に参照され
る。６６では部分領域毎に付与される「文字置換指針」
が記述され、図３のｓｔｅｐ１５において、文字列を置
換する手続きを決定する際に参照される。６７では部分
領域毎に付与される「テキスト解析指針」が記述され、
図３のｓｔｅｐ１７において、テキストを解析する手続
きを決定する際に参照される。６８では部分領域毎に付
与される「認識属性」に関する設定値が記述され、図３
のｓｔｅｐ１３において、認識属性に応じたパターン認
識用の手続きを選択する際に参照される。FIG. 6 is a diagram showing an example of the document image area / logical structure correspondence definition. For the input data shown in FIG.
"Partial area of document image data" and "Markup tag that represents a specific part of the logical structure of the document"
It is a description of the correspondence relationship with and information related to it. Here, 61 is a partial area definition. Reference numeral 62 denotes an area ID, and in this document image area / logical structure correspondence definition, five partial areas are defined. In 63, the position and shape of the partial area are described. In 64, a sequence of document tags associated with each partial area is described. Another important point of the present invention is that
This is because, in the correspondence definition, an attribute designating a process to be performed later by the structured document generation means 5 is added for each ID. FIG.
64-68 is that part. In 65, the setting value of the “cutout mode” given for each partial area is described, and is referred to when determining the branch destination in step 8 of FIG. In 66, "character replacement guideline" given to each partial area
Is described and referred to when determining the procedure for replacing the character string in step 15 of FIG. In 67, the “text analysis guideline” given to each partial area is described.
In step 17 of FIG. 3, it is referred to when determining a procedure for analyzing a text. In 68, the setting value regarding the “recognition attribute” given to each partial area is described.
In step 13 of, the process is referred to when selecting a pattern recognition procedure according to the recognition attribute.

【００２４】６９は６７の設定値である「繰り返し構造
抽出１」に関する情報であり、「テキスト中に出現する
繰り返し構造を規定する規則」と「その繰り返し構造を
構造化文書に変換する際に付与すべき文書タグに関する
情報」が記述される。図３のｓｔｅｐ１７において、テ
キストを解析する手続きから参照される。７０は６５の
設定値である「繰り返し構造切り出し１」に関する情報
であり、「繰り返し構造の形状」、「並び方の規則」、
そして「その繰り返し構造を構造化文書に変換する際に
付与すべき文書タグに関する情報」が記述される。７１
は７０で導入された繰り返し領域ＩＤに関する「形状」
と「構造化文書に変換する際に付与すべき文書タグに関
する情報」が記述される。７０と７１は、図３のｓｔｅ
ｐ１０において、文書画像データ中の繰り返し構造を認
識する手続きから参照される。７２は「文書画像領域推
定モード」の設定値が記述され、図２のｓｔｅｐ５にお
ける分岐先を決定する際に参照される。７３は文書型定
義名称が記述され、図４のｓｔｅｐ１９において呼び出
すべき文書型定義を確定する際に参照される。７４は
「外部手続き呼びだしモード」の設定値が記述され、図
４のｓｔｅｐ２４における分岐先を決定する際に参照さ
れる。また図１１は、他の対応定義の例を示す図であ
る。Reference numeral 69 is information about "repeating structure extraction 1" which is a set value of 67, and is given when "a rule defining a repeating structure appearing in the text" and "when converting the repeating structure into a structured document. Information about document tags to be written "is described. In step 17 of FIG. 3, it is referred to by the procedure for analyzing the text. Reference numeral 70 is information regarding “repeating structure cutout 1” which is the setting value of 65, and includes “shape of repeating structure”, “rule of arrangement”,
Then, “information regarding a document tag to be added when converting the repeating structure into a structured document” is described. 71
Is the "shape" for the repeating region ID introduced in 70
And "information about document tags to be added when converting to structured document" are described. 70 and 71 are the steps in FIG.
In p10, it is referred to by the procedure for recognizing the repeating structure in the document image data. Reference numeral 72 describes the setting value of the “document image area estimation mode”, which is referred to when the branch destination in step 5 of FIG. 2 is determined. A document type definition name 73 is described, and is referred to when the document type definition to be called is determined in step 19 of FIG. A set value of “external procedure calling mode” is described in 74, which is referred to when the branch destination in step 24 of FIG. 4 is determined. FIG. 11 is a diagram showing an example of another correspondence definition.

【００２５】図７は文書型定義の一例を示す例図であ
り、文書中に出現する文書タグの順序を規定するもので
あり、その意味は、ISO 規格８８７９のSGMLにおける文
書型定義に準ずる。例えば、１行目は＜カタログ＞とい
う文書タグが指定する領域内では、＜名称＞、＜日時
＞、＜本文＞というタグがこの順に出現することを示
す。また図１２は、他の文書型定義の例を示す図であ
る。FIG. 7 is an example diagram showing an example of the document type definition, which defines the order of the document tags appearing in the document, and its meaning conforms to the document type definition in SGML of ISO standard 8879. For example, the first line indicates that the tags <name>, <date>, and <body> appear in this order in the area specified by the document tag <catalog>. FIG. 12 is a diagram showing an example of another document type definition.

【００２６】図８は文書タグ／コード化データ対応表の
一例を示す例図であり、パターン認識手段４におけるコ
ード化データ間の構造に関する関係を記述するものであ
る。９１から９５は、（文書タグ列、コード化データ、
順序）の三つ組であり、図３のｓｔｅｐ１８において作
成され、図４のｓｔｅｐ２１において構造化文書を生成
する際に利用される。図９は、図８の論理的な接続関係
を人に容易に理解できるよう木構造で表した図であり、
図８と同一の関係を表現している。FIG. 8 is an example diagram showing an example of the document tag / coded data correspondence table, and describes the relationship regarding the structure between the coded data in the pattern recognition means 4. 91 to 95 are (document tag string, encoded data,
3), which is created in step 18 of FIG. 3 and is used when generating a structured document in step 21 of FIG. FIG. 9 is a diagram showing a tree structure so that a person can easily understand the logical connection relationship of FIG.
The same relationship as in FIG. 8 is expressed.

【００２７】図１０は構造化文書の一例を示す例図であ
り、図４のｓｔｅｐ２３において作成され、図４のｓｔ
ｅｐ２５において出力される。１０１から１０９は、構
造化文書全体に対して下位の関係にある部分的な論理構
造である。また図１３は、他の構造化文書の出力例を示
す図である。FIG. 10 is an example diagram showing an example of a structured document, which is created in step 23 of FIG.
It is output at ep25. Reference numerals 101 to 109 are partial logical structures having a subordinate relationship to the entire structured document. FIG. 13 is a diagram showing an output example of another structured document.

【００２８】上記のように構成された構造化文書生成装
置の全体の動作を図２ないし図４のフローチャートに沿
って説明する。実施の形態１は、図２ないし図４のフロ
ーチャートにおいて、文書画像領域・論理構造対応定義
推定モード、文書画像領域推定モード、文字置換モー
ド、テキスト解析モード、外部手続き呼びだしモードが
いずれもオフで、切り出しモードが通常モードの場合で
ある。The overall operation of the structured document generation apparatus configured as described above will be described with reference to the flowcharts of FIGS. In the first embodiment, the document image area / logical structure correspondence definition estimation mode, the document image area estimation mode, the character replacement mode, the text analysis mode, and the external procedure calling mode are all off in the flowcharts of FIGS. This is the case where the cutout mode is the normal mode.

【００２９】まず最初に、文書画像データ入力手段１が
文書画像データを読み込む（ｓｔｅｐ１）。次に、制御
手段７は、文書画像領域・論理構造対応定義推定モード
をチェックし、オンの場合にはｓｔｅｐ３に、オフの場
合はｓｔｅｐ４に進む（ｓｔｅｐ２）。文書画像領域・
論理構造対応定義が図４に示す一例である場合には、図
６の７５に示すようににオフであるから、指定された文
書画像領域・論理構造対応定義の部分領域定義を読み込
む（ｓｔｅｐ４）。文書画像領域・論理構造対応定義推
定モードがオンの場合は、実施の形態８に記載する。部
分領域定義は、本実施の形態において重要な部分であ
る。これは文書画像領域・論理構造対応定義の一部分で
あり、領域ＩＤで確定される各々の文書画像中における
部分領域に対して、「始点」、「終点」、「文書タグ
列」、「切り出しモード」、「文字置換指針」、「テキ
スト解析指針」、「認識属性」等の属性に関する情報を
指定する定義である。この定義（加工規則）を記憶する
ことでこれを参照し、構造化文書生成手段５が以下に詳
述するように中間表を経て、指定の構造化文書を出力で
きる。First, the document image data input means 1 reads the document image data (step 1). Next, the control means 7 checks the document image area / logical structure correspondence definition estimation mode, and proceeds to step 3 if it is on and proceeds to step 4 if it is off (step 2). Document image area
In the case where the logical structure correspondence definition is the example shown in FIG. 4, since it is off as indicated by 75 in FIG. 6, the specified document image region / partial region definition of the logical structure correspondence definition is read (step 4). . The case where the document image area / logical structure correspondence definition estimation mode is ON is described in the eighth embodiment. The partial region definition is an important part in this embodiment. This is a part of the document image area / logical structure correspondence definition, and for each partial area in each document image determined by the area ID, "start point", "end point", "document tag string", "cutout mode" , "Character replacement guideline", "text analysis guideline", "recognition attribute", and the like. By storing this definition (processing rule), the structured document generation means 5 can output the designated structured document through the intermediate table as described in detail below.

【００３０】部分領域定義は、図６に示す文書画像領域
・論理構造対応定義では、６１の部分にあたり、領域Ｉ
Ｄは６２の部分に示されるように１から５まであり、５
つの部分領域についての情報が記述されている。例え
ば、「領域ＩＤ」２によって確定される部分領域は、
「始点」が（３０，５０）、「終点」が（１６０，８
０）、「文書タグ列」が＜カタログ＞＜日時＞、「切り
出しモード」が通常、「文字置換指針」はなし、「テキ
スト解析指針」はなし、「認識属性」は活字／漢字であ
ることを示す。次に、制御手段７は、文書画像推定モー
ドをチェックし、オンの場合にはｓｔｅｐ６に、オフの
場合はｓｔｅｐ７に進む（ｓｔｅｐ５）。文書画像領域
・論理構造対応定義が図６に示す一例である場合には、
７２に示すようにオフであるから、ｓｔｅｐ７へ進む。
文書画像推定モードがオンである場合は、実施の形態２
に記載する。ｓｔｅｐ７では、部分領域定義から、領域
ＩＤによって確定される部分領域を順に取り出し、取り
出された場合はｓｔｅｐ８へ、すべて取り出し終えた場
合はｓｔｅｐ１９に進む。部分領域定義が、図６に示す
文書画像領域・論理構造対応定義の６１の場合では、領
域ＩＤ１から領域ＩＤ５に対応する部分領域を順に取り
出すことになる。The partial area definition corresponds to the area 61 in the document image area / logical structure correspondence definition shown in FIG.
D is from 1 to 5 as shown in part 62, 5
Information about one partial area is described. For example, the partial area determined by "area ID" 2 is
The "start point" is (30,50) and the "end point" is (160,8)
0), “Document tag string” is <Catalog><Date>, “Cutout mode” is usually “Character replacement guideline”, “Text analysis guideline” is not shown, and “Recognition attribute” is print / kanji. . Next, the control means 7 checks the document image estimation mode, and if it is on, proceeds to step 6, and if it is off, proceeds to step 7 (step 5). When the document image area / logical structure correspondence definition is an example shown in FIG. 6,
Since it is off as indicated by 72, the process proceeds to step 7.
If the document image estimation mode is on, the second embodiment
It describes in. In step 7, the partial areas defined by the area ID are sequentially extracted from the partial area definition, and if extracted, proceed to step 8, and if all have been extracted, proceed to step 19. When the partial area definition is 61 of the document image area / logical structure correspondence definition shown in FIG. 6, the partial areas corresponding to the area ID1 to the area ID5 are sequentially extracted.

【００３１】ｓｔｅｐ７において部分領域が取り出せた
場合には、制御手段７は切り出しモードをチェックし、
通常モードである場合はｓｔｅｐ９に、繰り返し構造認
識モードである場合はｓｔｅｐ１０に、入れ子構造認識
モードである場合はｓｔｅｐ１１に、イメージ切り出し
モードである場合はｓｔｅｐ１２に進む（ｓｔｅｐ
８）。例えば、「領域ＩＤ」２によって確定される部分
領域の場合に通常モードであるからｓｔｅｐ９に進む。
切り出しモードが繰り返し構造認識モードである場合は
実施の形態３に、切り出しモードが入れ子構造認識モー
ドである場合は実施の形態４に、切り出しモードがイメ
ージ切り出しモードである場合は実施の形態５に各々記
載する。ｓｔｅｐ９では、ｓｔｅｐ７で取り出された部
分領域定義中の文書タグ列を取り出すと共に、「始点」
と「終点」によって指定される領域を切り出し画像デー
タを得ることによって、（文書タグ列、画像データ）対
を作成し、ｓｔｅｐ１３に進む。例えば、「領域ＩＤ」
２によって確定される部分領域を取り出した場合は、文
書タグ列は＜カタログ＞＜日時＞である。また、画像デ
ータは、（３０，５０）を左上の頂点とし、かつ（１６
０，８０）を右下の頂点とする矩形領域を、ｓｔｅｐ１
で読み込んだ文書画像データから切り出した結果得られ
るデータである。但し、ここで、（３０，５０）はＸ座
標が３０、Ｙ座標が５０である文書画像データ上の点を
指し、図５の文書画像データの例では、名称の領域５２
に対応する。When the partial area can be taken out in step 7, the control means 7 checks the cutout mode,
If it is the normal mode, proceed to step 9, if it is the repeated structure recognition mode, proceed to step 10, if it is the nested structure recognition mode, proceed to step 11, and if it is the image cutout mode, proceed to step 12 (step).
8). For example, in the case of the partial area determined by the “area ID” 2, it is the normal mode, so the process proceeds to step 9.
When the cutout mode is the repeated structure recognition mode, the third embodiment is described. When the cutout mode is the nested structure recognition mode, the fourth embodiment is described. When the cutout mode is the image cutout mode, the fifth embodiment is described. Enter. In step 9, the document tag string in the partial area definition extracted in step 7 is extracted and "start point"
By obtaining the image data by cutting out the area designated by "end point", a (document tag string, image data) pair is created, and the process proceeds to step 13. For example, "area ID"
When the partial area determined by 2 is taken out, the document tag string is <catalog><date and time>. Further, the image data has (30, 50) as the upper left vertex and (16
0,80) is the rectangular area with the lower right vertex as step1
This is data obtained as a result of cutting out from the document image data read in. However, here, (30, 50) indicates a point on the document image data whose X coordinate is 30 and Y coordinate is 50, and in the example of the document image data of FIG.
Corresponding to

【００３２】ｓｔｅｐ１３では、ｓｔｅｐ９、ｓｔｅｐ
１０、ｓｔｅｐ１１で生成された（文書タグ列、画像デ
ータ）対のデータ中の画像データを、公知のパターン認
識技術を用いて、画像データをコード化データに置換す
ることにより、（文書タグ列、コード化データ）対を生
成する。例えば、「領域ＩＤ」２によって確定される部
分領域の場合は、ｓｔｅｐ１２において得られた画像デ
ータから「炊飯器」というコード化データが得られるの
で、（＜カタログ＞＜日時＞、炊飯器）という（文書タ
グ列、コード化データ）対が生成される。次に、制御手
段７は、文字置換モードをチェックし、オンの場合はｓ
ｔｅｐ１５へ、オフの場合はｓｔｅｐ１６へ進む（ｓｔ
ｅｐ１４）。例えば、「領域ＩＤ」２によって確定され
る部分領域の場合はオフであるから、ｓｔｅｐ１６に進
む。また、文字置換モードがオンの場合は、実施の形態
７に記載する。ｓｔｅｐ１６では、テキスト解析モード
をチェックし、オンの場合はｓｔｅｐ１７へ、オフの場
合はｓｔｅｐ１８へ進む。例えば、「領域ＩＤ」２によ
って確定される部分領域の場合はオフであるから、テキ
スト解析モードはオフであり、ｓｔｅｐ１８に進む。ま
た、テキスト解析モードがオンの場合は、実施の形態６
に記載する。In step 13, step 9, step 9
10, by replacing the image data in the data of the (document tag string, image data) pair generated in step 11 with the coded data using a known pattern recognition technique (document tag string, Coded data) pairs are generated. For example, in the case of the partial area determined by the "area ID" 2, the coded data "rice cooker" can be obtained from the image data obtained in step 12, so it is called (<catalog><date> rice cooker). A (document tag string, coded data) pair is generated. Next, the control means 7 checks the character replacement mode, and if it is on, s
If it is off, go to step 15 and go to step 16 (st
ep14). For example, in the case of the partial area determined by the “area ID” 2, since it is off, the process proceeds to step 16. Further, when the character replacement mode is ON, it will be described in the seventh embodiment. In step 16, the text analysis mode is checked, and if it is on, the process proceeds to step 17, and if it is off, the process proceeds to step 18. For example, in the case of the partial area determined by the “area ID” 2, since it is off, the text analysis mode is off, and the process proceeds to step 18. If the text analysis mode is on, the sixth embodiment
It describes in.

【００３３】ｓｔｅｐ１８では、ｓｔｅｐ１３、ｓｔｅ
ｐ１５、ｓｔｅｐ１７で得られた（文書タグ列、コード
化データ）対を文書タグ／コード化データ対応表の末尾
に追加し、ｓｔｅｐ７に戻る。文書タグ／コード化デー
タ対応表とは、いわゆる中間表であり、本文中に記載し
ていない主記憶、または制御手段のメモリに一時的に生
成される。これは図８に示されるような「文書タグ列」
と「コード化データ」の対応表である。例えば、ｓｔｅ
ｐ７において「領域ＩＤ」２によって確定される部分領
域が選択された場合は、図８の９２に示されるように、
ｓｔｅｐ３によって得られた（＜カタログ＞＜日時＞、
炊飯器）という（文書タグ列、コード化データ）対が登
録される。また、図５の文書画像に対して、図６の６１
に示す５つの部分領域すべてに対するｓｔｅｐ１８の処
理が終った場合には、図８に示す文書タグ／コード化デ
ータ対応表が得られる。In step 18, step 13, step
The pair (document tag string, coded data) obtained in p15 and step 17 is added to the end of the document tag / coded data correspondence table, and the process returns to step 7. The document tag / coded data correspondence table is a so-called intermediate table, which is temporarily generated in the main memory not described in the text or the memory of the control means. This is the "document tag sequence" as shown in FIG.
It is a correspondence table of and "coded data". For example, ste
When the partial area determined by the “area ID” 2 is selected in p7, as shown at 92 in FIG.
obtained by step 3 (<catalog><date>,
A pair (document tag string, coded data) called rice cooker is registered. In addition, the document image of FIG.
When the processing of step 18 for all the five partial areas shown in FIG. 8 is completed, the document tag / coded data correspondence table shown in FIG. 8 is obtained.

【００３４】一方、ｓｔｅｐ７において部分領域が取り
出せない場合またはすべての取り出しを終えた場合に
は、構造化文書生成手段５が、文書型定義記憶手段３５
から、文書画像領域・論理構造対応定義の文書型定義名
称が指定する文書型定義を読み込み（ｓｔｅｐ１９）、
文書型定義から文書タグ列を順に取り出し（ｓｔｅｐ２
０）、文書タグが取り出せた場合はｓｔｅｐ２１に、す
べての文書タグ列を取り出し終えた場合はｓｔｅｐ２４
に進む。図６に示す文書画像領域・論理構造対応定義の
例では、７３に示される「/usr/local/dtd/catalog.dt
d」が文書型定義名称である。また、図７は、文書型定
義名称が指定する文書型定義の一例である。文書型定義
は文書中に出現する文書タグの順序を規定するものであ
り、図７の例における１行目は、＜カタログ＞という文
書タグが指定する領域内では、＜名称＞、＜日時＞、＜
本文＞という文書タグがこの順に出現することを示す。
図７の４、５行目も同様に、<!ELEMENT の次に出現する
文書タグが指定する領域内では、（）の中にある文書タ
グが順に出現することを示す。図７の２行目は、＜名称
＞という文書タグの後には文字列が出現することを示
す。図７の３、７、８、１１、１２行目も同様に、各々
の行の<!ELEMENT の次に出現する文書タグの次に文字列
が出現することを示す。図７の６行目は、＜リスト＞と
いう文書タグの後には、＜項目＞という文書タグが０個
以上繰り返し出現することを示す。図７の１０行目は、
＜表＞という文書タグの後には、＜属性＞と＜値＞とい
う文書タグの対が繰り返し出現することを示す。図７の
定義中の文書タグの親子関係を木構造で表現すると図９
のようになる。On the other hand, in step 7, if the partial area cannot be extracted or if all the partial areas have been extracted, the structured document generation means 5 causes the document type definition storage means 35.
Read the document type definition specified by the document type definition name of the document image area / logical structure correspondence definition (step 19),
The document tag sequence is taken out in order from the document type definition (step 2
0), if the document tags can be extracted, go to step 21, and if all the document tag strings have been taken out, step 24.
Proceed to. In the example of the document image area / logical structure correspondence definition shown in FIG. 6, “/usr/local/dtd/catalog.dt” shown at 73 is displayed.
“D” is the document type definition name. FIG. 7 shows an example of the document type definition designated by the document type definition name. The document type definition defines the order of the document tags that appear in the document. The first line in the example of FIG. 7 is <name>, <date> in the area specified by the document tag <catalog>. , <
It indicates that the document tags "text>" appear in this order.
Similarly, the 4th and 5th lines in FIG. 7 indicate that the document tags in parentheses appear in order within the area specified by the document tag that appears next to <! ELEMENT. The second line in FIG. 7 indicates that a character string appears after the document tag <name>. Similarly, the 3rd, 7th, 8th, 11th, and 12th lines in FIG. 7 also indicate that a character string appears next to the document tag that appears next to <! ELEMENT in each line. The sixth line in FIG. 7 shows that 0 or more document tags <item> appear repeatedly after the document tag <list>. The 10th line of FIG.
After the document tag of <table>, a pair of document tags of <attribute> and <value> appears repeatedly. When the parent-child relationship of the document tags in the definition of FIG. 7 is represented by a tree structure, FIG.
become that way.

【００３５】次に、取り出された文書タグ列を含む列
を、文書タグ／コード化データ対応表から取り出す（ｓ
ｔｅｐ２１）。文書タグ／コード化データ対応表におい
て、同じ文書タグ列が連続している場合は、それをすべ
て取り出す。例えば、ｓｔｅｐ２０で＜カタログ＞＜日
時＞を取り出した際には、図８の文書タグ／コード化デ
ータ対応表の９２の部分を取り出す。また、ｓｔｅｐ２
０で＜カタログ＞＜本文＞＜特長＞＜リスト＞＜項目＞
を取り出し際には、図８の文書タグ／コード化データ対
応表の９３の部分を取り出す。次に、取り出された列の
コード化データに文書タグを付与する（ｓｔｅｐ２
２）。付与すべき文書タグは、一つ前で取り出した文書
タグ列と共通部分を取り除いた部分である。例えば、ｓ
ｔｅｐ２０において、＜カタログ＞＜本文＞＜外観＞と
いう文書タグ列を取り出した場合は、一つ前の文書タグ
列＜カタログ＞＜本文＞＜特長＞＜リスト＞＜項目＞と
の共通部分である＜カタログ＞＜本文＞を取り除いた＜
外観＞という文書タグが付与される。結局、ｓｔｅｐ２
０において、＜カタログ＞＜本文＞＜外観＞という文書
タグ列を取り出した場合は、ｓｔｅｐ２１において、図
８の文書タグ／コード化データ対応表９４の部分が取り
出されるので、コード化データ「picture.img 」に＜外
観＞という文書タグを付与し、＜外観＞picture.img ＜
＼外観＞という文字列を得る。＜＼外観＞という文書タ
グは、＜外観＞という文書タグの終りを示す文書タグで
ある。次に、ｓｔｅｐ２２で生成された文書タグ付きコ
ード化データを作成中の構造化文書の末尾に追加する
（ｓｔｅｐ２３）。Next, a column including the retrieved document tag sequence is retrieved from the document tag / coded data correspondence table (s
step 21). If the same document tag string is consecutive in the document tag / coded data correspondence table, all of them are taken out. For example, when <catalog><date and time> is taken out at step 20, the portion 92 of the document tag / coded data correspondence table of FIG. 8 is taken out. Also, step2
0 in <Catalog><Body><Features><List><Item>
8 is taken out, the part 93 of the document tag / coded data correspondence table of FIG. 8 is taken out. Next, a document tag is added to the coded data of the extracted column (step 2)
2). The document tag to be added is a part obtained by removing the common part from the document tag string extracted immediately before. For example, s
In step 20, when the document tag string <catalog><text><appearance> is taken out, it is a common part with the previous document tag string <catalog><text><features><list><item>.<Catalog><Body> removed
Appearance> document tag is added. After all, step2
When the document tag string <catalog><text><appearance> is extracted at 0, the portion of the document tag / coded data correspondence table 94 of FIG. 8 is extracted at step 21, so the coded data “picture. Append the document tag <appearance> to "img" and add <appearance> picture.img <
Get the character string \ appearance>. The document tag <appearance> is the document tag indicating the end of the document tag <appearance>. Next, the document-tagged coded data generated in step 22 is added to the end of the structured document being created (step 23).

【００３６】例えば、ｓｔｅｐ２０において、＜カタロ
グ＞＜本文＞＜外観＞という文書タグ列を取り出される
前のｓｔｅｐ２３の終了後には、図１４（ａ）に示す構
造化文書が中間的に生成されている。また、ｓｔｅｐ２
０において、＜カタログ＞＜本文＞＜外観＞という文書
タグ列を取り出される後のｓｔｅｐ２３の終了時には、
ｓｔｅｐ２２で得られた＜外観＞picture.img ＜＼外観
＞という文字列を末尾に追加することにより、図１４
（ｂ）に示す構造化文書を中間的に生成する。そして、
図８に示す文書タグ／コード化データ対応表の場合に
は、ｓｔｅｐ２０においてすべてのタグを取り出し終え
た後には、図１０に示す構造化文書が得られる。For example, in step 20, after the end of step 23 before the document tag string <catalog><text><appearance> is taken out, the structured document shown in FIG. 14A is intermediately generated. . Also, step2
At the end of step 23 after the document tag string <catalog><text><appearance> is retrieved at 0,
By adding the character string <appearance> picture.img <\ appearance> obtained in step 22 to the end,
The structured document shown in (b) is intermediately generated. And
In the case of the document tag / coded data correspondence table shown in FIG. 8, the structured document shown in FIG. 10 is obtained after all the tags have been extracted in step 20.

【００３７】また、図５に示す文書画像データに対し
て、図１１に示す文書画像領域・論理構造対応定義と図
１２に示す文書型定義を用いた場合には、図２ないし図
４に示すフローチャートに従えば、図１３に示す構造化
文書が得られる。図１３に示す構造化文書は、図１０に
示す構造化文書とは、文書タグ名、文書構造、日時等の
データの表現形式の点で異なっている。例えば、文書タ
グ名においては、図２ないし図４の＜名称＞に対して、
図１３では＜製品名＞となっており、日時等のデータの
表現形式においては、図１０の平成７年９月２５日に対
して、図１３では１９９５／９／２５となっている。こ
の日時の形式の違いは、文書画像領域・論理構造対応定
義において、「領域ＩＤ」１の「文字置換指針」の値
が、図６の場合は西暦元号変換であるが、図１１の場合
はなしであることによっている。したがって、文書画像
領域・論理構造対応定義と文書型定義を取り替えること
によって、出力とする構造化文書の形式、すなわち、文
書タグ名、文書構造、データの表現形式等を自由に選択
することができることが示された。When the document image area / logical structure correspondence definition shown in FIG. 11 and the document type definition shown in FIG. 12 are used for the document image data shown in FIG. 5, the document type definitions shown in FIGS. By following the flowchart, the structured document shown in FIG. 13 is obtained. The structured document shown in FIG. 13 differs from the structured document shown in FIG. 10 in terms of the data representation format such as the document tag name, the document structure, and the date and time. For example, in the document tag name, for <name> in FIGS. 2 to 4,
In FIG. 13, <Product Name> is set, and the expression format of data such as date and time is September 25, 1995 in FIG. 10, whereas it is 1995/9/25 in FIG. 13. The difference in the format of the date and time is that in the document image area / logical structure correspondence definition, the value of "character replacement guideline" of "area ID" 1 is the conversion of the Japanese era in the case of FIG. 6, but in the case of FIG. It depends on the story. Therefore, by exchanging the document image area / logical structure correspondence definition and the document type definition, it is possible to freely select the format of the structured document to be output, that is, the document tag name, the document structure, the data representation format, etc. It has been shown.

【００３８】ｓｔｅｐ２４では、制御手段７によって、
外部手続き呼びだしモードをチェックし、オンの場合は
ｓｔｅｐ２５へ、オフの場合はｓｔｅｐ２５に進む。文
書画像領域・論理構造対応定義が図６に示す一例である
場合には、外部手続き呼びだしモードはオフであるか
ら、ｓｔｅｐ２３で生成された構造化文書をモニター等
の表示装置またデスク等の記憶装置に出力する（ｓｔｅ
ｐ２５）。また、外部手続き呼びだしモードがオンであ
る場合は、実施の形態９に記載する。At step 24, by the control means 7,
The external procedure calling mode is checked, and if it is on, proceed to step 25, and if it is off, proceed to step 25. If the document image area / logical structure correspondence definition is an example shown in FIG. 6, the external procedure calling mode is off, so the structured document generated in step 23 is a display device such as a monitor or a storage device such as a desk. Output to (ste
p25). Further, when the external procedure calling mode is ON, it will be described in the ninth embodiment.

【００３９】したがって、上記のような構成によれば、
ｓｔｅｐ７からｓｔｅｐ１８の処理において、文書画像
領域・論理構造対応定義に記述された部分領域の位置に
関する情報を利用して切り出されれた文書画像からコー
ド化データ対応表を作成するので、文書画像データにお
ける定型的なフォーマットに関する情報を有効に活用で
きる。また、ｓｔｅｐ７からｓｔｅｐ１８によって得ら
れる部分領域に関する情報を記述した文書タグ／コード
化データ対応表を基にして、文書タグ／コード化データ
対応表における文書タグ列をキーとして、文書型定義を
参照しながら構造化文書を構成するので、出力とする構
造化文書の形式を自由に選択することができる。Therefore, according to the above configuration,
In the processing from step 7 to step 18, since the coded data correspondence table is created from the document image cut out by using the information about the position of the partial area described in the document image area / logical structure correspondence definition, the fixed format in the document image data is created. Can effectively utilize the information on the specific format. In addition, based on the document tag / coded data correspondence table that describes the information about the partial areas obtained by steps 7 to 18, the document type definition is referenced with the document tag string in the document tag / coded data correspondence table as a key. However, since the structured document is configured, the format of the structured document to be output can be freely selected.

【００４０】実施の形態２．本実施の形態２では、複数
の文書画像データ間の領域の大きさの誤差を吸収し、対
応定義記憶部に記憶されている、いわば標準の部分領域
と新しい入力の文書画像データの部分領域とのゆらぎ、
またはフォーマットの違いがあっても処理が可能となる
例を説明する。これは読み取り誤差がある場合にも対処
でき、対応定義と文書型定義への登録の数を減らすこと
ができる。図１５は実施の形態２における構造化文書生
成装置の構成図であり、図で実施の形態１と同様又は相
当する部分については同一符合を付しその説明を省略す
る。８は、文書画像データ入力手段１によって入力され
た文書画像データを受けとり、その画像中の閉曲線で囲
まれた領域と文書画像領域・論理構造対応定義記憶部２
で記憶されている「文書画像データの部分領域」とを比
較することによって、文書画像領域を推定し、その推定
された文書画像の部分領域を切り出し、切り出された文
書画像データをパターン認識手段４へと送る文書画像領
域推定手段である。Embodiment 2 In the second embodiment, an error in the size of a region between a plurality of pieces of document image data is absorbed, so to speak, a standard partial region and a partial region of new input document image data, which are stored in the correspondence definition storage unit. Fluctuation,
Alternatively, an example in which processing is possible even if there is a difference in format will be described. This can deal with a reading error, and can reduce the number of registrations in the correspondence definition and the document type definition. FIG. 15 is a configuration diagram of the structured document generation device according to the second embodiment. In the figure, the same or corresponding parts as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. Reference numeral 8 receives the document image data input by the document image data input means 1, and stores an area surrounded by a closed curve in the image and a document image area / logical structure correspondence definition storage unit 2
The document image area is estimated by comparing it with the "partial area of the document image data" stored in, and the estimated partial area of the document image is cut out. It is a document image area estimation means for sending to.

【００４１】上記のように構成された構造化文書生成装
置の動作を図２ないし４とそのステップ６の詳細である
図１６のフローチャートに沿って説明する。本実施の形
態２は、図２のフローチャートにおいて、画像領域推定
モードがオンの場合であり、制御手段７は、文書画像領
域推定モードをチェックし（ｓｔｅｐ５）、ｓｔｅｐ６
に進む。ｓｔｅｐ６の動作を図１６のフローチャートに
沿って説明する。文書画像データ入力手段１によって入
力された文書画像データ（以後、Ｘと呼ぶ）において閉
曲線で囲まれた領域を認識する（ｓｔｅｐ３１）。例え
ば、図５の文書画像データの場合には、名称、特徴、外
観図などの矩形で囲まれた領域を認識する。ついで、文
書画像領域・論理構造対応定義における部分領域Ｙ_i を
順にひとつづつ取り出し（ｓｔｅｐ３２）、取り出され
た場合にはｓｔｅｐ３３に、すべて取り出した場合には
図２のｓｔｅｐ７に進む。ただし、ｉは文書画像領域・
論理構造対応定義における部分領域の数ｎ以下の自然数
である。図６の文書画像領域・論理構造対応定義の場合
には、部分領域は２次元平面上の２点で指定される矩形
領域であり、ｎは５である。The operation of the structured document generating apparatus configured as described above will be described with reference to FIGS. 2 to 4 and the flowchart of FIG. 16 showing the details of step 6 thereof. The second embodiment is a case where the image area estimation mode is ON in the flowchart of FIG. 2, and the control means 7 checks the document image area estimation mode (step 5) and step 6.
Proceed to. The operation of step 6 will be described with reference to the flowchart of FIG. In the document image data (hereinafter referred to as X) input by the document image data input means 1, the area surrounded by the closed curve is recognized (step 31). For example, in the case of the document image data shown in FIG. 5, a region surrounded by a rectangle such as a name, a feature, and an external view is recognized. Then, the partial areas Y _i in the document image area / logical structure correspondence definition are sequentially taken out one by one (step 32), and if they are all taken out, the processing goes to step 33, and if all are taken out, the processing goes to step 7 in FIG. 2. However, i is the document image area
It is a natural number equal to or less than the number n of partial areas in the logical structure correspondence definition. In the case of the document image area / logical structure correspondence definition in FIG. 6, the partial area is a rectangular area designated by two points on the two-dimensional plane, and n is 5.

【００４２】ついで、ｓｔｅｐ３１で認識されたＸ中の
領域の中で、Ｙ_i と重なり部分をもつ領域を取り出す
（ｓｔｅｐ３３）。ただし、取り出された領域をＺ_ijと
し、Ｚ_ijとＹ_i との重なり部分の領域をＡ_ijと呼ぶ。た
だし、ｊはＹ_i と重なり部分をもつＸの部分領域の数ｍ
以下の自然数である。ついで、ｉを固定しｊを動かした
場合に、（Ａ_ijの面積）／（（Ｙ_i の面積）＋（Ｚ_ijの面積））が最大となるＺ_ijを求める（ｓｔｅｐ３４）、文書画像
領域・論理構造対応定義における部分領域Ｙ_i をＺ_ijで
置換し、ｓｔｅｐ３２に戻る（ｓｔｅｐ３５）。上記の
ような構成によれば、文書画像データ入力手段１によっ
て入力された文書画像データのフォーマット形式が、文
書画像領域・論理構造対応定義記憶部２に記憶される文
書画像領域・論理構造対応定義が規定するフォーマット
形式と少し異っている場合にも、文書画像領域と論理構
造の対応関係を正しく認識することができる。この機能
は、文書画像データ入力手段１で入力されたデータがず
れている場合や少し異ったフォーマットをもつ文書画像
データを扱う場合に有効である。Next, the area having an overlapping portion with Y _i is taken out from the area in X recognized in step 31 (step 33). However, the extracted area is called Z _ij, and the area of the overlapping portion of Z _ij and Y _i is called A _ij . However, j is the number m of partial areas of X having an overlapping portion with Y _i.
It is the following natural number. Then, when you move the fixed and i j, (A area _ij) / ((the area of the Y _i) + (Z area _ij)) is seeking Z _ij as a maximum (step 34), the document image area Replace the partial area Y _i in the logical structure correspondence definition with Z _ij and return to step 32 (step 35). According to the above configuration, the format of the document image data input by the document image data input means 1 is defined as the document image area / logical structure correspondence definition stored in the document image area / logical structure correspondence definition storage unit 2. Even if the format is slightly different from the format defined by, the correspondence between the document image area and the logical structure can be correctly recognized. This function is effective when the data input by the document image data input means 1 is deviated or when the document image data having a slightly different format is handled.

【００４３】なお、上記の例では、文書画像領域推定手
段８において、文書画像データ中の部分領域を確定する
ために、閉曲線を用いる例を示したが、「フォントや大
きさのような文字属性」や「点線などを補間することに
よって得られる閉曲線」を用いることによって、部分領
域を確定しても良い。In the above example, the document image area estimating means 8 uses the closed curve to determine the partial area in the document image data. , Or “closed curve obtained by interpolating a dotted line” may be used to determine the partial region.

【００４４】実施の形態３．定義への登録数を減らす他
の例を説明する。図１７は実施の形態３における構造化
文書生成装置の構成図であり、図で実施の形態１と同様
又は相当する部分については同一符合を付しその説明を
省略する。９は、文書画像データ入力手段１によって入
力された文書画像データを受けとり、その画像データに
おいて繰り返し出現する閉曲線で囲まれた領域を認識
し、その領域を切り出し、パターン認識手段４に送ると
共に、繰り返し構造に関する情報を構造化文書生成手段
５に送る繰り返し構造認識手段である。Embodiment 3. Another example of reducing the number of registrations in the definition will be described. FIG. 17 is a configuration diagram of the structured document generation device according to the third embodiment. In the figure, the same or corresponding parts as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. Reference numeral 9 receives the document image data input by the document image data input means 1, recognizes an area surrounded by a closed curve that appears repeatedly in the image data, cuts out the area, sends it to the pattern recognition means 4, and repeats it. It is a repetitive structure recognizing means for sending information about the structure to the structured document generating means 5.

【００４５】上記のように構成された構造化文書生成装
置の動作を図２ないし図４とそのステップ１０の詳細で
ある図１８のフローチャートに沿って説明する。実施の
形態３は、図３のフローチャートにおいて、切り出しモ
ードが繰り返し構造認識モードの場合であり、例えば、
図６の文書画像領域・論理構造対応定義における領域Ｉ
Ｄ５の部分領域を処理する場合にあたる。制御手段７
は、切り出しモードドをチェックし（ｓｔｅｐ８）、ｓ
ｔｅｐ１０に進む。ｓｔｅｐ１０の動作を図１８のフロ
ーチャートに沿って説明する。文書画像データ入力手段
１によって入力された文書画像データ（以後、Ｘと呼
ぶ）において閉曲線で囲まれた領域Ｘ_i を認識する（ｓ
ｔｅｐ４１）、ただし、ｉは１から認識された領域の数
ｎまでの整数である。The operation of the structured document generating apparatus configured as described above will be described with reference to FIGS. 2 to 4 and the flowchart of FIG. 18, which is a detail of step 10 thereof. The third embodiment is a case where the cutout mode is the repetitive structure recognition mode in the flowchart of FIG.
Area I in the document image area / logical structure correspondence definition in FIG.
This corresponds to the case of processing the partial area of D5. Control means 7
Checks the cutout mode (step 8), s
Proceed to step 10. The operation of step 10 will be described with reference to the flowchart of FIG. An area X _i surrounded by a closed curve is recognized in the document image data (hereinafter referred to as X) input by the document image data input means 1 (s
step 41), where i is an integer from 1 to the number n of recognized regions.

【００４６】次に、図６の文書画像領域・論理構造対応
定義の領域ＩＤ５におけるテキスト解析指針６７が「繰
り返し構造切り出し１」であることより、この項に収容
しきれない複雑な属性処理を別に用意された詳細処理表
である。７０に示されるテキスト解析指針を読み込む。
その結果、繰り返し領域ＩＤをキーとして、文書画像領
域・論理構造対応定義から繰り返し出現する領域の個数
ｍ（この場合は２）と形状Ｙ_j を得る（ｓｔｅｐ４
２）。次に、ｓｔｅｐ４２で得られた形状を順に対応付
けることにより、Ｘ_i に対応するＹ_j の候補を得（ｓｔ
ｅｐ４３）。文書画像領域・論理構造対応定義で記述さ
れる繰り返し領域関係を満たしているかを確認する（ｓ
ｔｅｐ４４）。繰り返し構造を満たしている場合はｓｔ
ｅｐ４５に進み、そうでない場合はｓｔｅｐ４３に戻
る。図６の文書画像領域・論理構造対応定義の場合に
は、７０の部分に繰り返し領域関係が記述されており、
この場合は、７１によって規定されるＲ１とＲ２という
矩形がこの順に右横方向に並んでおり、その並びが縦下
方向に繰り返すことを意味する。Ｒ１、Ｒ２は図５の文
書画像イメージの領域５８、５９に対応している。従っ
て、ｓｔｅｐ４４では、「Ｙｊの候補として得られるＲ
１とＲ２という形状が右横に並んでいること」、また
「Ｒ１とＲ２の組が縦下方向に繰り返し並んでいるこ
と」を確認する。Next, since the text analysis guideline 67 in the area ID5 of the document image area / logical structure correspondence definition of FIG. 6 is "repeated structure cutout 1", complicated attribute processing that cannot be accommodated in this section is separately described. It is a detailed processing table prepared. The text analysis guideline 70 is read.
As a result, the number m (2 in this case) of repeatedly appearing areas and the shape Y _j are obtained from the document image area / logical structure correspondence definition using the repeated area ID as a key (step 4).
2). Next, by sequentially associating the shapes obtained in step 42, a candidate of Y _j corresponding to X _i is obtained (st
ep43). It is confirmed whether the repeating area relation described in the document image area / logical structure correspondence definition is satisfied (s
(Step 44). St if the repeating structure is satisfied
Proceed to ep45, and if not, return to step43. In the case of the document image area / logical structure correspondence definition in FIG. 6, the repeating area relationship is described in the portion 70,
In this case, it means that the rectangles R1 and R2 defined by 71 are arranged in this order in the right lateral direction, and the arrangement is repeated vertically downward. R1 and R2 correspond to the regions 58 and 59 of the document image image of FIG. Therefore, in step 44, “R obtained as a candidate of Yj is obtained.
Make sure that the shapes 1 and R2 are lined up on the right side, and that the pair of R1 and R2 is repeatedly lined up vertically.

【００４７】次に、ｓｔｅｐ４３で得られた領域Ｘ_i と
形状Ｙ_j の対に対して、形状Ｙ_j が出現した回数を順序
Ａ_i とし、次に文書画像データから領域Ｘ_i を切り出
し、切り出された画像データをＺ_i とする（ｓｔｅｐ４
６）、（文書タグ列，Ｚ_i ，Ａ_i ）の三つ組をｓｔｅｐ
１３に送る（ｓｔｅｐ４７）。なお、文書タグ列は、処
理中の部分領域の文書タグ列に、繰り返し領域関係が指
定する文書タグを加えたものを指定する。例えば、文書
画像データの一例である図５の５９の部分を処理したい
場合には、図６の７１のＲ１の部分に対応するので、図
６の６１の「領域ＩＤ」５の文書タグ＜カタログ＞＜本
文＞＜仕様＞に、７０の親タグ＜表＞、７１の文書タグ
＜属性＞を付与することによって＜カタログ＞＜本文＞
＜仕様＞＜表＞＜属性＞が文書タグ列として得られる。
また、図５の５９の部分の画像データをコード化したデ
ータは大きさであり、図５の５９の部分の部分は、図５
の５６の表の２列目にあたるのでＡ_i は２となるので、
（＜カタログ＞＜本文＞＜仕様＞＜表＞＜属性＞，大き
さ，２）という（文書タグ列，Ｚ_i ，Ａ_i ）の三つ組が
生成される。この三つ組は、図８の文書タグ列／コード
化データの対応表の９５中の３行目に当たる。また、図
５の５８の部分は図６の７１のＲ２の部分に対応するの
で、７１の文書タグ＜属性＞のかわりに文書タグ＜値＞
を参照することによって、５９の部分の処理と同様な手
順によって、（＜カタログ＞＜本文＞＜仕様＞＜表＞＜
値＞，５０×３０×３０，２）という（文書タグ列，Ｚ
_i ，Ａ_i ）の三つ組が生成される。この三つ組は、図８
の文書タグ列／コード化データの対応表の９５中の４行
目に当たる。また、ｓｔｅｐ７において図６の６１の
「領域ＩＤ」５の部分領域が取り出された場合は、ｓｔ
ｅｐ１８において、文書タグ列／コード化データの対応
表に図８の９５の部分が追加される。この部分は，構造
化文書図１０の１０５に示される論理構造要素に対応す
る。文書画像データが図５であり、かつ文書画像領域・
論理構造対応定義が図６の場合には、図５の領域５６を
繰り返し構造として認識することにより、図１０の１０
５に示される表に対応する論理構造を生成する（ｓｔｅ
ｐ２３）。Next, with respect to the pair of the area X _i and the shape Y _j obtained in step 43, the number of appearances of the shape Y _j is set to A _i, and then the area X _i is cut out from the document image data and cut out. The acquired image data is set to Z _i (step 4
6), (Document tag string, Z _i , A _i ) triplets step
13 (step 47). Note that the document tag string specifies the document tag string of the partial area being processed with the document tag specified by the repeating area relationship added. For example, when it is desired to process the portion 59 of FIG. 5 which is an example of the document image data, it corresponds to the portion R1 of 71 of FIG. 6, and therefore the document tag of “area ID” 5 of 61 of FIG. By adding 70 parent tags <table> and 71 document tags <attribute> to <text><specification>,<catalog><text>
<Specification><table><attribute> is obtained as a document tag string.
Further, the data obtained by encoding the image data of the portion 59 of FIG. 5 is the size, and the portion of the portion 59 of FIG.
Since it corresponds to the second column of the table of No. 56, A _i becomes 2.
(<Catalog><text><specification><table><attribute>, size, 2) (document tag string, Z _i , A _i ) triplet is generated. This triplet corresponds to the third row in 95 of the document tag string / coded data correspondence table in FIG. Further, since the part 58 in FIG. 5 corresponds to the part R2 in 71 in FIG. 6, the document tag <value> is used instead of the document tag <attribute> 71.
By the same procedure as the processing of the 59 part (<catalog><text><specification><table><
Value>, 50 × 30 × 30, 2) (document tag string, Z
_i , A _i ) triplets are generated. This triplet is shown in Figure 8.
This corresponds to the fourth line in 95 of the document tag string / coded data correspondence table of. In step 7, if the partial area of “area ID” 5 of 61 in FIG. 6 is extracted, st
In ep18, the portion 95 in FIG. 8 is added to the document tag string / coded data correspondence table. This part corresponds to the logical structure element 105 shown in the structured document FIG. The document image data is as shown in FIG.
When the logical structure correspondence definition is shown in FIG. 6, the area 56 in FIG.
Generate a logical structure corresponding to the table shown in FIG.
p23).

【００４８】上記のような構成によれば、文書画像デー
タ入力手段１によって入力された文書画像中に、閉曲線
で囲まれた繰り返し構造を含む場合にも、構造化文書を
生成することができる。図５の文書画像データの場合に
は、領域５６中の表がもつ構造情報を認識することがで
きる。なお、上記の実施の形態では、繰り返し構造認識
手段９において、文書画像データ中の部分領域を確定す
るために、閉曲線を用いる例を示したが、「フォントや
大きさのような文字属性」や「点線などを補間すること
によって得られる閉曲線」を用いることによって、部分
領域を確定しても良い。According to the above-mentioned structure, the structured document can be generated even when the document image input by the document image data inputting means 1 includes the repeating structure surrounded by the closed curve. In the case of the document image data shown in FIG. 5, the structural information of the table in the area 56 can be recognized. In the above embodiment, the repeated structure recognition means 9 uses the closed curve to determine the partial area in the document image data. However, the "character attribute such as font or size" or The partial area may be determined by using a “closed curve obtained by interpolating a dotted line or the like”.

【００４９】実施の形態４．入力の文書画像データの部
分領域が包含関係を持つ場合には、即ちデータ間に入れ
子の構造がある場合に、同様に定義への登録数を減らす
ことができる。以下にそれを説明する。図１９は本実施
の形態４における構造化文書生成装置の構成図であり、
図に実施の形態１と同様又は相当する部分については同
一符合を付しその説明を省略する。１０は、文書画像デ
ータ入力手段１によって入力された文書画像データを受
けとり、その文書画像データにおいて、閉曲線で囲まれ
た領域同士の包含関係を認識し、その包含関係に関する
情報を構造化文書生成手段５に送る入れ子構造認識手段
である。Embodiment 4 When the partial areas of the input document image data have an inclusive relation, that is, when there is a nested structure between the data, the number of registrations in the definition can be similarly reduced. This will be described below. FIG. 19 is a configuration diagram of the structured document generation device according to the fourth embodiment.
In the figure, the same or corresponding portions as those of the first embodiment are designated by the same reference numerals and the description thereof will be omitted. Reference numeral 10 receives the document image data input by the document image data input means 1, recognizes the inclusion relation between the areas surrounded by the closed curves in the document image data, and outputs information on the inclusion relation to the structured document generation means. It is a nested structure recognition means to be sent to 5.

【００５０】上記のように構成された構造化文書生成装
置の動作を図２ないし図４とそのステップ１１の詳細で
ある図２０のフローチャートと、図２１の対応定義に沿
って説明する。図２１は文書画像領域・論理構造対応定
義の一例を示す例図であり、１４１の部分が図６の文書
画像領域・論理構造対応定義の６４と異っている。例え
ば、６４の３行目は＜カタログ＞＜本文＞＜特長＞であ
り、１４１の４行目は＜特長＞である。このように、文
書画像中の包含関係を認識することによって、構造化文
書中の入れ子関係を自動的に抽出できるので、文書画像
領域・論理構造対応定義における定義を簡略化すること
ができる。１４２は、切り出しモードが入れ子構造モー
ドであることを示す。本実施の形態は、図３のフローチ
ャートにおいて、切り出しモードが入れ子構造認識モー
ドの場合であり、例えば、図２１の文書画像領域・論理
構造対応定義における領域ＩＤ３の部分領域を処理する
場合にあたる。図２１の１４２は切り出しモードが入れ
子構造モードであることを示す。制御手段７は、切り出
しモードをチェックし（ｓｔｅｐ８）、ｓｔｅｐ１１に
進む。ｓｔｅｐ１１の動作を図２０のフローチャートに
沿って説明する。The operation of the structured document generation apparatus configured as described above will be described with reference to FIGS. 2 to 4 and the flowchart of FIG. 20 showing the details of step 11 thereof, and the correspondence definition of FIG. FIG. 21 is an example diagram showing an example of the document image area / logical structure correspondence definition. The part 141 is different from the document image area / logical structure correspondence definition 64 of FIG. For example, the third line of 64 is <catalog><text><feature>, and the fourth line of 141 is <feature>. In this way, by recognizing the inclusion relationship in the document image, the nesting relationship in the structured document can be automatically extracted, so that the definition in the document image area / logical structure correspondence definition can be simplified. 142 indicates that the cutout mode is the nested structure mode. This embodiment is a case where the cutout mode is the nested structure recognition mode in the flowchart of FIG. 3, and corresponds to, for example, the case of processing the partial area of the area ID3 in the document image area / logical structure correspondence definition of FIG. Reference numeral 142 in FIG. 21 indicates that the cutout mode is the nested structure mode. The control means 7 checks the cutout mode (step 8), and proceeds to step 11. The operation of step 11 will be described with reference to the flowchart of FIG.

【００５１】文書画像データ入力手段１によって入力さ
れた文書画像データ（以後、Ｘと呼ぶ）において閉曲線
で囲まれた領域Ｘ_i を認識する（ｓｔｅｐ５１）、ただ
し、ｉは１から認識された領域の数ｎまでの整数であ
る。次に、各々のｉに対して、取り出された領域Ｘ_i 間
の包含関係を認識し（ｓｔｅｐ５２）、領域Ｘ_i を含む
領域Ｘ_j を取り出し、その集合をＹ_i とする（ｓｔｅｐ
５３）。図５の文書画像データの場合には、領域５４を
含む領域は領域５３であり、同様に領域５５を含む領域
は５３である。次に、文書画像データから領域Ｘ_i を切
り出し、切り出された画像をＺ_i とし（ｓｔｅｐ５
４）、（文書タグ列、Ｚ_i 、Ｙ_i ）の三つ組をｓｔｅｐ
５５で生成して、ｓｔｅｐ１３に移る。この際に生成さ
れる文書タグ列は、文書画像領域／論理構造対応定義中
で切り出された領域がもつ文書タグ列の前に、この領域
を含む領域がもつ文書タグ列を連結することによって得
られる。例えば、図２１の文書画像領域／論理構造対応
定義においては、領域５３に対応する領域ＩＤは３、す
なわち文書タグ列は＜カタログ＞＜本文＞であり、ま
た、領域５４に対応する領域ＩＤは４、すなわち、文書
タグ列は＜特長＞である。従って、領域５４を切り出し
た場合に、ｓｔｅｐ１３に送られる文書タグ列は＜カタ
ログ＞＜本文＞＜特長＞となる。文書タグ列の生成以外
の部分は、実施の形態１と同じである。An area X _i surrounded by a closed curve is recognized in the document image data (hereinafter referred to as X) input by the document image data input means 1 (step 51), where i is an area recognized from 1 It is an integer up to the number n. Next, for each i, recognizes the inclusion relation between the retrieved area X _i (the step 52), retrieves the area X _j including the area X _i, for the collection and Y _i (step
53). In the case of the document image data of FIG. 5, the area including the area 54 is the area 53, and the area including the area 55 is 53. Next, a region X _i is cut out from the document image data, and the cut out image is set as Z _i (step 5
4), (Document tag string, Z _i , Y _i ) triplets step
Generate at 55 and move to step 13. The document tag string generated at this time is obtained by concatenating the document tag string of the area including this area before the document tag string of the area cut out in the document image area / logical structure correspondence definition. To be For example, in the document image area / logical structure correspondence definition of FIG. 21, the area ID corresponding to the area 53 is 3, that is, the document tag string is <catalog><body>, and the area ID corresponding to the area 54 is 4, that is, the document tag string is <feature>. Therefore, when the area 54 is cut out, the document tag string sent to step 13 is <catalog><text><feature>. The parts other than the generation of the document tag string are the same as those in the first embodiment.

【００５２】上記のような構成によれば、文書画像デー
タ入力手段１によって入力された文書画像中の領域同士
の包含関係を構造化文書中の入れ子関係として認識する
ことができる。なお、上記の実施の形態では、入れ子構
造認識手段１０において、文書画像データ中の部分領域
を確定するために、閉曲線を用いる例を示したが、「フ
ォントや大きさのような文字属性」や「点線などを補間
することによって得られる閉曲線」を用いることによっ
て、部分領域を確定しても良い。With the above arrangement, the inclusion relation between the areas in the document image input by the document image data input means 1 can be recognized as the nesting relation in the structured document. In the above embodiment, the nested structure recognition means 10 uses the closed curve in order to determine the partial area in the document image data. However, the "character attribute such as font or size" or The partial area may be determined by using a “closed curve obtained by interpolating a dotted line or the like”.

【００５３】実施の形態５．入力の文書にイメージとし
ての画像を持ち、これをそのまま利用する場合を説明す
る。即ち、イメージ領域を切り出し、この切り出したイ
メージファイルに名称をつけて文書タグの属性とすれば
よい。図２２は実施の形態５における構造化文書生成装
置の構成図であり、図に実施の形態１と同様又は相当す
る部分については同一符合を付しその説明を省略する。
１１は、文書画像データを受けとり、「文書画像データ
の部分領域」を切りだし外部ファイルとして出力すると
同時に、外部ファイルの名称を構造化文書生成手段５に
送るイメージ切り出し手段である。Embodiment 5 The case where the input document has an image as an image and is used as it is will be described. That is, the image area may be cut out, and the cut out image file may be named and used as the attribute of the document tag. FIG. 22 is a configuration diagram of the structured document generation device according to the fifth embodiment. In the figure, the same or corresponding parts as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted.
Reference numeral 11 denotes an image cutout unit that receives the document image data, cuts out the “partial area of the document image data” and outputs it as an external file, and at the same time sends the name of the external file to the structured document generation unit 5.

【００５４】上記のように構成された構造化文書生成装
置の動作を図２とないし図４とそのステップ１２の詳細
である図２３のフローチャートに沿って説明する。本実
施の形態５は、図３のフローチャートにおいて、切り出
しモードがイメージ切り出しモードの場合であり、例え
ば、図６の文書画像領域・論理構造対応定義における領
域ＩＤ４の部分領域を処理する場合にあたる。制御手段
７は、切り出しモードをチェックし（ｓｔｅｐ８）、ｓ
ｔｅｐ１２に進む。ｓｔｅｐ１２の動作を図２３のフロ
ーチャートに沿って説明する。The operation of the structured document generating apparatus configured as described above will be described with reference to FIGS. 2 to 4 and the flowchart of FIG. 23 showing the details of step 12 thereof. The fifth embodiment is a case where the cutout mode is the image cutout mode in the flowchart of FIG. 3, and corresponds to, for example, the case of processing the partial area of the area ID4 in the document image area / logical structure correspondence definition of FIG. The control means 7 checks the cutout mode (step 8), s
Proceed to step 12. The operation of step 12 will be described with reference to the flowchart of FIG.

【００５５】図６の文書画像領域・論理構造対応定義の
６３に示される領域に関する情報を参照することによ
り、文書画像データ入力手段１により入力された文書画
像データからイメージ領域を切り出す（ｓｔｅｐ６
１）。次に、切り出された文書画像データをイメージ・
ファイルとして保存し（ｓｔｅｐ６２）、名称をつけ
（ｓｔｅｐ６３）、（領域ＩＤ、イメージ・ファイルの
名称）の対をｓｔｅｐ１３に送る（ｓｔｅｐ６４）。例
えば、ｓｔｅｐ７において図６の６１の「領域ＩＤ」４
の部分領域が取り出された場合は、ｓｔｅｐ１８におい
て、文書タグ列／コード化データの対応表に図８の９４
の部分が追加される。この部分は、構造化文書図１０の
１０４に示される論理構造要素に対応する。An image area is cut out from the document image data input by the document image data input means 1 by referring to the information about the area shown in 63 of the document image area / logical structure correspondence definition in FIG. 6 (step 6).
1). Next, image the cut out document image data.
It is saved as a file (step 62), a name is given (step 63), and a pair of (area ID, image file name) is sent to step 13 (step 64). For example, in step 7, “area ID” 4 61 in FIG.
When the partial area of is extracted, the document tag string / coded data correspondence table 94 in FIG.
Is added. This portion corresponds to the logical structure element shown at 104 in the structured document FIG.

【００５６】文書画像データが図５であり、かつ文書画
像領域・論理構造対応定義が図６の場合には、図５の領
域５５をイメージ領域として認識することにより、外部
イメージ・ファイルとして保存され、構造化文書図１０
中の１０４が示すファイル名picture.img のように参照
する論理構造が生成される（ｓｔｅｐ２３）。上記のよ
うな構成によれば、文書画像データに図や写真などのコ
ード化データへの変換が困難なデータが混在している場
合でも、読みとった文書画像中のある部分を外部ファイ
ルとして参照することができる構造化文書を生成するこ
とができる。If the document image data is as shown in FIG. 5 and the document image area / logical structure correspondence definition is as shown in FIG. 6, the area 55 in FIG. 5 is recognized as an image area and is saved as an external image file. , Structured document Figure 10
A logical structure to be referred to like a file name picture.img indicated by 104 is generated (step 23). According to the above configuration, even if the document image data includes data that is difficult to convert into coded data such as diagrams and photographs, a part of the read document image is referred to as an external file. A structured document that can be generated can be generated.

【００５７】実施の形態６．入力文書は、一般的には、
章、節、段落等から構成され、それぞれ文書構成上の論
理的な意味を持っている。これらを識別して管理するこ
とでより精密で利用度の高い文書管理ができ、変化に富
んだ出力文書が得られる。以下、章、節、等の区分を解
析する手段を含んだ例を説明する。図２４は実施の形態
６における構成化文書生成装置の構成図であり、図で実
施の形態１と同様又は相当する部分については同一符合
を付しその説明を省略する。１２は、生成されたコード
化データを受けとり、章、節、箇条書き等の文書構造を
認識し、認識した結果を構造化文書生成手段５に送るテ
キスト解析手段である。Embodiment 6 FIG. The input document is typically
It is composed of chapters, sections, paragraphs, etc., and each has a logical meaning in the document structure. By identifying and managing these, more precise and highly-useful document management can be performed, and an output document rich in change can be obtained. Hereinafter, an example including means for analyzing the division of chapters, sections, etc. will be described. FIG. 24 is a block diagram of the structured document generation device according to the sixth embodiment. In the figure, the same or corresponding parts as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. A text analysis unit 12 receives the generated coded data, recognizes the document structure such as chapters, sections, and bullets, and sends the recognized result to the structured document generation unit 5.

【００５８】上記のように構成された構造化文書生成装
置の動作を図２ないし図４とそのステップ１７の詳細で
ある図２５のフローチャートに沿って説明する。本実施
の形態は、図３のフローチャートにおいて、テキスト解
析モードがオンの場合であり、例えば、図６の文書画像
領域・論理構造対応定義における領域ＩＤ３の部分領域
を処理する場合にあたる。制御手段７は、テキスト解析
モードをチェックし（ｓｔｅｐ１６）、ｓｔｅｐ１７に
進む。ｓｔｅｐ１７の動作を図２５のフローチャートに
沿って説明する。図６の文書画像領域・論理構造対応定
義の領域ＩＤ３におけるテキスト解析指針が「繰り返し
構造抽出１」であることより、６９に示されるテキスト
解析指針を読み込む（ｓｔｅｐ７１）。この指針は、繰
り返し構造に関しては、図５の文書画像の５４のよう
に、「数字」と「。」の並びにより繰り返し構造の開始
が指定され、また、その数字が繰り返しの出現順序を指
定していることを表現する。また、この指針は、文書タ
グとの対応に関しては、繰り返し構造全体には＜リスト
＞という文書タグが、また、個々の繰り返し要素には＜
項目＞という文書タグ付与されることを表現する。The operation of the structured document generating apparatus configured as described above will be described with reference to FIGS. 2 to 4 and the flowchart of FIG. 25 showing the details of step 17 thereof. The present embodiment is a case where the text analysis mode is ON in the flowchart of FIG. 3, and corresponds to, for example, a case where the partial area of area ID3 in the document image area / logical structure correspondence definition of FIG. 6 is processed. The control means 7 checks the text analysis mode (step 16), and proceeds to step 17. The operation of step 17 will be described with reference to the flowchart of FIG. Since the text analysis guideline in the area ID3 of the document image area / logical structure correspondence definition in FIG. 6 is "repeated structure extraction 1", the text analysis guideline indicated by 69 is read (step 71). With respect to the repeating structure, this guideline specifies the start of the repeating structure by the arrangement of “numbers” and “.” As in 54 of the document image in FIG. Express what you are doing. Regarding the correspondence with document tags, this guideline shows that a document tag called <list> is used for the entire repeating structure, and <list> is used for each repeating element.
Indicates that a document tag of item> is added.

【００５９】ｓｔｅｐ７１により得られた解析指針に従
い文書構造を抽出し、文書構造毎に「コード化デー
タ」、「文書タグ列」、「順序」を認識する（ｓｔｅｐ
７２）。例えば、図５の文書画像５４の２行目の場合に
は、コード化データが「早い」、文書タグ列が「＜カタ
ログ＞＜本文＞＜特長＞＜リスト＞＜項目＞」、順序が
「２」である。ただし、ここで文書タグ列は６９で指定
される＜リスト＞＜項目＞が、６４の３行目で指定され
る＜カタログ＞＜本文＞＜特長＞と連結されている。次
に、ｓｔｅｐ７２で得られた（コード化データ、文書タ
グ列、順序）の三つ組をｓｔｅｐ１３に送る（ｓｔｅｐ
７３）。例えば、ｓｔｅｐ７において図６の６１の「領
域ＩＤ」３の部分領域が取り出された場合は、ｓｔｅｐ
１８において、文書タグ列／コード化データの対応表
に、図８の９３の部分が追加される。この部分は、構造
化文書図１０の１０３に示される論理構造要素に対応す
る。文書画像データが図５であり、かつ文書画像領域・
論理構造対応定義が図６の場合には、図５の領域５４を
繰り返し構造として認識することにより、構造化文書図
１０の１０５に示される繰り返し構造をもつ論理構造を
生成する（ｓｔｅｐ２３）。A document structure is extracted according to the analysis guideline obtained in step 71, and "coded data", "document tag string", and "order" are recognized for each document structure (step).
72). For example, in the case of the second line of the document image 54 in FIG. 5, the coded data is “early”, the document tag sequence is “<catalog><text><feature><list><item>”, and the order is “ 2 ". However, here, the <list><item> specified by the document tag string 69 is linked to the <catalog><text><feature> specified in the third line of 64. Next, the triplet of (coded data, document tag sequence, order) obtained in step 72 is sent to step 13 (step
73). For example, in step 7, when the partial area of “area ID” 3 of 61 in FIG. 6 is extracted, step 7
18, the part 93 of FIG. 8 is added to the correspondence table of the document tag string / coded data. This portion corresponds to the logical structure element shown in 103 of the structured document FIG. The document image data is as shown in FIG.
When the logical structure correspondence definition is shown in FIG. 6, the logical structure having the repeating structure shown by 105 in the structured document FIG. 10 is generated by recognizing the region 54 in FIG. 5 as the repeating structure (step 23).

【００６０】上記のような構成によれば、文書画像デー
タ入力手段１によって入力された文書画像からは認識す
ることが困難であるが、コード化された情報からは認識
可能な文書構造に関する情報を利用することにより、よ
り木目の細かい構造化文書を生成することができる。According to the above-mentioned configuration, it is difficult to recognize from the document image input by the document image data input means 1, but the information about the document structure that can be recognized from the coded information is recognized. By using it, a structured document with a finer grain can be generated.

【００６１】実施の形態７．実施の形態３、４でもそう
であるが、処理属性の収容量には限りがあるので別に詳
細処理表を用意し、それを属性として指定して複雑な属
性処理を行わせることができる。例えば、特定の領域の
文字列を出力文書では別の文字列に変換して利用したい
という場合も多い。以下の例では最も簡単な元号と西暦
間の変換を説明するが、拡張すると部分領域の入力を他
の情報に置換して出力することができ、これは変換のた
めのフィルタを予め登録しておけば、任意の出力側のソ
フトウェア・インタフェースに合わせて出力ができて文
書データの流用性が高まる。図２６は実施の形態７の構
造化文書生成装置の構成図であり、図で実施の形態１と
同様又は相当する部分については同一符合を付しその説
明を省略する。１３は、生成されたコード化データ中を
受けとり、そのデータ中に出現するある文字列を別の文
字列に変換した結果を構造化文書生成手段５に送る文字
列変換手段である。Embodiment 7 FIG. As in the third and fourth embodiments, since the capacity of processing attributes is limited, a detailed processing table can be prepared separately and designated as an attribute to perform complicated attribute processing. For example, it is often the case that a character string in a specific area is desired to be converted into another character string for use in an output document. In the example below, the simplest conversion between era and AD is explained, but if you expand it, you can replace the input of the partial area with other information and output it. By doing so, the output can be performed in accordance with the software interface on the output side, and the diversion of the document data is enhanced. FIG. 26 is a configuration diagram of the structured document generation device according to the seventh embodiment. In the figure, the same or corresponding portions as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. Reference numeral 13 is a character string conversion means for receiving the generated encoded data and converting a certain character string appearing in the data into another character string and sending the result to the structured document generation means 5.

【００６２】上記のように構成された構造化文書生成装
置の動作を図２ないし図４とそのステップ１５の詳細で
ある図２５のフローチャートに沿って説明する。本実施
の形態は、図３のフローチャートにおいて、文字置換モ
ードがオンの場合であり、例えば、図６の文書画像領域
・論理構造対応定義における領域ＩＤ１の部分領域を処
理する場合にあたる。制御手段７は、文字置換モードを
チェックし（ｓｔｅｐ１４）、ｓｔｅｐ１５に進む。ｓ
ｔｅｐ１５の動作を図２７のフローチャートに沿って説
明する。図６の文書画像領域・論理構造対応定義の６７
に示されるテキスト解析指針を読み込み、領域ＩＤ１に
対応する部分が「西暦元号変換」であることにより、変
換時に変換アルゴリズムとして利用するフィルターの名
称「西暦元号変換」を得る（ｓｔｅｐ８１）。The operation of the structured document generating apparatus configured as described above will be described with reference to FIGS. 2 to 4 and the flowchart of FIG. 25 which is the detail of step 15 thereof. The present embodiment is a case where the character replacement mode is ON in the flowchart of FIG. 3, and corresponds to, for example, a case where the partial area of area ID1 in the document image area / logical structure correspondence definition of FIG. 6 is processed. The control means 7 checks the character replacement mode (step 14) and proceeds to step 15. s
The operation of step 15 will be described with reference to the flowchart of FIG. 67 of the document image area / logical structure correspondence definition in FIG.
The text analysis guideline shown in is read, and the part corresponding to the area ID1 is "AD conversion", so that the name of the filter "AD conversion" used as a conversion algorithm at the time of conversion is obtained (step 81).

【００６３】ついで、ｓｔｅｐ１３により得られた（文
書タグ列、コード化データ）対の中のコード化データ
を、フィルター「西暦元号変換」に通した結果によって
置換することによって、新たな（文書タグ列、コード化
データ）を生成する（ｓｔｅｐ８２）。例えば、ｓｔｅ
ｐ７において図６の６１の「領域ＩＤ」１の部分領域が
取り出された場合には、対応する部分領域の文書画像デ
ータは図５の文書画像６０の部分となるので、文字コー
ド「１９９５／９／２５」が「平成７年９月２５日」に
変換され、（＜カタログ＞＜日時＞、平成７年９月２５
日）が生成される。その結果、ｓｔｅｐ１８において、
文書タグ列／コード化データの対応表に図８の９１とし
て書き込まれる。この部分は、構造化文書図１０の１０
２に示される論理構造要素に対応する。上記のような構
成によれば、文書画像データ中に存在するある文字列
を、出力となる構造化文書が要求する文字列に置換する
ことができる。Then, by replacing the coded data in the (document tag string, coded data) pair obtained by step 13 with the result passed through the filter "AD conversion", a new (document tag) (Column, coded data) is generated (step 82). For example, ste
When the partial area of "area ID" 1 of 61 in FIG. 6 is taken out in p7, the document image data of the corresponding partial area becomes the portion of the document image 60 of FIG. 5, so the character code "1995/9 "/ 25" is converted to "September 25, 1995", and (<catalog><date and time>, September 25, 1995)
Day) is generated. As a result, in step 18,
It is written as 91 in FIG. 8 in the correspondence table of the document tag string / coded data. This part corresponds to the structured document 10 in FIG.
2 corresponds to the logical structure element shown in FIG. According to the above configuration, it is possible to replace a certain character string existing in the document image data with the character string required by the structured document to be output.

【００６４】実施の形態８．本実施の形態８では、実施
の形態２では文書画像データの入力時に対応定義記憶部
に記憶されている標準の部分領域とのゆらぎを吸収する
のに対し、記憶されている複数の標準部分領域中から適
合する部分領域を見つけて対応処理をする例を説明す
る。このようにしても定義への登録の数を減らすことが
できる。図２８は実施の形態８の構造化文書生成装置の
構成図であり、図で実施の形態１と同様又は相当する部
分については同一符合を付しその説明を省略する。１４
は、文書画像データ入力手段１で入力された文書画像デ
ータとパターン認識手段４で生成されたコード化データ
を受けとり、複数の文書画像領域・論理構造対応定義か
ら一つの文書画像領域・論理構造対応定義を選択し、そ
の画像領域／論理構造対応定義を文書画像領域切り出し
手段３に送る領域／論理構造対応定義推定手段である。Embodiment 8 FIG. In the eighth embodiment, in the second embodiment, the fluctuation with the standard partial area stored in the correspondence definition storage unit is absorbed when the document image data is input, whereas the plurality of stored standard partial areas are stored. An example of finding a matching partial area from the inside and performing the corresponding processing will be described. Even in this way, the number of registrations in the definition can be reduced. FIG. 28 is a configuration diagram of the structured document generation device according to the eighth embodiment. In the figure, the same or corresponding parts as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. 14
Receives the document image data input by the document image data input unit 1 and the coded data generated by the pattern recognition unit 4, and selects one document image region / logical structure correspondence from a plurality of document image region / logical structure correspondence definitions. This is an area / logical structure correspondence definition estimating means for selecting a definition and sending the image area / logical structure correspondence definition to the document image area cutout means 3.

【００６５】上記のように構成された構造化文書生成装
置の動作を図２ないし図４とそのステップ３の詳細であ
る図２９のフローチャートに沿って説明する。本実施の
形態８は、図２のフローチャートにおいて、文書画像領
域・論理構造対応定義推定モードがオンの場合であり、
制御手段７は、文書画像領域推定モードをチェックし
（ｓｔｅｐ２）、ｓｔｅｐ３に進む。ｓｔｅｐ３の動作
を図２９のフローチャートに沿って説明する。ｓｔｅｐ
３１を実行した後、文書画像領域・論理構造対応定義を
順時取り出し、取り出された場合はｓｔｅｐ３２に、す
べて取り出した場合にはｓｔｅｐ９３に進む（ｓｔｅｐ
９１）。また、取り出された文書画像領域・論理構造対
応定義を以後、Ｒ_k として参照する。The operation of the structured document generating apparatus configured as described above will be described with reference to FIGS. 2 to 4 and the flowchart of FIG. 29 showing the details of step 3 thereof. The eighth embodiment is a case where the document image area / logical structure correspondence definition estimation mode is ON in the flowchart of FIG.
The control means 7 checks the document image area estimation mode (step 2) and proceeds to step 3. The operation of step 3 will be described with reference to the flowchart of FIG. step
After executing 31, the document image area / logical structure correspondence definition is sequentially fetched.
91). Further, the extracted document image area / logical structure correspondence definition will be referred to as R _k hereinafter.

【００６６】ｓｔｅｐ３２、ｓｔｅｐ３３、ｓｔｅｐ３
４を図２９のフローチャートと同様に実行する。ただ
し、ｓｔｅｐ３２において、文書画像領域・論理構造対
応定義における部分領域をすべて取り出し終えた場合に
は、ｓｔｅｐ９２に進む。また、ｓｔｅｐ３４で求めた
値（Ａ_ijの面積）／（（Ｙ_i の面積）＋（Ｚ_ijの面積））を以後、Ｂ_i として参照する。ｓｔｅｐ９２では、Σ{i
=j}{n}Ｂ_i を求め、この値をＣ_k とする。但し、ｎはｓ
ｔｅｐ３２で指定される文書画像領域・論理構造対応定
義における部分領域の数である。Ｃ_k が最大となるＲ_k
が推定された文書画像領域・論理構造対応定義である
（ｓｔｅｐ９３）。即ち、切り出した部分領域が登録記
憶されているいわゆる幾つかの標準の文書画像領域と比
較されて、類似度Ｃ_k が最大となるものが定義であると
推定する。ｓｔｅｐ４ではこの推定された文書画像領域
・論理構造対応定義が読み込まれる。上記のような構成
によれば、文書画像データから構造化文書を変換時に必
要とされる画像領域／論理構造対応定義を指定する手間
を省くことができる。Step32, step33, step3
4 is executed similarly to the flowchart of FIG. However, in step 32, when all the partial areas in the document image area / logical structure correspondence definition have been extracted, the process proceeds to step 92. The value (area of A _ij ) / ((area of Y _i ) + (area of Z _ij )) obtained in step 34 will be referred to as B _i hereinafter. In step 92, Σ {i
= j} {n} B _i is obtained, and this value is set as C _k . However, n is s
This is the number of partial areas in the document image area / logical structure correspondence definition specified in step 32. C _k is the maximum R _k
Is the estimated document image area / logical structure correspondence definition (step 93). That is, it is estimated that the cutout partial area is compared with so-called some standard document image areas that are registered and stored, and the one that maximizes the similarity C _k is the definition. In step 4, the estimated document image area / logical structure correspondence definition is read. According to the above configuration, it is possible to save the trouble of designating the image area / logical structure correspondence definition required when converting the structured document from the document image data.

【００６７】実施の形態９．データベースに登録するに
は、そのデータベース毎に指定された項目、フォーマッ
ト、コード等に定めがあり、これらに適合した形でデー
タ群を用意すると、自動登録プログラム等によってデー
タベース登録ができる。本実施の形態９ではこうした目
的のために所望のデータベースに適合した出力文書を得
る例を説明する。図３０は実施の形態９の構造化文書生
成装置の構成図であり、図で実施の形態１と同様又は相
当する部分については同一符合を付しその説明を省略す
る。２１は、構造化文書生成手段５が生成した構造化文
書とスクリプト／論理構造対応定義記憶部２２が記憶す
るスクリプト／論理構造対応定義を受けとり、構造化文
書の部分毎に、スクリプト／論理構造対応定義が指定す
る命令を実行する外部手続き呼び出し手段である。Embodiment 9 In order to register in the database, the items, formats, codes, etc. specified for each database are prescribed, and if a data group is prepared in a form conforming to these, the database can be registered by an automatic registration program or the like. In the ninth embodiment, an example of obtaining an output document adapted to a desired database for such purpose will be described. FIG. 30 is a configuration diagram of the structured document generation device according to the ninth embodiment. In the figure, the same or corresponding parts as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. Reference numeral 21 receives the structured document generated by the structured document generating means 5 and the script / logical structure correspondence definition stored in the script / logical structure correspondence definition storage unit 22, and the script / logical structure correspondence is made for each part of the structured document. It is an external procedure calling means for executing the instruction specified by the definition.

【００６８】上記のように構成された構造化文書生成装
置の動作を図２ないし図４とそのステップ２６の詳細で
ある図２９のフローチャートに沿って説明する。本実施
の形態は、図４のフローチャートにおいて、文書画像領
域・論理構造対応定義外部呼びだしモードがオンの場合
であり、制御手段７は、文書画像領域推定モードをチェ
ックし（ｓｔｅｐ２４）、ｓｔｅｐ２６に進む。ｓｔｅ
ｐ２６の動作を図３１のフローチャートに沿って説明す
る。The operation of the structured document generating apparatus configured as described above will be described with reference to FIGS. 2 to 4 and the flowchart of FIG. 29 showing the details of step 26 thereof. In the present embodiment, the document image area / logical structure correspondence definition external call mode is ON in the flowchart of FIG. 4, and the control means 7 checks the document image area estimation mode (step 24) and proceeds to step 26. . ste
The operation of p26 will be described with reference to the flowchart of FIG.

【００６９】図３２は文書画像領域・論理構造対応定義
の一例を示す例図であり、構造化文書をデータベースへ
自動登録することを規定するスクリプトを記述するもの
である。２５１はスクリプトタイプが「ＤＢ登録」であ
ることを規定する。２５２、２５３は、ｓｔｅｐ１０
２、ｓｔｅｐ１０３において、ＤＢ属性の値として、構
造化文書のどの部分を抽出すればよいかを決定する際に
参照される。FIG. 32 is an example diagram showing an example of a document image area / logical structure correspondence definition, and describes a script that defines automatic registration of a structured document in a database. 251 defines that the script type is “DB registration”. 252 and 253 are step 10
2. In step 103, it is referred to when determining which part of the structured document should be extracted as the value of the DB attribute.

【００７０】文書画像領域・論理構造対応定義を読み込
み、呼び出すべき外部手続きを確定する（ｓｔｅｐ１０
１）。例えばデータを登録しようとする外部のデータベ
ース側が要求するデータの項目、フォーマットが図３３
の２５２の属性であったとする。図３３の文書画像領域
・論理構造対応定義を読み込んだ場合には、２５１を参
照することにより、外部手続きを「ＤＢ登録（データベ
ースへの登録）」と確定する。次に、外部手続きが必要
とする引数の値を構造化文書より抽出する（ｓｔｅｐ１
０２）。図３３の文書画像領域・論理構造対応定義の場
合には、２５２に指定されるＤＢ属性の値として、同じ
領域ＩＤをもつ文書タグが示す部分を構造化文書から取
り出す。すなわち、図１０の構造化文書の場合には、例
えば、２５３に示すＤＢ属性「figure」の値として、文
書タグ＜カタログ＞＜本文＞＜外観＞が示す部分である
「picture.img 」を取り出す。The document image area / logical structure correspondence definition is read, and the external procedure to be called is determined (step 10).
1). For example, the items and formats of the data requested by the external database side that wants to register the data are shown in FIG.
252 attributes. When the document image area / logical structure correspondence definition in FIG. 33 is read, the external procedure is confirmed as “DB registration (register in database)” by referring to 251. Next, the argument values required by the external procedure are extracted from the structured document (step 1).
02). In the case of the document image area / logical structure correspondence definition in FIG. 33, the portion indicated by the document tag having the same area ID as the value of the DB attribute designated by 252 is extracted from the structured document. That is, in the case of the structured document of FIG. 10, for example, as the value of the DB attribute “figure” indicated by 253, “picture.img” which is the part indicated by the document tag <catalog><text><appearance> is extracted. .

【００７１】次に、ｓｔｅｐ１０２で得られた値を基に
して、外部手続きを実行する（ｓｔｅｐ１０３）。図３
２の文書画像領域・論理構造対応定義と図１０の構造化
文書の場合には、ＤＢ属性「title」の値が「炊飯
器」、ＤＢ属性「feature」の値が「（おいしい、早
い、お買い得）」、ＤＢ属性「figure」の値が「（おい
しい、早い、お買い得）」、「figure」の値が「（pict
ure.img ）」である文書として、データベースに登録す
る。Next, an external procedure is executed based on the value obtained in step 102 (step 103). FIG.
In the case of the document image area / logical structure correspondence definition of 2 and the structured document of FIG. 10, the value of the DB attribute “title” is “rice cooker” and the value of the DB attribute “feature” is “(delicious, fast, bargain. ) ”, The value of DB attribute“ figure ”is“ (delicious, fast, bargain) ”, and the value of“ figure ”is“ (pict
ure.img) ”is registered in the database as a document.

【００７２】上記のような構成によれば、構造化文書の
構造情報を用いて、データベースへの文書の自動登録が
可能になる。なお、上記実施の形態は、データベースへ
の文書の自動登録に関するものであるが、本発明はこれ
に限定されず、発注伝票の自動発送等ビジネス文書の自
動生成／自動発送、表計算データの自動生成等の表計算
ソフトウェアとの連携等の各種アプリケーションソフト
ウェアとの連携に適用できる。According to the above-mentioned structure, it becomes possible to automatically register the document in the database by using the structure information of the structured document. Although the above embodiment relates to automatic registration of documents in the database, the present invention is not limited to this, and automatic generation / automatic shipment of business documents such as automatic shipment of order slips and automatic calculation of spreadsheet data. It can be applied to linkage with various application software such as linkage with spreadsheet software such as generation.

【００７３】実施の形態１０．実施の形態１では、入力
の文書画像データにたいして既に所定の部分としての文
書画像領域毎に論理構造としての文書タグが付加されて
いるとして動作説明をした。本実施の形態では、この文
書画像領域と論理構造との対応の登録について、つまり
最初の文書タグの領域の範囲指定動作について説明す
る。図３３は実施の形態１０の構造化文書生成装置の構
成図であり、図で実施の形態１と同様又は相当する部分
については同一符合を付しその説明を省略する。３１
は、ユーザが指定する「文書画像データの部分領域」と
「文書の論理構造のある特定部分であることを表現する
マークアップ用タグ（文書タグ）」との対応関係を受け
付け、文書画像領域・論理構造対応定義を生成し、文書
画像領域・論理構造対応定義記憶部に格納する文書画像
領域／論理構造対応付け手段である。Embodiment 10 FIG. In the first embodiment, the operation has been described assuming that the document tag as the logical structure is already added to the input document image data for each document image area as the predetermined portion. In this embodiment, the registration of the correspondence between the document image area and the logical structure, that is, the range specifying operation of the area of the first document tag will be described. FIG. 33 is a configuration diagram of the structured document generation device according to the tenth embodiment. In the figure, the same or corresponding parts as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. 31
Accepts the correspondence between the “partial area of the document image data” specified by the user and the “markup tag (document tag) that represents a specific part of the logical structure of the document”, A document image area / logical structure associating unit that generates a logical structure correspondence definition and stores it in the document image area / logical structure correspondence definition storage unit.

【００７４】図３５は「文書画像データの部分領域」と
「文書の論理構造のある特定部分であることを表現する
マークアップ用タグ（文書タグ）」との対応付けを制御
するためのインタフェースの一例である。「文書画像デ
ータの部分領域」は自動的に生成される「領域ＩＤ」２
７１によって指定し、この登録単位であるＩＤ毎に「文
書の論理構造のある特定部分であることを表現するマー
クアップ用タグ」は先頭から終端までの文書タグの並び
である「文書タグ列」２７３によって指定する。「文書
画像データの部分領域」の文書画像データ中の位置と大
きさは「部分領域座標」２７２によって始点と終点の
Ｘ，Ｙ座標を指定する。対応付けが終了した場合はボタ
ン２７４を押すことによって確認する。対応付け操作を
誤った場合はボタン２７５によって対応付け操作を取り
消す。FIG. 35 shows an interface for controlling the association between the "partial area of the document image data" and the "markup tag (document tag) expressing a specific portion of the logical structure of the document". This is an example. "Partial area of document image data" is automatically created "area ID" 2
The "markup tag that represents a certain portion of the logical structure of the document" specified by the registration unit 71 is a "document tag string" that is a sequence of document tags from the beginning to the end. 273. For the position and size of the "partial area of the document image data" in the document image data, the "partial area coordinates" 272 specify the X and Y coordinates of the start point and the end point. When the association is completed, it is confirmed by pressing the button 274. If the association operation is incorrect, the button 275 cancels the association operation.

【００７５】上記のように構成された構造化文書生成装
置の動作を図３４のフローチャートに沿って説明する。
まず最初に、部分領域定義メニューを表示する（ｓｔｅ
ｐ２６１）。部分領域定義メニューとは、文書画像領域
・論理構造対応定義中の部分領域定義に関する定義部分
をユーザに指定させるインタフェースである。図３５
は、部分領域定義メニューの一例であり、部分領域毎に
指定される「領域ＩＤ」「部分領域の始点、終点」、
「文書タグ」という項目をもち、この項目をユーザが埋
めることにより、部分領域定義を完成させる。次に、部
分領域の指定を続ける場合にはｓｔｅｐ２６３に、続け
ない場合には部分領域定義が完了したものと見なし、部
分領域定義メニュー画面を閉じて終了する（ｓｔｅｐ２
６２）。ｓｔｅｐ２６３では、指定しようとする文書領
域に領域ＩＤを付与する。領域ＩＤとしては、例えば、
ユーザが何番目にその部分領域を指定したかを示す数字
を用いればよい。この領域ＩＤはユーザが指定する部分
領域を識別するために用いられる。The operation of the structured document generating apparatus configured as described above will be described with reference to the flowchart of FIG.
First, display the partial area definition menu (step
p261). The partial area definition menu is an interface that allows the user to specify the definition part related to the partial area definition in the document image area / logical structure correspondence definition. FIG.
Is an example of a partial area definition menu, and includes an “area ID”, a “start point and an end point of a partial area” specified for each partial area
There is an item "document tag", and the user fills in this item to complete the partial area definition. Next, if the designation of the partial area is continued, it is regarded as step 263. If it is not continued, it is considered that the partial area definition is completed, and the partial area definition menu screen is closed to end (step 2).
62). In step 263, the area ID is given to the document area to be designated. As the area ID, for example,
It suffices to use a number indicating which order the user has designated the partial area. This area ID is used to identify the partial area designated by the user.

【００７６】次に、ユーザは、指定しようとする部分領
域の位置情報を、メニュー項目を埋めることにより指定
する（ｓｔｅｐ２６４）。例えば、図３５に示す部分領
域定義メニューでは、２７２に示す始点座標、終点座標
を各々左上頂点、右下頂点とする矩形領域を指定してい
る。次に、ユーザは、ｓｔｅｐ２６４で指定した部分領
域に付与する文書タグ列を指定する（ｓｔｅｐ２６
５）。例えば、図３５に示す部分領域定義メニューで
は、２７３に示す＜カタログ＞＜名称＞という文書タグ
列を指定している。最後に、「ｓｔｅｐ２６３で付与さ
れた領域ＩＤ」、「ｓｔｅｐ２６４で指定された部分領
域の位置情報」、「ｓｔｅｐ２６５で指定された文書タ
グ列」を文書画像領域・論理構造対応定義に追加する
（ｓｔｅｐ２６６）。例えば、図３５に示す部分領域定
義メニューのように指定され場合は、図６の文書画像領
域・論理構造対応定義中の領域ＩＤが２の部分を追加す
る。上記のような構成によれば、文書画像領域・論理構
造対応定義のユーザによる指定が可能になる。Next, the user specifies the position information of the partial area to be specified by filling in the menu item (step 264). For example, in the partial area definition menu shown in FIG. 35, a rectangular area having the start point coordinates and end point coordinates 272 as the upper left vertex and the lower right vertex is designated. Next, the user specifies the document tag string to be added to the partial area specified in step 264 (step 26).
5). For example, in the partial area definition menu shown in FIG. 35, the document tag string <catalog><name> shown at 273 is designated. Finally, the "region ID given in step 263", "position information of the partial region designated in step 264", and "the document tag string designated in step 265" are added to the document image region / logical structure correspondence definition (step 266). ). For example, in the case of designation as in the partial area definition menu shown in FIG. 35, a part having an area ID of 2 in the document image area / logical structure correspondence definition in FIG. 6 is added. According to the above configuration, the user can specify the document image area / logical structure correspondence definition.

【００７７】実施の形態１１．本実施の形態では、先の
実施の形態で述べた部分領域の範囲指定に続いて、その
文書タグの出力に際しての処理等を規定した属性の登録
について説明する。図３６は実施の形態１１の構造化文
書生成装置の構成図であり、図で実施の形態１と同様又
は相当する部分については同一符合を付しその説明を省
略する。３２は、ユーザが指定する認識すべきパターン
に関する属性情報を受け付け、その情報を文書画像領域
・論理構造対応定義に付加し、その結果得られる文書画
像領域・論理構造対応定義を文書画像領域・論理構造対
応定義記憶部に格納する認識属性指定手段である。Embodiment 11 FIG. In the present embodiment, following the range designation of the partial area described in the previous embodiment, the registration of the attribute that defines the processing when outputting the document tag will be described. FIG. 36 is a configuration diagram of the structured document generation device according to the eleventh embodiment. In the figure, the same or corresponding parts as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. 32 receives attribute information relating to a pattern to be recognized designated by the user, adds the information to the document image area / logical structure correspondence definition, and the resulting document image area / logical structure correspondence definition is the document image area / logic. It is a recognition attribute designating unit that is stored in the structure correspondence definition storage unit.

【００７８】図３８は図３５のインタフェースに「文書
画像データの部分領域」を文字認識する際の設定項目を
追加したものの一例である。「切り出しモード」２９７
は、文書画像データの部分領域にパターン認識処理を適
用させずに単なるイメージ領域として切り出すか否かを
指示するために用られる。「テキスト解析指針」２９６
は、文書画像データの部分領域にパターン認識処理をさ
せた結果にいかなるテキスト解析をさせるかを指示する
ために用られる。「文字置換指針」２９５は、文書画像
データの部分領域にパターン認識処理をさせた結果にい
かなる文字置換処理をさせるかを指示するために用られ
る。置換処理としては、例えば、「西暦元号／数量単位
／住所／氏名」などの正規化形式への置換などがある。
「認識属性」２９４は、文書画像データの部分領域中の
パターンが「文字／画像」、「手書き／活字」、「英字
／数字／記号／カタカナ／平仮名／漢字／それらの混
在」等のどのパターンに属するかを指示するために用い
られる。このように、更に処理の種類を指定して記憶さ
せると出力文書の変化に対応できる。FIG. 38 shows an example in which a setting item for character recognition of the "partial area of document image data" is added to the interface of FIG. "Cutout mode" 297
Is used to instruct whether to cut out a partial area of the document image data as a simple image area without applying pattern recognition processing. "Text Analysis Guidelines" 296
Is used to instruct what text analysis is to be performed as a result of pattern recognition processing performed on a partial area of document image data. The “character replacement guideline” 295 is used to instruct what character replacement processing is to be performed on the result of pattern recognition processing performed on the partial area of the document image data. The replacement process includes, for example, replacement with a normalized format such as "American era / quantity unit / address / name".
"Recognition attribute" 294 is a pattern such as "character / image", "handwriting / printing", "alphabet / number / symbol / katakana / hiragana / kanji / mixture thereof" in the partial area of the document image data. It is used to indicate which belongs to. In this way, by further designating and storing the type of processing, it is possible to cope with changes in the output document.

【００７９】上記のように構成された構造化文書生成装
置の動作を図３７のフローチャートに沿って説明する。
まず最初に、部分領域定義メニューを表示する（ｓｔｅ
ｐ２８１）。図３８の部分領域定義メニューにおいて、
部分領域毎に指定される「領域ＩＤ」「部分領域の始
点、終点」、「文書タグ」、「切り出しモード」、「文
字置換指針」、「テキスト解析指針」、「認識属性」と
いう項目をもち、この項目をユーザが埋めることによ
り、部分領域定義を完成させる。次に、部分領域の指定
を続ける場合にはｓｔｅｐ２６３に、続けない場合には
部分領域定義が完了したものと見なし、部分領域定義メ
ニュー画面を閉じて終了する（ｓｔｅｐ２６２）。ｓｔ
ｅｐ２６３、ｓｔｅｐ２６４では、実施の形態１０と同
様の処理を行なう。The operation of the structured document generating apparatus configured as described above will be described with reference to the flowchart of FIG.
First, display the partial area definition menu (step
p281). In the partial area definition menu of FIG. 38,
It has the items "area ID" specified for each partial area, "start and end points of partial area", "document tag", "cutout mode", "character replacement guideline", "text analysis guideline", and "recognition attribute". The user completes this item to complete the partial area definition. Next, if the designation of the partial area is continued, it is considered to be step 263. If it is not continued, it is considered that the partial area definition is completed, and the partial area definition menu screen is closed to end (step 262). st
At ep263 and step 264, the same processing as that of the tenth embodiment is performed.

【００８０】次に、ユーザは、現在指定しようとする部
分領域に対する文書タグ、文字置換指針、テキスト解析
指針、認識属性を指定する（ｓｔｅｐ２８２）。図３８
に示す部分領域定義メニューでは、文書タグ列として２
７３に示す＜カタログ＞＜名称＞を、切り出しモードと
して２９７に示す「なし」を、文字置換指針として２９
５に示す「なし」を、テキスト解析指針として２９６に
示す「なし」を、認識属性として２９４に示す「活字／
漢字」を各々指定している。ｓｔｅｐ２６３、ｓｔｅｐ
２６４、ｓｔｅｐ２８２で指定された「領域ＩＤ」、
「部分領域の位置情報」、「文書タグ列」、「切り出し
モード」、「文字置換指針」、「テキスト解析指針」、
「認識属性」を追加する（ｓｔｅｐ２８３）。例えば、
図３８に示す部分領域定義メニューのように指定され場
合は、図６の文書画像領域・論理構造対応定義中の領域
ＩＤが２の部分を追加する。上記のような構成によれ
ば、文書画像データをパターン認識し、認識した部分領
域を出力する際に行う処理を指定した属性情報を構造化
文書生成手段に伝達し、それぞれ後工程のソフトウェア
で必要とされる出力に適合した形に変換できる。Next, the user specifies the document tag, the character replacement guideline, the text analysis guideline, and the recognition attribute for the partial area that is currently specified (step 282). Figure 38
In the partial area definition menu shown in, the document tag string is 2
<Catalog><name> shown in 73, “none” shown in 297 as a cutout mode, and 29 as a character replacement guideline.
5, “None” shown as a text analysis guideline, “None” shown as a text analysis guideline 296, and “Type /
"Kanji" is specified for each. step263, step
H.264, "area ID" specified in step 282,
"Position information of partial area", "Document tag string", "Cutout mode", "Character replacement guideline", "Text analysis guideline",
A "recognition attribute" is added (step 283). For example,
In the case of designation as in the partial area definition menu shown in FIG. 38, a part having an area ID of 2 in the document image area / logical structure correspondence definition in FIG. 6 is added. According to the above configuration, the pattern information of the document image data is recognized, and the attribute information designating the processing to be performed when outputting the recognized partial area is transmitted to the structured document generation means, and each is required by the software in the subsequent process. Can be converted into a form suitable for the output.

【００８１】実施の形態１２．実施の形態１０では、文
書画像データの部分領域指定をＸＹ座標値で設定する例
を示した。本実施の形態では更に操作性を向上させるた
め、文書画像データそのものを画面表示させ、その画面
上でカーソル、マウス等で直接領域の範囲指定をさせる
例を説明する。図３９は実施の形態１２における構造化
文書生成装置の構成図であり、図で実施の形態１と同様
又は相当する部分については同一符合を付しその説明を
省略する。３３は、画像読みとり装置から読み取った画
像データを文書画像データとして表示する文書画像デー
タ表示手段である。３１は、文書画像データ表示手段３
３が表示した表示データ中のある部分領域がユーザによ
り指定されると、その指定された領域を認識し、その領
域に関する情報を文書画像領域・論理構造対応定義に付
加し、その結果得られる文書画像領域・論理構造対応定
義を文書画像領域・論理構造対応定義記憶部２に格納す
る文書画像領域／論理構造対応付け手段である。Embodiment 12 FIG. In the tenth embodiment, the example in which the partial area designation of the document image data is set by the XY coordinate values has been shown. In the present embodiment, in order to further improve the operability, an example will be described in which the document image data itself is displayed on the screen, and the range of the area is directly designated on the screen with a cursor, a mouse, or the like. FIG. 39 is a configuration diagram of the structured document generation device according to the twelfth embodiment. In the figure, the same or corresponding parts as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. Reference numeral 33 is a document image data display means for displaying the image data read from the image reading device as document image data. 31 is the document image data display means 3
When a partial area in the display data displayed by 3 is designated by the user, the designated area is recognized, information regarding the area is added to the document image area / logical structure correspondence definition, and the resultant document is obtained. The document image area / logical structure correspondence definition means stores the image area / logical structure correspondence definition in the document image area / logical structure correspondence definition storage unit 2.

【００８２】図４１は「文書画像データ」３１１を背景
として表示し、その上で「文書画像データの部分領域」
３１２を「ポインティングデバイス」３１３を用いて作
成・変更するインタフェースの一例である。この文書画
像データ３１１の画面表示と同時に、対応定義記憶部２
の対応するＩＤの文書タグ部分が呼び出されて表示され
る。そして、「ポインティングデバイス」３１３を用い
て作成・変更された「文書画像データの部分領域」３１
２の位置と大きさに対応して、連動して対応定義記憶部
２の「部分領域座標」２７２の値が変更され、逆に、
「部分領域座標」２７２の値を変更すると、対応して
「文書画像データの部分領域」３１２の位置と大きさが
変更される。In FIG. 41, "document image data" 311 is displayed as a background, and "document image data partial area" is displayed on the background.
This is an example of an interface for creating / changing 312 using a “pointing device” 313. At the same time that the document image data 311 is displayed on the screen, the correspondence definition storage unit 2
The document tag portion of the corresponding ID is called and displayed. Then, the “partial area of the document image data” 31 created / changed using the “pointing device” 313 31
The value of the “partial area coordinates” 272 of the correspondence definition storage unit 2 is changed in association with the position and size of 2 and vice versa.
When the value of the “partial area coordinates” 272 is changed, the position and size of the “partial area of the document image data” 312 are correspondingly changed.

【００８３】上記のように構成された構造化文書生成装
置の動作を図４０のフローチャートに沿って説明する。
まず最初に文書画像データ入力手段１が読み込んだ文書
画像データを文書画像データ表示手段によって表示する
（ｓｔｅｐ３４１）。次に、実施の形態１０のｓｔｅｐ
２６１、ｓｔｅｐ２６２、ｓｔｅｐ２６３（文書タグの
付与）と同様の処理を行なう。次に、ｓｔｅｐ３０１で
表示された文書画像中の部分領域をユーザがポインティ
ングデバイスにより指定する（ｓｔｅｐ３０２）。例え
ば、図４１に示す文書画像データ３１１中の「炊飯器」
と書かれた部分領域を指定する場合には、図４１の３１
２に示す矩形領域の左上と右下の頂点をポインティング
デバイス３１３により指定する。The operation of the structured document generation apparatus configured as described above will be described with reference to the flowchart of FIG.
First, the document image data read by the document image data input means 1 is displayed by the document image data display means (step 341). Next, the step of the tenth embodiment
The same processing as 261, step 262, and step 263 (adding a document tag) is performed. Next, the user specifies a partial area in the document image displayed in step 301 with the pointing device (step 302). For example, “rice cooker” in the document image data 311 shown in FIG. 41.
When designating a partial area written as, 31 in FIG.
The upper left and lower right vertices of the rectangular area shown in 2 are designated by the pointing device 313.

【００８４】次に、ｓｔｅｐ３０２で指定された部分領
域の座標を確定し、部分領域定義メニューに書き込む
（ｓｔｅｐ３４６）。例えば、図４１の３１２の部分領
域を指定した場合には、図４１の２７２に示すような、
その領域の左上（始点）座標である（３０，５０）と右
下（終点）の座標である（１６０，８０）を確定した
後、図３５に示す部分領域定義メニューの２７２のよう
に書き込む。最後に、実施の形態１０のｓｔｅｐ２６６
と同様の処理を行なうことにより、ｓｔｅｐ２６３、ｓ
ｔｅｐ３０２、ｓｔｅｐ３０３、ｓｔｅｐ２６５で指定
した情報を文書画像領域・論理構造対応定義に追加す
る。上記のような構成によれば、ユーザは文書画像デー
タを参照しながら領域を指定することができるので、ユ
ーザの領域指定に要する負荷を軽減させることができ
る。Next, the coordinates of the partial area designated in step 302 are confirmed and written in the partial area definition menu (step 346). For example, when the partial area 312 in FIG. 41 is designated, as shown in 272 in FIG.
After determining the upper left (start point) coordinates (30, 50) and the lower right (end point) coordinates (160, 80) of the area, write as 272 in the partial area definition menu shown in FIG. Finally, step 266 of the tenth embodiment
By performing the same processing as in step 263, s
The information specified in step 302, step 303, and step 265 is added to the document image area / logical structure correspondence definition. According to the above configuration, the user can specify the area while referring to the document image data, so that the load required by the user to specify the area can be reduced.

【００８５】実施の形態１３．先の実施の形態で文書画
像データの部分領域の範囲指定を実際の入力データを画
面上に表示して指定する例を示したが、実施の形態３、
４における繰り返し領域の指定や、入れ子構造の範囲指
定に対しても同様の機能を備える例を説明する。図４２
は実施の形態１３の構造化文書生成装置の構成図であ
り、図で実施の形態１と同様又は相当する部分について
は同一符合を付しその説明を省略する。３４は、文書画
像データ表示手段３３により表示された画像データに対
して、繰り返し構造を構成する例がユーザにより指定さ
れると、その指定された繰り返し構造を認識し、その繰
り返し構造に関する情報を文書画像領域・論理構造対応
定義に付加し、その結果得られる文書画像領域・論理構
造対応定義を文書画像領域・論理構造対応定義記憶部２
に格納する繰り返し構造指定手段である。Thirteenth Embodiment In the above embodiment, an example of designating the range of the partial area of the document image data by displaying the actual input data on the screen has been described.
An example will be described in which the same function is provided for the designation of the repeating area in 4 and the range designation of the nested structure. FIG.
22 is a configuration diagram of the structured document generation device according to the thirteenth embodiment. In the figure, the same or corresponding portions as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. When the user designates an example of forming a repeating structure with respect to the image data displayed by the document image data display means 33, the document recognition unit 34 recognizes the specified repeating structure and provides information about the repeating structure to the document. The document image area / logical structure correspondence definition is added to the image area / logical structure correspondence definition, and the resulting document image area / logical structure correspondence definition is stored in the document image area / logical structure correspondence definition storage unit 2.
It is a repeating structure designating means for storing in.

【００８６】図４４は「繰り返しの有無」２９６におい
て繰り返し有と指定された「文書画像データの部分領
域」の繰り返しパターンを設定するインタフェースの一
例である。「繰り返し構造抽出」３３３では繰り返しの
始点の特徴と繰り返される「文書画像データの部分領
域」の全体をまとめた「文書画像データの部分領域」に
対応付けられる親タグと繰り返される「文書画像データ
の部分領域」の各々に対応付けられる繰り返しタグとを
指定する。この部分は、テキスト解析手段が参照する情
報を指定している。FIG. 44 is an example of an interface for setting a repeating pattern of the “partial area of the document image data” designated as having repeating in “presence / absence of repetition” 296. In the “repeated structure extraction” 333, the parent tag associated with the “partial area of the document image data” that is a collection of the characteristics of the repeated start point and the entire “partial area of the document image data” that is repeated and the repeated “of the document image data” The repeat tag associated with each of the “partial areas” is specified. This part specifies the information referred to by the text analysis means.

【００８７】「領域ＩＤ１」３３４および「領域ＩＤ
２」３３５は１組みの２つの各々の「文書画像データの
部分領域」に対応付けられる文書タグを指定する。「領
域間関係」３３７は「領域ＩＤ１」３３４と「領域ＩＤ
２」３３５の各々の「文書画像データの部分領域」の位
置的な関係を指定する。「繰り返し方向」３３８は「領
域ＩＤ１」３３４と「領域ＩＤ２」３３５とが繰り返さ
れる方向を指定する。「親タグ」３３６は、繰り返され
た「文書画像データの部分領域」がどの親タグに属する
かを指定する。この部分は、繰り返し構造認識手段が参
照する情報を指定している。"Area ID 1" 334 and "Area ID"
“2” 335 designates a document tag associated with each of the two “partial areas of the document image data”. The “relationship between areas” 337 includes “area ID 1” 334 and “area ID”.
2 ”335 specifies the positional relationship of each“ partial area of the document image data ”. “Repeat direction” 338 specifies the direction in which “region ID1” 334 and “region ID2” 335 are repeated. The “parent tag” 336 specifies to which parent tag the repeated “partial area of the document image data” belongs. This part specifies the information referred to by the repeating structure recognition means.

【００８８】上記のように構成された構造化文書生成装
置の動作を図４３のフローチャートに沿って説明する。
まず最初に実施の形態１２のｓｔｅｐ３０１と同様に、
画像データ表示手段によって表示する。次に、文書画像
データ中の部分領域における繰り返し構造に関する条件
を指定する繰り返し構造定義メニューを表示する（ｓｔ
ｅｐ３２１）。図４４の繰り返し構造定義メニューによ
り、繰り返し出現する構成要素の「形状」、「付与すべ
き文書タグ」、「構成要素間の位置関係」、「構成要素
の繰り返し方」を指定する。次に、ｓｔｅｐ３０１が表
示する画像データの部分領域を指し示すことにより、繰
り返し出現する構成要素の形状を、ユーザが指定する
（ｓｔｅｐ３２２）。例えば、図４４の上部の画像デー
タの部分領域３３１を指定した場合には、領域ＩＤ１の
部分領域として認識され、その形状として、横の長さ２
０と縦の長さ１０という数値が繰り返し構造定義メニュ
ー中に入力される。同様に３３２という部分領域に対し
ては、領域ＩＤ２の部分領域として認識される。The operation of the structured document generating apparatus configured as described above will be described with reference to the flowchart of FIG.
First, similar to step 301 of the twelfth embodiment,
It is displayed by the image data display means. Next, a repeating structure definition menu for designating conditions regarding the repeating structure in the partial area in the document image data is displayed (st
ep321). By the repeating structure definition menu of FIG. 44, “shape”, “document tag to be added”, “positional relationship between constituents”, and “how to repeat constituents” of constituents that appear repeatedly are specified. Next, the user designates the shape of the repeatedly appearing component by pointing to the partial area of the image data displayed by step 301 (step 322). For example, when the partial area 331 of the image data in the upper part of FIG. 44 is designated, it is recognized as the partial area of the area ID1, and its shape is the horizontal length 2
Numerical values of 0 and vertical length 10 are entered in the repeating structure definition menu. Similarly, the partial area 332 is recognized as the partial area of the area ID2.

【００８９】次に、ｓｔｅｐ３２２で指示された構成要
素に対して付与すべき文書タグを指定する（ｓｔｅｐ３
２３）。例えば、画像データ中の３３１の領域に対して
＜属性＞という文書タグを付与したい場合には、３３４
にあるように指定する。また、同様に、画像データ中の
３３２の領域に対して＜値＞という文書タグを付与した
い場合には、３３５にあるように指定する。また、３３
１と３３２を横に並べることによって得られる矩形領域
が繰り返し出現するかたまりに対して、＜表＞という文
書タグを付与したい場合には、３３６にあるように指定
する。次に、ｓｔｅｐ３２２で指示された構成要素間の
位置関係をユーザが指定する（ｓｔｅｐ３２４）。例え
ば、画像データ中の３３１と３３２が横に並んでいるこ
とを指定したい場合には、３３７の領域間関係を横に設
定する。次に、ｓｔｅｐ３２４で指示された構成要素間
の位置関係を満たすひとまとまりの領域の繰り返し方を
ユーザが指示する（ｓｔｅｐ３２５）。例えば、図４４
の３３１と３３２を横に並べることによってできる矩形
領域が縦方向に繰り返すことを指示したい場合には、３
３８の繰り返し方向を縦に設定する。Next, the document tag to be given to the constituent element designated in step 322 is designated (step 3).
23). For example, if a document tag of <attribute> is to be added to the area 331 in the image data, 334
Specify as in. Similarly, when it is desired to add a document tag of <value> to the area 332 in the image data, it is designated as in 335. Also, 33
When it is desired to add a document tag <table> to a block in which rectangular areas obtained by arranging 1 and 332 horizontally appear repeatedly, it is specified as in 336. Next, the user designates the positional relationship between the constituent elements designated in step 322 (step 324). For example, if it is desired to specify that 331 and 332 in the image data are arranged side by side, the inter-region relation 337 is set horizontally. Next, the user instructs how to repeat a group of regions that satisfy the positional relationship between the constituents designated in step 324 (step 325). For example, in FIG.
If you want to instruct that the rectangular area created by arranging 331 and 332 in the horizontal direction is repeated in the vertical direction,
The repeating direction of 38 is set to vertical.

【００９０】最後に、ｓｔｅｐ３２２によって設定され
た構成要素の形状、ｓｔｅｐ３２３によって設定された
構成要素に付与する文書タグ、ｓｔｅｐ３２４によって
設定された構成要素間の位置関係、ｓｔｅｐ３２５によ
って設定された構成要素の繰り返し方向を文書画像領域
・論理構造対応定義に書き込む（ｓｔｅｐ３２６）。例
えば、図４４に示すような設定がなされた場合には、図
６の文書画像領域・論理構造対応定義中の７０、７１の
部分が書き込まれる。このように、通常の属性付与の項
に収まらない複雑な処理を別に自由に設定し、文書タグ
（領域ＩＤ）に付与することができる。上記のような構
成によれば、ユーザは繰り返し構造の一部を示すだけ
で、その領域に繰り返し構造を含むことを指定できるの
で、ユーザの領域指定に要する負荷を軽減させることが
できる。Finally, the shape of the component set by step 322, the document tag given to the component set by step 323, the positional relationship between the components set by step 324, and the repetition of the component set by step 325. The direction is written in the document image area / logical structure correspondence definition (step 326). For example, when the setting as shown in FIG. 44 is made, the portions 70 and 71 in the document image area / logical structure correspondence definition in FIG. 6 are written. In this way, it is possible to freely set a complicated process that does not fit in the normal attribute assignment section and assign it to the document tag (area ID). According to the above-mentioned configuration, the user can specify that the region includes the repeating structure by only showing a part of the repeating structure, and thus the load required for the user to specify the region can be reduced.

【００９１】実施の形態１４．先の実施の形態では、出
力文書の出力形式の指定、つまり文書型の定義はできて
いて文書型定義部に登録されている前提で動作を説明し
た。本実施の形態では、この出力文書の出力形式の指定
について説明する。このためには文書がどんな論理構造
でできているかを判り易い形、例えば木構造で表示でき
なければならない。以下にこのことを示す。図４５は、
実施の形態１４の構造化文書生成装置の構成図であり、
図で実施の形態１と同様又は相当する部分については同
一符合を付しその説明を省略する。３６は、文書型定義
記憶手段３５中の文書型定義を参照することにより、文
書の論理構造を木構造として表示する文書論理構造表示
・修正手段、３１は、木構造中のあるノードがユーザに
より指定されると、そのノードに対応する文書タグを確
定し、そのタグに関する情報を文書画像領域・論理構造
対応定義に付加し、その結果得られる文書画像領域・論
理構造対応定義を文書画像領域・論理構造対応定義記憶
部に格納する文書画像領域／論理構造対応付け手段であ
る。Fourteenth Embodiment In the above embodiment, the operation has been described on the assumption that the output format of the output document is specified, that is, the document type is defined and registered in the document type definition section. In this embodiment, the designation of the output format of this output document will be described. For this purpose, it must be possible to display the logical structure of the document in a form that is easy to understand, such as a tree structure. This is shown below. FIG.
It is a block diagram of the structured document generation device of the fourteenth embodiment,
In the figure, the same or corresponding parts as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. Reference numeral 36 denotes a document logical structure display / correction means for displaying the logical structure of a document as a tree structure by referring to the document type definition in the document type definition storage means 35. Reference numeral 31 denotes a node in the tree structure by the user. When specified, the document tag corresponding to the node is determined, information about the tag is added to the document image area / logical structure correspondence definition, and the resulting document image area / logical structure correspondence definition is added to the document image area / A document image area / logical structure associating unit to be stored in the logical structure correspondence definition storage unit.

【００９２】図４７は「文書の論理構造」を「木」の形
で表示し、「文書の論理構造のある特定部分」３５１を
「ポインティングデバイス」３１３を用いて指定するイ
ンタフェースの一例である。「ポインティングデバイ
ス」３１３を用いて指定した「文書の論理構造のある特
定部分」に応じて「文書タグ列」２７３が変更され、そ
の結果「文書の論理構造のある特定部分であることを表
現するマークアップ用文書タグ」を得ることができる。FIG. 47 shows an example of an interface for displaying the “document logical structure” in the form of “tree” and designating the “specific part of the document logical structure” 351 using the “pointing device” 313. The "document tag string" 273 is changed according to the "specific part having the logical structure of the document" specified by using the "pointing device" 313, and as a result, "the specific part having the logical structure of the document" is expressed. The document tag for markup "can be obtained.

【００９３】上記のように構成された構造化文書生成装
置の動作を図４６のフローチャートに沿って説明する。
まず最初に文書画像データ入力手段１が読み込んだ文書
画像データを文書画像データ表示手段によって表示する
（ｓｔｅｐ３４１）。例えば、文書画像データとして図
５の画像データを表示する。次に、ユーザが指定する文
書型定義が規定する文書タグの親子関係を表現する木構
造を表示する（ｓｔｅｐ３４２）。例えばユーザが図７
に示す文書型定義を指定した場合には、図４７の右部分
の木構造を表示する。文書タグの親子関係については、
実施の形態の図７の説明に記載した通りである。次に、
ユーザが文書画像中の部分領域の指定を続ける場合には
ｓｔｅｐ３４４に、続けない場合にはそこで終了する。The operation of the structured document generating apparatus configured as described above will be described with reference to the flowchart of FIG.
First, the document image data read by the document image data input means 1 is displayed by the document image data display means (step 341). For example, the image data of FIG. 5 is displayed as the document image data. Next, a tree structure expressing the parent-child relationship of the document tags defined by the document type definition specified by the user is displayed (step 342). For example, if the user
When the document type definition shown in is designated, the tree structure in the right part of FIG. 47 is displayed. For the parent-child relationship of document tags,
This is as described in the description of FIG. 7 of the embodiment. next,
If the user continues to specify the partial area in the document image, the process ends at step 344. If not, the process ends there.

【００９４】ｓｔｅｐ３４４では、ｓｔｅｐ３４１で表
示された文書画像中の部分領域をユーザがポインティン
グデバイスにより指定する。例えば、図５の文書画像デ
ータ中の「炊飯器」と書かれた部分領域を指定する場合
には、図４７の左に示す文書画面データ３１２に示す矩
形領域を左上と右下の頂点をポインティングデバイス３
１３により指定する。次に、指定された文書領域に領域
ＩＤを付与する（ｓｔｅｐ３４５）。領域ＩＤとして
は、例えば、ユーザが何番目にその部分領域を指定した
かを示す数字を用いればよい。この領域ＩＤはユーザが
指定する部分領域を識別するために用いられる。次に、
ｓｔｅｐ３４４で指定された部分領域の座標を確定する
（ｓｔｅｐ３４６）。例えば、図４７の３１２の部分領
域を指定した場合には、その領域の左上（始点）座標で
ある（３０，５０）と右下（終点）の座標である（１６
０，８０）を得る。次に、ｓｔｅｐ３４２で表示された
木構造中のノードをポインティングデバイスによりユー
ザが指定する（ｓｔｅｐ３４７）。例えば、図７に示す
文書型定義の文書タグ「名称」を指定する場合には、図
４７の木構造中の３５１のノードをポインティングデバ
イス３５３を用いて指定する。At step 344, the user designates the partial area in the document image displayed at step 341 with the pointing device. For example, in the case of designating a partial area written as "rice cooker" in the document image data of FIG. 5, the rectangular area shown in the document screen data 312 shown on the left of FIG. 47 is pointed at the upper left and lower right vertices. Device 3
Specify by 13. Next, an area ID is given to the designated document area (step 345). As the area ID, for example, a number indicating how many times the user has designated the partial area may be used. This area ID is used to identify the partial area designated by the user. next,
The coordinates of the partial area designated in step 344 are confirmed (step 346). For example, when the partial area 312 of FIG. 47 is designated, the upper left (start point) coordinates (30, 50) and the lower right (end point) coordinates of that area are (16).
0,80) is obtained. Next, the user designates a node in the tree structure displayed in step 342 with a pointing device (step 347). For example, when the document tag “name” of the document type definition shown in FIG. 7 is designated, the node 351 in the tree structure of FIG. 47 is designated using the pointing device 353.

【００９５】次に、ｓｔｅｐ３４７で指定されたノード
から文書タグ列を確定する（ｓｔｅｐ３４８）。文書タ
グ列は、表示された木の一番上に位置するノードから指
定されたノードまで辿る道に存在するノードに対応する
文書タグをその順に並べたものである。例えば、図４７
の３５１のノード「名称」を指定した場合には、文書タ
グ列は、２７３に示すように＜カタログ＞＜名称＞であ
る。また、図４７の３５２のノード「属性」を指定した
場合には、文書タグ列は、＜カタログ＞＜本文＞＜仕様
＞＜表＞＜属性＞である。次に、ｓｔｅｐ３４５で得ら
れた「領域ＩＤ」、ｓｔｅｐ３４６で得られた「部分領
域の座標」、ｓｔｅｐ３４８で得られた「文書タグ列」
を文書画像領域・論理構造対応定義に追加し、ｓｔｅｐ
３４３に戻る（ｓｔｅｐ３４９）。例えば、文書画像中
で図４７の部分領域３１２に示す矩形領域を２番目に指
定し、かつ図４７の木構造中の３５１のノードを指定し
た場合には、図６に示す文書画像領域・論理構造対応定
義中の６１の３行目に示される「領域ＩＤ」２の始点、
終点、文書タグに関する部分が追加される。上記のよう
な構成によれば、ユーザは文書の論理構造を参照しなが
らタグを指定することができるので、ユーザのタグ指定
に要する負荷を軽減させることができる。Next, the document tag string is decided from the node designated in step 347 (step 348). The document tag string is an array of document tags corresponding to the nodes existing on the path from the node located at the top of the displayed tree to the designated node. For example, in FIG.
When the node “name” of node 351 is designated, the document tag string is <catalog><name> as indicated by 273. When the node “attribute” of 352 in FIG. 47 is designated, the document tag string is <catalog><text><specification><table><attribute>. Next, the "area ID" obtained in step 345, the "coordinates of partial areas" obtained in step 346, and the "document tag string" obtained in step 348.
Is added to the document image area / logical structure correspondence definition, and step
Return to step 343 (step 349). For example, when the rectangular area shown in the partial area 312 of FIG. 47 is designated second in the document image and the node 351 in the tree structure of FIG. 47 is designated, the document image area / logic shown in FIG. The start point of the "area ID" 2 shown in the 3rd line of 61 in the structure correspondence definition,
Portions related to the end point and document tag are added. According to the above configuration, the user can specify the tag while referring to the logical structure of the document, so that the load required for the user to specify the tag can be reduced.

【００９６】実施の形態１５．出力文書の出力形式の指
定にあたっては、入力の文書画像データの領域とＩＤも
しくは文書タグの対応、つまり分解単位を直観的に知っ
て把握できることが大切である。このためには、入力の
部分領域の範囲と文書タグの特に木構造表示との対応表
示が重要である。本実施の形態ではこのことについて説
明する。図４８は実施の形態１５における構造化文書生
成装置の構成図であり、図で実施の形態１と同様又は相
当する部分については同一符合を付しその説明を省略す
る。３７は、文書画像領域・論理構造対応定義を参照す
ることにより、文書画像／論理構造対応定義中で参照さ
れているタグに対応するノードを確定し、表示されてい
る木構造においてそのノードを強調させて表示する文書
画像領域／論理構造対応表示手段である。Fifteenth Embodiment In designating the output format of the output document, it is important to intuitively know and understand the correspondence between the area of the input document image data and the ID or the document tag, that is, the decomposition unit. To this end, it is important to display the correspondence between the range of the input partial area and the document tag, especially the tree structure display. This will be described in the present embodiment. 48 is a configuration diagram of the structured document generation device according to the fifteenth embodiment. In the figure, the same or corresponding parts as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. 37 refers to the document image area / logical structure correspondence definition to determine the node corresponding to the tag referred to in the document image / logical structure correspondence definition, and emphasizes that node in the displayed tree structure. This is a document image area / logical structure correspondence display means that is displayed by being displayed.

【００９７】図４９は図４７の「文書の論理構造」を
「木」の形で表示したものの各々の「文書の論理構造の
ある特定部分」に対して、対応付けされている「文書の
論理構造のある特定部分」３７１と対応付けがされてい
ない「文書の論理構造のある特定部分」３７２とを視覚
的に区別できるように表示した一例である。ここで、対
応付けされている「文書の論理構造のある特定部分」と
は「文書画像データの部分領域」との対応付けがなされ
ている終端の「文書の論理構造のある特定部分」または
下位の必須の「文書の論理構造のある特定部分」がすべ
て対応付けされていてかつ下位の選択的な「文書の論理
構造のある特定部分」の１つが対応付けされている「文
書の論理構造のある特定部分」であり、対応付けされて
いない「文書の論理構造のある特定部分」とは対応付け
されている「文書の論理構造のある特定部分」以外の
「文書の論理構造のある特定部分」である。FIG. 49 shows the "logic of document" associated with each "specific part of the logic structure of document" of the "logic structure of document" shown in FIG. 47 in the form of "tree". This is an example in which a "specific portion having a structure" 371 and a "specific portion having a logical structure of a document" 372 that is not associated are displayed so as to be visually distinguishable. Here, the associated "specific portion having the logical structure of the document" is the terminal "specific portion having the logical structure of the document" or the lower order that is associated with the "partial area of the document image data". Of all the required "specific parts of the logical structure of the document" are associated and one of the subordinate "specific parts of the logical structure of the document" is associated "A specific part having a logical structure of a document" other than "a specific part having a logical structure of a document" that is not associated with "a specific part having a logical structure of a document" It is.

【００９８】上記のように構成された構造化文書生成装
置の動作を図５０のフローチャートに沿って説明する。
まず、最初に文書画像領域・論理構造対応定義を読み込
む（ｓｔｅｐ３７１）。次に、実施の形態１４のｓｔｅ
ｐ３４２と同様の処理を行なう。ｓｔｅｐ３７２では、
ｓｔｅｐ３７１で読み込んだ文書画像領域・論理構造対
応定義から部分領域を順に取り出し、取り出せた場合は
ｓｔｅｐ３７３に、取り出し終えた場合は終了する。ｓ
ｔｅｐ３７３では、取り出された部分領域毎に文書タグ
を取り出す（ｓｔｅｐ３７３）。例えば、図５１に示す
文書画像領域・論理構造対応定義の場合は、領域ＩＤが
１の部分領域では＜カタログ＞＜日時＞という文書タグ
列が、領域ＩＤが３の部分領域では＜カタログ＞＜本文
＞＜外観＞という文書タグ列が取り出される。The operation of the structured document generating apparatus configured as described above will be described with reference to the flowchart of FIG.
First, the document image area / logical structure correspondence definition is read (step 371). Next, the step of the fourteenth embodiment
The same processing as p342 is performed. In step 372,
The partial areas are sequentially taken out from the document image area / logical structure correspondence definition read in step 371, and if the partial areas can be taken out, the processing ends at step 373, and if the extraction is completed, the processing ends. s
In step 373, the document tag is extracted for each of the extracted partial areas (step 373). For example, in the case of the document image area / logical structure correspondence definition shown in FIG. 51, the document tag string <catalog><date / time> in the partial area with the area ID 1 and <catalog><in the partial area with the area ID 3 A document tag string "text><appearance>" is extracted.

【００９９】次に、ｓｔｅｐ３７２で得られた文書タグ
列に対応する木構造中の道を強調させて表示し、ｓｔｅ
ｐ３７２に戻る（ｓｔｅｐ３７４）。例えば、ｓｔｅｐ
３７３により、＜カタログ＞＜日時＞という文書タグが
得られた場合は、図４９の３７１の部分が強調される。
図５１の文書画像領域・論理構造対応定義のすべての部
分領域に対するｓｔｅｐ３７４の処理が終った後には、
図４９に示す木構造が表示される。図４９中の＜特長＞
からはじまる道が白抜きであるのは、＜カタログ＞＜本
文＞＜特長＞＜リスト＞＜項目＞という文書タグ列が図
５１の文書画像領域・論理構造対応定義に出現しないこ
とによる。また、図４９の本文というノードが白抜きで
あるのは、＜本文＞の下位の文書タグである＜特長＞、
＜外観＞、＜仕様＞のすべての文書タグが出現していな
いことによる。上記のような構成によれば、ユーザは、
文書構造中のタグが、領域定義ファイルから既に参照さ
れているかどうかを簡単に知ることができるので、ユー
ザの文書画像領域・論理構造対応定義作成／修正に要す
る負荷を軽減させることができる。Next, the road in the tree structure corresponding to the document tag sequence obtained in step 372 is highlighted and displayed, and
Return to p372 (step 374). For example, step
When the document tag <catalog><date / time> is obtained by 373, the portion 371 in FIG. 49 is emphasized.
After the processing of step 374 for all the partial areas of the document image area / logical structure correspondence definition in FIG. 51 is completed,
The tree structure shown in FIG. 49 is displayed. <Features> in Figure 49
The reason why the road starting from is blank is that the document tag sequence <catalog><text><feature><list><item> does not appear in the document image area / logical structure correspondence definition in FIG. 51. In addition, the body node in FIG. 49 is white, which is a document tag subordinate to <body>, <feature>,
This is because all the document tags of <Appearance> and <Specification> have not appeared. According to the above configuration, the user
Since it is possible to easily know whether or not the tag in the document structure is already referenced from the area definition file, it is possible to reduce the load required for the user to create / correct the document image area / logical structure correspondence definition.

【０１００】実施の形態１６．先の実施の形態では、文
書タグと論理構造の木表示との対応のみで、未だ部分領
域の範囲と木構造表示との直観的に判り易い対応表示は
示していない。本実施の形態ではこの対応表示について
説明する。本実施の形態における装置構成は、先の実施
の形態と同じ図４８で示される。Sixteenth Embodiment In the above embodiment, only the correspondence between the document tag and the tree structure logical display is shown, and the intuitive display of the partial area range and the tree structure display is not shown yet. In the present embodiment, this correspondence display will be described. The device configuration in this embodiment is shown in FIG. 48, which is the same as the previous embodiment.

【０１０１】図５２は「文書画像データ」３１１を背景
として表示し、その上に表示されている複数の「文書画
像データの部分領域」のうちの１つを「ポインティング
デバイス」３１３を用いて「注目すべき文書画像データ
の部分領域」３８１とした場合、「木」の形で表示した
「文書の論理構造」中の「注目すべき文書画像データの
部分領域」３８１と対応付けられている「文書の論理構
造のある特定部分」を「注目すべき文書の論理構造のあ
る特定部分」３８２として他の「文書の論理構造のある
特定部分」と視覚的に区別して表示する一例である。In FIG. 52, "document image data" 311 is displayed as a background, and one of a plurality of "partial areas of document image data" displayed thereon is displayed by "pointing device" 313. When the “partial area of the document image data to be noted” 381 is set, it is associated with the “partial area of the document image data to be noted” 381 in the “logical structure of the document” displayed in the form of “tree”. This is an example of visually displaying "a specific portion having a logical structure of a document" as "a specific portion having a logical structure of a document of interest" 382 and visually distinguishing it from other "specific portions having a logical structure of a document".

【０１０２】上記のように構成された構造化文書生成装
置の動作を図５３のフローチャートに沿って説明する。
まず、最初に文書画像領域・論理構造対応定義を読み込
む（ｓｔｅｐ３８１）。次に、実施の形態１４のｓｔｅ
ｐ３４１、ｓｔｅｐ３４２と同様の処理を行なう。次
に、ユーザは、ｓｔｅｐ３４１で表示された文書画像デ
ータ中の点をポイントティングデバイスによって指定す
る（ｓｔｅｐ３８２）。次に、文書画像領域・論理構造
対応定義より、ｓｔｅｐ３８２により指定された位置を
含む部分領域を求める（ｓｔｅｐ３８３）。例えば、ｓ
ｔｅｐ３４１により表示された文書画像データ３１１に
おいて、図５２のポインィングデバイス３１３が図の位
置を指している場合には、その点を含む領域として、図
６に示す文書画像領域・論理構造対応定義中の領域ＩＤ
が２である部分領域を求める。The operation of the structured document generation apparatus configured as described above will be described with reference to the flowchart of FIG.
First, the document image area / logical structure correspondence definition is read (step 381). Next, the step of the fourteenth embodiment
The same processing as p341 and step 342 is performed. Next, the user designates a point in the document image data displayed in step 341 with the pointing device (step 382). Next, a partial area including the position designated by step 382 is obtained from the document image area / logical structure correspondence definition (step 383). For example, s
In the document image data 311 displayed by step 341, when the pointing device 313 in FIG. 52 points to the position in the figure, the document image area / logical structure correspondence definition in FIG. 6 is defined as an area including the point. Area ID
A partial area whose value is 2 is obtained.

【０１０３】次に、ｓｔｅｐ３８３で求めた部分領域に
付与されている文書タグ列を得る（ｓｔｅｐ３８４）。
例えば、ｓｔｅｐ３４３により領域ＩＤが２である部分
領域が求められた場合には、図６に示す文書画像領域・
論理構造対応定義中の領域ＩＤが２の部分領域に付与さ
れた文書タグ列＜カタログ＞＜名称＞を得る。最後に、
ｓｔｅｐ３８４で得られた文書タグ列に対応する木構造
中の道の末端のノードを強調させて表示する（ｓｔｅｐ
３８５）。例えば、ｓｔｅｐ３８４により、＜カタログ
＞＜名称＞という文書タグが得られた場合は、ｓｔｅｐ
３４２により表示された図５２の右の木構造中の３８２
のノードを強調させて表示する。上記のような構成によ
れば、ユーザは文書画像データ中の領域が文書構造中の
どのタグと対応付けられているかを簡単に知ることがで
きるので、ユーザの文書画像領域・論理構造対応定義作
成／修正に要する負荷を軽減させることができる。Next, the document tag string attached to the partial area obtained in step 383 is obtained (step 384).
For example, when a partial area having an area ID of 2 is obtained by step 343, the document image area shown in FIG.
The document tag string <catalog><name> assigned to the partial area having the area ID 2 in the logical structure correspondence definition is obtained. Finally,
The node at the end of the road in the tree structure corresponding to the document tag sequence obtained in step 384 is emphasized and displayed (step
385). For example, when the document tags <catalog><name> are obtained by step 384, step
382 in the tree structure on the right of FIG. 52 displayed by 342
The node of is highlighted and displayed. According to the above configuration, the user can easily know which tag in the document structure is associated with the area in the document image data. Therefore, the user's document image area / logical structure correspondence definition / The load required for correction can be reduced.

【０１０４】実施の形態１７．先の実施の形態では、部
分領域を指定すると木表示の論理構造の対応部分が識別
表示される例を説明したが、本実施の形態では、逆に木
表示の論理構造の特定部分を指定すると文書画像データ
の対応する部分領域が識別表示される例を説明する。本
実施の形態における装置構成も、先の実施の形態と同じ
図４８で示される。Embodiment 17 FIG. In the above embodiment, an example in which the corresponding portion of the tree-displayed logical structure is identified and displayed when the partial area is designated has been described. However, in the present embodiment, conversely, when the specified portion of the tree-displayed logical structure is designated. An example in which the corresponding partial area of the document image data is identified and displayed will be described. The device configuration in the present embodiment is also shown in FIG. 48, which is the same as the previous embodiment.

【０１０５】図５４は図５２とは逆に、「木」の形で表
示した「文書の論理構造」中の「文書の論理構造のある
特定部分」のうちの１つを「ポインティングデバイス」
３１３を用いて「注目すべき文書の論理構造のある特定
部分」３９１とした場合、「文書画像データ」３１１を
背景とする複数の「文書画像データの部分領域」のうち
「注目すべき文書の論理構造のある特定部分」３９１と
対応付けられた「文書画像データの部分領域」を「注目
すべき文書画像データの部分領域」３９２として他の
「文書画像データの部分領域」と視覚的に区別して表示
する一例である。In FIG. 54, contrary to FIG. 52, one of the “specific parts of the logical structure of the document” in the “logical structure of the document” displayed in the form of “tree” is designated as the “pointing device”.
When the “specific portion having the logical structure of the document to be noticed” 391 is made using 313, the “partial region of the document image data” of the “document image data” 311 in the background The "partial area of the document image data" associated with the "specific part having a logical structure" 391 is visually separated from other "partial areas of the document image data" as "partial area of the document image data to be noticed" 392. This is an example of displaying separately.

【０１０６】上記のように構成された構造化文書生成装
置の動作を図５５のフローチャートに沿って説明する。
まず、最初に文書画像領域・論理構造対応定義を読み込
む（ｓｔｅｐ３８１）。次に、実施の形態１４のｓｔｅ
ｐ３４１，ｓｔｅｐ３４２，ｓｔｅｐ３４３と同様の処
理を行なう。次に、文書画像領域・論理構造対応定義よ
り、ｓｔｅｐ３４３により指定されたノードに至る道に
対応する文書タグ列をもつ部分領域を求める（ｓｔｅｐ
３９１）。例えば、ｓｔｅｐ３４２により表示された図
５４の右の木構造中のノード３９１を指定した場合に
は、ノードに至る道に対応する文書タグ列＜カタログ＞
＜日時＞をもつ文書領域として、図６に示す文書画像領
域・論理構造対応定義中の領域ＩＤが１である部分領域
を得る。最後に、文書画像データにおいて、ｓｔｅｐ３
９１で得られた部分領域を強調させて表示する（ｓｔｅ
ｐ３９２）。例えば、図６に示す文書画像領域・論理構
造対応定義中の領域ＩＤが１の部分領域がｓｔｅｐ３９
１により得られた場合は、始点が（２０，１０）、終点
が（１６０，４０）である矩形領域部分３９２の部分を
強調させて表示する。上記のような構成によれば、ユー
ザは、文書構造中のどのタグが、文書画像データ中のど
の領域と対応付けられているかを簡単に知ることができ
るので、ユーザの文書画像領域・論理構造対応定義作成
／修正に要する負荷を軽減させることができる。The operation of the structured document generation apparatus configured as described above will be described with reference to the flowchart of FIG.
First, the document image area / logical structure correspondence definition is read (step 381). Next, the step of the fourteenth embodiment
The same process as p341, step 342, and step 343 is performed. Next, from the document image area / logical structure correspondence definition, a partial area having a document tag string corresponding to the way to the node designated by step 343 is obtained (step
391). For example, when the node 391 in the tree structure on the right side of FIG. 54 displayed by step 342 is specified, the document tag string <catalog> corresponding to the way to the node
As the document area having <date and time>, the partial area having the area ID 1 in the document image area / logical structure correspondence definition shown in FIG. 6 is obtained. Finally, in the document image data, step3
The partial area obtained in step 91 is displayed with emphasis (step
p392). For example, the partial area having the area ID 1 in the document image area / logical structure correspondence definition shown in FIG.
When it is obtained by 1, the portion of the rectangular area portion 392 having the start point (20, 10) and the end point (160, 40) is emphasized and displayed. According to the above configuration, the user can easily know which tag in the document structure is associated with which region in the document image data. Therefore, the user's document image region / logical structure It is possible to reduce the load required to create / correct the correspondence definition.

【０１０７】実施の形態１８．出力文書が必要十分に情
報を記載した構造化文書となっているかどうかを容易に
チェックできるようにした例を説明する。図５６は実施
の形態１８の構造化文書生成装置の構成図であり、図で
実施の形態１と同様又は相当する部分については同一符
合を付しその説明を省略する。３８は、指定された文書
領域／論理構造対応定義が、文書型定義を満たしている
かを判定する。ついで、満たしていない場合は、その原
因となるタグにを確定し、そのタグに対応する木構造中
のノードを強調させて表示する文書型定義正当性確認手
段である。Eighteenth Embodiment An example in which it is possible to easily check whether or not the output document is a structured document in which necessary and sufficient information is described will be described. FIG. 56 is a configuration diagram of the structured document generation device according to the eighteenth embodiment. In the figure, the same or corresponding portions as those in the first embodiment are designated by the same reference numerals and the description thereof will be omitted. 38 determines whether the designated document area / logical structure correspondence definition satisfies the document type definition. Next, when the condition is not satisfied, the document type definition correctness confirmation means for confirming the tag that causes the tag and displaying the node in the tree structure corresponding to the tag by emphasizing it.

【０１０８】図５８は、生成した構造化文書が正当なも
のであるかどうかを解析し、不当な部分が発見された場
合、不当な部分に対応する文書の論理構造の文書タグを
わかりやすく表示すると共に、不当な理由を表示するユ
ーザインタフエースの一例である。FIG. 58 analyzes whether or not the generated structured document is valid, and when an invalid portion is found, the document tag of the logical structure of the document corresponding to the invalid portion is displayed in an easy-to-understand manner. This is an example of a user interface that displays an unreasonable reason.

【０１０９】上記のように構成された構造化文書生成装
置の動作を図５７のフローチャートに沿って説明する。
まず、最初に文書画像領域・論理構造対応定義を読み込
む（ｓｔｅｐ３７１）。次に、実施の形態１４のｓｔｅ
ｐ３４２と同様の処理を行なう。次に、文書型定義に規
定されているが、文書画像領域・論理構造対応定義では
規定されていない文書タグを見つける（ｓｔｅｐ４０
１）。例えば、ｓｔｅｐ３４２で図７に示す文書型定義
が指定され、ｓｔｅｐ３７１で図５１に示す文書画像領
域・論理構造対応定義を読み込んだ場合には、図７の文
書型定義中に存在するが図５１の文書画像領域・論理構
造対応定義では定義されていない文書タグ列として、＜
カタログ＞＜本文＞＜特長＞が見つかる。次に、ｓｔｅ
ｐ４０１で見つかった文書タグ列に対応する木構造中の
道の末端のノードを強調させて表示した後（ｓｔｅｐ４
０２）、文書型定義と文書画像領域・論理構造対応定義
との不整合の理由を表示する（ｓｔｅｐ４０３）。例え
ば、ｓｔｅｐ４０１で＜カタログ＞＜本文＞＜特長＞と
いう文書タグ列が見つかった場合には、図５８の４１１
に示すノードが強調され、４１２に示す不整合の理由が
表示される。上記のような構成によれば、ユーザは、文
書領域／論理構造対応定義が適切であるか、また、適切
でない場合はどの部分が問題であるかを簡単に知ること
ができるので、ユーザの文書画像領域・論理構造対応定
義作成／修正に要する負荷を軽減させることができる。The operation of the structured document generating apparatus configured as described above will be described with reference to the flowchart of FIG.
First, the document image area / logical structure correspondence definition is read (step 371). Next, the step of the fourteenth embodiment
The same processing as p342 is performed. Next, find a document tag that is defined in the document type definition but not in the document image area / logical structure correspondence definition (step 40).
1). For example, when the document type definition shown in FIG. 7 is specified in step 342 and the document image area / logical structure correspondence definition shown in FIG. 51 is read in step 371, it exists in the document type definition in FIG. As a document tag string not defined in the document image area / logical structure correspondence definition, <
Catalog><text><features> can be found. Next, ste
After highlighting the node at the end of the road in the tree structure corresponding to the document tag string found in p401 (step 4
02), the reason for inconsistency between the document type definition and the document image area / logical structure correspondence definition is displayed (step 403). For example, if the document tag string <catalog><text><feature> is found in step 401, the document tag string 411 in FIG.
The node indicated by is highlighted, and the reason for the inconsistency indicated by 412 is displayed. According to the above configuration, the user can easily know whether the document area / logical structure correspondence definition is appropriate, and if not, which part is the problem, so that the user's document It is possible to reduce the load required for creating / correcting the image area / logical structure correspondence definition.

【０１１０】[0110]

【発明の効果】以上のようにこの発明によれば、文書画
像領域を文書の論理構造中の定義として記憶する対応定
義記憶手段と、文書画像の部分領域を切りだす文書画像
領域切り出し手段と、文書を必要領域の集りである論理
構造として記憶する文書型定義記憶手段と、指定の文書
型に基き文書タグ中の属性処理をして構造化文書を生成
する構造化文書生成手段とを備えたので、後段のプログ
ラムが必要とするインタフェースに適合した出力とする
構造化文書の形式を自由に設定して選択することができ
ると共に、画像文書データにおける定型的なフォーマッ
ト情報を有効利用でき、文書データの流用性、データベ
ース登録等の自動処理の可能性が向上する効果がある。As described above, according to the present invention, the correspondence definition storing means for storing the document image area as the definition in the logical structure of the document, and the document image area cutting means for cutting out the partial area of the document image, Document type definition storing means for storing a document as a logical structure, which is a collection of necessary areas, and structured document generating means for generating a structured document by performing attribute processing in a document tag based on a specified document type Therefore, it is possible to freely set and select the format of the structured document to be output that conforms to the interface required by the program in the latter stage, and it is possible to effectively use the standard format information in the image document data. There is an effect that the diversion property and the possibility of automatic processing such as database registration are improved.

【０１１１】また更に、文書画像領域推定手段を付加し
たので、読みとった文書画像データのフォーマット形式
が、対応定義記憶手段中の規定のフォーマット形式と少
し異っている場合にも、文書画像領域と論理構造の対応
関係を正しく認識することができ、認識許容度が拡がっ
て装置の安定性が増し、また、文書フォーマットの登録
に要する手間を削減する効果がある。Furthermore, since the document image area estimating means is added, even if the format format of the read document image data is slightly different from the prescribed format format in the correspondence definition storing means, The correspondence of the logical structure can be correctly recognized, the recognition tolerance is expanded, the stability of the device is increased, and the labor required for registering the document format is reduced.

【０１１２】また更に、別の処理属性用の詳細処理表を
設けたので、必要な出力文書に応じて複雑な処理をして
出力することも、また処理内容を自由に変更することが
でき、木目細かな出力文書管理に、またその管理の簡易
化に効果がある。Furthermore, since the detailed processing table for another processing attribute is provided, it is possible to perform complicated processing according to a required output document and output, and it is possible to freely change the processing content. It is effective for fine-grained output document management and simplification of the management.

【０１１３】また更に、文書画像データの部分領域を切
りだしイメージ・ファイルとして出力するイメージ切り
出し手段を付加したので、図や写真などのコード化デー
タへの変換が困難なデータが混在している場合でも、自
由な構造化文書を生成できる効果がある。Furthermore, since image cut-out means for cutting out a partial area of the document image data and outputting it as an image file is added, if there is a mixture of data that is difficult to convert into coded data such as figures and photographs. However, there is an effect that a structured document can be freely generated.

【０１１４】また更に、入力の文字列を別の文字列に変
換するので、出力側の各種アプリケーションソフトウェ
アにとって処理しやすい形式をもった構造化文書が得ら
れ、文書データの流用性、自動処理の可能性が向上する
効果がある。Furthermore, since the input character string is converted into another character string, a structured document having a format that can be easily processed by various kinds of application software on the output side can be obtained. It has the effect of increasing the possibility.

【０１１５】また更に、文書画像データの部分領域を出
力側アプリケーションソフトウェアに適合する単位に分
解して文書タグを付けて登録・記憶する文書画像領域・
論理構造対応付け手段を設けたので、文書画像領域と論
理構造との対応定義をユーザが自由にできる。ユーザ要
求に適合した文書画像データのフォーマットの登録によ
り、構造化文書管理の自由度が高まる効果がある。Furthermore, a document image area in which a partial area of the document image data is decomposed into units suitable for the output side application software and a document tag is added for registration / storage.
Since the logical structure associating means is provided, the user can freely define the correspondence between the document image area and the logical structure. By registering the format of the document image data that meets the user request, the degree of freedom in structured document management is increased.

【０１１６】また更に、文書画像データの部分領域に複
雑な処理をほどこしたい場合にも、独立してユーザ要求
に合った自由な処理を指定できるので、文書データの利
用度を向上する効果がある。Further, even when it is desired to perform complicated processing on a partial area of the document image data, it is possible to independently specify a free processing that meets the user's request, which has the effect of improving the utilization of the document data. .

【０１１７】また更に、文書画像データの部分領域と文
書タグとの対応した識別表示ができるので、ユーザは文
書画像データを参照しながら領域を指定することがで
き、領域指定の負荷を軽減して、構造化文書管理の負荷
を軽減させる効果がある。Furthermore, since the identification display corresponding to the partial area of the document image data and the document tag can be performed, the user can specify the area while referring to the document image data, and the load of area specification can be reduced. , It is effective in reducing the load of structured document management.

【０１１８】また更に、文書型定義が規定する文書の論
理構造を木構造として表示する文書論理構造表示・修正
手段を備えたので、ユーザは文書の論理構造を参照しな
がら文書タグを指定することができ、タグ指定の負荷を
軽減できる。こうして構造化文書管理の負荷を軽減でき
る効果がある。Further, since the document logical structure display / correction means for displaying the logical structure of the document defined by the document type definition as a tree structure is provided, the user can specify the document tag while referring to the logical structure of the document. It is possible to reduce the load of tag specification. Thus, there is an effect that the load of structured document management can be reduced.

【０１１９】また更に、文書型定義を木構造で示し、ま
た、文書画像データの部分領域と文書タグとの対応を記
憶するようにしたので、ユーザは文書構造中のタグが領
域定義ファイルから既に参照されているかどうかを簡単
に知ることができ、ユーザの文書画像領域・論理構造対
応定義作成／修正に要する負荷を軽減できる。即ち、構
造化文書管理の負荷を軽減できる効果がある。Furthermore, since the document type definition is shown by a tree structure and the correspondence between the partial area of the document image data and the document tag is stored, the user can already find the tag in the document structure from the area definition file. It is possible to easily know whether or not it is referenced, and it is possible to reduce the load required for the user to create / correct the document image area / logical structure correspondence definition. That is, there is an effect that the load of structured document management can be reduced.

【０１２０】また更に、木構造で示される文書型定義と
文書画像データの部分領域と文書タグとの対応した識別
表示ができるようにしたので、ユーザは直観的にこれら
の関係を知って構造化文書管理の負荷を軽減させる効果
がある。Furthermore, since it is possible to identify and display the document type definition represented by the tree structure, the partial area of the document image data, and the document tag, the user intuitively knows the relationship between them and configures them. This has the effect of reducing the load of document management.

【０１２１】また更に、木構造表示と部分領域と文書タ
グとの対応がない場合にはそのことを識別表示するよう
にしたので、ユーザは文書領域・論理構造対応定義の適
・不適を知ることができ、構造化文書管理の負荷を軽減
できる効果がある。Furthermore, when there is no correspondence between the tree structure display, the partial area and the document tag, the fact is displayed so that the user can know the suitability or unsuitability of the document area / logical structure correspondence definition. It is possible to reduce the load of structured document management.

[Brief description of the drawings]

【図１】本発明の実施の形態１の構造化文書生成装置
の構成を示す図である。FIG. 1 is a diagram showing a configuration of a structured document generation device according to a first embodiment of the present invention.

【図２】本発明の構造化文書生成装置の処理動作の例
を示すフローチャート図である。FIG. 2 is a flowchart showing an example of processing operation of the structured document generation device of the present invention.

【図３】本発明の構造化文書生成装置の処理動作の例
を示すフローチャート図である。FIG. 3 is a flowchart showing an example of processing operation of the structured document generation device of the present invention.

【図４】本発明の構造化文書生成装置の処理動作の例
を示すフローチャート図である。FIG. 4 is a flowchart showing an example of processing operation of the structured document generation device of the present invention.

【図５】本発明の構造化文書生成装置における文書画
像データの一例を示す図である。FIG. 5 is a diagram showing an example of document image data in the structured document generation device of the present invention.

【図６】本発明の構造化文書生成装置の文書画像領域
・論理構造対応定義の一例を示す図である。FIG. 6 is a diagram showing an example of a document image area / logical structure correspondence definition of the structured document generation device of the present invention.

【図７】本発明の構造化文書生成装置における文書型
定義の一例を示す図である。FIG. 7 is a diagram showing an example of a document type definition in the structured document generation device of the present invention.

【図８】本発明の構造化文書生成装置における文書タ
グ／コード化データ対応表の一例を示す図である。FIG. 8 is a diagram showing an example of a document tag / coded data correspondence table in the structured document generation device of the present invention.

【図９】図７の文書タグの親子関係を木構造で表示す
る図である。9 is a diagram showing a parent-child relationship of the document tags of FIG. 7 in a tree structure.

【図１０】本発明の構造化文書生成装置が出力する構
造化文書の一例を示す図である。FIG. 10 is a diagram showing an example of a structured document output by the structured document generation device of the present invention.

【図１１】文書画像領域・論理構造対応定義の他の例
を示す図である。FIG. 11 is a diagram showing another example of a document image area / logical structure correspondence definition.

【図１２】文書型定義の他の例を示す図である。FIG. 12 is a diagram showing another example of a document type definition.

【図１３】構造化文書の他の例を示す図である。FIG. 13 is a diagram showing another example of a structured document.

【図１４】中間的に生成される中間構造化文書の例を
示す図である。FIG. 14 is a diagram illustrating an example of an intermediate structured document that is intermediately generated.

【図１５】本発明の実施の形態２の構造化文書生成装
置の構成を示す図である。FIG. 15 is a diagram showing a configuration of a structured document generation device according to a second embodiment of the present invention.

【図１６】本発明の実施の形態２における構造化文書
生成装置の処理動作を示すフローチャート図である。FIG. 16 is a flowchart showing a processing operation of the structured document generation device according to the second embodiment of the present invention.

【図１７】本発明の実施の形態３の構造化文書生成装
置の構成を示す図である。FIG. 17 is a diagram showing a structure of a structured document generation device according to a third embodiment of the present invention.

【図１８】本発明の実施の形態３における構造化文書
生成装置の処理動作を示すフローチャート図である。FIG. 18 is a flowchart showing a processing operation of the structured document generation device according to the third embodiment of the present invention.

【図１９】本発明の実施の形態４の構造化文書生成装
置の構成を示す図である。FIG. 19 is a diagram showing a structure of a structured document generation device according to a fourth embodiment of the present invention.

【図２０】本発明の実施の形態４における構造化文書
生成装置の処理動作を示すフローチャート図である。FIG. 20 is a flowchart showing a processing operation of the structured document generation device according to the fourth embodiment of the present invention.

【図２１】本発明の実施の形態４における構造化文書
生成装置の文書画像領域・論理構造対応定義の一例を示
す図である。FIG. 21 is a diagram showing an example of a document image area / logical structure correspondence definition of the structured document generation device according to the fourth embodiment of the present invention.

【図２２】本発明の実施の形態５の構造化文書生成装
置の構成を示す図である。FIG. 22 is a diagram showing the structure of a structured document generation device according to a fifth embodiment of the present invention.

【図２３】本発明の実施の形態５における構造化文書
生成装置の処理動作を示すフローチャート図である。FIG. 23 is a flowchart showing a processing operation of the structured document generation device according to the fifth embodiment of the present invention.

【図２４】本発明の実施の形態６の構造化文書生成装
置の構成を示す図である。FIG. 24 is a diagram showing a structure of a structured document generation device according to a sixth embodiment of the present invention.

【図２５】本発明の実施の形態６における構造化文書
生成装置の処理動作を示すフローチャート図である。FIG. 25 is a flowchart showing a processing operation of the structured document generation device according to the sixth embodiment of the present invention.

【図２６】本発明の実施の形態７の構造化文書生成装
置の構成を示す図である。FIG. 26 is a diagram showing the structure of a structured document generation device according to a seventh embodiment of the present invention.

【図２７】本発明の実施の形態７における構造化文書
生成装置の処理動作を示すフローチャート図である。FIG. 27 is a flowchart showing a processing operation of the structured document generation device according to the seventh embodiment of the present invention.

【図２８】本発明の実施の形態８の構造化文書生成装
置の構成を示す図である。FIG. 28 is a diagram showing a structure of a structured document generation device according to an eighth embodiment of the present invention.

【図２９】本発明の実施の形態８における構造化文書
生成装置の処理動作を示すフローチャート図である。FIG. 29 is a flowchart showing a processing operation of the structured document generation device according to the eighth embodiment of the present invention.

【図３０】本発明の実施の形態９の構造化文書生成装
置の構成を示す図である。FIG. 30 is a diagram showing a structure of a structured document generation device according to a ninth embodiment of the present invention.

【図３１】本発明の実施の形態９における構造化文書
生成装置の処理動作を示すフローチャート図である。FIG. 31 is a flowchart showing a processing operation of the structured document generation device according to the ninth embodiment of the present invention.

【図３２】本発明の実施の形態９における構造化文書
生成装置の文書画像領域・論理構造対応定義の一例を示
す例図である。FIG. 32 is an example diagram showing an example of a document image area / logical structure correspondence definition of the structured document generation device according to the ninth embodiment of the present invention.

【図３３】本発明の実施の形態１０の構造化文書生成
装置の構成を示す図である。FIG. 33 is a diagram showing the structure of a structured document generation device according to a tenth embodiment of the present invention.

【図３４】本発明の実施の形態１０における構造化文
書生成装置の処理動作を示すフローチャート図である。FIG. 34 is a flowchart showing a processing operation of the structured document generation device according to the tenth embodiment of the present invention.

【図３５】本発明の実施の形態１０における構造化文
書生成装置の文書画像領域・論理構造対応付けインタフ
ェースの一例を示す図である。FIG. 35 is a diagram showing an example of a document image area / logical structure association interface of the structured document generation device in the tenth embodiment of the present invention.

【図３６】本発明の実施の形態１１の構造化文書生成
装置の構成を示す図である。FIG. 36 is a diagram showing a structure of a structured document generation device according to an eleventh embodiment of the present invention.

【図３７】本発明の実施の形態１１における構造化文
書生成装置の処理動作を示すフローチャート図である。FIG. 37 is a flow chart diagram showing the processing operation of the structured document generation device in the eleventh embodiment of the present invention.

【図３８】本発明の実施の形態１１における構造化文
書生成装置の認識属性指定インタフェースの一例を示す
図である。FIG. 38 is a diagram showing an example of a recognition attribute designation interface of the structured document generation device in the eleventh embodiment of the present invention.

【図３９】本発明の実施の形態１２の構造化文書生成
装置の構成を示す図である。FIG. 39 is a diagram showing the structure of a structured document generation device according to a twelfth embodiment of the present invention.

【図４０】本発明の実施の形態１２における構造化文
書生成装置の処理動作を示すフローチャート図である。FIG. 40 is a flow chart diagram showing the processing operation of the structured document generation device according to the twelfth embodiment of the present invention.

【図４１】本発明の実施の形態１２における構造化文
書生成装置の文書画像データ表示インタフェースの一例
を示す例図である。FIG. 41 is an example diagram showing an example of a document image data display interface of a structured document generation device according to a twelfth embodiment of the present invention.

【図４２】本発明の実施の形態１３の構造化文書生成
装置の構成を示す図である。FIG. 42 is a diagram showing the structure of a structured document generation device according to a thirteenth embodiment of the present invention.

【図４３】本発明の実施の形態１３における構造化文
書生成装置の処理動作を示すフローチャート図である。FIG. 43 is a flowchart showing a processing operation of the structured document generation device according to the thirteenth embodiment of the present invention.

【図４４】本発明の実施の形態１３における構造化文
書生成装置の繰り返し構造指定インタフェースの一例を
示す図である。FIG. 44 is a diagram showing an example of a repeated structure designation interface of the structured document generation device in the thirteenth embodiment of the present invention.

【図４５】本発明の実施の形態１４の構造化文書生成
装置の構成を示す図である。FIG. 45 is a diagram showing a structure of a structured document generation device according to a fourteenth embodiment of the present invention.

【図４６】本発明の実施の形態１４における構造化文
書生成装置の処理動作を示すフローチャート図である。FIG. 46 is a flowchart showing a processing operation of the structured document generation device according to the fourteenth embodiment of the present invention.

【図４７】本発明の実施の形態１４における構造化文
書生成装置の文書論理構造表示インタフェースの一例を
示す図である。FIG. 47 is a diagram showing an example of a document logical structure display interface of the structured document generation device in the fourteenth embodiment of the present invention.

【図４８】本発明の実施の形態１５，１６，１７の構
造化文書生成装置の構成を示す図である。FIG. 48 is a diagram showing the structure of the structured document generation device according to the fifteenth, sixteenth and seventeenth embodiments of the present invention.

【図４９】本発明の実施の形態１５における構造化文
書生成装置の文書領域・文書論理構造対応表示インタフ
ェースの一例を示す図である。FIG. 49 is a diagram showing an example of a document area / document logical structure correspondence display interface of the structured document generation device in the fifteenth embodiment of the present invention.

【図５０】本発明の実施の形態１５における構造化文
書生成装置の処理動作を示すフローチャート図である。FIG. 50 is a flowchart showing a processing operation of the structured document generation device according to the fifteenth embodiment of the present invention.

【図５１】本発明の実施の形態１５における構造化文
書生成装置の文書画像領域・論理構造対応定義の一例を
示す図である。FIG. 51 is a diagram showing an example of a document image area / logical structure correspondence definition of the structured document generation device in the fifteenth embodiment of the present invention.

【図５２】本発明の実施の形態１６における構造化文
書生成装置の文書領域／文書論理構造対応表示インタフ
ェースの一例を示す図である。FIG. 52 is a diagram showing an example of a document area / document logical structure correspondence display interface of the structured document generation device in the sixteenth embodiment of the present invention.

【図５３】本発明の実施の形態１７における構造化文
書生成装置の処理動作を示すフローチャート図である。FIG. 53 is a flowchart showing a processing operation of the structured document generation device according to the seventeenth embodiment of the present invention.

【図５４】本発明の実施の形態１７における構造化文
書生成装置の文書領域・文書論理構造対応表示インタフ
ェースの一例を示す図である。FIG. 54 is a diagram showing an example of a document area / document logical structure correspondence display interface of the structured document generation device in the seventeenth embodiment of the present invention.

【図５５】本発明の実施の形態１７における構造化文
書生成装置の処理動作を示すフローチャート図である。FIG. 55 is a flowchart showing a processing operation of the structured document generation device according to the seventeenth embodiment of the present invention.

【図５６】本発明の実施の形態１８の構造化文書生成
装置の構成を示す図である。FIG. 56 is a diagram showing the structure of the structured document generation device according to the eighteenth embodiment of the present invention.

【図５７】本発明の実施の形態１８における構造化文
書生成装置の処理動作を示すフローチャート図である。FIG. 57 is a flow chart diagram showing the processing operation of the structured document generation device in the eighteenth embodiment of the present invention.

【図５８】本発明の実施の形態１８における構造化文
書生成装置の文書型定義正当性確認インタフェースの一
例を示す図である。FIG. 58 is a diagram showing an example of a document type definition validity confirmation interface of the structured document generation device in the eighteenth embodiment of the present invention.

【図５９】従来の構造化文書生成装置を示す構成図で
ある。FIG. 59 is a configuration diagram showing a conventional structured document generation device.

[Explanation of symbols]

１文書画像データ入力手段、２文書画像領域・論理
構造対応定義記憶部、３文書画像領域切り出し手段、
４パターン認識手段、５構造化文書生成手段、６
構造化文書出力手段、７制御手段、８文書画像領域
推定手段、９繰り返し構造認識手段、１０入れ子構造
認識手段、１１イメージ切り出し手段、１２テキス
ト解析手段、１３文字列変換手段、１４領域／論理
構造対応定義推定手段、２１外部手続き呼び出し手
段、２２スクリプト／論理構造対応定義記憶部、３１
文書画像領域／論理構造対応付け手段、３２認識属
性指定手段、３３文書画像データ表示手段、３４繰
り返し構造指定手段、３５文書型定義記憶手段、３６
文書論理構造表示手段、３７文書画像領域／論理構造
対応表示手段、３８文書型定義正当性確認手段。1 document image data input means, 2 document image area / logical structure correspondence definition storage section, 3 document image area cutout means,
4 pattern recognition means, 5 structured document generation means, 6
Structured document output means, 7 control means, 8 document image area estimation means, 9 repeated structure recognition means, 10 nested structure recognition means, 11 image cutout means, 12 text analysis means, 13 character string conversion means, 14 area / logical structure Correspondence definition estimating means, 21 external procedure calling means, 22 script / logical structure correspondence definition storage section, 31
Document image area / logical structure associating unit, 32 recognition attribute designating unit, 33 document image data displaying unit, 34 repeating structure designating unit, 35 document type definition storing unit, 36
Document logical structure display means, 37 document image area / logical structure correspondence display means, 38 document type definition correctness confirmation means.

Claims

[Claims]

1. A document tag corresponding to a specific logical structure portion of a document is added to a document image area, which is a predetermined portion of input document image data, and is necessary for an output document. A document image area / logical structure correspondence definition storage means (hereinafter referred to as correspondence definition storage means) that stores a processing attribute as a definition and a document image area in the correspondence definition storage means for the pattern-recognized input. A document image area cutting-out unit that cuts out a predetermined portion using information on a document, and a logical structure that describes a collection of the document image areas, which are the above-mentioned portions necessary for various output documents, is named and stored as a definition for each document. When the document type definition storage means and the specified output document are designated, the definition name and the logical structure corresponding to the designation of the stored document type definition storage means are referred to A structured document generation device, comprising: a structured document generation unit that processes the region cut out by the document image region cutout unit according to the processing attribute in the correspondence definition storage unit and generates according to the logical structure.

2. When a predetermined part of the input is cut out and compared with the corresponding document image area in the correspondence definition storage means and it is judged that they are similar to each other with a certain evaluation value or more, the document in the correspondence definition storage means is similar. 2. The structured document generation apparatus according to claim 1, further comprising a document image area estimation unit that replaces the image area or determines that the image area is equivalent to the document image area.

3. A detailed processing table is specified as a processing attribute, and a partial area of input document image data is processed based on the detailed processing table to generate an output document. The structured document generation device according to claim 1.

4. The structured document generation device according to claim 1, wherein a predetermined image portion of the input is cut out to form an image file, which is combined with the document tag and output.

5. An output document is generated by referring to a table prepared as another detailed processing table or applying an algorithm to a predetermined part of the input, converting it into another character string. The structured document generation device according to claim 3, wherein

6. When a document tag indicating a portion of a logical structure that constitutes a document is designated for a document image area which is a predetermined portion of the cut out input, the two are linked to the correspondence definition storage means. 2. The structured document generation device according to claim 1, further comprising a document image area / logical structure associating unit to be registered / stored.

7. The structured structure according to claim 6, wherein the document tag is supported, and an attribute for performing a predetermined process is added to the output document and registered / stored in the correspondence definition storage means. Document generation device.

8. A document image area of a corresponding portion of the document image data is identified and displayed by adding display means for input document image data and designating a document tag. The structured document generation device according to claim 1.

9. An arbitrary document type definition stored in the document type definition storing means is read out and displayed in a tree structure, or when the document type definition of the tree structure which is corrected and displayed is instructed to be stored, 2. The structured document generation apparatus according to claim 1, further comprising a document logical structure display / correction unit stored in the document type definition storage unit.

10. A document logical structure display / correction means is added to display an arbitrary document type definition modified by a tree structure, and a document image area / logical structure associating means is displayed by a tree structure. 7. The structured document generation device according to claim 6, wherein correspondence is made for each document tag with.

11. A display means for input document image data is added, and when a document tag is designated, the document image area of the corresponding portion of the document image data is identified and displayed. The structured document generation device described.

12. The document image area / logical structure associating unit also associates each document tag with a document type definition displayed in a tree structure, and if there is an independent tree structure without the correspondence, the document logical structure. 11. The structured document generation apparatus according to claim 10, wherein the display / correction unit is configured to identify and display the tree structure having no correspondence.