JPH01130293A

JPH01130293A - Document image analyzing system

Info

Publication number: JPH01130293A
Application number: JP62290207A
Authority: JP
Inventors: Yoshitake Tsuji; 辻　善丈
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 1987-11-16
Filing date: 1987-11-16
Publication date: 1989-05-23

Abstract

PURPOSE:To stably and easily extract a prescribed area by generating hierarchically and automatically an element for constituting various document images and an arrangement structure between the elements, in accordance with an inclusive relation of a block and the upper and the lower or the right and the left relative position relation. CONSTITUTION:A document image which has been stored in a document image memory 1 is divided into element areas of a character line, a character, etc., by an area dividing part 2. At the time of structuring one or plural pieces of element areas as a block, a document structure generating part 20 determines hierarchically an attribute of the block and an arrangement structure between the blocks in accordance with an inclusive relation of each block and the upper and the lower or the right and the left arrangement relation, and stores them in a structured data storage part 4. An area searching part 13 searches an area to be extracted in the document image or one or plural pieces of blocks for constituting the area to be extracted, from the attribute of the block and the hierarchical arrangement structure between the blocks, and stores them in an extraction result storage part 14.

Description

【発明の詳細な説明】（産業上の利用分野）本願発明は、文書画像を文字行・文字等のその構成要素
に分割し、所望の領域を自動抽出する文書画像解析方式
に係わり、特に、文書画像を文字行・文字等の要素に分
解し、文章の流れを含む所望の領域を自動抽出するのに
適した文書画像解析方式、に係わる。・　１（従来の技術及びその問題点）多量の既存文書−画像′の効率的な蓄積・検索や画像伝
送を行ったり、また、一般書籍を自動的に読み取るため
には、固着書式を持つ帳票の予め定められた特定の文字
イメージ列のみの文字読取りを行うだけでなく、多種多
様の文書画像を解析し、文字゛領域や図表領域の分離、
更には所望の領域を自動抽出することが必要となる。Detailed Description of the Invention (Industrial Field of Application) The present invention relates to a document image analysis method that divides a document image into its constituent elements such as character lines and characters, and automatically extracts a desired area. The present invention relates to a document image analysis method suitable for decomposing a document image into elements such as character lines and characters, and automatically extracting a desired area including the flow of text.・ 1 (Prior art and its problems) In order to efficiently store, search, and transmit images of a large amount of existing documents and images, and to automatically read general books, it is necessary to create forms with fixed formats. It not only reads characters from a specific predetermined character image string, but also analyzes a wide variety of document images and separates character areas and diagram areas.
Furthermore, it is necessary to automatically extract a desired area.

従来、このような文書画像の構造解析方式として、本願
発明者と同一人による「スプリット検出法に基づく頁画
像の構造解析Ｊ　　ｌ子通信学会技術研究報告パターン
認識と学習ＰＲＬ８５−１７．１９８５−６゜６３ペー
ジ〜７０ページ）なる技術論文に記載されているように
、垂直又は水平方向の射影情報交互に抽出しながら大局
的領域から局所的領域に分解した後、文書の構成要素を
決める方式がある。ここで、上記方式は、書籍の本文等
の文字読取りを前提として行われたものである。そのた
め、一般文書に見られる段組の存在や特定な文書領域の
文字読取りを前提とした場合には、文書画像を構成する
配置関係が得られていないから、文書画像の要素及び要
素間の配置関係を求めることが必要となる。Conventionally, as a structural analysis method for such document images, "Structural analysis of page images based on the split detection method" by the same inventor as the present inventor, J. Communication Society Technical Research Report Pattern Recognition and Learning PRL85-17. As described in a technical paper titled ゜pages 63 to 70), there is a method to determine the constituent elements of a document after decomposing the global area into local areas while alternately extracting vertical or horizontal projection information. Here, the above method was performed with the premise of reading characters in the main text of books, etc. Therefore, when assuming the existence of columns found in general documents and the reading of characters in specific document areas. Since the layout relationships that make up the document image are not available, it is necessary to find the elements of the document image and the layout relationships between the elements.

一方、文書画像の所定の領域を抽出する方式として、例
えば、特願昭４１−２８１３７７号「画像理解方式」に
示されたように、文書画ｆ象を複数個の矩形領域の集合
として定義された文法に従って抽出すべき領域を求める
方式が知られている。しかしながら、矩形領域の位置・
サイズを絶対又は相対座標をベースにすべて定義するこ
とは、労力を必要とし、また、書籍等の文書画像の文章
を読み取るという場合には、座標による矩形領域の定義
は、行数の変化や図等の混在により実用上困難となる。On the other hand, as a method for extracting a predetermined region of a document image, for example, as shown in Japanese Patent Application No. 41-281377 "Image Understanding Method", a document image f is defined as a set of a plurality of rectangular regions. A method is known in which the area to be extracted is determined according to a given grammar. However, the position of the rectangular area
Defining all sizes based on absolute or relative coordinates requires effort, and when reading text in document images such as books, defining rectangular areas using coordinates is difficult due to changes in the number of lines or graphics. It becomes difficult in practice due to the mixture of etc.

そこで、本願発明の目的は、従来の上記問題点を解決す
るために、１つないし複数個の要素領域（以下、ブロッ
クと呼ぶ）間の関係として、包含関係及び上下又は左右
の配置間１系に従って、文書画像の構造を階層構造とし
て自動生成し、相対内置、置関係により所定の領域を抽
出する文書画像解析方式を提供することにある。SUMMARY OF THE INVENTION Therefore, an object of the present invention is to solve the above-mentioned conventional problems by establishing an inclusive relationship and a single system between upper and lower or left and right arrangements as relationships between one or more element areas (hereinafter referred to as blocks). Accordingly, it is an object of the present invention to provide a document image analysis method that automatically generates the structure of a document image as a hierarchical structure and extracts a predetermined area based on relative inposition and positional relationships.

本願発明の他の目的は、種々な文書画像のブロック間の
関係を自動生成することによって、文章領域の読み取る
べき順序や文書の構造を容易に検出できる文書画像解析
方式を提供することにある。Another object of the present invention is to provide a document image analysis method that can easily detect the order in which text areas should be read and the structure of a document by automatically generating relationships between blocks of various document images.

本願発明の他の目的は、目的に応じて定められる特定な
領域の抽出を容易に行える文書画像解析方式を提供する
ことにある。Another object of the present invention is to provide a document image analysis method that can easily extract a specific area determined depending on the purpose.

本願発明の他の目的は、文字行内の所定の位置に存在、
する空白ブロックも含めて、文字行の形状を抽出し、文
章領域の論理的配置構造を生成する文書画像解析方式を
提供することにある。Another object of the present invention is to exist at a predetermined position within a character line;
An object of the present invention is to provide a document image analysis method that extracts the shapes of character lines, including blank blocks, and generates a logical arrangement structure of text areas.

（問題点を解決するための手段）前述の問題点を解決するために本願の第１の発明が提供
する文書画像解析方式は、文書画像を文！行、文字等の
要素領域に分解する手段と１．１つ各は複数個の要素領
域をブロックとして構造化する際、各ブロックの包含関
係及び上下又は左右の配置関係に従って、ブロック属性
及びブロック間の配置構造を階層的に決定し、記憶する
文８構造生成手段と、ブロックの属性及びブロック間の
階層的な配置構造から、文書画像内の抽出すべき領域又
はそれを構成する１つ若しくは複数個のブロックを探索
する領域探索手段とを有してなる。(Means for Solving the Problems) In order to solve the above-mentioned problems, the document image analysis method provided by the first invention of the present application converts document images into sentences! 1. When structuring a plurality of element areas as blocks, block attributes and interblocks are determined according to the inclusion relationship and vertical or horizontal placement relationship of each block. Sentence 8 structure generation means that hierarchically determines and stores the layout structure of the document image, and the region to be extracted in the document image or one or more of the regions constituting the region from the block attributes and the hierarchical layout structure between the blocks. and area search means for searching the blocks.

また、前述の問題点を解決するために本願の第２の発明
が提供する文書画像解析方式は、文書画像を文字行、文
字等の要素領域に分解する手段と、１つ又は複数個の要
素領域をブロックとして構造化する際、各ブロックの包
含関係及び上下又は左右の配置関係に従って、ブロック
の属性及びブロック間の配置構造を階層的に決定し、記
憶する文書ｍ造生成手段と、文字行ブロック内の所定の
位置・大きさを持つ空白を空白ブロックとして抽出する
手段と、文書構造生成手段によって生成された複数個の
文字行を含むブロックから空白ブロックを基にして各文
字行ブロックの形状を調べ、文書画像に於ける配置構造
を更新する文書構造更新手段と、前記階層的な配置構造
から文書画像内の抽出すべき領域又はそれを構成する１
つ若しくは複数個のブロックを探索する領域探索手段と
を有してなる。In addition, in order to solve the above-mentioned problems, the document image analysis method provided by the second invention of the present application includes means for decomposing a document image into element areas such as character lines and characters, and one or more elements. When structuring an area as blocks, a document generation means that hierarchically determines and stores the attributes of blocks and the layout structure between blocks according to the inclusion relationship of each block and the vertical or horizontal layout relationship; A means for extracting a blank space with a predetermined position and size within a block as a blank block, and a shape of each character line block based on the blank block from a block containing multiple character lines generated by a document structure generation means. a document structure updating means for checking and updating the arrangement structure in the document image; and a region to be extracted in the document image from the hierarchical arrangement structure, or a region forming the same.
and area search means for searching one or more blocks.

（実施例）以下本願発明の実施例について図面を参照しながら説明
する。(Example) Examples of the present invention will be described below with reference to the drawings.

第１図及び第２図は、それぞれＩ書さ及び横書きで記載
された文書画像の構成を説明するために用いた一例であ
る。FIG. 1 and FIG. 2 are examples used to explain the structure of document images written in I-letter and horizontal writing, respectively.

図中、黒丸は文字を示し、斜線で示した矩形領域（図中
Ｆ、）を図、表、写真などの要素とする。In the figure, black circles indicate characters, and the diagonally shaded rectangular areas (F in the figure) are elements of figures, tables, photographs, and the like.

従来の文書画像の領域分割方式あるいは行抽出方式など
を用いると、第１図及び第２図の記号Ｓｔ（第１図では
ｉ＝１・・・７．第２図ではｉ＝１・・・１５）で示し
た文字行領域あるいは第１図の記号Ｆ１で示し図／表／
写真などの領域（以下゛、画素記述領域と呼ぶ）が抽出
できる。When conventional document image area division methods or line extraction methods are used, symbols St in FIGS. 1 and 2 (i=1...7 in FIG. 1; i=1... in FIG. 2) are used. 15) or the figure/table/character area indicated by symbol F1 in Figure 1.
Areas such as photographs (hereinafter referred to as pixel description areas) can be extracted.

次に、文章情報の流れに着目すると、通常第１図で示す
縦書きの場合、縦書きである文字行は右から左へと文章
情報が流れ、文字行内の各文字は上から下へと文字情報
が流れる。即ち、第１図の各文字行の配置関係は、左右
関係があり、各文字行内の各文字の配置は、上下関係と
なる。また、画素記述領域Ｆ＋は文字行Ｓｓ　、Ｓ６．
Ｓｔの上にあるなどの配置関係が存在する。そこで、第
１図の図中Ｔｉ　（ｉ＝１．２）で示したブロック、即
ち、文章領域を検出すると、左右間係を持つ文字行から
成る２つの文章領域ＴｒとＴ２が左右関係に有り更に画
素記述領域ｌマ、と文・章領域Ｔ２とは上下関係である
ことが容易にかわる。そこで、上述した包含関係及び上
下又は左右の配置関係を抽出することによって、例えば
、文章領域Ｔ１からＴ２へと順次、文字行内の文字を抽
出し、文字コードに変換しなり、あるいは、画素記述領
域Ｆ１の下にある文章領域のみを抽出することなどが容
易に可能となる。Next, focusing on the flow of text information, in the case of vertical writing as shown in Figure 1, text information flows from right to left in a vertical text line, and each character within a text line flows from top to bottom. Text information flows. That is, the arrangement of each character line in FIG. 1 has a left-right relationship, and the arrangement of each character within each character line has an up-and-down relationship. Furthermore, the pixel description area F+ includes character lines Ss, S6 .
There is an arrangement relationship such as being above St. Therefore, when the block indicated by Ti (i=1.2) in Fig. 1, that is, the text area, is detected, two text areas Tr and T2 consisting of character lines with left-right relationship are located in a left-right relationship. Furthermore, the vertical relationship between the pixel description area lma and the sentence/chapter area T2 can be easily changed. Therefore, by extracting the above-mentioned inclusion relationship and vertical or horizontal positioning relationship, for example, characters in a character line are extracted sequentially from text area T1 to T2 and converted to character codes, or a pixel description area It becomes possible to easily extract only the text area under F1.

同様に゛、第２図で示すような横書きの場合、通常、横
書きである文字行内の各文字は、左から右へと情報が流
れ、横書きの各文字行から成る文章領域は、上から下へ
と情報が流れる０例えば、図中の各文字行Ｓ＋（ｉ＝１
・・・１５）において、文字行Ｓ１から文字行Ｓ５及び
文字行Ｓ６から文字行Ｓ　１０及び文字行Ｓ＋＋から文
字行Ｓ＋ｓはそれぞれ、上下関係を持つ文字行から文章
領ｔＪＡＴ＋　、　’ｒｓ　。Similarly, in the case of horizontal writing as shown in Figure 2, information normally flows from left to right for each character in a horizontal character line, and the text area consisting of each horizontal character line is from top to bottom. For example, each character line S+(i=1
...15), character line S1 to character line S5, character line S6 to character line S10, and character line S++ to character line S+s are respectively text regions tJAT+ and 'rs from the character line having a vertical relationship.

Ｔ６から形成されている。また、二段組に類似する構造
として、文章領域Ｔ　ｓ　、　Ｔ　ａが存在し、情報の
流れとして左右関係を持つ文章領域Ｔ５゜Ｔ６により文
章領域Ｔ４が形成されていると見ることができる。It is formed from T6. Further, as a structure similar to a two-column set, there are text regions T s and T a, and it can be seen that the text region T4 is formed by the text regions T5 and T6, which have a left-right relationship in terms of information flow.

また、第１図で示すように、文章領域Ｔ１内で、文字ピ
ッチが異なる性質の文字行が存在した場合、更に、上下
関係を保持する文字行から構成された文章領域Ｔ、を文
章領域とＴ３に分解しても、それぞれ上下関係が成立す
ることになる。Furthermore, as shown in FIG. 1, if there are character lines with different character pitches within the text area T1, a text area T composed of character lines that maintain a vertical relationship is further defined as a text area. Even if it is decomposed into T3, a vertical relationship will be established for each.

以上説明したように、文章画像の構成要素の配置関係及
び文章情報を流れを表現する場合、各要素間の関係を上
下関係と左右関係（Ｉ書きの場合、右左関係、横書きの
場合、左右関係）を階層的に検出し生成することによっ
て可能となることがわかる。As explained above, when expressing the arrangement relationship of the constituent elements of a text image and the flow of text information, the relationship between each element is defined as the vertical relationship and the horizontal relationship (in the case of I writing, the right-left relationship; in the case of horizontal writing, the left-right relationship). ) can be realized by detecting and generating hierarchically.

第３図（ａ＞、（ｂ）、（ｃ）は、ブロック間の上下及
び左右関係を交互に規定しながら階層的に領域分割を行
う方式の一例である。FIGS. 3(a), 3(b), and 3(c) are examples of a method for performing area division hierarchically while alternately defining vertical and horizontal relationships between blocks.

上記領域分割を実現する方法として、同一出願人による
「スプリット検出法に基づく頁画像の構造解析」に記載
されている。そこで、本説明では、詳細は省略し、第３
図（ａ）、（ｂｌ、（ｃ）により、上記領域分割は、各
ブロックが上下又は左右の配置関係を保持しつつ階層的
に紙面の構成要素に分解する一方法であることを示す。A method for realizing the above region division is described in "Structure Analysis of Page Images Based on Split Detection Method" by the same applicant. Therefore, in this explanation, details will be omitted and the third
Figures (a), (bl, and (c)) show that the above-mentioned area division is a method of hierarchically decomposing each block into constituent elements of the paper while maintaining the vertical or horizontal arrangement relationship.

尚、上記方式は、黒素の値（黒又は白の２値）を水平、
垂直に射影し、その画素計数値を示す分布（投影分布）
を用いているが、本発明はこれに限定されるものではな
く、黒白の変化点の計数値や予め輪かくトレースで求ま
る矩形情報を水平、垂直に射影して重なりを持つ領域情
報を用いても良い。In addition, in the above method, the black prime value (binary value of black or white) is horizontally
Distribution that is projected vertically and shows the pixel count value (projected distribution)
However, the present invention is not limited to this, and the present invention is not limited to this, but uses area information that overlaps by horizontally and vertically projecting the counted values of black and white changing points and rectangular information determined by circular tracing in advance. Also good.

第３図（ａ）において領域分割対象となる文書画像の領
域Ｐには、黒丸で示した文字及び矩形ど゛斜線で基すよ
うな画素記述領域を含んでおり、２第１図で示した縦書
き文書画像と類似した構造を持っている。In Figure 3(a), area P of the document image to be divided into areas includes characters indicated by black circles and a pixel description area based on a rectangle with diagonal lines. It has a structure similar to a vertical document image.

第２図で使用される記号Ｒｔ　　（Ｌ、、）（Ｌ、、＝
１．２，３．・・・、ｉ＝１．２，３．・・・）は、投
影分布（図中斜線で示した図形）を用いた階層的領域分
割過程で得られるブロックを示しており、上記記号し、
、は、分割レベルを示すものとする。また、分割レベル
Ｌ、、は、階層深さを表わすと共に、投影情報を求める
際の方向をも表わしている。即ち、水平方向の投影情報
により分割された複数個の領域の分割レベルし、、は奇
数値を持ち、垂直方向の投影情報により分割された複数
個の領域の分割レベルし、、は偶数値を持つことになる
。更に、水平方向（垂直方向）の投影分布により分割さ
れた複数個の領域がそれぞれ上下関係（左右関係）が保
存されることは明らかである。Symbols Rt (L, ,) (L, , =
1.2,3. ..., i=1.2,3. ...) indicates a block obtained through the hierarchical region segmentation process using the projection distribution (the shaded figure in the figure), and the symbol above indicates
, indicates the division level. Furthermore, the division level L, . . . represents not only the hierarchical depth but also the direction in which projection information is obtained. That is, the division level of multiple regions divided according to horizontal projection information, , has an odd value, and the division level of multiple regions divided according to vertical projection information, , has an even value. I will have it. Furthermore, it is clear that the vertical relationship (horizontal relationship) of the plurality of areas divided by the horizontal (vertical) projection distribution is preserved.

最初に、解析対象領域Ｐに対して、水平投影分布Ｈ１が
適用され、領域Ｒ１（１）が得られる。First, the horizontal projection distribution H1 is applied to the analysis target region P, and a region R1(1) is obtained.

次にブロックＲ，（１）に垂直投影分布■窃が適用され
、ブロックＲｔ　　（２）　、・・・Ｒ１（２）が得ら
れる。ここで、分割レベル２を持つ５個の領域は、順次
、左右関係を満足していることは明らかである。また、
ブロックＲ，（１）と１つないし複数個のブロックＲ１
（２）・・・Ｒ５（２）は包含関係を満たずことも明ら
かである。尚、５個のブロックをどのように分割するか
は、各ブロックの特徴及び各ブロック間の特徴量（空白
値や相関比）をその親ブロックＲ１（１）の特徴（例え
ば識別子）と５個のブロックＲ，（２）・・・、Ｒ５（
２）の特徴に応じて場合分けを行い、検査することによ
って決定される。第３図（ａ）の場合では、５つのブロ
ックに分割され、ブロックＲ，（２）には識別〒として
未確定、Ｒ２（２）ないしＲ９（２）には文字行候補と
いう識別子が付加され、記憶される。領域分割は、Ａｆ
＆型探木探索技法いられているために、次に分割すべき
ブロックとしてブロックＲ，（２）が取り出され、同様
な処理が繰り返される。ブロックＲｓ（２）に対しては
水平投影分布を適用すると、複数個の文字候補領域が得
られるため、この時点でブロックＲ５１）の分割が停止
し、複数個の文字候補領域も含めて記憶される。同様に
、ブロックＲａ（２）、Ｒｉ（２）、Ｒ２（２）が順次
、水平投影分布を適用され、ブロックＲｓ＜２）の場合
と同様な処理が行われる０次に、ブロックＲ＋（２）に
対して、第３図（ｂ）で示したように、水平投影分布Ｈ
２が適用され、２つのブロックＲ，（３）、Ｒｔ（３）
が得られ、前述と同様な処理が行われる。Next, the vertical projection distribution is applied to block R, (1) to obtain blocks Rt (2), . . . R1 (2). Here, it is clear that the five regions having division level 2 satisfy the left-right relationship in sequence. Also,
Block R, (1) and one or more blocks R1
(2)...R5 It is also clear that (2) does not satisfy the inclusion relationship. In addition, how to divide the five blocks is to divide the features of each block and the feature amount between each block (blank value and correlation ratio) with the features of its parent block R1 (1) (for example, identifier). Blocks R, (2)..., R5(
It is determined by dividing the cases according to the characteristics of 2) and examining them. In the case of Fig. 3(a), the block is divided into five blocks, and block R, (2) has an undetermined identifier, and R2 (2) to R9 (2) have an identifier of character line candidate added. , will be remembered. Area division is Af
Since the &-type search tree search technique is used, block R, (2) is extracted as the next block to be divided, and the same process is repeated. When horizontal projection distribution is applied to block Rs(2), multiple character candidate areas are obtained, so the division of block R51) is stopped at this point and the multiple character candidate areas are also stored. Ru. Similarly, blocks Ra(2), Ri(2), and R2(2) are sequentially applied with the horizontal projection distribution, and the block R+(2) is processed in the same way as the block Rs<2). ), as shown in Figure 3(b), the horizontal projection distribution H
2 is applied, and the two blocks R, (3), Rt (3)
is obtained, and the same processing as described above is performed.

第３図（ｂ）の場合には、ブロックＲ＋　　（３）とＲ
ｔ（３）は未確定という識別子が付加され、記憶される
。この時、ブロックＲ１（３）とブロックＲ２（３）は
上下関係が成立し、その親ブロックはＲ１（２）である
０次に、ブロックＲ２（３）に対して、第３図（ｃ）で
示すように、垂直投影分布Ｖ４が適用され、３つの文字
行候補ブロックＲ１＜４＞、Ｒ２（４）、Ｒ３（４）が
得られる。In the case of FIG. 3(b), blocks R+ (3) and R
An identifier "undefined" is added to t(3) and it is stored. At this time, block R1 (3) and block R2 (3) have a vertical relationship, and their parent block is R1 (2). As shown, vertical projection distribution V4 is applied, and three character line candidate blocks R1<4>, R2(4), and R3(4) are obtained.

ま′た、ブロックＲ１（３）に対して、領域Ｒ２（３）
と同様に垂直射影分布を適用すると、１つのみであるた
め、これ以上分割ができず、また、そのブロックサイズ
などから図・表・写真等の画素表現ブロックという識別
子が与えられる。Also, for block R1(3), area R2(3)
If a vertical projection distribution is applied in the same manner as above, since there is only one block, it cannot be divided any further, and an identifier for a pixel expression block such as a diagram, table, or photograph is given based on the block size.

以上の如く操作を繰り返し、縦型探索が終了すると、第
４図で示した領域情報の木ｍ造が生成されることになる
。When the operations described above are repeated and the vertical search is completed, the wooden structure of area information shown in FIG. 4 is generated.

尚、第４図で示した領域分割結果は、第３図（ａ）で示
した文書画像に対応して生成されたものであるが、第３
図（ａ）は第１図と類似した構造を持っているため、以
後述べる第４図の説明は、第１図に対応して行うことと
する。Note that the region segmentation results shown in FIG. 4 were generated corresponding to the document image shown in FIG.
Since FIG. 4A has a structure similar to FIG. 1, the following description of FIG. 4 will be made in accordance with FIG. 1.

第４図において、図中サークルで領域情報を表わし１．
記号Ｓ、は文字行、記号Ｆ、は画素表現領域（但し、添
字ｉは第１図との対応をとるために付加したものである
）を示すとし、黒丸は、文字領域情報を示する。また、
記号が付加されていないサークルは仮想ブロックを表わ
すとする。In FIG. 4, area information is represented by circles in the figure.1.
The symbol S indicates a character line, the symbol F indicates a pixel expression area (however, the subscript i was added to correspond with FIG. 1), and the black circle indicates character area information. Also,
It is assumed that a circle to which no symbol is attached represents a virtual block.

更に、図中り、、は分割レベルを表わし、分割レベルが
奇数の時、各領域間に上・下関像が成立し、分割レベル
が偶数の時、各ブロック間に左右関係が成立する。ここ
で、第４図における本構造による階層表現において、分
割レベルｉと分割レベルｉ＋１　（ｉ＝１．２・・・）
の関係は包含関係が成立し、同一分割レベル内の各領域
は左右間係（図中ゆで示す）又は上下関係（図中８で示
す）が成立している。Further, in the figure, the numbers represent division levels; when the division level is an odd number, upper and lower relations are established between each area, and when the division level is an even number, a left-right relationship is established between each block. Here, in the hierarchical representation according to this structure in FIG. 4, division level i and division level i+1 (i=1.2...)
An inclusive relationship is established for the relationship, and each area within the same division level has a left-right relationship (indicated by 8 in the figure) or a vertical relationship (indicated by 8 in the figure).

次に、第４図で述べたブロック情報の属性値の一例につ
いて第５図を用いて説明する。第５図で示した分類名は
、解析対象、文字、線分、文字行、画素表現の各ブロッ
ク及び仮想ブロックである。Next, an example of the attribute values of the block information described in FIG. 4 will be explained using FIG. 5. The classification names shown in FIG. 5 are analysis target, character, line segment, character line, pixel representation block, and virtual block.

尚、仮想ブロックのうち、後述する構造化の過程で節、
文章などの分類名が付加される仮想ブロックが存在する
ことになる。In addition, among virtual blocks, nodes,
There will be virtual blocks to which classification names such as sentences are added.

位置・大きさは、ブロックの位置サイズを表わす、ブロ
ック間距離は、同一分割レベルで隣接するブロック間の
空白サイズを示す、子ブロック数は、自身の子ブロック
の数とする。尚、文字ブロックについては、例えば、本
願発明者と同一人による「分散最小基準に基づく適応型
文字分離方式」（電子通信学会論文誌Ｄ’　８５／８Ｖ
ＯＬ−Ｊ６８−Ｄ、　ＮＯ，８，ヘージ１４９７〜１５
０４）に示されているような方法を用いて得られている
とする。また、文字行ブロック内の各文字を分離する場
合、空白文字が存在すれば、それも子ブロック数が０の
文字として収り扱うこととする。The position and size represent the position size of the block, the distance between blocks represents the blank size between adjacent blocks at the same division level, and the number of child blocks is the number of its own child blocks. Regarding character blocks, for example, "Adaptive Character Separation Method Based on Minimum Dispersion Criteria" by the same inventor as the present inventor (IEICE Transactions D' 85/8V
OL-J68-D, NO, 8, Hage 1497-15
Suppose that it is obtained using the method shown in 04). Furthermore, when separating each character in a character line block, if a blank character exists, it is treated as a character whose child block number is 0.

子ブロック間配置属性は、自分の背下にある複数個の子
ブロックの配置関係を表わされる上下関係又は左右関係
（縦書きの時右−左関係、横書きの時定→右関係）を持
つ、子ブロック分類名群は、そのブロックに含まれる１
つないし複数個の分類名（線１文字行２文字１文章）と
その個数が格納される。尚、分類名１子ブロック間配置
属性、子ブロック分類名群などは、後述される構造化に
おいて更新・セットされる。また、子ブロック分類名群
には、文字行や文章の分類名は、縦書き、横書きに別け
て記憶されているものとする。The inter-child block placement attribute has a vertical relationship or horizontal relationship (right-left relationship in vertical writing, time-to-right relationship in horizontal writing) that represents the placement relationship of multiple child blocks behind itself. Child block taxonomic name group is 1 included in that block.
A plurality of connected classification names (one line, two characters, one sentence) and their number are stored. Note that the classification name 1 inter-child block arrangement attribute, child block classification name group, etc. are updated and set in structuring, which will be described later. It is also assumed that in the child block classification name group, the classification names of character lines and sentences are stored separately for vertical writing and horizontal writing.

第６図は、第４図で示した領域分割結果から文書構造を
自動生成した一例を示している。第６図で示す文書構造
の記述生成は、第５図で示したブロック間の分類名を構
造化条件として用いた一例であり、文字行の集まりとし
た新たに文章という分類名を持つブロック（図中Ｔ、で
示す）が生成されることになる。FIG. 6 shows an example of automatically generating a document structure from the region segmentation results shown in FIG. 4. The generation of a description of the document structure shown in Figure 6 is an example of using the classification names between blocks shown in Figure 5 as structuring conditions. (indicated by T in the figure) will be generated.

尚、構造化条件は、第４図の場合では、縦書きである。In the case of FIG. 4, the structuring condition is vertical writing.

縦書き・横書き情報は、予め与えても良いし、また、従
来技術を用いて自動決定しても良い、また、複数個のブ
ロックが構造化され、唯一の親１０ツクとなる時には１
．新たなプロ・ｙりを生成する必要はない。Vertical/horizontal writing information may be given in advance, or may be determined automatically using conventional technology.Also, when multiple blocks are structured and there is only one parent, 1
．． There is no need to generate new professionals.

最初に、領域情報の探索として最も分割レベルの大きい
ブロック（文字行、線１文字行を形成しない文字１画素
表現の分類名を持つ０例えば、第４図の図中、ブロック
Ｒ１（４）、Ｒ２（４）。First, as a search for area information, a block with the largest division level (character line, 0 with a classification name of 1 pixel representation of a character that does not form a line 1 character line), for example, block R1 (4) in the diagram of FIG. R2 (4).

Ｒｓ’（４））とそれらの親ブロック（第４図の図中Ｒ
２（３）のみ）が取り出される０次に、構造化条件のう
ち、縦書き・横書きの情報及び分割レベルＬ　１１　Ｉ
Ｉが検査される０分割レベルＬ、９が奇数の時、同−親
ブロックを持つ複数個のブロックを上下関係が成立する
順序で第５図で示した上下関係のポインタを用いて連結
することによって並べられる。一方、分割レベルＬ　Ｉ
Ｉ　ＩＩが偶数の時には、文章の流れから見ると、横書
きでは右左間係（第６図で図中φで示す）が成立し、縦
書きでは、左右関係が成立する。そこで、前述した同−
親ブロックＲ２（３）を持つ複数個のブロック（第４図
の図中Ｒ＋　　（４）、Ｒ２（４＞、Ｒ３（４））を、
左右関係が成立する順序で第５図で示した左右関係のポ
インタを用いて連結することによって並べられる（第６
図の図中Ｒ３（４）ゆＲ２（４）ゆＲ，（４））。Rs' (4)) and their parent blocks (R in Figure 4)
2 (3) only) is extracted. Next, among the structuring conditions, vertical writing/horizontal writing information and division level L 11 I
When I is checked at the 0 division level L and 9 is an odd number, connect multiple blocks having the same parent block in an order that establishes the vertical relationship using the vertical relationship pointers shown in FIG. sorted by. On the other hand, the division level L I
When I II is an even number, from the perspective of the flow of the text, a right-left relationship (indicated by φ in FIG. 6) is established in horizontal writing, and a left-right relationship is established in vertical writing. Therefore, the above-mentioned
A plurality of blocks having a parent block R2(3) (R+(4), R2(4>, R3(4) in FIG. 4),
They are arranged in the order in which the left-right relationship is established by connecting them using the left-right relationship pointers shown in FIG.
In the figure R3(4)YR2(4)YR,(4)).

次に新たなブロックの生成又は親ブロックの属性が付加
される。第４図の例では、３つのブロックがＲｔ　　（
４）、Ｒ２（４）、Ｒｓ　　（４）がすべて文字行とい
う分類名（図中Ｓ＋　、ｉ＝５．６゜７）を持っている
ため、その親ブロックＲ２（３）がそのまま文章として
の分類名（図中′ｒ２）が付加される。ここで、分割レ
ベルＬ、１．子ブロック数、子ブロック間配置属性等の
ブロック情報のセット・更新が行われる。ここで、第４
図の場合では、分割レベル４の構造化が終了する。Next, a new block is generated or the attributes of the parent block are added. In the example of FIG. 4, three blocks are Rt (
4), R2(4), and Rs(4) all have the classification name of character line (S+, i=5.6°7 in the figure), so their parent block R2(3) can be used as a text as is. A classification name ('r2 in the figure) is added. Here, the division level L, 1. Block information such as the number of child blocks and placement attributes between child blocks is set and updated. Here, the fourth
In the case shown in the figure, structuring at division level 4 is completed.

次に、分割レベル３のブロック（第６図の図中ブロック
Ｒ，（３）、Ｒ２（３））とその親ブロックＲ＋（２）
が取り出される。尚、分割レベル３の領域は、第６図の
図中黒丸で示す文字領域が存在するが、それらの親ブロ
ックの分類名は、文字行であるため、取り出されないと
する。Next, blocks of division level 3 (blocks R, (3), R2 (3) in Figure 6) and their parent block R + (2)
is taken out. Note that in the area of division level 3, there are character areas indicated by black circles in FIG. 6, but the classification names of their parent blocks are character lines, so they are not extracted.

同様に分割レベル３の２つのブロックＲ＋　　（３）と
Ｒ２（３）に前述した構造化条件が検査される。Similarly, the above-mentioned structuring conditions are checked for the two blocks R+ (3) and R2 (3) of division level 3.

この場合、ブロックＲ１（３）（分類名として画素表現
が既に付加されている）とＲ２（３）（分類名として文
章）とが上下関係ポインターが付けられ、分類名として
仮想ブロック（第６図の図中記号Ｍ、）が付加される０
次に、子ブロック分類名群に、文章（記号Ｔ）１画素表
現（記号Ｆ）が、子ブロック間配置属性として、上下関
係等のブロック情報が更新・セットされる。In this case, blocks R1(3) (with a pixel expression already added as a classification name) and R2(3) (text as a classification name) are given a vertical relationship pointer, and a virtual block (see Figure 6) is added as a classification name. 0 to which the symbol M,) in the figure is added
Next, the text (symbol T) and one-pixel representation (symbol F) are updated and set in the child block classification name group, and block information such as the vertical relationship is updated and set as the child block arrangement attribute.

ここで、分割レベル３の構造化が終了する。Here, the structuring of division level 3 is completed.

次に分割レベル２のブロック（第６図ではＲ１（２）、
Ｒ２（２）、Ｒ３（２）、Ｒ４（２）。Next, the block of division level 2 (R1 (2) in Fig. 6,
R2(2), R3(2), R4(2).

Ｒ，＜２＞）とその親ブロックＲ＋（１）が取り出され
る。同様に、前述した構造化条件が検査され前に、Ｒｓ
（２）ないしＲ１（２）の順序で左右関係ポインタが付
けられる０次に、それらの分類名（この場合、４つの文
字行Ｓｔ　、　Ｓ２　、　Ｓｓ　。R, <2>) and its parent block R+(1) are extracted. Similarly, before the structuring conditions described above are checked, Rs
(2) to R1 (2) to which left-right relation pointers are attached in the order 0, followed by their classification names (in this case, the four character lines St, S2, Ss).

Ｓ、と文章と画素表現ブロックを上下関係に含む仮想ブ
ロックＭ＋＞が順次調べられる。The virtual blocks M+> that include S, text, and pixel expression blocks in a vertical relationship are sequentially examined.

この場合、４つの文字行が文章として構造化でき、更に
、それらの親ブロックＲ，（１）は、４つの文字行以外
に仮想ブロックＭ、を含んでいるため、新たな領域とし
て第６図図中矩形で示す分類名文章（Ｔ１）としてブロ
ックを生成する。次に、文字行ブロックＳ、と仮想ブロ
ックＭ１どの左右関係ポインタを切り離しく即ち、文字
行ブロックＳ、の左右関係ポインタをＮＵＬＬとする。In this case, the four character lines can be structured as a sentence, and furthermore, their parent block R, (1) includes a virtual block M, in addition to the four character lines, so as a new area, as shown in FIG. A block is generated as a classification name text (T1) indicated by a rectangle in the figure. Next, the left-right relationship pointers of the character line block S and the virtual block M1 are separated, that is, the left-right relationship pointer of the character line block S is set to NULL.

）、文章ブロックＴ＋の左右関係ポインタに仮想ブロッ
クＭ１を示すアドレスを入れる。), the address indicating the virtual block M1 is entered into the left-right relationship pointer of the text block T+.

更に４つの文字行ブロックＳｔ　、３２　、Ｓ、。Furthermore, four character line blocks St, 32, S,.

Ｓ４に於ける第５図で示した新領域ポインタに新たに生
成された文章ブロックＴ、を示すアドレスが記憶される
と共に、文章ブロック′ｒ１に於ける第５図で示す子領
域ポインタには、先頭の子領域として文章ブロックＳ＋
を示すアドレスが記憶される。次に、文章ブロックＴＩ
の属性である分割レベル、をその子ブロックである４つ
の文字行３１゜Ｓ２・・・Ｓ４と同一の分割レベル２と
して、セットし、更に、前述したような他の属性値もセ
ットされる０次に、新たに生成された文章ブロックＴ１
から順次左右関係となるブロック（この場合、文章ブロ
ックＴ１と仮想ブロックＭ＋　）を取り出し、同様に前
述した構造化条件を調べる。この場合、ブロックＲ１（
１）に対して、仮想ブロック（図中Ｍ２）を表わす分類
名が与えられる。尚、仮想ブロックＭ２の第５図で示し
た子ブロック分類名群には、仮想ブロックＭｌ　　（尚
、仮想ブロックの場合には、その仮想ブロックに含まれ
る分類名、第６図の場合には、画素表現Ｆ１と文章Ｔ、
）と文章ブロックＴ＋の和、即ち２つの文章Ｔと画表表
現Ｆ）が格納される。The address indicating the newly generated text block T is stored in the new area pointer shown in FIG. 5 in S4, and the child area pointer shown in FIG. Text block S+ as the first child area
An address indicating the address is stored. Next, the text block TI
The division level, which is an attribute of , the newly generated text block T1
Blocks having a left-right relationship (in this case, the text block T1 and the virtual block M+) are sequentially extracted from the blocks, and the above-mentioned structuring conditions are examined in the same manner. In this case, block R1 (
1) is given a classification name representing a virtual block (M2 in the figure). The child block classification name group shown in FIG. 5 of the virtual block M2 includes the virtual block Ml (in the case of a virtual block, the classification name included in the virtual block; in the case of FIG. 6, Pixel representation F1 and text T,
) and the sentence block T+, that is, the two sentences T and the graphic representation F) are stored.

以下、同様な操作を行うことにより、第６図で示す文書
構造が生成され、各ブロックには、第５図で示す各属性
値が決定される。Thereafter, by performing similar operations, the document structure shown in FIG. 6 is generated, and each attribute value shown in FIG. 5 is determined for each block.

そこで、第６図で示した文書構造の自動生成結果（第１
図の文書画像に対応する）を用いて、本発明の第１項に
記載された領域抽出法について説明する。第１項に記載
された領域抽出手段は、第６図で示したような木′ｕｉ
造を探索することにより所望の１つないし複数個のブロ
ックを所定の順序で抽出する。具体的な例として、下記
に示す２つの場合について説明する。最初に、第１図で
示した如く、書籍等見られる文章ブロックＴ１と文章ブ
ロックＴ２で示されたテキスト領域を文章として順次読
む場合を考える。Therefore, the automatic generation result of the document structure shown in Figure 6 (first
(corresponding to the document image in the figure) will be used to explain the area extraction method described in Section 1 of the present invention. The area extraction means described in the first section uses a tree 'ui as shown in FIG.
By searching the structure, one or more desired blocks are extracted in a predetermined order. As specific examples, the following two cases will be explained. First, as shown in FIG. 1, a case will be considered in which a text area indicated by a text block T1 and a text block T2 in a book or the like is sequentially read as a text.

上記のような場合では、まず領域抽出手段は、第６図で
示した各ブロックの属性として、分類名、ブロック内分
類名群を順次ポインターを使って縦型探索を行うことに
より、第６図で示した文字行ブロックＳ、、３２．３３
　、Ｓ４．Ｓ、、Ｓ、。In the above case, the area extracting means first performs a vertical search using a pointer to sequentially search the classification name and group of classification names within the block as attributes of each block shown in FIG. Character line block S, 32.33
, S4. S,,S,.

Ｓ７を文章を読゛み収るべき順序で容易に抽出すること
ができる。S7 can be easily extracted in the order in which the text should be read.

ここで、各ブロックにはブロック内分類名群が記載され
ているので、各ブロックに所望の文字行ブロックが含ま
れていなければ、このブロックの縦型探索を中止するこ
とができ、効率的に得られる。尚、所望の文字行ブロッ
ク（Ｓ、、Ｓ２・・・Ｓ７）の各文字イメージあるいは
従来技術を用いた文字認識も上記領域抽出手段によって
得られた結果から容易に求まることは言うまでもない。Here, since the taxonomic name group within the block is written in each block, if the desired character line block is not included in each block, the vertical search for this block can be stopped, which is efficient. can get. It goes without saying that each character image of the desired character line block (S, , S2, . . . S7) or character recognition using the conventional technique can be easily obtained from the results obtained by the area extracting means.

次に、第１図で示した如く、画素表現ブロックＦｌの下
にある３つの文章ブロックＴ２で示された領域のみを読
み取る場合を考える。Next, as shown in FIG. 1, consider the case where only the area indicated by the three text blocks T2 below the pixel expression block Fl is read.

この場合、領域抽出手段は、第６図で示した木構造で表
現された各ブロックを縦型探索を行いながら、各ブロッ
クの属性を調べるのは前述した文書画像のテキスト領域
の読み取りを行う場合と同様であるが、唯一の相違点は
最初に、キーとなる画素表現ブロックＦ１を探索する点
のみである。In this case, the area extraction means performs a vertical search for each block expressed in the tree structure shown in FIG. 6, and examines the attributes of each block when reading the text area of the document image as described above. The only difference is that the key pixel expression block F1 is searched first.

即ち、第６図で示した本構造に対して縦型探索を行うと
、まず文章ブロックＴ、が見つかる。文章ブロックＴ１
の中には、画素表現ブロックはないので、文章ブロック
ＴＩの背方の探索が中止され、次に仮想ブロックＭ１が
調べられる。仮想ブロックＭ１のブロック内分類名群に
は、画素表現ブロックが存在するので、更に探索を行な
うことによって画素表現ブロックＦ＋が検出される。こ
のようにして画素ブロックＦＩの下方にある文字行ブロ
ックＳｓ　、Ｓ６．Ｓｔを抽出し、文字を読み取るのは
容易にできる。また例えば、画素表現Ｆ、の右側にあり
、隣接する文字行ブロックＳ４のみを抽出することも容
易にできる。That is, when a vertical search is performed on the main structure shown in FIG. 6, the text block T is found first. Text block T1
Since there is no pixel expression block in , the search behind the text block TI is stopped and the virtual block M1 is examined next. Since a pixel expression block exists in the intra-block classification name group of the virtual block M1, a pixel expression block F+ is detected by further searching. In this way, character line blocks Ss, S6 . It is easy to extract St and read the characters. For example, it is also possible to easily extract only the adjacent character line block S4 on the right side of the pixel representation F.

尚、第１図では、画素表現ブロックＦ　＋が１個のみで
あったが、複数個あるような場合には、キーとなるブロ
ック群を探索する時、領域のおおよその位置大きさを与
え、各ブロックの属性値としての位置、大きさとの検査
を含めてキーとなるブロック群を見つけることができる
ことは言うまでもない。In FIG. 1, there is only one pixel expression block F+, but if there are multiple, when searching for a key block group, give the approximate position and size of the area, It goes without saying that key blocks can be found by checking the position and size of each block as an attribute value.

第７図は、第２図で示した横書きの文書画像に対して領
域分割を行った結果を示す一例である。FIG. 7 is an example showing the result of region division on the horizontally written document image shown in FIG.

尚、第７図で示す領域分割結果は、第４図で示した如く
、前述した同一出願人による「スプリット検出法に基づ
く頁画像の構造解析」に記載されているような従来技術
を用いて実現できる。また、図中、文字領域については
省略する。As shown in FIG. 4, the area division results shown in FIG. 7 are obtained by using the conventional technique described in "Structure Analysis of Page Images Based on Split Detection Method" by the same applicant mentioned above. realizable. Further, in the figure, character areas are omitted.

第８図は、第７図の領域分割結果に対して文書構造の記
述生成を行った一例である。第８図で示す文書構造の自
動生成は、構造条件として第５図で示したブロック間の
分類名、ブロック間距離及び文字ピッチ推定値′を用い
た一例である。第８図の場合にも、第６図の場合と同様
な処理で実現でき、、第８図の場合には、第２図で示し
た文章ブロックＴ１が上記構造化条件から２つの文章ブ
ロックＴ２．Ｔ３に分解できる点が異なる。また、第８
図で示す仮想ブロックＭ、は、ブロック内分類名群とし
て２つのｆＪ書き文章を示す分類名、ブロック間配置属
性として左右関係を示し情報が記憶されている。尚、第
８図の場合、文字行ブロック文章ブロック等の左右の配
置関係は、横書きであるため、左→右への関、係で得ら
れ−る。FIG. 8 is an example of generating a document structure description based on the area division result shown in FIG. The automatic generation of the document structure shown in FIG. 8 is an example in which the classification names between blocks, the distance between blocks, and the estimated character pitch value ' shown in FIG. 5 are used as structural conditions. The case of FIG. 8 can also be realized by the same processing as the case of FIG. 6, and in the case of FIG. 8, the text block T1 shown in FIG. ．． The difference is that it can be decomposed into T3. Also, the 8th
The virtual block M shown in the figure stores information indicating a classification name indicating two fJ-written sentences as an intra-block classification name group and a left-right relationship as an inter-block arrangement attribute. In the case of FIG. 8, the left-right arrangement relationship of the character line blocks, text blocks, etc. is obtained from the relationship from left to right, since they are written horizontally.

ここで第８図で示す本構造針探索することによって所望
の領域を抽出できることを示す０例えば、第２図で示す
文章ブロックＴ、とＴ６内の各文字を所定の順序で抽出
し、従来の文字認識を用いて文字コード列に変換する場
合を述べる。最初に、文章領域Ｔ　ｓ　、　Ｔ　ｂから
成る２段組のブロック（図中仮想ブロックＭ”１）を探
索する。例えば、ブロックＸ及びｙ（ブロックＸ及びｙ
の条件として、その子ブロック配置属性が上下関係にあ
る）が左右関係Ｘφｙを含むブロック（Ｘφｙ）を探索
すると、ブロック（Ｘφｙ）として第８図の仮想ブロッ
クＭ１が検出される。Here, it is shown that a desired area can be extracted by searching for the basic structure shown in FIG. 8. For example, each character in text blocks T and T6 shown in FIG. The case of converting to a character code string using character recognition will be described. First, a two-column block (virtual block M"1 in the figure) consisting of text areas T s and T b is searched. For example, blocks X and y (blocks X and y
When searching for a block (Xφy) whose child block arrangement attributes (with a vertical relationship) have a left-right relationship Xφy, virtual block M1 in FIG. 8 is detected as the block (Xφy).

次に、仮想ブロックＭ１から順次縦型探索をして、文字
行ブロックＳ６・・・ｓｒｓを取り出し、それぞれの文
字行ブロックＳ６・・・３１５に各文字ブロックを順次
取りだして文字認識を行えば良い。Next, a vertical search is performed sequentially from the virtual block M1, character line blocks S6...srs are extracted, and each character block is sequentially extracted in each character line block S6...315 to perform character recognition. .

尚、上述した２段組を意味するプロ・ｙりを探索する際
、ブロックＸ及びブロックＹの条件としてその子ブロッ
ク配置属性のみを用いたが、属性として第５図で示す位
置、サイズ等も使用できる場合には、それを用いても良
い。In addition, when searching for the above-mentioned two-column structure, only the child block placement attributes of block X and block Y were used as conditions, but the position, size, etc. shown in Figure 5 were also used as attributes. If possible, you may use it.

また、第８図の文章ブロックＴ１内の２つの文字行Ｓｔ
、Ｓｓを探索することも例えば−２＆（ｘ＜）ｙ）（但
し、ｘ、ｙは前述した条件とし、２は、文章ブロックと
する）を満たすブロック２を探索すると、第８図の文章
ブロック２が検出される。In addition, two character lines St in the text block T1 in FIG.
, Ss can also be searched for block 2 that satisfies -2&(x<)y) (where x, y are the conditions described above, and 2 is a text block), the text block in Figure 8 is searched. 2 is detected.

今、文字行ブロックＳ−、Ｓｓ即ち、第２図で示した２
段組の上方にある２つの文字行ブロックを見つけるので
あるから、例えば、ｚ１８ｚ２　（但し、ブロックＺｌ
、Ｚ２はブロックＺに含まれる文字行であり、ｚ２は最
も下にあるブロックとする）を満たすブロックＺ、、　
ｚ２を求めれば良い。Now, the character line blocks S-, Ss, 2 as shown in FIG.
Since we are looking for two character line blocks above the column, for example, z18z2 (where block Zl
, Z2 is a character line included in block Z, and z2 is the lowest block).
All you have to do is find z2.

尚、以上の説明で述べたように、第８図で示したような
本構造を探索し、所望のブロックを抽出する場合、予め
抽出すべき領域等をブロックの属性及び相対的配置関係
に従って言語として定義して置き、これに従って本構造
を探索するようにしても良いし、前述したように処理手
順をプログラムして置き、これに従って木構造を探索し
ても良い。As mentioned in the above explanation, when searching this structure as shown in FIG. 8 and extracting a desired block, the area to be extracted is determined in advance in language according to the attributes and relative arrangement of the block. The tree structure may be searched according to this definition, or the processing procedure may be programmed as described above and the tree structure searched according to this.

以上、述べた如く、本願の第１の発明の文書画像解析方
式により抽出すべき領域は、第８図で示す本構造で表現
された入力文書画像のブロック群を探索することによっ
て求めることができる。尚、探索方法は、縦型探索とし
て説明したが横型探索を用いて行っても良い。As described above, the area to be extracted by the document image analysis method of the first invention of the present application can be found by searching the block group of the input document image expressed by the present structure shown in FIG. . Although the search method has been described as a vertical search, a horizontal search may also be used.

第９図、第１０図、第１１図は本願の第１の発明の文書
画像解析方式を説明するための図である。第９図は、横
書き文書画像の一例である。第９図において、図中斜線
で示す矩形及び丸領域はそれぞれ図や表等の画素表現領
域１文字領域である。解析対象領域Ｐに対して、第３図
で述べた領域分割方式を適用すると、図中８１　（但し
、ｉ＝１．・・・６）及びＦで示す文字行ブロック及び
画素ブロックが得られる。ここで、Ｓｌは図の注釈やキ
ャプション等の文字を含む文字行とし、Ｓ２・・・Ｓ６
は通常の文章領域とする。文字行ブロック内の各文字は
、例えば、前述した文字分離方式によって一文字毎に切
り出され、図中点線で示したように分けられる。FIG. 9, FIG. 10, and FIG. 11 are diagrams for explaining the document image analysis method of the first invention of the present application. FIG. 9 is an example of a horizontally written document image. In FIG. 9, the rectangular and circular areas indicated by diagonal lines in the figure are each a pixel expression area and one character area of a figure, table, etc. When the area division method described in FIG. 3 is applied to the analysis target area P, character line blocks and pixel blocks indicated by 81 (where i=1...6) and F in the figure are obtained. Here, Sl is a character line containing characters such as figure annotations and captions, and S2...S6
is a normal text area. Each character in the character line block is separated into individual characters by, for example, the character separation method described above, and separated as shown by dotted lines in the figure.

ここで、図中Ｃ１よ＋　ＣＩ＋６．　Ｃ２４１Ｃ３１１
Ｃ％３゜Ｃ６＋で示す空白領域を始端及び終端の文字ブ
ロック位置と文字行ブロックの始端及び終端位置との比
較により求め、空白ブロックとする０次に、第８図で示
したようにして、第９図の文書構造の自動生成を行うと
、第１０図で示すようになる。尚、第１０図は、構造化
条件として、第５図で示したブロック間の分類名、ブロ
ック間距離及び文字ピッチ推定値を用いた一例である。Here, in the figure, C1+CI+6. C241C311
The blank area indicated by C%3°C6+ is determined by comparing the starting and ending character block positions with the starting and ending positions of the character line block, and is set as a blank block. Next, as shown in FIG. 8, When the document structure shown in FIG. 9 is automatically generated, it becomes as shown in FIG. 10. Note that FIG. 10 is an example in which the classification names between blocks, the distance between blocks, and the estimated character pitch values shown in FIG. 5 are used as structuring conditions.

ここで、文章は、通常、節、段下げなどを用いてパラグ
ラフなどの論理構造がとられ、この単位で文章領域を分
割して置くことは、文書イメージの文字コード列の変換
や所望の文書領域を抽出する際に有効である。Here, the text usually has a logical structure such as a paragraph using sections, indentations, etc., and dividing the text area into these units is useful for converting character code strings in the document image and for creating the desired document. This is effective when extracting regions.

また、見出し、キャプションや章題なとは、文字行の始
端や終端あるいはその両方に空白ブロックが存在するこ
とが特徴の１つである。One of the characteristics of headings, captions, and chapter titles is that they have blank blocks at the beginning and/or end of the character line.

そこで、文章ブロック１゛内の各文字行３２゜Ｓ３１・
・・Ｓ６に於ける空白ブロックＣｔａ、Ｃ３１・Ｃ５３
，Ｃ６１に従って、文章ブロックＴを分解すると、第１
１図に示す如く、文字行３２．文字行Ｓ、。Therefore, each character line 32゜S31 in text block 1゛
・Blank block Cta, C31, C53 in S6
, C61, the text block T is decomposed into the first
As shown in Figure 1, the character line 32. Text line S.

Ｓ、、Ｓ、、から成るパラグラフブロックＵ１文字行Ｓ
６の３つのブロックから構成される。Paragraph block U1 character line S consisting of S,,S,,
It consists of three blocks of 6.

また、文字行Ｓ、は文章ブロックＴの上方にあり、また
画表ブロックＦの下方にあり、両端に２つの空白ブロッ
クＣ１１ｌＣ１６があるため、画素ブロックＦのキャン
プジョンを示す文字行ブロックであることがわかる。Also, the text line S is above the text block T and below the graphic block F, and there are two blank blocks C11lC16 at both ends, so it is a text line block that indicates the campsite of the pixel block F. I understand.

以上のようにして、特に文章ブロック内の文字行の性質
や複数個の文字行ブロックの統合、更には、文字行ブロ
ック自身の論理的性質を抽出して置くことによって、前
述して抽出すべき領域を第１１図で示すような木構造を
用いて探索する場合、容易となる。As described above, by extracting the properties of text lines within a text block, the integration of multiple text line blocks, and furthermore the logical properties of the text line blocks themselves, This becomes easy if the area is searched using a tree structure as shown in FIG.

第１２図は本願の第１の発明の一実施例を示す論理ブロ
ック図である。FIG. 12 is a logical block diagram showing an embodiment of the first invention of the present application.

図において、１は文書画像を量子化された画像情報とし
て記憶する文書画像メモリである。２は領域分割部であ
る。領域分割部２は、文書画像メモリ１の文書画像に対
して第３図で説明したように、上下関係及び左右関係の
配置関係を保持しながら大局的領域から局小的領域へ領
域分割を行う機能を有しており、第４図あるいは第５図
で示したような領域分割過程で得られるブロック情報は
順次構造化データ記憶部４に格納する。ここで、領域分
割部２から出力されるブロック情報のうち、文字行ブロ
ックの子領域となる文字ブロックについては、文字分離
部１５において１文字単位の領域情報に変換され、構造
化データ記憶部に格納される。In the figure, 1 is a document image memory that stores document images as quantized image information. 2 is an area dividing section. The area dividing unit 2 divides the document image in the document image memory 1 from a global area into local areas while maintaining the vertical and horizontal relationships as explained in FIG. The block information obtained in the area division process as shown in FIG. 4 or 5 is sequentially stored in the structured data storage section 4. Here, among the block information output from the area dividing unit 2, character blocks that are child areas of the character line block are converted into area information for each character in the character separating unit 15, and stored in the structured data storage unit. Stored.

また、領域分割部２は、領域分割結果から文書画像が縦
書きか横町きかを判定し、その結果を縦・横情報記憶部
３に記憶する。Further, the area dividing unit 2 determines whether the document image is written vertically or horizontally from the area dividing result, and stores the result in the vertical/horizontal information storage unit 3.

尚、！造化データ記憶部４に格納された各ブロック情報
のポインタ関連（親領域ポインタ、子領域ポインタ、上
下関係ポインタ、左右関係ポインタなど）の値は、各ブ
ロックの構造化データ記憶部４内での相対位置によって
表現されるとする。still,! Pointer-related values (parent area pointer, child area pointer, vertical relationship pointer, horizontal relationship pointer, etc.) of each block information stored in the structured data storage unit 4 are relative values within the structured data storage unit 4 of each block. Suppose that it is expressed by position.

更に、各ブロックには構造化データ記憶部４内の自分自
身の相対位置もブロックの属性値として相対位置ポイン
タに記憶されているとする。Furthermore, it is assumed that each block also stores its own relative position within the structured data storage unit 4 as a block attribute value in a relative position pointer.

また、相対位置カウンタ１１には領域分割部２から構造
化データ記憶部４に格納された最後のブロックの次の相
対位置が初期値として記憶されているとし、相対位置カ
ウンタ１１はその値が読み出された時、各ブロック情報
単位でカウントアツプされるものとする。Further, it is assumed that the relative position counter 11 stores the next relative position of the last block stored in the structured data storage unit 4 from the area dividing unit 2 as an initial value, and the relative position counter 11 stores that value as an initial value. When issued, it is assumed that the count is incremented for each block information unit.

次に、ブロック情報制御部４は第６図で示したように分
割レベルが最大となる構造化対象となる複数個のブロッ
ク及びそれらの親ブロック（分割レベルが１つ少ないブ
ロック）をペアーとじて構造化データ記憶部４から取り
出すと同時に、前述した分割レベル及び縦・横情報記憶
部３のＭｌ書き／横書き情報を用いて、各ブロックの上
下関係及び左右関係ポインタに、そのブロックに連結す
るブロックの相対位置を記憶する。更に、それらの親ブ
ロックの子領域ポインタには、最初に親ブロックから探
索されるブロックの相対位置を記憶し、親ブロック及び
複数個の構造化対象ブロックを対象データ記憶部６に格
納する。Next, as shown in FIG. 6, the block information control unit 4 pairs a plurality of blocks to be structured with the maximum division level and their parent blocks (blocks with one division level less). At the same time as fetching from the structured data storage unit 4, using the above-mentioned division level and Ml writing/horizontal writing information of the vertical/horizontal information storage unit 3, the vertical relationship and horizontal relationship pointer of each block is used to determine the block to be connected to that block. memorize the relative position of Furthermore, the relative position of the block first searched from the parent block is stored in the child area pointer of those parent blocks, and the parent block and a plurality of structuring target blocks are stored in the target data storage unit 6.

次に、ブロック情報制御部５は、対象データ記憶部６か
ら親ブロック及び複数個の構造化対象ブロックを読み出
し、構造化検査部８に転送する。Next, the block information control unit 5 reads the parent block and the plurality of structuring target blocks from the target data storage unit 6 and transfers them to the structuring inspection unit 8 .

構造化検査部８は、第６図及び第８図で説明したように
、複数個の構造化対象ブロックの属性値に於ける構造化
条件を記憶した条件記憶部７の内容に従って、複数個の
構造化対象ブロックの属性値を順次検査する。上記検査
を順次行なった時、新たなブロックを生成する必要が生
じた場合、前述した親ブロック及び複数個の構造化対象
ブロックをブロック生成部９に転送する。ブロック生成
部９では、複数個の構造化対象ブロックのうち、構造化
されるべき複数個にブロックの属性に従って新たなブロ
ックを生成し、第６図及び第８図を用いて説明したよう
に、新たなブロックの子領域ポインタに、構造化される
べき先頭のプロ・ツクの相対位置を記憶し、構造化され
るべき複数個のブロックの親領域ポインタには、新たに
生成されたブロックの相対位置を記憶する０次に、構造
化されるべき複数個のブロックと未検査となっているブ
ロックとの上下又は左右関係のポインタの切り離し処理
が行われる。As explained in FIGS. 6 and 8, the structuring inspection section 8 performs a plurality of structuring inspections according to the contents of the condition storage section 7 that stores structuring conditions for the attribute values of the plurality of structuring target blocks. Sequentially inspect the attribute values of the block to be structured. If it becomes necessary to generate a new block when the above tests are performed sequentially, the above-described parent block and a plurality of blocks to be structured are transferred to the block generation section 9. The block generation unit 9 generates new blocks for a plurality of blocks to be structured among the plurality of blocks to be structured according to the attributes of the blocks, and as explained using FIGS. 6 and 8, The relative position of the first block to be structured is stored in the child area pointer of a new block, and the relative position of the newly generated block is stored in the parent area pointer of multiple blocks to be structured. After storing the positions, a process is performed to separate the pointers in the vertical or horizontal relationship between the plurality of blocks to be structured and the uninspected blocks.

尚、新たに生成されたブロックの相対位置ポインタには
相対位置カウンタ１１の値が読み出され、セットされて
いるものとする。It is assumed that the value of the relative position counter 11 is read out and set in the relative position pointer of the newly generated block.

また、新たに生成されたブロックの親領域ポインタには
、その構造化されるべき複数個のブロックに於ける親ブ
ロックの相対位置が記憶される。Furthermore, the relative position of the parent block among the plurality of blocks to be structured is stored in the parent area pointer of the newly generated block.

次に、ブロック生成部９によって、複数個の構造化対象
ブロックのうち、前述したようにして構造化された複数
個のブロックが、構造化データ記憶部４の所定の相対位
置に書き込まれ、新たに生成されたプロ′ツク及び複数
個の構造化対象となる未検査ブロックとその親ブロック
が再度構造化検査部８へ送られ、前述した処理が繰り返
される６次に、構造化検査部８で１４１次検査される複
数個の構造化対象ブロックに対して新たなブロックを生
成する必要がない場合、複数個の構造化されるべきブロ
ック及びその親ブロックを属性決定部１０へ転送する。Next, the block generation unit 9 writes a plurality of blocks structured as described above among the plurality of blocks to be structured into a predetermined relative position in the structured data storage unit 4, and creates a new block. The program generated in the previous step, a plurality of uninspected blocks to be structured, and their parent blocks are sent again to the structuring inspection unit 8, and the above-mentioned process is repeated.6 Next, the structuring inspection unit 8 If there is no need to generate new blocks for the plurality of blocks to be structured that are subjected to the 141st inspection, the plurality of blocks to be structured and their parent blocks are transferred to the attribute determination unit 10.

属性決定部１０では、親ブロックの属性値を第６図で説
明したようにして決定し、構造化データ記憶部４の所定
の相対位置に親ブロック及び複数個の構造化ブロックを
書き込む。The attribute determining unit 10 determines the attribute value of the parent block as explained in FIG. 6, and writes the parent block and a plurality of structured blocks at predetermined relative positions in the structured data storage unit 4.

次に、ブロック情報制御部５は、前述したようにして、
構造化を行った分割レベルを持つ親ブロックと複数個の
′！ｙ４造化対象ブロックのベアーが対象データ記憶部
６に残っていれば、それらのベアーを順次、構造化検査
部８へ転送する。Next, the block information control unit 5, as described above,
A parent block with a structured division level and multiple ′! If the bears of the y4 structuring target block remain in the target data storage unit 6, those bears are sequentially transferred to the structuring inspection unit 8.

一方、対象データ記憶部６が空であれば、次に分割レベ
ルを１２減らし、構造化対象となるブロック及びその親
ブロックのベアーを構造化データ記憶部４から取り出し
、以下、分割レベル１のブロックが構造化対象として取
り出されるまで同様な動作が行われる。その結果、構造
化データ記憶部４に、文書画像メモリ１仲格納された文
書の構造が木構造として、自動生成されることになる。On the other hand, if the target data storage unit 6 is empty, then the division level is reduced by 12, the block to be structured and its parent block bare are taken out from the structured data storage unit 4, and the blocks at division level 1 are Similar operations are performed until the is extracted as a structuring target. As a result, the structure of the document stored in the document image memory 1 is automatically generated in the structured data storage unit 4 as a tree structure.

以上説明した文書構造生成部２０（図中点線で示す）に
よって、領域分割部２、及び文字分離１５に置いて得ら
れた文書画像の本構造として分割された結果を再度ボト
ムアップ的に見直され、文書の構造が構造化データ記憶
部４に格納されることになる。The document structure generation unit 20 (indicated by dotted lines in the figure) described above reconsiders the result of dividing the document image into the main structure obtained by the area division unit 2 and the character separation unit 15 in a bottom-up manner. , the structure of the document will be stored in the structured data storage section 4.

次に、第６図及び第８図で示したように、領域探索部１
３は、予め抽出すべき１つないし複数個のブロックの属
性及び配置関係に関する条件を記憶した領域定義記憶部
から読み出し、前記条件に従って、構造化データ記憶部
４に格納された文書画像内のブロック間の配ａ！構造を
示す木構造を探索し、抽出すべき１つないし複数個のブ
ロックを所定の順序で抽出結果記憶部１４に格納する。Next, as shown in FIGS. 6 and 8, the area search unit 1
3 reads blocks in a document image from an area definition storage unit that stores conditions regarding attributes and arrangement relationships of one or more blocks to be extracted in advance, and stores them in the structured data storage unit 4 according to the conditions. Space between a! A tree structure indicating the structure is searched, and one or more blocks to be extracted are stored in the extraction result storage unit 14 in a predetermined order.

第１３図は本願の第２の発明の一実施例を示す論理ブロ
ック図である。図において、文書画像メモリ１、領域分
割部２、文字分離部１５は、第１２図に説明した機能を
持つ。空白ブロック検査部２１は、領域分割部２及び文
字分離部１５を介した得られるブロック情報に於いて、
その背方の複数個の子ブロック情報（例えば、文字行ブ
ロックに対する複数個の文字ブロック）のうち、始端と
なる子ブロックの位置及び終端となる子ブロックの位置
を選択し、それらの位置及びその親となるブロック情報
の始端・終端位置の差を算出し、ブロック情報の始端及
び終端の空白サイズを求める。空白サイズを予め定めた
閾値あるいは文字ピッチなどの文字サイズと比較するこ
とにより、空白ブロックを検出する。FIG. 13 is a logical block diagram showing an embodiment of the second invention of the present application. In the figure, the document image memory 1, the area dividing section 2, and the character separating section 15 have the functions explained in FIG. In the block information obtained through the area dividing unit 2 and character separating unit 15, the blank block inspection unit 21 performs
Among the multiple child block information behind it (for example, multiple character blocks for a character line block), select the position of the child block that will be the start end and the position of the child block that will be the end, and The difference between the start and end positions of the parent block information is calculated, and the blank size at the start and end of the block information is determined. Blank blocks are detected by comparing the blank size with a predetermined threshold or character size such as character pitch.

尚、空白ブロックが検出された場合、それも１つのブロ
ック情報として加えられると共に、空白ブロックが検出
された親ブロックの属性値として、空白ブロックの存在
位置（始端又は終端）及び空白ブロックの位置・大きさ
が記憶されるとする。If a blank block is detected, it is also added as one piece of block information, and the position of the blank block (starting end or end) and the position of the blank block are added as attribute values of the parent block where the blank block was detected. Assume that the size is memorized.

そこで、構造化データ記憶部４は第１２図で示した機能
と同等であるが、空白ブロック情報も含めて記憶される
点が異なる。Therefore, the structured data storage section 4 has the same function as shown in FIG. 12, except that blank block information is also stored.

縦・横情報記憶部２に記憶された縦書き横書き情報を読
み出し、文書構造生成部２０によって、第６図、第８図
から第１０図で示したような文書の配置構造を生成する
機能は、第１２図で示した場合と同等である。The function is to read the vertical/horizontal writing information stored in the vertical/horizontal information storage section 2 and generate the document arrangement structure as shown in FIGS. 6, 8 to 10 using the document structure generating section 20. , is equivalent to the case shown in FIG.

文書ブロック検査部２２は、第１１図で説明したように
、文書ブロック情報とその子ブロックとなる複数個の文
字行ブロック情報を順次構造化データ記憶部４から取り
出し、文字行ブロックの属性として記憶されている空白
ブロックの存在位置、大きさを文章ブロックを構成する
文字行ブロックすべてについて順次検査し、パラグラフ
ブロックを構成する複数個の文字行ブロックを検出する
。As explained in FIG. 11, the document block inspection unit 22 sequentially retrieves document block information and a plurality of character line block information that are child blocks from the structured data storage unit 4, and stores the information as attributes of the character line block. The position and size of blank blocks in the text block are sequentially inspected for all character line blocks constituting a text block, and a plurality of character line blocks constituting a paragraph block are detected.

パラグラフブロックを構成する複数個の文字行ブロック
が文章ブロック検査部に於いて検出されると、ブロック
更新部２３に於いて、パラグラフブロックを生成し、そ
の属性値をそれを構成する複数個の文字行ブロック情報
に従って生成する。また、パラグラフブロックの親領域
ポインタには、その文書ブロックの相対位置が記憶され
、また、パラグラフブロックの複数個の文字行ブロック
の親領域ポインタには、新たに生成されたパラグラフブ
ロックの相対位置となる第１２図に示した文書ｌｆｌＡ
ｍ生成部２０内の相対位置カウンターの内容が読み出さ
れてセットされる。When a plurality of character line blocks constituting a paragraph block are detected in the sentence block inspection section, a paragraph block is generated in the block updating section 23, and its attribute values are changed to the plurality of character line blocks constituting the paragraph block. Generate according to row block information. In addition, the parent area pointer of a paragraph block stores the relative position of the document block, and the parent area pointers of multiple character line blocks of a paragraph block store the relative position of the newly generated paragraph block. The document lflA shown in FIG.
The contents of the relative position counter in the m generation section 20 are read and set.

尚、パラグラフブロックの子領域ポインタはそれを構成
する先頭の文字行ブロックの相対位置が記憶され、パラ
グラフを構成する複数個の文字行ブロックの始端及び終
端ブロックの上下又は左右関係ポインタの切り離し、及
びパラグラフブロックの上下又は左右関係ポインタとそ
れに隣接する文字行ブロック又はパラグラフブロックの
上下又は左右関係ポインタでの接続も行われる。Note that the child area pointer of a paragraph block stores the relative position of the first character line block that makes up the paragraph block, and can be used to separate the vertical or horizontal relationship pointers of the start and end blocks of the multiple character line blocks that make up the paragraph, and A connection is also made between the vertical or horizontal relationship pointer of a paragraph block and the vertical or horizontal relationship pointer of an adjacent character line block or paragraph block.

次に、１つの文書ブロックとそれを構成する複数個の文
字行ブロックに対して、文章ブロック検査部２２及びブ
ロック部２０において、パラグラフブロックの検出・生
成が行われると、ブロック更新部２３によって、それら
のデータが構造化データ記憶部４の所定の相対位置に書
き込まれる。Next, when the text block inspection unit 22 and the block unit 20 detect and generate paragraph blocks for one document block and the plurality of character line blocks that constitute it, the block update unit 23 performs the following: These data are written to predetermined relative positions in the structured data storage section 4.

以上の操作を構造化データ記憶部４に格納されたすべて
の文書ブロックとそれを構成する複数個の文字行プロ、
ツタについて行われることによって、文章ブロックは、
それを構成するパラグラフブロック単位に構造化される
。The above operations are performed on all document blocks stored in the structured data storage unit 4 and the plural character line programs that constitute them.
By being done about ivy, the sentence block is
It is structured into paragraph blocks that constitute it.

尚、ここで、パラグラフを構成しなかった文字行ブロッ
クの属性値として空白ブロック情報も含まれており、文
字行の性質を表わす文字ピッチと共に利用できることは
言うまでもない。Note that here, blank block information is also included as an attribute value of character line blocks that do not constitute a paragraph, and it goes without saying that it can be used together with the character pitch that represents the nature of the character line.

領域定義記憶部１２．領域探索部１３．抽出結果記憶部
１４については第１２図で説明した第１項記載の本願発
明と同等である。Area definition storage unit 12. Area search unit 13. The extraction result storage unit 14 is the same as that of the present invention described in item 1 described in FIG. 12.

（発明の効果）以上に説明したように、本願発明の文書画像解析方式に
よれば、ブロックの包含関係及び上下又は左右の相対位
置関係に従って、種々な文書画像を構成する要素及び要
素間の配置構造が階層的に自動生成され、同時に文章情
報の流れやパラグラフ等の論理的構造も得られる。そこ
でこの配置構造から様々な目的に応じて定められる領域
を探索することによって、安定にしかも容易に１つない
し複数個の領域の抽出を行うことかできる。(Effects of the Invention) As explained above, according to the document image analysis method of the present invention, the elements constituting various document images and the arrangement between the elements can be determined according to the inclusion relationship of blocks and the relative positional relationship vertically or horizontally. The structure is automatically generated hierarchically, and at the same time, the flow of text information and logical structures such as paragraphs are also obtained. Therefore, by searching for regions determined according to various purposes from this arrangement structure, it is possible to stably and easily extract one or more regions.

[Brief explanation of the drawing]

第１図及び第２図はそれぞれ縦書き及び横書きで記載さ
れた文書画像の構成を示す図である。第３図は上下及び
左右関係の分割方向を交互に規定しながら階層的に領域
分割を行う文書領域分割方式の一例を示す図である。第４図及び第７図は、それぞれ、第１図及び第２図の文
書画像の領域分割結果の一例を示す図である。第５図はブロック情報の一例を示す図である。第６図及び第８図は、第４図及び第７図でそれぞれ示し
た領域分割結果から本願の第１の発明によって文書構造
を自動生成した例を示す図である。第９図は文書画像の一例を示す図である。第１０図及び第１１図は第９図の文書画像に対して適用
する場合における本願の第２の発明の文書画像解析方式
を説明する図である。第１２図は本願の第１の発明の一実施例を示す論理ブロ
ック図である。第１３図は本願の第２の発明の一実施例を示す論理ブロ
ック図で′ある。図において、１は文書画像メモリ、２は領域分割部、３
は縦・横１報記憶部、４は構造化データ記憶部、５はブ
ロック情報制御部、６は対象データ記憶部、７は条件記
憶部、８は構造化検査部、９はブロック生成部、１０は
属性決定部、１１は相対位置カウンター、１２は領域定
義記憶部、１３は領域探索部、１４は抽出結果記憶部、
１５は文字分離部、２０は文書構造生成部、２１は空白
ブロック検査部、２２は文章ブロック検査部、２３はブ
ロック更新部である。FIG. 1 and FIG. 2 are diagrams showing the structure of a document image written vertically and horizontally, respectively. FIG. 3 is a diagram showing an example of a document area division method in which area division is performed hierarchically while alternately defining vertical and horizontal division directions. FIGS. 4 and 7 are diagrams showing examples of the region division results of the document images shown in FIGS. 1 and 2, respectively. FIG. 5 is a diagram showing an example of block information. FIGS. 6 and 8 are diagrams showing examples in which a document structure is automatically generated according to the first invention of the present application from the region segmentation results shown in FIGS. 4 and 7, respectively. FIG. 9 is a diagram showing an example of a document image. FIGS. 10 and 11 are diagrams for explaining the document image analysis method of the second invention of the present application when applied to the document image shown in FIG. 9. FIG. 12 is a logical block diagram showing an embodiment of the first invention of the present application. FIG. 13 is a logical block diagram showing an embodiment of the second invention of the present application. In the figure, 1 is a document image memory, 2 is an area dividing unit, and 3 is a document image memory.
1 is a vertical/horizontal information storage unit, 4 is a structured data storage unit, 5 is a block information control unit, 6 is a target data storage unit, 7 is a condition storage unit, 8 is a structured inspection unit, 9 is a block generation unit, 10 is an attribute determination unit, 11 is a relative position counter, 12 is an area definition storage unit, 13 is an area search unit, 14 is an extraction result storage unit,
15 is a character separation section, 20 is a document structure generation section, 21 is a blank block inspection section, 22 is a text block inspection section, and 23 is a block update section.

Claims

[Claims]

(1) A means for decomposing a document image into element areas such as character lines and characters, and when structuring one or more of the element areas as blocks, according to the inclusion relationship and vertical or horizontal arrangement relationship of each block. , a document structure generating means for hierarchically determining and storing the attributes of the blocks and the arrangement structure between the blocks, and an area to be extracted in the document image based on the attributes of the blocks and the hierarchical arrangement structure between the blocks. or a region search means for searching one or more blocks constituting the region to be extracted.

(2) A means for decomposing a document image into element areas such as character lines and characters, and when structuring one or more of the element areas as blocks, according to the inclusion relationship and vertical or horizontal arrangement relationship of each block. , document structure generating means for hierarchically determining and storing the attributes of the blocks and the arrangement structure between blocks; means for extracting a blank having a predetermined position and size within a character line block as a blank block; document structure updating means for checking the shape of each character line block based on the blank block from blocks containing a plurality of character lines generated by the document structure generating means, and updating the arrangement structure in the document image; A document image analysis method comprising: an area search means for searching for an area to be extracted in the document image from the hierarchical arrangement structure or one or more blocks constituting the area to be extracted. .