JP3380024B2

JP3380024B2 - Document image processing method and document image processing apparatus

Info

Publication number: JP3380024B2
Application number: JP35204493A
Authority: JP
Inventors: ケイ．チルトンジェフ; エフ．カレンジョン; 公一江尻
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1993-03-09
Filing date: 1993-12-30
Publication date: 2003-02-24
Anticipated expiration: 2018-02-24
Also published as: JPH06259524A

Description

Detailed Description of the Invention

【０００１】[0001]

【産業上の利用分野】本発明は、ドキュメントイメ−ジ
処理方法およびドキュメントイメージ処理装置に関す
る。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a document image processing method and a document image processing apparatus.

【０００２】[0002]

【従来の技術】従来、ドキュメントを再生する場合、ド
キュメントイメージは直接再生され、ドキュメントイメ
ージに対して変形操作等は行なわれない。このため、印
刷(再生)されたテキストに注釈やコメントなどを加筆し
たりする際には、テキストのライン間の限られたスペー
スに記すしかなく、注釈やコメントなどを付しにくく、
またこれを読みずらいといった問題がある。2. Description of the Related Art Conventionally, when a document is reproduced, the document image is directly reproduced and the document image is not deformed. Therefore, when annotating or commenting on the printed (reproduced) text, there is no choice but to write it in the limited space between the lines of the text, making it difficult to annotate or comment, etc.
There is also the problem that it is difficult to read this.

【０００３】[0003]

【発明が解決しようとする課題】このような問題を回避
するため、ドキュメントを拡大することも考えられる
が、この場合には、元のドキュメントよりも大きなサイ
ズの用紙にコピーする必要があり、書類がかさばるなど
の不都合が生ずる。さらには、ドキュメントを拡大する
ことによって、白スペースを拡げることはできるもの
の、イメージデータ自体も拡がってしまうという欠点が
あった。In order to avoid such a problem, it is possible to enlarge the document, but in this case, it is necessary to make a copy on a sheet having a size larger than that of the original document. This causes inconveniences such as being bulky. Furthermore, although the white space can be expanded by enlarging the document, the image data itself also expands.

【０００４】本発明は、ドキュメントイメージのテキス
トの大きさやテキスト領域内でのテキストの相対的な空
間配置を変更したりせずに、ドキュメントイメージのテ
キスト対象間の白スペースを物理的に増加させることの
可能なドキュメントイメージ処理方法および装置を提供
することを目的としている。The present invention physically increases the white space between text objects in a document image without changing the text size of the document image or the relative spatial placement of the text within the text area. It is an object of the present invention to provide a document image processing method and device capable of

【０００５】[0005]

【課題を解決するための手段および作用】本発明の１つ
の観点によれば、先ず、ドキュメントイメージデータの
範囲の境界となる矩形を定め、これらの矩形をＣ₁Ｘ×
Ｃ₂Ｙの寸法(Ｃ₁，Ｃ₂：定数)をもつ仮想の用紙にマッ
プする。このとき、マップされた新たな矩形は、以前と
同じ寸法をもつが、互いにより大きく離れて配置され
る。According to one aspect of the present invention, first, rectangles that are boundaries of a range of document image data are defined, and these rectangles are C ₁ X ×.
Map to a virtual sheet having C ₂ Y dimensions (C ₁ , C ₂ : constant). The new mapped rectangles then have the same dimensions as before, but are placed farther apart from each other.

【０００６】本発明の他の観点によれば、テキストブロ
ック(文書ブロック)としてまとめられた矩形のグループ
に順序番号を割り当てることによって、論理的な読みの
順序が保存されるように、上記マップされた新たな矩形
を物理的なドキュメントイメージページ上に配置する。
マップされた新たなブロックが物理的なページ境界と重
なり合う場合には、このブロックをそのページ境界のと
ころで２つに分割し、これに続く全てのブロックに対
し、番号付けを再度行なう(リナンバリングする)。According to another aspect of the present invention, the above-mentioned mapping is performed so that a logical reading order is preserved by assigning an order number to a rectangular group organized as a text block (document block). Place the new rectangle on the physical document image page.
If the new mapped block overlaps a physical page boundary, split this block in two at that page boundary and renumber all subsequent blocks (renumbering). ).

【０００７】本発明のさらに他の観点によれば、先ず、
画像イメージ領域にマップすることによって、ブロック
を物理的なページにマップする。次いで、ページの左上
部から順番に、テキストブロックをマップする。次のブ
ロックについては、最後にマップされたブロックのすぐ
右側にマップされる。According to still another aspect of the present invention, first,
Map blocks to physical pages by mapping to image areas. The text blocks are then mapped, starting from the top left of the page. The next block is mapped just to the right of the last mapped block.

【０００８】このようにして、本発明では、イメージデ
ータのサイズ(大きさ)，用紙サイズ，論理的な読みの順
序を変更せずに(保存させて)、白スペースを拡げること
ができる。本発明の他の特徴および利点については以下
で説明する。As described above, according to the present invention, the white space can be expanded without changing (saving) the image data size (size), the paper size, and the logical reading order. Other features and advantages of the invention are described below.

【０００９】[0009]

【実施例】図１は本発明の実施例による白スペース拡大
処理の概要を示す図である。この処理では、先ず、物理
的ドキュメントのデジタル表現をビット列として作成す
る(ステップ２)。この目的のためにドキュメントスキャ
ナーを用いることができる。各ビットは、スキャンされ
たイメージの１つの画素に対応しており、その位置に白
いスペースがあるか否かを表現している。システムメモ
リの全体量を必要最小限にするために、ステップ２で得
られるイメージ画素データを、ステップ４で圧縮する。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is a diagram showing an outline of white space enlargement processing according to an embodiment of the present invention. In this process, first, the digital representation of the physical document is created as a bit string (step 2). A document scanner can be used for this purpose. Each bit corresponds to one pixel of the scanned image and represents whether or not there is a white space at that position. The image pixel data obtained in step 2 is compressed in step 4 in order to minimize the total amount of system memory.

【００１０】大抵のドキュメントでは、語とイメージと
は、白いスペースの領域によって、互いに縦方向と横方
向の両方の方向に分離している。従って、語，テキス
ト，または絵画像のイメージは、白画素の広がりによっ
て分離された主に黒画素のランとして見ることができ
る。これらの２つの領域の境界は、語またはイメージを
含む短形領域のエッジ(縁)を画定する。ステップ６で
は、上述した関係を用いてドキュメントイメージ画素デ
ータを再分類するために必要な操作を実行する。In most documents, words and images are separated from each other both vertically and horizontally by areas of white space. Thus, a word, text, or pictorial image can be viewed as a run of predominantly black pixels separated by a spread of white pixels. The boundaries of these two regions define the edges of the rectangular regions containing the word or image. In step 6, the operations required to reclassify the document image pixel data using the relationships described above are performed.

【００１１】テキスト(文書)，ノイズ，画像の各短形領
域もまた特有の特徴をもっている。例えば、語の短形領
域は、絵画像の短形領域よりも寸法が小さく、また、ア
スペクト比が小さいという傾向がある。ステップ８で
は、これらの関係を用いて、ドキュメントの短形領域の
全セットからテキストブロックを同定(識別)する。Each of the short areas of the text (document), noise and image also has its own characteristic. For example, word rectangles tend to have smaller dimensions and smaller aspect ratios than picture image rectangles. In step 8, these relationships are used to identify a text block from the entire set of short regions of the document.

【００１２】ステップ８の最後に、ドキュメントイメー
ジ矩形領域の各々について座標を定める。横方向にＸ個
の画素と縦方向にＹ個の画素とからなるドキュメントイ
メージが与えられると、ステップ１０では、各矩形領域
の座標値を乗算することによって、Ｘ×Ｙの大きさのイ
メージをＣ₁Ｘ×Ｃ₂Ｙの大きさのイメージにマップす
る。この結果得られるイメージは、Ｃ₁Ｃ₂倍大きくなる
が、元の矩形領域の大きさは変わらない。At the end of step 8, the coordinates are determined for each of the document image rectangular areas. Given a document image consisting of X pixels in the horizontal direction and Y pixels in the vertical direction, in step 10, an image of size X × Y is obtained by multiplying the coordinate values of each rectangular area. Map to an image of size C ₁ X × C ₂ Y. The resulting image is C ₁ C ₂ times larger, but the size of the original rectangular area remains the same.

【００１３】この時点で、白スペースが拡げられるが、
前述のように、この際、テキストを１枚のページに収め
ることができなくなる場合がある。次のステップ１２乃
至１４は、ドキュメントの論理的な読みの順序を再度順
序付けしたり保存する随意の選択的なステップである。
ステップ１２では、矩形領域を論理的な順序で画像ブロ
ック，テキストブロックにグループ分けする。ステップ
１４では、論理的な読みの順序を保証するために、順序
付けられた矩形領域を１枚あるいは複数枚の適当なペー
ジにマップする。At this point, the white space is expanded,
As described above, in this case, it may not be possible to fit the text on one page. The following steps 12-14 are optional steps to reorder and preserve the logical reading order of the document.
In step 12, the rectangular areas are grouped into image blocks and text blocks in a logical order. In step 14, the ordered rectangular areas are mapped to one or more suitable pages to ensure a logical reading order.

【００１４】図２は、本発明の好適な実施例によるイメ
ージの再生および白スペースの拡大を行うのに適したハ
ードウェア構成を示す図である。このハードウェア構成
は、複写機のようなスタンドアローン型の装置内に備わ
っていても良いし，あるいは、ローカルネットワークや
ワイドエリアネットワークの一部であっても良い。ドキ
ュメントイメージ処理システムでは、ＣＰＵ２２にスキ
ャン装置２０が接続されている。なお、スキャン装置２
０としては、当業者に知られている他のスキャン装置を
用いても良い。ＣＰＵ２２は、本発明に従って、ドキュ
メントイメージデータを処理するコマンドを実行する。FIG. 2 is a diagram showing a hardware configuration suitable for reproducing an image and expanding a white space according to the preferred embodiment of the present invention. This hardware configuration may be included in a stand-alone type device such as a copying machine, or may be a part of a local network or a wide area network. In the document image processing system, the scanning device 20 is connected to the CPU 22. The scanning device 2
As 0, another scanning device known to those skilled in the art may be used. CPU 22 executes commands for processing document image data in accordance with the present invention.

【００１５】また、ＣＰＵ２２には、メモリ２４が接続
されている。ここで、メモリ２４として、任意の形式の
ランダムアクセスメモリを用いることができる。メモリ
２４には、ドキュメントイメージデータが格納されるよ
うになっている。この場合、メモリ２４としては、後述
のように、ソースメモリ，ターゲットメモリとしてそれ
ぞれ機能する別々のメモリ２４ａ，２４ｂを有していて
も良い。ＣＰＵ２２によって実行されるコマンドとデー
タを記憶しておくために、ＣＰＵ２２には、ＲＯＭ(図
示せず)を接続することもできる。A memory 24 is connected to the CPU 22. Here, as the memory 24, any type of random access memory can be used. Document image data is stored in the memory 24. In this case, the memory 24 may have separate memories 24a and 24b which respectively function as a source memory and a target memory, as described later. A ROM (not shown) may be connected to the CPU 22 in order to store commands and data executed by the CPU 22.

【００１６】また、ＣＰＵ２２には、ユーザインターフ
ェース２６が接続されている。ドキュメントイメージ処
理装置のオペレータは、このユーザインターフェース２
６によって、完成されたドキュメントについて所望の特
徴を指定することができ、この場合、ＣＰＵ２２は、こ
れらの特徴に基づいた処理を実施するのに必要なコマン
ドのサブセットを実行することができる。例えば、装置
のオペレータは、多くの照合コピーが作られるように指
定したり，あるいは、完成されたドキュメントが特定の
宛先に送られるように指定したりすることができる。A user interface 26 is connected to the CPU 22. The operator of the document image processing apparatus uses this user interface 2
6, the desired features can be specified for the completed document, in which case the CPU 22 can execute a subset of the commands necessary to perform processing based on these features. For example, the operator of the device can specify that many collated copies be made, or that the completed document be sent to a particular destination.

【００１７】図３乃至図５は、本発明による白スペース
拡大方法のフローチャートである。図３のフローチャー
トでは、先ず、ＣＣＤスキャナーや他のスキャン装置１
を用いてドキュメントをスキャンし、ドキュメントイメ
ージのデジタル表現を作成する(ステップ２０１)。この
結果、各スキャンラインは、イメージの各画素に対応す
るビット列としてデジタル表現される。好適な実施例で
は、スキャンラインは左から右へ延びる。しかしなが
ら、スキャンされるドキュメントが通常読まれる方向
に、スキャン方向を対応付けるよう、スキャン方向を再
設定しても良い。例えば、アラビア語テキストについて
は右から左へスキャンしても良い。FIGS. 3 to 5 are flowcharts of the white space expanding method according to the present invention. In the flowchart of FIG. 3, first, the CCD scanner and other scanning devices 1
To scan a document to create a digital representation of the document image (step 201). As a result, each scan line is digitally represented as a bit string corresponding to each pixel of the image. In the preferred embodiment, the scan lines extend from left to right. However, the scan direction may be reset so that the scan direction is associated with the direction in which the document to be scanned is normally read. For example, Arabic text may be scanned from right to left.

【００１８】次いで、ビットマップ表現の圧縮を行なう
(ステップ２０２)。このデータ圧縮は、後述のように、
短形領域を抽出するために用いられる。本実施例におけ
る圧縮技術により、ドキュメントを表現するのに用いら
れる実際のデータ量を１／４に減少させ、また、処理デ
ータ量を１／３２に減少させることができる。この圧縮
技術では、論理ＯＲ演算を用いて、４本の水平のスキャ
ンラインを１本の圧縮したスキャンラインにまとめる。
なお、上記のように選択されたスキャンラインの本数，
すなわち４本は、経験に基づくものであり、４本のスキ
ャンラインを選択することによって、６ポイントのタイ
プフェイス程度の低い解像度のドキュメントの処理が可
能となる。他の本数を選択することも可能である。Next, the bit map representation is compressed.
(Step 202). This data compression, as described below,
It is used to extract rectangular regions. The compression technique in this embodiment can reduce the actual amount of data used to represent a document by a factor of 4, and the amount of processed data by a factor of 32. In this compression technique, a logical OR operation is used to combine four horizontal scan lines into one compressed scan line.
In addition, the number of scan lines selected as described above,
That is, the four lines are based on experience, and by selecting the four scan lines, it becomes possible to process a document having a low resolution such as a 6-point typeface. It is also possible to select other numbers.

【００１９】ステップ２０２の圧縮技術は、縦方向の圧
縮と横方向の圧縮との２つの処理を有している。縦方向
に隣接する４本のスキャンライン内の同じ位置で、１つ
あるいは１以上の黒画素が存在する場合、これを圧縮し
て得られる１本のスキャンラインの画素は、黒画素とし
て表現される。４本のスキャンラインのグループ内の同
じ位置に黒画素が存在しない場合には、圧縮の結果得ら
れる１本のスキャンラインの画素は、白画素として表現
される。The compression technique of step 202 comprises two processes: vertical compression and horizontal compression. If one or more black pixels are present at the same position in four vertically adjacent scan lines, the pixels of one scan line obtained by compressing the black pixels are expressed as black pixels. It When there is no black pixel at the same position in the group of four scan lines, the pixel of one scan line obtained as a result of compression is expressed as a white pixel.

【００２０】図６は、本実施例におけるスキャンライン
の圧縮を示す図である。図６には、元の圧縮されていな
いビットマップ表現からの４本のスキャンライン３００
乃至３０３がそれぞれ示されている。スキャンライン３
００乃至３０３の各々に対して、それぞれ２バイト（３
０４，３０５；３０６，３０７；３０８，３０９；３１
０，３１１）が与えられている。また、縦方向の圧縮を
した結果のバイト（３１２，３１３）が示されている。
各バイト３０４乃至３１１と、圧縮の結果得られたバイ
ト３１２，３１３は、８ビットで構成されている。FIG. 6 is a diagram showing compression of scan lines in this embodiment. FIG. 6 shows four scan lines 300 from the original uncompressed bitmap representation.
Through 303 are shown respectively. Scan line 3
For each of 00 to 303, 2 bytes (3
04,305; 306,307; 308,309; 31
0, 311) are given. Also, the bytes (312, 313) resulting from the vertical compression are shown.
Each of the bytes 304 to 311 and the bytes 312 and 313 obtained as a result of the compression are composed of 8 bits.

【００２１】イメージデータを縦方向に圧縮した後、横
方向に圧縮する。この場合、図６のライン３１４に示す
ように、１つのセグメント，すなわち１つのバイト３１
２または３１３が黒画素データを含んでいるならば、こ
のセグメントは、１バイトの画素データが全て黒として
表わされる一方、１つのセグメントが黒画素データを含
んでいない場合には、そのセグメントは、１バイトの画
素データが全て白として表現される。上述した圧縮技術
によって、システムメモリ量を低減できる。なお、本発
明では、上述以外の圧縮技術を用いても良い。あるい
は、データ圧縮処理を省略しても良い。After the image data is compressed in the vertical direction, it is compressed in the horizontal direction. In this case, as shown in line 314 of FIG.
If 2 or 313 contains black pixel data, then this segment has all 1 byte of pixel data represented as black, while if one segment does not contain black pixel data, then that segment is All 1-byte pixel data is represented as white. The compression technique described above can reduce the amount of system memory. In the present invention, compression techniques other than the above may be used. Alternatively, the data compression process may be omitted.

【００２２】図３のステップ２０２においてイメージを
圧縮した後、ステップ２０３，２０４では、ドキュメン
トイメ−ジを解析することによって、データを矩形領域
に整理する。ここで、矩形領域は、テキストと絵画像と
ノイズとの境界を画定するものである。矩形領域データ
を得るため、ステップ２０３では、先ず、ラン長さ抽出
処理によって、連続黒画素のブロックごとのラン長さを
抽出する。After compressing the image in step 202 of FIG. 3, in steps 203 and 204 the data is organized into rectangular areas by analyzing the document image. Here, the rectangular area defines the boundary between the text, the pictorial image, and the noise. In order to obtain the rectangular area data, in step 203, the run length is first extracted by the run length extraction process for each block of continuous black pixels.

【００２３】ラン長さの定義において、最初の要素は、
白から黒への変移が生じる黒画素の位置を特定し、次の
要素は、黒から白への変移が生じる位置を特定する。各
圧縮スキャンラインは、１またはそれ以上のラン長さを
もつことができる。圧縮スキャンラインに沿って一連の
連続黒画素をサ−チすることによって、各圧縮スキャン
ラインに対して、ラン長さレコードで構成される“ラン
長さ”の組が得られる。この処理では、先ず、論理値
“０”が白画素を表わしている場合に、“０”でない論
理値を求めてそのスキャンラインのバイト値を調べるこ
とによって、連続する一連の黒画素を同定する。このよ
うな処理での最初の黒画素位置がラン長さについての開
始値としてセットされる。次いで、“０”の論理値の画
素を求めてスキャンラインの以後のバイト値を調べるこ
とによって、次の白画素を探索する。この画素は、ラン
長さの終了値としてセットされる。このような仕方で、
１つのスキャンラインについて全ての“ラン長さ”が抽
出される。所定のスキャンラインについて全てのラン長
さが抽出されると、これらの組は、圧縮されたビットマ
ップ表現の１番上からｎ番目のスキャンラインに存在す
るラン長さの組としてラベル付けされる。In the definition of run length, the first element is
The location of the black pixel where the transition from white to black occurs is identified, and the next element identifies the location where the transition from black to white occurs. Each compressed scanline can have one or more run lengths. By searching a series of consecutive black pixels along the compressed scanline, a "run length" set consisting of run length records is obtained for each compressed scanline. In this process, first, when the logical value "0" represents a white pixel, a continuous logical pixel is identified by obtaining a logical value other than "0" and examining the byte value of the scan line. . The first black pixel position in such processing is set as the start value for the run length. Next, the next white pixel is searched by obtaining a pixel having a logical value of "0" and examining the subsequent byte value of the scan line. This pixel is set as the end value of the run length. In this way
All "run lengths" are extracted for one scan line. Once all run lengths have been extracted for a given scanline, these tuples are labeled as the runlength tuples that lie in the top nth scanline of the compressed bitmap representation. .

【００２４】ラン長さの分類は、関連して抽出された矩
形領域を初期分類するのに使用される。３００ｄｐｉの
解像度で走査されたドキュメントに対する分類規則は、
ヒューリスティックなデータに基づいており、以下のよ
うなものとなっている。１．ラン長さ≦２画素である
場合には、ラン長さの型にＳＨＯＲＴ(“短かい”)を割
り当てる。２．ラン長さ＞６０画素である場合には、
ラン長さの型にＬＯＮＧ(“長い”)を割り当てる。３．
６０画素≧ラン長さ＞２画素である場合には、ラン長
さの型にＭＥＤＩＵＭ(“中間”)を割り当てる。Run length classification is used to initially classify the associated extracted rectangular regions. The classification rule for a document scanned at 300 dpi resolution is:
It is based on heuristic data and looks like this: 1. If run length ≦ 2 pixels, SHORT (“short”) is assigned to the run length type. 2. If the run length is> 60 pixels,
Assign LONG (“long”) to the run length type. 3.
If 60 pixels ≧ run length> 2 pixels, MEDIUM (“intermediate”) is assigned to the run length type.

【００２５】図７は、画素のスキャンラインとラン長さ
とを示す図である。なお、図７では、１バイトをその対
応する画素値として表わし、簡略化していることに留意
すべきである。例えば、画素４０５は“０”でないバイ
ト値（すなわち、黒画素）を表わしている一方で、画素
４０６は“０”のバイト値（すなわち、白画素）を表わ
している。スキャンライン４０１には、一連の連続黒画
素で構成されている部分４０２がある。画素４０３のア
ドレスは、そのラン長さの始めを示しており、また、画
素４０４のアドレスは、そのラン長さの終端を示してい
る。ここで、画素４０３がアドレス“３１２”のところ
にあり、また、画素４０４がアドレス“４４０”のとこ
ろにあるとし、長いラン長さに対する閾値が“１００”
であるとすると、結果として得られるラン長さレコ−ド
は、始端値“３１２”と、終端値“４４０”と、長いラ
ン長さについてのラン長さフラグ値とをもつ。FIG. 7 is a diagram showing scan lines and run lengths of pixels. It should be noted that in FIG. 7, one byte is represented as its corresponding pixel value for simplification. For example, pixel 405 represents a non-zero byte value (ie, a black pixel), while pixel 406 represents a byte value of “0” (ie, a white pixel). The scan line 401 has a portion 402 composed of a series of continuous black pixels. The address of pixel 403 indicates the beginning of its run length, and the address of pixel 404 indicates the end of its run length. Here, assuming that the pixel 403 is at the address “312” and the pixel 404 is at the address “440”, the threshold value for the long run length is “100”.
, The resulting run length record has a start value of "312", an end value of "440", and a run length flag value for long run lengths.

【００２６】圧縮スキャンラインに基づいてラン長さが
抽出されるときに、ドキュメントの特徴を表わす矩形領
域が構築される。これらの矩形領域は、ドキュメントイ
メ−ジの横方向と縦方向の両方向における連続黒画素の
境界を表わしている。ラン長さは、一次元であるが、こ
れに対し、矩形領域は二次元である。When run lengths are extracted based on compressed scanlines, rectangular areas are constructed that represent the features of the document. These rectangular areas represent the boundaries of consecutive black pixels in both the horizontal and vertical directions of the document image. The run length is one-dimensional, whereas the rectangular area is two-dimensional.

【００２７】処理のどの時点においても、２つの圧縮ス
キャンラインのラン長さを記述する２組のレコ−ドだけ
が使用され、メモリーに記憶される。第１の組のレコ−
ドは、現在のスキャンラインのラン長さを記述してお
り、第２の組のレコ−ドは、過去のスキャンラインのラ
ン長さを記述している。過去のスキャンライン情報は、
矩形領域の抽出に使用される。圧縮スキャンラインレコ
−ドの新たな組を読み取るに先立って、現在のレコ−ド
の組は、過去のスキャンラインのレコ−ドとして保持す
るためメモリー位置に複写される。次いで、新しいスキ
ャンラインを記述するレコ−ドが、現在のスキャンライ
ンを記述するレコ−ドを保持しているメモリー位置に読
み込まれて、適宜処理される。At any point in the process, only two sets of records describing the run lengths of the two compressed scanlines are used and stored in memory. First set of records
Code describes the run length of the current scan line, and the second set of records describes the run length of the past scan line. Past scanline information is
Used to extract a rectangular area. Prior to reading a new set of compressed scanline records, the current set of records is copied to a memory location to hold as a record of past scanlines. The record describing the new scanline is then read into the memory location holding the record describing the current scanline and processed accordingly.

【００２８】現在の圧縮スキャンラインと過去の圧縮ス
キャンラインとの間の関係によって、現在の圧縮スキャ
ンラインのラン長さを、既存の矩形領域に割り当てる
か、新しい矩形領域を作るかが決定される。最初の圧縮
スキャンラインが処理されると、各ラン長さによって、
１つの矩形領域が定められる。新しい圧縮スキャンライ
ンを次々に処理していくとき、ラン長さは、既存のスキ
ャンラインの矩形領域と関連付けられるか、あるいは、
新たな矩形領域の境界を定めるのに用いられる。ラン長
さのある部分が矩形領域の境界内に存在する場合には、
そのラン長さは既存の矩形領域と関連付けられる。現在
の圧縮スキャンラインにおいて矩形領域と隣接している
全ての画素が白であるとき、１つの矩形領域は完成し、
それ以上には拡張されない。換言すると、現在の圧縮ス
キャンラインのラン長さが矩形領域境界内に存在しなく
なるとき、１つの矩形領域が完結する。１つのラン長さ
が矩形領域境界内に一部存在する場合には、新しい矩形
領域が作り出される。このような方式では、重なり合う
矩形領域が生成されることがある。このような重なり合
った矩形領域は、次の工程においてさらに処理される。The relationship between the current compressed scan line and the past compressed scan line determines whether the run length of the current compressed scan line is assigned to an existing rectangular area or a new rectangular area is created. . When the first compressed scanline is processed, each run length gives
One rectangular area is defined. When processing new compressed scanlines one after another, the run length is either associated with the rectangular area of the existing scanline, or
Used to define the boundaries of a new rectangular area. If the run length part exists within the boundary of the rectangular area,
The run length is associated with the existing rectangular area. When all the pixels adjacent to the rectangular area in the current compressed scan line are white, one rectangular area is completed,
No further expansion. In other words, one rectangular area is complete when the run length of the current compressed scanline no longer lies within the rectangular area boundary. If a run length is partially within the bounds of the rectangular area, a new rectangular area is created. In such a method, overlapping rectangular areas may be generated. Such overlapping rectangular areas are further processed in the next step.

【００２９】図８には、現在の圧縮スキャンラインのラ
ン長さと過去の圧縮スキャンラインのラン長さとから、
矩形領域を構築する様子が示されている。図８におい
て、過去の圧縮スキャンライン５０１と現在の圧縮スキ
ャンライン５０２とは、それぞれ、複数のラン長さを含
んでいる。過去の圧縮スキャンライン５０１は、ラン長
さ５０３乃至５０９を有している。一方、現在の圧縮ス
キャンライン５０２は、ラン長さ５１０乃至５１７を有
している。なお、図８に示すように、スキャンライン５
０１とスキャンライン５０２とは、ビットに関し整列し
ている。従って、スキャンライン５０１の左端ビット
が、スキャンライン５０２の左端ビットに対応してい
る。In FIG. 8, from the run length of the current compressed scan line and the run length of the past compressed scan line,
The construction of a rectangular area is shown. In FIG. 8, the past compressed scan line 501 and the current compressed scan line 502 each include a plurality of run lengths. The past compressed scan line 501 has run lengths 503 to 509. On the other hand, the current compressed scan line 502 has run lengths 510-517. In addition, as shown in FIG.
01 and scan line 502 are bit-aligned. Therefore, the leftmost bit of the scan line 501 corresponds to the leftmost bit of the scan line 502.

【００３０】また図８には、過去に画定された矩形領域
５２０乃至５２５が示されている。ラン長さ５１０乃至
５１７が既存の矩形領域に加えられるためには、ラン長
さ５１０乃至５１７とラン長さ５０３乃至５０９との関
係が次のものである必要がある。すなわち、現在のスキ
ャンライン内の１つのラン長さの始点が、過去のスキャ
ンライン内のラン長さと連続性を有していなければなら
ない。例えば、図８において、ラン長さ５１０の始点
は、圧縮スキャンライン５０１のラン長さ５０３，５０
４と連続性をもっているので、ラン長さ５１０は、既存
の矩形領域５２０に加えられる。これに対し、ラン長さ
５１５に関しては、過去のスキャンライン５０１内のラ
ン長さと連続性がないので、新たな矩形領域５２２が作
り出される。また、過去のスキャンライン５０１内のラ
ン長さ５０８については、現在のスキャンライン内のラ
ン長さと連続性がなく、従って、矩形領域５２４が完成
する。Further, FIG. 8 shows rectangular areas 520 to 525 defined in the past. In order for the run lengths 510 to 517 to be added to an existing rectangular area, the relationship between the run lengths 510 to 517 and the run lengths 503 to 509 needs to be: That is, the start point of one run length in the current scan line must have continuity with the run length in the past scan line. For example, in FIG. 8, the start point of the run length 510 is the run length 503, 50 of the compressed scan line 501.
Since it is continuous with 4, the run length 510 is added to the existing rectangular area 520. On the other hand, the run length 515 has no continuity with the run length in the scan line 501 in the past, so that a new rectangular area 522 is created. Further, the run length 508 in the past scan line 501 is not continuous with the run length in the current scan line, and thus the rectangular area 524 is completed.

【００３１】図８において、スキャンライン５０１のラ
ン長さは、既存の矩形領域に次のように加えられる。す
なわち、ラン長さ５０３乃至５０４は矩形領域５２０に
加えられ、また、ラン長さ５０５は矩形領域５２１に加
えられ、また、ラン長さ５０６乃至５０７は、矩形領域
５２３に加えられ、また、ラン長さ５０９は、矩形領域
５２５に加えられる。また、ラン長さ５０８によって、
矩形領域５２４が作り出される。スキャンライン５０２
に関しては、ラン長さ５１０，５１１が矩形領域５２０
に加えられ、ラン長さ５１２乃至５１４が矩形領域５２
１に加えられ、ラン長さ５１６が矩形領域５２３に加え
られ、ラン長さ５１７が矩形領域５２５に加えられる。
また、上述したように、ラン長さ５１５によって矩形領
域５２２が新たに作り出される。In FIG. 8, the run length of the scan line 501 is added to the existing rectangular area as follows. That is, run lengths 503 to 504 are added to rectangular area 520, run lengths 505 are added to rectangular area 521, run lengths 506 to 507 are added to rectangular area 523, and run lengths are added. The length 509 is added to the rectangular area 525. Also, depending on the run length 508,
A rectangular area 524 is created. Scan line 502
With respect to, the run lengths 510 and 511 are rectangular areas 520.
In addition, the run lengths 512 to 514 are added to the rectangular area 52.
1, the run length 516 is added to the rectangular area 523, and the run length 517 is added to the rectangular area 525.
Further, as described above, the run length 515 newly creates the rectangular area 522.

【００３２】矩形領域が構築されているとき、各矩形領
域に含まれている互いに異なる型のラン長さについての
計数が続けられている。１つの矩形領域が画定すると、
矩形領域の初期分類が行なわれ、矩形領域は、４つの
型、すなわち、縦線“ＶＬ”，横線“ＨＬ”，画像（絵
画像）“ＩＭＧ”，不明“ＵＮＫＮＯＷＮ”のいずれか
１つに分類される。以下の一般的な規則が、矩形領域の
分類に使用される。規則１．「ラン長さが全て型ＬＯＮＧであり」、かつ、
「矩形領域の高さがラン長さの型ＳＨＯＲＴの閾値より
も小さいかまたは等しい」ならば、ＨＬの型として矩形
領域を分類する。規則２．「ラン長さが全て型ＳＨＯＲＴであり」、か
つ、「矩形領域の高さがラン長さの型ＳＨＯＲＴの閾値
より大きい」）ならば、ＶＬ型として矩形領域を分類す
る。規則３．「ラン長さが型ＬＯＮＧか」、または、「矩形
領域の幅がラン長さの型ＬＯＮＧの閾値よりも小さいか
または等しく」かつ「矩形領域の高さが画像高さ矩形領
域閾値よりも大きい」ならば、矩形領域をＩＭＧ型とし
て分類する。規則４．その他の残りの矩形領域を全てＵＮＫＮＯＷＮ
として分類する。As the rectangular areas are constructed, counting is continued for the different types of run lengths contained in each rectangular area. When one rectangular area is defined,
The initial classification of the rectangular area is performed, and the rectangular area is classified into one of four types: vertical line “VL”, horizontal line “HL”, image (picture image) “IMG”, and unknown “UNKNOWN”. To be done. The following general rules are used for classifying rectangular areas. Rule 1. "All run lengths are type LONG", and
If the height of the rectangular area is less than or equal to the threshold of the run length type SHORT, then the rectangular area is classified as an HL type. Rule 2. If "the run lengths are all of the type SHORT" and "the height of the rectangular area is larger than the threshold of the run length of the type SHORT"), the rectangular area is classified as the VL type. Rule 3. "Is the run length type LONG?", Or "the width of the rectangular area is less than or equal to the threshold of the run length type LONG" and "the height of the rectangular area is greater than the image height rectangular area threshold value". , The rectangular area is classified as an IMG type. Rule 4. All other remaining rectangular areas are UNKNOWN
Classify as.

【００３３】規則１は、横線を識別し、規則２は縦線を
識別し、規則３は画像矩形領域を識別し、規則４は“不
明”のデフォルト分類を与える。３００ｄｐｉのドキュ
メントイメ−ジに対して、ラン長さの型ＳＨＯＲＴの閾
値を２画素と定め、また、画像高さ矩形領域の閾値を８
２画素と定めた。上記分類規則は、典型的なドキュメン
トが含んでいる既知のパラメータから得られる。これら
のパラメータは、ドキュメントのビットマップ表現の解
像度に応じて変更しても良いし、および／または、矩形
領域の大きさの分布を解析することにより、ドキュメン
トに合わせても良い。Rule 1 identifies horizontal lines, rule 2 identifies vertical lines, rule 3 identifies rectangular image areas, and rule 4 provides a default classification of "unknown". For a document image of 300 dpi, the threshold of run length type SHORT is set to 2 pixels, and the threshold of the image height rectangular area is set to 8 pixels.
It was defined as 2 pixels. The classification rules above are derived from the known parameters that a typical document contains. These parameters may be modified depending on the resolution of the bitmap representation of the document and / or may be tailored to the document by analyzing the distribution of the size of the rectangular areas.

【００３４】図３のステップ２０４の処理の最後に、ド
キュメントイメージの全ての基本的な対象を記述する矩
形領域のリストを作成し、初期分類がなされる。この段
階で、あるテキスト(文書)は、縦線型あるいはＵＮＫＮ
ＯＷＮ“不明”型のセグメントとして誤分類される。例
えば、文字“Ｉ”，“ｌ”，“１”は、しばしば誤分類
される。At the end of the process of step 204 of FIG. 3, a list of rectangular areas describing all basic objects of the document image is created and an initial classification is made. At this stage, some text (document) is vertical line type or UNKN
OWN misclassified as "unknown" type segment. For example, the letters "I", "l", "1" are often misclassified.

【００３５】従って、ステップ２０５では、以下の規則
によって、上述のようになされた初期分類結果を試験
し、これをより正確なものにする。規則１’：誤分類された文書，すなわち縦線として分類
された１（いち），ｌ（エル），またはＩ（アイ）を修
正する。「矩形領域の型がＶＬで」かつ「矩形領域の高
さが“不明”矩形領域の高さの閾値よりも小さい」なら
ば、矩形領域をＵＮＫＮＯＷＮ型として分類する。規則２’：フォントサイズに基づいて矩形領域を再割り
当てする。最大のフォントサイズより大きい矩形領域に
ついては、これらを画像とする。（矩形領域の高さの２
倍）が（画像の高さに対する閾値）よりも大きいなら
ば、矩形領域をＩＭＧ型として分類する。規則３’：「単語」は高くなるというよりも長くなると
いう傾向にあるとの前提に基づいて、画像領域を割り当
てる。（（矩形領域の高さの４倍）＋（矩形領域の
幅））が（画像の高さの閾値の４倍）よりも大きいなら
ば、矩形領域をＩＭＧ型として分類する。規則４’：この規則４’は、横線を定義するための基準
を与えるものであって、長い横線がテキストブロックま
たはコラムを分割する短かい横線よりも太い傾向がある
との前提に基づいている。（矩形領域の高さの４倍）の
比としての（矩形領域の幅）が（横線の幅に対する閾
値）よりも大きいならば、矩形領域をＨＬ型として分類
する。規則５’：規則５’は、横線を小さな（例えば６ポイン
トの）フォントテキストの長いラインと区別するための
基準を与えるものである。（（矩形領域の高さ）の比と
しての（矩形領域の幅））が（横線の幅と高さとの比に
対する閾値）よりも大きいならば、矩形領域をＨＬ型と
して分類する。Therefore, in step 205, the following rules are used to test the initial classification result, made as described above, to make it more accurate. Rule 1 ': Correct misclassified document, i.e. 1 (1), l (el), or I (eye) classified as vertical lines. If "rectangular area type is VL" and "rectangular area height is smaller than" unknown "rectangular area height threshold", the rectangular area is classified as UNKNOWN type. Rule 2 ': Reallocate rectangular areas based on font size. For rectangular areas larger than the maximum font size, these are images. (2 of the height of the rectangular area
If (double) is larger than (threshold for image height), the rectangular area is classified as an IMG type. Rule 3 ': Allocate image regions based on the assumption that "words" tend to be longer than they are higher. If ((4 times the height of the rectangular area) + (width of the rectangular area)) is larger than (4 times the threshold of the height of the image), the rectangular area is classified as an IMG type. Rule 4 ': This Rule 4'provides a criterion for defining a horizontal line, and is based on the assumption that long horizontal lines tend to be thicker than short horizontal lines that separate text blocks or columns. . If the (width of the rectangular area) as a ratio of (four times the height of the rectangular area) is larger than the (threshold value for the width of the horizontal line), the rectangular area is classified as the HL type. Rule 5 ': Rule 5'provides a criterion for distinguishing horizontal lines from long lines of small (eg 6 point) font text. If ((width of rectangular area) as a ratio of (height of rectangular area)) is larger than (threshold for the ratio of width to height of horizontal line), the rectangular area is classified as HL type.

【００３６】３００ｄｐｉのイメ−ジに対し、上記の各
閾値は次のとおりである。すなわち、“不明”矩形領域
に対する高さ閾値は“５”であり、画像高さ閾値は“８
２”であり、横線の幅閾値は“７７”であって、横線の
幅高さ比の閾値は“１５”である。For the image of 300 dpi, the above threshold values are as follows. That is, the height threshold value for the “unknown” rectangular area is “5” and the image height threshold value is “8”.
2 ", the horizontal line width threshold value is" 77 ", and the horizontal line width height ratio threshold value is" 15 ".

【００３７】ステップ２０５の最後に、絵画像エリア，
縦線，横線の正確な分類を行なう。他の全てのドキュメ
ントは、ＵＮＫＮＯＷＮとして分類される。これらのＵ
ＮＫＮＯＷＮ型の矩形領域は、ドキュメント中のテキス
トを表わしているか、あるいは、テキスト状のノイズを
表わしている。本発明におけるこの処理には、１９９２
年４月６日に「“Segmentation of Text Picture and L
ines of a Document Image”」という名称で出願された
米国特許出願第０７／８６４，４２３号に記載されてい
るような、スキュー検出およびスキュー補正の処理，短
形領域のマージ(併合)の処理，および／または、ブロッ
クの順序付けの処理をも自由に追加し含ませることがで
きる。スキュー検出およびスキュー補正は、コラムエッ
ジを画定するのに有用である。しかしながら、本発明の
一実施例では、図３のステップ２０７の処理において、
テキストのラインとして分類されうるブロックにマージ
することの可能な矩形領域を検出する。このようなマー
ジブロックは、型“ＣＨＡＲ”として分類される。従っ
て、マージされたＵＮＫＮＯＷＮ矩形領域のグループ
は、テキストブロックである。At the end of step 205, the pictorial image area,
Accurately classify vertical and horizontal lines. All other documents are classified as UNKNOWN. These U
The NKNOWN rectangular area represents text in the document or text-like noise. This process in the present invention includes 1992.
April 6, 2014, "" Segmentation of Text Picture and L
skew detection and skew correction processing, short region merging processing, as described in US patent application Ser. No. 07 / 864,423 filed under the name "ines of a Document Image". And / or the processing of block ordering can be added and included at will. Skew detection and skew correction are useful in defining column edges. However, in one embodiment of the present invention, in the processing of step 207 of FIG.
Detect rectangular areas that can be merged into blocks that can be classified as lines of text. Such merged blocks are classified as type "CHAR". Therefore, the group of merged UNKNOWN rectangular areas is a text block.

【００３８】本実施例において、２つの矩形領域のマー
ジ（併合）は、これらの矩形領域が同じ型のものであ
り、かつ、所定の横方向マージ閾値および縦方向マージ
閾値内にあるときになされる。型“ＩＭＧ”の矩形領域
は、マージしない。このようなマージ閾値は、イメージ
の解像度と矩形領域の平均的高さに応じて定められる。
次表には、３００ｄｐｉの解像度をもつドキュメントに
対するこのようなマージ閾値が示されている。In this embodiment, two rectangular areas are merged when the rectangular areas are of the same type and are within a predetermined horizontal merge threshold and vertical merge threshold. It Rectangular areas of type "IMG" are not merged. Such a merge threshold is determined according to the image resolution and the average height of the rectangular area.
The following table shows such merge thresholds for documents with a resolution of 300 dpi.

【００３９】[0039]

【表１】 [Table 1]

【００４０】マージ処理には、横方向のマージ処理と縦
方向のマージ処理とがある。横方向のマージ処理では、
隣接しているが異なるコラムのテキスト(文書)矩形領域
をマージ（併合）しないようにしなければならない。ス
キュー角の検出は、ステップ２０６の一部に含めると
き、コラムエッジを定めるのに用いることができる。横
方向のマージ処理中、“ＵＮＫＮＯＷＮ”として分類さ
れ、かつ、境界長さがノイズ長さの閾値よりも小さい矩
形領域は、ノイズとして除去される。横方向にマージさ
れた矩形領域のうち残りのものは、テキスト（すなわ
ち、型“ＣＨＡＲ”）として分類される。The merge process includes a horizontal merge process and a vertical merge process. In the horizontal merge process,
Text (document) rectangles in adjacent but different columns must not be merged. Skew angle detection can be used to define column edges when included as part of step 206. During the horizontal merging process, a rectangular area classified as “UNKNOWN” and having a boundary length smaller than the noise length threshold is removed as noise. The rest of the horizontally merged rectangular areas are classified as text (ie, type "CHAR").

【００４１】縦方向のマージ処理では、テキストの横線
を抽出し、これらを縦方向にマージ（併合）する工程と
がある。テキストの矩形領域と型“ＩＭＧ”，“ＨＬ”
または“ＶＬ”の矩形領域との間に重なり合い（オーバ
ーラップ）がある場合には、これらの矩形領域に対する
処理を後の処理まで延ばす。The vertical merging process includes the steps of extracting horizontal lines of text and vertically merging them. Rectangular area of text and type "IMG", "HL"
Alternatively, if there is an overlap with the “VL” rectangular area, the processing for these rectangular areas is postponed until later processing.

【００４２】また、矩形領域が画像の矩形領域の内側に
ある場合には、マージ処理は、上述したと同じマージパ
ラメータを用いてなされる。“ＩＭＧ”矩形領域と重な
り合う矩形領域は、マージ（併合）しない。次のステッ
プ２０９では、テキストのブロックを論理的に順序付け
る処理を行なう。なお、ステップ２０９は随意に選択可
能な処理であるが、ステップ２０９を行なうことによっ
て、白スペースを拡げた後、テキストブロックを論理的
な読みの順に、ページにマッピングすることが容易とな
る。ステップ２０９の処理は、白スペースを拡げる前に
行なっても良いし、白スペースを拡げた後に行なっても
良い。このステップ２０９の処理については、スッテプ
２０１乃至２１０と同様、１９９２年４月６日に「“Se
gmentation of Text Picture and Lines of a Document
Image”」という名称で出願された米国特許出願第０７
／８６４，４２３号に記載されている。If the rectangular area is inside the rectangular area of the image, the merge process is performed using the same merge parameters as described above. Rectangular areas that overlap the "IMG" rectangular area are not merged. In the next step 209, a process of logically ordering the blocks of text is performed. Although step 209 is a process that can be arbitrarily selected, by performing step 209, it becomes easy to map the text blocks onto the page in the logical reading order after expanding the white space. The process of step 209 may be performed before expanding the white space or may be performed after expanding the white space. Regarding the processing of this step 209, as in steps 201 to 210, ““ Se
gmentation of Text Picture and Lines of a Document
US patent application No. 07 filed under the name "Image""
/ 864,423.

【００４３】ステップ２０９の順序付け方法では、検討
中のブロックの“上側”および“左側”のテキストブロ
ックの数を計数する。なお、ここで、“上側”，“左
側”は、ドキュメントの幾何学的レイアウトに関するも
のである。The ordering method of step 209 counts the number of "upper" and "left" text blocks of the block under consideration. It should be noted that the terms “upper side” and “left side” here relate to the geometrical layout of the document.

【００４４】本実施例のこのようなブロック順序付け方
法を、図９乃至図１１を参照して説明する。なお、図９
はドキュメントイメージの種々のテキストブロックの論
理的な順序を決定するための処理を示すフローチャート
である。また、図１０は、ドキュメントイメージを示す
図であり、図１１は、テキストブロックの論理的な順序
を計算するのに用いられる結果値テーブルを示す図であ
る。図９を参照すると、先ず、ドキュメントの各テキス
トブロック(各文書ブロック)に“ＴＡＵ”値を割り当て
る（ステップ１１０１）。“ＴＡＵ”値は、ドキュメン
トイメージの上から下に“１”から始まる順次に連続し
た番号で割り当てられる。すなわち、“ＴＡＵ”の割り
当ては、テキストブロックが構成される順序と同じ順序
でなされる。図１０を参照すると、テキストブロック１
２０１乃至１２０７をもつドキュメントイメージ１２０
０が示されている。さらに、図１０には、画像の矩形領
域１２０８が示されている。画像の矩形領域１２０８に
は“ＴＡＵ値”が割り当てられないことに着目すべきで
ある。ドキュメントを左から右に、また上から下に順次
に調べる場合に、最左上部のテキストブロックはブロッ
ク１２０１であり、従って、このテキストブロックに
は、“１”の“ＴＡＵ”値が割り当てられる。また、次
のテキストブロックはブロック１２０２である。従っ
て、このテキストブロック１２０２には、“２”の“Ｔ
ＡＵ”値が割り当てられる。このような処理は、テキス
トブロック１２０７に“７”の“ＴＡＵ”値が割り当て
られるまで続けられる。図１１の結果値テーブル１２２
０には、テキストブロック１２０１乃至１２０７の各々
に対する“ＴＡＵ”値が示されている。すなわち、“Ｔ
ＡＵ”の行１２１０には、各テキストブロックに対する
全ての“ＴＡＵ”値が示されている。ブロックを順序付
けする“ＴＡＵ”の値をテキストブロックの幾何学的順
序と呼ぶ。The block ordering method of this embodiment will be described with reference to FIGS. 9 to 11. Note that FIG.
6 is a flowchart showing a process for determining the logical order of various text blocks of a document image. 10 is a diagram showing a document image, and FIG. 11 is a diagram showing a result value table used to calculate the logical order of text blocks. Referring to FIG. 9, first, a "TAU" value is assigned to each text block (each document block) of the document (step 1101). The "TAU" values are assigned sequentially from the top to the bottom of the document image, starting with "1". That is, "TAU" is assigned in the same order as the text blocks are constructed. Referring to FIG. 10, the text block 1
Document image 120 having 201 to 1207
0 is shown. Further, FIG. 10 shows a rectangular area 1208 of the image. It should be noted that no "TAU value" is assigned to the rectangular area 1208 of the image. When examining a document sequentially from left to right and from top to bottom, the top left text block is block 1201 and is therefore assigned a "TAU" value of "1". Also, the next text block is block 1202. Therefore, this text block 1202 contains "T" of "2".
The AU "value is assigned. Such processing continues until the text block 1207 is assigned the" TAU "value of" 7 ". Result value table 122 of FIG.
At 0, the "TAU" value for each of the text blocks 1201 to 1207 is shown. That is, "T
All the "TAU" values for each text block are shown in line "AU" 1210. The "TAU" values that order the blocks are called the geometric order of the text blocks.

【００４５】“ＴＡＵ”の値が設定されると、次のステ
ップでは、各テキストブロックについて“ＭＵ”の値を
生成する（ステップ１１０２）。“ＭＵ”の値は、テキ
ストブロックの論理的順序を決定する際に先ず用いられ
る。各テキストブロックについて“ＭＵ”値を決定する
際に、所定のブロックの上側または左側にあるブロック
の全体の数には、該所定のブロックも含まれている。図
１１を参照すると、結果値テーブル１２２０の行１２１
１には、ドキュメントイメージ１２００のテキストブロ
ックについて得られた“ＭＵ”値が示されている。例え
ば、テキストブロック１２０４に対する“ＭＵ”値は、
“４”である。テキストブロック１２０４に対する“Ｍ
Ｕ”値が“４”であるのは、テキストブロック１２０
３，１２０１，１２０２がテキストブロック１２０４の
上側または左側であることによる。テキストブロックの
“ＭＵ”値は、左から右に、また上から下に順序付けす
る場合の論理的順序を与える。When the value of "TAU" is set, the next step is to generate the value of "MU" for each text block (step 1102). The value of "MU" is first used in determining the logical order of the text blocks. In determining the "MU" value for each text block, the total number of blocks above or to the left of a given block also includes that given block. Referring to FIG. 11, row 121 of result value table 1220
1 shows the “MU” value obtained for the text block of the document image 1200. For example, the "MU" value for text block 1204 is
It is "4". "M for text block 1204
The U ”value is“ 4 ”because the text block 120
3,1201 and 1202 are above or to the left of the text block 1204. The "MU" value of a text block gives a logical ordering when ordered from left to right and top to bottom.

【００４６】一般に、上から下への幾何学的順序を、ペ
ージの左位置から右位置までを考慮して“ＭＵ”の値に
より重み付けする。この結果、上／左から下／右の順序
を与える値が得られる。しかしながら、テキストブロッ
クがドキュメントにおいて上から下に整列されている場
合には、右への次のテキストブロックに進む前に、テキ
ストコラムの下に優先的に到達させる。このような優先
処理は、各テキストブロックについて値“ＰＳＩ”を計
算することによって可能となる。In general, the top-to-bottom geometric order is weighted by the value of "MU", taking into account the left-to-right position of the page. This results in values that give an order of top / left to bottom / right. However, if the text block is aligned from top to bottom in the document, it will preferentially reach the bottom of the text column before proceeding to the next text block to the right. Such priority processing is possible by calculating the value "PSI" for each text block.

【００４７】図９を参照すると、ドキュメントイメ−ジ
のテキストブロックの各々について、所定のブロックの
左側にあるテキストブロック数を合計することによっ
て、“ＰＳＩ”値が計算される（ステップ１１０３）。
上述のように、“ＰＳＩ”値は、テキストブロックがコ
ラムのフォ−マットとなっているときに、テキストを順
序付ける手段を提供する。図１１を参照すると、結果値
テ−ブル１２２０の行１２１２には、得られた“ＰＳ
Ｉ”値が示されている。例えば、テキストブロック１２
０５は、“５”の“ＰＳＩ”値をもつ。テキストブロッ
ク１２０５が“５”の“ＰＳＩ”値をもつのは、ブロッ
ク１２０１，１２０３，１２０４，１２０６，１２０７
がテキストブロック１２０５の左側にあることによる。Referring to FIG. 9, for each text block of the document image, the "PSI" value is calculated by summing the number of text blocks to the left of the given block (step 1103).
As mentioned above, the "PSI" value provides a means of ordering text when the text block is in a column format. Referring to FIG. 11, the row 1212 of the result value table 1220 shows the obtained "PS".
I "values are shown. For example, text block 12
05 has a "PSI" value of "5". Text block 1205 has a "PSI" value of "5" because blocks 1201, 1203, 1204, 1206, 1207
Is to the left of text block 1205.

【００４８】図９を再び参照すると、次のステップ１１
０４では、元の“ＰＳＩ”値にテキストブロック数を乗
算して、“ＰＳＩ”値に重み付けをする（ステップ１１
０４）。この重み付け処理によって、テキストブロック
の論理的な順序付けをより正確に行なうことができる。
結果値テ−ブル１２２０の行１２１３には、重み付けさ
れた“ＰＳＩ”値が示されている。Referring again to FIG. 9, the next step 11
In 04, the original "PSI" value is multiplied by the number of text blocks to weight the "PSI" value (step 11).
04). This weighting process allows the text blocks to be logically ordered more accurately.
Row 1213 of the result value table 1220 shows the weighted "PSI" values.

【００４９】最終的な論理的順序を決定するために、各
テキストブロックについて、重み付けされた“ＰＳＩ”
値を“ＭＵ”値に加算する（ステップ１１０５）。加算
した結果の値は、ドキュメント上のテキストブロックの
論理的順序を非常に良好に近似したものとなっている。
図１１において、この処理結果がテーブル１２２０の行
１２１４に示されている。図９を参照すると、次いで、
重み付けされた“ＰＳＩ”値と“ＭＵ”値とのうち同じ
ものがあるか否かを判断する（ステップ１１０６）。同
じ値のものがある場合には、複数のテキストブロックが
同じ論理的順序値をもつので、このブロック順序は有用
な情報を与えない。“ＭＵ”値に同じものがない場合に
は、テキストブロックの順序付け処理を終了する。同じ
“ＭＵ”値がある場合には、テキストブロックの幾何学
的順序を考慮する（ステップ１１０７）。なお、前述し
たように、幾何学的順序は、最初に計算された“ＴＡ
Ｕ”値である。A weighted "PSI" for each text block to determine the final logical order.
The value is added to the "MU" value (step 1105). The resulting value is a very good approximation of the logical order of the text blocks on the document.
In FIG. 11, this processing result is shown in the row 1214 of the table 1220. Referring to FIG. 9, then
It is determined whether or not there is the same weighted "PSI" value and "MU" value (step 1106). This block ordering does not provide useful information, since multiple text blocks have the same logical ordering value if they have the same value. If the "MU" values are not the same, the text block ordering process ends. If they have the same "MU" value, the geometric order of the text blocks is considered (step 1107). Note that, as described above, the geometrical order is the "TA" calculated first.
U "value.

【００５０】図１１を再び参照すると、同じ“ＭＵ”値
をもつテキストブロックがないことは明らかである。従
って、ドキュメントイメ−ジ１２００のテキストブロッ
クに対する順序付けの結果は、１２０３，１２０１，１
２０４，１２０６，１２０７，１２０２，１２０５とな
る。このドキュメントは、新聞や雑誌において見られる
ような、コラムタイプのフォ−マットである。ブロック
の順序付けを完了すると、これらのテキストブロックを
文字認識プログラムに用い、ドキュメントペ−ジ上の文
字を論理的に順序付けすることができる。Referring again to FIG. 11, it is clear that no text block has the same "MU" value. Therefore, the result of ordering the text blocks of the document image 1200 is 1203, 1201,1.
204, 1206, 1207, 1202, 1205. This document is a column-type format, as found in newspapers and magazines. Once the blocks are ordered, these text blocks can be used in a character recognition program to logically order the characters on the document page.

【００５１】最後に、テキストブロックを“上側，左
側”として検出する上記基準を適用するためには、テキ
ストブロックがいま問題としているテキストブロックの
どの位置よりも明確に上側，左側の位置でなければなら
ない。しかしながら、“左側”の基準は、横方向におい
て、１つのテキストブロックの１／２以上が問題として
いるテキストブロックの左側に位置していれば良い。Finally, in order to apply the above criteria for detecting a text block as "upper, left", the text block must be clearly above and to the left of any position of the text block in question. I won't. However, the "left side" reference may be that the half or more of one text block is located on the left side of the text block in question in the horizontal direction.

【００５２】ブロックの順序付けを完了すると、区分さ
れたテキストブロック情報を例えば文字認識用に、すな
わち利用可能な形に作成する必要がある。イメ−ジの表
現を圧縮したときに、ブロックに対応する実際の座標ア
ドレスを作成する必要がある。この処理は、イメ−ジ表
現を元の圧縮されていないドキュメントイメ−ジの寸法
に再び尺度付けすることによってなされる。Once the block ordering is complete, the segmented text block information needs to be created, eg, for character recognition, ie in a usable form. When the image representation is compressed, it is necessary to create the actual coordinate address corresponding to the block. This is done by rescaling the image representation to the dimensions of the original uncompressed document image.

【００５３】ドキュメントイメージのテキストブロック
の座標が、少なくともステップ２０１乃至２０４で、さ
らには随意のステップ２０５乃至２１１で定められる
と、図４，図５のステップ２１２乃至１２９０では、白
スペースを拡げる処理を行なう。When the coordinates of the text block of the document image are determined at least in steps 201 to 204, and optionally in steps 205 to 211, steps 212 to 1290 of FIGS. To do.

【００５４】前のステップによって、ドキュメントイメ
ージの横方向Ｘ，縦方向Ｙの寸法がわかっているので、
ステップ２１２では、このドキュメントイメージをＣ₁
Ｘ×Ｃ₂Ｙの寸法をもつより大きな仮想的な用紙にマッ
プする。メモリ２４には、ステップ２０１乃至２０４
で、さらには随意的なステップ２０５乃至２１１で画定
された矩形領域の各々についての座標値(ｘ₀，ｙ₀)；
(ｘ_n，ｙ_m)が格納されている。上記ステップ２１２の処
理では、より具体的に、次式の変換を行なうことによっ
て、Ｘ×Ｙのドキュメント空間からの各矩形領域を、Ｃ
₁Ｘ×Ｃ₂Ｙの空間にマップする。Since the dimensions of the document image in the horizontal direction X and the vertical direction Y are known by the previous steps,
In step 212, this document image is C ₁
Map to a larger virtual sheet of dimensions X × C ₂ Y. The memory 24 has steps 201 to 204.
, And optionally the coordinate values (x ₀ , y ₀ ), for each of the rectangular regions defined in steps 205 through 211;
(x _n, y _m) are stored. More specifically, in the process of step 212, each of the rectangular regions from the X × Y document space is converted into C
Map to a space of ₁ X x C ₂ Y.

【００５５】[0055]

【数１】ｘ₀₂→ｃ₁ｘ₀ ｘ_n2→ｃ₁ｘ₀＋(ｘ_n−ｘ₀) ｙ₀₂→ｃ₂ｙ₀ ｙ_m2→ｃ₂ｙ₀＋(ｙ_m−ｙ₀)[Number 1] _{_{_{x 02 → c 1 x 0 x}}} n2 → c 1 x 0 + (x n -x 0) y 02 → c 2 y 0 y m2 → c 2 y 0 + (y m -y 0)

【００５６】なお、上記変換式は、一般形では、次式の
ように表わされる。In the general form, the above conversion formula is expressed as the following formula.

【００５７】[0057]

【数２】ｘ_i→ｃ₁ｘ₀＋(ｘ_i−ｘ₀) ｙ_j→ｃ₂ｙ₀＋(ｙ_j−ｙ₀)## EQU2 ## x _i → c ₁ x ₀ + (x _i −x ₀ ) y _j → c ₂ y ₀ + (y _j −y ₀ )

【００５８】換言すれば、ドキュメントイメ−ジ内のイ
メ−ジデータの境界を定める矩形領域を同定し、矩形領
域の各々に第１組の座標値(ｘ₀，ｙ₀)，(ｘ_n，ｙ_m)を割
り当て、矩形領域の各々に、ｘ₀₂→ｃ₁ｘ₀ ｘ_n2→ｃ₁ｘ₀＋(ｘ_n−ｘ₀) ｙ₀₂→ｃ₂ｙ₀ ｙ_m2→ｃ₂ｙ₀＋(ｙ_m−ｙ₀) で表わされる第２組の座標値(ｘ₀₂，ｙ₀₂)，(ｘ_n2，ｙ
_m2)を割り当て、ドキュメントイメ−ジ内の白スペース
を変化させる。In other words, the rectangular areas that define the boundaries of the image data in the document image are identified, and the first set of coordinate values (x ₀ , y ₀ ), (x _n , y) is identified in each rectangular area. _m ) is assigned to each of the rectangular regions, x ₀₂ → c ₁ x ₀ x _n2 → c ₁ x ₀ + (x _n −x ₀ ) y ₀₂ → c ₂ y ₀ y _m2 → c ₂ y ₀ + (y The second set of coordinate values (x ₀₂ , y ₀₂ ), (x _n2 , y) represented by _m −y ₀ ).
_m2 ) is assigned to change the white space in the document image.

【００５９】図１２，図１３は、ステップ２１２のマッ
ピング処理を説明するための図である。図１２を参照す
ると、ドキュメントイメージ１２２０には、テキスト矩
形領域１２２２乃至１２３０と絵画像矩形領域１２３２
とが含まれている。ドキュメントイメージ１２２０は、
寸法Ｘ，Ｙを有している。また、図１３のドキュメント
イメージ１２４０は、座標マッピング処理によって生成
されたドキュメントイメージに対応している。この変換
において、矩形領域自体の寸法(大きさ)は、変化しない
ことに着目すべきである。すなわち、ドキュメントイメ
ージ１２２０からドキュメントイメージ１２４０への変
換は、テキスト領域間の白スペースのみを拡げるように
機能することに着目すべきである。なお、図１３の例で
は、定数Ｃ₁は、“１”に設定され、定数Ｃ₂は、“１”
よりも大きい値に設定されている。従って、この例の場
合には、白スペースは、縦方向にのみ拡大される。12 and 13 are diagrams for explaining the mapping process of step 212. Referring to FIG. 12, the document image 1220 includes text rectangular areas 1222 to 1230 and a picture image rectangular area 1232.
And are included. The document image 1220 is
It has dimensions X and Y. Further, the document image 1240 of FIG. 13 corresponds to the document image generated by the coordinate mapping process. It should be noted that in this conversion, the size (size) of the rectangular area itself does not change. That is, it should be noted that the conversion from the document image 1220 to the document image 1240 functions to expand only the white space between the text areas. In the example of FIG. 13, the constant C ₁ is set to “1” and the constant C ₂ is “1”.
Is set to a value greater than. Therefore, in the case of this example, the white space is expanded only in the vertical direction.

【００６０】図４のステップ２１２の処理では、ドキュ
メントイメージレコードの組をさらに生成する。第１の
組のドキュメントレコードは、元のドキュメント矩形領
域データに対応している。第１の組のドキュメントレコ
ードは、対象メモリ，例えば図２のメモリ２４ａに格納
される。また、第２の組のドキュメントレコードは、ス
テップ２１２の変換後のドキュメント矩形領域および関
連した座標データに対応している。第２の組のドキュメ
ントレコードは、ターゲットメモリ，例えば図２のメモ
リ２４ｂに格納される。The process of step 212 of FIG. 4 further produces a set of document image records. The first set of document records corresponds to the original document rectangular area data. The first set of document records is stored in the target memory, eg, memory 24a in FIG. Also, the second set of document records corresponds to the transformed document rectangular area of step 212 and associated coordinate data. The second set of document records is stored in the target memory, eg memory 24b of FIG.

【００６１】ステップ２１２の最後に、ターゲットメモ
リへの格納がなされ、イメージ１２４０は、少なくとも
Ｃ₁Ｘ×Ｃ₂Ｙの寸法をもつ現実の物理的形状のものに変
換される。しかしながら、拡げられたドキュメントイメ
ージの物理的表現は、現在の物理的媒体の大きさを越え
てしまうことがある。例えば、オペレータは、元のドキ
ュメントと同じサイズの用紙に、拡大された白スペース
のドキュメントを印刷することを希望する場合には、マ
ップされた新たなドキュメントでは最早合わない。さら
に、拡大されたドキュメントは、順序が狂ったコラムを
もつかもしれない。At the end of step 212, storage in the target memory is done and the image 1240 is transformed into a real physical form with dimensions of at least C ₁ X × C ₂ Y. However, the physical representation of the expanded document image may exceed the size of current physical media. For example, if an operator wants to print a document with enlarged white space on a paper of the same size as the original document, the new mapped document no longer fits. In addition, the expanded document may have out-of-order columns.

【００６２】図１４には、２つのテキストコラム１２５
２，１２５４をもつ１つのドキュメントイメージ１２５
０が示されている。図１５には、ドキュメントイメージ
１２５０に白スペース拡大処理を行なった結果が示され
ている。図１５において、コラム１２５２は、別々のペ
ージにコラム１２５２ａ，１２５２ｂとして配置されて
いる。また、コラム１２５４は、別々のページにコラム
１２５４ａ，１２５４ｂとして配置されているが、隣接
するコラム１２５２ａ，１２５２ｂとの位置関係(順序
関係)については維持されている。コラム情報がこのよ
うに順序付けされていることによって、読者は、ページ
１２５６のコラム１２５２ａ，ページ１２５８のコラム
１２５２ｂを読み、次いで、ページ１２５６に戻ってコ
ラム１２５４ａを読むことができる。このように、本発
明の好適な実施例では、白スペースが拡げられても、論
理的な読みの順序およびテキストコラムの連続性を保存
することができる。図５のステップ１２８０乃至１２９
０には、この処理が示されている。FIG. 14 shows two text columns 125.
One document image 125 with 2,1254
0 is shown. FIG. 15 shows the result of the white space enlargement processing performed on the document image 1250. In FIG. 15, columns 1252 are arranged as columns 1252a and 1252b on different pages. The columns 1254 are arranged as columns 1254a and 1254b on different pages, but the positional relationship (order relationship) with the adjacent columns 1252a and 1252b is maintained. This ordering of the column information allows the reader to read column 1252a on page 1256, column 1252b on page 1258, and then return to page 1256 to read column 1254a. Thus, in the preferred embodiment of the present invention, the logical reading order and the continuity of the text columns can be preserved even if the white space is expanded. Steps 1280 through 129 of FIG.
This processing is shown in 0.

【００６３】図３のステップ２０９では、対象領域を順
次に順序付けする。対象領域は、その境界内に含まれて
いる一組の矩形領域の座標を画定する。図４のステップ
２１２では、テキスト矩形領域および対象領域の座標値
を新たな座標にマップし、図４のステップ１２７０で
は、各対象領域の終端座標値を物理的なページ境界の座
標値と比較する。物理的なページ境界は、現実のページ
エッジと必ずしも一致させる必要はない。例えば、１枚
のＡ４サイズの用紙に１ｃｍのマージンを希望するなら
ば、ページの底縁から１ｃｍの位置の座標に、物理的な
境界を設定することができる。In step 209 of FIG. 3, the target areas are sequentially ordered. The region of interest defines the coordinates of a set of rectangular regions contained within its boundaries. In step 212 of FIG. 4, the coordinate values of the text rectangular area and the target area are mapped to new coordinates, and in step 1270 of FIG. 4, the end coordinate value of each target area is compared with the coordinate value of the physical page boundary. . Physical page boundaries do not necessarily have to match actual page edges. For example, if a margin of 1 cm is desired for one sheet of A4 size paper, a physical boundary can be set at the coordinate of 1 cm from the bottom edge of the page.

【００６４】物理的領域の座標値が物理的なページ境界
の座標値を越える場合には、ステップ１２７２におい
て、物理的領域の境界のところで、対象領域を２つの対
象領域に分割する。従って、その結果得られる対象領域
の第１の部分の終端座標は、物理的領域の境界と一致
し、対象領域の第２の部分の始端座標は、物理的領域の
境界と一致する。If the coordinate value of the physical area exceeds the coordinate value of the physical page boundary, in step 1272, the target area is divided into two target areas at the boundary of the physical area. Therefore, the resulting end coordinates of the first portion of the target area match the boundaries of the physical area, and the start coordinates of the second portion of the target area match the boundaries of the physical area.

【００６５】このようにして、ステップ１２７２の処理
によって、対象領域の新たなリストと、これに関連した
座標値が作成される。ステップ１２７４では、対象領域
のリストにおいて、第１の対象領域の後に、これに続け
て、新たに作成した第２の対象領域を挿入する。次い
で、リスト内のこの第２の対象領域に続く全ての対象領
域の順序番号を歩進(増加)する。In this manner, the processing of step 1272 creates a new list of target areas and coordinate values associated therewith. In step 1274, the newly created second target area is inserted after the first target area in the list of target areas. Then, the sequence numbers of all the target areas following this second target area in the list are stepped (incremented).

【００６６】上記対象領域には、テキスト対象領域とと
もに絵画像領域も含まれているので、ステップ１２７６
では、ステップ２１２で決定されたような拡大された新
たな座標に絵画像領域をマップする。Since the target area includes the picture target area as well as the text target area, step 1276 is executed.
Then, the picture image area is mapped to the enlarged new coordinates determined in step 212.

【００６７】初期の状態では、いずれのテキスト領域も
マップされていない。まだマップされていないテキスト
領域をマップするために、ステップ１２７８では、領域
の高さと幅を決定する。次いで、ステップ１２８０乃至
１２８２では、ページ領域の左上部から順にテキスト領
域をマップする。すなわち、テキスト領域をマップしよ
うとしている位置に、以前にマップされた絵画像領域，
あるいはテキスト領域が存在しない場合には、この位置
にテキスト領域をマップし、その位置を確定する。In the initial state, no text area is mapped. In order to map the text area that has not yet been mapped, step 1278 determines the height and width of the area. Then, in steps 1280 to 1282, the text area is mapped in order from the upper left portion of the page area. That is, the picture image area previously mapped to the location where you are trying to map the text area,
Alternatively, if the text area does not exist, the text area is mapped to this position and the position is determined.

【００６８】これに対し、ドキュメントページの左上部
に絵画像領域あるいはテキスト領域がすでにマップされ
存在する場合には、この領域の下側あるいは右側に、現
在のテキスト領域をマップする。ステップ１２８４で
は、現在のテキスト領域の終端座標がページ境界を越え
ないことをチェックして、現在のテキスト領域を前にマ
ップされた領域の下側にマップする試みを行なう。すな
わち、現在のテキスト領域の終端座標がページ境界を越
えない場合には、現在のテキスト領域を前にマップされ
た領域の下側であって、前にマップされた領域と同じＸ
座標のところにマップする。On the other hand, when the picture image area or the text area is already mapped in the upper left portion of the document page, the current text area is mapped to the lower side or the right side of this area. Step 1284 checks that the ending coordinates of the current text area do not cross page boundaries and attempts to map the current text area below the previously mapped area. That is, if the ending coordinates of the current text area do not exceed the page boundary, the current text area is below the previously mapped area and has the same X as the previously mapped area.
Map to coordinates.

【００６９】これに対し、現在のテキスト領域の終端座
標がページ境界を越える場合には、ステップ１２８６乃
至１２８８の処理により、現在のテキスト領域を前にマ
ップされた領域の右側にマップする。すなわち、この位
置にマップされる現在の領域のＸ座標値，Ｙ座標値をチ
ェックして、ページ境界を越えていないかを確かめ、ペ
ージ境界を越えていないときには、ステップ１２８８に
おいて、この位置に現在の領域をマップする。現在の領
域が既存の領域の右側にマップされると、この領域にＹ
座標を与える。On the other hand, when the end coordinates of the current text area exceed the page boundary, the current text area is mapped to the right side of the previously mapped area by the processing of steps 1286 to 1288. That is, the X coordinate value and the Y coordinate value of the current area mapped to this position are checked to see if the page boundary is exceeded. If the page boundary is not exceeded, then in step 1288, the current position is set at this position. Map the area of. If the current area is mapped to the right of the existing area, then Y
Give the coordinates.

【００７０】一方、現在の領域を前にマップされた領域
の下側にも右側にもマップすることができない場合に
は、ステップ１２９０において、ドキュメントイメージ
ページがフル(一杯)であると判断し、新たなページに進
む。新たなページへのイメージ領域の割り当ては、上述
したと同様の仕方で行なわれる。On the other hand, if the current area cannot be mapped below or to the right of the previously mapped area, then in step 1290 it is determined that the document image page is full, Go to a new page. The image area is allocated to the new page in the same manner as described above.

【００７１】ステップ１２７０乃至１２９０のような処
理により領域のマッピングを行なうことによって、論理
的な読みの順序を確実に維持(保存)することができる。
すなわち、領域は、前にマップされた領域の下側あるい
は右側にマップされ、前にマップされた領域の左側ある
いは上側にはマップされない。このような仕方で、論理
的な読み順序を保存することができる。By performing area mapping by the processing of steps 1270 to 1290, it is possible to reliably maintain (save) the logical reading order.
That is, the region is mapped below or to the right of the previously mapped region and not to the left or above the previously mapped region. In this way, the logical reading order can be preserved.

【００７２】具体例として、複写機を例にとって説明す
る。As a concrete example, a copying machine will be described.

【００７３】図１６は、本発明を用いた複写機によって
複写されるべき多ページドキュメントの最初のペ−ジ１
２９９を示している。図１６のドキュメントイメ−ジ
は、２つのテキストコラム１３０１，１３０２を含んで
いる。複写機では、オペレータは、入力装置を用いて、
所望のドキュメントの特徴を選択することができる。こ
れらの特徴としては、白スペース拡大特徴の選択の他
に、例えば、複写枚数，照合などがある。FIG. 16 shows the first page 1 of a multi-page document to be copied by a copying machine using the present invention.
299 is shown. The document image of FIG. 16 includes two text columns 1301 and 1302. In a copier, the operator uses the input device to
You can select the desired document characteristics. These features include, for example, the number of copies and collation, in addition to the selection of the white space expansion feature.

【００７４】所望の再生特徴が選択されると、ＣＣＤス
キャナは、ページ１２９９をスキャンして、ページ１２
９９の特徴を画素イメージとして得る。図１７には、抽
出された１セット（１組）の矩形領域としてのイメージ
表現が示されている。When the desired playback feature is selected, the CCD scanner scans page 1299 to
99 features are obtained as a pixel image. FIG. 17 shows the image representation as one set of extracted rectangular areas.

【００７５】図１７の各矩形領域は、１つの語，あるい
は一連の語，あるいは１つの画像に対応している。すな
わち、各領域１５０１乃至１５０３は、図１６の領域と
直接関係付けられる。例えば、領域１５０１は、図１６
のコラム終端のところの語“buildings”に対応してい
る。Each rectangular area in FIG. 17 corresponds to one word, a series of words, or one image. That is, each area 1501 to 1503 is directly associated with the area of FIG. For example, the area 1501 is shown in FIG.
Corresponds to the word "buildings" at the end of the column.

【００７６】図１８には、“ＵＮＫＮＯＷＮ”型の矩形
領域をマージしてテキストブロックを形成し、このよう
に形成されたテキストブロックを順序付ける例が示され
ている。ここで、順序は、テキストブロックの左上部分
の整数値によって示されている。この例では、テキスト
ブロックは、ドキュメントイメ−ジの２つのコラムに対
応している。従って、文字認識などにおいて、ドキュメ
ントイメージをテキストファイルに縮約すると、テキス
トはそのファイルに、上記のように示された順序で現わ
れる。FIG. 18 shows an example in which "UNKNOWN" type rectangular areas are merged to form a text block, and the text blocks thus formed are ordered. Here, the order is indicated by the integer value in the upper left portion of the text block. In this example, the text block corresponds to two columns of the document image. Therefore, when the document image is reduced to a text file, such as in character recognition, the text appears in that file in the order shown above.

【００７７】図１９は、白スペースを拡げた後の図１８
のテキストブロックのマッピングを示す図である。この
例では、Ｃ₁は“１”であり、Ｃ₂は“２”である。
Ｃ₁，Ｃ₂について、他の値も可能であるが、上記の値
“１”，“２”は、８．５×１１インチ(２１．６×２
７．９ｃｍ)の用紙に適していることがわかった。従っ
て、この例では、当初、座標(０，１)，(１０，４０)の
ところにあったテキストブロック１は、座標(０，０)，
(１０，８０)のところにマップされて、ブロック２００
０となる。FIG. 19 shows FIG. 18 after expanding the white space.
It is a figure which shows the mapping of the text block of. In this example, C ₁ is “1” and C ₂ is “2”.
Other values are possible for C ₁ and C ₂ , but the above values “1” and “2” are 8.5 × 11 inches (21.6 × 2).
It was found to be suitable for 7.9 cm) paper. Therefore, in this example, the text block 1 originally located at the coordinates (0,1), (10,40) has the coordinates (0,0),
Mapped to (10,80), block 200
It becomes 0.

【００７８】また、図１９には、ドキュメントイメージ
の物理的なページ境界２００１の座標が示されている。
図１９に示すように、両方のテキストブロックは、この
境界で重なり合っている。従って、テキストブロック２
０００と２００２をページ境界の接合部のところで分割
し、４つのテキストブロック２００４乃至２００７とす
る。新たなブロック２００５と２００７とが作られるこ
とによって、全てのブロックに図示のように再び順番を
つける(リナンバリングする)。Further, FIG. 19 shows the coordinates of the physical page boundary 2001 of the document image.
As shown in FIG. 19, both text blocks overlap at this boundary. Therefore, text block 2
000 and 2002 are divided at the junction of page boundaries into four text blocks 2004 to 2007. By creating new blocks 2005 and 2007, all blocks are re-ordered as shown.

【００７９】図２０は物理的なページへのテキストブロ
ックのマッピングを示している。第１番目に番号付けさ
れたテキストブロック２００４は、第１のページの左上
部にマップされる。すなわち、第１のテキストブロック
２００４は、第１のページの左上部において、前にマッ
プされた絵画像領域と干渉しないので、第１のページの
左上部にマップされる。FIG. 20 shows the mapping of text blocks to physical pages. The first numbered text block 2004 maps to the top left of the first page. That is, the first text block 2004 is mapped to the upper left portion of the first page because it does not interfere with the previously mapped pictorial image area in the upper left portion of the first page.

【００８０】次に、本発明では、第１番目のテキストブ
ロックの下側にテキストブロック２００５をマップす
る。しかしながら、この第２番目のテキストブロック
は、物理的なページ境界を越えるので、この領域には合
わない。従って、本発明では、ブロック２００４の右側
にブロック２００５をマップする。ここで、ブロック２
００５は、前にマップされたブロック２００４のＹ座標
と同じＹ座標値をもつ。Next, in the present invention, the text block 2005 is mapped to the lower side of the first text block. However, this second text block does not fit in this area because it crosses a physical page boundary. Therefore, the present invention maps block 2005 to the right of block 2004. Where block 2
005 has the same Y coordinate value as the Y coordinate of the previously mapped block 2004.

【００８１】次いで、ブロック２００５のすぐ下側にブ
ロック２００６をマップする試みがなされる。しかしな
がら、この場合には、ブロック２００６は、物理的なペ
ージ境界を越えるので、この位置に配置することはでき
ない。従って、ブロック２００６を前にマップされたブ
ロック２００５の右側にブロック２００５と同じＹ座標
のところからマップする試みがなされる。しかしなが
ら、このマッピングは、横方向の物理的なページ境界を
越えてしまう。従って、ブロック２００６については、
これを新たなページにマップする必要がある。An attempt is then made to map block 2006 just below block 2005. However, in this case, the block 2006 cannot be arranged at this position because it crosses the physical page boundary. Therefore, an attempt is made to map block 2006 to the right of previously mapped block 2005 from the same Y coordinate as block 2005. However, this mapping crosses horizontal physical page boundaries. Therefore, for block 2006,
We need to map this to a new page.

【００８２】ページ２０１０は、次の物理的ぺージであ
る。ブロック２００６は、ページ２０１０の左上部にマ
ップされる。上述の処理は、ブロック２００７について
も続けられる。図２０には完成されたドキュメントが示
されている。Page 2010 is the next physical page. Block 2006 is mapped to the top left of page 2010. The process described above continues for block 2007. The completed document is shown in FIG.

【００８３】以上、本発明の好適な実施例について説明
したが、種々の変形も当業者にとって明らかである。例
えば、上述の処理では、イメージの黒画素に着目し、黒
画素に関して処理を行なったが、同様の処理を白画素に
着目して行なっても良い。さらに、物理的イメージとし
ては、１枚の用紙であっても良いし、ディスプレイモニ
タあるいはドキュメントイメージの他の可視表現であっ
ても良く、物理的イメージは、広義に解釈されるべきで
ある。Although the preferred embodiment of the present invention has been described above, various modifications will be apparent to those skilled in the art. For example, in the above-described processing, the black pixels of the image are focused and the processing is performed on the black pixels, but the same processing may be performed by focusing on the white pixels. Further, the physical image may be a sheet of paper, a display monitor or another visual representation of a document image, and the physical image should be broadly construed.

【００８４】また、上述の具体例では、本発明を複写機
に適用した場合について説明したが、本発明は、複写機
に限定されず、他の装置，他の用途にも適用可能であ
る。Further, in the above-mentioned specific examples, the case where the present invention is applied to the copying machine has been described, but the present invention is not limited to the copying machine and can be applied to other devices and other uses.

【００８５】[0085]

【発明の効果】以上に説明したように、本発明によれ
ば、ドキュメントイメ−ジ内のイメ−ジデータの境界を
定める矩形領域を同定し、矩形領域をＣ₁Ｘ×Ｃ₂Ｙの領
域にマップして、Ｘ，Ｙの寸法をもつドキュメントイメ
−ジ内の白スペースを広げるようにしているので、ドキ
ュメントイメージのテキストの大きさやテキスト領域内
でのテキストの相対的な空間配置を変更したりせずに、
ドキュメントイメージのテキスト対象間の白スペースを
物理的に増加させることができる。As described above, according to the present invention, the rectangular area that defines the boundary of the image data in the document image is identified, and the rectangular area is set to the area of C ₁ X × C ₂ Y. By mapping to expand the white space in the document image with X and Y dimensions, you can change the size of the text in the document image and the relative spatial arrangement of the text in the text area. Without,
The white space between the text objects of the document image can be physically increased.

[Brief description of drawings]

【図１】本発明に係るドキュメントイメージ処理の一実
施例を示すフローチャートである。FIG. 1 is a flowchart showing an embodiment of document image processing according to the present invention.

【図２】本発明によるイメージの再生および白スペース
の拡大を行なうのに適したハードウェア構成を示す図で
ある。FIG. 2 is a diagram showing a hardware configuration suitable for reproducing an image and expanding a white space according to the present invention.

【図３】本発明による白スペース拡大処理の一例を示す
フローチャートである。FIG. 3 is a flowchart showing an example of white space expansion processing according to the present invention.

【図４】本発明による白スペース拡大処理の一例を示す
フローチャートである。FIG. 4 is a flowchart showing an example of white space enlargement processing according to the present invention.

【図５】本発明による白スペース拡大処理の一例を示す
フローチャートである。FIG. 5 is a flowchart showing an example of white space expansion processing according to the present invention.

【図６】スキャンラインの圧縮の一例を示す図である。FIG. 6 is a diagram showing an example of compression of scan lines.

【図７】スキャンラインとラン長さとを示す図である。FIG. 7 is a diagram showing scan lines and run lengths.

【図８】過去の圧縮スキャンラインと現在の圧縮スキャ
ンラインとから矩形領域を形成する仕方を説明するため
の図である。FIG. 8 is a diagram for explaining a method of forming a rectangular area from a past compressed scan line and a current compressed scan line.

【図９】テキストの順序付けを説明するためのフローチ
ャートである。FIG. 9 is a flowchart for explaining text ordering.

【図１０】ドキュメントイメージの一例を示す図であ
る。FIG. 10 is a diagram showing an example of a document image.

【図１１】ブロックの論理的な順序を計算するのに用い
られる結果値テーブルを示す図である。FIG. 11 shows a result value table used to calculate the logical order of blocks.

【図１２】本発明によるマッピング処理を説明するため
の図である。FIG. 12 is a diagram for explaining a mapping process according to the present invention.

【図１３】本発明によるマッピング処理を説明するため
の図である。FIG. 13 is a diagram for explaining a mapping process according to the present invention.

【図１４】ドキュメントイメージの一例を示す図であ
る。FIG. 14 is a diagram showing an example of a document image.

【図１５】図１４のドキュメントイメージに白スペース
拡大処理を行なった結果を示す図である。FIG. 15 is a diagram showing a result of performing white space expansion processing on the document image of FIG.

【図１６】ドキュメントイメージの一例を示す図であ
る。FIG. 16 is a diagram showing an example of a document image.

【図１７】図１６のドキュメントイメージから抽出され
た矩形領域を示す図である。17 is a diagram showing a rectangular area extracted from the document image of FIG.

【図１８】順序付けされたテキストブロックの一例を示
す図である。FIG. 18 is a diagram illustrating an example of ordered text blocks.

【図１９】図１８のテキストブロックに対し白スペース
を拡げた後のテキストブロックのマッピングを示す図で
ある。FIG. 19 is a diagram showing mapping of a text block after expanding a white space with respect to the text block of FIG. 18;

【図２０】物理的なページへのテキストブロックのマッ
ピングを示す図である。FIG. 20 is a diagram showing mapping of text blocks to physical pages.

[Explanation of symbols]

２０スキャン装置２２ＣＰＵ２４メモリ２５ユーザインタフェース 20 scanning device 22 CPU 24 memory 25 User interface

───────────────────────────────────────────────────── フロントページの続き (72)発明者江尻公一東京都大田区中馬込１丁目３番６号株式会社リコー内 (56)参考文献特開平２−58158（ＪＰ，Ａ) 特開平４−364584（ＪＰ，Ａ) 特開平１−131966（ＪＰ，Ａ) 特開平５−48872（ＪＰ，Ａ) 特開平４−177976（ＪＰ，Ａ) 特開平２−143379（ＪＰ，Ａ) 特開昭61−60066（ＪＰ，Ａ) 特開平１−183784（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06T 11/60 H04N 1/387 G06F 17/21 ─────────────────────────────────────────────────── ─── Continuation of the front page (72) Inventor Koichi Ejiri 1-3-6 Nakamagome, Ota-ku, Tokyo Inside Ricoh Co., Ltd. (56) Reference JP-A-2-58158 (JP, A) JP JP-A-4-364584 (JP, A) JP-A-1-131966 (JP, A) JP-A-5-48872 (JP, A) JP-A-4-177976 (JP, A) JP-A-2-143379 (JP , A) JP 61-60066 (JP, A) JP 1-183784 (JP, A) (58) Fields investigated (Int.Cl. ⁷ , DB name) G06T 11/60 H04N 1/387 G06F 17/21

Claims

(57) [Claims]

1. A document Ime - Ime in di - delimiting Jideta identified rectangular region, a first set of coordinate values in each of the rectangular area _{_{(x 0, y 0),}} (x n, y m) assignment, to each of the rectangular _{_{_{regions, x 02 = c 1 x 0}}} x n2 = c 1 x 0 + (x n -x 0) y 02 = c 2 y 0 y m2 = c 2 y 0 + (y m - The second set of coordinate values (x ₀₂ , y ₀₂ ), (x _n2 , y) represented by y ₀ ).
_m2 ) to change the white space in the document image to see if each second set of coordinate values fits within a physical page with X, Y dimensions. For an image area having a second set of coordinate values that exceeds the page, divide it into an image area that fits within the physical page, and include the new image area obtained by the division, with respect to the image area. A method of processing a document image according to claim 1, wherein the image area is re-ordered to obtain a second sequence number and the image area is mapped to the physical page with the second sequence number.

2. The document image processing method according to claim 1 , wherein the document image has n image areas, and the n image areas have a picture image image area and a text image area. In this case, the mapping of the image area comprises: (a) mapping the pictorial image area to the physical page; and (b) mapping the first text image area to the upper left part of the physical page. Then, in step (c), when the text image area A _i fits within the physical page, the text image area A _i is mapped below the text image area A _i-1 , and in step (d) the text image area A _i. although itself falls within physical pages, over in the process of step (c), the text image area a _i is the boundary of the physical page If maps the text image area A _i in a portion of the physical page that is adjacent to the right edge of the text image area A _i-1, in the process of step (e) step (d), the text image area A A document image processing method characterized in that, when _i exceeds the boundary of the physical page, the text image area is mapped to a new physical page.

3. A medium representation consisting of a plurality of scan lines is given to a run length extracting / classifying means, a run length is extracted from each scan line of the medium representation, and based on the length of the run length, Each of the run lengths is “short”, “middle”,
A run length record is created by classifying the run length as one of “long”, and a part of the medium is represented from the run length information, and a set of coordinate values (x ₀ , y ₀ , x _n , y _n ), and each of the rectangular regions has x ₀₂ = c ₁ x ₀ x _n2 = c ₁ x ₀ + (x _n −x ₀ ) y ₀₂ = c ₂ y ₀ y _m2 = c a second set of coordinate values represented by _{_{_{2 y 0 + (y m -y}}} 0) (x 02, y 02), (x n2, y
_A method of processing a document image, characterized in that _m2 ) is assigned.

4. The document image processing method according to claim 3 , wherein each of the rectangular areas is an "image",
Categorizing as "vertical line", "horizontal line", or "unknown" type, merging "unknown" type rectangular area into at least one text block, and assigning a sequence number to each of the text blocks, The second of each of the image areas
Check whether the set of coordinate values fits within a physical page having X, Y dimensions, and for image areas having a second set of coordinate values beyond the physical page, set this to the physical page. A method for processing a document image, characterized by dividing the image area to fit inside.

5. The document image processing method according to claim 4 , wherein the image areas are re-ordered to obtain a second sequence number including the new image areas obtained by the division. A method for processing a document image according to claim 2, wherein an area is mapped to the physical page with the second sequence number.

6. The method of processing a document image according to claim 5 , wherein the document image has n image areas, and the n image areas have a picture image image area and a text image area. In this case, the mapping of the image area comprises: (a) mapping the pictorial image area to the physical page; and (b) mapping the first text image area to the upper left part of the physical page. Then, in step (c), when the text image area A _i fits within the physical page, the text image area A _i is mapped below the text image area A _i-1 , and in step (d) the text image area A _i. although itself falls within physical pages, over in the process of step (c), the text image area a _i is the boundary of the physical page If maps the text image area A _i in a portion of the physical page that is adjacent to the right edge of the text image area A _i-1, in the process of step (e) step (d), the text image area A A document image processing method characterized in that, when _i exceeds the boundary of the physical page, the text image area is mapped to a new physical page.

7. A means for identifying a rectangular area that defines a boundary of image data in a document image, and a first set of coordinate values (x ₀ , y ₀ ), (x _n , y) for each rectangular area.
_m ), and for each of the rectangular regions, x ₀₂ = c ₁ x ₀ x _n2 = c ₁ x ₀ + (x _n −x ₀ ) y ₀₂ = c ₂ y ₀ y _m ₂ = c ₂ y ₀ + The second set of coordinate values (x ₀₂ , y ₀₂ ), (x _n2 , y) represented by (y _m −y ₀ ).
and means for assigning a _{m @ 2),} a document Ime - changing the white space in the di, respectively
The second set of coordinate values is a physical page with X and Y dimensions.
Check if it fits inside and cross the physical page
For image areas with a second set of coordinate values,
Divided into image regions that fit the physical page within the di, before
Including the new image area obtained by notation division,
The image area is reordered and the second order
Get the ordinal number and place the image area in the second sequence number
A document image processing device, characterized in that it is adapted to map to said physical page .

8. The document image processing apparatus according to claim 7 , further comprising means for scanning the document image.

9. The document image processing apparatus according to claim 7 , further comprising means for outputting a document image.