JP2000020726A

JP2000020726A - Image processor and its method

Info

Publication number: JP2000020726A
Application number: JP10191286A
Authority: JP
Inventors: Naoaki Kodaira; 直朗小平; Hiroaki Kubota; 浩明久保田
Original assignee: Toshiba Corp
Current assignee: Toshiba Corp
Priority date: 1998-07-07
Filing date: 1998-07-07
Publication date: 2000-01-21

Abstract

PROBLEM TO BE SOLVED: To make it possible to extract a character string area from a photograph or chart area or the like in a document image inputted from an image input device such as a scanner. SOLUTION: The image processor 15 provided with a character area extraction part 102 for extracting a character area from a document image inputted from the image input part 101 such as a scanner, a specific area extraction part 103 for extracting a specific area such as a photograph and a chart, a character string area extraction part 104 for extracting a character string area from a character area in the specific area, and an image discrimination part 105 for discriminating an image in the specific area by using the extracted character string information.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文書などの画像を
ファイリングあるいは複写する際に行う濃度値などの変
換処理のための画像像処理装置および方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image processing apparatus and method for converting a density value or the like when filing or copying an image such as a document.

【０００２】[0002]

【従来の技術】文書画像を画像データとして取り込んで
その画像をハードコピーとして出力したり、イメージフ
ァイルとして保存したりする技術として、複写機および
パソコンにおけるイメージ取り込みに利用される技術
や、ファイリングシステムやデータベースに利用される
技術がある。2. Description of the Related Art Techniques for capturing a document image as image data and outputting the image as a hard copy or saving the image file as an image file include a technique used for image capture in a copying machine and a personal computer, a filing system and the like. There are technologies used for databases.

【０００３】ハードコピーとして出力する場合、文書画
像として利用者が見やすくするために画像処理を施す場
合がある。例えば、文書画像に文字が含まれている場
合、ハイパスフィルタ処理を施すことによって、文字の
エッジ部分が強調され、くっきりとした出力になり、読
みやすい文字となる。[0003] In the case of outputting as a hard copy, image processing is sometimes performed to make it easy for a user to view as a document image. For example, when a character is included in the document image, the edge portion of the character is emphasized by performing the high-pass filter processing, the output becomes clear, and the character becomes easy to read.

【０００４】また、写真の場合、滑らかな階調を再現す
るためにローパスフィルタを施すことによって、ざらつ
き感が無くなり、奇麗な出力となる。同じ写真であって
も、網点で構成された網点写真であるのか、あるいは、
銀塩写真であるのかによって処理を切り替えることも可
能である。以上のように、文書画像の種類によって処理
手法を変更することは、ハードコピーとして出力する際
も、イメージファイルとして保存する際にも非常に有効
である。In the case of a photograph, a low-pass filter is applied to reproduce a smooth gradation, so that a rough feeling is eliminated and a beautiful output is obtained. If the same photo is a halftone dot photo,
It is also possible to switch the processing depending on whether the photograph is a silver halide photograph. As described above, changing the processing method according to the type of the document image is very effective both when outputting as a hard copy and when saving as an image file.

【０００５】文書画像は通常、文字、図表、写真等の領
域がそれぞれ単独にあるいは重なり合って構成されてい
る。したがって、画像処理を行なう場合、文書画像にど
のような領域がどのように構成されているかを検知する
必要がある。[0005] A document image is usually composed of areas such as characters, figures, photographs, etc., singly or overlapping. Therefore, when performing image processing, it is necessary to detect what area and how are configured in the document image.

【０００６】従来、文字、網点、写真等の文書画像を構
成する要素を識別するために、特開平９−９２４５０号
に記載されているように、レイアウト解析を利用して、
文書画像の構造を抽出する方法がある。この方法では、
２値化処理を行なった後に画素の連結性を調べ、連結し
ている画素同士を領域として抽出して、その位置や大き
さ等の特徴量を利用して識別するものである。この方法
の場合、前処理として２値化を行なっているため、写真
等の中間的な濃度値を含んだ領域を正確に抽出すること
が困難であった。Conventionally, as disclosed in Japanese Patent Application Laid-Open No. 9-92450, layout analysis has been used to identify elements constituting a document image such as characters, halftone dots, and photographs.
There is a method for extracting the structure of a document image. in this way,
After performing the binarization process, the connectivity of pixels is checked, connected pixels are extracted as regions, and identification is performed using feature amounts such as positions and sizes. In the case of this method, since binarization is performed as preprocessing, it has been difficult to accurately extract a region including an intermediate density value such as a photograph.

【０００７】さらに、この方法を拡張した方法として特
願平１０−０５３３１７号に記載されているように、多
値画像を複数枚の２値画像に変換し、各々の２値画像に
対してレイアウト解析を行うことにより、写真等の中間
的な濃度値を含んだ領域を正確に抽出するものがある。
この方法では、網点上にある文字といった領域が重なり
合った場合においても、領域の抽出が正しく行えるとい
う利点がある。Further, as described in Japanese Patent Application No. 10-053317 as an extension of this method, a multi-valued image is converted into a plurality of binary images, and a layout is applied to each of the binary images. In some cases, an area including an intermediate density value such as a photograph is accurately extracted by performing analysis.
This method has an advantage that the region can be correctly extracted even when the regions such as characters on the halftone dot overlap.

【０００８】しかし、実際の文書原稿は複雑なレイアウ
ト構造を持つ場合が多い。例えば、網点写真が存在する
領域上に文字が書かれていたりする。この場合、特開平
９−９２４５０号に記載されている方法では、２値化を
行なっているため、網点写真と文字の濃度値による分離
が困難であり、領域として正しく抽出できない。However, an actual document original often has a complicated layout structure. For example, characters are written on the area where the halftone picture exists. In this case, in the method described in Japanese Patent Application Laid-Open No. 9-92450, since binarization is performed, it is difficult to separate the halftone photograph and the character by the density value, and the area cannot be correctly extracted.

【０００９】特願平１０−０５３３１７号に記載されて
いる方法においては、網点領域上に写真領域が存在する
ことは識別可能であるが、文字領域の抽出に関しては、
写真領域における連結成分の大きさ等の特徴量が文字領
域抽出のための特徴量と区別できないため網点写真から
文字を分離することは困難であった。In the method described in Japanese Patent Application No. 10-053317, it is identifiable that a photographic area exists on a halftone dot area.
It is difficult to separate a character from a halftone dot photograph because a feature amount such as a size of a connected component in a photograph region cannot be distinguished from a feature amount for extracting a character region.

【００１０】[0010]

【発明が解決しようとする課題】以上述べたように、従
来の画像処理装置において、多値画像を複数枚の２値画
像に変換し、各々の２値画像に対してレイアウト解析を
利用して、文書画像の構造を抽出する方法では、網点写
真上に存在する文字領域を抽出しようとする場合、正し
い文字の分離が困難であった。そこで本発明の目的とす
るところは、文書画像データから、写真などの領域内の
文字を正確に抽出する画像処理装置を提供することにあ
る。As described above, in a conventional image processing apparatus, a multi-valued image is converted into a plurality of binary images, and each binary image is subjected to layout analysis. In the method of extracting the structure of a document image, it is difficult to separate a correct character when extracting a character region existing on a halftone dot photograph. Accordingly, it is an object of the present invention to provide an image processing apparatus for accurately extracting characters in a region such as a photograph from document image data.

【００１１】[0011]

【課題を解決するための手段】上記目的を達成するため
本発明は、画像データとして読み込まれた画像の各画素
および近傍画素の濃度値により文字の特徴を持つ文字領
域を抽出する文字領域抽出手段と、文字以外の特徴を持
つ特定領域を抽出する特定領域抽出手段と、この特定領
域抽出手段によって抽出された特定領域内から前記文字
領域抽出手段によって抽出された文字領域の情報を用い
て文字列領域を抽出する文字列領域抽出手段と、この文
字列領域抽出手段によって抽出された文字列領域の情報
を用いて前記特定領域の画像を判別する画像判別手段と
を具備することを特徴とする。In order to achieve the above object, the present invention provides a character area extracting means for extracting a character area having a character characteristic from the density values of each pixel and neighboring pixels of an image read as image data. A specific area extracting means for extracting a specific area having characteristics other than characters, and a character string using information on the character area extracted by the character area extracting means from the specific area extracted by the specific area extracting means. A character string region extracting means for extracting an area, and an image determining means for determining an image of the specific area using information on the character string region extracted by the character string region extracting means.

【００１２】また、画像データとして読み込まれた画像
の各画素および近傍画素の濃度値により文字の特徴を持
つ文字領域を抽出する文字領域抽出手段と、文字以外の
特徴を持つ特定領域を抽出する特定領域抽出手段と、こ
の特定領域抽出手段によって抽出された特定領域内から
前記文字領域抽出手段によって抽出された文字領域の情
報を用いて文字列領域を抽出する文字列領域抽出手段
と、抽出すべき文字列領域の位置を文書構造として格納
する文書構造格納手段と、前記文字列領域抽出手段によ
り抽出された文字列領域を前記文書構造格納手段に格納
された文書構造と照合し、文字列領域を確定する文字列
照合手段と、この文字列照合手段によって確定された文
字列領域の情報を用いて読み込まれた画像を呈示する画
像呈示手段とを具備することを特徴とする。Further, a character region extracting means for extracting a character region having character characteristics based on density values of pixels and neighboring pixels of an image read as image data, and a specification for extracting a specific region having characteristics other than characters A region extracting unit, a character string region extracting unit that extracts a character string region from information of the character region extracted by the character region extracting unit from the specific region extracted by the specific region extracting unit, A document structure storing means for storing the position of the character string area as a document structure, and comparing the character string area extracted by the character string area extracting means with the document structure stored in the document structure storing means. Character string collating means to be determined, and image presenting means for presenting an image read using information on the character string area determined by the character string collating means. And wherein the Rukoto.

【００１３】つまり、文書を複写したり、画像に変換し
て保存する場合、入力された文書画像に対して、画素値
の変化と特徴量を計測することよって画像に含まれる文
書要素を領域として抽出し、それぞれの領域に対して文
書要素の識別を行い、写真などの特定領域内に存在する
文字領域の位置および大きさなどの情報から文字列を抽
出することによって、文字領域に類似した他の領域と分
離することが可能となり、文字列情報を利用して、写真
などの特定領域の画素属性を修正することが可能とな
る。さらに、文書構造パターンと文字列情報比較するこ
とで、画像検索を行なうことが可能となる。That is, when a document is copied or converted into an image and stored, the change in pixel value and the characteristic amount of the input document image are measured so that the document element included in the image is defined as an area. Extracting, identifying document elements for each area, and extracting character strings from information such as the position and size of character areas existing in specific areas such as photographs, etc. It is possible to correct the pixel attribute of a specific area such as a photograph using character string information. Further, by comparing the document structure pattern with the character string information, an image search can be performed.

【００１４】[0014]

【発明の実施の形態】本発明は、文書画像をスキャナー
等の画像入力部によって取り込み、これに対して文字領
域抽出手段および領域抽出手段において各画素および近
傍画素のデータの状況から物理的あるいは論理的に連結
しているものを一つの領域として抽出したのち、個々の
領域の画像上の位置、大きさ、形状、構造、濃度分布等
の特徴量を計測し、その計測結果を予め定められたルー
ルに基づいて文書構成要素として文字や写真などの特定
属性を持った領域として識別する。DESCRIPTION OF THE PREFERRED EMBODIMENTS In the present invention, a document image is captured by an image input unit such as a scanner, and the character area extracting means and the area extracting means physically or logically determine the state of the data of each pixel and neighboring pixels. After extracting what is connected as a single area, the position, size, shape, structure, density distribution, and other features of each area on the image are measured, and the measurement results are determined in advance. Based on the rule, it is identified as an area having a specific attribute such as a character or a photograph as a document constituent element.

【００１５】そして、写真などの特定属性を持った領域
内に存在する文字属性を持つ領域に対して、その周囲に
存在する他の領域の属性、大きさ、形状等の特徴量を計
測し、その結果該当する文字属性を持つ領域が文字列を
構成する要素と判断された場合、文字列領域として新た
に画素属性を与えることができ、写真などの特定属性を
持った領域の画素属性を修正することが可能となる。Then, for a region having a character attribute existing in a region having a specific attribute such as a photograph, a characteristic amount such as an attribute, a size, and a shape of another region existing around the region is measured. As a result, if it is determined that the area having the corresponding character attribute is an element constituting the character string, a new pixel attribute can be given as the character string area, and the pixel attribute of the area having the specific attribute such as a photograph is corrected. It is possible to do.

【００１６】以下、本発明のより具体的な実施の一例に
ついて、図面を参照して説明する。図１は本発明の実施
の一例を示すブロック図である。Hereinafter, a more specific embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a block diagram showing an embodiment of the present invention.

【００１７】画像入力部１０１は画像データを入力する
装置であり、書類を読み取って画像データに変換する装
置であるイメージスキャナ等の画像入力装置により、文
書等の書類から描かれているものを取り込む装置であ
る。なお、本発明における画像入力部はイメージスキャ
ナ等の読み取り装置で構成されたものでも良いし、既に
画像データとしてファイリング装置等に保存された画像
イメージを取り込む装置で構成されても良い。An image input unit 101 is a device for inputting image data. An image input device such as an image scanner, which is a device for reading a document and converting it into image data, captures an image drawn from a document such as a document. Device. Note that the image input unit in the present invention may be configured by a reading device such as an image scanner, or may be configured by a device that captures an image already stored in a filing device or the like as image data.

【００１８】文字領域抽出部１０２は、画像入力部１０
１において入力された画像データから文字を持った画素
を領域としてするものである。まず、画像入力部１０１
において入力された画像データに対して、周辺画素の濃
度差や彩度などの状態によって複数の２値の画像データ
に分離し、その画像より文字や図形等が物理的にあるい
は論理的に連結されている各領域に分割して抽出し、そ
の領域の位置、大きさ、形状、構造、濃度分布等の特徴
量を計測して、文書要素として文字である領域を抽出す
る。The character area extraction unit 102 is provided with the image input unit 10
In step 1, pixels having characters from the input image data are used as regions. First, the image input unit 101
The input image data is separated into a plurality of binary image data according to the state such as the density difference and the saturation of peripheral pixels, and characters or graphics are physically or logically connected from the image. The area is extracted as a document element by measuring features such as the position, size, shape, structure, and density distribution of the area.

【００１９】複数の２値の画像データに分離する具体的
な手法として、特願平１０−０５３３１７号で開示され
ている手法により実現しても良い。As a specific method of separating the image data into a plurality of binary image data, a method disclosed in Japanese Patent Application No. 10-053317 may be used.

【００２０】この手法は、多値画像を複数枚の２値画像
に変換し、各々の２値画像に対してレイアウト解析を行
うことにより、写真等の中間的な濃度値を含んだ領域を
正確に抽出するものである。この場合、文字画像、中間
調画像、下地画像、網点画像、カラー画像、グレー画
像、黒画像の７つの２値分離画像データが生成される。In this method, a multi-valued image is converted into a plurality of binary images, and a layout analysis is performed on each of the binary images. Is to be extracted. In this case, seven binary separated image data of a character image, a halftone image, a background image, a halftone image, a color image, a gray image, and a black image are generated.

【００２１】また、文書要素の抽出および識別の具体的
な手法として、特開平９−９２４５０号で開示されてい
る手法により実現しても良い。Further, as a specific method of extracting and identifying document elements, a method disclosed in Japanese Patent Application Laid-Open No. 9-92450 may be realized.

【００２２】この手法は、２値化処理を行なった後に画
素の連結性を調べ、連結している画素同士を領域として
抽出して、その位置や大きさ等の特徴量を利用して識別
するものである。In this method, after performing a binarization process, the connectivity of pixels is checked, connected pixels are extracted as regions, and identification is performed using feature amounts such as the position and size. Things.

【００２３】特定領域抽出部１０３は、画像入力部１０
１において入力された画像データから特定の属性を持っ
た画素を領域として抽出するものである。この場合にお
ける属性の一例としては、写真、図表、網点領域であ
る。文字領域抽出部１０２と同様に、入力された画像デ
ータに対して、各画素の周辺画素との濃度差や彩度など
の状態によって複数の２値の画像データに分離し、各画
像より文字や図形等が物理的にあるいは論理的に連結さ
れている領域を抽出し、各領域の位置、大きさ、形状、
構造、濃度分布等の特徴量を計測して、文書要素として
の各種類や重要度を識別するものである。The specific area extracting unit 103 includes the image input unit 10
In step 1, a pixel having a specific attribute is extracted as an area from the input image data. Examples of the attribute in this case are a photograph, a chart, and a dot area. Similarly to the character area extraction unit 102, the input image data is separated into a plurality of binary image data according to the state such as the density difference and the saturation between each pixel and the surrounding pixels, and characters or characters are extracted from each image. Extract regions where figures etc. are physically or logically connected and extract the position, size, shape,
It measures features such as structure and density distribution, and identifies each type and importance as document elements.

【００２４】文書要素の種類として、写真、図、表、網
点などがあげられる。特定領域抽出部１０３では、単一
画像データからだけではなく、複数画像データでの特徴
量をルールにしたがって統合し、入力された画像データ
の領域属性を決定する。例えば、文字画像と中間調画像
の双方から同じ位置に領域が抽出された場合、その領域
の種類や領域の大きさはどれだけなのかを決定する。具
体的な例として、文字画像上に写真領域が存在し、同じ
位置に中間調画像に中間調画素が存在する場合、銀塩写
真領域と決定する。The types of document elements include photographs, figures, tables, and halftone dots. The specific area extraction unit 103 integrates not only single image data but also feature amounts of a plurality of pieces of image data according to rules, and determines an area attribute of input image data. For example, when an area is extracted at the same position from both the character image and the halftone image, the type of the area and the size of the area are determined. As a specific example, when a photographic area exists on the character image and a halftone pixel exists in the halftone image at the same position, it is determined to be a silver halide photographic area.

【００２５】複数画像データからの領域属性決定の具体
的な手法として、上述の特願平１０−０５３３１７号で
開示されている手法により実現しても良い。文字列領域
抽出部１０４は、詳細は後述するが、特定領域抽出部１
０３において抽出された写真、図、表、網点などの領域
内に存在する文字列を抽出する。その際、文字領域抽出
部１０２において抽出された文字属性領域のうち写真、
図、表、網点などの領域内に存在するものに対して、周
囲に存在する他の領域の属性、大きさ、形状等の特徴量
を計測し、その結果該当する文字属性を持つ領域が文字
列を構成する要素と判断された場合、文字列領域として
抽出する。As a specific method for determining an area attribute from a plurality of image data, it may be realized by the method disclosed in Japanese Patent Application No. 10-053317. The character string region extracting unit 104 will be described in detail later, but the specific region extracting unit 1
A character string existing in an area such as a photograph, a figure, a table, or a halftone dot extracted in 03 is extracted. At this time, a photograph,
Measure the features, such as attributes, size, shape, etc. of other surrounding areas for those existing in areas such as figures, tables, halftone dots, etc. If it is determined that the element forms a character string, it is extracted as a character string area.

【００２６】画像判別部１０５は、詳細は後述するが、
文字列領域抽出部１０４において抽出された文字列領域
の情報を用いて、特定領域抽出部１０３の出力である写
真、図、表、網点といった属性の領域情報の修正を行な
う。The details of the image discriminating unit 105 will be described later.
Using the information on the character string area extracted by the character string area extraction unit 104, the area information of the attributes such as a photograph, a figure, a table, and a halftone dot output from the specific area extraction unit 103 is corrected.

【００２７】画像出力部１０６は、画像判別部１０５に
よって決定された領域情報にしたがって出力を行なう。
出力形態として、属性の領域情報を画像に変換したもの
でも良い。属性の領域情報を画像に変換する具体的な手
法として例えば、上述の特願平１０−０５３３１７号で
開示されている手法により実現しても良い。The image output unit 106 outputs according to the area information determined by the image determination unit 105.
As an output form, the area information of the attribute may be converted into an image. As a specific method of converting the attribute area information into an image, for example, the method may be realized by a method disclosed in Japanese Patent Application No. 10-053317.

【００２８】図２は、図１に示した本発明装置の構成に
おける文書処理手順の例を示すフローチャートである。
このフローチャートを参照しながら図１の構成図とあわ
せて本発明による画像処理装置の処理の流れを次に説明
する。FIG. 2 is a flowchart showing an example of a document processing procedure in the configuration of the apparatus of the present invention shown in FIG.
Next, the flow of processing of the image processing apparatus according to the present invention will be described with reference to this flowchart and the configuration diagram of FIG.

【００２９】まず、書類の画像を画像入力部１０１によ
り取り込む（ステップＳＴ２０１）。すなわち、スキャ
ナ等の画像入力装置を利用して書類から画像を読み取っ
たり、またはファイリングシステム等の画像ファイルデ
ータを入力したりしたものを、画像入力部１０１により
画像データに変換する。First, an image of a document is captured by the image input unit 101 (step ST201). That is, the image input unit 101 converts an image read from a document using an image input device such as a scanner or image file data input from a filing system or the like into image data.

【００３０】文字領域抽出部１０２において、画像入力
装置から１ライン分、または数ライン分ずつ読み取り、
周辺画素の濃度差や彩度などの画素ごとの状態によって
複数の２値画像データに分離され、文字や図形等が物理
的にあるいは論理的に連結されている各領域毎に分割し
て抽出し、その領域の位置、大きさ、形状、構造、濃度
分布等の特徴量を計測して、領域の種類や重要度等の識
別を行ない、その結果、文字領域のみを抽出する（ステ
ップＳＴ２０２）。入力された画像の全画素に対して処
理が終了するまで繰り返す（ステップＳＴ２０３）。In the character area extracting unit 102, one line or several lines are read from the image input device.
It is separated into a plurality of binary image data according to the state of each pixel such as the density difference and the saturation of the peripheral pixels, and is divided and extracted for each area where characters and figures are physically or logically connected. Then, the feature amounts such as the position, size, shape, structure, and density distribution of the area are measured, and the type and importance of the area are identified. As a result, only the character area is extracted (step ST202). The processing is repeated until the processing is completed for all the pixels of the input image (step ST203).

【００３１】同様に、特定領域抽出部１０３において、
画像入力装置から１ライン分、または数ライン分ずつ読
み取り、周辺画素の濃度差や彩度などの画素ごとの状態
によって複数の２値画像データに分離され、文字や図形
等が物理的にあるいは論理的に連結されている各領域毎
に分割して抽出し、その領域の位置、大きさ、形状、構
造、濃度分布等の特徴量を計測して、領域の種類や重要
度等の識別を行ない、その結果、写真、図、表、網点領
域のみを抽出する（ステップＳＴ２０４）。入力された
画像の全画素に対して処理が終了するまで繰り返す（ス
テップＳＴ２０５）。Similarly, in the specific area extracting unit 103,
One line or several lines are read from the image input device, and separated into a plurality of binary image data according to the state of each pixel such as the density difference and saturation of peripheral pixels. Characters and figures are physically or logically read. It extracts and divides each of the regions that are connected to each other, measures the position, size, shape, structure, density distribution, etc. of the region, and identifies the type and importance of the region. Then, only the photograph, the figure, the table, and the halftone dot area are extracted (step ST204). The processing is repeated until the processing is completed for all the pixels of the input image (step ST205).

【００３２】すべての画素属性が決定した後、文字列領
域抽出部１０４において、特定領域抽出部１０３の領域
内に存在する文字属性を持つ領域に対して、その周囲に
存在する他の領域の属性、大きさ、形状等の特徴量を計
測し、その結果該当する文字属性を持つ領域が文字列を
構成する要素と判断された場合、文字列領域として新た
に画素属性を与え（ステップＳＴ２０６）、すべての特
定領域における文字属性に対して終了するまで繰り返す
（ステップＳＴ２０７）。なお、文字列領域抽出部１０
４における具体的な処理の内容の詳細は、図３を参照し
て後述する。After all the pixel attributes have been determined, the character string region extraction unit 104 sets the attribute of the region having the character attribute existing in the region of the specific region extraction unit 103 to the attribute of another region existing therearound. , Size, shape, and the like, and as a result, when it is determined that the area having the corresponding character attribute is an element constituting the character string, a new pixel attribute is given as a character string area (step ST206). The processing is repeated until the processing ends for the character attributes in all the specific areas (step ST207). Note that the character string region extraction unit 10
4 will be described later in detail with reference to FIG.

【００３３】文字列領域抽出部１０４において、すべて
の特定領域に対して文字列領域抽出が行われた後、特定
領域ごとに文字列情報を用いて画像判別を行う（ステッ
プＳＴ２０８）。判別の一例として、写真のみの領域、
写真と文字の領域、図と文字の領域等がある。この判別
結果をもとに、画素属性の情報の変更を行う。そして、
すべての特定領域に対して終了するまで繰り返す（ステ
ップＳＴ２０９）。After the character string region extraction unit 104 extracts the character string regions from all the specific regions, the image is determined using the character string information for each specific region (step ST208). As an example of the discrimination, an area of only a photograph,
There are an area for photographs and characters, an area for figures and characters, and the like. The information of the pixel attribute is changed based on the result of this determination. And
The processing is repeated until the processing is completed for all the specific areas (step ST209).

【００３４】最後に画像出力部１０６において、各々の
画素属性を画像データとして変換した出力画像を出力す
る（ステップＳＴ２１０）。以上が本発明装置の大まか
な処理動作である。次に個々の要素の処理の詳細を説明
する。Finally, the image output unit 106 outputs an output image obtained by converting each pixel attribute as image data (step ST210). The above is a rough processing operation of the apparatus of the present invention. Next, details of processing of each element will be described.

【００３５】図３は、本発明の画像処理装置の文字列領
域抽出部１０４における一例としての文字列抽出処理の
詳細を示すフローチャートであり、図２に示したステッ
プＳＴ２０８で行われる処理のフローチャートである。FIG. 3 is a flowchart showing details of an example of a character string extraction process in the character string region extraction unit 104 of the image processing apparatus of the present invention, and is a flowchart of the process performed in step ST208 shown in FIG. is there.

【００３６】文字列抽出処理は、文字属性を持つ領域を
手がかりにその周囲に存在する他の文字属性を持つ画素
との関係を調べることで、その文字属性を持つ領域が文
字列を構成するものであるかを調べ、最終的に文字列領
域を抽出する処理である。以下、フローチャートにした
がって説明する。In the character string extraction process, a region having the character attribute is examined by examining the relationship between the region having the character attribute and other pixels having the character attribute around the region. This is a process for examining whether a character string area is finally extracted. Hereinafter, description will be given according to the flowchart.

【００３７】文字領域抽出部１０２と特定領域抽出部１
０３からの出力である画素属性データ１５１は、識別さ
れた複数画像データの領域の種類や重要度を用いて、入
力された画像データの画素ごとの属性を決定したもので
ある。文字列抽出対象となる特定領域の属性としては、
文字画像や中間調画像から抽出された写真属性，図属
性、表属性等、網点画像から抽出された網点属性があ
る。文字列抽出処理では，これら属性領域内に存在する
文字領域をを手掛かりに、文字列領域の抽出を行う。Character region extraction unit 102 and specific region extraction unit 1
The pixel attribute data 151, which is an output from the image data 03, determines the attribute of each pixel of the input image data by using the type and importance of the area of the identified plurality of image data. The attributes of the specific area from which the character string is to be extracted include
There are halftone dot attributes extracted from a halftone dot image, such as a photograph attribute, a drawing attribute, and a table attribute extracted from a character image and a halftone image. In the character string extraction processing, the character string area is extracted by using the character areas existing in these attribute areas as clues.

【００３８】まず、画素属性データ１５１からデータを
読み取り、そのデータが特定領域である写真属性，図属
性、表属性、網点属性であるか調べる（ステップＳＴ３
０１）。特定領域が写真属性，図属性、表属性、網点属
性である場合、その領域内に文字属性を持った領域が存
在するか調べる。（ステップＳＴ３０２）。文字属性を
持った領域が存在する場合、その周囲の画素属性の状態
を調べるために領域の探索を行う。そのために探索領域
の設定を行なう（ステップＳＴ３０３）。First, data is read from the pixel attribute data 151, and it is checked whether the data is a photograph attribute, a drawing attribute, a table attribute, or a halftone dot attribute which is a specific area (step ST3).
01). When the specific area is a photograph attribute, a drawing attribute, a table attribute, and a halftone dot attribute, it is checked whether an area having a character attribute exists in the area. (Step ST302). When there is a region having a character attribute, the region is searched to check the state of the surrounding pixel attribute. For this purpose, a search area is set (step ST303).

【００３９】図４は、文字属性領域から探索領域の設定
方法の一例を示すものである。ステップＳＴ３０２にお
いて文字属性であると判別された斜線の領域５０１の座
標値を（ｘ０，ｙ０）、（ｘ１，ｙ１）とする。文字属
性領域５０１を囲む点線の領域５０２の座標値（Ｘ０，
Ｙ０）、（Ｘ１，Ｙ１）を次式から求める。Ｘ０＝ｘ０−（ｘ１−ｘ０）×ｍＹ０＝ｙ０−（ｙ１−ｙ０）×ｍＸ１＝ｘ１＋（ｘ１−ｘ０）×ｍＹ１＝ｙ１＋（ｙ１−ｙ０）×ｍFIG. 4 shows an example of a method for setting a search area from a character attribute area. The coordinate values of the hatched area 501 determined to have the character attribute in step ST302 are (x0, y0) and (x1, y1). The coordinates (X0, X0) of the dotted line area 502 surrounding the character attribute area 501
Y0) and (X1, Y1) are obtained from the following equations. X0 = x0− (x1−x0) × m Y0 = y0− (y1−y0) × m X1 = x1 + (x1−x0) × m Y1 = y1 + (y1−y0) × m

【００４０】ただし、ｍは探索領域を設定するための係
数で、定数であっても良いし、文字属性領域５０１の大
きさによって可変であっても良い。この領域５０２と文
字属性領域５０１から、文字属性領域５０１を中心とし
た太線で囲まれた十字型の探索領域５０３を設定する。
探索領域を十字型に設定した理由は、文字列の特徴を考
慮したものである。Here, m is a coefficient for setting the search area, and may be a constant or may be variable depending on the size of the character attribute area 501. From the area 502 and the character attribute area 501, a cross-shaped search area 503 surrounded by a thick line centering on the character attribute area 501 is set.
The reason why the search area is set to the cross shape is to take into account the characteristics of the character string.

【００４１】つまり、文字属性領域５０１が文字列を構
成する文字であるならば、その左右または上下に同様な
文字属性を持った領域が存在すると考えられるからであ
る。また探索領域を十字型に設定することで、文字列の
向きが縦であっても横であっても探索することが可能と
なる。That is, if the character attribute area 501 is a character constituting a character string, it is considered that an area having the same character attribute exists on the left, right, up, or down. In addition, by setting the search area in a cross shape, it is possible to perform a search regardless of whether the direction of the character string is vertical or horizontal.

【００４２】次に設定された探索領域５０３内に存在す
る、または探索領域５０３と重なり合う他の文字属性を
持った領域が存在するか探索を行う（ステップＳＴ３０
４）。図４において、領域５０４、領域５０５、領域５
０６、領域５０７は文字属性を持った領域とする。探索
領域５０３内に存在する、または探索領域５０３と重な
り合うという条件に合った領域、つまり領域５０４、領
域５０５、領域５０６が該当する領域となる。重なり合
う領域も有効にするのは文書の傾き等で十字型の探索領
域５０３内に必ずしも収まるとは限らないためである。
したがって、入力文書画像が多少の傾きを持っていても
問題なく検出することが可能となる。Next, a search is performed to determine whether there is an area having another character attribute existing in the set search area 503 or overlapping the search area 503 (step ST30).
4). In FIG. 4, a region 504, a region 505, and a region 5
06 and the area 507 are areas having character attributes. Regions existing within the search region 503 or satisfying the condition of overlapping with the search region 503, that is, the regions 504, 505, and 506 are the corresponding regions. The overlapping area is also made effective because it does not always fit in the cross-shaped search area 503 due to the inclination of the document.
Therefore, even if the input document image has a slight inclination, it can be detected without any problem.

【００４３】ステップＳＴ３０４において、文字属性を
持った領域が存在する場合、文字属性を持った領域の矩
形サイズを予め用意した投票空間に投票を行う（ステッ
プＳＴ３０５）。この投票空間とは、文字列領域を抽出
するために用いるものである。手掛かりとなる文字属性
領域と探索範囲に存在する検出された文字属性を持った
領域との位置関係を調べることによって文字列を直接抽
出する方法では、文字属性領域の個数が多くなるにした
がって、位置関係を評価するための処理コストが増加
し、高速処理には向いていない。In step ST304, if there is a region having a character attribute, a vote is made in a voting space in which the rectangular size of the region having the character attribute is prepared in advance (step ST305). This voting space is used for extracting a character string area. In the method of directly extracting a character string by examining the positional relationship between a character attribute region serving as a clue and a region having a detected character attribute existing in a search range, the position is determined as the number of character attribute regions increases. The processing cost for evaluating the relationship increases and is not suitable for high-speed processing.

【００４４】そこで、投票空間に文字列の構成要素と推
測される文字属性領域を投票し、後に特定領域単位ごと
に文字列領域として一度に抽出を行う。ここで用意する
投票空間は複数の２値画像の解像度より低いものとす
る。Therefore, a character attribute area which is presumed to be a component of a character string is voted in the voting space, and later extracted as a character string area for each specific area unit. The voting space prepared here is assumed to be lower than the resolution of a plurality of binary images.

【００４５】さらに、文字列の向きに対応して縦方向用
と横方向用の２種類を用意し、それぞれ縦方向用の投票
空間は縦方向の解像度を横方向より低くし、横方向用の
投票空間は横方向の解像度を縦方向より低くする。これ
は、解像度を落とすことにより分離している文字列の構
成要素である文字属性の領域が連結しやすくなる。Further, two types, one for the vertical direction and the other for the horizontal direction, are prepared according to the direction of the character string, and the voting space for the vertical direction has a lower resolution in the vertical direction than in the horizontal direction. The voting space has lower horizontal resolution than vertical resolution. This makes it easier to connect character attribute regions, which are constituent elements of a separated character string by lowering the resolution.

【００４６】また、縦方向と横方向に分離することで、
文字列方向が未知の場合や、混在する場合においても対
応することが可能となる。これらの理由により後述する
投票空間からの文字列領域抽出（ステップＳＴ３０６）
において、容易な文字列抽出が可能となる。Also, by separating the vertical and horizontal directions,
It is possible to deal with cases where the character string direction is unknown or mixed. For these reasons, extraction of a character string area from a voting space described later (step ST306)
, Easy character string extraction is possible.

【００４７】投票は、文字属性を持った領域５０１の座
標位置を投票空間の座標位置に変換した個所にその領域
の矩形サイズなどの特徴を加算することによりなされ
る。矩形サイズの例としては、矩形の幅や高さや面積値
などのいずれかを用いることが可能である。その際、ス
テップＳＴ３０４において検出された文字属性を持った
領域の分布状態によって、縦方向の投票空間に投票する
か、横方向の投票空間に投票するか、さらには両方の投
票空間に投票するかを決定する。The voting is performed by adding a feature such as the rectangular size of the area 501 to a position where the coordinate position of the area 501 having the character attribute is converted to the coordinate position of the voting space. As an example of the rectangle size, any of the width, height, area value, and the like of the rectangle can be used. At this time, depending on the distribution state of the area having the character attribute detected in step ST304, whether to vote in the vertical voting space, in the horizontal voting space, or in both voting spaces To determine.

【００４８】つまり、縦方向に検出された文字属性を持
った領域が存在する場合、縦方向の投票空間に投票し、
横方向に検出された文字属性を持った領域が存在する場
合、横方向の投票空間に投票する。図４を例にすると、
文字属性を持った領域５０１に対して検出された文字属
性領域は領域５０４、領域５０５、領域５０６であり、
縦方向、横方向の双方に存在する。したがって、この場
合両方の投票空間に投票する。That is, when there is a region having a character attribute detected in the vertical direction, a vote is cast in the vertical voting space,
If there is a region having a character attribute detected in the horizontal direction, the ballot is cast in the horizontal voting space. Taking FIG. 4 as an example,
Character attribute areas detected for the area 501 having the character attribute are an area 504, an area 505, and an area 506,
It exists both in the vertical and horizontal directions. Therefore, in this case, votes are given to both voting spaces.

【００４９】更に、十字型の探索領域５０３において、
検出された文字属性を持った領域をも投票の対象とす
る。横方向において検出されたものは横方向の投票空間
へ、縦方向において検出されたものは縦方向の投票空間
へそれぞれ投票される。Further, in the cross-shaped search area 503,
The region having the detected character attribute is also a target of voting. Those detected in the horizontal direction are voted in the voting space in the horizontal direction, and those detected in the vertical direction are voted in the voting space in the vertical direction.

【００５０】図４を例にすると、領域５０４と領域５０
６は横方向の投票空間へ、領域５０５は縦方向の投票空
間へ投票される。これらの投票は、領域５０１における
投票と同様に座標位置を投票空間の座標位置に変換した
個所にその領域の矩形サイズなどの特徴を加算すること
によりなされる。Referring to FIG. 4 as an example, the region 504 and the region 50
6 is cast in the horizontal voting space, and the area 505 is cast in the vertical voting space. These voting is performed by adding a feature such as a rectangular size of the area to a place where the coordinate position is converted into a coordinate position in the voting space, similarly to the voting in the area 501.

【００５１】投票空間への文字属性を持った領域の矩形
サイズの投票がすべて終了した後、またはステップＳＴ
３０４において文字属性を持った領域が存在しなかった
時、ステップＳＴ３０２に戻り、他の文字属性を持った
領域に対してステップＳＴ３０３からステップＳＴ３０
５の処理を行う。After all rectangular-size voting of the region having the character attribute to the voting space is completed, or
If there is no area having the character attribute in step 304, the process returns to step ST302, and the process returns from step ST303 to step ST30 for the area having another character attribute.
Step 5 is performed.

【００５２】特定領域内のすべての文字属性を持った領
域に対して終了した場合、投票空間より文字列領域の抽
出を行う（ステップＳＴ３０６）。ステップＳＴ３０４
によって出力される投票空間は、多値画像とみなすこと
ができる。When the process is completed for all the regions having the character attribute in the specific region, a character string region is extracted from the voting space (step ST306). Step ST304
Can be regarded as a multi-valued image.

【００５３】そこで、この投票空間を閾値処理すること
によって２値画像に変換する。この閾値処理には、次の
ような意味が含まれている。ステップＳＴ３０４では、
ノイズ成分なのにもかかわらず文字属性を持った領域を
も投票される。しかし、文字列の存在確率が高い個所で
は、即ち、文字属性を持った領域が多く存在する個所で
は、投票された値は高くなり、逆にノイズ成分によって
投票された個所は、他の領域による投票がなされないた
め値は低くなる。Therefore, the voting space is converted into a binary image by threshold processing. This threshold processing has the following meaning. In step ST304,
Votes are also made for areas that have character attributes despite being noise components. However, at locations where the probability of the presence of a character string is high, that is, at locations where there are many regions with character attributes, the voted value is high. Conversely, locations voted by noise components are due to other regions. The value is low because no voting is done.

【００５４】したがって、閾値処理によって２値画像へ
変換すると同時に、文字列領域として不要なノイズ成分
は消去される。閾値の設定は、定数であっても良いし、
投票空間における投票値から算出しても良い。Therefore, at the same time as the conversion into the binary image by the threshold processing, the noise component unnecessary as the character string area is deleted. The setting of the threshold may be a constant,
It may be calculated from the voting value in the voting space.

【００５５】閾値処理を行った２値化された投票空間か
ら、文字列領域を抽出する方法は、文字列領域抽出部１
０４における連結した成分の抽出と同じ手法を用いても
良い。The method of extracting a character string area from the binarized voting space that has been subjected to the threshold processing is performed by the character string area extracting unit 1
The same method as the extraction of the connected component in 04 may be used.

【００５６】文字列領域抽出の終了後、他の特定領域に
対してもステップＳＴ３０２からステップＳＴ３０６ま
での処理を繰り返し行なう。全ての特定領域に対して処
理を行なった結果、抽出された文字列領域の情報は、画
素属性データ１５１に文字列領域の属性を加えたものと
して、修正画素属性データ１５２となる。After the end of the character string area extraction, the processing from step ST302 to step ST306 is repeated for other specific areas. As a result of performing the process on all the specific areas, the extracted information on the character string area becomes corrected pixel attribute data 152 as a result of adding the attribute of the character string area to the pixel attribute data 151.

【００５７】図５は本発明の画像処理装置に入力される
一例としての画像データである。ここでは画像データ６
０１として、文字列を含んだカメラの写真である網点写
真６０２と銀塩写真６０３と、下地上の文字列６０４が
描かれているものを示している。この例を用いて文字列
領域抽出部１０４における実際の処理過程を説明する。FIG. 5 shows an example of image data input to the image processing apparatus of the present invention. Here, image data 6
01 indicates a halftone dot photograph 602 and a silver halide photograph 603, which are photographs of a camera including a character string, and a character string 604 on the background. The actual processing in the character string region extraction unit 104 will be described using this example.

【００５８】画像入力部１０１より入力した画像データ
６０１から、文字領域抽出部１０２によって図６におけ
る画像データ６０１ａでの点線で示した文字領域６０２
ａ１、６０２ａ２、６０３ａ１、６０３ａ２、６０４ａ
が矩形領域として抽出される。A character area extracting unit 102 extracts a character area 602 indicated by a dotted line in the image data 601a in FIG. 6 from the image data 601 input from the image input unit 101.
a1, 602a2, 603a1, 603a2, 604a
Is extracted as a rectangular area.

【００５９】便宜上、点線で複数の文字領域を囲ってい
るが、実際の文字領域は小さな矩形として個別に抽出さ
れている。ここで文字領域６０２ａ１および６０３ａ１
は、図５における網点写真６０２および銀塩写真６０３
のカメラ部分から分離したものであり、そのサイズや形
状あｈ文字属性と類似した領域を示していて、実際には
正しい文字領域ではない。For convenience, a plurality of character areas are surrounded by dotted lines, but actual character areas are individually extracted as small rectangles. Here, character areas 602a1 and 603a1
Are the dot photograph 602 and the silver halide photograph 603 in FIG.
The size and shape are similar to those of the character attribute, and are not correct character areas.

【００６０】同様にして、画像入力部１０１より入力し
た画像データ６０１から、特定領域抽出部１０３によっ
て図７における画像データ６０１ｂでの点線で示した網
点領域６０２ｂと銀塩写真部６０３ｂが矩形領域として
抽出される。Similarly, from the image data 601 input from the image input unit 101, the specific area extracting unit 103 converts the halftone dot area 602b indicated by the dotted line in the image data 601b in FIG. Is extracted as

【００６１】文字列領域抽出部１０４では、図７で抽出
された網点領域６０２ｂと銀塩写真部６０３ｂ内に存在
する文字領域６０２ａ１、６０２ａ２、６０３ａ１、６
０３ａ２を対象に文字列抽出を行なう。文字領域６０２
ａ１および６０３ａ１は、文字属性の探索において近隣
に文字属性がない、あるいは矩形サイズの投票が行なわ
れても投票空間における値が小さいという理由から文字
列領域としては抽出されず、最終的に図８における画像
データ６０１ｃで示されている文字列領域６０２ｃ２が
縦方向の投票空間から、文字列領域６０３ｃ２が横方向
の投票空間から文字列として抽出されることになる。以
上が、文字列領域抽出部１０４における実際の処理過程
の説明である。In the character string region extracting unit 104, the halftone dot region 602b extracted in FIG. 7 and the character regions 602a1, 602a2, 603a1, and 6 existing in the silver halide photographing unit 603b.
Character string extraction is performed on 03a2. Character area 602
a1 and 603a1 are not extracted as a character string area because there is no character attribute in the vicinity in the search for the character attribute or the value in the voting space is small even if a rectangle-sized voting is performed. The character string area 602c2 indicated by the image data 601c is extracted from the vertical voting space, and the character string area 603c2 is extracted from the horizontal voting space as a character string. The above is the description of the actual processing in the character string region extraction unit 104.

【００６２】次に本発明の請求項４に記載されている文
字列領域抽出手段について説明する。本発明における一
例としての文字列抽出処理の詳細を示すフローチャート
は図３と同様である。Next, a description will be given of the character string area extracting means according to claim 4 of the present invention. A flowchart showing details of a character string extraction process as an example in the present invention is the same as that in FIG.

【００６３】画素属性データ１５１から特定属性を持っ
た領域が存在するか確認し、その特定量機内に文字列属
性を持った領域が存在するか調べたのち、探索領域を設
定して，探索領域内の文字を探索し、その結果検出され
た文字属性領域の矩形サイズを縦方向と横方向のそれぞ
れの投票空間に投票を行い、投票空間を閾値処理するこ
とによって２値画像に変換し、文字列領域を抽出する。It is checked from the pixel attribute data 151 whether an area having a specific attribute exists, and it is checked whether an area having a character string attribute exists in the specific quantity machine. Then, a search area is set, and the search area is set. The characters within the character space are searched, the rectangular size of the character attribute area detected as a result is voted in each of the voting space in the vertical direction and the horizontal direction, and the voting space is converted into a binary image by threshold processing. Extract the column area.

【００６４】さらに、抽出された文字列領域の位置情報
から、他の文字列領域との位置関係を計測することによ
り、パラグラフを抽出する。このパラグラフ情報より抽
出された文字列領域の位置情報から、文字列であるか判
別することにより、文字部分をより正確に抽出すること
が可能となる。Further, a paragraph is extracted by measuring a positional relationship between the extracted character string area and another character string area. By determining whether the character string is a character string from the position information of the character string area extracted from the paragraph information, it is possible to more accurately extract the character part.

【００６５】図９は本発明の画像処理に入力される一例
としての画像データである。ここでは画像データ７０１
として、複数行の文字列を含んだカメラの写真である網
点写真７０２が描かれているものを示している。この例
を用いて、抽出された文字列領域の情報からパラグラフ
情報を抽出する方法と、その情報から文字列領域を限定
する方法について説明する。FIG. 9 shows an example of image data input to the image processing of the present invention. Here, image data 701
Shows a halftone dot picture 702 which is a picture of a camera including a character string of a plurality of lines. Using this example, a method of extracting paragraph information from the extracted character string area information and a method of limiting the character string area from the information will be described.

【００６６】画像入力部１０１より入力した画像データ
７０１から、文字領域抽出部１０２によって図１０にお
ける画像データ７０１ａでの点線で示した文字領域７０
２ａ１、７０２ａ２が矩形領域として抽出される。ここ
で文字領域７０２ａ１は、図９における網点写真７０２
のカメラ部分から分離したものであり、そのサイズや形
状は文字属性と類似した領域を示していて、実際には正
しい文字領域ではない。From the image data 701 input from the image input unit 101, the character area extracting unit 102 sets the character area 70 indicated by a dotted line in the image data 701a in FIG.
2a1 and 702a2 are extracted as rectangular areas. Here, the character area 702a1 is the halftone picture 702 in FIG.
Are separated from the camera part, and their size and shape indicate an area similar to the character attribute, and are not actually correct character areas.

【００６７】同様にして、画像入力部１０１より入力し
た画像データ７０１から、特定領域抽出部１０３によっ
て図１１における画像データ７０１ｂでの点線で示した
網点領域７０２ｂが矩形領域として抽出され、文字列領
域抽出部１０４により、網点領域７０２ｂ内に存在する
文字領域７０２ａ１と７０２ａ２を対象に文字列抽出を
行なわれる。その結果、図１２に示す文字列領域７０２
ｃ１と７０２ｃ２が横方向の投票空間から文字列として
抽出されることになる。便宜上、７０２ｃ２は点線によ
って複数の文字列領域を囲んでいるが、実際の文字列領
域は個別の領域として抽出されている。Similarly, from the image data 701 input from the image input unit 101, the dot area 702b indicated by the dotted line in the image data 701b in FIG. The character string extraction is performed by the region extracting unit 104 on the character regions 702a1 and 702a2 existing in the dot region 702b. As a result, a character string area 702 shown in FIG.
c1 and 702c2 are extracted as character strings from the horizontal voting space. For convenience, 702c2 surrounds a plurality of character string regions by dotted lines, but actual character string regions are extracted as individual regions.

【００６８】ここで、抽出された文字列領域の情報か
ら、他の文字列領域との位置関係を計測することで、図
１２における画像データ７０１ｃで示されている文字列
領域７０２ｃ２は複数行をもったパラグラフとして抽出
される。システムの設定として、パラグラフの形状を持
った領域のみを文字列領域として抽出するようにしてあ
るならば、網点写真７０２のカメラ部分から分離した偽
りの文字列領域である７０２ｃ１は、文字列としては抽
出されず、最終的に図１３に示す文字列領域７０２ｄ２
が文字列として残ることになる。Here, by measuring the positional relationship between the extracted character string area and other character string areas, the character string area 702c2 shown by the image data 701c in FIG. Extracted as a paragraph. If only a region having a paragraph shape is extracted as a character string region as a setting of the system, a false character string region 702c1 separated from the camera portion of the halftone dot photograph 702 is used as a character string. Are not extracted, and finally the character string area 702d2 shown in FIG.
Will remain as a character string.

【００６９】このように、抽出された文字列領域の情報
から、入力文書画像の文書構造を抽出し、文書構造を利
用して抽出された文字列領域を利用して抽出された文字
列領域を再度調べることにより、より精度の高い文字列
抽出を行うことが可能となる。以上が、図１に示す本発
明の実施の一例についての説明である。As described above, the document structure of the input document image is extracted from the extracted character string area information, and the character string area extracted using the document structure is extracted. By checking again, it is possible to extract a character string with higher accuracy. The above is the description of the embodiment of the present invention shown in FIG.

【００７０】次に、抽出した文字列領域の情報と文書構
造情報を照合することにより、さらに精度の高い文字列
領域抽出を行う装置およびその応用例である画像検索装
置について説明する。Next, an apparatus for extracting a character string region with higher accuracy by collating the extracted character string region information with document structure information and an image search device as an application example thereof will be described.

【００７１】図１４は本発明の第２の実施の一例を示す
ブロック図である。図１４は、画像入力部１０１、文字
領域抽出部１０２、特定領域抽出部１０３、文字列領域
抽出部１０４、文書構造格納部１０７、文字列照合部１
０８、画像判別部１０５、画像出力部１０６からなり、
図１のブロック図に文書構造格納部１０７と文字列照合
部１０８を追加した形態をなしている。FIG. 14 is a block diagram showing an example of the second embodiment of the present invention. FIG. 14 shows an image input unit 101, a character region extraction unit 102, a specific region extraction unit 103, a character string region extraction unit 104, a document structure storage unit 107, and a character string collation unit 1.
08, an image determination unit 105, and an image output unit 106,
In this embodiment, a document structure storage unit 107 and a character string collation unit 108 are added to the block diagram of FIG.

【００７２】画像入力部１０１、文字領域抽出部１０
２、特定領域抽出部１０３、文字列領域抽出部１０４、
画像判別部１０５、画像出力部１０６は既に説明した通
りであるので、ここでは文書構造格納部１０７と文字列
照合部１０８について説明する。Image input unit 101, character area extraction unit 10
2. Specific area extraction unit 103, character string area extraction unit 104,
Since the image discriminating unit 105 and the image output unit 106 have already been described, only the document structure storage unit 107 and the character string matching unit 108 will be described here.

【００７３】図１５は図１４に示した本発明装置の構成
における文書処理手順の例を示すフローチャートであ
る。図２に示した図１における構成のフローチャート
に、文字列照合処理（ステップＳＴ２１１とステップＳ
Ｔ２１２）を追加した形態をなす。FIG. 15 is a flowchart showing an example of a document processing procedure in the configuration of the apparatus of the present invention shown in FIG. The flowchart of the configuration in FIG. 1 shown in FIG. 2 includes a character string collation process (step ST211 and step S211).
T212).

【００７４】文字列抽出が終了したのち（ステップＳＴ
２０７）、文字列照合処理（ステップＳＴ２１１）が行
われる。ここでは、抽出された文字列領域情報と文書構
造格納部１０７に予め知識データとして与えた文書構造
パターンを用いて、文字列領域の照合を行ない、その結
果、照合した場合、その領域を文字列領域として確定す
る。知識データとして与える文書構造パターンとして
は、例えば矩形領域の座標位置を示したデータであって
も良いし、「写真領域の下の文字列」というような文書
構造を示すスクリプト情報をパターン情報に変換したデ
ータであっても良いし、抽出したい領域を示した簡単な
図をスキャナで入力したものであっても良い。スキャナ
で入力した場合は、その図から文書構造の抽出処理を行
い、データ化する。After the character string extraction is completed (step ST
207), and a character string collation process (step ST211) is performed. Here, the character string area is collated using the extracted character string area information and the document structure pattern previously given to the document structure storage unit 107 as knowledge data. Determine as area. The document structure pattern given as the knowledge data may be, for example, data indicating a coordinate position of a rectangular area, or convert script information indicating a document structure such as "character string under a photograph area" into pattern information. Data or a simple figure showing a region to be extracted may be input by a scanner. When the data is input by the scanner, the document structure is extracted from the figure and converted into data.

【００７５】まず、抽出された文字列領域情報から位置
やサイズ等を取得する。次に、与えられた知識データの
文書構造パターンから抽出したい文字列の位置やサイズ
等を取得する。次に、各々の文字列情報の照合を行う。
その際、写真などの特定領域の大きさが異なる場合、正
規化することにより、大きさを合わせる。First, the position, size, and the like are obtained from the extracted character string area information. Next, the position and size of a character string to be extracted are acquired from the document structure pattern of the given knowledge data. Next, each character string information is collated.
At this time, if the size of a specific area such as a photograph is different, the size is adjusted by normalization.

【００７６】そして、重なりの度合いを類似度によって
表現し、類似度が閾値以上である場合、照合されたとみ
なす。照合の結果、知識データに示された文字列領域が
存在する場合、その領域を文字列として確定し、もしそ
れ以外の文字列領域が存在するならば、区別するための
情報を文字列領域情報に付加する。知識データの文書構
造情報の文字列データに対して、照合処理がすべて終了
するまで繰り返す（ステップＳＴ２１２）。終了後は、
画像判別部１０５において判別処理が行なわれるが、そ
の際、文字列照合部１０８の結果を文字列領域抽出部１
０４の出力とみなすことで、図１に示した例と同じ処理
を行なうことが可能となる。Then, the degree of overlap is represented by the degree of similarity, and if the degree of similarity is equal to or greater than the threshold value, it is considered that collation has been performed. If the character string area indicated in the knowledge data exists as a result of the collation, the area is determined as a character string, and if there is any other character string area, information for discrimination is used as character string area information. To be added. It repeats until the collation processing is completed for all the character string data of the document structure information of the knowledge data (step ST212). After finishing,
The image discriminating unit 105 performs a discriminating process. At this time, the result of the character string collating unit 108 is compared with the character string region extracting unit 1.
By regarding this as the output of 04, the same processing as in the example shown in FIG. 1 can be performed.

【００７７】図１６は本発明の画像処理に入力される一
例としての画像データである。ここでは画像データ８０
１として、一行だけの文字列を含んだカメラの写真であ
る網点写真８０２が描かれているものを示している。こ
の例を用いて、文字列照合部１０８における動作を説明
する。FIG. 16 shows an example of image data input to the image processing of the present invention. Here, the image data 80
1, a halftone dot photograph 802 which is a photograph of a camera including a character string of only one line is shown. The operation of the character string matching unit 108 will be described using this example.

【００７８】画像入力部１０１より入力した画像データ
８０１から、文字領域抽出部１０２によって図１７にお
ける画像データ８０１ａでの点線で示した文字領域８０
２ａ１、８０２ａ２が矩形領域として抽出される。ここ
で文字領域８０２ａ１は、図１６における網点写真８０
２のカメラ部分から分離したものであり、そのサイズや
形状は文字属性と類似した領域を示していて、実際には
正しい文字領域ではない。The image data 801 input from the image input unit 101 is extracted by the character area extracting unit 102 into the character area 80 indicated by a dotted line in the image data 801a in FIG.
2a1 and 802a2 are extracted as rectangular areas. Here, the character area 802a1 is the halftone picture 80 in FIG.
The camera is separated from the camera part No. 2 and its size and shape indicate an area similar to the character attribute, and is not actually a correct character area.

【００７９】同様にして、画像入力部１０１より入力し
た画像データ８０１から、特定領域抽出部１０３によっ
て図１８における画像データ８０１ｂでの点線で示した
網点領域８０２ｂが矩形領域として抽出され、文字列領
域抽出部１０４により、網点領域８０２ｂ内に存在する
文字領域８０２ａ１と８０２ａ２を対象に文字列抽出を
行なわれる。その結果、図１９に示す文字列領域８０２
ｃ１と８０２ｃ２が横方向の投票空間から文字列として
抽出されることになる。Similarly, from the image data 801 input from the image input unit 101, the dot area 802b indicated by the dotted line in the image data 801b in FIG. The character string extraction is performed by the region extracting unit 104 on the character regions 802a1 and 802a2 existing in the halftone dot region 802b. As a result, the character string area 802 shown in FIG.
c1 and 802c2 are extracted as character strings from the horizontal voting space.

【００８０】ここで、抽出された文字列領域の情報か
ら、他の文字列領域との位置関係を計測することで、文
書構造の抽出を行う。図１に示した本発明の構成例で
は、パラグラフを抽出し、その結果をもとに文字列領域
の抽出を行なうのに対し、画像データ８０１の文字列は
一行だけで構成されているため、パラグラフは抽出でき
ず、文字列領域８０２ｃ２のみを抽出することができな
い。Here, the document structure is extracted by measuring the positional relationship between the extracted character string area and other character string areas. In the configuration example of the present invention shown in FIG. 1, a paragraph is extracted and a character string area is extracted based on the result. On the other hand, the character string of the image data 801 is composed of only one line. The paragraph cannot be extracted, and only the character string area 802c2 cannot be extracted.

【００８１】図２０は本発明にかかわるの文書構造格納
部１０７に与える文書構造パターンの一例である。ここ
では画像データ９０１により、領域９０２を写真領域
と、領域９０３を文字列領域とそれぞれ位置情報を指定
している。このデータの文書構造パターンと図１９の結
果を照合することによって、文字列領域を確定する。そ
の結果、図１９における文字列領域８０２ｃ２と図２０
における文字列領域９０３が対応していることから、図
２１に示す文字列８０２ｄ２を文字列領域として残すこ
とができる。FIG. 20 shows an example of a document structure pattern given to the document structure storage unit 107 according to the present invention. Here, the image data 901 specifies position information for the region 902 and the character string region for the region 903, respectively. The character string area is determined by comparing the document structure pattern of this data with the result of FIG. As a result, the character string area 802c2 in FIG.
Corresponds to the character string area 903, the character string 802d2 shown in FIG. 21 can be left as a character string area.

【００８２】このように、抽出された文字列領域の情報
と、予め与えられた文書構造パターンでの文字列領域を
照合することにより、高精度で目的に応じた文字列抽出
を行うことが可能となる。以上が、図１４に示す本発明
の実施の一例についての説明である。As described above, by collating the extracted character string area information with the character string area in the document structure pattern given in advance, it is possible to extract a character string according to the purpose with high accuracy. Becomes The preceding is an explanation of the embodiment of the present invention shown in FIG.

【００８３】さらに、別の形態においては、本発明は画
像検索を行なう装置として用いることも可能である。Further, in another form, the present invention can be used as an apparatus for performing image retrieval.

【００８４】図２２は本発明の第３の実施の一例を示す
ブロック図である。図２２は、画像入力部１０１、文字
領域抽出部１０２、特定領域抽出部１０３、文字列領域
抽出部１０４、文書構造格納部１０７、文字列照合部１
０８、画像呈示部１０９で構成される。図１４のブロ
ック図から画像判別部１０５と画像出力部１０６を取り
除き、画像呈示部１０９を追加した形態をなしている。
画像入力部１０１、文字領域抽出部１０２、特定領域抽
出部１０３、文字列領域抽出部１０４、文書構造格納部
１０７は既に説明した通りであるので、ここでは文字列
照合部１０８と画像呈示部１０９について説明する。FIG. 22 is a block diagram showing an example of the third embodiment of the present invention. FIG. 22 shows an image input unit 101, a character region extraction unit 102, a specific region extraction unit 103, a character string region extraction unit 104, a document structure storage unit 107, and a character string collation unit 1.
08, and an image presentation unit 109. The image discriminating unit 105 and the image output unit 106 are removed from the block diagram of FIG. 14, and an image presenting unit 109 is added.
Since the image input unit 101, the character area extraction unit 102, the specific area extraction unit 103, the character string area extraction unit 104, and the document structure storage unit 107 have already been described, here, the character string collation unit 108 and the image presentation unit 109 Will be described.

【００８５】画像データがファイリング装置等に保存さ
れた文書画像イメージであるとする。既に説明したよう
に、文字列照合部１０８では、抽出された文字列領域情
報と文書構造格納部１０７に予め与えた文書構造パター
ンより、文字列領域の照合を行ない、その領域を文字列
領域として確定するものである。It is assumed that the image data is a document image stored in a filing device or the like. As described above, the character string matching unit 108 performs character string matching based on the extracted character string area information and the document structure pattern given in advance to the document structure storage unit 107, and sets the area as a character string area. It will be decided.

【００８６】したがって、文書構造格納部１０７にユー
ザが検索したい画像イメージの文書構造パターンを指定
することによって、文字列照合部１０８における照合結
果は、即ち検索結果となる。文字列照合部１０８の照合
の結果、一致しているという情報が得られた場合、ユー
ザが所望している画像データとみなし、画像呈示部１０
９にその画像を提示する。Therefore, when the user specifies the document structure pattern of the image image to be searched for in the document structure storage unit 107, the collation result in the character string collation unit 108 becomes a search result. If the result of the collation by the character string collating unit 108 indicates that the information matches, the image presenting unit 10 regards the image data as desired by the user.
9 is presented with the image.

【００８７】画像呈示部１０９は、ディスプレイのよう
に画像を表示するものでも良いし、プリンタのように紙
に出力するもので良い。以上の処理を複数の画像データ
に対して行なうことにより、ファイリング装置における
文書画像イメージの検索操作となる。このように、抽出
された文字列領域の情報と、ユーザが検索したい画像イ
メージの文書構造パターンから抽出された文字列領域と
を照合することにより、画像データの検索を行なうこと
が可能となる。以上が、図２２に示す本発明の実施の一
例についての説明である。The image presenting unit 109 may display an image like a display, or output it to paper like a printer. By performing the above processing on a plurality of image data, a retrieval operation of a document image in the filing apparatus is performed. As described above, it is possible to search for image data by comparing the information on the extracted character string area with the character string area extracted from the document structure pattern of the image image that the user wants to search. The preceding is an explanation of the embodiment of the present invention shown in FIG.

【００８８】このように本装置は、画像データを画像入
力部１０１により取り込み、これを文字領域抽出部１０
２および特定領域抽出部１０３において各画素および近
傍画素の特徴ら物理的あるいは論理的に連結しているも
のを一つの領域として抽出したのち、個々の領域の画像
上の位置、大きさ、形状、構造、濃度分布等の特徴量を
計測し、その計測結果を予め定められたルールに基づい
て文書構成要素として文字や写真などの特定属性を持っ
た領域として識別する。As described above, in the present apparatus, the image data is fetched by the image input unit 101,
2 and the specific area extraction unit 103 extracts the physical or logical connection of the characteristics of each pixel and neighboring pixels as one area, and then positions, sizes, shapes, and the like of the individual areas on the image. A characteristic amount such as a structure or a density distribution is measured, and the measurement result is identified as a region having a specific attribute such as a character or a photograph as a document component based on a predetermined rule.

【００８９】そして、文字列領域抽出部１０４において
写真などの特定属性を持った領域内に存在する文字属性
を持つ領域に対して、その周囲に存在する他の領域の属
性、大きさ、形状等の特徴量を計測し、その結果該当す
る文字属性を持つ領域が文字列を構成する要素と判断さ
れた場合、文字列領域として抽出を行ない、画像判別部
１０５において特定属性領域の判別を行ない、領域の画
素属性の修正を可能としたものである。Then, in the character string region extraction unit 104, for the region having the character attribute existing in the region having the specific attribute such as a photograph, the attribute, size, shape, etc. of the other region existing around the region. If the area having the corresponding character attribute is determined to be an element constituting a character string, extraction is performed as a character string area, and a specific attribute area is determined by the image determination unit 105. The pixel attribute of the area can be corrected.

【００９０】また、文書構造格納部１０７に予め知識デ
ータとして与えた文書構造パターンと抽出された文字列
領域情報とを文字列照合部１０８において照合し、文字
列領域を確定することにより、さらに正確な文字列抽出
を可能にしたものである。さらに、文字列照合部１０８
において照合した結果を利用することで、画像検索を可
能にしたものである。Further, the document structure pattern previously given as knowledge data to the document structure storage unit 107 and the extracted character string region information are collated by the character string collation unit 108, and the character string region is determined, so that the character string region is determined. This allows for easy extraction of character strings. Further, the character string matching unit 108
The image search is made possible by using the result of the collation in.

【００９１】したがって、本装置により入力された文書
画像データに対し、文字領域と写真、図、表、網点等の
異なる文書要素を区別して領域抽出し、その領域に存在
する文字領域の位置情報から文字列を抽出することによ
って、文字に類似した部分と文字とを正確に分離するこ
とが可能となり、この文字列の情報を利用して、写真な
どの領域の画素属性を修正することによって、複雑な領
域の判別に対応できる。Therefore, in the document image data input by the present apparatus, a character area is distinguished from a different document element such as a photograph, a figure, a table, a halftone dot and the like, and the area is extracted. By extracting a character string from, it becomes possible to accurately separate a character similar to a character from a character, and by using the information of this character string to correct the pixel attribute of a region such as a photograph, Compatible with complex area determination.

【００９２】この出力結果をフィルタリングや圧縮等の
画像処理に利用することで、ハードコピーとして出力す
る際にも奇麗な出力を得ることができ、イメージファイ
ルとして保存する際にも文字と写真を分離することも可
能となる。By using this output result for image processing such as filtering and compression, a beautiful output can be obtained even when outputting as a hard copy, and characters and photographs can be separated even when saving as an image file. It is also possible to do.

【００９３】また、予め与えられた文書構造パターンと
文字列情報を照合することで、ファイリングされた画像
データからユーザが所望するデータを検索することが可
能となる。Further, by collating a given document structure pattern with character string information, it is possible to search for desired data from the filed image data.

【００９４】[0094]

【発明の効果】以上説明したように本発明によれば、文
書をハードコピーしたり、イメージデータに変換して保
存しようとする場合、入力された文書画像データに対
し、文字領域と写真、図、表、網点等の異なる文書要素
を区別して領域として抽出・識別し、写真、図、表、網
点等の領域に存在する文字領域の存在位置から文字列を
抽出することによって、文字に類似した部分と文字とを
正確に分離することが可能となる。この文字列の情報を
利用して、写真などに文字を含んでいるような複雑な領
域に対して、正確な判別が行なえるようになる。As described above, according to the present invention, when a document is to be hard-copied or converted into image data and stored, the text area, the photograph, By extracting and identifying different document elements such as tables, halftone dots, etc. as areas, and extracting character strings from the positions of character areas existing in areas such as photographs, figures, tables, halftone dots, etc. Similar parts and characters can be accurately separated. By utilizing the information of this character string, accurate determination can be made for a complicated area such as a photograph containing characters.

【００９５】その結果、例えばハードコピーをとる場
合、写真中に存在する文字を見やすいようにフィルタリ
ング処理を施したり、イメージファイルとして保存する
際にも文字と写真を分離することができ、再利用しやす
い画像データが得られる。As a result, for example, when a hard copy is taken, it is possible to apply a filtering process so that the characters present in the photograph are easy to see, and to separate the character and the photograph when saving the image file, and to reuse the photograph. Easy image data can be obtained.

【００９６】また、予め与えられた文書構造パターンと
文字列情報を比較することで、ファイリングされた画像
データからユーザが所望するデータを検索することがで
きる。Further, by comparing a given document structure pattern with character string information, data desired by the user can be searched from the filed image data.

[Brief description of the drawings]

【図１】本発明に係わる画像処理装置の構成図であり、
実施の一例を示すブロック図。FIG. 1 is a configuration diagram of an image processing apparatus according to the present invention;
FIG. 2 is a block diagram showing an example of the embodiment.

【図２】図１の構成の本発明装置の画像処理の例を示す
フローチャート。FIG. 2 is a flowchart showing an example of image processing of the apparatus of the present invention having the configuration of FIG. 1;

【図３】本発明に係わる画像処理装置の文字列抽出処理
を示すフローチャート。FIG. 3 is a flowchart showing a character string extraction process of the image processing apparatus according to the present invention.

【図４】本発明に係わる画像処理装置の文字列抽出処理
における文字属性の探索領域設定および探索方法の説明
図。FIG. 4 is an explanatory diagram of a search area setting and search method of a character attribute in a character string extraction process of the image processing apparatus according to the present invention.

【図５】本発明に係わる画像処理装置において扱われる
文書画像の例を示す図。FIG. 5 is a view showing an example of a document image handled in the image processing apparatus according to the present invention.

【図６】本発明に係わる画像処理装置の文字領域抽出処
理の文字領域の出力例。FIG. 6 is an output example of a character area in a character area extraction process of the image processing apparatus according to the present invention.

【図７】本発明に係わる画像処理装置の特定領域抽出処
理の特定領域の出力例。FIG. 7 is an output example of a specific area in a specific area extraction process of the image processing apparatus according to the present invention.

【図８】本発明に係わる画像処理装置の文字列領域抽出
処理の文字列領域の出力例。FIG. 8 is an output example of a character string area in the character string area extraction processing of the image processing apparatus according to the present invention.

【図９】本発明に係わる画像処理装置において扱われる
文書画像の例を示す図。FIG. 9 is a view showing an example of a document image handled by the image processing apparatus according to the present invention.

【図１０】本発明に係わる画像処理装置の文字領域抽出
処理の文字領域の出力例。FIG. 10 is an output example of a character area in a character area extraction process of the image processing apparatus according to the present invention.

【図１１】本発明に係わる画像処理装置の特定領域抽出
処理の特定領域の出力例。FIG. 11 is an output example of a specific area in a specific area extraction process of the image processing apparatus according to the present invention.

【図１２】本発明に係わる画像処理装置の文字列領域抽
出処理の文字列領域の出力例。FIG. 12 is an output example of a character string area in the character string area extraction processing of the image processing apparatus according to the present invention.

【図１３】本発明に係わる画像処理装置の文字列領域抽
出処理の文字列領域の出力例。FIG. 13 is an output example of a character string area in the character string area extraction processing of the image processing apparatus according to the present invention.

【図１４】本発明に係わる画像処理装置の構成図であ
り、実施の一例を示すブロック図。FIG. 14 is a block diagram of an image processing apparatus according to the present invention, and is a block diagram showing an example of an embodiment.

【図１５】図１４の構成の本発明装置の画像処理の例を
示すフローチャート。FIG. 15 is a flowchart showing an example of image processing of the apparatus of the present invention having the configuration of FIG. 14;

【図１６】本発明に係わる画像処理装置において扱われ
る文書画像の例を示す図。FIG. 16 is a view showing an example of a document image handled in the image processing apparatus according to the present invention.

【図１７】本発明に係わる画像処理装置の文字領域抽出
処理の文字領域の出力例。FIG. 17 is an output example of a character area in a character area extraction process of the image processing apparatus according to the present invention.

【図１８】本発明に係わる画像処理装置の特定領域抽出
処理の特定領域の出力例。FIG. 18 is an output example of a specific area in a specific area extraction process of the image processing apparatus according to the present invention.

【図１９】本発明に係わる画像処理装置の文字列領域抽
出処理の文字列領域の出力例。FIG. 19 is an output example of a character string area in the character string area extraction processing of the image processing apparatus according to the present invention.

【図２０】本発明に係わる画像処理装置に与える文書構
造の例を示す図FIG. 20 is a diagram showing an example of a document structure given to the image processing apparatus according to the present invention.

【図２１】本発明に係わる画像処理装置の文字列領域抽
出処理の文字列領域の出力例。FIG. 21 is an output example of a character string area in the character string area extraction processing of the image processing apparatus according to the present invention.

【図２２】本発明に係わる画像処理装置の構成図であ
り、実施の一例を示すブロック図。FIG. 22 is a configuration diagram of an image processing apparatus according to the present invention, and is a block diagram illustrating an example of an embodiment.

[Explanation of symbols]

１０１…画像入力部１０２…文字領域抽出部１０３…特定領域抽出部１０４…文字列領域抽出部１０５…画像判別部１０６…画像出力部１０７…文書構造格納部１０８…文字列照合部１０９…画像呈示部 101 image input unit 102 character region extraction unit 103 specific region extraction unit 104 character string region extraction unit 105 image discrimination unit 106 image output unit 107 document structure storage unit 108 character string collation unit 109 image presentation Department

───────────────────────────────────────────────────── フロントページの続きＦターム(参考） 5B057 CA16 CC03 CH11 DA08 DB02 DC30 DC36 5L096 AA06 BA07 BA08 BA20 CA18 DA01 DA05 EA35 EA43 FA02 GA07 ──────────────────────────────────────────────────の Continued on the front page F term (reference) 5B057 CA16 CC03 CH11 DA08 DB02 DC30 DC36 5L096 AA06 BA07 BA08 BA20 CA18 DA01 DA05 EA35 EA43 FA02 GA07

Claims

[Claims]

1. A character region extracting means for extracting a character region having a character characteristic based on the density values of each pixel and neighboring pixels of an image read as image data, and a specification for extracting a specific region having a characteristic other than the character. Area extracting means; character string area extracting means for extracting a character string area from the specific area extracted by the specific area extracting means using information on the character area extracted by the character area extracting means; An image processing apparatus comprising: an image determining unit that determines an image of the specific region using information on a character string region extracted by the region extracting unit.

2. The image processing apparatus according to claim 1, wherein the specific area extracted by the specific area extracting means is at least one of a photograph, a figure, a table, and a halftone dot.

3. The image processing apparatus according to claim 1, wherein the character string area extracting means extracts a vertical character string area and a horizontal character string area from the image data individually.

4. The image processing apparatus according to claim 1, wherein said character string area extracting means extracts a paragraph area by measuring a positional relationship between a plurality of character string areas.

5. A character region extracting means for extracting a character region having a character characteristic based on the density values of each pixel and neighboring pixels of an image read as image data, and a specification for extracting a specific region having a characteristic other than the character. Region extracting means; and character string region extracting means for extracting a character string region from information of the character region extracted by the character region extracting means from within the specific region extracted by the specific region extracting means. A document structure storing means for storing the position of the character string area as a document structure, and comparing the character string area extracted by the character string area extracting means with the document structure stored in the document structure storing means,
An image comprising: character string matching means for determining a character string area; and image determining means for determining an image of the specific area using information on the character string area determined by the character string matching means. Processing equipment.

6. A character region extracting means for extracting a character region having a character characteristic based on the density values of each pixel and neighboring pixels of an image read as image data, and a specification for extracting a specific region having a characteristic other than the character. Region extracting means; and character string region extracting means for extracting a character string region from information of the character region extracted by the character region extracting means from within the specific region extracted by the specific region extracting means. A document structure storing means for storing the position of the character string area as a document structure, and comparing the character string area extracted by the character string area extracting means with the document structure stored in the document structure storing means,
Image processing comprising: character string matching means for determining a character string area; and image presenting means for presenting an image read using information on the character string area determined by the character string matching means. apparatus.

7. A character region having character characteristics and a specific region having non-character characteristics are extracted based on density values of each pixel and neighboring pixels of an image read as image data, and from the extracted specific regions, An image processing method, wherein a character string region is extracted using the extracted character region information, and an image of the specific region is determined using the extracted character string region information.

8. A character region having character characteristics and a specific region having non-character characteristics are extracted based on the density values of each pixel and neighboring pixels of an image read as image data, and from the extracted specific regions, Extracting a character string area using the extracted information on the character area; collating the extracted character string area with a document structure indicating the position of the character string area to be extracted to determine the character string area; An image processing method, wherein an image of the specific area is determined using information of the determined character string area.

9. A character region having character characteristics and a specific region having non-character characteristics are extracted based on the density values of each pixel and neighboring pixels of an image read as image data, and from the extracted specific regions, Extracting a character string area using the extracted information on the character area; collating the extracted character string area with a document structure indicating the position of the character string area to be extracted to determine the character string area; An image processing method characterized by presenting an image read using information of a determined character string region.

10. A machine-readable recording medium on which a program for performing image processing is recorded, wherein a character area and a character other than a character having character characteristics are determined by the density values of each pixel and neighboring pixels of an image read as image data. Extracting a specific region having the following characteristics: extracting a character string region using information of the character region extracted from within the extracted specific region; and specifying the specific region using information of the extracted character string region. A recording medium on which a program for determining an image of an area is recorded.

11. A machine-readable recording medium on which a program for performing image processing is recorded, wherein a character area and a character other than a character having character characteristics are determined by density values of each pixel and neighboring pixels of an image read as image data. Extracting a specific region having the following characteristics: extracting a character string region using information of the character region extracted from the extracted specific region; extracting the extracted character string region into a character string to be extracted A recording medium on which a program for determining a character string area by collating with a document structure indicating a position of the area and determining an image of the specific area using information of the determined character string area is recorded.

12. A machine-readable recording medium on which a program for performing image processing is recorded, wherein a character area and a character other than a character having a character characteristic are represented by density values of pixels and neighboring pixels of an image read as image data. Extracting a specific region having the following characteristics: extracting a character string region using information of the character region extracted from the extracted specific region; extracting the extracted character string region into a character string to be extracted A recording medium storing a program for determining a character string region by comparing it with a document structure indicating the position of the region and presenting an image read using information of the determined character string region.