JP4631371B2

JP4631371B2 - Image processing device

Info

Publication number: JP4631371B2
Application number: JP2004274300A
Authority: JP
Inventors: 真之久武
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2004-09-22
Filing date: 2004-09-22
Publication date: 2011-02-16
Anticipated expiration: 2024-09-22
Also published as: JP2006092050A

Description

本発明は、画像データから文字部分と絵柄部分とを分離して所定処理を行う画像処理装置に関する。 The present invention relates to an image processing apparatus that separates a character part and a picture part from image data and performs predetermined processing.

ラスタ画像のデータ（以下、区別するべき場合を除いて単に「画像データ」と呼ぶ）には、文字（テキスト）部分や、自然画の部分（絵柄部分）など、互いに性状の異なる多くの画像要素が含まれ得る。こうした画像要素は、その性状の相違から、例えば圧縮処理において異なる方式での圧縮が適していたりするなど、画像要素ごとに異なる画像処理を行うことが好ましい場合が多い。 Raster image data (hereinafter simply referred to as “image data” unless otherwise distinguished) has many image elements with different properties such as character (text) parts and natural picture parts (design parts). Can be included. Due to the difference in properties of these image elements, it is often preferable to perform different image processing for each image element, for example, compression by a different method is suitable for compression processing.

そこで従来から、いわゆるＴ／Ｉ分離と呼ばれる画像処理が研究・開発されている。従来、Ｔ／Ｉ分離の方法としては、例えば処理対象画像を二値化し、黒画素の連続する領域を画定し、当該画定した領域のサイズが予め定めたしきい値を下回る場合に当該領域に含まれる黒画素が文字を表すものと判定する方法等がある（特許文献１）。 Therefore, conventionally, image processing called so-called T / I separation has been researched and developed. Conventionally, as a method of T / I separation, for example, a processing target image is binarized, a continuous region of black pixels is defined, and when the size of the defined region is lower than a predetermined threshold value, the region is included in the region. There is a method of determining that an included black pixel represents a character (Patent Document 1).

また、こうして文字部分として判定された領域については、当該領域に含まれる画素値に基づいてそれらの画素の代表的な色を決定し、各文字を構成する画素の値を当該代表的な色の値に設定することで圧縮率をさらに高めているものがある（特許文献２）。
特開２００３−８９０９号公報（段落番号００２６を参照）特開２００２−１６５１０５号公報特開２００２−１７５５３２号公報 In addition, for the area thus determined as the character part, the representative colors of the pixels are determined based on the pixel values included in the area, and the values of the pixels constituting each character are set to the representative colors. Some have further increased the compression rate by setting the value (Patent Document 2).
JP 2003-8909 A (see paragraph 0026) JP 2002-165105 A JP 2002-175532 A

ところが、文字部分として判定される領域全体について、一つの色を決定する上記方法では、例えば文字画像にグラデーション処理が施され、文字が多色で表現されている場合には、当該表現が失われることになる。 However, in the above method for determining one color for the entire area determined as the character portion, for example, when the character image is subjected to gradation processing and the character is expressed in multiple colors, the expression is lost. It will be.

一方で、文字が多色で表現されている場合に当該文字部分を分離しない場合は、圧縮率が低下するなど、Ｔ／Ｉ分離処理の効果が十分に発揮されない。 On the other hand, if the character part is not separated when the character is expressed in multiple colors, the effect of the T / I separation process is not sufficiently exhibited, such as a reduction in the compression rate.

本発明は上記実情に鑑みて為されたもので、多色表現された文字画像に対応した処理を行うことのできる画像処理装置を提供することをその目的の一つとする。 The present invention has been made in view of the above circumstances, and an object thereof is to provide an image processing apparatus capable of performing processing corresponding to a character image expressed in multiple colors.

上記従来例の問題点を解決するための本発明は、画像処理装置であって、処理対象となった画像データの少なくとも一部から文字と判断される画素塊を少なくとも一つ含む領域を画定する手段と、各領域に含まれる、前記文字と判断される画素塊の色数を文字色数として判定する文字色数判定手段と、各領域の背景部分となる画素の色数を背景色数として判定する背景色数判定手段と、を含み、前記判定によって得られた文字色数と背景色数との情報が、所定の画像処理に供されることを特徴としている。 The present invention for solving the above-described problems of the conventional example is an image processing apparatus, and defines an area including at least one pixel block determined to be a character from at least a part of image data to be processed. Means, character color number determination means for determining the number of colors of a pixel block included in each area, which is determined as the character, as the number of character colors; Background color number determination means for determining, and information on the number of character colors and the number of background colors obtained by the determination is provided for predetermined image processing.

またここで、前記所定処理として、前記判定された文字色数が所定Ｎ色以下である場合は、前記画素塊を表すマスク画像と当該画素塊に含まれる色に基づいて定められた代表色情報とを出力し、前記判定された文字色数が前記所定Ｎ色を超えており、かつ前記背景色数が所定Ｍ色以下である場合は、前記画素塊を表すマスク画像を反転して、背景部分を表すマスク画像を生成し、当該背景部分を表すマスク画像と、背景に含まれる色に基づいて定められた代表色情報とを出力することとしてもよい。 Here, as the predetermined process, when the determined number of character colors is equal to or less than a predetermined N colors, the representative color information determined based on the mask image representing the pixel block and the color included in the pixel block When the determined number of character colors exceeds the predetermined N colors and the number of background colors is equal to or less than the predetermined M colors, the mask image representing the pixel block is inverted and a background A mask image representing the portion may be generated, and the mask image representing the background portion and representative color information determined based on the color included in the background may be output.

さらに、前記判定された文字色数が前記所定Ｎ色を超えており、かつ前記背景色数が所定Ｍ色以下である場合は、前記画素塊を表すマスク画像の有意画素を膨張する処理を行い、当該膨張処理後の当該マスク画像を反転して背景部分を表すマスク画像を生成し、当該背景部分を表すマスク画像と、背景に含まれる色に基づいて定められた代表色情報とを出力することとしてもよい。 Further, when the determined number of character colors exceeds the predetermined N colors and the number of background colors is equal to or less than the predetermined M colors, a process of expanding significant pixels of the mask image representing the pixel block is performed. The mask image after the expansion process is inverted to generate a mask image representing the background portion, and the mask image representing the background portion and the representative color information determined based on the color included in the background are output. It is good as well.

また、本発明の一態様に係る画像処理方法は、処理対象となった画像データの少なくとも一部から文字と判断される画素塊を少なくとも一つ含む領域を画定する工程と、各領域に含まれる、前記文字と判断される画素塊の色数を文字色数として判定する工程と、各領域の背景部分となる画素の色数を背景色数として判定する工程と、を含み、前記判定によって得られた文字色数と背景色数との情報が、所定の画像処理に供されることを特徴としている。 In addition, an image processing method according to an aspect of the present invention includes a step of defining an area including at least one pixel block determined to be a character from at least a part of image data to be processed, and included in each area And determining the number of colors of the pixel block determined as the character as the number of character colors, and determining the number of colors of the pixels serving as the background portion of each area as the number of background colors. The information on the number of character colors and the number of background colors is used for predetermined image processing.

さらに本発明の別の態様に係るプログラムは、コンピュータに、処理対象となった画像データの少なくとも一部から文字と判断される画素塊を少なくとも一つ含む領域を画定する手順と、各領域に含まれる、前記文字と判断される画素塊の色数を文字色数として判定する手順と、各領域の背景部分となる画素の色数を背景色数として判定する手順と、を実行させ、前記判定によって得られた文字色数と背景色数との情報が、所定の画像処理に供されることを特徴としている。 Furthermore, a program according to another aspect of the present invention includes a procedure for defining a region including at least one pixel block that is determined to be a character from at least a part of image data to be processed in each computer, and included in each region. And determining the number of colors of the pixel block determined as the character as the number of character colors, and determining the number of colors of the pixels serving as the background portion of each region as the number of background colors, and performing the determination The information on the number of character colors and the number of background colors obtained by the above is used for predetermined image processing.

本発明の実施の形態に係る画像処理装置は、図１に示すように、制御部１１と記憶部１２と画像入力部１３と画像出力部１４とを含んで構成されている。制御部１１は、記憶部１２に格納されているプログラムに従って動作しており、後に説明する各画像処理を遂行する。この画像処理の内容については、後に詳しく述べる。 As shown in FIG. 1, the image processing apparatus according to the embodiment of the present invention includes a control unit 11, a storage unit 12, an image input unit 13, and an image output unit 14. The control unit 11 operates in accordance with a program stored in the storage unit 12 and performs each image processing described later. The contents of this image processing will be described in detail later.

記憶部１２は、制御部１１によって実行されるプログラムを保持している。またこの記憶部１２は、制御部１１の処理の過程で生成される各種データ等を格納するワークメモリとしても動作する。具体的にこの記憶部１２は、コンピュータ可読な記録媒体と当該記録媒体に対してデータを書き込み、又は当該記録媒体からデータを読み出す装置（例えばハードディスク装置やメモリ装置）として実装できる。 The storage unit 12 holds a program executed by the control unit 11. The storage unit 12 also operates as a work memory that stores various data generated during the process of the control unit 11. Specifically, the storage unit 12 can be implemented as a computer-readable recording medium and a device that writes data to or reads data from the recording medium (for example, a hard disk device or a memory device).

画像入力部１３は、例えばスキャナであり、原稿を光学的に読み取って得られた画像データを制御部１１に出力する。ここではこの画像入力部１３が出力する画像データにおいて、各画素の値がＲＧＢ（赤、緑、青）の色空間で表現されているとする。画像出力部１４は、制御部１１から入力される指示に従って画像データを出力するもので、例えば画像形成部（プリンタ等）に出力し、又はネットワークを介して外部の装置に送信する等の処理を行うものである。 The image input unit 13 is, for example, a scanner, and outputs image data obtained by optically reading a document to the control unit 11. Here, it is assumed that the value of each pixel is expressed in an RGB (red, green, blue) color space in the image data output from the image input unit 13. The image output unit 14 outputs image data in accordance with an instruction input from the control unit 11. For example, the image output unit 14 outputs the image data to an image forming unit (printer or the like) or transmits it to an external device via a network. Is what you do.

次に制御部１１の処理の内容について説明する。本実施の形態の制御部１１は、図２に機能的に示すように、画像入力部１３から入力される画像データを処理対象として、この処理対象となった画像データに対して、所定前処理を行う前処理部２１と、文字画像部分を抽出する文字抽出部２２と、色数判定部２３と、代表色決定部２４と、マスク画像生成部２５と、後処理部２６と、フォーマット部２７とを含んで構成されている。 Next, the content of the process of the control part 11 is demonstrated. As functionally shown in FIG. 2, the control unit 11 of the present embodiment sets the image data input from the image input unit 13 as a processing target, and performs predetermined preprocessing on the image data that is the processing target. A pre-processing unit 21 that performs a character image extraction, a character extraction unit 22 that extracts a character image portion, a color number determination unit 23, a representative color determination unit 24, a mask image generation unit 25, a post-processing unit 26, and a formatting unit 27. It is comprised including.

以下、これら各部について具体的に説明する。前処理部２１では、画像入力部１３から入力される画像データ（処理対象画像データ）の各画素の値をＲＧＢからＹＣｂＣｒ（輝度と色差とからなる値）に変換する。具体的には、次の（１）式を用いて変換を行うことができる。なお、ここではＲＧＢの各成分の値は0x00（「0x」は１６進数であることを示す）から0xFFまでの値であるとしている。また、この前処理部２１では、下地領域の輝度・彩度に基づいて各画素値を階調補正してもよい。尤も、この階調補正の処理は、必ずしも必要なものではない。 Hereinafter, each of these parts will be described in detail. The preprocessing unit 21 converts the value of each pixel of the image data (processing target image data) input from the image input unit 13 from RGB to YCbCr (a value composed of luminance and color difference). Specifically, the conversion can be performed using the following equation (1). Here, the value of each component of RGB is assumed to be a value from 0x00 (“0x” indicates a hexadecimal number) to 0xFF. Further, the pre-processing unit 21 may correct the gradation of each pixel value based on the luminance and saturation of the background area. However, the gradation correction process is not always necessary.

文字抽出部２２は、前処理部２１が出力する画像データから文字と判断される画素塊を含む領域を特定する。具体的に本実施の形態では、この文字抽出部２２は、レイアウト解析の処理により文字部分を特定する。 The character extraction unit 22 specifies an area including a pixel block that is determined to be a character from the image data output by the preprocessing unit 21. Specifically, in the present embodiment, the character extraction unit 22 specifies a character portion by layout analysis processing.

ここでレイアウト解析処理の処理内容について説明する。文字抽出部２２は図３に機能的に示すように、二値化処理部４１と、連結画素抽出部４２と、基本矩形画定部４３と、第１セパレータ検出部４４と、行矩形画定部４５と、第２セパレータ検出部４６と、文字領域画定部４７と、ノイズ判定部４８と、文字部分特定部４９とを含んで構成される。 Here, processing contents of the layout analysis processing will be described. As functionally shown in FIG. 3, the character extraction unit 22 includes a binarization processing unit 41, a connected pixel extraction unit 42, a basic rectangle definition unit 43, a first separator detection unit 44, and a row rectangle definition unit 45. A second separator detection unit 46, a character region demarcation unit 47, a noise determination unit 48, and a character part specifying unit 49.

二値化処理部４１は、前処理部２１が出力するＹＣｂＣｒ色空間で表現された画像データ（元の画像データ）のうち、絵柄候補領域画定情報で画定される領域内の部分的な画像データ（絵柄候補部分データ）を処理対象として、この処理対象となった絵柄候補部分データを二値化して、二値化絵柄候補部分データを生成する。 The binarization processing unit 41 includes partial image data in an area defined by the pattern candidate area definition information among the image data (original image data) expressed in the YCbCr color space output from the preprocessing unit 21. Using (picture candidate part data) as a processing target, the pattern candidate part data that is the processing target is binarized to generate binarized picture candidate part data.

連結画素抽出部４２は、二値化絵柄候補部分データに対してラベリング処理を行い、所定の条件（例えば黒画素である等の条件）を満足する画素値の画素が連続する部分からなる複数の画素群（連結画素群）を特定する。 The connected pixel extraction unit 42 performs a labeling process on the binarized pattern candidate partial data, and includes a plurality of continuous pixels having pixel values that satisfy a predetermined condition (for example, a condition such as a black pixel). A pixel group (connected pixel group) is specified.

基本矩形画定部４３は、連結画素抽出部４２が特定した連結画素群に関する矩形（例えば連結画素群に外接する矩形）を基本矩形として画定し、各連結画素群についての基本矩形の座標情報（当該矩形を画定するための座標情報）を生成する。そして、各基本矩形ごとに固有の識別子を発行し、当該識別子と基本矩形の座標情報とを関連づけて基本矩形データベースとして記憶部１２に格納する。 The basic rectangle defining unit 43 defines a rectangle (for example, a rectangle circumscribing the connected pixel group) related to the connected pixel group specified by the connected pixel extracting unit 42 as a basic rectangle, and the basic rectangle coordinate information about each connected pixel group Coordinate information for defining a rectangle) is generated. Then, a unique identifier is issued for each basic rectangle, and the identifier and the coordinate information of the basic rectangle are associated and stored in the storage unit 12 as a basic rectangle database.

第１セパレータ検出部４４は、処理対象となっている絵柄候補部分データの左上端の画素を初期位置として、各画素を左から右へと１ライン走査し、ついで一つ下のラインについて同様に走査する順（すなわちラスタスキャン順）で、各画素を走査する。そして上記ラベリング処理における所定の条件を満足しない画素値の画素（例えば白画素）が、予め定めた水平方向閾値より多く連続している場合に、当該連続画素部分を（水平方向の）第１セパレータとして検出する。第１セパレータ検出部４４は、ここで検出された第１セパレータを特定する情報（連続画素部分の左端画素の座標と右端画素の座標など、ここで座標は、元の画像データ上の座標であってもよいし、絵柄候補部分データ上のローカルな座標であってもよい）を生成して記憶部１２に格納する。 The first separator detection unit 44 scans each pixel one line from the left to the right with the pixel at the upper left end of the candidate pattern data to be processed as the initial position, and then similarly for the next lower line. Each pixel is scanned in the scanning order (that is, raster scanning order). When pixels having pixel values that do not satisfy the predetermined condition in the labeling process (for example, white pixels) continue more than a predetermined horizontal direction threshold value, the continuous pixel portion is determined as the first separator (in the horizontal direction). Detect as. The first separator detection unit 44 specifies information for identifying the first separator detected here (such as the coordinates of the leftmost pixel and the rightmost pixel of the continuous pixel portion, where the coordinates are coordinates on the original image data). Or the local coordinates on the pattern candidate portion data may be generated) and stored in the storage unit 12.

また、この第１セパレータ検出部４４は、処理対象となっている絵柄候補部分データの左上端の画素を初期位置として、各画素をラスタスキャン順に走査していき、上記ラベリング処理における所定の条件を満足しない画素値の画素（例えば白画素）が、予め定めた垂直方向閾値より多く連続している場合に、当該連続画素部分を（垂直方向の）第１セパレータとして検出する。そして当該第１セパレータを特定する情報（連続画素部分の上端画素の座標と下端画素の座標など、ここで座標は、元の画像データ上の座標であってもよいし、絵柄候補部分データ上のローカルな座標であってもよい）を生成して記憶部１２に格納する。 Further, the first separator detection unit 44 scans each pixel in the raster scan order with the pixel at the upper left end of the pattern candidate partial data to be processed as an initial position, and sets a predetermined condition in the labeling process. When pixels having unsatisfactory pixel values (for example, white pixels) continue more than a predetermined vertical direction threshold value, the continuous pixel portion is detected as the first separator (in the vertical direction). Information for identifying the first separator (the coordinates of the upper end pixel and the lower end pixel of the continuous pixel portion, such as the coordinates may be coordinates on the original image data, or on the pattern candidate portion data. Local coordinates may be generated) and stored in the storage unit 12.

これらの処理において、水平方向閾値や垂直方向閾値は、ユーザが任意に定め得る。水平方向閾値は多段組のレイアウトにおいて各段を分かつための閾値であり、垂直方向閾値は２行以上の文字列を含む文書から、各行を分かつための閾値である。またユーザの設定によるだけでなく、水平方向閾値は基本矩形の幅の統計値（平均）などに基づく所定関数値として、また垂直方向閾値は基本矩形の高さの統計値（平均）などに基づく所定関数値としてそれぞれ定めてもよい。 In these processes, the horizontal direction threshold and the vertical direction threshold can be arbitrarily determined by the user. The horizontal threshold is a threshold for dividing each stage in a multi-column layout, and the vertical threshold is a threshold for separating each line from a document including two or more character strings. In addition to the user setting, the horizontal threshold is a predetermined function value based on the basic rectangle width statistical value (average), and the vertical threshold is based on the basic rectangular height statistical value (average). Each may be determined as a predetermined function value.

具体的に第１セパレータは、図４（ａ）に示すような状態で検出されることになる。なお、図４（ａ）では各第１セパレータが相互に隣接して検出された結果として、一つの第１セパレータ領域（斜線部分）のように示されている。 Specifically, the first separator is detected in a state as shown in FIG. In FIG. 4A, as a result of detecting the first separators adjacent to each other, one first separator region (shaded portion) is shown.

行矩形画定部４５は、記憶部１２に格納されている基本矩形の一つを注目基本矩形として選択する。そして、記憶部１２に格納されている基本矩形であって、いままでに注目基本矩形として選択されていない基本矩形を処理対象基本矩形として順次選択しながら、次の処理を行う。 The row rectangle defining unit 45 selects one of the basic rectangles stored in the storage unit 12 as the target basic rectangle. Then, the following processing is performed while sequentially selecting the basic rectangles that have been stored in the storage unit 12 and have not been selected as the target basic rectangle so far as the processing target basic rectangles.

まず行矩形画定部４５は、注目基本矩形の中心座標（座標情報が対角位置にある各頂点の座標を表している場合、その中点座標）から、処理対象基本矩形の中心座標へのベクトルを算出する。さらに行矩形画定部４５は、ベクトルの大きさ（各成分の二乗和の平方根）から注目基本矩形と処理対象基本矩形との距離を算出する。そして、この算出した距離が予め定めた距離閾値以下となっている場合は、上記算出したベクトルが、検出された第１セパレータのいずれかと交差するか否かを調べる。この処理は２つの線分が交差するか否かを調べる処理として広く知られたものを用いることができる。ここで、上記算出したベクトルが、検出された第１セパレータのいずれとも交差しない場合、注目基本矩形の識別子に、当該処理対象基本矩形の識別子を関連づけて基本矩形関係データベースとして記憶部１２に格納する。 First, the line rectangle demarcating unit 45 is a vector from the center coordinates of the target basic rectangle (if the coordinate information represents the coordinates of each vertex at the diagonal position, the midpoint coordinates thereof) to the center coordinates of the processing target basic rectangle. Is calculated. Further, the row rectangle defining unit 45 calculates the distance between the target basic rectangle and the processing target basic rectangle from the size of the vector (the square root of the sum of squares of each component). When the calculated distance is equal to or less than a predetermined distance threshold, it is checked whether the calculated vector intersects any of the detected first separators. For this process, a widely known process for checking whether or not two line segments intersect can be used. Here, when the calculated vector does not intersect any of the detected first separators, the identifier of the target basic rectangle is associated with the identifier of the target basic rectangle and stored in the storage unit 12 as a basic rectangle relation database. .

行矩形画定部４５は記憶部１２に格納されている基本矩形について順次注目基本矩形として選択しながら上記処理を行う。そしてこの処理の結果として得られた基本矩形関係データベースを参照しながら連鎖的に互いに関連する一連の基本矩形群（複数あってもよい）を特定し、特定された基本矩形群に含まれる基本矩形に外接する矩形を行矩形として画定する（例えば図４（ｂ））。 The row rectangle defining unit 45 performs the above processing while sequentially selecting the basic rectangles stored in the storage unit 12 as the target basic rectangle. Then, a series of basic rectangle groups (which may be plural) are linked and identified with reference to the basic rectangle relation database obtained as a result of this processing, and the basic rectangles included in the specified basic rectangle group A rectangle circumscribing is defined as a row rectangle (for example, FIG. 4B).

例えば基本矩形関係データベース内において、識別子が「１」の基本矩形と識別子が「２」の基本矩形とが関連づけられ、また識別子が「２」の基本矩形と識別子が「３」の基本矩形とが関連づけられている場合、行矩形画定部４５は、これらの結果を統合して識別子「１」と「２」と「３」との各基本矩形を一連の基本矩形群として特定する。そして基本矩形群に含まれる基本矩形のうち、その座標情報のｘ（水平方向）の値の最大値と最小値とを抽出し、同じようにｙ（垂直方向）の値の最大値と最小値とを抽出する。そして、抽出されたｘの最小値とｙの最小値とを組とした第一座標と、ｘの最大値とｙの最大値とを組とした第二座標とを、それぞれ左上座標，右下座標とする行矩形を画定する。つまり、行矩形は、この２つの座標値を含む座標情報によって画定される。 For example, in the basic rectangle relation database, a basic rectangle with an identifier “1” is associated with a basic rectangle with an identifier “2”, and a basic rectangle with an identifier “2” and a basic rectangle with an identifier “3” are associated with each other. If they are associated, the row rectangle defining unit 45 integrates these results and specifies the basic rectangles having the identifiers “1”, “2”, and “3” as a series of basic rectangle groups. Then, from the basic rectangles included in the basic rectangle group, the maximum value and minimum value of the x (horizontal direction) value of the coordinate information are extracted, and similarly the maximum value and minimum value of the y (vertical direction) value are extracted. And extract. Then, a first coordinate that is a set of the extracted minimum value of x and a minimum value of y, and a second coordinate that is a set of the maximum value of x and the maximum value of y are respectively set as upper left coordinates and lower right coordinates. A line rectangle is defined as coordinates. That is, the row rectangle is defined by coordinate information including these two coordinate values.

行矩形画定部４５は、こうして画定した各行矩形についてそれぞれ固有の識別子を発行し、各識別子とその行矩形の座標情報と当該行矩形に含まれる基本矩形群を特定する情報（各基本矩形の識別子のリストなど）とを関連づけて行矩形データベースとして記憶部１２に格納する。 The row rectangle defining unit 45 issues a unique identifier for each row rectangle thus defined, and information for identifying each identifier, the coordinate information of the row rectangle, and the basic rectangle group included in the row rectangle (the identifier of each basic rectangle) Are stored in the storage unit 12 as a row rectangle database.

第２セパレータ検出部４６は、処理対象となっている絵柄候補部分データの左上端の画素を初期位置として、ラスタスキャン順に、各画素を走査する。そして連結画素抽出部４２でのラベリング処理における所定の条件を満足する画素値の画素（例えば黒画素）が、予め定めた水平方向閾値より多く連続している場合に、当該連続画素部分を（水平方向の）第２セパレータとして検出し、当該第２セパレータを特定する情報（連続画素部分の左端画素の座標と右端画素の座標など、ここで座標は、元の画像データ上の座標であってもよいし、絵柄候補部分データ上のローカルな座標であってもよい）を生成して記憶部１２に格納する。 The second separator detection unit 46 scans each pixel in the raster scan order with the pixel at the upper left corner of the pattern candidate partial data to be processed as the initial position. Then, when pixels having pixel values (for example, black pixels) that satisfy a predetermined condition in the labeling process in the connected pixel extraction unit 42 are continuous more than a predetermined horizontal direction threshold, the continuous pixel portion is (horizontal Information that is detected as the second separator (in the direction) and identifies the second separator (such as the coordinates of the leftmost pixel and the rightmost pixel of the continuous pixel portion, where the coordinates are coordinates on the original image data) Or the local coordinates on the pattern candidate portion data may be generated) and stored in the storage unit 12.

また、この第２セパレータ検出部４６は、処理対象となっている絵柄候補部分データの左上端の画素を初期位置として、各画素をラスタスキャン順に走査していき、連結画素抽出部４２でのラベリング処理における所定の条件を満足する画素値の画素（例えば黒画素）が、予め定めた垂直方向閾値より多く連続している場合に、当該連続画素部分を（垂直方向の）第２セパレータとして検出し、当該第２セパレータを特定する情報（連続画素部分の上端画素の座標と下端画素の座標など、ここで座標は、元の画像データ上の座標であってもよいし、絵柄候補部分データ上のローカルな座標であってもよい）を生成して記憶部１２に格納する。 The second separator detection unit 46 scans each pixel in the raster scan order with the pixel at the upper left corner of the pattern candidate partial data to be processed as an initial position, and performs labeling in the connected pixel extraction unit 42. When pixels having a pixel value satisfying a predetermined condition in processing (for example, black pixels) continue more than a predetermined vertical direction threshold, the continuous pixel portion is detected as a second separator (in the vertical direction). , Information for specifying the second separator (the coordinates of the upper end pixel and the lower end pixel of the continuous pixel portion, such as the coordinates may be coordinates on the original image data, or on the pattern candidate portion data Local coordinates may be generated) and stored in the storage unit 12.

これらの処理において、水平方向閾値や垂直方向閾値は、ユーザが任意に定め得る。水平方向閾値は多段組のレイアウトにおいて各段を分かつための閾値であり、垂直方向閾値は２行以上の文字列を含む文書から、各行を分かつための閾値である。またユーザの設定によるだけでなく、水平方向閾値は基本矩形の幅の統計値（平均）などに基づく所定関数値として、また垂直方向閾値は基本矩形の高さの統計値（平均）などに基づく所定関数値としてそれぞれ定めてもよい。 In these processes, the horizontal direction threshold and the vertical direction threshold can be arbitrarily determined by the user. The horizontal threshold is a threshold for dividing each stage in a multi-column layout, and the vertical threshold is a threshold for separating each line from a document including two or more character strings. In addition to user settings, the horizontal threshold is based on a basic rectangle width statistical value (average), and the vertical threshold is based on a basic rectangle height statistical value (average). Each may be determined as a predetermined function value.

文字領域画定部４７は、行矩形画定部４５が画定した行矩形の一つを注目行矩形として選択する。そして文字領域画定部４７は、記憶部１２に格納されている行矩形であって、いままでに注目行矩形として選択されていない行矩形を処理対象行矩形として順次選択しながら、次の処理を行う。 The character area defining unit 47 selects one of the line rectangles defined by the line rectangle defining unit 45 as the target line rectangle. Then, the character area demarcation unit 47 sequentially performs the next processing while sequentially selecting the row rectangles stored in the storage unit 12 and not selected as the target row rectangle so far as the processing target row rectangle. Do.

すなわち、注目行矩形の各頂点の座標と処理対象行矩形の対応する頂点の座標とを結ぶ線分、及び注目行矩形と処理対象行矩形の各辺とによって画定される多角形領域を生成し、この多角形領域と第２セパレータ（の領域）とが交差（領域同士が少なくとも一部で重なり合う）するか否かを調べる。この処理は２つの領域が交差するか否かを調べる処理として広く知られたものを用いることができる。 In other words, a line segment connecting the coordinates of each vertex of the target line rectangle and the coordinates of the corresponding vertex of the processing target line rectangle, and a polygonal area defined by each side of the target line rectangle and the processing target line rectangle are generated. Then, it is examined whether or not this polygonal region and the second separator (region) intersect (regions overlap at least partially). For this process, a widely known process for checking whether or not two regions intersect can be used.

ここで多角形領域と第２セパレータとが交差していない場合、注目行矩形の識別子と処理対象行矩形の識別子とを関連づけて行矩形関係データベースとして記憶部１２に格納する。 Here, when the polygonal region and the second separator do not intersect, the identifier of the target row rectangle and the identifier of the processing target row rectangle are associated with each other and stored in the storage unit 12 as a row rectangle relation database.

文字領域画定部４７は記憶部１２に格納されている行矩形について順次注目行矩形として選択しながら上記処理を行う。そしてこの処理の結果として得られた行矩形関係データベースを参照しながら連鎖的に互いに関連する一連の行矩形群（複数あってもよい）を特定し、特定された行矩形群に含まれる行矩形に外接する矩形を文字領域として画定する。 The character area defining unit 47 performs the above processing while sequentially selecting the row rectangles stored in the storage unit 12 as the target row rectangle. Then, a series of row rectangle groups (which may be plural) related to each other in a chained manner are identified with reference to the row rectangle relation database obtained as a result of this processing, and the row rectangles included in the identified row rectangle group A rectangle circumscribing is defined as a character area.

例えば行矩形関係データベース内において、識別子が「１」の行矩形と識別子が「２」の行矩形とが関連づけられ、また識別子が「２」の行矩形と識別子が「３」の行矩形とが関連づけられている場合、行矩形画定部４５は、これらの結果を統合して識別子「１」と「２」と「３」との各行矩形を一連の行矩形群として特定する。そして行矩形群に含まれる行矩形のうち、その座標情報のｘ（水平方向）の値の最大値と最小値とを抽出し、同じようにｙ（垂直方向）の値の最大値と最小値とを抽出する。そして、抽出されたｘの最小値とｙの最小値とを組とした第一座標と、ｘの最大値とｙの最大値とを組とした第二座標とを、それぞれ左上座標，右下座標とする文字領域の矩形を画定する。つまり、文字領域は、この２つの座標値を含む座標情報によって画定される。 For example, in the row rectangle relation database, the row rectangle having the identifier “1” is associated with the row rectangle having the identifier “2”, and the row rectangle having the identifier “2” is associated with the row rectangle having the identifier “3”. If they are associated, the row rectangle defining unit 45 integrates these results and identifies the row rectangles having the identifiers “1”, “2”, and “3” as a series of row rectangle groups. Then, from the row rectangles included in the row rectangle group, the maximum value and the minimum value of the x (horizontal direction) value of the coordinate information are extracted, and similarly the maximum value and the minimum value of the y (vertical direction) value are extracted. And extract. Then, a first coordinate that is a set of the extracted minimum value of x and a minimum value of y, and a second coordinate that is a set of the maximum value of x and the maximum value of y are respectively set as upper left coordinates and lower right coordinates. Define the rectangle of the character area as coordinates. That is, the character area is defined by coordinate information including these two coordinate values.

文字領域画定部４７は、こうして画定した文字領域についてそれぞれ固有の識別子を発行し、各識別子とその文字領域の座標情報と当該文字領域に含まれる行矩形群を特定する情報（各行矩形の識別子のリストなど）とを関連づけて文字領域データベースとして記憶部１２に格納する。 The character area defining unit 47 issues unique identifiers for the character areas thus defined, and information for identifying each identifier, the coordinate information of the character area, and the group of line rectangles included in the character area (the identifier of each line rectangle identifier). Are stored in the storage unit 12 as a character area database.

ノイズ判定部４８は、文字領域画定部４７によって画定された文字領域のそれぞれについて文字が含まれているか否かを確認する、ノイズ判定処理を行う。ここでノイズ判定処理は、行矩形の数、又は各行矩形の性状を表す情報に基づいて、各行矩形に文字が含まれているか否かを判断する第１ノイズ判定処理と、行矩形に関係する基本矩形に関する情報に基づいて、当該行矩形に文字が含まれているか否かを判断する第２ノイズ処理とを含む。 The noise determination unit 48 performs noise determination processing for confirming whether or not a character is included in each of the character regions defined by the character region definition unit 47. Here, the noise determination processing is related to the first noise determination processing for determining whether or not each row rectangle includes a character based on the number of row rectangles or information indicating the property of each row rectangle, and the row rectangle. Second noise processing for determining whether or not a character is included in the row rectangle based on information on the basic rectangle.

具体的にノイズ判定部４８の処理は、図５に示すような処理として行われる。まずノイズ判定部４８は記憶部１２に格納された文字領域のうち、未だ注目文字領域として選択されていないものを注目文字領域として選択する（Ｓ１１）。そして注目文字領域に含まれる行矩形の数を調べ、これが２以上か（すなわち、当該文字領域が複数行からなるか）否かを判断する（Ｓ１２）。ここで、行矩形の数が２以上であれば（Ｙｅｓならば）、注目文字領域に含まれる各行矩形の幅と高さ、並びにそれらの平均や標準偏差など、ばらつきを検定するための統計量を演算する（Ｓ１３）。そして、これらの統計量に基づいて各行矩形の幅や高さのばらつきが所定のしきい値より大きいか否かを比較する（Ｓ１４）。この比較は例えば標準偏差が、予め定めたしきい値を超えるか否かの比較とすることができる。そしてこの処理Ｓ１４によって、ばらつきが大きいと判断される場合（Ｙｅｓの場合）、注目文字領域には文字は含まれないと判断して、記憶部１２の文字領域データベースから、注目文字領域を削除して（Ｓ１５）、処理Ｓ１８に移行する。これら処理Ｓ１２からＳ１５の処理が、第１ノイズ判定処理に相当する。すなわちここでは行矩形の性状を表す情報として各行矩形の幅や高さ、並びにそれらの統計量が用いられている。 Specifically, the processing of the noise determination unit 48 is performed as shown in FIG. First, the noise determination unit 48 selects a character region stored in the storage unit 12 that has not yet been selected as a target character region as a target character region (S11). Then, the number of line rectangles included in the target character area is checked to determine whether it is 2 or more (that is, whether the character area is composed of a plurality of lines) (S12). Here, if the number of row rectangles is 2 or more (if Yes), the statistics for testing variation such as the width and height of each row rectangle included in the target character area, and their average and standard deviation Is calculated (S13). Then, based on these statistics, it is compared whether or not the variation in the width and height of each row rectangle is larger than a predetermined threshold (S14). This comparison can be, for example, a comparison as to whether or not the standard deviation exceeds a predetermined threshold value. If it is determined by this processing S14 that the variation is large (in the case of Yes), it is determined that no character is included in the target character region, and the target character region is deleted from the character region database in the storage unit 12. (S15), the process proceeds to S18. These processes from S12 to S15 correspond to a first noise determination process. That is, here, the width and height of each row rectangle and their statistics are used as information representing the properties of the row rectangle.

一方、処理Ｓ１４において、ばらつきが小さいと判断される場合（Ｎｏの場合）には、注目文字領域に含まれる各行矩形に対して行内判定処理（第２ノイズ判定処理）を行う（Ｓ１６）。この処理Ｓ１６の具体的内容については後述する。そしてこの処理Ｓ１６においてノイズ（文字が含まれていない）と判定された行の数と注目文字領域に含まれる行矩形の数との比に基づいて、ノイズと判定された行の数が、注目文字領域に含まれる行矩形の数に比して所定比率以上となっているか否かを判断し（Ｓ１７）、所定比率以上であるときに、注目文字領域には文字は含まれないと判断して、処理Ｓ１５に移行する。 On the other hand, if it is determined in the process S14 that the variation is small (in the case of No), an in-line determination process (second noise determination process) is performed on each line rectangle included in the target character area (S16). The specific contents of this process S16 will be described later. Then, based on the ratio of the number of lines determined as noise (no character is included) in this process S16 and the number of line rectangles included in the target character area, the number of lines determined as noise is It is determined whether or not the predetermined ratio is greater than or equal to the number of line rectangles included in the character area (S17). When the ratio is equal to or greater than the predetermined ratio, it is determined that no character is included in the target character area. Then, the process proceeds to process S15.

また、処理Ｓ１７において、所定比率未満である場合は、未だ注目文字領域となっていない文字領域が記憶部１２の文字領域データベースにあるか否かを調べ（Ｓ１８）、未選択の文字領域があれば、処理Ｓ１１に戻って処理を続ける。さらに処理Ｓ１８において、未選択の文字領域がないならば（すべての文字領域について処理を行ったならば）、ノイズ判定の処理を終了する。 In step S17, if the ratio is less than the predetermined ratio, it is checked whether or not there is a character area that is not yet a focused character area in the character area database of the storage unit 12 (S18), and there is an unselected character area. If so, the process returns to step S11 to continue the process. Furthermore, if there is no unselected character area in process S18 (if all the character areas have been processed), the noise determination process is terminated.

さらに処理Ｓ１２において行矩形の数が１つであれば（Ｎｏならば）、処理Ｓ１６に移行して処理を続ける。この場合、当該単一の行矩形について文字が含まれているか否かを判断し、文字が含まれていれば（この場合は処理Ｓ１７の比率は「０」となる）、注目文字領域には文字が含まれると判断され、当該単一の行矩形内に文字が含まれていないならば（この場合は処理Ｓ１７の比率は「１」となる）、注目文字領域には文字が含まれないと判断される。 Further, if the number of row rectangles is one in process S12 (if No), the process proceeds to process S16 to continue the process. In this case, it is determined whether or not a character is included in the single line rectangle, and if a character is included (in this case, the ratio of processing S17 is “0”), the target character region includes If it is determined that a character is included and no character is included in the single line rectangle (in this case, the ratio of processing S17 is “1”), the character region of interest does not include the character. It is judged.

ここで、処理Ｓ１６における具体的処理（第２ノイズ判定処理）の内容について説明する。この処理ではノイズ判定部４８は、図６に示すように、処理の対象となった各行矩形の一つを注目行矩形として選択し（Ｓ２１）、記憶部１２に格納されている行矩形データベースを参照して、当該注目行矩形に含まれる基本矩形の数をカウントする（Ｓ２２）。そしてカウントの結果、基本矩形の数が「１」である場合と、「２」である場合と、「３」以上である場合とに分岐して（Ｓ２３）、基本矩形の数が「１」である場合は、注目行矩形に含まれる基本矩形の識別子のリストを取得し、このリストに含まれる基本矩形の座標情報を記憶部１２の基本矩形データベースから読出して、このリストに含まれる基本矩形の幅及び高さとその積（つまり面積）を演算する。そしてこの面積が予め定められた面積しきい値以下であるか否かを判断し（Ｓ２５）、面積しきい値以下である場合は、注目行矩形には文字が含まれないと判断して、当該結果を記憶部１２に格納する（Ｓ２６）。そして未だ注目行矩形として選択されていない行矩形があるか否かを調べ（Ｓ２７）、未選択の行矩形があれば、当該未選択の行矩形の一つを注目行矩形として選択するべく処理Ｓ２１に戻って処理を続ける。一方、処理Ｓ２７において未選択の行矩形がなければ、処理を終了して図５の処理に戻る。 Here, the content of the specific process (2nd noise determination process) in process S16 is demonstrated. In this processing, as shown in FIG. 6, the noise determination unit 48 selects one of the row rectangles to be processed as a target row rectangle (S21), and selects a row rectangle database stored in the storage unit 12. With reference to this, the number of basic rectangles included in the target row rectangle is counted (S22). As a result of the counting, the process branches into a case where the number of basic rectangles is “1”, a case where it is “2”, and a case where it is “3” or more (S23), and the number of basic rectangles is “1”. Is obtained, a list of identifiers of the basic rectangles included in the target line rectangle is acquired, the coordinate information of the basic rectangles included in this list is read from the basic rectangle database of the storage unit 12, and the basic rectangles included in this list are read out. The width and height and the product (that is, the area) are calculated. Then, it is determined whether or not the area is equal to or smaller than a predetermined area threshold (S25). If the area is equal to or smaller than the area threshold, it is determined that no character is included in the target line rectangle, The result is stored in the storage unit 12 (S26). Then, it is checked whether or not there is a row rectangle that has not yet been selected as the target row rectangle (S27). The process returns to S21 and continues. On the other hand, if there is no unselected row rectangle in the process S27, the process is terminated and the process returns to the process of FIG.

さらに処理Ｓ２２におけるカウント値が「２」である場合、各基本矩形の面積を演算し、また、これらの基本矩形間の距離を演算する。基本矩形間の距離は、例えば基本矩形の中心同士の距離として演算できる。そして、距離が予め定めた距離しきい値より大きいか、または２つの基本矩形の面積の比が予め定めた面積比しきい値より大きいかを判断し（Ｓ３１）、距離が予め定めた距離しきい値より大きいか、または２つの基本矩形の面積の比が予め定めた面積比しきい値より大きい場合は、処理Ｓ２６に移行して（Ｘ）処理を続ける。 Further, when the count value in the process S22 is “2”, the area of each basic rectangle is calculated, and the distance between these basic rectangles is calculated. The distance between the basic rectangles can be calculated as the distance between the centers of the basic rectangles, for example. Then, it is determined whether the distance is larger than a predetermined distance threshold value or the ratio of the areas of the two basic rectangles is larger than a predetermined area ratio threshold value (S31), and the distance is set to a predetermined distance. If it is larger than the threshold value, or if the ratio of the areas of the two basic rectangles is larger than the predetermined area ratio threshold, the process proceeds to step S26 and (X) the process is continued.

さらに、処理Ｓ２２におけるカウント値が「３」以上である場合には、カウント値（基本矩形の数）が所定最大数を超えているか否かを判断し（Ｓ３２）、この所定最大数を超える場合は、注目行矩形には文字が含まれないと判断して処理Ｓ２６に移行する（Ｘ）。これは、一行内に例えば１００字を超える文字を含めることは通常あり得ないことなどに配慮したものであり、固定値として定めておいてもよいし、注目行矩形の幅に基づいて調整してもよい。また処理Ｓ３２において基本矩形の数が所定最大数を超えていない場合は、さらに各基本矩形の面積を演算して、演算された面積の最大値が、所定最大面積値を超えているか否かを判断する（Ｓ３４）。ここで所定最大面積値を超えていると判断される場合は、注目行矩形には文字が含まれないと判断して処理Ｓ２６に移行する（Ｘ）。この最大面積値も、固定値として定めてもよいし、注目行矩形の幅や高さの少なくとも一方（例えばそれらのうち小さい方）に基づいて調整してもよい。 Furthermore, when the count value in process S22 is “3” or more, it is determined whether or not the count value (number of basic rectangles) exceeds a predetermined maximum number (S32). Determines that no character is included in the target line rectangle, and the process proceeds to step S26 (X). This is in consideration of the fact that it is usually impossible to include more than 100 characters in one line, and may be set as a fixed value or adjusted based on the width of the line of interest rectangle. May be. If the number of basic rectangles does not exceed the predetermined maximum number in step S32, the area of each basic rectangle is further calculated to determine whether the calculated maximum area value exceeds the predetermined maximum area value. Judgment is made (S34). If it is determined that the predetermined maximum area value is exceeded, it is determined that no character is included in the target line rectangle, and the process proceeds to step S26 (X). This maximum area value may also be determined as a fixed value, or may be adjusted based on at least one of the width and height of the target row rectangle (for example, the smaller one of them).

さらに処理Ｓ３４において所定最大面積値を超えていないと判断される場合、さらに２つの基本矩形の組み合せ（任意に取り出された少なくとも一つの組み合せ）について、各組み合せに係る２つの基本矩形の面積比が予め定めた面積比しきい値より大きいかを判断し（Ｓ３５）、２つの基本矩形の面積の比が予め定めた面積比しきい値より大きい場合は、処理Ｓ２６に移行して処理を続ける（Ｘ）。 Further, when it is determined in the process S34 that the predetermined maximum area value is not exceeded, the area ratio of the two basic rectangles related to each combination is further determined for two combinations of the basic rectangles (at least one combination arbitrarily extracted). It is determined whether it is larger than a predetermined area ratio threshold value (S35). If the ratio of the areas of the two basic rectangles is larger than the predetermined area ratio threshold value, the process proceeds to step S26 and the process continues ( X).

この処理Ｓ３５において２つの基本矩形の面積の比が予め定めた面積比しきい値より大きくない場合は、注目行矩形内に文字が含まれると判断して、その判断結果を記憶部１２に格納し、処理Ｓ２７に移行する。 When the ratio of the areas of the two basic rectangles is not larger than the predetermined area ratio threshold value in this process S35, it is determined that the character is included in the target line rectangle, and the determination result is stored in the storage unit 12. Then, the process proceeds to process S27.

なお、処理Ｓ２５において、面積が面積しきい値を超える場合、並びに、処理Ｓ３１において、距離が予め定めた距離しきい値以下であり、かつ２つの基本矩形の面積の比が予め定めた面積比しきい値以下である場合には、処理Ｓ３２（または処理Ｓ３４）以下に移行して処理を続けることとする。 In the process S25, when the area exceeds the area threshold value, and in the process S31, the distance is equal to or smaller than the predetermined distance threshold value, and the ratio of the areas of the two basic rectangles is a predetermined area ratio. If it is less than or equal to the threshold value, the process proceeds to the process S32 (or process S34) and the process is continued.

また、処理Ｓ３５においては各組み合せについて処理を行っているが、処理負荷を軽減するためには、例えば各基本矩形の面積の平均値（平均面積）や、最小値（最小面積）・最大値（最大面積）を演算し、平均面積と最小面積、平均面積と最大面積との比、あるいは最小面積と最大面積との比と、上記面積比しきい値との比較を行ってもよい。 Further, in the processing S35, the processing is performed for each combination, but in order to reduce the processing load, for example, the average value (average area) of each basic rectangle, the minimum value (minimum area), the maximum value ( The maximum area) may be calculated, and the average area and the minimum area, the ratio of the average area and the maximum area, or the ratio of the minimum area and the maximum area may be compared with the area ratio threshold value.

このように、ノイズ判定部４８は、各行矩形について、そこに含まれる基本矩形の性状（面積、面積比、距離など）に基づき、各行矩形に真に文字が含まれているか否かを再確認する。 As described above, the noise determination unit 48 reconfirms whether or not each row rectangle truly includes a character based on the properties (area, area ratio, distance, etc.) of the basic rectangle included in each row rectangle. To do.

なお、ノイズ判定部４８の第１ノイズ判定処理は、ここで述べた例に限られない。例えばここでは行矩形の性状として各行矩形の幅や高さを用いていたが、これらとともに、またはこれらに代えて、行矩形の座標情報（の平均値や標準偏差などの統計量）を用いてもよい。これによると、文字領域内に含まれる行矩形の位置がばらついている場合などに、当該文字領域には文字が含まれていない（ノイズである）と判断して、記憶部１２の文字領域データベースから、注目文字領域を削除することとなる。 The first noise determination process of the noise determination unit 48 is not limited to the example described here. For example, the width and height of each row rectangle is used here as the property of the row rectangle, but using the coordinate information of the row rectangle (statistics such as an average value and standard deviation) together with or instead of these, Also good. According to this, when the positions of the line rectangles included in the character area vary, it is determined that the character area does not include a character (noise), and the character area database in the storage unit 12 is determined. Therefore, the attention character area is deleted.

文字部分特定部４９は、ノイズ判定部４８の処理を経た文字領域データベースを記憶部１２から読出して、当該文字領域データベースに含まれる文字領域（文字領域の座標情報そのもの）、または当該文字領域内の黒画素部分（文字領域の座標情報と、黒画素部分からなるビットマップ情報）を文字部分として特定し、当該文字部分を特定する情報（文字部分特定情報）を記憶部１２に格納する。制御部１１は、この時点で記憶部１２に格納されている、基本矩形関係データベースや行矩形関係データベースを削除してもよい。 The character part specifying unit 49 reads the character area database that has undergone the processing of the noise determination unit 48 from the storage unit 12, and the character area included in the character area database (the coordinate information of the character area itself) or the character area database A black pixel portion (character area coordinate information and bitmap information including a black pixel portion) is specified as a character portion, and information (character portion specifying information) for specifying the character portion is stored in the storage unit 12. The control unit 11 may delete the basic rectangle relationship database and the row rectangle relationship database stored in the storage unit 12 at this time.

このように、本実施の形態における文字抽出部２２は、文字から行、行から領域へと段階的に文字領域を画定し、当該画定した文字領域内の行の状態に基づいて文字列が含まれているかを判断し、文字列が含まれていないと判断される場合には、さらに行内（文字単位）の状態に基づいて文字が含まれているかを判断することとしている。尤も、本実施の形態におけるレイアウト処理はこれに限られるものではなく、その他広く知られたレイアウト処理を用いても構わない。 As described above, the character extraction unit 22 in the present embodiment demarcates a character area step by step from a character to a line and from a line to an area, and includes a character string based on the state of the line in the delimited character area. If it is determined that the character string is not included, it is further determined whether the character is included based on the state in the line (character unit). However, the layout processing in the present embodiment is not limited to this, and other well-known layout processing may be used.

本実施形態において特徴的なことの一つは、いわゆるＴ／Ｉ分離処理においてレイアウト解析を用いて文字部分を抽出することとしていることである。これによって文字部分の抽出精度を向上させることができる。 One characteristic of this embodiment is that a character portion is extracted using layout analysis in so-called T / I separation processing. Thereby, the extraction accuracy of the character part can be improved.

文字抽出部２２は、文字部分特定部４９にて特定した文字部分について、それぞれ固有の領域識別子（以下、ラベルデータと呼ぶ）を生成し、このラベルデータと、対応する文字領域を画定するための座標情報（頂点座標の情報等）とを関連づけて記憶部１２に文字領域データベースとして格納する。 The character extraction unit 22 generates a unique region identifier (hereinafter referred to as label data) for each character part specified by the character part specification unit 49, and delimits the label data and the corresponding character region. Coordinate information (vertex coordinate information and the like) is associated with each other and stored in the storage unit 12 as a character area database.

色数判定部２３は、記憶部１２の文字領域データベースに格納されている文字領域の各々について、同一の色の部分を特定するマスク画像データを生成する。 The number-of-colors determination unit 23 generates mask image data that specifies the same color portion for each character region stored in the character region database of the storage unit 12.

本実施の形態における色数判定部２３は、図７に示すように、文字色数判定部５１と、背景色数判定部５２とを含んで構成されている。 As shown in FIG. 7, the color number determination unit 23 in the present embodiment includes a character color number determination unit 51 and a background color number determination unit 52.

文字色数判定部５１は、記憶部１２に格納された文字領域の座標情報を参照し、それらを順次注目文字領域として選択しながら、注目文字領域内の基本矩形に含まれる有意画素（文字画像部分を表す画素塊群）に対応する、元画像データの画素値に基づいて文字色数をカウントする。 The character color number determination unit 51 refers to the coordinate information of the character area stored in the storage unit 12 and selects the significant pixels (character image) included in the basic rectangle in the target character area while sequentially selecting them as the target character area. The number of character colors is counted based on the pixel value of the original image data corresponding to the pixel block group representing the portion).

具体的にこの文字色数判定部５１は、元画像データのうち注目文字領域内の有意画素に対応する値のヒストグラム（発生頻度）を生成する。そしてこのヒストグラムにおいて所定しきい値を超える頻度で出現する画素値を特定し、当該特定した画素値の数（文字部分の画素値の数）をカウントする。 Specifically, the character color number determination unit 51 generates a histogram (occurrence frequency) of values corresponding to significant pixels in the target character area in the original image data. Then, pixel values that appear at a frequency exceeding a predetermined threshold in this histogram are specified, and the number of the specified pixel values (the number of pixel values in the character portion) is counted.

同様に背景色数判定部５２は、注目文字領域に含まれる有意画素以外の画素に対応する、元画像データの画素値に基づいて背景色数をカウントする。具体的にこの背景色数判定部５２は、注目文字領域内の有意画素以外の画素に対応する、元画像データの画素の値のヒストグラム（発生頻度）を生成する。そしてこのヒストグラムにおいて所定しきい値を超える頻度で出現する画素値を特定し、当該特定した画素値の数（背景部分の画素値の数）をカウントする。 Similarly, the background color number determination unit 52 counts the number of background colors based on the pixel values of the original image data corresponding to pixels other than the significant pixels included in the target character area. Specifically, the background color number determination unit 52 generates a histogram (occurrence frequency) of pixel values of the original image data corresponding to pixels other than significant pixels in the target character area. Then, pixel values that appear with a frequency exceeding a predetermined threshold in the histogram are specified, and the number of the specified pixel values (the number of pixel values in the background portion) is counted.

代表色決定部２４は、図８に示すように、文字代表色決定部５３と、背景代表色決定部５４とを含んで構成されている。文字代表色決定部５３は、文字色数判定部５１によって判定された文字色数が所定の整数Ｎ（例えばＮ＝１）以下であるか否かを調べ、文字色数が所定数Ｎ以下である場合は、注目文字領域に含まれる有意画素に対応する元画像データの画素値に基づいて文字代表色を決定する。 As shown in FIG. 8, the representative color determining unit 24 includes a character representative color determining unit 53 and a background representative color determining unit 54. The character representative color determining unit 53 checks whether or not the character color number determined by the character color number determining unit 51 is equal to or less than a predetermined integer N (for example, N = 1), and the character color number is equal to or less than the predetermined number N. In some cases, the character representative color is determined based on the pixel value of the original image data corresponding to the significant pixel included in the target character area.

背景代表色決定部５４は、背景色数判定部５２によって判定された背景色数が所定の整数Ｍ（例えばＭ＝１）以下であるか否かを調べ、背景色数が所定数Ｍ以下である場合は、注目文字領域に含まれる有意画素以外の画素に対応する元画像データの画素値に基づいて背景代表色を決定する。 The background representative color determination unit 54 checks whether the background color number determined by the background color number determination unit 52 is a predetermined integer M (for example, M = 1) or less, and the background color number is a predetermined number M or less. In some cases, the background representative color is determined based on the pixel values of the original image data corresponding to pixels other than the significant pixels included in the target character area.

なお、これらにおける代表画素値の決定方法は、色数判定の処理と同様に、元画像データのうち、対象となる画素値のヒストグラム（発生頻度）を生成し、このヒストグラムにおける際頻値を代表画素値として決定すればよい。 The representative pixel value determination method in these methods generates a histogram (occurrence frequency) of the target pixel value in the original image data as in the color number determination process, and represents the frequent value in this histogram. What is necessary is just to determine as a pixel value.

マスク画像生成部２５は、文字色数判定部５１によって判定された文字色数が所定の整数Ｎ（例えばＮ＝１）以下であるか否かを調べ、文字色数が所定数Ｎ以下である場合は、注目文字領域に含まれる有意画素部分をマスク画像として抽出して出力する。また、この場合は、当該マスク画像に対応する代表色として文字代表色決定部５３が出力する文字代表色の情報を選択的に出力する。 The mask image generation unit 25 checks whether the character color number determined by the character color number determination unit 51 is a predetermined integer N (for example, N = 1) or less, and the character color number is a predetermined number N or less. In this case, a significant pixel portion included in the target character area is extracted as a mask image and output. In this case, information on the character representative color output by the character representative color determining unit 53 is selectively output as the representative color corresponding to the mask image.

また、文字色数判定部５１によって判定された文字色数が所定の整数Ｎを超えていれば、背景色数判定部５２によって判定された背景色数が所定の整数Ｍ（例えばＭ＝１）以下であるか否かを調べ、背景色数が所定数Ｍ以下である場合は、マスク画像を反転し、注目文字領域内で文字画像以外の部分（すなわち背景部分）を表すマスク画像を生成する。そして、マスク画像生成部２５は、当該背景部分のマスク画像を出力する。また、この場合は、当該マスク画像に対応する代表色として背景代表色決定部５４が出力する文字代表色の情報を選択的に出力する。 If the number of character colors determined by the character color number determination unit 51 exceeds a predetermined integer N, the number of background colors determined by the background color number determination unit 52 is a predetermined integer M (for example, M = 1). If the number of background colors is equal to or less than a predetermined number M, the mask image is inverted and a mask image representing a portion other than the character image (that is, the background portion) is generated in the target character region. . Then, the mask image generation unit 25 outputs a mask image of the background portion. In this case, information on the character representative color output by the background representative color determination unit 54 is selectively output as a representative color corresponding to the mask image.

ここでマスク画像生成部２５は、マスク画像を判定する際には、注目文字領域に含まれる各画素塊（有意画素）を膨張する処理を行う。すなわち有意画素に対して隣接する画素を有意画素とする。そして当該膨張処理後の当該マスク画像を反転して背景部分を表すマスク画像を生成してもよい。 Here, when determining the mask image, the mask image generation unit 25 performs a process of expanding each pixel block (significant pixel) included in the target character area. That is, a pixel adjacent to a significant pixel is a significant pixel. Then, the mask image representing the background portion may be generated by inverting the mask image after the expansion process.

なお、生成した各マスク画像には、対応する（そのマスク画像に対応する画素が含まれている）文字領域を画定する座標情報を関連付けて出力する。 Each generated mask image is output in association with coordinate information defining a corresponding character area (including pixels corresponding to the mask image).

後処理部２６は、マスク画像生成部２５が生成したマスク画像の入力を受け入れる。そして、後処理部２６は、元の画像データのうちマスク画像によって特定される画素を除去（所定の値に設定）する。後処理部２６は、画素を除去した後の画像データの各画素をラスタスキャン順に走査して、注目画素として選択する。そして注目画素が除去された画素（上記所定の値に設定されている画素）でなければ、当該注目画素の画素値をそのままとするとともに、当該注目画素の画素値を直前画素値として記憶部１２のワークメモリに記憶する。なお、既に他の画素値が直前画素値として記憶されている場合は、その記憶内容に上書きする。 The post-processing unit 26 receives the input of the mask image generated by the mask image generation unit 25. Then, the post-processing unit 26 removes (sets to a predetermined value) the pixels specified by the mask image from the original image data. The post-processing unit 26 scans each pixel of the image data after removing the pixel in the raster scan order, and selects it as a target pixel. If the pixel is not a pixel from which the pixel of interest has been removed (a pixel set to the predetermined value), the pixel value of the pixel of interest remains as it is, and the pixel value of the pixel of interest is used as the previous pixel value. Stored in the work memory. If another pixel value is already stored as the previous pixel value, the stored content is overwritten.

また注目画素が除去された画素である場合、当該注目画素の画素値を、記憶している直前画素値に設定する。これにより除去された部分の画素値が、ラスタスキャン順に直前画素値と同一になり、多くの圧縮処理において圧縮効率を向上させることができるようになる。そしてこの処理を行った後の画像データを背景部データとして記憶部１２に格納する。 If the pixel of interest is a pixel from which the pixel of interest has been removed, the pixel value of the pixel of interest is set to the stored previous pixel value. As a result, the pixel value of the removed portion becomes the same as the previous pixel value in the raster scan order, and the compression efficiency can be improved in many compression processes. The image data after this processing is stored in the storage unit 12 as background data.

フォーマット部２７は、背景部データと、マスク画像と、それに関連して出力される代表色の情報とに基づいて、元の画像データを再現するためのデータを生成する。例えばこのデータはＰＤＦ（Portable Document Format）として記述することができる。 The format unit 27 generates data for reproducing the original image data based on the background data, the mask image, and the representative color information output in association therewith. For example, this data can be described as PDF (Portable Document Format).

具体的に、フォーマット部２７は、まず背景部データを描画する指示を記述する。そして、生成された各マスク画像について、それに関連して出力される代表色の情報と注目文字領域を画定する座標情報とを参照し、当該注目文字領域である矩形内部を当該代表色で塗りつぶした画像を描画する指示を記述する。そして、この指示により描画された矩形状から、対応するマスク画像部分（マスク画像内の有意画素に対応する画素）を抽出する指示と、当該抽出した部分を、背景部データ上、上記参照した座標情報で定められる位置に透過合成（マスク画像内の有意画素外に対応する画素は背景部データの画素値とし、マスク画像内の有意画素に対応する画素については抽出して得た上記代表色の画素値とする合成方式）する指示と、を記述する。 Specifically, the format unit 27 first describes an instruction to draw background portion data. Then, with respect to each generated mask image, the representative color information and the coordinate information defining the target character area output in relation to the mask image are referred to, and the rectangle inside the target character area is filled with the representative color. Describes the instruction to draw the image. Then, an instruction to extract a corresponding mask image portion (a pixel corresponding to a significant pixel in the mask image) from the rectangular shape drawn by this instruction, and the extracted coordinates of the extracted portion on the background data Transmission composition at the position determined by the information (pixels corresponding to outside significant pixels in the mask image are the pixel values of the background data, and pixels corresponding to significant pixels in the mask image are extracted with the representative color obtained above) And an instruction for a pixel value to be combined).

このようにしているので、本実施の形態の画像処理装置によると、処理対象の画像に、単色で表現された文字が含まれているときには、当該文字部分をマスク画像として抽出する（図９（ａ））。そして当該抽出したマスク画像と、その文字部分の色を表す情報と、マスク画像部分を除去して穴埋処理した背景部データとを含むデータを生成する。 Thus, according to the image processing apparatus of the present embodiment, when the image to be processed includes characters expressed in a single color, the character portion is extracted as a mask image (FIG. 9 ( a)). Then, data including the extracted mask image, information indicating the color of the character portion, and background portion data that has been subjected to hole filling processing by removing the mask image portion is generated.

一方、処理対象の画像に含まれる文字部分がグラデーション処理されているなど、多色で表現されているときには、当該文字部分を含む領域（文字領域）の背景部分の色数を調べ、当該背景部分の色数が１色であれば、当該背景部分を表すマスク画像を生成する（図９（ｂ））。そして、そして当該抽出した背景部分のマスク画像と、その背景部分の色を表す情報と、マスク画像部分を除去して穴埋処理した背景部データとを含むデータを生成する。つまりこの場合は、多色表現される文字部分は背景部データに含まれたままとなる。 On the other hand, when the character part included in the image to be processed is expressed in multiple colors such as gradation processing, the number of colors in the background part of the area (character area) including the character part is examined, and the background part If the number of colors is one, a mask image representing the background portion is generated (FIG. 9B). Then, data including the extracted background portion mask image, information indicating the color of the background portion, and background portion data that has been subjected to hole filling processing by removing the mask image portion is generated. That is, in this case, the character portion that is expressed in multiple colors remains included in the background portion data.

なお、本実施の形態の説明では、一行分の文字列を含む行矩形をさらに連結した文字領域を対象として、各マスク画像を生成する処理を行っているが、行矩形ごとに処理してもよい。 In the description of the present embodiment, the process of generating each mask image is performed for a character area obtained by further concatenating a line rectangle including a character string for one line, but the process may be performed for each line rectangle. Good.

また、文字部分も、その背景部分も多色で表現されている場合は、これらの処理をせずに、当該部分を背景部データとして処理してもよい。このようにすると、例えば各マスク画像をランレングス圧縮し、背景部データをＪＰＥＧ等で圧縮することで、圧縮効率を向上できる。 Further, when both the character part and the background part are expressed in multiple colors, the part may be processed as background part data without performing these processes. In this way, for example, each mask image is run-length compressed, and the background data is compressed with JPEG or the like, thereby improving the compression efficiency.

このように本実施の形態によると、多色表現された文字画像に対応した処理を行うことができる。 As described above, according to the present embodiment, it is possible to perform processing corresponding to a character image expressed in multiple colors.

本発明の実施の形態に係る画像処理装置の一例を表す構成ブロック図である。1 is a configuration block diagram illustrating an example of an image processing apparatus according to an embodiment of the present invention. 本発明の実施の形態に係る画像処理装置の制御部によって実行される処理内容を表す機能ブロック図である。It is a functional block diagram showing the processing content performed by the control part of the image processing apparatus which concerns on embodiment of this invention. 文字抽出部２２の処理内容例を表す機能ブロック図である。4 is a functional block diagram illustrating an example of processing contents of a character extraction unit 22. FIG. 文字抽出部２２の処理例を表す説明図である。5 is an explanatory diagram illustrating a processing example of a character extraction unit 22. FIG. 文字抽出部２２の処理例を表すフローチャート図である。FIG. 10 is a flowchart illustrating a processing example of a character extraction unit 22. 文字抽出部２２の処理例を表すフローチャート図である。FIG. 10 is a flowchart illustrating a processing example of a character extraction unit 22. 色数判定部２３の処理内容例を表す機能ブロック図である。6 is a functional block diagram illustrating an example of processing contents of a color number determination unit 23. FIG. 代表色決定部２４の処理例を表す説明図である。6 is an explanatory diagram illustrating a processing example of a representative color determination unit 24. FIG. 生成されるマスク画像の例を表す説明図である。It is explanatory drawing showing the example of the mask image produced | generated.

Explanation of symbols

１１制御部、１２記憶部、１３画像入力部、１４画像出力部、２１前処理部、２２文字抽出部、２３色数判定部、２４代表色決定部、２５マスク画像生成部、２６後処理部、２７フォーマット部、４１二値化処理部、４２連結画素抽出部、４３基本矩形画定部、４４第１セパレータ検出部、４５行矩形画定部、４６第２セパレータ検出部、４７文字領域画定部、４８ノイズ判定部、４９文字部分特定部、５１文字色数判定部、５２背景色数判定部、５３文字代表色決定部、５４背景代表色決定部。
DESCRIPTION OF SYMBOLS 11 Control part, 12 Memory | storage part, 13 Image input part, 14 Image output part, 21 Pre-processing part, 22 Character extraction part, 23 Color number determination part, 24 Representative color determination part, 25 Mask image generation part, 26 Post-processing part , 27 format section, 41 binarization processing section, 42 connected pixel extraction section, 43 basic rectangle definition section, 44 first separator detection section, 45 line rectangle definition section, 46 second separator detection section, 47 character area definition section, 48 noise determination unit, 49 character part identification unit, 51 character color number determination unit, 52 background color number determination unit, 53 character representative color determination unit, 54 background representative color determination unit.

Claims

Means for demarcating a region including at least one pixel block determined as a character from at least a part of image data to be processed;
A character color number determination means for determining the number of colors of the pixel block determined to be the character included in each region as the number of character colors;
A background color number determination means for determining the number of colors of a pixel as a background part of each region as the background color number;
Including
Information on the number of character colors and the number of background colors obtained by the determination is an image processing apparatus that is subjected to predetermined image processing ,
As the predetermined process, when the determined number of character colors is equal to or less than a predetermined N colors, a mask image representing the pixel block and representative color information determined based on a color included in the pixel block are output. ,
If the determined number of character colors exceeds the predetermined N colors and the number of background colors is equal to or less than the predetermined M colors, the mask image representing the pixel block is inverted by inverting the mask image representing the pixel block An image processing apparatus that outputs a mask image representing the background portion and representative color information determined based on a color included in the background .

The image processing apparatus according to claim 1 ,
When the determined number of character colors exceeds the predetermined N colors and the number of background colors is equal to or less than the predetermined M colors, a process of expanding significant pixels of the mask image representing the pixel block is performed, Inverting the mask image after the expansion process to generate a mask image representing the background portion, and outputting the mask image representing the background portion and representative color information determined based on the color included in the background A featured image processing apparatus.

Defining a region including at least one pixel block determined to be a character from at least a part of the image data to be processed;
A step of determining the number of colors of a pixel block included in each region, which is determined as the character, as the number of character colors;
A step of determining the number of colors of a pixel as a background portion of each region as the number of background colors;
Including
Information on the number of character colors and the number of background colors obtained by the determination is an image processing method used for predetermined image processing ,
As the predetermined process, when the determined number of character colors is equal to or less than a predetermined N colors, a mask image representing the pixel block and representative color information determined based on a color included in the pixel block are output. ,
If the determined number of character colors exceeds the predetermined N colors and the number of background colors is equal to or less than the predetermined M colors, the mask image representing the pixel block is inverted by inverting the mask image representing the pixel block And generating a mask image representing the background portion and representative color information determined based on a color included in the background .

The computer,
Means for demarcating a region including at least one pixel block determined as a character from at least a part of image data to be processed;
A character color number determination means for determining the number of colors of the pixel block determined to be the character included in each region as the number of character colors;
A background color number determination means for determining the number of colors of a pixel as a background part of each region as the background color number ;
Function as
Information on the number of character colors and the number of background colors obtained by the determination is a program provided for predetermined image processing ,
As the predetermined process, when the determined number of character colors is equal to or less than a predetermined N colors, a mask image representing the pixel block and representative color information determined based on a color included in the pixel block are output. ,
If the determined number of character colors exceeds the predetermined N colors and the number of background colors is equal to or less than the predetermined M colors, the mask image representing the pixel block is inverted by inverting the mask image representing the pixel block And generating a mask image representing the background portion and representative color information determined based on the color included in the background .