JP2010186246A

JP2010186246A - Image processing apparatus, method, and program

Info

Publication number: JP2010186246A
Application number: JP2009028750A
Authority: JP
Inventors: Tetsuo Ishita; 哲夫井下
Original assignee: NEC Corp
Current assignee: NEC Corp
Priority date: 2009-02-10
Filing date: 2009-02-10
Publication date: 2010-08-26

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image processing apparatus capable of accurately extracting the character area of a reversed character included in an input image, without using line information. <P>SOLUTION: An edge detection means 11 detects edges from an input image according to changes in feature quantity between adjacent pixels. An area extraction and separation means 12 separates pixels near the pixels of the input image corresponding to the detected edges into pixels with high feature quantities and pixels with low feature quantities according to the feature quantity. The area extraction and separation means 12 extracts as character regions pixels with low feature quantities that are surrounded by the pixels with the high feature quantities and pixels with high feature quantities that are surrounded by the pixels with the low feature quantities. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、画像処理装置、方法、及び、プログラムに関し、更に詳しくは、画像から文字領域を抽出する画像処理装置、方法、及び、プログラムに関する。 The present invention relates to an image processing apparatus, method, and program, and more particularly to an image processing apparatus, method, and program for extracting a character region from an image.

画像に含まれる文字を、スキャナ装置などを用いて光学的に読み取り、文字データを出力する光学式文字読み取り装置（ＯＣＲ；Optical Character Reader）がある。ＯＣＲにて、文字データを得るためには、画像から背景領域と文字領域とを分類し、文字を構成する領域を抽出する必要がある。例えば、白色の背景に文字が黒色で記載されている白黒画像では、白色の領域を背景領域とし、黒色の領域を文字領域として切り出すことで、文字領域を抽出する。 There is an optical character reader (OCR) that optically reads characters included in an image using a scanner device or the like and outputs character data. In order to obtain character data by OCR, it is necessary to classify a background region and a character region from an image and extract a region constituting a character. For example, in a black and white image in which characters are written in black on a white background, a white region is extracted as a background region, and a black region is extracted as a character region, thereby extracting the character region.

ところで、広告やポスターなどの画像では、多様な文字の飾り付けがなされていることが多い。例えば、広告などでは、特定の文字を、周囲の文字よりも強調した表示とするために、反転文字とするケースも多い。反転文字とは、例えば、他の文字が白色の背景に黒色で書かれているときに、黒色の背景に白色の文字で書かれている文字を指す。つまり、文字を構成する色が、周囲とは反転している文字を指す。例えば、周辺の文字では、白色が背景で黒色が文字となるところ、反転文字では、黒色が背景で白色が文字となる。 By the way, various characters are often decorated in images such as advertisements and posters. For example, in an advertisement or the like, in many cases, a specific character is displayed as an inverted character in order to display the character more emphasized than surrounding characters. An inverted character refers to, for example, a character written in white characters on a black background when other characters are written in black on a white background. That is, the color which comprises a character points out the character reversed with respect to the circumference | surroundings. For example, in a surrounding character, white is a background and black is a character, and in an inverted character, black is a background and white is a character.

画像に反転文字が混在する場合、反転している領域を精度よく検出しなければ、誤って背景領域を文字領域として抽出することになる。反転文字の検出に関して、特許文献１〜４に記載の技術がある。特許文献１では、ラン長をあらかじめ設定された所定の長さと比較し、ラン長の長さの変化が所定条件を満たすか否かで反転画像判定を行う。ラン長は、画像情報を一方向に計測し、同一の濃度の画素が連続する長さで定義される。特許文献１では、抽出した行領域に対し、文字のエッジ数や黒ラン数のヒストグラムを生成する。その後、ヒストグラムの値が所定のしきい値を超えたか否かを判定し、反転領域か否かを判定している。 When reverse characters are mixed in the image, the background region is erroneously extracted as the character region unless the reverse region is accurately detected. Regarding detection of inverted characters, there are techniques described in Patent Documents 1 to 4. In Patent Document 1, the run length is compared with a predetermined length set in advance, and a reverse image determination is performed based on whether or not the change in the run length satisfies a predetermined condition. The run length is defined by a length of pixels having the same density measured by measuring image information in one direction. In Patent Document 1, a histogram of the number of character edges and the number of black runs is generated for the extracted line area. Thereafter, it is determined whether or not the value of the histogram exceeds a predetermined threshold value, and it is determined whether or not it is an inversion region.

特許文献２では、入力された２値画像の黒画素と白画素との割合を元に、画像の反転を判定する。特許文献２では、入力２値画像の黒画素と白画素の数をカウントし、黒画素の割合が、例えば７０％を超える場合は、反転画像であると判定する。特許文献３でも、同様に、黒画素と白画素の数をカウントし、画素数の多い色を、背景領域としている。 In Patent Document 2, image inversion is determined based on the ratio of black pixels and white pixels in an input binary image. In Patent Document 2, the number of black pixels and white pixels in an input binary image is counted, and when the ratio of black pixels exceeds, for example, 70%, it is determined as an inverted image. Similarly in Patent Document 3, the number of black pixels and white pixels is counted, and a color having a large number of pixels is used as a background region.

特許文献４では、入力画像を２値化して２値画像を得ると共に、入力画像のエッジを検出してエッジ画像を生成する。２値画像とエッジ画像のそれぞれに対して領域識別処理を行い、双方の領域識別結果を統合する。その後、統合された領域識別結果に含まれる各文字領域に応じて、入力画像の部分２値化処理を行う。部分２値化では、特許文献２や特許文献３と同様に、白画素と黒画素との割合に応じて、反転文字であるか、非反転文字かを判断している。 In Patent Document 4, an input image is binarized to obtain a binary image, and an edge image is generated by detecting an edge of the input image. Region identification processing is performed on each of the binary image and the edge image, and the region identification results of both are integrated. Thereafter, partial binarization processing of the input image is performed in accordance with each character area included in the integrated area identification result. In partial binarization, similar to Patent Document 2 and Patent Document 3, it is determined whether the character is an inverted character or a non-inverted character according to the ratio of white pixels to black pixels.

特開２００４−３２６５６８号公報JP 2004-326568 A 特開２００４−６４６６４号公報JP 2004-64664 A 特許第２７４３３７８号公報Japanese Patent No. 2743378 特開２００７―１８３７４２号公報JP 2007-183742 A

大津「判別及び最小２乗基準に基づく自動しきい値選定法」電子通信学会論文誌、ｖｏｌ．Ｊ６３−Ｄ、ｎｏ４、ｐｐ．３４９−３５６、１９８０Otsu “Automatic Threshold Selection Method Based on Discrimination and Least Squares Criterion”, IEICE Transactions, vol. J63-D, no4, pp. 349-356, 1980

特許文献１では、行数や各行の位置といった行情報がわかっていることを前提に、行ごとに白画素、黒画素のラン数を用いて反転領域を特定している。特許文献１では、あらかじめ、画像のどの位置に行が存在するかが分かっている必要がある。従って、特許文献１は、行の位置が固定された定型の文書にしか適用できない。また、入力画像に含まれる文字は、縦又は横に一列に並ぶとは限らない。例えば、斜め方向に文字が並ぶことや、湾曲した曲線上に文字が並ぶこともある。特に、広告など、装飾が多い画像では、こうした傾向が強い。特許文献１は、決まった位置にある行ごとの処理となるので、そうした画像から文字領域を抽出することはできない。 In Patent Document 1, on the assumption that line information such as the number of lines and the position of each line is known, the inversion area is specified using the number of runs of white pixels and black pixels for each line. In Patent Document 1, it is necessary to know in advance where a row exists in an image. Therefore, Patent Document 1 can be applied only to a standard document in which the position of a line is fixed. Further, the characters included in the input image are not necessarily arranged in a line vertically or horizontally. For example, characters may be arranged in an oblique direction, or characters may be arranged on a curved curve. This tendency is particularly strong in images with many decorations such as advertisements. Since Patent Document 1 performs processing for each line at a fixed position, a character region cannot be extracted from such an image.

また、特許文献２では、文字領域に外接している外接矩形内の白画素、黒画素の割合を元に反転領域か否かを判断している。特許文献２では、矩形領域の取り方が適切でないと、文字とは関係がない背景領域の画素が判定基準に大きく影響を与え、反転文字の判定を誤ることがある。このため、反転文字を精度よく抽出することはできない。特許文献３及び４も、同様に、外接矩形内の白黒の画素の割合から反転領域を抽出している。このため、矩形内の文字数や文字の書体に応じて、反転文字の抽出を誤る場合がある。 In Patent Document 2, it is determined whether or not the region is an inversion region based on the ratio of white pixels and black pixels in a circumscribed rectangle circumscribing the character region. In Patent Document 2, if the rectangular area is not properly taken, pixels in the background area that are not related to the character greatly affect the determination criterion, and the determination of the reversed character may be erroneous. For this reason, inverted characters cannot be extracted with high accuracy. Similarly, Patent Documents 3 and 4 also extract the inversion area from the ratio of black and white pixels in the circumscribed rectangle. For this reason, there is a case where reverse character extraction is erroneous depending on the number of characters in the rectangle and the typeface of the characters.

本発明は、行情報を用いなくても、精度よく入力画像に含まれる反転文字の文字領域を抽出可能な画像処理装置、方法、及び、プログラムを提供することを目的とする。 An object of the present invention is to provide an image processing apparatus, method, and program capable of accurately extracting a character region of inverted characters included in an input image without using line information.

上記目的を達成するために、本発明の画像処理装置は、入力画像から、隣接する画素間での特徴量の変化に基づいて、エッジを検出するエッジ検出手段と、前記検出されたエッジに対応する前記入力画像の画素の近傍の画素を前記特徴量に基づいて特徴量が高い画素と特徴量が低い画素とに区分し、前記特徴量が高い画素に囲まれた特徴量が低い画素、及び、特徴量が低い画素に囲まれた特徴量が高い画素を文字領域として抽出する領域抽出分離手段とを備えることを特徴とする。 In order to achieve the above object, an image processing apparatus of the present invention corresponds to an edge detection unit that detects an edge from an input image based on a change in a feature amount between adjacent pixels, and corresponds to the detected edge. A pixel in the vicinity of the pixel of the input image is divided into a pixel having a high feature amount and a pixel having a low feature amount based on the feature amount, and a pixel having a low feature amount surrounded by pixels having a high feature amount, and And an area extracting / separating means for extracting a pixel having a high feature quantity surrounded by pixels having a low feature quantity as a character area.

本発明の画像処理方法は、コンピュータが、入力画像から、隣接する画素間での特徴量の変化に基づいて、エッジを検出するステップと、前記コンピュータが、前記検出されたエッジに対応する前記入力画像の画素の近傍の画素を前記特徴量に基づいて特徴量が高い画素と特徴量が低い画素とに区分するステップと、前記コンピュータが、前記特徴量が高い画素に囲まれた特徴量が低い画素、及び、特徴量が低い画素に囲まれた特徴量が高い画素を文字領域として抽出するステップとを有することを特徴とする。 The image processing method of the present invention includes a step in which a computer detects an edge from an input image based on a change in a feature amount between adjacent pixels, and the computer inputs the input corresponding to the detected edge. Dividing a pixel in the vicinity of an image pixel into a pixel having a high feature value and a pixel having a low feature value based on the feature value; and the computer has a low feature value surrounded by the pixels having a high feature value. Extracting a pixel having a high feature amount surrounded by pixels and a pixel having a low feature amount as a character region.

本発明のプログラムは、コンピュータに、入力画像から、隣接する画素間での特徴量の変化に基づいて、エッジを検出する処理と、前記検出されたエッジに対応する前記入力画像の画素の近傍の画素を前記特徴量に基づいて特徴量が高い画素と特徴量が低い画素とに区分する処理と、前記特徴量が高い画素に囲まれた特徴量が低い画素、及び、特徴量が低い画素に囲まれた特徴量が高い画素を文字領域として抽出する処理とを実行させることを特徴とする。 The program according to the present invention allows a computer to detect an edge from an input image based on a feature amount change between adjacent pixels, and to detect the vicinity of the pixel of the input image corresponding to the detected edge. Based on the feature amount, the pixel is classified into a pixel having a high feature amount and a pixel having a low feature amount, a pixel surrounded by the pixels having a high feature amount, and a pixel having a low feature amount and a pixel having a low feature amount. It is characterized by executing a process of extracting a surrounded pixel having a high feature amount as a character region.

本発明の画像処理装置、方法、及び、プログラムは、行情報を用いなくても、精度よく入力画像に含まれる反転文字の文字領域を抽出することができる。 The image processing apparatus, method, and program of the present invention can accurately extract the character region of the inverted character included in the input image without using line information.

本発明の画像処理装置を示すブロック図。1 is a block diagram showing an image processing apparatus of the present invention. 本発明の第１実施形態の画像処理装置を示すブロック図。1 is a block diagram showing an image processing apparatus according to a first embodiment of the present invention. 動作手順を示すフローチャート。The flowchart which shows an operation | movement procedure. 入力画像を示す図。The figure which shows an input image. (ａ)及び（ｂ）は、それぞれ第１のエッジ及び第２のエッジで構成されるエッジ画像Ａ及びＢを示す図。(a) And (b) is a figure which shows edge image A and B comprised by the 1st edge and the 2nd edge, respectively. (ａ)は、入力画像の一部を拡大して示す図、(ｂ)及び（ｃ）は、エッジ画像Ａ及びエッジ画像Ｂの一部を拡大して示す図。(a) is a figure which expands and shows a part of input image, (b) and (c) is a figure which expands and shows a part of edge image A and the edge image B. (ａ)及び（ｂ）は、それぞれ、第１の画像及び第２の画像を示す図。(a) And (b) is a figure which shows the 1st image and the 2nd image, respectively. (ａ)及び（ｂ）は、それぞれ、第１の画像及び第２の画像の一部を拡大して示す図。(a) And (b) is a figure which expands and shows a part of 1st image and 2nd image, respectively. (ａ)及び（ｂ）は、それぞれ、第１の画像から抽出された文字領域及び第２の画像から抽出された文字領域を示す図。(a) And (b) is a figure which shows the character area extracted from the 1st image and the character area extracted from the 2nd image, respectively. (ａ)及び（ｂ）は、それぞれ、図８（ａ）及び（ｂ）から抽出される文字領域を示す図。(a) And (b) is a figure which shows the character area extracted from Fig.8 (a) and (b), respectively. (ａ)及び（ｂ）は、それぞれ第１の画像及び第２の画像におけるラベル領域を示す図、（ｃ）は、統合後の文字領域を示す図。(a) And (b) is a figure which shows the label area | region in a 1st image and a 2nd image, respectively, (c) is a figure which shows the character area after integration. 本発明の第２実施形態の画像処理装置を示す図。The figure which shows the image processing apparatus of 2nd Embodiment of this invention. 第２実施形態における動作手順を示すフローチャート。The flowchart which shows the operation | movement procedure in 2nd Embodiment. (ａ)は、縮小画像の一部を示す図、（ｂ）は、オリジナル画像の対応する領域を示す図。(a) is a figure which shows a part of reduced image, (b) is a figure which shows the area | region corresponding to an original image.

まず、本発明の概要について説明する。図１に、本発明の画像処理装置を示す。画像処理装置１０は、エッジ検出手段１１と、領域抽出分離手段１２とを有する。エッジ検出手段１１は、入力画像から、隣接する画素間での特徴量の変化に基づいて、エッジを検出する。領域抽出分離手段１２は、エッジ検出手段１１で検出されたエッジに対応する入力画像の画素の近傍の画素を、特徴量が高い画素と、特徴量が低い画素とに区分する。領域抽出分離手段１２は、エッジ近傍で、特徴量が高い画素に囲まれた特徴量が低い画素、及び、特徴量が低い画素に囲まれた特徴量が高い画素を文字領域として抽出する。 First, an outline of the present invention will be described. FIG. 1 shows an image processing apparatus of the present invention. The image processing apparatus 10 includes an edge detection unit 11 and a region extraction / separation unit 12. The edge detection unit 11 detects an edge from the input image based on a change in feature amount between adjacent pixels. The region extraction / separation unit 12 classifies the pixels in the vicinity of the pixel of the input image corresponding to the edge detected by the edge detection unit 11 into a pixel having a high feature amount and a pixel having a low feature amount. The region extraction / separation unit 12 extracts, as a character region, pixels having a low feature amount surrounded by pixels having a high feature amount and pixels having a high feature amount surrounded by pixels having a low feature amount in the vicinity of the edge.

エッジ検出手段１１が検出したエッジは、文字領域と背景領域との境界を示す。エッジ近傍にて、特徴量が高い画素に囲まれた、特徴量が低い画素を抽出することで、特徴量が高い画素を背景とし、特徴量が低い画素が文字を構成する文字の文字領域を抽出することができる。また、エッジ近傍にて、特徴量が低い画素に囲まれた、特徴量が高い画素を抽出することで、特徴量が低い画素を背景とし、特徴量が高い画素が文字を構成する文字の文字領域を抽出することができる。つまり、入力画像から、反転文字と非反転文字との双方の文字領域を抽出できる。 The edge detected by the edge detection means 11 indicates the boundary between the character area and the background area. By extracting pixels with low feature values that are surrounded by pixels with high feature values in the vicinity of the edge, the pixels with the high feature values are used as the background, and the character regions of the characters in which the low feature values make up the character Can be extracted. In addition, by extracting pixels with a high feature value that are surrounded by pixels with a low feature value in the vicinity of the edge, a character having a low feature value as a background and a pixel with a high feature value constituting a character Regions can be extracted. That is, it is possible to extract character regions of both inverted characters and non-inverted characters from the input image.

本発明では、入力画像からエッジを検出し、エッジ近傍から文字領域を抽出している。このため、反転文字の文字領域抽出に際して、あらかじめ行情報がわかっている必要はない。従って、定型の文書画像だけでなく、さまざまな画像から文字領域を抽出できる。また、エッジ近傍の画素の特徴量に応じて文字領域を抽出しているので、あらかじめ文字領域を判別しておき、入力画像を２値化した２値画像で、文字領域に外接する外接矩形内の白画素と黒画素との比を求める必要がない。外接矩形内で白画素と黒画素との比を求め反転領域か否かを判定する場合は、背景画素が反転領域の判定に影響を与えることがある。また、外接矩形内の文字数や文字の書体に応じて反転領域を正しく判定できないことがある。本発明では、そのような問題は生じず、精度よく反転文字の文字領域を抽出できる。 In the present invention, an edge is detected from an input image, and a character region is extracted from the vicinity of the edge. For this reason, it is not necessary to know line information in advance when extracting a character area of an inverted character. Therefore, character regions can be extracted from various images as well as standard document images. In addition, since the character area is extracted according to the feature amount of the pixel near the edge, the character area is discriminated in advance, and the input image is binarized in the circumscribed rectangle circumscribing the character area. There is no need to obtain the ratio of white pixels to black pixels. When determining the ratio of the white pixel to the black pixel in the circumscribed rectangle and determining whether or not it is an inversion region, the background pixel may affect the determination of the inversion region. In some cases, the inversion area cannot be correctly determined according to the number of characters in the circumscribed rectangle or the typeface of the characters. In the present invention, such a problem does not occur, and the character region of the reversed character can be extracted with high accuracy.

以下、図面を参照し、本発明の実施の形態を詳細に説明する。図２に、本発明の第１実施形態の画像処理装置を示す。画像処理装置１００は、画像入力装置１１０、データ処理部１２０、データ記憶部１３０、及び、画像出力装置１４０を有する。画像入力装置１１０は、画像を入力する装置である。画像入力装置１１０は、典型的には、スチルカメラやビデオカメラ、スキャナといった撮像システムである。画像入力装置１１０は、画像データを、データ処理部１２０に入力する。入力画像の各画素は、画素値を有する。画素値は、白黒の階調画像であれば、黒（最低輝度）から白（最高輝度）までの輝度値で表される。画素値は、入力画像がカラー画像であれば、色空間に応じたベクトル値で表される。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 2 shows an image processing apparatus according to the first embodiment of the present invention. The image processing apparatus 100 includes an image input device 110, a data processing unit 120, a data storage unit 130, and an image output device 140. The image input device 110 is a device for inputting an image. The image input device 110 is typically an imaging system such as a still camera, a video camera, or a scanner. The image input device 110 inputs image data to the data processing unit 120. Each pixel of the input image has a pixel value. In the case of a monochrome gradation image, the pixel value is represented by a luminance value from black (lowest luminance) to white (highest luminance). If the input image is a color image, the pixel value is represented by a vector value corresponding to the color space.

データ処理部１２０は、電子計算機上で所定のプログラムを動作させることで動作する。データ処理部１２０は、エッジ検出手段１２１、領域抽出分離手段１２２、及び、文字領域統合手段１２５を有する。エッジ検出手段１２１は、入力画像から、隣接する画素間での特徴量の変化に基づいて、エッジを検出する。エッジ検出手段１２１は、隣接する画素間での特徴量の変化を求め、特徴量の変化（エッジ強度）がしきい値以上の画素を、エッジとして検出する。 The data processing unit 120 operates by operating a predetermined program on the electronic computer. The data processing unit 120 includes an edge detection unit 121, a region extraction / separation unit 122, and a character region integration unit 125. The edge detection unit 121 detects an edge from the input image based on a change in feature amount between adjacent pixels. The edge detection unit 121 obtains a change in the feature amount between adjacent pixels, and detects a pixel whose feature amount change (edge strength) is equal to or greater than a threshold as an edge.

エッジ検出手段１２１は、特徴量の変化の方向に応じて、第１のエッジと第２のエッジとを検出する。エッジ検出手段１２１は、特徴量が増加する方向のエッジを第１のエッジとして検出し、特徴量が減少する方向のエッジを第２のエッジとして検出する。言い換えれば、エッジ検出手段１２１は、特徴量の変化が大きい隣接する２つの画素のうち、特徴量が低い側の画素を第１のエッジとして検出し、特徴量が高い側の画素を第２のエッジとして検出する。 The edge detection unit 121 detects the first edge and the second edge according to the direction of change of the feature amount. The edge detection unit 121 detects an edge in the direction in which the feature amount increases as the first edge, and detects an edge in the direction in which the feature amount decreases as the second edge. In other words, the edge detection unit 121 detects, as a first edge, a pixel having a lower feature value among two adjacent pixels having a large change in the feature value, and detects a pixel having a higher feature value as the second edge. Detect as an edge.

エッジ検出手段１２１は、例えば、入力画像の輝度成分を対象として、エッジ検出を行う。エッジ検出手段１２１は、入力画像から、隣接する画素間での輝度値の変化がしきい値以上で、かつ、明るい輝度値から暗い輝度値に変化する画素を第１のエッジとして検出する。また、エッジ検出手段１２１は、隣接する画素間での輝度値の変化がしきい値以上で、かつ、暗い輝度値から明るい輝度値に変化する画素を第２のエッジとして検出する。入力画像がカラー画像で、各画素の画素値がＲＧＢの輝度値で表される場合、エッジ検出手段１２１は、ＲＧＢの各プレーンにエッジ検出処理を施し、３つのプレーンの同じ画素位置のエッジ強度のうち、最大のエッジ強度をしきい値処理することで、エッジを検出してもよい。 The edge detection unit 121 performs edge detection for the luminance component of the input image, for example. The edge detection unit 121 detects, from the input image, a pixel in which a change in luminance value between adjacent pixels is equal to or greater than a threshold value and a pixel that changes from a bright luminance value to a dark luminance value as a first edge. Further, the edge detection unit 121 detects, as the second edge, a pixel in which the change in luminance value between adjacent pixels is equal to or greater than a threshold value and changes from a dark luminance value to a bright luminance value. When the input image is a color image and the pixel value of each pixel is represented by a luminance value of RGB, the edge detection unit 121 performs edge detection processing on each RGB plane, and the edge intensity at the same pixel position of the three planes Of these, the edge may be detected by thresholding the maximum edge strength.

エッジ検出手段１２１は、入力画像がカラー画像の場合は、特徴量として色相や彩度を用い、これらの変化に基づいて、エッジを検出してもよい。エッジ検出手段１２１は、例えば、入力画像から、隣接する画素の色相の変化がしきい値以上で、かつ、彩度が大きい値から小さい値に変化する画素を第１のエッジとして検出する。また、エッジ検出手段１２１は、隣接する画素の色相の変化がしきい値以上で、かつ、彩度が大きい値から小さい値に変化する画素を第２のエッジとして検出する。なお、以下では、説明簡略化のため、主に、入力画像が白黒の階調画像である場合について説明する。 When the input image is a color image, the edge detection unit 121 may detect the edge based on these changes using hue and saturation as the feature amount. For example, the edge detection unit 121 detects, as a first edge, a pixel in which the hue change of an adjacent pixel is equal to or greater than a threshold value and the saturation changes from a large value to a small value from the input image. Further, the edge detection unit 121 detects, as the second edge, a pixel in which the hue change of an adjacent pixel is equal to or greater than a threshold value and the saturation value changes from a large value to a small value. In the following, for simplification of description, a case where the input image is a monochrome gradation image will be mainly described.

領域抽出分離手段１２２は、エッジ検出手段１２１から、エッジの位置情報を受け取る。領域抽出分離手段１２２は、第１のエッジ近傍の入力画像の画素を、特徴量に基づいて、周囲よりも特徴量が高い画素と、周囲よりも特徴量が低い画素とに区分する。領域抽出分離手段１２２は、例えば特徴量をしきい値処理し、特徴量がしきい値より大きい画素を特徴量が高い画素に区分し、特徴量がしきい値よりも小さい画素を特徴量が低い画素に区分する。領域抽出分離手段１２２は、第１のエッジの位置に対応する入力画像の画素の近傍の画素のうち、周囲よりも特徴量が高い画素に囲まれた、周囲よりも特徴量が低い画素を、文字領域として抽出する。 The region extraction / separation unit 122 receives edge position information from the edge detection unit 121. The region extraction / separation unit 122 classifies the pixels of the input image near the first edge into pixels having a higher feature amount than the surroundings and pixels having a lower feature amount than the surroundings based on the feature amount. For example, the region extraction / separation unit 122 performs threshold processing on the feature amount, divides pixels having a feature amount larger than the threshold value into pixels having a higher feature amount, and determines pixels having feature amounts smaller than the threshold as feature amounts Divide into low pixels. The region extraction / separation unit 122 includes, among the pixels in the vicinity of the pixel of the input image corresponding to the position of the first edge, pixels surrounded by pixels having a higher feature amount than the surroundings and having a lower feature amount than the surroundings. Extract as a character area.

また、領域抽出分離手段１２２は、エッジ検出部１２１が検出した第２のエッジの位置に対応する入力画像の画素の近傍の画素を、周囲よりも特徴量が低い画素と、周囲よりも特徴量が高い画素とに区分する。この区分は、上記と同様に、特徴量をしきい値処理することで行うことができる。領域抽出分離手段１２２は、第２のエッジの位置に対応する入力画像の画素の近傍の画素のうち、周囲よりも特徴量が低い画素に囲まれた、周囲よりも特徴量が高い画素を、文字領域として抽出する。なお、ここで言う“囲まれている”とは、周囲を完全に囲まれていることまでは必要なく、周囲が覆われていれば足りる。 In addition, the region extraction / separation unit 122 includes a pixel having a feature amount lower than the surroundings and a feature amount lower than the surroundings of pixels in the vicinity of the pixel of the input image corresponding to the position of the second edge detected by the edge detection unit 121. Is divided into high pixels. Similar to the above, this classification can be performed by thresholding the feature amount. The region extraction / separation unit 122 includes a pixel having a feature amount higher than that of the surrounding pixels surrounded by pixels having a feature amount lower than that of the pixels in the vicinity of the pixel of the input image corresponding to the position of the second edge. Extract as a character area. Note that “enclosed” here does not need to be completely surrounded, and is sufficient if the periphery is covered.

本実施形態では、入力画像中で、特徴量が低い画素を背景とし、特徴量が高い画素で文字が構成される文字を反転文字とする。逆に、入力画像中で、特徴量が高い画素を背景とし、特徴量が低い画素で構成される文字を非反転文字とする。第１のエッジの位置に対応する画素の近傍の画素のうち、周囲よりも特徴量が高い画素に囲まれた、周囲よりも特徴量が低い画素で構成される文字領域は、非反転文字の文字領域に対応する。また、第２のエッジの位置に対応する画素の近傍の画素のうち、周囲よりも特徴量が低い画素に囲まれた、周囲よりも特徴量が高い画素で構成される文字領域は、反転文字の文字領域に対応する。 In the present embodiment, in the input image, a pixel having a low feature value is used as a background, and a character that is composed of a pixel having a high feature value is used as an inverted character. Conversely, in the input image, a pixel having a high feature value is set as a background, and a character composed of pixels having a low feature value is set as a non-inverted character. A character region composed of pixels having a feature amount lower than that of the surrounding pixels among pixels near the pixel corresponding to the position of the first edge is a non-inverted character. Corresponds to the character area. In addition, a character area composed of pixels having a feature amount higher than that of the surrounding pixels among the pixels in the vicinity of the pixel corresponding to the position of the second edge is a reversed character. Corresponds to the character area.

領域抽出分離手段１２２は、局所２値化手段１２３と、文字領域分離手段１２４とを有する。局所２値化手段１２３は、エッジ検出手段１２１が検出した第１のエッジに対応する入力画像の画素の近傍の画素を処理対象とし、第１のエッジ近傍の入力画像を２値化する。また、局所２値化手段１２３は、エッジ検出手段１２１が検出した第２のエッジに対応する入力画像の画素の近傍の画素を処理対象とし、第２のエッジ近傍の入力画像を２値化する。 The area extraction / separation means 122 includes a local binarization means 123 and a character area separation means 124. The local binarization unit 123 binarizes the input image in the vicinity of the first edge with the pixel near the pixel of the input image corresponding to the first edge detected by the edge detection unit 121 as a processing target. The local binarization unit 123 binarizes the input image in the vicinity of the second edge with the pixel near the pixel of the input image corresponding to the second edge detected by the edge detection unit 121 as a processing target. .

局所２値化手段１２３は、第１のエッジ近傍の画素では、特徴量が低い画素を黒とし特徴量が高い画素を白とする。この画像を第１の画像とする。また、局所２値化手段１２３は、第２のエッジ近傍の画素では、周囲よりも特徴量が高い画素を黒とし特徴量が低い画素を白とする。この画像を、第２の画像とする。白画素と黒画素とは（白画素と黒画素の役割りは）、逆でも構わない。 The local binarization unit 123 sets a pixel having a low feature value as black and a pixel having a high feature value as white in the pixels near the first edge. This image is the first image. Further, the local binarization unit 123 sets pixels having a higher feature amount than the surrounding pixels to black and pixels having a lower feature amount to white in the pixels near the second edge. This image is the second image. The white pixel and the black pixel (the role of the white pixel and the black pixel) may be reversed.

文字領域分離手段１２４は、第１の画像にて、白画素に囲まれた黒画素を抽出し、文字領域として出力する。この第１の画像から抽出された文字領域は、特徴量が高い画素に囲まれた特徴量が低い画素で構成される文字領域、つまり、非反転文字の文字領域に相当する。また、文字領域分離手段１２４は、第２の画像にて、白画素に囲まれた黒画素を抽出し、文字領域として出力する。この第２の画像から抽出された文字領域は、特徴量が低い画素に囲まれた特徴量が高い画素で構成される文字領域、つまり、反転文字の文字領域に相当する。 The character area separating unit 124 extracts black pixels surrounded by white pixels from the first image and outputs them as character areas. The character area extracted from the first image corresponds to a character area composed of pixels surrounded by pixels having a high feature amount and having a low feature amount, that is, a character region of non-inverted characters. In addition, the character area separation unit 124 extracts black pixels surrounded by white pixels in the second image and outputs them as character areas. The character region extracted from the second image corresponds to a character region composed of pixels with a high feature amount surrounded by pixels with a low feature amount, that is, a character region of inverted characters.

ここで、非反転文字にて、文字のパーツの隙間などに背景が入り込むと、その部分は、第２の画像にて、白画素に囲まれた黒画素として現れる。また、反転文字にて、文字のパーツの隙間などに背景が入り込むと、その部分は、第１の画像にて、白画素に囲まれた黒画素として現れる。これらは、抽出された文字領域にて、ノイズ成分となる。そこで、１文字を構成する文字は、反転文字又は非反転文字の何れかであると仮定して、ノイズ成分を除去する。 Here, when a background enters a gap between character parts in a non-inverted character, the portion appears as a black pixel surrounded by white pixels in the second image. In addition, when a background enters a gap between character parts in an inverted character, the portion appears as a black pixel surrounded by white pixels in the first image. These become noise components in the extracted character area. Therefore, the noise component is removed assuming that the characters constituting one character are either inverted characters or non-inverted characters.

文字領域統合手段１２５は、第１の画像から抽出された文字領域と、第２の画像から抽出された文字領域と統合する。統合に際して、文字領域統合手段１２５は、第１の画像から抽出された文字領域と、第２の画像から抽出された文字領域とが重複する位置にあるとき、重複する位置の文字領域の何れか一方を文字領域として選択する。より詳細には、文字領域統合手段１２５は、第１の画像から抽出された文字領域と、第２の画像から抽出された文字領域とに対してラベリング処理を行う。文字領域統合手段１２５は、重複した位置にあるラベルに対して、ラベル面積に基づいて何れか一方を選択する。 The character area integration unit 125 integrates the character area extracted from the first image and the character area extracted from the second image. At the time of integration, when the character area extracted from the first image and the character area extracted from the second image overlap each other, the character area integration unit 125 selects any one of the character areas at the overlapping positions. One is selected as a character area. More specifically, the character region integration unit 125 performs a labeling process on the character region extracted from the first image and the character region extracted from the second image. The character region integration unit 125 selects one of the labels at the overlapping positions based on the label area.

画像出力装置１４０は、文字領域統合手段１２５で統合処理が施された文字領域を、後段の文字認識装置などに出力する。或いは、画像出力装置１４０は、文字領域を、ディスプレイ等の表示画面に表示してもよい。画像出力装置１４０は、第１の画像から抽出された文字領域と第２の画像から抽出された文字領域とを、別々に出力してもよい。その際、画像出力装置１４０は、重複した位置にある文字領域のうち、文字領域統合手段１２５が選択しなかった文字領域を、第１の画像又は第２の画像から削除してもよい。データ記憶部１３０は、パラメータ記憶部１３１を含む。パラメータ記憶部１３１は、各種しきい値や、局所２値化手段１２３にて２値化処理方法を決定するパラメータなどを記憶する。 The image output device 140 outputs the character region that has been subjected to the integration processing by the character region integration unit 125 to a subsequent character recognition device or the like. Alternatively, the image output device 140 may display the character area on a display screen such as a display. The image output device 140 may separately output the character area extracted from the first image and the character area extracted from the second image. At that time, the image output apparatus 140 may delete, from the first image or the second image, the character area that is not selected by the character area integration unit 125 from the overlapping character areas. The data storage unit 130 includes a parameter storage unit 131. The parameter storage unit 131 stores various threshold values, parameters for determining the binarization processing method by the local binarization unit 123, and the like.

図３に、動作手順を示す。画像入力装置１１０は、撮像デバイスで撮影した画像を、データ処理部１２０に入力する（ステップＳ１）。入力画像は、撮影した画像には限られず、何らかの方法で生成した画像で構わない。エッジ検出手段１２１は、入力画像から、第１のエッジと第２のエッジとを検出する（ステップＳ２Ａ、Ｓ２Ｂ）。エッジ検出手段１２１は、エッジの検出では、例えばラプラシアンフィルタのようなエッジ検出フィルタを適用し、特徴量が変化する位置（符号が反転する位置）を検出する。エッジ検出手段１２１は、特徴量が変化する位置、つまりは、特徴量が高い画素と特徴量が低い画素との境界で、特徴量が低い側の画素を第１のエッジとして検出し、特徴量が高い側の画素を第２のエッジとして検出する。 FIG. 3 shows an operation procedure. The image input device 110 inputs an image captured by the imaging device to the data processing unit 120 (step S1). The input image is not limited to the captured image, and may be an image generated by some method. The edge detection unit 121 detects the first edge and the second edge from the input image (steps S2A and S2B). For edge detection, the edge detection unit 121 applies an edge detection filter such as a Laplacian filter to detect a position where the feature amount changes (a position where the sign is inverted). The edge detection unit 121 detects, as a first edge, a pixel having a low feature amount at a position where the feature amount changes, that is, a boundary between a pixel having a high feature amount and a pixel having a low feature amount, as a first edge. The pixel having the higher side is detected as the second edge.

エッジ検出手段１２１は、各画素のエッジの強度を計算し、エッジ強度をしきい値処理することで、２値のエッジ画像を生成する。エッジ強度のしきい値は、パラメータ記憶部１３１に記憶されている固定値を用いることができる。或いは、判別分析法（非特許文献１）を用いて、小領域ごとにしきい値を変えてもよい。なお、ラプラシアンフィルタは、ノイズの影響を受けやすい。このため、エッジ検出手段１２１がエッジ検出を行う前の段階で、あらかじめ、メディアンフィルタやガウシアンフィルタなどを適用し、入力画像に対してノイズ除去処理や平滑化処理を行っておいてもよい。そのような処理を行っておく場合、ノイズに起因するエッジの誤検出を抑えることができる。 The edge detection unit 121 calculates the edge strength of each pixel and generates a binary edge image by thresholding the edge strength. A fixed value stored in the parameter storage unit 131 can be used as the threshold value of the edge strength. Or you may change a threshold value for every small area using a discriminant analysis method (nonpatent literature 1). Note that the Laplacian filter is susceptible to noise. Therefore, before the edge detection unit 121 performs edge detection, a median filter, a Gaussian filter, or the like may be applied in advance to perform noise removal processing or smoothing processing on the input image. When such processing is performed, erroneous detection of edges due to noise can be suppressed.

局所２値化手段１２３は、エッジ検出手段１２１から第１のエッジ及び第２のエッジの位置（座標）を受け取り、入力画像のエッジの近傍の画素に対し、局所的な２値化処理を行う（ステップＳ３Ａ、Ｓ３Ｂ）。局所２値化手段１２３は、ステップＳ３Ａでは、例えば、第１のエッジを構成する画素から所定範囲内の入力画像の画素をエッジ近傍の画素とし、そのエッジ近傍の画素に対して２値化処理を行う。局所２値化手段１２３は、処理対象の領域内の各画素の特徴量と、その周辺画素の特徴量とを比較し、周辺画素よりも特徴量が高い画素を白に、特徴量が低い画素を黒にした第１の画像を生成する。 The local binarization unit 123 receives the positions (coordinates) of the first edge and the second edge from the edge detection unit 121, and performs local binarization processing on pixels near the edge of the input image. (Steps S3A, S3B). In step S3A, for example, the local binarization unit 123 sets the pixels of the input image within a predetermined range from the pixels constituting the first edge as pixels near the edge, and binarizes the pixels near the edge. I do. The local binarization unit 123 compares the feature amount of each pixel in the processing target region with the feature amount of the surrounding pixels, and whites a pixel having a higher feature amount than the surrounding pixels and a pixel having a low feature amount. A first image is generated in black.

局所２値化手段１２３は、ステップＳ３Ｂでは、上記と同様に、第２のエッジを構成する画素から所定範囲内の画素をエッジ近傍の画素とし、そのエッジ近傍の画素に対して２値化処理を行う。局所２値化手段１２３は、処理対象の領域内の各画素について、注目画素の特徴量と周辺画素の特徴量とを比較し、周辺画素に対して特徴量が低い画素を白に、特徴量が高い画素を黒にした第２の画像を生成する。局所２値化手段１２３は、第１の画像及び第２の画像の生成に際して、処理対象の画素の特徴量を、パラメータ記憶部１３１に格納されている固定値を用いてしきい値処理し、２値化してもよい。また、局所２値化手段１２３は、小領域ごとに、判別分析法（非特許文献１）を用いてしきい値を動的に決定してもよい。 In step S3B, the local binarization unit 123 sets pixels within a predetermined range from the pixels constituting the second edge as pixels near the edge, and binarizes the pixels near the edge in the same manner as described above. I do. The local binarization unit 123 compares the feature amount of the target pixel with the feature amount of the peripheral pixel for each pixel in the processing target region, and sets the pixel whose feature amount is lower than the peripheral pixel to white. A second image is generated in which pixels having a high value are black. The local binarization unit 123 performs threshold processing on the feature amount of the pixel to be processed using the fixed value stored in the parameter storage unit 131 when generating the first image and the second image, You may binarize. Moreover, the local binarization means 123 may determine a threshold value dynamically for every small area using a discriminant analysis method (nonpatent literature 1).

文字領域分離手段１２４は、第１及び第２の画像にて、白画素で囲まれた黒画素を抽出し、これを文字領域とする（ステップＳ４Ａ、Ｓ４Ｂ）。白画素で囲まれた黒画素の抽出は、以下の手順で行うことができる。文字領域分離手段１２４は、第１の画像にて、連結された白画素を検索する。文字領域分離手段１２４は、第１の画像にて、連結された白画素の中に、黒画素が存在するか否かを判断する。文字領域分離手段１２４は、白画素領域内に、黒画素領域が包含されているときは、その黒画素領域を、文字領域として抽出する。文字領域分離手段１２４は、上記と同様な手順で、第２の画像にて、連結された白画素領域内に包含されている黒画素領域を、文字領域として抽出する。 The character area separating unit 124 extracts black pixels surrounded by white pixels from the first and second images, and uses them as character areas (steps S4A and S4B). Extraction of black pixels surrounded by white pixels can be performed by the following procedure. The character area separation unit 124 searches for connected white pixels in the first image. The character area separation unit 124 determines whether or not there is a black pixel among the connected white pixels in the first image. When the black pixel area is included in the white pixel area, the character area separating unit 124 extracts the black pixel area as the character area. The character region separation unit 124 extracts, as a character region, a black pixel region included in a connected white pixel region in the second image by the same procedure as described above.

文字領域統合手段１２５は、第１の画像から抽出された文字領域、及び、第２の画像から抽出された文字領域に対して、ラベリング処理を行う。ラベリング際しては、分離されている一文字を構成する複数のパーツ、又は、一連の文字列が１つのラベルとなるように、ラベリング前に画素膨張処理を行ってもよい。文字領域統合手段１２５は、両画像から抽出された文字領域のうち、ラベル位置が重なり合う領域のラベル面積を比較する（ステップＳ５）。文字領域統合手段１２５は、ラベル面積が大きい方の文字領域を残し、ラベル面積が小さい方の文字領域を消去して、両画像から抽出された文字領域を統合する（ステップＳ６）。この統合処理を行うことで、あるラベル画像では、第１の画像又は第２の画像の何れか一方から抽出された文字領域が、統合後の文字領域として残ることになる。 The character region integration unit 125 performs a labeling process on the character region extracted from the first image and the character region extracted from the second image. At the time of labeling, pixel expansion processing may be performed before labeling so that a plurality of parts constituting a separated character or a series of character strings become one label. The character region integration unit 125 compares the label areas of the regions where the label positions overlap among the character regions extracted from both images (step S5). The character region integration means 125 leaves the character region with the larger label area, erases the character region with the smaller label area, and integrates the character regions extracted from both images (step S6). By performing this integration process, in a certain label image, the character area extracted from either the first image or the second image remains as a character area after integration.

以下、具体例を用いて説明する。図４に、入力画像例を示す。入力画像は、背景よりも輝度が高い画素で構成される白抜きの文字（反転文字）と、背景よりも輝度が低い画素で構成される黒色の文字（非反転文字）とを含む。エッジ検出手段１２１は、図４に示す入力画像から、第１のエッジと第２のエッジとを検出する。図５（ａ）に、第１のエッジで構成されるエッジ画像（エッジ画像Ａ）示し、図５（ｂ）に、第２のエッジで構成されるエッジ画像（エッジ画像Ｂ）を示す。図４に示す入力画像からエッジを検出し、エッジの位置の画素を黒、それ以外の画素を白とすると、エッジ画像は、図５（ａ）及び（ｂ）に示すようになる。 Hereinafter, a specific example will be described. FIG. 4 shows an input image example. The input image includes white characters (inverted characters) composed of pixels with higher brightness than the background and black characters (non-inverted characters) composed of pixels with lower brightness than the background. The edge detection means 121 detects the first edge and the second edge from the input image shown in FIG. FIG. 5A shows an edge image (edge image A) composed of the first edge, and FIG. 5B shows an edge image (edge image B) composed of the second edge. If an edge is detected from the input image shown in FIG. 4 and the pixel at the edge position is black and the other pixels are white, the edge image is as shown in FIGS. 5 (a) and 5 (b).

図６（ａ）に、入力画像の一部を拡大して示し、（ｂ）及び（ｃ）に、エッジ画像を拡大して示す。図６（ａ）に示す画像は、“運”の文字のしんにゅう（しんにょう）の点の部分に相当する。点の部分は、高輝度、すなわち、白に近い色であり、背景部分は、グレー、つまり、中間階調の色である。エッジ検出手段１２１は、輝度が増加方向に変化する画素を第１のエッジとして検出する。つまり、エッジ検出手段１２１は、図６（ａ）に示す画像の暗い画素と明るい画素の境界で、暗い画素の位置を第１のエッジとして検出する（図６（ｂ））。また、エッジ検出手段１２１は、輝度が減少方向に変化する画素を第２のエッジとして検出する。つまり、エッジ検出手段１２１は、図６（ａ）に示す画像の暗い画素と明るい画素との境界で、明るい画素の位置を第２のエッジとして検出する（図６（ｃ））。第１のエッジと第２のエッジとは、隣接する関係にある。 FIG. 6A shows an enlarged part of the input image, and FIGS. 6B and 6C show the edge image in an enlarged manner. The image shown in FIG. 6A corresponds to a portion of a point of “luck” characters. The dot portion has a high brightness, that is, a color close to white, and the background portion is gray, that is, a color of intermediate gradation. The edge detection unit 121 detects a pixel whose luminance changes in the increasing direction as the first edge. That is, the edge detection unit 121 detects the position of the dark pixel as the first edge at the boundary between the dark pixel and the bright pixel in the image shown in FIG. 6A (FIG. 6B). Further, the edge detection unit 121 detects a pixel whose luminance changes in a decreasing direction as the second edge. That is, the edge detection unit 121 detects the position of the bright pixel as the second edge at the boundary between the dark pixel and the bright pixel in the image shown in FIG. 6A (FIG. 6C). The first edge and the second edge are adjacent to each other.

局所２値化手段１２３は、第１のエッジの位置を用いて、第１のエッジ近傍の入力画像の画素のうち、輝度が低い画素を黒とし、輝度が高い画素を白とする白黒の２値画像（第１の画像）を生成する。また、局所２値化手段１２３は、第２のエッジの位置を用いて、第２のエッジ近傍の入力画像の画素のうち、輝度が高い画素を黒とし、輝度が低い画素を白とする白黒の２値画像（第２の画像）を生成する。図７（ａ）及び（ｂ）に、第１の画像と第２の画像とを示す。図７（ａ）及び（ｂ）において、グレーで示す領域は、２値化処理対象外の領域である。この領域は、背景領域に相当する。 The local binarization means 123 uses the position of the first edge to make black and white 2 pixels of the input image in the vicinity of the first edge where the low luminance pixel is black and the high luminance pixel is white. A value image (first image) is generated. Further, the local binarization unit 123 uses the position of the second edge to make black and white a pixel having a high luminance and white a pixel having a low luminance among pixels of the input image near the second edge. The binary image (second image) is generated. 7A and 7B show the first image and the second image. In FIGS. 7A and 7B, a gray area is an area not subject to binarization processing. This area corresponds to the background area.

図８（ａ）及び（ｂ）に、それぞれ、第１の画像及び第２の画像の一部を拡大して示す。図８に示す部分は、図６に示すしんにゅうの点の部分に相当する。局所２値化手段１２３は、図６（ｂ）及び（ｃ）に示すエッジ位置（黒画素の位置）の近傍で２値化処理を行う。エッジ近傍の画素の範囲は、エッジ位置を中心として、上下、左右、及び、斜めに隣接する３×３の領域と定義する。エッジ近傍の画素の範囲は、文字を構成する線の細さや、入力画像中での文字の大きさ、入力画像の解像度などを考慮して、適宜決めておけばよい。 FIGS. 8A and 8B are enlarged views of a part of the first image and the second image, respectively. The portion shown in FIG. 8 corresponds to the portion of the silver point shown in FIG. The local binarization means 123 performs binarization processing in the vicinity of the edge position (black pixel position) shown in FIGS. The range of pixels in the vicinity of the edge is defined as a 3 × 3 area that is vertically and horizontally, and diagonally adjacent to the edge position. The pixel range in the vicinity of the edge may be determined as appropriate in consideration of the fineness of the lines constituting the character, the size of the character in the input image, the resolution of the input image, and the like.

入力画像にて（図６（ａ））、第１のエッジよりも内側の部分（白に見える部分）は周囲よりも輝度が高く、第１のエッジとその外側の部分（グレーに見える部分）とは周囲よりも輝度が低い。局所２値化手段１２３は、輝度が高い部分、すなわち、第１のエッジよりも内側の部分を白とし、輝度が低い部分、すなわち、第１のエッジとその外側の部分とを黒とした第１の画像を生成する（図８（ａ））。また、入力画像にて、第２のエッジとその内側の部分とは周囲よりも輝度が高く、第２のエッジよりも外側の部分は周囲よりも輝度が低い。局所２値化手段１２３は、輝度が高い部分、すなわち、第２のエッジとその内側の部分とを黒とし、輝度が低い部分、すなわち、第２のエッジよりも外側の部分を白とした第２の画像を生成する（図８（ｂ））。 In the input image (FIG. 6A), the inner portion (the portion that appears white) of the first edge has higher brightness than the surroundings, and the first edge and the outer portion (the portion that appears gray). The brightness is lower than the surroundings. The local binarization unit 123 sets the portion with high luminance, that is, the portion inside the first edge as white, and the portion with low luminance, that is, the first edge and the outside portion as black. 1 image is generated (FIG. 8A). In the input image, the brightness of the second edge and the portion inside the second edge is higher than that of the surroundings, and the brightness of the portion outside the second edge is lower than that of the surroundings. The local binarization unit 123 sets the high-luminance portion, that is, the second edge and the inner portion thereof as black, and the low-luminance portion, ie, the portion outside the second edge as white. 2 image is generated (FIG. 8B).

文字領域分離手段１２４は、第１の画像及び第２の画像から、白画素に囲まれた黒画素を抽出し、双方の画像にて、文字領域を抽出する。図９（ａ）及び（ｂ）に、文字領域分離手段１２４が抽出した文字領域を示す。図７（ａ）に示す第１の画像（第１のエッジ近傍の２値画像）にて、白画素に囲まれた黒画素を残し、白画素に囲まれていない黒画素を削除すると、図９（ａ）に示す画像が得られる。図９（ａ）に示す画像中の黒画素が、第１の画像から抽出された文字領域に相当する。また、図７（ｂ）に示す第２の画像（第２のエッジ近傍の２値画像）にて、白画素に囲まれた黒画素を残し、白画素に囲まれていない黒画素を削除すると、図９（ｂ）に示す画像が得られる。図９（ｂ）に示す画像中の黒画素が、第２の画像から抽出された文字領域に相当する。 The character area separating unit 124 extracts black pixels surrounded by white pixels from the first image and the second image, and extracts a character area from both images. 9A and 9B show the character areas extracted by the character area separating unit 124. FIG. In the first image (binary image in the vicinity of the first edge) shown in FIG. 7A, when black pixels surrounded by white pixels are left and black pixels not surrounded by white pixels are deleted, FIG. The image shown in 9 (a) is obtained. A black pixel in the image shown in FIG. 9A corresponds to a character region extracted from the first image. Further, in the second image (binary image in the vicinity of the second edge) shown in FIG. 7B, when black pixels surrounded by white pixels are left and black pixels not surrounded by white pixels are deleted. An image shown in FIG. 9B is obtained. A black pixel in the image shown in FIG. 9B corresponds to a character region extracted from the second image.

図１０（ａ）及び（ｂ）に、それぞれ図８（ａ）及び（ｂ）に示す第１の画像及び第２の画像から抽出される文字領域を示す。文字領域分離手段１２４は、図８（ａ）に示す第１の画像中で、縦、横、斜めに連続する白画素（白画素の塊）を探索する。文字領域分離手段１２４は、白画素の塊が見つかると、白画素の塊の中に黒画素が存在するか否かを調べる。入力画像（図４）にて、“運”の文字は反転文字であり、「しんにゅう」の点の部分は輝度が低い領域に囲まれた輝度が高い領域なので、図８（ａ）では、白画素の塊の中に黒画素は存在していない。従って、図８（ａ）に示す第１の画像から抽出される文字領域はない（図１０（ａ））。 FIGS. 10A and 10B show character areas extracted from the first image and the second image shown in FIGS. 8A and 8B, respectively. The character region separation unit 124 searches for white pixels (a lump of white pixels) that are continuous vertically, horizontally, and diagonally in the first image shown in FIG. When the white pixel block is found, the character area separating unit 124 checks whether there is a black pixel in the white pixel block. In the input image (FIG. 4), the character “luck” is an inverted character, and the point portion of “shinyu” is a high luminance region surrounded by a low luminance region. There are no black pixels in the block of pixels. Therefore, there is no character region extracted from the first image shown in FIG. 8A (FIG. 10A).

文字領域分離手段１２４は、図８（ｂ）に示す第２画像中で、縦、横、斜めに連続する白画素の塊を探索する。文字領域分離手段１２４は、白画素の塊が見つかると、白画素の塊の中に黒画素が存在するか否かを調べる。図８（ｂ）では、黒画素の周りを白画素が囲んでいるので、白画素の塊の中に存在する黒画素が見つかる。文字領域分離手段１２４は、図８（ｂ）で白画素に囲まれた黒画素を、文字領域として抽出する（図１０（ｂ）)。このように、「しんにゅう」の点の部分に対応する文字領域は、第１の画像からは抽出されず、第２の画像から抽出されることになる。 The character region separation unit 124 searches for a cluster of white pixels that are continuous vertically, horizontally, and diagonally in the second image shown in FIG. When the white pixel block is found, the character area separating unit 124 checks whether there is a black pixel in the white pixel block. In FIG. 8B, since the white pixels surround the black pixels, the black pixels existing in the white pixel block are found. The character area separating unit 124 extracts black pixels surrounded by white pixels in FIG. 8B as character areas (FIG. 10B). Thus, the character region corresponding to the point portion of “shinnyu” is not extracted from the first image but extracted from the second image.

非反転文字は、背景を構成する画素の輝度が高く、文字を構成する画素の輝度が低い。非反転文字のエッジ近傍を考えると、第１の画像では、第１のエッジ近傍で非反転文字を構成する画素は黒画素になり、非反転文字の背景の画素は白画素になる。また、第２の画像では、第２のエッジ近傍で非反転文字を構成する画素は白画素になり、非反転文字の背景の画素は黒画素になる。非反転文字は、輝度が高い画素に囲まれた輝度が低い画素で構成されるので、白画素に囲まれた黒画素を抽出することで、第１の画像から、非反転文字の文字領域を抽出できる。 The non-inverted character has high luminance of pixels constituting the background and low luminance of pixels constituting the character. Considering the vicinity of the edge of the non-inverted character, in the first image, the pixels constituting the non-inverted character in the vicinity of the first edge are black pixels, and the background pixel of the non-inverted character is the white pixel. Further, in the second image, the pixels constituting the non-inverted character near the second edge are white pixels, and the background pixel of the non-inverted character is a black pixel. Since the non-inverted characters are composed of pixels with low luminance surrounded by pixels with high luminance, by extracting black pixels surrounded with white pixels, the character region of the non-inverted characters is extracted from the first image. Can be extracted.

また、反転文字は、背景を構成する画素の輝度が低く、文字を構成する画素の輝度が高い。反転文字のエッジ近傍を考えると、第１の画像では、第１のエッジ近傍で反転文字を構成する画素は白画素になり、反転文字の背景の画素は黒画素になる。第２の画像では、第２のエッジ近傍で反転文字を構成する画素は黒画素になり、反転文字の背景の画素は白画素になる。反転文字は、輝度が低い画素に囲まれた輝度が高い画素で構成されるので、白画素に囲まれた黒画素を抽出することで、第２の画像から、反転文字の文字領域を抽出できる。つまり、第１の画像から、非反転文字の文字領域が抽出でき、第２の画像から、反転文字の文字領域が抽出できる。 In the inverted character, the luminance of the pixels constituting the background is low, and the luminance of the pixels constituting the character is high. When the vicinity of the edge of the inverted character is considered, in the first image, the pixels constituting the inverted character near the first edge are white pixels, and the background pixel of the inverted character is a black pixel. In the second image, the pixels constituting the reversed character near the second edge are black pixels, and the background pixels of the reversed character are white pixels. Since the reverse character is composed of pixels with high luminance surrounded by pixels with low luminance, the character region of the reverse character can be extracted from the second image by extracting the black pixels surrounded by white pixels. . That is, the character area of non-inverted characters can be extracted from the first image, and the character area of inverted characters can be extracted from the second image.

第１の画像から抽出された文字領域には、非反転文字を構成する画素に加えて、反転文字で文字を構成する画素に囲まれた背景部分の画素が含まれる。例えば、反転文字である“用”の文字では、文字の内側の背景画素が、文字を構成する画素に囲まれている（図４）。反転文字は、文字を構成する画素の輝度が高く、背景の画素の輝度が低いため、“用”の文字の内側の背景画素は、第１の画像にて白画素に囲まれた黒画素として現れ、文字領域として抽出される（図９（ａ））。第２の画像でも、同様に、非反転文字の内側の背景画素が、文字領域として抽出される（図９（ｂ））。これら、文字領域として抽出された背景画素は、ノイズ成分となる。 The character region extracted from the first image includes pixels in the background portion surrounded by pixels that form characters with reversed characters, in addition to pixels that form non-inverted characters. For example, in the “for” character that is an inverted character, the background pixel inside the character is surrounded by the pixels constituting the character (FIG. 4). Inverted characters have high brightness in the pixels constituting the text and low brightness in the background pixels. Therefore, the background pixels inside the “use” characters are black pixels surrounded by white pixels in the first image. It appears and is extracted as a character area (FIG. 9A). Similarly, in the second image, the background pixel inside the non-inverted character is extracted as a character region (FIG. 9B). These background pixels extracted as the character area become noise components.

文字領域統合手段１２５は、同じ位置では、文字は、反転文字又は非反転文字の何れか一方で構成されると仮定し、第１の画像から抽出された文字領域に含まれる反転文字の内側の背景画素、及び、第２の画像から抽出された文字領域に含まれる非反転文字の内側の背景画素を取り除く。文字領域統合手段１２５は、背景画素を取り除いた双方の文字領域を統合する。言い換えれば、文字領域統合手段１２５は、双方の画像から抽出された文字領域が重複した位置にあるとき、第１の画像から抽出された文字領域と第２の画像から抽出された文字領域との何れか一方を、統合後の文字領域として選択する。 The character region integration means 125 assumes that, at the same position, the character is composed of either an inverted character or a non-inverted character, and the character region integrating means 125 is located inside the inverted character included in the character region extracted from the first image. The background pixel and the background pixel inside the non-inverted character included in the character region extracted from the second image are removed. The character area integration unit 125 integrates both character areas from which background pixels are removed. In other words, when the character areas extracted from both images are at overlapping positions, the character area integration unit 125 calculates the character area extracted from the first image and the character area extracted from the second image. Either one is selected as a character area after integration.

図１１（ａ）〜（ｃ）に、文字領域統合処理の様子を示す。文字領域統合手段１２５は、第１の画像から抽出された文字領域に対して画素膨張処理を行い、黒画素が連結する部分にラベル付けを行う（図１１（ａ））。また、文字領域統合手段１２５は、第２の画像から抽出された文字領域についても、同様に、画素膨張処理とラベル付けとを行う（図１１（ｂ））。ラベル付けの前に、画素膨張処理を行っておくことで、第２の画像から抽出された文字領域で（図９（ｂ））、１文字ずつ分離していた“先進の機能”の文字領域が隣接する文字と連結し、文字列に対して、同じラベルが付与されることになる。 FIGS. 11A to 11C show the character region integration process. The character region integration unit 125 performs pixel expansion processing on the character region extracted from the first image, and labels the portion where the black pixels are connected (FIG. 11A). Similarly, the character region integration unit 125 performs pixel expansion processing and labeling on the character region extracted from the second image (FIG. 11B). By performing pixel dilation processing before labeling, the character region extracted from the second image (FIG. 9B) is a character region of “advanced function” that has been separated character by character. Are connected to adjacent characters, and the same label is assigned to the character string.

文字領域統合手段１２５は、第１の画像から抽出された文字領域のラベルと、第２の画像から抽出された文字領域のラベルとで、重複する位置にあるラベルが存在するか否かを調べる。文字領域統合手段１２５は、双方の文字領域で重複した位置にラベルが存在しないとき、つまり、何れか一方にのみラベルが存在する位置では、ラベルが存在する方の文字領域（第１の画像又は第２の画像からを抽出された黒画素）を、統合後の文字領域とする。文字領域統合手段１２５は、重複する位置にラベルが存在する場合は、ラベル領域の大きさ（面積）を比較し、ラベル面積が大きい方の文字領域を、統合後の文字領域とする（図１１（ｃ））。ラベル面積に代えて、ラベル内の画素数を比較してもよい。 The character area integration unit 125 checks whether or not there is an overlapping label between the character area label extracted from the first image and the character area label extracted from the second image. . When there is no label at an overlapping position in both character areas, that is, at a position where the label exists only in one of the character areas, the character area integration means 125 (the first image or A black pixel extracted from the second image) is set as a character region after integration. When there is a label at an overlapping position, the character region integration unit 125 compares the size (area) of the label region, and the character region with the larger label area is set as the character region after integration (FIG. 11). (C)). Instead of the label area, the number of pixels in the label may be compared.

例えば、入力画像（図４）に含まれる“先進の機能”の部分を考える。この部分は反転文字なので、文字領域は、第２の画像から抽出される（図９（ｂ））。しかし、第１の画像からも、輝度が低い画素に囲まれた輝度が高い画素が、文字領域として抽出される（図９（ａ））。文字領域統合手段１２５がラベル付けを行うと、第１の画像から抽出された文字領域では６つのラベルができ（図１１（ａ））、第２の画像から抽出された文字領域では１つのラベルができる（図１１（ｂ））。文字領域統合手段１２５は、図１１（ａ）に存在する６つのラベルのそれぞれと、図１１（ｂ）に存在するラベルとを比較する。図１１（ａ）に存在する６つのラベルのラベル面積は、何れも、図１１（ｂ）に存在するラベルのラベル面積よりも小さい。従って、文字領域統合手段１２５は、“先進の機能”の部分については、第２の画像から抽出された文字列を選択する。最終的に、統合後の文字領域では、第２の画像から抽出された文字領域が残る（図１１（ｃ））。 For example, consider the “advanced function” part included in the input image (FIG. 4). Since this part is an inverted character, the character region is extracted from the second image (FIG. 9B). However, from the first image, pixels with high luminance surrounded by pixels with low luminance are extracted as character regions (FIG. 9A). When the character area integration unit 125 performs labeling, six labels are generated in the character area extracted from the first image (FIG. 11A), and one label is generated in the character area extracted from the second image. (FIG. 11B). The character area integrating unit 125 compares each of the six labels existing in FIG. 11A with the labels existing in FIG. The label areas of the six labels existing in FIG. 11A are all smaller than the label areas of the labels existing in FIG. Therefore, the character area integration unit 125 selects the character string extracted from the second image for the “advanced function” portion. Finally, the character region extracted from the second image remains in the character region after integration (FIG. 11C).

本実施形態では、エッジ検出手段１２１は、入力画像から、隣接する画素間での特徴量の変化に基づいてエッジを検出する。領域抽出分離手段１２２は、検出されたエッジに対応する入力画像の画素の近傍の画素を、特徴量が高い画素と特徴量が低い画素とに区分する。領域抽出分離手段１２２は、特徴量が高い画素に囲まれた特徴量が低い画素、及び、特徴量が低い画素に囲まれた特徴量が高い画素を文字領域として抽出する。文字と背景との境界は、エッジとして検出され、エッジ近傍の画素は、特徴量が高い画素と特徴量が低い画素とを含んでいる。エッジ近傍の画素にて、特徴量が高い画素に囲まれた特徴量が低い画素を抽出することで、非反転文字の文字領域を抽出できる。また、エッジ近傍の画素にて、特徴量が低い画素に囲まれた特徴量が高い画素を抽出することで、反転文字の文字領域を抽出できる。 In the present embodiment, the edge detection unit 121 detects an edge from an input image based on a change in feature amount between adjacent pixels. The region extraction / separation unit 122 classifies the pixels in the vicinity of the pixel of the input image corresponding to the detected edge into a pixel having a high feature amount and a pixel having a low feature amount. The region extraction / separation unit 122 extracts, as a character region, pixels having a low feature amount surrounded by pixels having a high feature amount and pixels having a high feature amount surrounded by pixels having a low feature amount. The boundary between the character and the background is detected as an edge, and pixels near the edge include a pixel with a high feature value and a pixel with a low feature value. By extracting a pixel having a low feature amount surrounded by pixels having a high feature amount from pixels in the vicinity of the edge, a character region of a non-inverted character can be extracted. In addition, by extracting a pixel having a high feature amount surrounded by pixels having a low feature amount from pixels in the vicinity of the edge, a character region of an inverted character can be extracted.

本実施形態では、エッジ近傍から、反転文字の文字領域と非反転文字の文字領域との双方を抽出しているので、文字領域の抽出に際して、あらかじめ行がどのように構成されているかという情報を用いる必要はない。従って、入力画像は、定型の文書画像に限定されず、種々の入力画像から、反転文字と非反転文字の文字領域を抽出することができる。また、文字が斜めに並ぶ場合や、湾曲した曲線上に配置される場合も、反転文字と非反転文字の文字領域を抽出できる。 In this embodiment, since both the character area of the inverted character and the character area of the non-inverted character are extracted from the vicinity of the edge, information on how the line is configured in advance when extracting the character area is obtained. There is no need to use it. Therefore, the input image is not limited to a standard document image, and character regions of inverted characters and non-inverted characters can be extracted from various input images. In addition, even when characters are arranged obliquely or arranged on a curved curve, the character regions of inverted characters and non-inverted characters can be extracted.

また、本実施形態では、エッジ近傍にて、特徴量が低い画素に囲まれた特徴量が高い画素を反転文字の文字領域としているので、反転文字の判定に際して、反転文字の領域に外接矩形を設定し、その外接矩形内の黒画素と白画素との比を計算する必要がない。黒画素と白画素との比に基づいて反転領域を判定する方式では、外接矩形の取り方や、文字のフォント、文字を構成する線の太さなどに応じて、反転文字を正しく判定できないことがある。これに対し、本実施形態では、外接矩形は必要なく、フォントや線の太さに依存せずに、反転文字の文字領域を抽出できる。従って、黒画素と白画素との比に基づいて反転領域を判定する方式に比して、反転文字の文字領域を精度よく抽出できる。 Further, in the present embodiment, pixels having a high feature amount surrounded by pixels having a low feature amount in the vicinity of the edge are used as the character region of the reversed character. There is no need to set and calculate the ratio of black pixels to white pixels in the circumscribed rectangle. In the method of judging the inversion area based on the ratio of black pixels to white pixels, the inversion characters cannot be judged correctly depending on how to draw the circumscribed rectangle, the font of the characters, the thickness of the lines constituting the characters, etc. There is. On the other hand, in the present embodiment, a circumscribed rectangle is not necessary, and the character area of the inverted character can be extracted without depending on the font or line thickness. Therefore, the character region of the reversed character can be extracted with higher accuracy than the method of determining the reversed region based on the ratio between the black pixel and the white pixel.

続いて、本発明の第２実施形態について説明する。図１２は、本発明の第２実施形態の画像処理装置を示している。画像処理装置１００Ａは、画像入力装置１１０と、データ処理部１２０Ａと、データ記憶部１３０と、画像出力装置１４０とを有する。データ処理部１２０Ａは、エッジ検出手段１２１、領域抽出分離手段１２２、文字領域統合手段１２５、画像縮小手段１２６、及び、領域マッチング手段１２７を有する。第２実施形態におけるデータ処理部１２０Ａの構成は、図２に示す第１実施形態のデータ処理部１２０の構成に、画像縮小手段１２６と領域マッチング手段１２７とが追加された構成である。 Subsequently, a second embodiment of the present invention will be described. FIG. 12 shows an image processing apparatus according to the second embodiment of the present invention. The image processing apparatus 100A includes an image input device 110, a data processing unit 120A, a data storage unit 130, and an image output device 140. The data processing unit 120A includes an edge detection unit 121, a region extraction / separation unit 122, a character region integration unit 125, an image reduction unit 126, and a region matching unit 127. The configuration of the data processing unit 120A in the second embodiment is a configuration in which an image reduction unit 126 and a region matching unit 127 are added to the configuration of the data processing unit 120 of the first embodiment shown in FIG.

画像縮小手段１２６は、画像入力装置１１０が入力した画像を縮小して、エッジ検出手段１２１に渡す。画像縮小手段１２６は、入力画像を、所望の解像度の画像に縮小する。画像縮小手段１２６は、例えば、バイリニア方式やバイキュービック方式のような公知の縮小方式を用いて、入力画像を所望の解像度に縮小する。 The image reduction unit 126 reduces the image input by the image input device 110 and passes it to the edge detection unit 121. The image reducing unit 126 reduces the input image to an image with a desired resolution. The image reduction unit 126 reduces the input image to a desired resolution using a known reduction method such as a bilinear method or a bicubic method.

画像の縮小率は、例えば、以下のように決定する。事前に、パラメータ記憶部１３１にテーブルを記憶しておく。そのテーブルには、画像から抽出したい文字の文字サイズと、その文字サイズの文字領域の抽出が可能となる画像の解像度との対応を記憶しておく。ユーザは、処理対象の文字サイズを指定する。画像縮小手段１２６は、テーブルを参照して、ユーザが指定した文字サイズに対応する解像度を得る。画像縮小手段１２６は、入力画像の解像度と、テーブルに記憶された解像度とから、画像の縮小率を決定する。或いは、テーブルに、処理対象の文字サイズと、縮小率との対応を記憶しておいてもよい。この場合、画像縮小手段１２６は、テーブルから、ユーザが指定した文字サイズに対応する縮小率を取得する。 The image reduction rate is determined as follows, for example. A table is stored in the parameter storage unit 131 in advance. The table stores the correspondence between the character size of the character desired to be extracted from the image and the image resolution that enables extraction of the character area of the character size. The user specifies the character size to be processed. The image reduction unit 126 refers to the table to obtain a resolution corresponding to the character size designated by the user. The image reduction means 126 determines the image reduction ratio from the resolution of the input image and the resolution stored in the table. Alternatively, the correspondence between the character size to be processed and the reduction rate may be stored in the table. In this case, the image reduction unit 126 acquires a reduction ratio corresponding to the character size designated by the user from the table.

例えば、画像から、２０ポイント以上の文字を抽出したい場合を考える。画像中には、２０ポイント以上の文字加えて、１０ポイントや１６ポイントなどの２０ポイントよりも小さいサイズの文字も含まれているとする。入力画像が、１０ポイントや１６ポイントの文字の抽出に対して十分な解像度を持っている場合、入力画像をそのまま用いて文字領域の抽出を行うと、入力画像から、必要がない２０ポイントよりも小さい文字まで抽出されることになる。そこで、画像縮小手段１２６を用いて処理対象の画像の解像度を落とし、必要以上に小さいサイズの文字が抽出されないようにする。 For example, consider a case where it is desired to extract characters of 20 points or more from an image. In the image, it is assumed that characters having a size smaller than 20 points such as 10 points and 16 points are included in addition to characters of 20 points or more. If the input image has sufficient resolution for extracting 10-point or 16-point characters, extracting the character area using the input image as it is is more than the unnecessary 20 points from the input image. Even small characters will be extracted. Therefore, the image reduction means 126 is used to reduce the resolution of the image to be processed so that characters with a size smaller than necessary are not extracted.

エッジ検出手段１２１が行うエッジ検出、領域抽出分離手段１２２が行う文字領域の抽出、及び、文字領域統合手段１２５が行う文字領域の統合（反転文字と非反転文字との統合）は、第１実施形態と同様である。ただし、第２実施形態では、画像入力装置１１０が入力した画像（オリジナル画像）ではなく、画像縮小手段１２６が縮小した画像に対して、エッジ検出、文字領域抽出、及び、文字領域の統合を行う。文字領域統合手段１２５は、出縮小画像から抽出された文字領域を出力する。 Edge detection performed by the edge detection means 121, extraction of character areas performed by the area extraction / separation means 122, and integration of character areas performed by the character area integration means 125 (integration of inverted characters and non-inverted characters) are performed in the first embodiment. It is the same as the form. However, in the second embodiment, edge detection, character region extraction, and character region integration are performed not on the image (original image) input by the image input device 110 but on the image reduced by the image reduction unit 126. . The character area integration unit 125 outputs the character area extracted from the output / reduced image.

領域マッチング手段１２７は、縮小画像から抽出された文字領域とオリジナル画像とのマッチングを行い、オリジナル画像から詳細な文字領域を抽出する。領域マッチング手段１２７は、縮小画像から抽出された文字領域が、第１の画像から抽出された文字領域（非反転文字）であるか、第２の画像から抽出された文字領域（反転文字）であるかに応じて、縮小画像で文字領域として抽出された画素に対応するオリジナル画像の領域から、特徴量が低い画素、又は、特徴量が高い画素を文字領域として抽出する。以下では、特徴量として、画素値の輝度成分を考える。特徴量は、色相など別の成分でもよい。 The area matching unit 127 performs matching between the character area extracted from the reduced image and the original image, and extracts a detailed character area from the original image. The area matching unit 127 is configured such that the character area extracted from the reduced image is a character area (non-inverted character) extracted from the first image, or a character area (inverted character) extracted from the second image. Depending on whether or not there is, a pixel having a low feature amount or a pixel having a high feature amount is extracted as a character region from the region of the original image corresponding to the pixel extracted as the character region in the reduced image. Hereinafter, the luminance component of the pixel value is considered as the feature amount. The feature amount may be another component such as a hue.

図１３に、動作手順を示す。画像入力装置１１０は、画像を入力する（ステップＳ１）。画像縮小手段１２６は、入力画像を縮小する（ステップＳ７）。画像縮小手段１２６は、パラメータ記憶部１３１を参照し、縮小画像が、ユーザが指定した処理対象の文字サイズに応じた解像度になるように、画像の縮小率を決定する。或いは、縮小率は、ユーザが任意に設定してもよい。 FIG. 13 shows an operation procedure. The image input device 110 inputs an image (step S1). The image reducing unit 126 reduces the input image (step S7). The image reduction unit 126 refers to the parameter storage unit 131 and determines an image reduction rate so that the reduced image has a resolution corresponding to the character size of the processing target specified by the user. Alternatively, the reduction ratio may be arbitrarily set by the user.

ステップＳ２Ａ、Ｓ２Ｂから、ステップＳ６までの動作は、第１実施形態における動作と同様である。すなわち、エッジ検出部１２１は、縮小画像から第１のエッジと第２のエッジとを検出する（ステップＳ２Ａ、Ｓ２Ｂ）。局所２値化手段１２３は、第１のエッジ近傍で、縮小画像にて輝度が高い画素を白、輝度が低い画素を黒とする第１の画像を生成する（ステップＳ３Ａ）。また、局所２値化手段１２３は、第２のエッジ近傍で、縮小画像にて輝度が低い画素を白、輝度が高い画素を黒とする第２の画像を生成する（ステップＳ３Ｂ）。文字領域分離手段１２４は、それぞれ、第１の画像及び第２の画像で、白画素に囲まれた黒画素を文字領域として抽出する（ステップＳ４Ａ、Ｓ４Ｂ）。文字領域統合手段１２５は、重複する位置にある文字領域の面積同士を比較し（ステップＳ５）、文字領域を統合する（ステップＳ６）。 The operations from step S2A, S2B to step S6 are the same as the operations in the first embodiment. That is, the edge detection unit 121 detects the first edge and the second edge from the reduced image (steps S2A and S2B). The local binarization unit 123 generates a first image in the vicinity of the first edge, in which a pixel with high luminance is white and a pixel with low luminance is black in the reduced image (step S3A). Further, the local binarization unit 123 generates a second image in the vicinity of the second edge, in which the low-luminance pixel is white and the high-luminance pixel is black in the reduced image (step S3B). The character region separation unit 124 extracts black pixels surrounded by white pixels as character regions in the first image and the second image, respectively (steps S4A and S4B). The character region integration means 125 compares the areas of the character regions at overlapping positions (step S5) and integrates the character regions (step S6).

文字領域統合手段１２５が統合した文字領域は、縮小画像から抽出された文字領域である。縮小画像の解像度は、オリジナル画像の解像度よりも低いので、抽出された文字領域は、粗く抽出された文字領域となる。領域マッチング手段１２７は、文字領域統合手段１２５が統合した文字領域と、ステップＳ１で入力された画像（オリジナル画像）とのマッチングを行い、オリジナル画像の解像度で、詳細な文字領域を抽出する（ステップＳ８）。 The character area integrated by the character area integration unit 125 is a character area extracted from the reduced image. Since the resolution of the reduced image is lower than the resolution of the original image, the extracted character area becomes a roughly extracted character area. The region matching unit 127 performs matching between the character region integrated by the character region integration unit 125 and the image (original image) input in step S1, and extracts a detailed character region at the resolution of the original image (step S1). S8).

領域マッチング手段１２７は、文字領域統合手段１２５から、縮小画像中で文字領域を構成する画素の位置と、その画素が第１の画像から抽出された文字領域か第２の画像から抽出された文字領域かを示す情報とを受け取る。領域マッチング手段１２７は、第１の画像から抽出された文字領域については、対応するオリジナル画像の領域から、輝度が低い画素を文字領域として抽出する。領域マッチング手段１２７は、第２の画像から抽出された文字領域については、対応するオリジナル画像の領域から、輝度が高い画素を文字領域として抽出する。 The area matching unit 127 outputs the position of the pixel constituting the character area in the reduced image from the character area integration unit 125 and the character area extracted from the second image or the character area extracted from the first image. Receive information indicating the area. For the character region extracted from the first image, the region matching unit 127 extracts a pixel having low luminance as a character region from the corresponding original image region. For the character region extracted from the second image, the region matching unit 127 extracts pixels having high luminance as the character region from the corresponding original image region.

図１４（ａ）に、縮小画像から抽出された文字領域を示し、（ｂ）に、オリジナル画像の対応する領域を示す。図１４（ａ）にて、濃い色で表される画素が、文字領域として抽出された画素に対応する。また、文字領域として抽出された画素に記載された数値は、縮小画像の画素の特徴量（輝度値）を表している。画像の縮小率は、１／３とする。縮小画像の１画素は、オリジナル画像では、３×３の領域に対応する。図１４（ａ）に示す縮小画像の２×２の領域は、図１４（ｂ）に示すオリジナル画像の６×６の領域に対応する。オリジナル画像にて、Ａで示される画素は輝度値１４の画素で、Ｂで示される画素は輝度値１００の画素である。 FIG. 14A shows a character area extracted from the reduced image, and FIG. 14B shows a corresponding area of the original image. In FIG. 14A, a pixel represented by a dark color corresponds to a pixel extracted as a character area. The numerical value described in the pixel extracted as the character area represents the feature amount (luminance value) of the pixel of the reduced image. The image reduction ratio is 1/3. One pixel of the reduced image corresponds to a 3 × 3 area in the original image. The 2 × 2 area of the reduced image shown in FIG. 14A corresponds to the 6 × 6 area of the original image shown in FIG. In the original image, a pixel indicated by A is a pixel having a luminance value of 14, and a pixel indicated by B is a pixel having a luminance value of 100.

例えば、図１４（ａ）に示す４つの画素のうちの紙面に向かって右上の画素（輝度値５７）について考える。この画素に対応するオリジナル画像の領域（３×３の領域）は、輝度値１４の画素（図１４（ｂ）に画素Ａで示す画素）が４つあり、輝度値１００の画素（図１４（ｂ）に画素Ｂで示す画素）が５つある。抽出された文字領域は、第１の画像から抽出された文字領域、つまり、輝度が高い画素を背景とし輝度が低い画素が文字を構成する非反転文字の文字領域であるとする。 For example, consider the upper right pixel (luminance value 57) of the four pixels shown in FIG. The area (3 × 3 area) of the original image corresponding to this pixel has four pixels with a luminance value of 14 (the pixel indicated by pixel A in FIG. 14B), and a pixel with a luminance value of 100 (FIG. 14 (FIG. 14 (b)). There are five pixels b). It is assumed that the extracted character region is a character region extracted from the first image, that is, a character region of non-inverted characters in which a pixel having a high luminance and a pixel having a low luminance constitute a character.

領域マッチング手段１２７は、縮小画像から文字領域として抽出された画素の輝度値（輝度値５７）と、オリジナル画像の対応する領域内の各画素の輝度値とを比較する。領域マッチング手段１２７は、オリジナル画像の対応する領域内で、輝度値が５７よりも低い画素を輝度が低い画素と判断し、その画素を文字領域として抽出する。領域マッチング手段１２７は、図１４（ｂ）では、画素Ａを文字領域として抽出する。領域マッチング手段１２７は、その他の領域についても同様な処理を行う。領域マッチング手段１２７は、最終的に、オリジナル画像から、図１４（ｂ）で濃い色で示す画素を、文字領域として抽出する。 The region matching unit 127 compares the luminance value (luminance value 57) of the pixel extracted as a character region from the reduced image with the luminance value of each pixel in the corresponding region of the original image. The area matching unit 127 determines that a pixel having a luminance value lower than 57 in the corresponding area of the original image is a pixel having a low luminance, and extracts the pixel as a character area. In FIG. 14B, the area matching unit 127 extracts the pixel A as a character area. The area matching unit 127 performs the same processing for other areas. The region matching unit 127 finally extracts, as a character region, pixels indicated by a dark color in FIG. 14B from the original image.

抽出された文字領域が、第２の画像から抽出された文字領域、つまり、輝度が低い画素を背景とし輝度が高い画素が文字を構成する反転文字の文字領域であれば、領域マッチング手段１２７は、上記とは逆の動作で、文字領域を抽出する。すなわち、領域マッチング手段１２７は、オリジナル画像の対応する領域で、縮小画像から文字領域として抽出された画素の輝度値よりも輝度が高い画素を文字領域として抽出する。 If the extracted character region is a character region extracted from the second image, that is, a character region of an inverted character in which a pixel with high luminance and a pixel with high luminance constitute a character, the region matching unit 127 The character area is extracted by the reverse operation to that described above. That is, the area matching unit 127 extracts, as a character area, a pixel whose luminance is higher than the luminance value of the pixel extracted as a character area from the reduced image in the corresponding area of the original image.

本実施形態では、画像縮小手段１２６は、入力画像を縮小する。領域マッチング手段１２７は、縮小画像から抽出された文字領域と縮小前のオリジナル画像とのマッチングを行い、オリジナル画像から文字領域を抽出する。オリジナル画像が、処理対象の文字サイズに対して過大な解像度を持っている場合、抽出する必要がないサイズが小さな文字の文字領域も抽出されることになる。解像度が高いほど、処理すべき画素の数が増えるため、処理に必要なメモリ量が増加し、文字領域の抽出に要する時間も長くなる。本実施形態では、縮小画像に対してエッジ検出、文字領域の抽出を行っているので、解像度が高いオリジナル画像を使用する場合に比して、使用するメモリ量を抑えることができ、また、処理速度を向上することができる。その他の効果は、第１実施形態と同様である。 In the present embodiment, the image reducing unit 126 reduces the input image. The area matching unit 127 performs matching between the character area extracted from the reduced image and the original image before reduction, and extracts the character area from the original image. When the original image has an excessive resolution with respect to the character size to be processed, a character region of a character having a small size that does not need to be extracted is also extracted. As the resolution is higher, the number of pixels to be processed increases, so that the amount of memory required for processing increases and the time required for extracting the character region also increases. In the present embodiment, edge detection and character area extraction are performed on a reduced image, so that the amount of memory to be used can be reduced compared to the case of using an original image with a high resolution, and processing Speed can be improved. Other effects are the same as those of the first embodiment.

なお、上記各実施形態では、エッジ検出手段１２１は、特徴量の変化の方向に応じて、第１のエッジと第２のエッジとを検出したが、検出するエッジは何れか一方でもよい。これは、第１のエッジと第２のエッジとは隣接するため、第１のエッジの近傍で、特徴量が低い画素に囲まれた特徴量が高い画素を抽出することでも、反転文字の文字領域を抽出することができ、第２のエッジの近傍で、特徴量が高い画素に囲まれた特徴量が低い画素を抽出することでも、非反転文字の文字領域を抽出することができるためである。ただし、特徴量の変化の方向に応じて第１のエッジと第２のエッジを検出する方が、より正確な文字領域の抽出が可能であると考えられる。その理由は、特徴量が高い画素と低い画素との境界で特徴量が低い画素が第１のエッジとなり、特徴量が高い画素と低い画素の境界で特徴量が高い画素が第２のエッジとなるので、第１のエッジは非反転文字を構成する画素の位置と重なり、第２のエッジは反転文字を構成する画素の位置と重なるためである。 In each of the above embodiments, the edge detection unit 121 detects the first edge and the second edge according to the direction of change of the feature value, but either edge may be detected. This is because the first edge and the second edge are adjacent to each other, and therefore, by extracting a pixel having a high feature amount surrounded by pixels having a low feature amount in the vicinity of the first edge, the character of the inverted character is also obtained. This is because the region can be extracted, and the character region of the non-inverted character can also be extracted by extracting the pixel with the low feature amount surrounded by the pixel with the high feature amount in the vicinity of the second edge. is there. However, it is considered that a more accurate character region can be extracted by detecting the first edge and the second edge according to the direction of change of the feature amount. The reason for this is that a pixel with a low feature value at the boundary between a pixel with a high feature value and a low pixel becomes the first edge, and a pixel with a high feature value at the boundary between a pixel with a high feature value and a low pixel is the second edge. Therefore, the first edge overlaps with the position of the pixel constituting the non-inverted character, and the second edge overlaps with the position of the pixel constituting the inverted character.

上記各実施形態では、第１の画像と第２の画像とを生成したが、生成する２値画像は、何れか一方でもよい。非反転文字の文字領域と、反転文字の文字領域とは、特徴量が高い画素に囲まれた特徴量が低い画素と、特徴量が低い画素に囲まれた特徴量が高い画素とを抽出することで抽出できる。従って、第２の画像を生成せずに、第１の画像を用いて、白画素に囲まれた黒画素を非反転文字の文字領域として抽出し、黒画素に囲まれた白画素を反転文字の文字領域として抽出してもよい。逆に、第１の画像を生成せずに、第２の画像を用いて、黒画素に囲まれた白画素を非反転文字の文字領域として抽出し、白画素に囲まれた黒画素を反転文字の文字領域として抽出してもよい。 In each of the above embodiments, the first image and the second image are generated, but either one of the generated binary images may be used. The non-inverted character region and the inverted character region are extracted from pixels having a low feature amount surrounded by pixels having a high feature amount and pixels having a high feature amount surrounded by pixels having a low feature amount. Can be extracted. Therefore, without generating the second image, the first image is used to extract the black pixels surrounded by the white pixels as the character region of the non-inverted characters, and the white pixels surrounded by the black pixels are converted to the inverted characters. May be extracted as a character area. Conversely, without generating the first image, the second image is used to extract white pixels surrounded by black pixels as the character area of non-inverted characters, and the black pixels surrounded by white pixels are inverted. You may extract as a character area of a character.

画像入力装置１１０が読み取った画像に対して、何らかの画像処理を施し、データ処理部１２０に入力してもよい。例えば、エッジ検出手段１２１がエッジ検出を行う前の段階で、領域判別処理を行い、絵や写真の部分など、文字が存在しないことが明らかな領域を入力画像から削除してもよい。また、第２実施形態で、縮小画像から抽出された文字領域とオリジナル画像とのマッチングを行わない構成も可能である。すなわち、縮小画像が十分な解像度を持ち、縮小画像から抽出された文字領域を用いて文字認識などが可能であれば、マッチングを行わずに、画像出力装置１４０から、粗く抽出された文字領域を出力してもよい。 An image read by the image input device 110 may be subjected to some image processing and input to the data processing unit 120. For example, an area determination process may be performed before the edge detection unit 121 performs edge detection, and an area where it is clear that there is no character, such as a picture or a photograph, may be deleted from the input image. Further, in the second embodiment, a configuration in which matching between the character region extracted from the reduced image and the original image is not possible is also possible. In other words, if the reduced image has sufficient resolution and character recognition or the like is possible using the character area extracted from the reduced image, the character area roughly extracted from the image output device 140 without matching is used. It may be output.

以上、本発明をその好適な実施形態に基づいて説明したが、本発明の画像処理装置、方法、及び、プログラムは、上記実施形態にのみ限定されるものではなく、上記実施形態の構成から種々の修正及び変更を施したものも、本発明の範囲に含まれる。 Although the present invention has been described based on the preferred embodiment, the image processing apparatus, method, and program of the present invention are not limited to the above embodiment, and various configurations are possible from the configuration of the above embodiment. Those modified and changed as described above are also included in the scope of the present invention.

本発明は、多様な文字飾りが成されている文書画像から文字領域を抽出する文字領域抽出装置や、文字領域抽出装置をコンピュータに実現するためのプログラムといった用途に適用できる。また、本発明は、光学式文字読み取り装置や、文字認識した結果をファイルに出力するファイル生成装置における文字領域の抽出に適用できる。 INDUSTRIAL APPLICABILITY The present invention can be applied to uses such as a character area extraction device that extracts a character area from a document image with various character decorations, and a program for realizing the character area extraction device in a computer. Further, the present invention can be applied to extraction of a character region in an optical character reading device or a file generation device that outputs a character recognition result to a file.

１０：画像処理装置
１１：エッジ検出手段
１２：領域抽出分離手段
１００：画像処理装置
１１０：画像入力装置
１２０：データ処理部
１２１：エッジ検出手段
１２２：領域抽出分離手段
１２３：局所２値化手段
１２４：文字領域分離手段
１２５：文字領域統合手段
１２６：画像縮小手段
１２７：領域マッチング手段
１３０：データ記憶部
１３１：パラメータ記憶部
１４０：画像出力装置 10: Image processing device 11: Edge detection unit 12: Region extraction / separation unit 100: Image processing device 110: Image input device 120: Data processing unit 121: Edge detection unit 122: Region extraction / separation unit 123: Local binarization unit 124 : Character region separation unit 125: Character region integration unit 126: Image reduction unit 127: Region matching unit 130: Data storage unit 131: Parameter storage unit 140: Image output device

Claims

An edge detection means for detecting an edge based on a change in a feature amount between adjacent pixels from an input image;
A pixel in the vicinity of the pixel of the input image corresponding to the detected edge is divided into a pixel having a high feature value and a pixel having a low feature value based on the feature value, and is surrounded by pixels having a high feature value. An image processing apparatus comprising: a pixel having a low feature amount; and a region extraction / separation unit that extracts a pixel having a high feature amount surrounded by pixels having a low feature amount as a character region.

The region extraction / separation means includes
Of the pixels in the vicinity of the pixel of the input image corresponding to the edge, a first image in which the pixel having a low feature amount is black and the pixel having a high feature amount is white, and a pixel having a high feature amount A local binarization unit that generates a second image in which black is used and a pixel having a low feature amount is white, and black pixels surrounded by white pixels are extracted from the first image and the second image. The image processing apparatus according to claim 1, further comprising a character area separating unit that outputs the character area.

When the character area extracted from the first image and the character area extracted from the second image overlap each other, the character that selects one of the overlapping character areas as the character area The image processing apparatus according to claim 2, further comprising region integration means.

The character region integration unit performs labeling on the character region extracted from the first image and the character region extracted from the second image, and a label area for a label at an overlapping position. The image processing apparatus according to claim 3, wherein a character area having a larger is selected as a character area.

The image processing apparatus according to claim 1, further comprising an image reduction unit that reduces an image size of the original image and inputs the reduced image to the edge detection unit.

The image processing apparatus according to claim 5, further comprising a region matching unit that performs matching between the character region extracted from the reduced image and the original image, and extracts a detailed character region from the original image.

The area matching unit is configured to extract a pixel extracted as a character area in the reduced image when a pixel extracted as a character area in the reduced image corresponds to a character area surrounded by pixels having a high feature quantity and having a low feature quantity. Among the pixels of the original image region corresponding to the above, the pixel having a feature amount lower than the feature amount of the pixel of the reduced image is extracted as the detailed character region, and the pixel extracted as the character region in the reduced image is, When corresponding to a character region having a high feature amount surrounded by pixels having a low feature amount, out of the pixels in the original image region corresponding to the pixels extracted as the character region in the reduced image, The image processing apparatus according to claim 6, wherein a pixel having a feature amount higher than a feature amount is extracted as the detailed character region.

The edge detection means detects an edge in the direction in which the feature amount increases as a first edge, detects an edge in a direction in which the feature amount decreases as a second edge, and the region extraction separation means Among the pixels of the input image near the first edge, pixels having a low feature amount surrounded by pixels having a high feature amount are extracted as a character region, and pixels of the input image near the second edge The image processing apparatus according to claim 1, wherein a pixel having a high feature amount surrounded by pixels having a low feature amount is extracted as a character region.

The edge detection means detects, as the first edge, a pixel in which a change in luminance value between adjacent pixels is greater than or equal to a threshold value and changes from a bright luminance value to a dark luminance value from the input image, The image processing apparatus according to claim 8, wherein a pixel in which the change in the luminance value is equal to or greater than a threshold and the luminance value changes from a dark luminance value to a bright luminance value is detected as the second edge.

The edge detection unit detects, as the first edge, a pixel in which a change in hue of an adjacent pixel is greater than or equal to a threshold value and a saturation value is changed from a large value to a small value from the input image. The image processing apparatus according to claim 8, wherein a pixel whose change is greater than or equal to a threshold and whose saturation changes from a small value to a large value is detected as the second edge.

A computer detecting an edge from an input image based on a change in a feature amount between adjacent pixels;
The computer classifying a pixel in the vicinity of the pixel of the input image corresponding to the detected edge into a pixel having a high feature value and a pixel having a low feature value based on the feature value;
And a step of extracting, as a character region, a pixel having a low feature amount surrounded by pixels having a high feature amount and a pixel having a high feature amount surrounded by pixels having a low feature amount.

In the step of dividing the pixel into a pixel having a high feature amount and a pixel having a low feature amount, the computer sets the pixel having the low feature amount as black among the pixels in the vicinity of the pixel of the input image corresponding to the edge, and the feature amount is In the step of generating a first image in which a high pixel is white and a second image in which a pixel having a high feature amount is black and a pixel having a low feature amount is white, and extracting the character region, The image processing method according to claim 11, wherein black pixels surrounded by white pixels are extracted as a character region in the first image and the second image.

When the computer is at a position where the character area extracted from the first image and the character area extracted from the second image overlap, any one of the character areas at the overlapping position is defined as a character area. The image processing method according to claim 12, further comprising the step of selecting as:

In the step of selecting the character area, the computer labels the character area extracted from the first image and the character area extracted from the second image, and labels at overlapping positions. The image processing method according to claim 13, wherein a character region having a larger label area is selected as a character region.

The computer according to claim 11, further comprising a step of reducing the image size of the original image and generating a reduced image to be subjected to edge detection in the edge detection step prior to the step of detecting the edge. The image processing method according to any one of the above.

The image processing method according to claim 15, further comprising: a step of matching the character area extracted from the reduced image with the original image and extracting a detailed character area from the original image.

In the step of detecting the edge, the computer detects an edge in a direction in which the feature amount increases as a first edge, and detects an edge in a direction in which the feature amount decreases as a second edge, In the step of classifying the pixel having a high feature amount and the pixel having a low feature amount, the pixel in the vicinity of the pixel of the input image corresponding to the first edge is divided into a pixel having a high feature amount and a pixel having a low feature amount, In the step of dividing the pixels in the vicinity of the pixels of the input image corresponding to the second edge into pixels having high feature values and pixels having low feature amounts, and extracting a character region, among the pixels in the vicinity of the first edge A pixel surrounded by pixels having a high feature value and having a low feature value is extracted as a character area, and a pixel having a high feature value surrounded by pixels having a low feature value among the pixels near the second edge. Extracting as a character area, an image processing method according to any one of claims 11 to 16.

On the computer,
From the input image, processing for detecting an edge based on a change in feature amount between adjacent pixels;
Processing to classify a pixel in the vicinity of the pixel of the input image corresponding to the detected edge into a pixel having a high feature value and a pixel having a low feature value based on the feature value;
A program for executing, as a character region, a pixel having a low feature amount surrounded by pixels having a high feature amount and a pixel having a high feature amount surrounded by pixels having a low feature amount.

In the process of discriminating between the high feature amount pixel and the low feature amount pixel, among the pixels in the vicinity of the pixel of the input image corresponding to the edge, the low feature amount pixel is black and the high feature amount pixel is white. In the process of generating the first image and the second image in which the pixel having the high feature amount is black and the pixel having the low feature amount is white and extracting the character region, the first image 19. The program according to claim 18, wherein black pixels surrounded by white pixels are extracted as a character area in the second image.

When the character area extracted from the first image and the character area extracted from the second image are at a position where the computer overlaps, either one of the character areas at the overlapping position is a character area. The program according to claim 19, further causing a process to be selected to be executed.

In the process of selecting the character region, labeling is performed on the character region extracted from the first image and the character region extracted from the second image. The program according to claim 20, wherein a character area having a larger label area is selected as a character area.

19. The computer further includes a step of reducing the image size of the original image and generating a reduced image to be subjected to edge detection in the processing of detecting an edge prior to the step of detecting the edge. The program as described in any one of thru | or 21.

The program according to claim 22, further causing the computer to perform a process of matching a character area extracted from the reduced image with the original image and extracting a detailed character area from the original image.

In the processing for detecting the edge, the edge in the direction in which the feature amount increases is detected as the first edge, and the edge in the direction in which the feature amount decreases is detected as the second edge, and the feature amount is high. In the process of dividing the pixel into the low pixel, the pixel in the vicinity of the pixel of the input image corresponding to the first edge is divided into a pixel having a high feature amount and a pixel having a low feature amount, and the second In the process of dividing the pixels in the vicinity of the pixel of the input image corresponding to the edge into pixels having a high feature amount and pixels having a low feature amount, and extracting the character region, the feature amount among the pixels in the vicinity of the first edge A pixel with a low feature amount surrounded by pixels with a high feature amount is extracted as a character region, and a pixel with a high feature amount surrounded by pixels with a low feature amount among pixels near the second edge is extracted as a character region. Do Program according to any one of claim 18 to 23.