JP6743092B2

JP6743092B2 - Image processing apparatus, image processing control method, and program

Info

Publication number: JP6743092B2
Application number: JP2018116076A
Authority: JP
Inventors: 三沢　玲司; 玲司三沢; 航也島村
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2013-12-19
Filing date: 2018-06-19
Publication date: 2020-08-19
Anticipated expiration: 2034-07-07
Also published as: JP2016054564A; JP2018139457A; JP6362632B2

Description

本発明は、画像処理装置、画像処理装置の制御方法、及びプログラムに関する。 The present invention relates to an image processing device, an image processing device control method, and a program.

近年、カラープリンタやカラースキャナ等の普及により、カラー化された文書が増え、この文書をスキャンにより取り込んで電子ファイルとして保存したり、インターネット等を介して第三者等に送付したりする機会が増えてきている。しかし、フルカラーデータのままでは記憶装置や回線への負荷が大きいため、圧縮処理を行ってデータ量を小さくする必要がある。 In recent years, with the spread of color printers and color scanners, the number of colorized documents has increased, and there is an opportunity to scan these documents and save them as electronic files or send them to third parties via the Internet. It is increasing. However, since full-color data remains a heavy load on a storage device and a line, it is necessary to perform compression processing to reduce the amount of data.

従来、カラー画像を圧縮する方法として、例えば、誤差拡散等で擬似階調を持った２値画像にして圧縮する方法、ＪＰＥＧ形式で圧縮する方法、８ビットのパレットカラーに変換を行ってＺＩＰ圧縮やＬＺＷ圧縮をする方法等があった。 Conventionally, as a method of compressing a color image, for example, a method of compressing a binary image having a pseudo gradation by error diffusion or the like, a method of compressing in a JPEG format, or a ZIP compression by converting into an 8-bit palette color. And LZW compression method.

また、特許文献１では、入力画像に含まれる文字領域を検出し、当該検出した文字の部分を２値画像にしてＭＭＲ圧縮（２値非可逆圧縮）して、各文字の文字色情報とともにファイルに保存する。更に、入力画像上の文字部分を周囲の色で塗りつぶしたものを背景画像として解像度を落としてＪＰＥＧ圧縮（非可逆圧縮）して該ファイルに保存する。この圧縮方法により圧縮されたファイルは、文字領域については高い品位が得られるとともに、圧縮率も高くすることができる。 Further, in Patent Document 1, a character region included in an input image is detected, the detected character portion is converted into a binary image, and MMR compression (binary lossy compression) is performed, and a file is provided together with character color information of each character. Save to. Further, the character portion on the input image is filled with the surrounding color is used as the background image, the resolution is reduced, and the image is JPEG-compressed (irreversibly compressed) and stored in the file. The file compressed by this compression method can obtain high quality in the character area and can also have a high compression rate.

特開２００２−０７７６３３号公報JP, 2002-077633, A

特許文献１では、入力画像を２値化して得られた２値画像において、黒画素の集まりのサイズ（幅や高さ）、及びサイズが同程度である黒画素の集まりが近傍にあるかなどに基づいて、各黒画素の集まりが文字らしいか判定し、文字領域の検出を行っている。 In Patent Document 1, in a binary image obtained by binarizing an input image, the size (width or height) of a collection of black pixels, and whether there are collections of black pixels of similar size in the vicinity, etc. Based on the above, it is determined whether the collection of each black pixel seems to be a character, and the character area is detected.

一方で、単純２値化で文字と背景を分離するのが困難な入力画像に対して、特許文献１のように２値画像にもとづいて領域判定を行う方法を適用した場合、文字を構成する画素の識別が難しくなる。例えば、白の背景の上の黒文字（文字と背景の濃度差が大きい文字画像）に対して単純２値化を行う場合は、背景画素と文字画素の分離が容易である。一方、濃い濃度の背景の上の黒文字（文字と背景の濃度差が小さい文字画像画像）に対して２値化を行う場合は、背景画素と文字画素の分離が難しい。特に、濃い濃度の背景を有する文字は、２値化の際に背景の濃度よりも小さい値の閾値で２値化が行われると、２値の文字画像が黒くつぶれてしまう。このとき、濃い濃度の背景領域のサイズが文字と同程度のサイズであった場合、背景と文字が黒く潰れて２値化された状態の２値画像が、文字画素の部分として誤判定されてしまう場合もありうる。例えば、文字列の一部を濃いマーカーペンでマーキングした文書をスキャンし、そのスキャン画像を２値化した場合、マーカーペンでマーキングした箇所全体が黒くなってしまう場合がある。そして、そのマーカーペンでマーキングした箇所のサイズが文字サイズに近ければ、マーカーペンでマーキングした箇所の画素全体が２値化で黒く潰れた状態で１つの文字として扱われることになる。換言すれば、２値化の際に黒く潰れた状態の領域の全ての黒画素を、文字の画素として扱ってしまう場合がある。 On the other hand, when a method of performing area determination based on a binary image as in Patent Document 1 is applied to an input image in which it is difficult to separate a character and a background by simple binarization, a character is formed. Pixel identification becomes difficult. For example, when performing simple binarization on a black character (a character image having a large density difference between the character and the background) on a white background, it is easy to separate the background pixel and the character pixel. On the other hand, when binarizing a black character (a character image image in which the density difference between the character and the background is small) on a dark background, it is difficult to separate the background pixel and the character pixel. In particular, when a character having a dark background is binarized with a threshold value smaller than the background density, the binary character image is blackened. At this time, if the size of the background area of high density is about the same as the size of the character, the binary image in the state where the background and the character are crushed in black and binarized is erroneously determined as a character pixel portion. In some cases it may end up. For example, when a document in which a part of a character string is marked with a dark marker pen is scanned and the scanned image is binarized, the entire portion marked with the marker pen may be black. If the size of the part marked with the marker pen is close to the character size, the entire pixels of the part marked with the marker pen are binarized and treated as one character. In other words, all the black pixels in the black-crushed area may be treated as character pixels during binarization.

上記課題を解決するために、本発明の画像処理装置は、画像内の領域の属性が少なくとも文字属性であるか、画像属性であるかを判定する第１の判定手段と、前記文字属性であると前記第１の判定手段によって判定された領域の画像が背景から分離容易な文字画像であるか、背景から分離困難な文字画像であるかを判定する第２の判定手段と、前記文字属性であると判定された領域の画像が背景から分離容易な文字画像であると前記第２の判定手段によって判定されたら、前記領域の画像に２値圧縮処理を実行し、前記文字属性であると判定された領域の画像が背景から分離容易な文字画像であると前記第２の判定手段によって判定されたら、前記領域の画像に多値圧縮処理を実行する処理手段と、前記文字属性であると前記第１の判定手段によって判定され、前記２値圧縮処理と前記多値圧縮処理のいずれか１つが実行された前記領域の画像に少なくとも基づいて前記画像のファイルを生成する生成手段とを有することを特徴とする。 In order to solve the above-mentioned problems, the image processing apparatus of the present invention includes a first determination unit that determines whether an attribute of an area in an image is at least a character attribute or an image attribute, and the character attribute. And a second determination unit that determines whether the image of the area determined by the first determination unit is a character image that is easily separated from the background or a character image that is difficult to be separated from the background, and the character attribute If the image of the area determined to be present is a character image that can be easily separated from the background, the second determination means determines that the image of the area has the character attribute by performing a binary compression process. If the second determination means determines that the image of the region is a character image that can be easily separated from the background , the image processing device performs a multi-value compression process on the image of the region, and determines that the image has the character attribute. And a generating unit that generates a file of the image based on at least the image of the area that has been determined by the first determining unit and on which one of the binary compression process and the multi-value compression process has been executed. Characterize.

本発明の一実施形態によれば、文字属性であると判定された領域のタイプによって２値圧縮処理を実行するか多値圧縮処理を実行するかを変えることができる。 According to the embodiment of the present invention, it is possible to change whether to perform the binary compression processing or the multi-value compression processing depending on the type of the area determined to have the character attribute.

画像処理システムを示すブロック図Block diagram showing image processing system 実施例１におけるＭＦＰのハードウェア構成Hardware configuration of MFP in Embodiment 1 実施例１における領域判定部２のブロック図Block diagram of the area determination unit 2 in the first embodiment 実施例１における領域判定を説明するための図FIG. 3 is a diagram for explaining area determination in the first embodiment. 画像圧縮処理部のブロック図Block diagram of image compression processor 画像伸長処理部のブロック図Block diagram of image decompression processor 入力画像のサンプルと出力画像のサンプルInput image sample and output image sample 実施例１における領域判定のフローチャートFlowchart of area determination in Example 1 実施例１における入力画像の例Example of input image in Example 1 実施例２における入力画像の例Example of input image in Example 2 本実施例４におけるエッジ検出部のブロック図Block diagram of an edge detector in the fourth embodiment エッジ抽出のサンプル図Edge extraction sample diagram 本実施例４におけるエッジ抽出のフローチャートFlowchart of edge extraction in the fourth embodiment 本実施例５におけるエッジ検出部のブロック図１Block diagram of an edge detection unit in the fifth embodiment 1 本実施例５におけるエッジ検出部のブロック図２2 is a block diagram of an edge detection unit according to the fifth embodiment. 実施例１におけるサンプル図Sample diagram in Example 1

（実施例１）
図１は、実施例１におけるシステム構成を示す概略図である。図１では、複合機（ＭＦＰ）１０１とコンピュータ（以下、ＰＣ）１０２が、ネットワーク１０３を介して接続されている。 (Example 1)
FIG. 1 is a schematic diagram illustrating a system configuration according to the first embodiment. In FIG. 1, a multi-function peripheral (MFP) 101 and a computer (hereinafter, PC) 102 are connected via a network 103.

点線１０４と１０５は処理の流れを示しており、１０４は、ユーザがＭＦＰ１０１のスキャナを用いて紙文書を読み込ませる処理を示す。その際、ユーザは、後述するＭＦＰ１０１のユーザーインターフェース（図２の２０３）を用いて、スキャン画像を送信する宛先（例えば、ＰＣ１０２）と、スキャンや送信に関わる各種設定を行うことができる。その各種設定として、ユーザは、解像度、圧縮率、データ書式（例えば、ＪＰＥＧ、ＴＩＦＦ、ＰＤＦ、ＰＤＦ高圧縮、ＰＤＦ高圧縮（ＯＣＲ結果付き））などを指定できる。本実施例では、データ書式としてＰＤＦ高圧縮（ＯＣＲ結果付き）が指定された場合についての説明を行う。ＰＤＦ高圧縮の技術詳細については後述する。１０５は、指定された各種設定に基づいて、ＭＦＰ１０１のソフトウェアあるいはハードウェア機能を利用してデータを生成し、指定された宛先に送信する処理を示す。ここで、ＰＣ１０２へ送信された画像は、ＰＤＦなどのファイルフォーマットで送信されることになるので、ＰＣ１０２の有する汎用的なビューアで閲覧可能である。 Dotted lines 104 and 105 show the flow of processing, and 104 shows the processing in which the user uses the scanner of the MFP 101 to read a paper document. At that time, the user can use the user interface (203 in FIG. 2) of the MFP 101, which will be described later, to set the destination (for example, the PC 102) to which the scan image is to be transmitted and various settings related to the scan and the transmission. As the various settings, the user can specify the resolution, compression ratio, data format (for example, JPEG, TIFF, PDF, PDF high compression, PDF high compression (with OCR result)) and the like. In the present embodiment, a case will be described where PDF high compression (with OCR result) is designated as the data format. The technical details of PDF high compression will be described later. Reference numeral 105 denotes a process for generating data using the software or hardware function of the MFP 101 based on various designated settings and transmitting the data to the designated destination. Here, since the image transmitted to the PC 102 is transmitted in a file format such as PDF, it can be viewed by a general-purpose viewer included in the PC 102.

図２は、ＭＦＰ１０１の詳細構成を示す図である。ＭＦＰ１０１は、画像入力デバイスであるスキャナ部２０１と、画像出力デバイスであるプリンタ部２０２、ＭＦＰ全体の制御を行う制御ユニット２０４、ユーザーインタフェースである操作部２０３等を有する。制御ユニット２０４は、スキャナ部２０１、プリンタ部２０２、操作部２０３と接続し、一方では、ＬＡＮ２０９と接続することで、画像情報やデバイス情報の入出力を行うコントローラである。ＣＰＵ２０５はシステム全体を制御するプロセッサである。ＲＡＭ２０６はＣＰＵ２０５が動作するためのシステムワークメモリであり、画像データを一時記憶するための画像メモリでもある。ＲＯＭ２１０はブートＲＯＭであり、システムのブートプログラム等のプログラムが格納されている。記憶部２１１は、ハードディスクドライブ等の不揮発性記憶媒体であり、システム制御ソフトウェアや画像データを格納する。操作部Ｉ／Ｆ２０７は操作部（ＵＩ）２０３とのインターフェース部で、操作部２０３に表示するための画像データを操作部２０３に対して出力する。また、操作部Ｉ／Ｆ２０７は、操作部２０３を介して本画像処理装置のユーザが指示した情報を、ＣＰＵ２０５に伝える役割をする。ネットワークＩ／Ｆ２０８は本画像処理装置をＬＡＮ２０９に接続し、データの入出力を行う（例えば、ＰＤＦ形式の圧縮データを別の装置に送信したり、別の装置からＰＤＦ形式の圧縮データを受信したりする）。以上のデバイスがシステムバス２１６上に配置される。また、イメージバスインターフェース２１２は、システムバス２１６と画像データを高速で転送する画像バス２１７とを接続し、データ構造を変換するバスブリッジである。画像バス２１７は、例えば、ＰＣＩバスやＩＥＥＥ１３９４で構成される。画像バス２１７上には以下のデバイスが配置される。ラスターイメージプロセッサ（ＲＩＰ）２１３は、ＰＤＬ（ページ記述言語）コードを解析し、指定された解像度のビットマップイメージに展開する、いわゆるレンダリング処理を実現する。デバイスＩ／Ｆ部２１４は、信号線２１８を介して画像入力デバイスであるスキャナ部２０１を接続し、信号線２１９を介して画像出力デバイスであるプリンタ部２０２を接続しており、画像データの同期系／非同期系の変換を行う。データ処理部２１５では、ＰＤＦ高圧縮やＯＣＲなどの処理を行うことで、ＰＤＦ形式の圧縮データ（５１５）を生成する。生成された圧縮データ（５１５）は、ネットワークＩ／Ｆ２０８及びＬＡＮ２０９を介して、指定された宛先（例えば、クライアントＰＣ１０２）に送信される。また、このデータ処理部２１５は、ネットワークＩ／Ｆ２０８及びＬＡＮ２０９を介して受信した圧縮データの伸長を行うこともできる。伸長画像は、デバイスＩ／Ｆ２１４を介してプリンタ部２０２に送られ、印刷されることになる。 FIG. 2 is a diagram showing a detailed configuration of the MFP 101. The MFP 101 includes a scanner unit 201 which is an image input device, a printer unit 202 which is an image output device, a control unit 204 which controls the entire MFP, an operation unit 203 which is a user interface, and the like. The control unit 204 is a controller that is connected to the scanner unit 201, the printer unit 202, and the operation unit 203, and is connected to the LAN 209 to input/output image information and device information. The CPU 205 is a processor that controls the entire system. The RAM 206 is a system work memory for the CPU 205 to operate and also an image memory for temporarily storing image data. The ROM 210 is a boot ROM, and stores programs such as a system boot program. The storage unit 211 is a non-volatile storage medium such as a hard disk drive, and stores system control software and image data. The operation unit I/F 207 is an interface unit with the operation unit (UI) 203, and outputs image data to be displayed on the operation unit 203 to the operation unit 203. Further, the operation unit I/F 207 plays a role of transmitting information instructed by the user of the image processing apparatus via the operation unit 203 to the CPU 205. The network I/F 208 connects the image processing apparatus to the LAN 209 and inputs/outputs data (for example, transmits compressed data in PDF format to another apparatus or receives compressed data in PDF format from another apparatus). Or). The above devices are arranged on the system bus 216. The image bus interface 212 is a bus bridge that connects the system bus 216 and the image bus 217 that transfers image data at high speed, and converts the data structure. The image bus 217 is composed of, for example, a PCI bus or IEEE1394. The following devices are arranged on the image bus 217. The raster image processor (RIP) 213 realizes a so-called rendering process that analyzes a PDL (Page Description Language) code and develops it into a bitmap image of a specified resolution. The device I/F unit 214 connects the scanner unit 201, which is an image input device, via a signal line 218, and the printer unit 202, which is an image output device, via a signal line 219, and synchronizes image data. Perform system/asynchronous system conversion. The data processing unit 215 generates compressed PDF data (515) by performing processing such as PDF high compression and OCR. The generated compressed data (515) is transmitted to the designated destination (for example, the client PC 102) via the network I/F 208 and the LAN 209. The data processing unit 215 can also expand the compressed data received via the network I/F 208 and the LAN 209. The decompressed image is sent to the printer unit 202 via the device I/F 214 and printed.

＜データ処理部２１５の説明＞
次に、図２のデータ処理部２１５により実現される画像圧縮処理部の構成と画像伸長処理部の構成について、図５及び図６のブロック図を用いて説明する。データ処理部２１５は、プロセッサがコンピュータプログラムを実行することにより、図５または図６の各処理部として機能するように構成してもよいし、その一部または全部をＡＳＩＣや電子回路等のハードウェアで構成するようにしてもよい。 <Explanation of the data processing unit 215>
Next, the configuration of the image compression processing unit and the configuration of the image decompression processing unit realized by the data processing unit 215 of FIG. 2 will be described with reference to the block diagrams of FIGS. 5 and 6. The data processing unit 215 may be configured to function as each processing unit of FIG. 5 or FIG. 6 by the processor executing a computer program, or a part or all of the data processing unit 215 may be a hardware such as an ASIC or an electronic circuit. You may make it comprised by wear.

ＰＤＦ高圧縮処理は、特許文献１で述べられているように、属性毎の領域判定を行い、各領域の属性に応じて、ＭＭＲによる２値可逆圧縮とＪＰＥＧによる多値非可逆圧縮とを適応的に変えて圧縮する。すなわち、文字領域に対してはＭＭＲ圧縮し、文字領域を周りの色で塗りつぶした画像をＪＰＥＧ圧縮することにより、圧縮率を高くできるとともに、文字領域については高い品位が得られるようにする。このＰＤＦ高圧縮の処理は、カラーまたはモノクロの多値画像に対して有効な圧縮技術である。詳細は後述するが、本実施例では、２値化すると潰れてしまう領域が文字領域であるか否かを判定することができる。そうすることにより、本当の文字領域だけをＭＭＲ圧縮すべき対象であると判定できるようになる。 In the PDF high compression processing, as described in Patent Document 1, area determination is performed for each attribute, and binary lossless compression by MMR and multivalued lossy compression by JPEG are applied according to the attribute of each area. To change and compress. That is, the compression ratio can be increased by performing the MMR compression on the character area and the JPEG compression of the image in which the character area is filled with the surrounding color, and the high quality can be obtained for the character area. This PDF high-compression processing is a compression technique effective for color or monochrome multi-valued images. Although details will be described later, in the present embodiment, it is possible to determine whether or not the area that is destroyed when binarized is a character area. By doing so, it becomes possible to determine that only the real character area is the target to be MMR compressed.

図５は、データ処理部２１５により実現される画像圧縮処理部の構成を示すブロック図であり、入力画像を圧縮して高圧縮ＰＤＦ（ＯＣＲ結果付き）を生成するための各処理部を示す。 FIG. 5 is a block diagram showing a configuration of an image compression processing unit realized by the data processing unit 215, and shows each processing unit for compressing an input image to generate a high compression PDF (with an OCR result).

２値化部５０２は、多値画像である入力画像５０１から２値画像を生成する。２値画像では、入力画像において閾値より濃い画素が例えば黒画素、閾値以下の画素が例えば白画素となる（もちろん、２値化結果は、黒、白で表されず、他の色で表されても良いし、色は無く、１、０や０、１で表されてもよい）。また、２値化部５０２では、閾値より濃い画素と、閾値以下の画素を区別することを目的としているが、同じ目的を達成できるのであれば、２値化以外の方法でも良い（例えば、３値化、４値化でも良い）。ただし、以下は、２値化部５０２で２値化がされたものとして説明を行う。なお、入力画像が７０１のような画像である場合、２値画像は７０２のようになる。なお、入力画像がカラーの多値画像である場合には、２値化は、その多値画像の輝度（例えば、ＹＵＶのうちのＹ）に対してのみ行われることになる。 The binarization unit 502 generates a binary image from the input image 501 which is a multi-valued image. In the binary image, the pixels darker than the threshold in the input image are, for example, black pixels, and the pixels below the threshold are, for example, white pixels (of course, the binarization result is not expressed in black or white, but in other colors). Or it may be represented by 1, 0 or 0, 1 with no color). Further, the binarization unit 502 aims to distinguish pixels darker than the threshold and pixels less than the threshold, but a method other than binarization may be used as long as the same purpose can be achieved (for example, 3 It may be quantized or quantized). However, the following description will be made assuming that the binarization unit 502 has performed binarization. When the input image is an image like 701, the binary image is like 702. When the input image is a color multi-valued image, binarization is performed only for the brightness of the multi-valued image (for example, Y of YUV).

領域判定部５０３は、２値化部５０２で生成された２値画像から、文字領域と写真領域を検出する。これにより、例えば、７０４と７０６が文字領域として、７０５が写真領域として検出される。この処理は、公知の領域識別手法（例えば、特開平０６−０６８３０１号公報）によってなされる。概要を説明すると例えば以下の通りとなる。 The area determination unit 503 detects a character area and a photo area from the binary image generated by the binarization unit 502. Thereby, for example, 704 and 706 are detected as a character area, and 705 is detected as a photograph area. This processing is performed by a known area identification method (for example, Japanese Patent Laid-Open No. 06-068301). The outline is as follows, for example.

（１）２値画像７０２に対して８連結で繋がる黒画素の輪郭を追跡することにより、８方向の何れかの方向で連続して存在する黒画素の塊（黒画素塊）を抽出する。８連結とは、左上、左、左下、下、右下、右、右上、上の８つの方向の何れかで同じ色（今回のケースでは黒）の画素が連続しているという意味である。一方、４連結とは、左、下、右、上の４つの方向の何れかで同じ色の画素が連続しているという意味である。 (1) A black pixel block (black pixel block) that continuously exists in any of eight directions is extracted by tracing the contours of black pixels that are connected to the binary image 702 by eight connections. 8-connected means that pixels of the same color (black in this case) are continuous in any of the eight directions of upper left, left, lower left, lower, lower right, right, upper right, and upper. On the other hand, 4-connected means that pixels of the same color are continuous in any of the four directions of left, bottom, right, and top.

（２）抽出された黒画素塊の中に、一定の大きさを越える黒画素塊（例えば、黒画素塊によって囲まれる領域の面積が一定の面積を超えるような黒画素塊。）があれば、その領域内に白画素塊があるかを特定する。即ち、その領域内の４連結で繋がる白画素の輪郭を追跡することにより、白画素塊を抽出する。更に、抽出した白画素塊が一定の大きさを越える場合には、再度同様の黒画素の輪郭を追跡することにより黒画素塊の抽出を行う。これらの処理は、画素塊が一定の大きさ以下になるまで繰り返し行う。 (2) If there is a black pixel block that exceeds a certain size in the extracted black pixel blocks (for example, a black pixel block in which the area surrounded by the black pixel blocks exceeds a certain area). , Specify whether there is a white pixel block in the area. That is, a white pixel block is extracted by tracing the contours of white pixels connected by four connections in the area. Further, when the extracted white pixel block exceeds a certain size, the black pixel block is extracted by tracing the contour of the same black pixel again. These processes are repeatedly performed until the pixel block becomes a certain size or less.

（３）得られた黒画素塊を、大きさや形状、黒画素密度のうちの少なくとも１つを用いて、文字か写真かに分類する。例えば、縦横比が１に近く（即ち、１プラスマイナスαに収まる。αは固定の閾値で例えば０．１。）、かつ、大きさが定められた範囲（例えば、黒画素塊によって囲まれる画素の数が１００画素以下）の黒画素塊を、文字を構成する黒画素塊と判定する。そして、残りの黒画素塊を写真を構成する画素塊と判定する。 (3) The obtained black pixel block is classified into a character or a photograph using at least one of size, shape, and black pixel density. For example, the aspect ratio is close to 1 (that is, it is within 1 plus or minus α. α is a fixed threshold value, for example, 0.1), and the size is determined (for example, pixels surrounded by a black pixel block). A black pixel block of 100 pixels or less) is determined as a black pixel block forming a character. Then, the remaining black pixel blocks are determined to be the pixel blocks forming the photograph.

（４）文字を構成する黒画素塊同士の距離が所定の距離（例えば、３画素。）内である場合に、その黒画素塊同士を同じグループに分類する。その上で、同じグループに分類された黒画素塊の何れをも包含する外接矩形領域を文字領域（７０４、７０６）と判定する。なお、文字を構成する他の黒画素塊が所定の距離内に無い、文字を構成する黒画素塊は、それ単独で一つのグループを構成することになる。従って、その単独の黒画素塊の外接矩形領域が文字領域と判定されることになる。なお、写真を構成する黒画素塊に対して（４）で説明した処理と同様の処理がなされるものとする。 (4) If the distance between black pixel blocks forming a character is within a predetermined distance (for example, 3 pixels), the black pixel blocks are classified into the same group. Then, the circumscribing rectangular area including all of the black pixel blocks classified into the same group is determined as the character area (704, 706). It should be noted that the black pixel blocks forming a character, which do not have other black pixel blocks forming a character within a predetermined distance, will form one group by themselves. Therefore, the circumscribed rectangular area of the single black pixel block is determined as the character area. It should be noted that the same processing as the processing described in (4) is performed on the black pixel block forming the photograph.

（５）各領域の位置と、その領域の属性判定情報（文字か写真か）とを判定結果として出力する。 (5) The position of each area and the attribute judgment information (character or photograph) of the area are output as the judgment result.

上記の（１）から（５）の処理により、７０４と７０６が文字領域、７０５が写真領域であるという判定結果が出力されることになる。以上で領域判定部５０３の説明を終える。 By the above processes (1) to (5), the determination result that 704 and 706 are the character region and 705 is the photograph region is output. This is the end of the description of the area determination unit 503.

文字切出し部５０４は、領域判定部５０３で生成された文字領域の夫々に対して、文字切り矩形の切り出し処理を行う。切り出された結果は、７１０、７１１、７１２、７１３のようになる。この切り出し処理は以下の処理から構成される。 The character cutout unit 504 performs a character cutting rectangle cutout process on each of the character areas generated by the area determination unit 503. The cut out results are like 710, 711, 712, 713. This cutout process is composed of the following processes.

（１）文字領域の一つを選択する（例えば、７０８を選択する）。 (1) Select one of the character areas (for example, select 708).

（２）文字領域によって特定される一の２値画像に対して横方向から射影を取る。具体的には、横方向に伸びるラインに黒画素がいくつあるかを数え、その数えた結果が射影となる。取られた射影を７１５に表す。この射影７１５において、閾値より多くの黒画素があった縦方向に連続するラインを一つのグループにする。この結果、三つのグループが生じることになる。三つのグループは、ＡＢＣＤの存在するライン群から構成されるグループ、ＥＦＧの存在するライン群から構成されるグループ、及び、Ｈの存在するライン群から構成されるグループである。 (2) A projection is taken from the lateral direction with respect to one binary image specified by the character area. Specifically, the number of black pixels in a line extending in the horizontal direction is counted, and the counted result is a projection. The projection taken is represented at 715. In this projection 715, lines that continue in the vertical direction and have more black pixels than the threshold value are grouped together. This results in three groups. The three groups are a group composed of line groups in which ABCD exists, a group composed of line groups in which EFG exists, and a group composed of line groups in which H exists.

（３）各グループに対して、縦方向から射影を取る。７１６は、ＡＢＣＤの存在するライン群に対して取った射影を表す。 (3) For each group, take a projection from the vertical direction. Reference numeral 716 represents a projection taken on a line group in which ABCD exists.

（４）各グループの射影において、閾値より多くの黒画素があった横方向に連続するラインを一つのグループにする。例えば、射影７１６では、四つのグループに生じることになる。四つのグループは、Ａの存在するライン群から構成されるグループ、Ｂの存在するライン群から構成されるグループ、Ｃの存在するライン群から構成されるグループ、Ｄの存在するライン群から構成されるグループからなる。 (4) In the projection of each group, horizontally continuous lines having more black pixels than the threshold value are grouped into one group. For example, projection 716 will occur in four groups. The four groups are composed of a group consisting of line groups with A, a group consisting of line groups with B, a group consisting of line groups with C, and a line group containing D. It consists of a group.

（５）（４）で得られた各ライン群のグループの外接矩形を文字切出し矩形として切り出す。その結果、例えば、各文字の外接矩形が文字切出し矩形として切り出されることになる。切り出された結果は、７１１、７１２、７１３、７１０に示す通りである。 (5) The circumscribing rectangle of the group of each line group obtained in (4) is cut out as a character cutting rectangle. As a result, for example, the circumscribed rectangle of each character is cut out as a character cutout rectangle. The cut-out results are as shown in 711, 712, 713, and 710.

（６）以上（１）−（５）の処理を、選択されていない文字領域が無くなるまで繰り返す。 (6) The above processes (1) to (5) are repeated until there are no unselected character areas.

ここで、図７を用いて、処理対象となる画像と、２値化・領域判定・文字切出しの処理結果の画像の例を示す。画像７０１は入力画像５０１の例であり、７０１１は白の背景上に記載された文字画像、７０１２は薄い濃度の背景上に記載された文字画像、７０１３は濃い濃度の背景上に記載された文字画像の例を示している。すなわち、文字７０１１と７０１２は、文字と背景の濃度差が大きい文字画像であり、文字画像７０１３は文字と背景の濃度差が小さい文字画像である。 Here, an example of an image to be processed and an image of a processing result of binarization/region determination/character cutout will be described with reference to FIG. 7. An image 701 is an example of the input image 501, 7011 is a character image described on a white background, 7012 is a character image described on a light density background, and 7013 is a character described on a dark density background. The example of an image is shown. That is, the characters 7011 and 7012 are character images with a large density difference between the character and the background, and the character image 7013 is a character image with a small density difference between the character and the background.

７０２は、２値化部５０２において画像７０１を２値化した結果の２値画像の例であり、文字７０１３は、背景の濃度よりも小さい値の閾値で２値化が行われて、黒く潰れてしまっている状態を示している。 Reference numeral 702 denotes an example of a binary image obtained by binarizing the image 701 in the binarizing unit 502. The character 7013 is binarized with a threshold value smaller than the density of the background, and is blackened. It shows that it has been exhausted.

本実施例では、２値化すると潰れてしまう文字画像（例えば、閾値より濃い背景上のさらに濃い文字など、背景と文字の濃度差が小さくて２値化しても背景と文字を分離するのが困難な画像）を、「背景から分離困難な文字画像」と呼ぶこととする。また、２値化した場合に潰れない文字画像（例えば、白または閾値より薄い背景上の黒文字など、背景と文字の濃度差が大きて２値化したときに背景と文字を分離するのが容易な画像）を、「背景から分離容易な文字画像」と呼ぶこととする。すなわち、「背景から分離容易な文字画像」は、２値化すると文字画像部分が黒画素となり文字以外の背景部分は白画素になる文字領域の画像である。 In the present embodiment, a character image that is crushed when binarized (for example, a darker character on a background darker than a threshold or the like has a small density difference between the background and the character, and the background and the character are separated even when binarized. A difficult image) is called a “character image that is difficult to separate from the background”. In addition, a character image that does not collapse when binarized (for example, white or black characters on a background that is lighter than a threshold value, is easy to separate the background and the character when binarized due to a large density difference between the background and the character. Image) is referred to as a “character image that is easily separated from the background”. That is, the "character image that is easily separated from the background" is an image of a character area in which the character image portion becomes black pixels when binarized and the background portion other than the character becomes white pixels.

７０３は、領域判定部５０３で２値画像７０２に対して領域判定を行った結果を示す。領域判定の結果、７０４と７０６は文字領域と判定され、７０５は写真領域として判定されたものとする。文字領域７０７と７０８は、２値画像７０３から、領域判定部５０３により文字領域と判定された部分画像を抽出したものである。７０９は、文字切り出し部５０４により切り出された文字切り矩形の概略図を示す。７１０は文字領域７０４内から切り出された文字切り矩形である。また、７１１、７１２、７１３は、文字領域７０６内から切り出された文字切り矩形である。 Reference numeral 703 denotes the result of the region determination unit 503 performing the region determination on the binary image 702. As a result of the area determination, it is assumed that 704 and 706 are determined to be character areas and 705 is determined to be a photograph area. The character areas 707 and 708 are obtained by extracting, from the binary image 703, partial images determined as the character areas by the area determination unit 503. Reference numeral 709 shows a schematic view of the character cutting rectangle cut out by the character cutting unit 504. A character cutting rectangle 710 is cut out from the character area 704. Further, 711, 712, and 713 are character cutting rectangles cut out from the character area 706.

領域判定部２（５０５）は、文字切り出し部５０４により切り出された文字切り矩形内の文字画像について、２値化すると潰れてしまう文字（背景から分離困難な文字画像）であるか否かを判定する。領域判定部２の判定方法の詳細については後述する。領域判定部２（５０５）で「背景から分離困難な文字画像」であると判断された文字領域の情報に基づいて、領域判定部５０３で生成された文字領域情報と、文字切り出し部５０４で生成された文字切り矩形情報とを修正する。すなわち、領域判定部５０３で生成された文字領域情報と、文字切り出し部５０４で生成された文字切り矩形情報とから、領域判定部２（５０５）で「背景から分離困難な文字画像」であると判断された文字領域の情報を除去する。そうすることにより、「背景から分離困難な文字画像」であると判断された文字領域は、文字ではないと判定されることにより、後述のＭＭＲ圧縮がかからず、文字画像が見えなくなってしまうという問題を解決できることになる。 The area determination unit 2 (505) determines whether or not the character image in the character cutting rectangle cut out by the character cutting unit 504 is a character that is crushed when binarized (a character image that is difficult to separate from the background). To do. Details of the determination method of the area determination unit 2 will be described later. The character region information generated by the region determination unit 503 and the character segmentation unit 504 based on the information of the character region determined by the region determination unit 2 (505) to be a “character image that is difficult to separate from the background”. The character cutting rectangle information that has been corrected is corrected. That is, from the character area information generated by the area determination unit 503 and the character cut rectangle information generated by the character cutout unit 504, the area determination unit 2 (505) determines that it is a “character image that is difficult to separate from the background”. Information on the determined character area is removed. By doing so, the character area determined to be a “character image that is difficult to separate from the background” is determined not to be a character, so that the MMR compression described later does not apply and the character image becomes invisible. That will solve the problem.

ＭＭＲ圧縮部５０６は、２値化部５０２で生成された２値画像から、領域判定部２（５０５）で修正した後の文字領域情報に基づいて文字領域の２値画像を抽出する（即ち、「背景から分離容易な文字画像」と判断された文字切り矩形領域に含まれる２値画像のみを抽出する）。そして、当該抽出した文字領域の２値画像に対してＭＭＲ圧縮を行い、圧縮コード１（５１１）を生成する。 The MMR compressing unit 506 extracts a binary image of a character area from the binary image generated by the binarizing unit 502 based on the character area information corrected by the area determining unit 2 (505) (that is, Only the binary image included in the character-cutting rectangular area determined to be “a character image that is easily separated from the background” is extracted). Then, MMR compression is performed on the extracted binary image of the character region to generate compression code 1 (511).

縮小部５０７は、入力画像５０１を縮小処理（低解像度化処理）し、縮小多値画像（不図示）を生成する。 The reduction unit 507 performs a reduction process (resolution reduction process) on the input image 501 to generate a reduced multivalued image (not shown).

代表色抽出部５０８は、領域判定部２（５０５）で修正した後の文字領域情報と文字切り矩形情報とに基づいて、２値画像における各文字を構成する画素（黒画素）の位置を特定する。そして、当該特定した文字の画素の位置に基づいて、縮小多値画像における対応する位置の色を参照して、文字切り矩形領域単位で文字の代表色を算出し、各文字の文字色情報５１３を得る。例えば、代表色は、文字切り矩形領域における２値画像で黒となった画素群の多値画像における色の平均や重み付け平均である。あるいは、そうした画素群の中で最も頻度の多い色である。このように代表色の取り方は様々考えられるが、文字切り矩形領域における２値画像で黒となった画素群のうちの少なくとも一画素の、多値画像における色が、代表職の算出には用いられることになる。 The representative color extraction unit 508 identifies the position of the pixel (black pixel) forming each character in the binary image based on the character area information and the character cutting rectangle information after being corrected by the area determination unit 2 (505). To do. Then, based on the position of the pixel of the identified character, the color of the corresponding position in the reduced multi-valued image is referred to, the representative color of the character is calculated for each character cutting rectangular area, and the character color information 513 of each character is calculated. To get For example, the representative color is an average or weighted average of colors in a multi-valued image of a pixel group that is black in the binary image in the character-cut rectangular area. Alternatively, it is the most frequent color in such a pixel group. As described above, there are various possible ways of obtaining the representative color, but the color in the multi-valued image of at least one pixel of the pixel group that becomes black in the binary image in the character-cutting rectangular area is used for the calculation of the representative job. Will be used.

文字領域穴埋め部５０９は、領域判定部２（５０５）で修正した後の文字領域情報と文字切り矩形情報とに基づいて、２値画像における各文字を構成する画素（黒画素）の位置を特定する。そして、当該特定した画素の位置に基づいて縮小多値画像における対応する位置の画素を、その周辺色で塗り潰す処理を行う。周辺色は文字の周囲の画素の画素値の平均値を用い、文字の画素の画素値を当該求めた周辺色で置き換えればよい。文字領域穴埋め部による穴埋め処理の詳細については、特許文献１に記載されている。 The character area filling unit 509 identifies the position of a pixel (black pixel) forming each character in the binary image based on the character area information and the character cutting rectangle information after being corrected by the area determination unit 2 (505). To do. Then, based on the position of the identified pixel, the pixel at the corresponding position in the reduced multi-valued image is filled with the surrounding color. As the peripheral color, an average value of pixel values of pixels around the character may be used, and the pixel value of the pixel of the character may be replaced with the obtained peripheral color. The details of the filling processing by the character area filling portion are described in Patent Document 1.

ＪＰＥＧ圧縮部５１０は、文字領域穴埋め部５０９で穴埋め処理した後の画像をＪＰＥＧ圧縮して、圧縮コード２（５１４）を生成する。 The JPEG compression unit 510 JPEG-compresses the image that has undergone the padding processing by the character area padding unit 509 to generate a compression code 2 (514).

ＯＣＲ部（５１６）は、領域判定部（５０３）で文字領域と判定された領域に対してステップ９０４において生成された文字切り矩形情報を参照しながら、公知の文字認識処理を行う。文字コード５１７は、その文字認識処理により得られた文字コードである。 The OCR unit (516) performs known character recognition processing on the area determined by the area determination unit (503) as the character cutting rectangle information generated in step 904. The character code 517 is a character code obtained by the character recognition process.

ここで、ＭＭＲ圧縮部（５０６）で、ＭＭＲ圧縮する際には、領域判定部２（５０５）で文字として判定された領域、すなわち「背景から分離容易な文字画像」と判断された領域を対象としてＭＭＲ圧縮したのに対し、ＯＣＲ部（５１６）でＯＣＲする際には、領域判定部（５０３）で文字領域として判定された領域を対象としてＯＣＲする。 Here, when MMR compression is performed by the MMR compression unit (506), the region determined by the region determination unit 2 (505), that is, the region determined as “a character image that is easily separated from the background” is targeted. In contrast to the MMR compression as described above, when the OCR unit (516) performs the OCR, the region determined by the region determination unit (503) as the character region is subjected to the OCR.

このうち「背景から分離容易な文字画像」は、領域判定部（５０３）で文字領域と判定した領域のうちの一部の領域となっていることから、「背景から分離容易な文字画像」の方が狭い。即ち、ＯＣＲ対象領域は広く、ＭＭＲ圧縮領域は狭い。 Of these, the “character image that is easily separated from the background” is a part of the area that is determined to be a character area by the area determination unit (503). Narrower. That is, the OCR target area is wide and the MMR compression area is narrow.

なぜＯＣＲされる領域の方が広くなっているのか。それは、ＯＣＲ対象領域の中にたとえ本当は文字でないものが存在していたとしても、余計な文字コードが得られるだけであり、それほど大きな問題とはならないためである（余計だと思うのであればそうした文字コードを消せば良い）。これに対し、ＭＭＲ圧縮時に、本当は文字でない領域をＭＭＲ圧縮してしまうと、その領域の画質劣化が起きてしまう。そのため、ＯＣＲでは広めの領域を対象として、ＭＭＲ圧縮では狭めの領域を対象とした処理を行っているのである。 Why is the OCR area wider? This is because even if there is something that is not really a character in the OCR target area, you will only get an extra character code and it will not cause a big problem (if you think that it is an extra character, Erase the character code). On the other hand, when the MMR compression is performed on a region that is not actually a character, the image quality of the region deteriorates. Therefore, in OCR, processing is performed for a wide area, and for MMR compression, processing is performed for a narrow area.

このようにして、各構成要素から得られた圧縮コード１（５１１）と、修正後の文字領域情報（５１２）と、文字色情報（５１３）と、圧縮コード２（５１４）と、文字コード（５１７）を含む圧縮データ（５１５）のファイルがＰＤＦ形式で生成される。生成されたＰＤＦ形式のファイルは、上述の通り、ユーザにより指定された宛先へと送信されることになる。 In this way, the compressed code 1 (511) obtained from each component, the corrected character area information (512), the character color information (513), the compressed code 2 (514), and the character code ( A file of compressed data (515) including 517) is generated in PDF format. The generated PDF format file will be transmitted to the destination designated by the user as described above.

図６は、別の装置から送られてきたＰＤＦ形式の圧縮データを伸長する画像伸長処理部の構成を示すブロック図である。図６の処理は、圧縮データを伸長して印刷する場合などに実行される。ここでは、別の装置から送られてきた圧縮データが５１５と同じファイルであった場合を例に説明する。 FIG. 6 is a block diagram showing the configuration of an image expansion processing unit that expands PDF format compressed data sent from another device. The process of FIG. 6 is executed when decompressing compressed data and printing. Here, a case where the compressed data sent from another device is the same file as 515 will be described as an example.

ＭＭＲ伸長部６０１は、圧縮データ（５１５）のファイルに含まれている圧縮コード１（５１１）に対してＭＭＲ伸長処理を行い、２値画像を再現する。ＪＰＥＧ伸長部６０３は圧縮コード２（５１４）に対してＪＰＥＧ伸長処理を行い、縮小多値画像を再現する。拡大部６０４は、ＪＰＥＧ伸長部（６０３）で伸長された縮小多値画像に対して、拡大処理を行うことで、圧縮前の入力画像５０１のサイズと同じサイズの多値画像を生成する。 The MMR expansion unit 601 performs MMR expansion processing on the compression code 1 (511) included in the file of the compressed data (515) to reproduce a binary image. The JPEG decompression unit 603 performs JPEG decompression processing on the compressed code 2 (514) to reproduce a reduced multi-valued image. The enlarging unit 604 performs an enlarging process on the reduced multi-valued image decompressed by the JPEG decompressing unit (603) to generate a multi-valued image having the same size as the uncompressed input image 501.

合成部６０２は、文字領域情報（５１２）を参照しながら、ＭＭＲ伸長部で伸長された２値画像の黒画素に文字色情報の色（以下、文字色と称する。５１３）を割り当る。更に、当該文字色が割り当てられた２値画像を、拡大部６０４で生成された多値画像の上に合成することにより、伸長画像６０５を生成する。合成する際、２値画像における白画素に対しては透明色が割り当てられており、背景の多値画像を透過する。このように、画像伸長処理部は、画像圧縮処理部により生成された圧縮データを伸長し、伸長画像６０５を生成する。この伸長画像６０５は、デバイスＩ／Ｆ２１４を介してプリンタ部２０２に送られ、印刷されることになる。なお、この画像伸長処理部は、文字コード５１７は無視する。これは、伸長画像を印刷する上で文字コードは不要だからである。文字コードを必要とするのは、伸長画像６０５をディスプレイに表示するクライアントＰＣ１０２のような装置であって、ＭＦＰ１０１ではない。従って、ＭＦＰ１０１は、文字コード５１７は無視する。なお、正確に言うと、文字コードを必要としているのは、ＰＣ１０２というより、ＰＣ１０２を利用するユーザである。文字列の切り貼り、編集をしたい場合に、文字コードは活用されることになる。 The synthesizing unit 602 assigns a color (hereinafter, referred to as a character color, 513) of the character color information to the black pixel of the binary image expanded by the MMR expanding unit with reference to the character area information (512). Further, the binary image to which the character color is assigned is combined with the multi-valued image generated by the enlargement unit 604 to generate a decompressed image 605. When synthesizing, a transparent color is assigned to the white pixel in the binary image, and the background multi-valued image is transparent. In this way, the image decompression processing unit decompresses the compressed data generated by the image compression processing unit and generates a decompressed image 605. The decompressed image 605 is sent to the printer unit 202 via the device I/F 214 and printed. The image expansion processing unit ignores the character code 517. This is because the character code is unnecessary for printing the expanded image. A device such as the client PC 102 that displays the decompressed image 605 on the display needs a character code, not the MFP 101. Therefore, the MFP 101 ignores the character code 517. To be exact, it is not the PC 102 that needs the character code but the user who uses the PC 102. The character code is used when you want to cut and paste a character string and edit it.

次に、上述した領域判定部２（５０５）が実行する処理の詳細について説明する。領域判定部２（５０５）は、２値化部５０２で生成された２値画像と、縮小部５０７で生成された縮小多値画像と、文字切出し部５０４で生成された文字切り矩形情報とに基づいて、文字切り矩形内の文字画像は２値化によって潰れるかどうかの判定を行う。なお、記憶部２１１において、入力画像５０１が保持されている場合は、縮小多値画像の代わりに、入力画像５０１を用いてもよい。 Next, details of the processing executed by the above-described area determination unit 2 (505) will be described. The area determination unit 2 (505) uses the binary image generated by the binarization unit 502, the reduced multi-valued image generated by the reduction unit 507, and the character cutting rectangle information generated by the character cutting unit 504. Based on this, it is determined whether the character image in the character cutting rectangle is destroyed by binarization. When the input image 501 is held in the storage unit 211, the input image 501 may be used instead of the reduced multi-valued image.

領域判定部２（５０５）の詳細構成について図３を用いて説明する。説明を行う上で、図４の文字画像の例を適宜参照する。ここで、図４の４０１は、白の背景上に記載された文字（背景と文字の濃度差が大きい画像）の例を示しており、２値化部５０２で画像４０１を２値化すると、４０２に示すような２値画像となる。また、４０６は、濃い濃度の背景上に記載された文字（背景と文字の濃度差が小さい画像）の例を示しており、背景４０７の濃度よりも小さい値の閾値で画像４０６を２値化すると、４０８に示すような黒く潰れた画像となる。尚、薄い濃度の背景上に記載された文字は、背景の濃度と文字の濃度の間の閾値で２値化が行われることによって、画像４０２と同様になるため説明を省略する。４０３〜４０５、及び４０９〜４１１については、後述する。 The detailed configuration of the area determination unit 2 (505) will be described with reference to FIG. In describing, the example of the character image of FIG. 4 will be referred to as appropriate. Here, 401 in FIG. 4 shows an example of a character (an image in which the density difference between the background and the character is large) written on a white background. When the binarizing unit 502 binarizes the image 401, A binary image as shown by 402 is obtained. Reference numeral 406 denotes an example of a character (an image in which the density difference between the background and the character is small) written on a background having a high density, and the image 406 is binarized with a threshold value smaller than the density of the background 407. Then, an image is crushed in black as shown by 408. It should be noted that the characters written on the background of light density become the same as the image 402 by being binarized with a threshold value between the background density and the character density, and therefore description thereof is omitted. 403-405 and 409-411 are mentioned later.

領域判定部２（５０５）は、細線化部３０１、エッジ検出部３０２、論理演算部３０３、エッジカウント部３０４、エッジ数比較部３０５から構成される。 The area determination unit 2 (505) includes a thinning unit 301, an edge detection unit 302, a logical operation unit 303, an edge count unit 304, and an edge number comparison unit 305.

領域判定部２（５０５）は、閾値よりも濃い領域（即ち、４０２、４０８で黒くなっている領域）の内部のエッジ画素を抽出する（１）。そして、抽出されたエッジ画素の数が閾値より少ない場合に「背景から分離容易な文字画像」であると判定する（２）。また、閾値以上である場合に、「背景から分離困難な文字画像」であると判定する（２）。 The area determination unit 2 (505) extracts the edge pixel inside the area darker than the threshold value (that is, the area darkened at 402 and 408) (1). Then, when the number of extracted edge pixels is less than the threshold value, it is determined to be a “character image that is easily separated from the background” (2). Further, when it is equal to or more than the threshold value, it is determined that the character image is a character image that is difficult to separate from the background (2).

例えば、４０２の黒くなっている領域の内部には、エッジ画素が無い。一方、４０８で黒くなっている領域の内部には、エッジ画素（４１０で表されるＨのエッジ画素）がある。ここでいうエッジ画素とは、もちろん二値画像から抽出されたエッジ画素ではなく、多値画像（入力画像）から抽出されたエッジ画素という意味である。 For example, there are no edge pixels inside the blackened area 402. On the other hand, inside the blackened area 408, there are edge pixels (H edge pixels represented by 410). The edge pixel referred to here does not mean the edge pixel extracted from the binary image, but the edge pixel extracted from the multi-valued image (input image).

下記構成は、以上の処理（１）（２）を実現するための一構成であり、この構成に限られるわけではない。他に考えられる構成については後述する。 The following configuration is one configuration for realizing the above processes (1) and (2), and is not limited to this configuration. Other possible configurations will be described later.

細線化部３０１は、２値画像に対して文字切り矩形情報を参照しながら、文字切り矩形単位で細らせ処理を実行する。細らせ処理は、２値画像内の黒画素塊の外側の２画素を削る（すなわち、黒画素塊の輪郭にある黒画素を白画素に置換する）ことによって、黒画素塊を細らせるための処理である。例えば、対象とする一つの文字切り矩形に含まれる２値画像内の各画素を順に注目画素として、５×５のウインドウを利用して走査を行う。そして、５×５のウインドウ中で１画素でも白画素が存在すれば、注目画素（５×５の中心）を白画素に置き換えることで、細らせ処理を行う。ここで、２値画像４０２に対して細らせ処理を行うと、４０３のような細線化画像になる。また、２値画像４０８に対して細線化を行うと、４０９のような細線化画像になる。 The thinning unit 301 performs a thinning process on a character-cutting rectangle unit basis with reference to the character-cutting rectangle information for a binary image. The thinning process thins the black pixel block by removing two pixels outside the black pixel block in the binary image (that is, replacing the black pixel on the outline of the black pixel block with the white pixel). This is processing for. For example, each pixel in the binary image included in one target character cutting rectangle is sequentially set as a target pixel, and scanning is performed using a 5×5 window. Then, if there is even one white pixel in the 5×5 window, the thinning process is performed by replacing the pixel of interest (center of 5×5) with the white pixel. Here, if the thinning process is performed on the binary image 402, a thinned image like 403 is obtained. Further, if the binary image 408 is thinned, a thinned image like 409 is obtained.

エッジ検出部３０２は、入力された縮小多値画像に対して、文字切り矩形情報を参照しながら、文字切り矩形単位でエッジ検出を行う。エッジであると判定された画素を黒画素として、エッジでないと判定された画素を白画素として表現した画像を、エッジ検出画像とする。エッジ検出は、公知の手法を用いればよいため、詳細については省略するが、以下の処理が考えられる。例えば、縮小多値画像の輝度成分に対して微分フィルタ処理を実行して各画素のエッジ強度を求め、該エッジ強度が所定閾値以上の画素を黒画素とし、エッジ強度が所定閾値より小さい画素を白画素とすることでエッジ検出画像を生成する。ただし、実施例４で説明するエッジ検出方法を利用するとより高精度なエッジ検出が実現できる。入力画像４０１を縮小することによって得られる不図示の縮小多値画像に対してエッジ検出を行うと、４０４のようなエッジ検出画像が得られる。また、入力画像４０６を縮小することによって得られる不図示の縮小多値画像に対してエッジ検出を行うと、４１０のようなエッジ検出画像が得られる。ここで、入力画像４０１や４０６を縮小することによって得られる不図示の縮小多値画像が、入力画像の１／２の解像度である場合は、４０４や４１０も、入力画像の１／２の解像度となるが、説明の簡略化のため、同じ大きさで図示している。なお、記憶部２１１において、入力画像４０１や４０６が保持されている場合は、縮小多値画像の代わりに、入力画像４０１や４０６を用いてエッジ検出を行ってもよい。 The edge detection unit 302 performs edge detection on the input reduced multi-valued image in units of character cutting rectangles with reference to the character cutting rectangle information. An image in which a pixel determined to be an edge is represented as a black pixel and a pixel determined not to be an edge is represented as a white pixel is referred to as an edge detection image. Since a known method may be used for the edge detection, the details thereof will be omitted, but the following processing can be considered. For example, the differential filter processing is performed on the luminance component of the reduced multi-valued image to obtain the edge strength of each pixel, the pixels whose edge strength is equal to or higher than a predetermined threshold are black pixels, and the pixels whose edge strength is lower than the predetermined threshold are selected. An edge detection image is generated by using white pixels. However, more accurate edge detection can be realized by using the edge detection method described in the fourth embodiment. When edge detection is performed on a reduced multivalued image (not shown) obtained by reducing the input image 401, an edge detection image such as 404 is obtained. If edge detection is performed on a reduced multi-valued image (not shown) obtained by reducing the input image 406, an edge detection image like 410 is obtained. Here, when the reduced multi-valued image (not shown) obtained by reducing the input images 401 and 406 has a half resolution of the input image, 404 and 410 also have a half resolution of the input image. However, for simplification of description, they are illustrated in the same size. When the storage unit 211 holds the input images 401 and 406, the edge detection may be performed using the input images 401 and 406 instead of the reduced multivalued image.

論理演算部３０３は、細線化部３０１によって生成された細線化画像と、エッジ検出部３０２によって生成されたエッジ検出画像との論理積（ＡＮＤ）をとって、論理積（ＡＮＤ）画像を生成する処理を行う。具体的には、細線化部３０１によって生成された細線化画像に黒画素があり、且つエッジ検出部３０２によって生成されたエッジ検出画像の同じ位置に黒画素がある場合のみ、論理積を取ると黒画素になる。なお、エッジ検出部３０２によって生成されたエッジ検出画像が、細線化画像の１／２の解像度である場合は、エッジ検出画像を０次補間によって、細線化画像の解像度に合わせてから論理積をとる。または、細線化画像を間引くことによって、エッジ検出画像の解像度に合わせてから論理積をとる。細線化画像４０３とエッジ検出画像４０４との論理積をとると、細線化画像４０３の黒画素とエッジ検出画像４０４の黒画素は、同じ位置にないため、論理積画像４０５内の黒画素は基本的に無くなる（ただし、ノイズ等の影響により少し残る場合はある）。一方、細線化画像４０９とエッジ検出画像４１０との論理積を取ると、論理積画像４１１のように、文字の輪郭部分に黒画素が残る。このように、「背景から分離容易な文字画像」に対する論理積画像内の黒画素の数は少なく、「背景から分離困難な文字画像」に対する論理積画像内の黒画素数は多いという特徴がある。 The logical operation unit 303 takes a logical product (AND) of the thinned image generated by the thinning unit 301 and the edge detection image generated by the edge detection unit 302 to generate a logical product (AND) image. Perform processing. Specifically, the logical product is calculated only when the thinned image generated by the thinning unit 301 has a black pixel and the black pixel exists at the same position in the edge detection image generated by the edge detection unit 302. It becomes a black pixel. If the edge detection image generated by the edge detection unit 302 has a resolution half that of the thinned image, the edge detection image is subjected to zero-order interpolation to match the resolution of the thinned image, and then the logical product is calculated. To take. Alternatively, the thinned images are thinned out to obtain the logical product after matching the resolution of the edge detection image. When the logical product of the thinned image 403 and the edge detection image 404 is calculated, the black pixels of the thinned image 403 and the black pixels of the edge detection image 404 are not at the same position, and therefore the black pixel in the logical product image 405 is basically (However, it may remain a little due to the influence of noise etc.). On the other hand, when the logical product of the thinned image 409 and the edge detection image 410 is obtained, black pixels remain in the outline portion of the character like the logical product image 411. As described above, the number of black pixels in the logical product image for the “character image that is easily separated from the background” is small, and the number of black pixels in the logical product image for the “character image that is difficult to separate from the background” is large. ..

なお、４１２は、細線化画像４０３とエッジ検出画像４０４とを重ね合わせた場合を示す図である。４１３は細線化画像４０３の黒画素に相当し、４１４はエッジ検出画像４０４の黒画素に相当しており、細線化画像４１３の黒画素とエッジ検出画像４１４の黒画素は、同じ位置にないので、論理積を取ると黒画素は生成されないことになる。 412 is a diagram showing a case where the thinned image 403 and the edge detection image 404 are superimposed. 413 corresponds to a black pixel of the thinned image 403, 414 corresponds to a black pixel of the edge detection image 404, and the black pixel of the thinned image 413 and the black pixel of the edge detection image 414 are not at the same position. , If a logical product is taken, a black pixel will not be generated.

エッジカウント部３０４は、論理演算部３０３によって論理積（ＡＮＤ）をとった結果（論理積画像）における黒画素の数を、エッジ数としてカウントする処理を行う。 The edge counting unit 304 performs a process of counting, as the number of edges, the number of black pixels in the result (logical product image) obtained by taking the logical product (AND) by the logical operation unit 303.

エッジ数比較部３０５は、エッジカウント部３０４によってカウントされたエッジ数と所定の閾値とを比較し、「背景から分離容易な文字画像」であるか「背景から分離困難な文字画像」であるかを判定する。すなわち、エッジ数が所定閾値より少なければ、「背景から分離容易な文字画像（２値化したときに潰れない文字画像）」であると判定し、エッジ数が所定閾値以上であれば、「背景から分離困難な文字画像（２値化したときに潰れる文字画像）」であると判定する。 The edge number comparing unit 305 compares the number of edges counted by the edge counting unit 304 with a predetermined threshold value to determine whether the image is a “character image that is easily separated from the background” or a “character image that is difficult to separate from the background”. To judge. That is, if the number of edges is less than a predetermined threshold value, it is determined that the character image is a character image that is easily separated from the background (a character image that is not crushed when binarized), and if the number of edges is equal to or greater than the predetermined threshold value, the “background Is a character image that is difficult to separate (a character image that is crushed when binarized)”.

なお、黒画素塊の画素の幅が、細らせ処理の細らせ幅よりも小さい場合には、細らせ処理により、２値画像内の黒画素塊がすべてなくなってしまう場合がある。例えば、２値画像の黒画素塊が３画素幅で構成される細線文字で、細らせ処理の細らせ幅が４画素の場合、２値画像を細らせると黒画素塊がなくなってしまう。このように黒画素塊がなくなってしまう場合には、処理スピード向上の観点で、エッジ検出部３０２と論理演算部３０３とエッジカウント部３０４とエッジ数比較部３０５の処理を省くことが好ましい。これは、エッジ検出部３０２でエッジ画素を検出したとしても、細線化画像との間で論理積を取り、その結果得られるエッジ数をカウントすると、カウント結果が０になることが明らかだからである。カウント結果が０になると、エッジ数が所定閾値より少ないということになるので、「背景から分離容易な文字画像（２値化したときに潰れない文字画像）」であると判定できる。従って、対象とする文字切り矩形における黒画素が細らせ処理により全てなくなってしまう場合には、エッジ検出部３０２−エッジ数比較部３０５の処理を行わずして、その文字切り矩形を「背景から分離容易な文字画像（２値化したときに潰れない文字画像）」であると判定することになる。このように３０２−３０５の処理を省いた場合には、次の文字切り矩形領域を対象として、その領域に対して細線化部３０１−エッジ数比較部３０５の処理に移る。なお、上述の処理を省く理由は、以下のようにも説明できる。即ち、細線化したくらいで黒画素がなくなるようであれば元の二値画像の黒画素群はかなり細いといえ、細い黒画素群は一般に文字や線である。よって、上述の処理を省いて、対象とする文字切り矩形領域は、「背景から分離容易な文字画像（２値化したときに潰れない文字画像）」であると判定するのが処理スピードの面で好ましいというようにも説明できる。 If the width of the pixels of the black pixel block is smaller than the narrowing width of the thinning process, the black pixel block may disappear entirely in the binary image due to the thinning process. For example, if a black pixel block of a binary image is a thin line character having a width of 3 pixels and the narrowing width of the thinning process is 4 pixels, the black pixel block disappears when the binary image is thinned. End up. When the black pixel block disappears in this way, it is preferable to omit the processes of the edge detection unit 302, the logical operation unit 303, the edge count unit 304, and the edge number comparison unit 305 from the viewpoint of improving the processing speed. This is because even if the edge detection unit 302 detects an edge pixel, it is clear that the count result becomes 0 when the logical product is calculated with the thinned image and the number of edges obtained as a result is counted. .. When the count result is 0, it means that the number of edges is less than the predetermined threshold value, and thus it can be determined that the character image is a character image that is easily separated from the background (character image that is not crushed when binarized). Therefore, when all the black pixels in the target character cutting rectangle are eliminated by the thinning processing, the processing of the edge detecting unit 302-the edge number comparing unit 305 is not performed, and the character cutting rectangle is set as the “background”. Is a character image that can be easily separated (a character image that does not collapse when binarized)”. When the processing of 302 to 305 is omitted in this way, the processing advances to the processing of the thinning unit 301-edge number comparison unit 305 for the next character cutting rectangular area. The reason why the above processing is omitted can be explained as follows. That is, if the black pixels disappear after thinning, it can be said that the black pixel group of the original binary image is quite thin, but the thin black pixel group is generally a character or a line. Therefore, the processing speed is determined by omitting the above processing and determining that the target character cutting rectangular area is a “character image that is easily separated from the background (character image that does not collapse when binarized)”. Can be explained as being preferable.

あるいは、２値画像内の黒画素がすべてなくなる場合には、削る画素数を減らす事も可能である。例えば、５×５のウインドウ中で１画素でも白画素が存在すれば、注目画素（５×５の中心）を白画素に置き換えると黒画素塊がすべて白画素となってしまう場合にはウインドウサイズを小さくし、３×３のウインドウで処理する事も可能である。なお、細線化と細らせ処理は同義である。 Alternatively, when all the black pixels in the binary image are lost, the number of pixels to be cut can be reduced. For example, if there is at least one white pixel in a 5×5 window and the target pixel (5×5 center) is replaced with a white pixel, the black pixel block becomes all white pixels. It is also possible to make the value smaller and process in a 3×3 window. Note that thinning and thinning processing have the same meaning.

なお、上述の説明では、エッジカウント部３０４によってカウントされたエッジ数と所定の閾値とを比較すると記載したが、エッジ数を、細線化画像の黒画素数で割った値を、所定の閾値と比べるのも好ましい。そうすることにより、文字切り矩形領域のサイズによらず適切な判断ができることになる。また、文字切り矩形領域を構成する全ての画素数や、その矩形領域を二値化した後の黒画素の数でエッジ数を割ることも考えられる。ただし、一番精度が高いのは、上述の通り、細線化画像の黒画素数でエッジ数を割ることである。そのようにすると、二値画像の内側（濃い領域の内側）にどれだけの割合でエッジが存在するかがわかるからである。この割合が高ければ高いほど、二値画像の内側にエッジが高い割合で存在すると言え、よって、この二値画像が文字でない可能性が高いと言えることになる。 In the above description, it is described that the number of edges counted by the edge counting unit 304 is compared with a predetermined threshold value. However, a value obtained by dividing the number of edges by the number of black pixels of the thinned image is a predetermined threshold value. It is also preferable to compare. By doing so, an appropriate judgment can be made regardless of the size of the character cutting rectangular area. It is also possible to divide the number of edges by the total number of pixels forming the character-cut rectangular area or the number of black pixels after binarizing the rectangular area. However, the highest accuracy is obtained by dividing the number of edges by the number of black pixels of the thinned image, as described above. This is because it is possible to know at what rate the edges exist inside the binary image (inside the dark area). It can be said that the higher this ratio is, the higher the ratio of the edges existing inside the binary image is, and thus the higher the possibility that the binary image is not a character.

次に、図８のフローチャートを用いて、データ処理部２１５が実行する各処理の説明を行う。説明を行う上で、図２、３、５を適宜参照する。なお、領域判定部２（５０５）は、図８の９０５〜９１１の処理を実行する。 Next, each processing executed by the data processing unit 215 will be described using the flowchart of FIG. In describing, reference will be made to FIGS. The area determination unit 2 (505) executes the processing of 905 to 911 in FIG. 8.

ステップ９０１にて、２値化部５０２は、入力画像５０１に対して２値化処理を実行する。 In step 901, the binarization unit 502 executes binarization processing on the input image 501.

ステップ９０２にて、領域判定部５０３は、２値画像に対して領域判定処理を実行し、２値画像内に含まれる各領域を識別し、当該識別された領域が文字領域か非文字領域かの判定を行う。 In step 902, the area determination unit 503 performs area determination processing on the binary image, identifies each area included in the binary image, and determines whether the identified area is a character area or a non-character area. Is determined.

ステップ９０３にて、領域判定部で判定された領域の１つを順に注目領域とし、その注目領域が領域判定部で文字領域と判定された領域である場合は、ステップ９０４へ進み、非文字領域と判定された領域である場合は、ステップ９１３へ進む。 In step 903, one of the areas determined by the area determination unit is sequentially set as the attention area, and when the attention area is the area determined by the area determination unit as the character area, the process proceeds to step 904 and the non-character area is selected. If the area is determined to be, the process proceeds to step 913.

ステップ９０４にて、文字切出し部５０４は、当該注目領域内の画像に対して文字切り出しを行うことによって、文字切り矩形情報を生成する。 In step 904, the character cutout unit 504 generates character cut rectangle information by performing character cutout on the image in the attention area.

ステップ９１６にて、ＯＣＲ部５１６は、領域判定部（５０３）で文字領域と判定された領域に対して、ステップ９０４において生成された文字切り矩形情報を参照しながら、公知の文字認識処理を行う。 In step 916, the OCR unit 516 performs known character recognition processing on the area determined to be the character area by the area determination unit (503) with reference to the character cutting rectangle information generated in step 904. ..

ステップ９０５にて、細線化部３０１は、ステップ９０２において２値化された２値画像に対して、ステップ９０４において生成された文字切り矩形情報を参照しながら、文字切り矩形内の２値画像ごとに細線化処理を実行する。 In step 905, the thinning unit 301 refers to the binarized image binarized in step 902 with respect to the binarized image in step 904 and refers to each binarized image in the binarized rectangle. The thinning process is executed.

ステップ９０６にて、エッジ検出部３０２は、入力画像を縮小した縮小多値画像（または入力画像５０１）と、ステップ９０４において生成された文字切り矩形情報とを用いて、文字切り矩形内の縮小多値画像（または文字切り矩形内の入力画像。）ごとにエッジ検出処理を実行する。 In step 906, the edge detection unit 302 uses the reduced multivalued image (or the input image 501) obtained by reducing the input image and the character-cutting rectangle information generated in step 904 to reduce the size of the character-cutting rectangle. The edge detection processing is executed for each value image (or the input image within the character cutting rectangle).

ステップ９０７にて、論理演算部３０３は、ステップ９０５において細線化部３０１によって生成された細線化画像と、ステップ９０６において生成されたエッジ画像の論理積（ＡＮＤ）をとる。 In step 907, the logical operation unit 303 takes the logical product (AND) of the thinned image generated by the thinning unit 301 in step 905 and the edge image generated in step 906.

ステップ９０８にて、エッジカウント部３０４は、ステップ９０７において、論理演算部３０３によって論理積（ＡＮＤ）をとった結果の論理積画像の黒画素をカウントし、エッジ数を求める。ここで、求めたエッジ数は、更に、文字切り矩形領域の面積（文字切り矩形領域内の画素総数）で割ることにより、単位面積あたりのエッジ数を求めるように正規化を行っても構わない。このようにしておけば、文字切り矩形領域の大きさに依存せずにステップ９０９で閾値と比較できるという利点がある。 In step 908, the edge counting unit 304 counts the number of black pixels in the logical product image resulting from the logical product (AND) in step 907 and obtains the number of edges. Here, the calculated number of edges may be further normalized by dividing by the area of the character-cut rectangular area (the total number of pixels in the character-cut rectangular area) to obtain the number of edges per unit area. .. This has the advantage that the threshold value can be compared in step 909 without depending on the size of the character-cutting rectangular area.

次に、ステップ９０９にて、エッジ数比較部３０５は、ステップ９０８でカウントされたエッジ数と閾値ｔｈとの比較を行う。ここで、エッジ数が閾値ｔｈよりも大きい場合は、ステップ９１０にて、対象とする文字切り矩形領域を「背景から分離困難な文字画像」であると判断する。また、エッジ数が閾値ｔｈ以下の場合は、ステップ９１１にて対象とする文字切り矩形領域を「背景から分離容易な文字画像」と判断する。 Next, in step 909, the edge number comparison unit 305 compares the edge number counted in step 908 with the threshold th. Here, when the number of edges is larger than the threshold value th, it is determined in step 910 that the target character cutting rectangular area is a “character image that is difficult to separate from the background”. If the number of edges is less than or equal to the threshold value th, in step 911 the target character-cutting rectangular area is determined to be a “character image that can be easily separated from the background”.

ステップ９１２にて、文字切出し部５０４は、当該着目している文字領域内の全ての文字切り矩形について処理が終了しているかどうか判断し、終了していると判断するとステップ９１３へ進む。一方、未処理の文字切り矩形があると判断した場合は、ステップ９１４にて次の文字切り矩形を処理対象として設定して、ステップ９０５に戻る。 In step 912, the character cutout unit 504 determines whether or not the processing has been completed for all the character cut rectangles in the focused character area, and if it is determined that the processing has been completed, the process proceeds to step 913. On the other hand, if it is determined that there is an unprocessed character cutting rectangle, the next character cutting rectangle is set as the processing target in step 914, and the process returns to step 905.

ステップ９１３にて、全ての領域についての判定が終了したと判断すると本処理を終了し、未処理の領域があると判断した場合は、ステップ９１５にて未処理の次の領域を注目領域として設定して、ステップ９０３に戻る。 If it is determined in step 913 that the determination has been completed for all areas, this processing ends, and if it is determined that there is an unprocessed area, the next unprocessed area is set as the attention area in step 915. Then, the process returns to step 903.

以上のように、領域判定部２（５０５）において、文字切り矩形領域ごとに、細線化画像とエッジ検出画像との論理積を取った結果の黒画素の数（残ったエッジ数）にもとづいて、各文字切り矩形領域が「背景から分離容易な文字画像。」であるか「背景から分離困難な文字画像」であるかを、高精度に判定できるようになる。 As described above, in the area determination unit 2 (505), based on the number of black pixels (the number of remaining edges) obtained as the logical product of the thinned image and the edge detection image for each character-cut rectangular area. It becomes possible to determine with high accuracy whether each character-cutting rectangular area is a “character image that is easily separated from the background” or a “character image that is difficult to separate from the background”.

「背景から分離困難な文字画像」（例えば図７の７１３）に対しては、文字領域情報から除去するので、ＭＭＲ圧縮部５０６の処理対象にならない。すなわち、「背景から分離困難な文字画像」は、２値化されずに、背景画像とともにＪＰＥＧ圧縮部５１０で圧縮処理されることになる。 The “character image that is difficult to separate from the background” (for example, 713 in FIG. 7) is removed from the character area information, and thus is not a processing target of the MMR compression unit 506. That is, the “character image that is difficult to separate from the background” is not binarized but is compressed by the JPEG compression unit 510 together with the background image.

以上のように、２値化すると潰れる文字画像であるか否かを判定することができるので、ＰＤＦ高圧縮に適用した場合は、文字画像が潰れるのを防ぐことができる。 As described above, it is possible to determine whether or not the character image is crushed when binarized. Therefore, when the PDF high compression is applied, the character image can be prevented from being crushed.

尚、本実施例では、「背景から分離困難な文字画像（２値化すると潰れてしまう文字画像。）」として、図４の４０６と４０８で示したように、１文字の「Ｈ」の場合を例にあげたが、これに限るものではなく、例えば、図９の入力画像１００１に示すように２文字以上であっても構わない。なお、この入力画像１００１を２値化すると２値画像１００２となる。また、２値化で潰れた文字画像は矩形である必要はなく、例えば、図９の１００３に示すように文字画像の一部が潰れる画像であっても構わない。なお、この入力画像１００３を２値化すると、２値画像１００４となる。 In the present embodiment, as shown in 406 and 408 of FIG. 4, the case of one character "H" is defined as "a character image that is difficult to separate from the background (a character image that is crushed when binarized)". However, the present invention is not limited to this, and may be two or more characters as shown in the input image 1001 of FIG. 9, for example. It should be noted that when this input image 1001 is binarized, it becomes a binary image 1002. Further, the character image crushed by binarization does not need to be rectangular, and may be an image in which a part of the character image is crushed as shown in 1003 of FIG. 9, for example. In addition, when this input image 1003 is binarized, it becomes a binary image 1004.

続いて、領域判定部２（５０５）の別の構成について説明する。 Next, another configuration of the area determination unit 2 (505) will be described.

別の構成では、（Ａ）まず領域判定部２に入力された画像を閾値より濃い領域と、閾値以下の領域に切り分ける（二値化でも三値化でも他の方法でもよい）。その結果、４０２、４０８のような領域が手に入る。 In another configuration, (A) first, the image input to the area determination unit 2 is divided into an area darker than a threshold and an area equal to or less than the threshold (binarization, ternarization, or another method may be used). As a result, areas such as 402 and 408 are available.

そして、（Ｂ）その入力された画像における、閾値より濃いと判定された領域（４０１のＨ領域や、４０６の全体領域）からエッジ画素を抽出する（抽出方法は上述の通りである）。このエッジを抽出する際には、閾値より濃いと判定された領域の端の部分（例えば、端から一画素内に入った画素や二画素内に入った画素）は非対象とする。即ち、このＢの構成では、閾値より濃いと判定された領域の端の部分から一定距離以上離れた（中に入った）エッジ画素のみを抽出するのである。または、そうした端の部分（一定距離以上離れていない画素）も対象としてエッジ画素を抽出し、そうした端の部分を除去する構成としても良い。そうすることにより、４０５や４１１の結果と同じ結果が得られる。なお、この例では、一定距離は、３画素となっているが、他の値であっても良い。 Then, (B) in the input image, edge pixels are extracted from a region determined to be darker than a threshold value (H region of 401 or entire region of 406) (the extraction method is as described above). When this edge is extracted, the end portion of the area determined to be darker than the threshold value (for example, a pixel within one pixel or a pixel within two pixels from the edge) is not a target. That is, in the configuration of B, only the edge pixels that are separated by a certain distance or more from the end portion of the area determined to be darker than the threshold value (entered inside) are extracted. Alternatively, a configuration may be adopted in which edge pixels are extracted for such edge portions (pixels that are not separated by a certain distance or more) and the edge portions are removed. By doing so, the same result as that of 405 or 411 is obtained. In this example, the fixed distance is 3 pixels, but it may be another value.

（Ｃ）後は、得られた結果である所のエッジ画素の数をカウントし、そのエッジ画素の数が閾値ｔｈより大きいか、閾値ｔｈ以下であるか判定する。 After (C), the number of edge pixels at the obtained result is counted, and it is determined whether the number of edge pixels is larger than the threshold value th or less than the threshold value th.

そうすることにより、上述の方法と同様の結果（「背景から分離困難な文字画像」であるか、「背景から分離容易な文字画像」であるかの判断結果）が得られることになる。なお、（Ｂ）の処理の代わりに、領域判定部２に入力された画像全体からエッジ画素を抽出しても良い。その場合、入力された画像全体から抽出されたエッジ画素のうち、閾値より濃いと判定された領域の端の部分、及び、閾値以下の領域を除く。そうすることにより、上述の（Ｂ）の構成と同じ結果が得られることになる。 By doing so, the same result as the above-described method (a determination result as to whether it is a “character image that is difficult to separate from the background” or a “character image that is easy to separate from the background”) is obtained. Instead of the process (B), edge pixels may be extracted from the entire image input to the area determination unit 2. In that case, among the edge pixels extracted from the entire input image, the edge part of the area determined to be darker than the threshold and the area equal to or less than the threshold are excluded. By doing so, the same result as the above-mentioned configuration of (B) can be obtained.

なお、本実施例では文字切出し部５０４で切り出された文字切り出し９０４結果の文字単位での処理を説明した。この処理は、文字単位ではなく、その文字単位を更に分割して行う事も可能である。例えば、文字切出し部５０４に対し、領域を４等分に区切り、それぞれの領域での処理を行う事も可能である。例えば、図１６の１３００〜１３０４は、文字切りされた４０６を均等に４分割行った例である。１３００〜１３０４それぞれにおいて処理を行う。更に、均等に区切るのではなく、文字切り出しされた領域の中心部のみ（例えば、文字切り領域の中心部６０％のみを使用）で判定を行う事もできる。例えば、図１６の１３０５は、文字切りされた４０６の中心部６０％を抜き出したものであり、この１３０５に対して処理を行う。また、文字切り出しされた領域での判定と、この領域分割および・又は中心部での判定を合わせて、「背景から分離容易な文字画像」であるか「背景から分離困難な文字画像」であるかの判定を行う事も可能である。 In this embodiment, the processing of the character cutout 904 cut out by the character cutout unit 504 on a character-by-character basis has been described. This processing can be performed by further dividing the character unit instead of the character unit. For example, it is possible to divide the area into four equal parts for the character cutout unit 504 and perform the processing in each area. For example, 1300 to 1304 in FIG. 16 are examples in which the character-cut 406 is equally divided into four. Processing is performed in each of 1300 to 1304. Further, the determination can be made only in the central portion of the character-cut out area (for example, only 60% of the central portion of the character-cut area is used) instead of dividing the character evenly. For example, 1305 in FIG. 16 is obtained by extracting 60% of the central portion of the character-cut 406, and processing is performed on this 1305. In addition, the judgment in the area where the character is cut out and the judgment in the area division and/or the central portion are combined to be the “character image that is easily separated from the background” or the “character image that is difficult to separate from the background”. It is also possible to determine whether or not.

（実施例２）
実施例１では、領域判定部２（５０５）において「背景から分離困難な文字画像」と判定された領域はＭＭＲ圧縮処理を行わないようにした。実施例２では、領域判定部２（５０５）で「背景から分離困難な文字画像」と判定された領域に対して、２値化部５０２と異なるアルゴリズムの高精度な２値化処理を再度実行して、背景から文字画像部分の画素を分離するようにしてもよい。この場合、高精度な再２値化処理を行った結果の文字領域を用いてＭＭＲ圧縮処理を行えば、文字領域の画質の向上を図れる。例えば、図７の７１３の領域は「背景から分離困難な文字画像」と判定されるので、その領域７１３に対応する入力画像７０１における領域７０１３のみを、他の領域と異なる閾値で２値化を行う。その結果、図７の７１４に示すような２値画像を生成することができ、この文字領域をＭＭＲ圧縮することができる。なお、高精度の再２値化処理の一例は、固定の閾値で２値化処理を行うのではなく、対象となる領域の濃度あるいは輝度の平均値を閾値として２値化処理する方法である。 (Example 2)
In the first embodiment, the area determined by the area determination unit 2 (505) as a “character image that is difficult to separate from the background” is not subjected to MMR compression processing. In the second embodiment, the region determination unit 2 (505) again performs the high-precision binarization process on the region determined to be “a character image that is difficult to separate from the background” using an algorithm different from that of the binarization unit 502. Then, the pixels of the character image portion may be separated from the background. In this case, the image quality of the character area can be improved by performing the MMR compression processing using the character area resulting from the highly accurate re-binarization processing. For example, since the area 713 in FIG. 7 is determined to be a “character image that is difficult to separate from the background”, only the area 7013 in the input image 701 corresponding to the area 713 is binarized with a threshold different from other areas. To do. As a result, a binary image as shown by 714 in FIG. 7 can be generated, and this character area can be MMR compressed. An example of the highly accurate re-binarization process is a method of performing the binarization process using the average value of the density or the brightness of the target region as the threshold value, instead of performing the binarization process with the fixed threshold value. ..

（実施例３）
実施例１では、図４の４０１に示すように比較的文字品位の良い入力画像を例として挙げた。しかしながら、図１０の１１０１に示すように文字品位が悪くノイズ等が多い画像（例えば、スキャン原稿や圧縮画像）に対してエッジ検出処理を行うと、図１１の１１０２に示すように文字の内部で多くのエッジが出現する場合がある。文字内部におけるエッジの出現は、特に大きな文字になるほど、顕著になりやすい。 (Example 3)
In the first embodiment, an input image having relatively good character quality as shown by 401 in FIG. 4 is taken as an example. However, when the edge detection processing is performed on an image (for example, a scanned document or a compressed image) that is poor in character quality and has a lot of noise as shown by 1101 in FIG. 10, inside the character as shown by 1102 in FIG. Many edges may appear. The appearance of edges inside a character tends to be more prominent as the character becomes larger.

ここで、エッジ検出画像１１０２と細線化画像１１０３から得られる論理積（ＡＮＤ）画像１１０４では、文字内部のエッジが残りやすい。文字内部のエッジが多く残ると、本来、「背景から分離容易な文字画像」であるにも関わらず、「背景から分離困難な文字画像」と判定されてしまう。 Here, in the logical product (AND) image 1104 obtained from the edge detection image 1102 and the thinned image 1103, the edge inside the character tends to remain. If many edges inside the character remain, the character image is determined to be a “character image that is difficult to separate from the background” although it is originally a “character image that is easy to separate from the background”.

実施例３では、文字切り領域のサイズが大きい場合、細線化部３０１での細らせ処理の削減量を大きくすることにより、文字内部に残ってしまうエッジを低減することができる。この処理を、図１０の１１０５〜１１１２を用いて説明する。 In the third embodiment, when the size of the character cutting area is large, the amount of reduction of the thinning processing in the thinning unit 301 is increased, so that the edge left inside the character can be reduced. This process will be described with reference to 1105-1112 in FIG.

１１０５は、文字品位が悪くノイズ等が多い入力画像の小文字を示す。１１０６は、小文字の画像１１０５に対してエッジ検出処理を実行した結果のエッジ検出画像を示す。１１０７は、小文字の画像１１０５に対して細らせ処理を実行した結果の細線化画像を示している。細らせ処理では、５×５のウインドウを利用して、５×５の中で１画素でも白画素が存在すれば、注目画素（５×５の中心）を白画素に置き換える処理を行っている。 Reference numeral 1105 denotes a lowercase character of the input image that is poor in character quality and has a lot of noise. Reference numeral 1106 denotes an edge detection image obtained as a result of executing the edge detection processing on the lower case image 1105. Reference numeral 1107 denotes a thinned image obtained as a result of performing the thinning processing on the lowercase image 1105. In the thinning process, a 5×5 window is used to replace the target pixel (center of 5×5) with a white pixel if there is even one white pixel in 5×5. There is.

１１０８は、エッジ検出画像１１０６と細線化画像１１０７との論理積をとった結果の論理積（ＡＮＤ）画像を示している。ここで、文字品位が悪くノイズ等が多い入力画像の小文字であったとしても、大文字の論理積（ＡＮＤ）画像１１０４と比較すると、エッジ数が少ない。 Reference numeral 1108 denotes a logical product (AND) image of the logical product of the edge detection image 1106 and the thinned image 1107. Here, even if the input image is in lower case with poor character quality and much noise, the number of edges is smaller than in the upper case logical product (AND) image 1104.

また、１１０９は、文字品位が悪くノイズ等が多い入力画像の大文字（１１０１と同様の文字画像）を示す。１１１０は、大文字の画像１１０９に対してエッジ検出処理を実行した結果のエッジ検出画像を示す。１１１１は、大文字の画像１１０９に対して細らせ処理を実行した結果の細線化画像を示している。大文字の画像に対する細らせ処理では、９×９のウインドウを利用して、９×９の中で１画素でも白画素が存在すれば、注目画素（９×９の中心）を白画素に置き換える処理を行う。すなわち、文字画像の大きさ（文字切り領域の大きさ）にもとづいて、ウインドウの大きさを変更することで細らせ処理の削減量を大きくしている。なお、上述したウインドウの大きさは一例であり、５×５や９×９に限るものではない。 Further, reference numeral 1109 denotes an uppercase character (character image similar to 1101) of the input image having poor character quality and having many noises. Reference numeral 1110 denotes an edge detection image obtained as a result of performing the edge detection processing on the capital image 1109. Reference numeral 1111 denotes a thinned image obtained as a result of performing the thinning processing on the capital image 1109. In the thinning process for an uppercase image, a 9×9 window is used, and if there is even one white pixel in 9×9, the pixel of interest (center of 9×9) is replaced with a white pixel. Perform processing. That is, the reduction amount of the thinning process is increased by changing the size of the window based on the size of the character image (size of the character cutting area). The size of the window described above is an example, and is not limited to 5×5 or 9×9.

１１１２は、エッジ検出画像１１１０と細線化画像１１１１との論理積をとった結果の論理積（ＡＮＤ）画像を示している。論理積（ＡＮＤ）画像１１１２は、前述の論理積（ＡＮＤ）画像１１０４と比較して、エッジ数が少なくなる。したがって、ノイズが多い大文字であっても、文字画像のサイズが大きければ細らせ処理の削減量を大きくすることで、「背景から分離容易な文字画像」と判定することができる。 Reference numeral 1112 denotes a logical product (AND) image of the logical product of the edge detection image 1110 and the thinned image 1111. The logical product (AND) image 1112 has a smaller number of edges than the logical product (AND) image 1104 described above. Therefore, even if it is a capital letter with a lot of noise, if the size of the character image is large, it is possible to determine that the character image is “easily separated from the background” by increasing the reduction amount of the thinning process.

以上のように、実施例３によれば、文字切り領域の大きさにもとづいて細線化部による細らせ処理の削減量を制御することで、入力画像がスキャン原稿のような場合であっても、ノイズ等の影響を低減することができ、高精度な判定を行うことができる。 As described above, according to the third embodiment, by controlling the reduction amount of the thinning processing by the thinning unit based on the size of the character cutting region, the case where the input image is a scanned document can be realized. Also, the influence of noise or the like can be reduced, and highly accurate determination can be performed.

（実施例４）
次に、図１１を用いて図５の領域判定部２（５０５）内のエッジ検出部（３０２）が行う処理の詳細について説明を行う。エッジ検出部（３０２）は、分散値検出部１００１、エッジ判定閾値算出部１００２、エッジ抽出部１００３から構成される。エッジ検出部（３０２）の処理をより詳細に説明するため、図１２も合わせて説明を行う。図１２の１１０１、１１０２、１１０３はそれぞれ図４内に示した４０１及び、４０６と同じく入力画像に対し文字切り矩形情報を参照しながら、文字切り矩形単位で切り出された入力画像を示している。１１０１、１１０２、１１０３はそれぞれ、スキャナ部２０１で取得された際の信号値が異なっている画像例である。より具体化するために、Ｌ＊ａ＊ｂ＊表色系での信号値をしめしており、Ｌ＊が明度、ａ＊およびｂ＊で色度を示している。なお、本例ではＬ＊ａ＊ｂ＊表色系で示しているが、限定するものでなく例えば、ＲＧＢ表色系など別の色空間の信号値でも同様の処理が可能である。１１０１の１１０４で示す領域の信号値は｛Ｌ＊，ａ＊，ｂ＊｝＝｛１１２８，−５０，＋３０｝である。１１０５で示す領域の信号値は｛Ｌ＊，ａ＊，ｂ＊｝＝｛１２８，＋５０，−６０｝である。１１０４と１１０５の領域間で大きな信号値差がある例を示している。一方、１１０２の１１０６で示す領域の信号値は｛Ｌ＊，ａ＊，ｂ＊｝＝｛１２８，−５０，＋３０｝である。１１０７で示す領域の信号値は｛Ｌ＊，ａ＊，ｂ＊｝＝｛１２８，−６０，＋３０｝である。１１０６と１１０７の領域間で小さな信号値差しかない例を示している。更に、１１０３の１１０８で示す領域の信号値は｛Ｌ＊，ａ＊，ｂ＊｝＝｛１２８，−５０，＋３０｝である。１１０９で示す領域の信号値は｛Ｌ＊，ａ＊，ｂ＊｝＝｛１２８，−５２，＋３０｝である。１１０８と１１０９の領域間ではほぼ信号値差がない例を示している。例えば、エッジ検出部（３０２）を本構成ではなく、単純に隣り合う画素との信号値比較を元に行うエッジ検出や、フィルタ処理によって行うエッジ検出を行った場合には以下の問題がある。即ち、闘値によっては１１０１では１１０４と１１０５との境界で輪郭エッジが取得できるが、１１０２の１１０６と１１０７との境界で輪郭エッジが取得できない。また、１１０２の１１０６と１１０７との境界で輪郭エッジを取得できる閾値にした場合には１１０３の１１０８と１１０９との境界の輪郭エッジが取得されてしまう。その結果、スキャナの読み取りバラつきやＪｐｅｇノイズなどの小さなノイズもエッジとして検出されてしまう。 (Example 4)
Next, the details of the processing performed by the edge detection unit (302) in the area determination unit 2 (505) in FIG. 5 will be described using FIG. 11. The edge detection unit (302) includes a variance value detection unit 1001, an edge determination threshold value calculation unit 1002, and an edge extraction unit 1003. In order to describe the processing of the edge detection unit (302) in more detail, the description will be given with reference to FIG. 12. Reference numerals 1101, 1102, and 1103 in FIG. 12 denote input images cut out in character cutting rectangle units with reference to the character cutting rectangle information for the input image similarly to 401 and 406 shown in FIG. Reference numerals 1101, 1102, and 1103 are image examples in which the signal values obtained by the scanner unit 201 are different. To be more specific, the signal values in the L*a*b* color system are shown, where L* is the lightness and a* and b* are the chromaticities. In this example, the L*a*b* color system is shown, but the present invention is not limited to this, and similar processing can be performed with signal values in another color space such as the RGB color system. The signal value of the area indicated by 1104 of 1101 is {L*, a*, b*}={1128, -50, +30}. The signal value of the area indicated by 1105 is {L*, a*, b*}={128, +50, -60}. An example in which there is a large signal value difference between the areas 1104 and 1105 is shown. On the other hand, the signal value of the area indicated by 1106 of 1102 is {L*, a*, b*}={128, -50, +30}. The signal value of the area indicated by 1107 is {L*, a*, b*}={128, -60, +30}. An example in which there is only a small signal value difference between the areas 1106 and 1107 is shown. Further, the signal value of the area indicated by 1108 of 1103 is {L*, a*, b*}={128, -50, +30}. The signal value of the area indicated by 1109 is {L*, a*, b*}={128, -52, +30}. An example in which there is almost no signal value difference between the areas 1108 and 1109 is shown. For example, when the edge detection unit (302) is not configured as described above and edge detection is performed simply based on signal value comparison with adjacent pixels or edge detection performed by filter processing, there are the following problems. That is, depending on the threshold value, the contour edge can be acquired at the boundary between 1104 and 1105 in 1101, but the contour edge cannot be acquired at the boundary between 1106 and 1107 at 1102. Further, when the threshold is set such that the contour edge can be acquired at the boundary between 1102 and 1107 in 1102, the contour edge at the boundary between 1108 and 1109 in 1103 is acquired. As a result, small noises such as scanner reading variations and Jpeg noises are also detected as edges.

以上の課題を解決する構成が図１１であり、分散値検出部１００１は、文字切り矩形単位で切り出された入力画像の信号値での分散値を演算する演算部である。算出方法は、例えば以下の式で算出する。 A configuration for solving the above problem is shown in FIG. 11, and the variance value detection unit 1001 is a computation unit that computes a variance value in the signal value of the input image cut out in character cutting rectangle units. The calculation method is, for example, the following formula.

ここで、切り出された入力画像の画素数をｎ、各画素の信号値（本実施例では、Ｌ＊、ａ＊、ｂ＊のそれぞれの値）をＸｉ（ｉ＝１，２，… ，ｎ）、領域内の画素数の信号値の平均をＸａｖｅで示す。尚、本実施例ではＬ＊、ａ＊、ｂ＊のそれぞれの値での分散値を示すが、限定するものではなく、例えば、ａ＊、ｂ＊信号値での共分散値であってもよい。図１２に示した１１０１、１１０２、１１０３の例では、１１０１は信号値差が大きくいことから分散値も大きくなり、１１０２と１１０３は信号値差が小さいことから分散値も比較的小さくなる。 Here, the number of pixels of the clipped input image is n, and the signal value of each pixel (in this embodiment, the respective values of L*, a*, and b*) is Xi (i=1, 2,..., N). ), the average of the signal values of the number of pixels in the area is indicated by Xave. In the present embodiment, the dispersion value at each value of L*, a*, and b* is shown, but it is not limited, and for example, the covariance value at the a* and b* signal values may be used. Good. In the examples 1101, 1102, and 1103 shown in FIG. 12, 1101 has a large difference in signal value and thus a large variance value, and 1102 and 1103 have a small difference in signal value and thus have a relatively small variance value.

これ以降の説明で用いる用語の定義として、「エッジが取得されやすい閾値」とは隣り合う画素間の信号値差を比較し、差がある場合にエッジと判定する処理では信号値差が小さくてもエッジと判定するものである。逆に、「エッジが取得されにくい閾値」は、信号値差が大きくなければエッジと判定されず、信号値差が少ない場合にはエッジと判定されないものをいう。 As a definition of terms used in the following description, “a threshold value at which an edge is easily acquired” is compared with a signal value difference between adjacent pixels, and if there is a difference, in the process of determining an edge, the signal value difference is small. Is also determined as an edge. On the other hand, the “threshold value at which an edge is hard to be acquired” refers to an edge that is not determined as an edge unless the signal value difference is large and is not determined as an edge when the signal value difference is small.

エッジ判定閾値算出部１００２は、分散値検出部１００１によって算出された分散値を元にエッジ抽出を行うための閾値の算出を行う。例えば、１１０１に示すように分散値が大きい画像に対してはエッジが取得されにくい閾値を割り当てる。一方で、１１０２と１１０３に対しては、エッジが取得されやすい閾値を割り当てる。 The edge determination threshold value calculation unit 1002 calculates a threshold value for edge extraction based on the variance value calculated by the variance value detection unit 1001. For example, as shown by 1101, a threshold value with which an edge is hard to be acquired is assigned to an image having a large variance value. On the other hand, to 1102 and 1103, a threshold value with which an edge is easily acquired is assigned.

エッジ抽出部１００３は、エッジ判定閾値算出部１００２により決定した閾値を元に、エッジ抽出処理を行う処理部である。処理の方法は、汎用的な処理でよく、たとえば近接する画素の信号値差の比較を行い、その差が特定の閾値を越えるか否かで判定するものや、一次微分を算出するフィルタによりエッジ量を求め、特定の閾値を越えるか否かで判定する方法などが挙げられる。 The edge extraction unit 1003 is a processing unit that performs edge extraction processing based on the threshold determined by the edge determination threshold calculation unit 1002. The processing method may be general-purpose processing, for example, by comparing the signal value differences between adjacent pixels and determining whether or not the difference exceeds a specific threshold, or by using a filter that calculates the first derivative There is a method of obtaining the amount and making a determination based on whether or not the amount exceeds a specific threshold value.

エッジ判定閾値算出部１００２によって算出した条件で切り分ける場合、１１０１はエッジが取得されにくい閾値を割り当ててエッジ抽出を行う。ここでは例えば、分散値を元に決定した閾値が５となった場合の例を示す。その閾値で判定した場合、１１０４と１１０５の領域間の信号値差は大きいため、正確に１１０４と１１０５の領域間にあるエッジを抽出できる。この結果を１１１０に示す。一方、１１０２の場合には、１１０６と１１０７の信号値差は小さいものの、エッジが取得されやすい閾値を割り当てる事で、１１０６と１１０７の領域間にあるエッジを抽出できる。この結果を１１１１に示す。１１０３の場合にはエッジが取得されやすい閾値を割り当てているが、１１０８と１１０９の間の信号値差が、１１０６と１１０７の信号値差に比べ非常に小さい。そのため、エッジが取得されやすい閾値であったとしても、１１０８と１１０９の領域間にあるエッジを抽出する事はない。この結果を１１１２に示す。 In the case of cutting according to the condition calculated by the edge determination threshold value calculation unit 1002, 1101 assigns a threshold value at which an edge is hard to be acquired and performs edge extraction. Here, for example, an example in which the threshold determined based on the variance value is 5 is shown. When the determination is made with the threshold value, the signal value difference between the regions 1104 and 1105 is large, and therefore the edge between the regions 1104 and 1105 can be accurately extracted. The result is shown in 1110. On the other hand, in the case of 1102, although the signal value difference between 1106 and 1107 is small, an edge between the regions of 1106 and 1107 can be extracted by assigning a threshold value with which an edge is easily acquired. The result is shown in 1111. In the case of 1103, a threshold value with which an edge is easily acquired is assigned, but the signal value difference between 1108 and 1109 is much smaller than the signal value difference between 1106 and 1107. Therefore, even if the edge is a threshold value that is easily acquired, the edge between the areas 1108 and 1109 is not extracted. The result is shown in 1112.

次に、図１３のフローチャートを用いて、図１１のエッジ検出部（３０２）の説明を行う。説明を行う上で、図１１を適宜参照する。 Next, the edge detection unit (302) in FIG. 11 will be described using the flowchart in FIG. 11 will be appropriately referred to in the description.

まず、ステップ１２０１にて、分散値算出部（１００１）は、入力画像（５０１）に対して信号の分散値を算出する。この際、その画像が持つチャンネル数が３の場合には３つとも求めてもよいし、１チャンネル化して１つでも良い。 First, in step 1201, the variance value calculation unit (1001) calculates the variance value of a signal for the input image (501). At this time, if the number of channels of the image is three, all three may be obtained, or one channel may be obtained.

次に、ステップ１２０２にて、エッジ閾値算出部（１００２）は、ステップ１２０１で算出した画像の信号の分散値が所定の値を越えているか否かを判定する。もし、所定の閾値以上の場合には、１２０３において「エッジが取得されやすい閾値」を取得する。逆に、所定の閾値未満の場合には、１２０４において「エッジが取得されにくい値」を取得する。 Next, in step 1202, the edge threshold calculation unit (1002) determines whether or not the variance value of the image signal calculated in step 1201 exceeds a predetermined value. If it is equal to or larger than the predetermined threshold value, “a threshold value at which an edge is easily acquired” is acquired at 1203. On the other hand, if it is less than the predetermined threshold value, “a value at which an edge is hard to be acquired” is acquired at 1204.

最後に、ステップ１２０５にて、エッジ抽出部（１００３）は、１２０３又は１２０４で決定した閾値を元にエッジ抽出処理を行う。 Finally, in step 1205, the edge extraction unit (1003) performs edge extraction processing based on the threshold determined in 1203 or 1204.

以上のように、本実施例では、エッジ抽出を行う場合に、文字切り矩形単位で切り出された入力画像毎に、画像の分散値を元に閾値を適応的に切り替える構成としている。そうすることにより、より高精度に「背景から分離困難な文字画像」と「背景から分離容易な文字画像」を精度よく切り分ける事ができるようになる。 As described above, in the present embodiment, when the edge extraction is performed, the threshold is adaptively switched for each input image cut out in the character cutting rectangle unit based on the variance value of the image. By doing so, it becomes possible to more accurately separate the “character image that is difficult to separate from the background” and the “character image that is easy to separate from the background” with high accuracy.

（実施例５）
実施例４では、エッジの抽出を行う際の閾値算出において、信号値の分散値を元に閾値を切り替える手法を説明した。入力画像が３チャンネル等を持つカラー画像の場合には、チャンネル数に応じた数だけの分散値を算出ができ、精度よく閾値の決定に用いる事ができる。しかしながら、入力画像がグレースケールの場合には、チャンネル数が１つのため、閾値算出に用いる事ができる分散値が１つになってしまい、高精度に閾値を算出する事が難しい。 (Example 5)
The fourth embodiment has described the method of switching the threshold value based on the variance value of the signal values in the threshold value calculation when the edge is extracted. When the input image is a color image having three channels or the like, it is possible to calculate as many variance values as the number of channels, which can be used for determining the threshold value with high accuracy. However, when the input image is grayscale, since the number of channels is one, the variance value that can be used for threshold calculation is one, and it is difficult to calculate the threshold with high accuracy.

そこで、本実施例では図１４に示すようにエッジ検出部（３０２）の構成を、分散値検出部１００１、エッジ判定閾値算出部１００２、エッジ抽出部１００３に加え、黒画素密度算出部１００４から構成される。また、入力画像に加え、２値化画像として使用する。 Therefore, in this embodiment, as shown in FIG. 14, the configuration of the edge detection unit (302) includes a variance value detection unit 1001, an edge determination threshold value calculation unit 1002, an edge extraction unit 1003, and a black pixel density calculation unit 1004. To be done. In addition to the input image, it is used as a binarized image.

黒画素密度算出部１００４は、入力される２値化画像を元に、文字切り矩形の面積に対する黒画素数の比率を算出する演算部である。入力されてくる２値化画像内で、黒画素数をカウントし、そのカウント数を文字切り矩形の面積で除算を行う。 The black pixel density calculation unit 1004 is a calculation unit that calculates the ratio of the number of black pixels to the area of the character cutting rectangle based on the input binarized image. In the input binary image, the number of black pixels is counted, and the counted number is divided by the area of the character-cutting rectangle.

次に、エッジ閾値算出部１００２において、黒画素密度算出部１００４で算出した黒画素密度を元に、最適な閾値を算出する。ここでも実施例１の分散値に応じてエッジの閾値を切り替えたのと同様に黒画素密度に応じてエッジの閾値を算出する。具体的には、黒画素密度が高い場合には「エッジが取得されやすい閾値」とし、黒画素密度が低い場合には「エッジが取得されにくい閾値」に設定する。このように設定する事で、「濃い濃度の背景を有する文字」の場合には黒画素密度が高く、「エッジが取得されやすい閾値」によりエッジ抽出が行え、正確にエッジの算出を行う事が可能となる。 Next, the edge threshold calculator 1002 calculates an optimum threshold based on the black pixel density calculated by the black pixel density calculator 1004. Here, the edge threshold is calculated according to the black pixel density in the same manner as the edge threshold is switched according to the variance value in the first embodiment. Specifically, when the black pixel density is high, the "threshold is easily acquired" is set, and when the black pixel density is low, the "edge is hard to be acquired" is set. By setting in this way, in the case of "characters having a dark background", the black pixel density is high, and the "threshold value at which an edge is easily acquired" can be used for edge extraction, and the edge can be accurately calculated. It will be possible.

尚、分散値を元に算出した閾値と、黒画素密度を元に算出した閾値のいずれか一つを使う事も可能であるが、両方とも使用して閾値算出に用いることも可能である。その際には、エッジをより多く取得する観点で「エッジが取得されやすい閾値」の方を使用する事が望ましいが、「エッジが取得されにくい閾値」を選ぶことも可能である。また、それぞれの閾値の重みを切り替える事で、例えば分散値を元に算出した閾値を優先させることなども可能である。 Note that it is possible to use either one of the threshold value calculated based on the variance value and the threshold value calculated based on the black pixel density, but it is also possible to use both of them for the threshold value calculation. In that case, it is preferable to use the “threshold value at which edges are easily acquired” from the viewpoint of acquiring more edges, but it is also possible to select the “threshold value at which edges are hard to be acquired”. Further, by switching the weights of the respective threshold values, it is possible to give priority to the threshold value calculated based on the variance value, for example.

また、図１５に示す通りエッジ検出部（３０２）の構成を、分散値検出部１００１、エッジ判定閾値算出部１００２、エッジ抽出部１００３、黒画素密度算出部１００４に加え、閉ループ数算出部１００５から構成しても良い。 Further, as shown in FIG. 15, the configuration of the edge detection unit (302) includes a variance value detection unit 1001, an edge determination threshold value calculation unit 1002, an edge extraction unit 1003, a black pixel density calculation unit 1004, and a closed loop number calculation unit 1005. It may be configured.

閉ループ数算出部１００５は、入力される２値化画像に対し白の部分の連続した画素により閉ループができている数を算出するラベリングの処理を行う演算部である。 The closed loop number calculation unit 1005 is a calculation unit that performs a labeling process for calculating the number of closed loops formed by continuous pixels in the white part of the input binarized image.

次に、エッジ閾値算出部１００２において、閉ループ数算出部１００５で算出した閉ループ数を元に、最適な閾値を算出する。ここでも実施例１同様に、閉ループ数の多少によりエッジ抽出に用いる閾値を算出する。具体的には、閉ループ数が多い場合には「エッジが取得されにくい閾値」を使用し、逆に閉ループ数が少ない場合には「エッジが取得されやすい閾値」を使用する。 Next, the edge threshold value calculation unit 1002 calculates an optimum threshold value based on the closed loop number calculated by the closed loop number calculation unit 1005. Here, as in the first embodiment, the threshold used for edge extraction is calculated depending on the number of closed loops. Specifically, when the number of closed loops is large, the “threshold value at which an edge is hard to be acquired” is used, and when the number of closed loops is small, a “threshold value at which an edge is easily acquired” is used.

以上の処理により、グレースケールのようなチャンネル数が少なく信号値の分散を元にエッジの閾値を算出できない画像に対しても、最適なエッジ閾値の算出を行う事が可能となる。 With the above processing, it is possible to calculate the optimum edge threshold even for an image such as gray scale in which the number of channels is small and the edge threshold cannot be calculated based on the variance of signal values.

（その他の実施例）
本発明は、上述の実施形態の１以上の機能を実現するプログラムを、ネットワーク又は記憶媒体を介してシステム又は装置に供給し、そのシステム又は装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 (Other embodiments)
The present invention supplies a program that implements one or more functions of the above-described embodiments to a system or apparatus via a network or a storage medium, and one or more processors in a computer of the system or apparatus read and execute the program. It can also be realized by the processing. It can also be realized by a circuit (for example, ASIC) that realizes one or more functions.

Claims

First determining means for determining whether the attribute of the area in the image is at least a character attribute or an image attribute based on the binarization result of the image ;
Second determining means for determining whether the image of the area determined by the first determining means to have the character attribute is a character image easily separated from the background or a character image difficult to separate from the background ,
When the image of the area determined by the first determination means to have the character attribute is a character image that is easily separated from the background by the second determination means, binary compression processing is performed on the image of the area. When the image of the area determined by the first determining means to have the character attribute is a character image that is difficult to separate from the background by the second determining means, the image of the area is displayed . Processing means for executing multi-value compression processing,
A file of the image is generated based on at least the image of the area that has been determined by the first determination unit to have the character attribute, and one of the binary compression processing and the multi-value compression processing has been executed. An image processing apparatus comprising: a generation unit.

When the image of the region determined by the first determination unit to have the character attribute is a character image that is difficult to separate from the background by the second determination unit, the processing unit determines the character attribute. 2. The image processing apparatus according to claim 1, wherein the binary compression processing is not performed on the image of the area determined by the first determination unit to be.

The second determination means determines whether the image of the area determined by the first determination means to have the character attribute is a character image that is easily separated from the background or a character image that is difficult to be separated from the background. The image processing apparatus according to claim 1, wherein the determination is performed based on the number of edges in the image of the area.

Further comprising character recognition means for performing character recognition processing on the image of the area determined by the first determination means to have the character attribute,
4. The image processing apparatus according to claim 1, wherein the generation unit generates the file of the image including the character code obtained by the character recognition processing.

The processing means performs the multi-value compression processing on the image of the area determined by the image attribute and the first determination means,
The generation unit determines whether the region is an image in which the one of the binary compression process and the multi-value compression process has been determined by the first determination unit as the character attribute, and the image attribute. 5. The file of the image is generated based on the image of the area in which the multi-value compression processing has been performed, which is determined by the first determination means to be present. The image processing device according to item 1.

The image processing apparatus according to claim 1, further comprising a reading unit that reads a document to generate an image.

A first determining step of determining whether the attribute of the area in the image is at least a character attribute or an image attribute based on the binarization result of the image ;
Wherein either said as a text attribute first determined at decision step area image is easy character image separated from the background, and a second determination step of determining whether the hard character image separated from the background ,
If it is determined in the second determination step that the image of the area determined in the first determination step as the character attribute is a character image that can be easily separated from the background , binary compression processing is performed on the image in the area. If the image of the area determined to have the character attribute in the first determination step is a character image that is difficult to separate from the background in the second determination step, the image of the area is displayed . A processing step for executing multi-value compression processing,
A file of the image is generated based on at least the image of the area that has been determined in the first determination step to have the character attribute and in which one of the binary compression processing and the multi-value compression processing has been executed. A method of controlling an image processing device, comprising: a generation step.

A program for causing a computer of an image processing apparatus to execute the control method according to claim 7.