JP3106080B2

JP3106080B2 - Image processing apparatus and method

Info

Publication number: JP3106080B2
Application number: JP07022895A
Authority: JP
Inventors: 剛蒔田; 修山田; 浩森
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1995-02-10
Filing date: 1995-02-10
Publication date: 2000-11-06
Anticipated expiration: 2015-11-06
Also published as: JPH08221565A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は画像処理装置及びその方
法に関し、例えば、多値画像の２値化閾値を決定して２
値化を行う画像処理装置及びその方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image processing apparatus and method, for example, by determining a binarization threshold value of a multi-valued image.
The present invention relates to an image processing device and a method for performing binarization.

【０００２】[0002]

【従来の技術】近年の画像処理技術の発展はめざまし
く、フルカラー画像等の多値画像の処理や、多値画像内
の文字認識処理等が可能な画像処理装置も普及してきて
いる。2. Description of the Related Art In recent years, image processing techniques have been remarkably developed, and image processing apparatuses capable of processing multi-valued images such as full-color images and character recognition processing in multi-valued images have become widespread.

【０００３】このような画像処理技術において、多値画
像の２値化処理は不可欠の技術となっている。従来の２
値化方法としては、予め設定してある固定閾値による単
純２値化法をはじめとして、ある閾値でヒストグラムを
２クラスに分割した場合のクラス間分散が最大になる時
の閾値を２値化閾値とする大津法（大津、”判別および
最小２乗基準に基づく自動しきい値選定方”、電子通信
学会論文誌、vol.J63-D,no.4,pp.349-356,1980）や、ま
た、階調をもつ画像に対して、局所的濃度に応じて閾値
を設定する２値化方法等があった。In such an image processing technique, binarization processing of a multi-valued image is an indispensable technique. Conventional 2
Examples of the binarization method include a simple binarization method using a preset fixed threshold, and a threshold at which the inter-class variance is maximized when the histogram is divided into two classes at a certain threshold. Otsu method (Otsu, "How to automatically select thresholds based on discrimination and least squares criterion", IEICE Transactions, vol.J63-D, no.4, pp.349-356, 1980), Further, there has been a binarization method for setting a threshold value for an image having a gradation according to a local density.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら、上述し
たような従来の画像処理装置における２値化方法におい
ては、以下のような問題点があった。However, the above-described conventional binarizing method in the image processing apparatus has the following problems.

【０００５】固定閾値による単純２値化方法では、画像
内の対象物濃度と背景濃度との間に適切な閾値を設定す
ることが難しく、その結果、画像一面が黒く潰れてしま
ったり、逆に白くなってしまっていた。また、大津法で
は、２クラスの分布が極端に異なる場合においては、大
きい方のクラスに閾値が寄ってしまうという性質があ
り、従ってノイズの多い２値画像が生成されてしまって
いた。また、局所濃度に応じて閾値を設定する２値化方
法では、画像を局所分割しているため、ブロック歪みが
発生しやすいという問題があった。In the simple binarization method using a fixed threshold value, it is difficult to set an appropriate threshold value between the object density and the background density in an image. As a result, the entire image may be crushed black, or conversely. It had turned white. Further, according to the Otsu method, when the distributions of the two classes are extremely different, the threshold value is closer to the larger class, so that a binary image with much noise has been generated. Further, in the binarization method in which the threshold is set according to the local density, since the image is locally divided, there is a problem that block distortion is likely to occur.

【０００６】本発明は上述した課題を解決するためにな
されたものであり、画像内の対象物濃度と背景濃度との
間に適切な２値化閾値を自動的に設定可能な画像処理装
置及びその方法を提供することを目的とする。SUMMARY OF THE INVENTION The present invention has been made to solve the above-mentioned problem, and an image processing apparatus and an image processing apparatus capable of automatically setting an appropriate binarization threshold value between an object density and a background density in an image. It is intended to provide such a method.

【０００７】[0007]

【課題を解決するための手段】上述した目的を達成する
ために、本発明は以下の構成を備える。In order to achieve the above-mentioned object, the present invention has the following arrangement.

【０００８】即ち、多値画像の輝度頻度分布を算出する
演算手段と、前記輝度頻度分布に基づいて、前記多値画
像における背景と対象物を分離するための２値化閾値を
決定する閾値決定手段と、を有する画像処理装置であっ
て、前記閾値決定手段は、前記輝度頻度分布における特
定輝度領域の始点及び終点として、第１及び第２の輝度
値をそれぞれ設定し、前記特定輝度領域における平均輝
度値及び分布偏りを求め、前記分布偏りが所定の条件を
満たさない場合に、前記第１の輝度値又は前記第２の輝
度値のいずれかを変更して前記特定輝度領域を新たに設
定し、該特定輝度領域における平均輝度値及び分布偏り
を新たに求める処理を、該分布偏りが前記所定の条件を
満たすまで繰り返し、前記分布偏りが前記所定の条件を
満たす場合に、前記特定輝度領域における平均輝度値
を、２値化閾値として設定することを特徴とする。That is, a calculating means for calculating a luminance frequency distribution of a multi-valued image, and a threshold value determining for determining a binarization threshold for separating a background and an object in the multi-valued image based on the luminance frequency distribution Means, wherein the threshold value determining means sets first and second luminance values as a start point and an end point of the specific luminance area in the luminance frequency distribution, respectively, and An average brightness value and distribution bias are obtained, and when the distribution bias does not satisfy a predetermined condition, the specific brightness area is newly set by changing either the first brightness value or the second brightness value. The process of newly calculating the average luminance value and the distribution deviation in the specific luminance region is repeated until the distribution deviation satisfies the predetermined condition, and when the distribution deviation satisfies the predetermined condition, An average luminance value in the specific luminance area, and sets a binarization threshold.

【０００９】更に、前記多値画像を入力する入力手段
と、前記２値化閾値を用いて前記多値画像を２値化する
２値化手段と、を有することを特徴とする。[0009] Further, it is characterized in that it comprises an input means for inputting the multi-valued image, and a binarizing means for binarizing the multi-valued image using the binarization threshold.

【００１０】例えば、前記閾値決定手段は、前記分布偏
りを、前記輝度頻度分布における前記特定輝度領域内の
各画素の輝度値と該特定輝度領域内の平均輝度値との差
に基づいて算出することを特徴とする。For example, the threshold value determining means calculates the distribution bias based on a difference between a luminance value of each pixel in the specific luminance area and an average luminance value in the specific luminance area in the luminance frequency distribution. It is characterized by the following.

【００１１】例えば、前閾値決定手段は、前記分布偏り
を、前記輝度頻度分布における前記特定輝度領域内の各
画素の輝度値と該特定輝度領域内の平均輝度値との差の
奇数乗に基づいて算出することを特徴とする。For example, the pre-threshold determining means determines the distribution bias based on an odd power of a difference between a luminance value of each pixel in the specific luminance region and an average luminance value in the specific luminance region in the luminance frequency distribution. Is calculated.

【００１２】例えば、前記閾値決定手段は、前記分布偏
りが所定範囲内であれば前記所定の条件を満たすと判断
し、前記分布偏りが前記所定範囲外でかつ正であれば、
前記第１の輝度値を前記特定輝度領域における平均輝度
値に変更し、前記分布偏りが前記所定範囲外でかつ負で
あれば、前記第２の輝度値を前記特定輝度領域における
平均輝度値に変更することを特徴とする。For example, the threshold value determining means determines that the predetermined condition is satisfied if the distribution bias is within a predetermined range, and if the distribution bias is outside the predetermined range and is positive,
The first luminance value is changed to an average luminance value in the specific luminance region, and if the distribution bias is outside the predetermined range and negative, the second luminance value is changed to an average luminance value in the specific luminance region. It is characterized by changing.

【００１３】[0013]

【作用】以上の構成により、入力された多値画像におい
て、輝度頻度とその偏りに基づいて、画像内の背景と対
象物とを分離するために最も適した閾値が存在する領域
を特定し、該特定領域の平均輝度値をもって、２値化閾
値を決定することができるという特有の作用効果が得ら
れる。According to the above arrangement, in the input multi-valued image, based on the luminance frequency and its bias, the region where the most suitable threshold value for separating the background and the object in the image is specified, The specific operation and effect that the binarization threshold can be determined based on the average luminance value of the specific region is obtained.

【００１４】[0014]

【実施例】以下、図面を参照して本発明に係る一実施例
について詳細に説明する。DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS One embodiment according to the present invention will be described below in detail with reference to the drawings.

【００１５】＜第１実施例＞図１は、本実施例における
２値化処理を実行するシステム構成を示すブロック図で
ある。図１において、１は文字認識処理を行なう画像処
理装置であり、２は画像を入力するスキャナ等の画像入
力装置、３は処理後の画像を表示する画像表示装置であ
る。<First Embodiment> FIG. 1 is a block diagram showing a system configuration for executing a binarization process in this embodiment. In FIG. 1, reference numeral 1 denotes an image processing device for performing a character recognition process, 2 denotes an image input device such as a scanner for inputting an image, and 3 denotes an image display device for displaying a processed image.

【００１６】画像処理装置１において、４は画像入力装
置２とのインターフェースとなる入力部、５は処理中の
データを記憶するメモリ等の記憶部、６は入力画像の輝
度頻度（ヒストグラム）を累計する輝度頻度累計部であ
る。７は入力画像の２値化閾値を算出する２値化閾値算
出部であり、８は２値化閾値算出部７において算出され
た閾値を用いて２値画像を作成する２値化部である。９
は画像を属性毎の領域に分離する領域分離部であり、１
０は領域分離により文字領域として抽出された領域に対
する文字認識処理を行う文字認識部、１１は文字領域以
外に分離された領域に対する各種画像処理を行う画像処
理部、１２は画像表示装置３とのインターフェースとな
る出力部である。これら各構成は、不図示のＣＰＵによ
り統括的に制御されている。In the image processing apparatus 1, 4 is an input section serving as an interface with the image input apparatus 2, 5 is a storage section such as a memory for storing data being processed, and 6 is a luminance frequency (histogram) of the input image. This is the luminance frequency accumulating unit that performs the operation. Reference numeral 7 denotes a binarization threshold calculator that calculates a binarization threshold of the input image, and 8 denotes a binarization unit that creates a binary image using the threshold calculated by the binarization threshold calculator 7. . 9
Is an area separation unit that separates an image into areas for each attribute.
0 is a character recognition unit that performs character recognition processing on an area extracted as a character area by area separation, 11 is an image processing unit that performs various types of image processing on an area separated from the character area, and 12 is an image processing unit 3 This is an output unit that serves as an interface. These components are generally controlled by a CPU (not shown).

【００１７】上述した構成をなす本実施例の画像処理装
置１において実行されるＯＣＲ処理について、以下説明
する。The OCR processing executed in the image processing apparatus 1 according to the present embodiment having the above-described configuration will be described below.

【００１８】図２は、本実施例の特徴である２値化閾値
決定方法を利用した像域分離ＯＣＲ処理を示すフローチ
ャートである。FIG. 2 is a flowchart showing an image area separation OCR process using a binarization threshold value determination method which is a feature of the present embodiment.

【００１９】まずステップＳ５０１においてスキャナ等
の画像入力装置２により、画像データを入力する。ここ
での入力は、８ビットの多値画像データとして行われ
る。続いてステップＳ５０２においては、ステップＳ５
０１で入力された多値画像に対し、本実施例の特徴であ
るところの像域分離に最適な２値化閾値を決定して、該
２値化閾値により２値画像を生成する。そしてステップ
Ｓ５０３では、ステップＳ５０２で生成された２値画像
の像域分離を行い、その属性を付加した領域データを出
力する。次にステップＳ５０４においては、ステップＳ
５０３で分離された領域データに含まれる誤判定結果を
除去する。以下、ステップＳ５０４における誤判定結果
除去処理を、レイアウトノイズリダクション（ＬＮＲ）
と称する。そしてステップＳ５０５に進み、ステップＳ
５０３において分離された領域データにおいて、「テキ
スト」と指定された領域を２値画像から切り出し、該２
値画像に対してＯＣＲ処理を行い、認識された文字コー
ドを出力する。First, in step S501, image data is input by the image input device 2 such as a scanner. The input here is performed as 8-bit multivalued image data. Subsequently, in step S502, step S5
For the multi-valued image input at 01, a binarization threshold optimal for image area separation, which is a feature of the present embodiment, is determined, and a binary image is generated using the binarization threshold. In step S503, image area separation of the binary image generated in step S502 is performed, and area data to which the attribute is added is output. Next, in step S504, step S504
The erroneous determination result included in the area data separated in 503 is removed. Hereinafter, the erroneous determination result removal processing in step S504 is referred to as layout noise reduction (LNR).
Called. Then, the process proceeds to step S505, and step S505
In the area data separated in 503, an area designated as “text” is cut out from the binary image.
OCR processing is performed on the value image, and the recognized character code is output.

【００２０】＜＜２値化処理＞＞次に、本実施例におけ
る２値化処理について、図３のフローチャートを参照し
て詳細に説明する。<< Binarization Processing >> Next, the binarization processing in this embodiment will be described in detail with reference to the flowchart of FIG.

【００２１】図３において、まずステップＳ１で８ビッ
トの多値画像を画像処理装置１内の記憶部５から不図示
のメモリ等に入力する。尚、該多値画像は、スキャナ等
の画像入力装置２により、予め記憶部５に格納されてい
るものとする。そして、ステップＳ２において、入力画
像全体のヒストグラムを算出する。ここでは、画像中の
全画素を用い、８ビット、即ち「０」から「２５５」ま
での各デジタル値に対する頻度を計算する。これによ
り、例えば図４に示すヒストグラムが得られる。In FIG. 3, first, in step S1, an 8-bit multivalued image is input from a storage unit 5 in the image processing apparatus 1 to a memory (not shown). It is assumed that the multivalued image is stored in the storage unit 5 in advance by the image input device 2 such as a scanner. Then, in step S2, a histogram of the entire input image is calculated. Here, all the pixels in the image are used to calculate the frequency for 8 bits, that is, each digital value from “0” to “255”. Thereby, for example, a histogram shown in FIG. 4 is obtained.

【００２２】次にステップＳ３において、パラメータＳ
ＴＡＲＴ，ＥＮＤにそれぞれ「０」、「２５５」をセッ
トする。ＳＴＡＲＴ，ＥＮＤはそれぞれ、後段のステッ
プＳ４やステップＳ５で求める輝度値の統計量の始点及
び終点に対応する。Next, in step S3, the parameter S
"0" and "255" are set in TART and END, respectively. START and END correspond to the start point and the end point of the statistic of the luminance value obtained in the subsequent steps S4 and S5, respectively.

【００２３】ステップＳ４では、ＳＴＡＲＴからＥＮＤ
までのデジタル値に対応する画素の平均値ＡＶを算出す
る。例えば、ＳＴＡＲＴ＝０，ＥＮＤ＝２５５であれば
「０」から「２５５」の値を持つ画素（この場合、全画
素）の平均値ＡＶを算出し、ＳＴＡＲＴ＝０，ＥＮＤ＝
１７７であれば「０」から「１７７」の値を持つ画素の
平均値ＡＶを算出する。In step S4, START to END
The average value AV of the pixels corresponding to the digital values up to is calculated. For example, if START = 0 and END = 255, an average value AV of pixels having values from “0” to “255” (all pixels in this case) is calculated, and START = 0, END =
If it is 177, the average value AV of pixels having values from “0” to “177” is calculated.

【００２４】ステップＳ５では、ＳＴＡＲＴからＥＮＤ
までの輝度値に対応する画素のスキュー値ＳＫを算出す
る。スキュー値とは、ヒストグラム分布の偏りを示す統
計量である。スキュー値の算出には、以下に示す（１）
式を用いる。In step S5, START to END
The skew value SK of the pixel corresponding to the luminance value up to is calculated. The skew value is a statistic indicating the bias of the histogram distribution. To calculate the skew value, the following (1)
Use the formula.

【００２５】ＳＫ＝（Σ（Ｘｉ−ＡＶ）∧3）／Ｄ・・・
（１）（尚、Ｒ∧3 の表記によってＲの３乗を示すものとす
る。）ここで、Ｘｉは画素の輝度値である。また、Ｄは画像全
体の分散値であり、（２）式により算出される。SK = (Σ (Xi-AV) ∧3) / D
(1) (Note that the notation R∧3 indicates the cube of R.) Here, Xi is the luminance value of the pixel. D is a variance value of the entire image, and is calculated by equation (2).

【００２６】Ｄ＝Σ（Ｘｉ−ＡＶ）∧2 ・・・（２）（尚、Ｒ∧2 の表記によってＲの２乗を示すものとす
る。）上述した式（１）において、スキュー値は各画素の輝度
値と、その平均値との差分を３乗することにより算出さ
れるが、奇数乗であれば３乗に限定されるものではな
い。D = Σ (Xi−AV) ∧2 (2) (R∧2 represents the square of R.) In the above equation (1), the skew value is It is calculated by raising the difference between the luminance value of each pixel and its average value to the third power, but is not limited to the third power if it is an odd power.

【００２７】続いてステップＳ６、Ｓ７では、ヒストグ
ラムの偏りの方向を判断する。まずステップＳ６では、
以下に示す（３）式によりヒストグラムの偏りの方向を
判断する。これは、ヒストグラムの偏りが平均値ＡＶよ
りも小さい値の範囲にあるか否かの判断となる。Subsequently, in steps S6 and S7, the direction of the bias of the histogram is determined. First, in step S6,
The bias direction of the histogram is determined by the following equation (3). This is to determine whether the bias of the histogram is in a range of values smaller than the average value AV.

【００２８】ＳＫ＜−１．０・・・（３）ステップＳ６において（３）式が真ならばステップＳ１
０へ進み、偽ならばステップＳ７へ進む。ステップＳ１
０では、ＳＴＡＲＴは変化させず、ＥＮＤに平均値ＡＶ
をセットする。そしてステップＳ４に戻り、再びＳＴＡ
ＲＴ値からＥＮＤ値までの平均値ＡＶを算出する。SK <−1.0 (3) If Expression (3) is true in Step S6, Step S1
The process proceeds to 0, and if false, the process proceeds to step S7. Step S1
At 0, START is not changed, and the average value AV is added to END.
Is set. Then, returning to step S4, the STA
The average value AV from the RT value to the END value is calculated.

【００２９】一方、ステップＳ７では以下に示す（４）
式によりヒストグラムの偏りの方向を判断する。これ
は、ヒストグラムの偏りが平均値ＡＶより大きい値の範
囲にある否かの判断となる。On the other hand, in step S7, the following (4)
The direction of the bias of the histogram is determined by the equation. This is to determine whether the bias of the histogram is in a range of values larger than the average value AV.

【００３０】ＳＫ＞１．０・・・（４）ステップＳ７において（４）式が真ならばステップＳ１
１へ進み、偽ならばステップＳ８へ進む。ステップＳ１
１では、ＳＴＡＲＴに平均値ＡＶをセットし、ＥＮＤは
変化させない。そしてステップＳ４に戻り、再びＳＴＡ
ＲＴ値からＥＮＤ値までの平均値ＡＶを算出する。SK> 1.0 (4) If equation (4) is true in step S7, step S1
The process proceeds to step S1, and if false, the process proceeds to step S8. Step S1
At 1, the average value AV is set in START, and END is not changed. Then, returning to step S4, the STA
The average value AV from the RT value to the END value is calculated.

【００３１】一方、ステップＳ８ではステップＳ６，Ｓ
７における条件が共に偽である場合の平均値ＡＶを、２
値化閾値ＴＨとして設定する。そして、ステップＳ９で
２値化閾値ＴＨを用いた単純２値化処理を行なう。On the other hand, in step S8, steps S6, S
The average value AV when the conditions in 7 are both false is 2
The threshold value TH is set. Then, in step S9, a simple binarization process using the binarization threshold TH is performed.

【００３２】以上説明したようにして本実施例における
２値化処理が行われるが、式（３），（４）で示した範
囲は、これに限定されるものではない。As described above, the binarization processing in the present embodiment is performed, but the range shown by equations (3) and (4) is not limited to this.

【００３３】以下、具体的な画像の例を参照して、本実
施例の２値化処理について更に詳細に説明する。Hereinafter, the binarization processing of this embodiment will be described in more detail with reference to specific examples of images.

【００３４】上述した図４に示すヒストグラムの例を用
いて、本実施例における２値化閾値ＴＨの決定処理につ
いて説明する。Using the example of the histogram shown in FIG. 4 described above, the process of determining the binarization threshold TH in this embodiment will be described.

【００３５】図４は、ある画像（８ビット入力）のヒス
トグラムを示したものである。図４において、横軸は左
端が「０」即ち黒、右端が「２５５」即ち白を表わす輝
度のデジタル値であり、縦軸は各デジタル値の頻度を表
わしている。図５は、図４に示す様なヒストグラムを有
する画像に対して、上述した図３で示す２値化処理にお
いてステップＳ４およびＳ５で示した処理の際の、各パ
ラメータの値の変化を示す図である。尚、図５において
示される各パラメータ値は、ステップＳ４及びＳ５を通
過する回数によって、それぞれ示されている。FIG. 4 shows a histogram of a certain image (8-bit input). In FIG. 4, the horizontal axis represents the digital value of the luminance representing "0" or black at the left end and "255" or white at the right end, and the vertical axis represents the frequency of each digital value. FIG. 5 is a diagram showing a change in the value of each parameter during the processing shown in steps S4 and S5 in the above-described binarization processing shown in FIG. 3 for an image having a histogram as shown in FIG. It is. Each parameter value shown in FIG. 5 is indicated by the number of times of passing through steps S4 and S5.

【００３６】まず、ステップＳ４及びＳ５を通過する１
回目の処理では、ＳＴＡＲＴ＝０，ＥＮＤ＝２５５で平
均値ＡＶ，統計量ＳＫを計算し、それぞれ「１７７」，
「−７８．９」という値を得る。この場合、統計量ＳＫ
が「−１．０」未満であるため、ステップＳ１０におい
てＳＴＡＲＴ＝０，ＥＮＤ＝１７７が設定される。First, 1 which passes through steps S4 and S5
In the third process, the average value AV and the statistic SK are calculated with START = 0 and END = 255, and “177”,
The value “−78.9” is obtained. In this case, the statistic SK
Is less than “−1.0”, START = 0 and END = 177 are set in step S10.

【００３７】続いて２回目の処理ではＳＴＡＲＴ＝０，
ＥＮＤ＝１７７における平均値ＡＶ，統計量ＳＫを計算
し、それぞれ「９１」，「−８．６」という値を得る。
これも、統計量ＳＫが「−１．０」未満であるため、ス
テップＳ１０においてＳＴＡＲＴ＝０，ＥＮＤ＝９１が
設定される。Subsequently, in the second processing, START = 0,
The average value AV and the statistic SK at END = 177 are calculated to obtain values “91” and “−8.6”, respectively.
Again, since the statistic SK is less than "-1.0", START = 0 and END = 91 are set in step S10.

【００３８】続いて３回目の処理では、ＳＴＡＲＴ＝
０，ＥＮＤ＝９１における平均値ＡＶ，統計量ＳＫを計
算し、それぞれ「４３」，「９．６」という値を得る。
この場合は、統計量ＳＫが「１．０」を超えるため、ス
テップＳ１１においてＳＴＡＲＴ＝４３，ＥＮＤ＝９１
が設定される。Subsequently, in the third processing, START =
The average value AV and the statistic SK at 0, END = 91 are calculated to obtain values of "43" and "9.6", respectively.
In this case, since the statistic SK exceeds “1.0”, START = 43 and END = 91 in step S11.
Is set.

【００３９】続いて４回目の処理では、ＳＴＡＲＴ＝４
３，ＥＮＤ＝９１における平均値ＡＶ，統計量ＳＫを計
算し、それぞれ「７２」，「−７．０」という値を得
る。これも、統計量ＳＫが「−１．０」未満であるた
め、ステップＳ１０においてＳＴＡＲＴ＝４３，ＥＮＤ
＝７２が設定される。Subsequently, in the fourth processing, START = 4
3, the average value AV and the statistic SK at END = 91 are calculated to obtain values "72" and "-7.0", respectively. Again, since the statistic SK is less than “−1.0”, START = 43, END in step S10.
= 72 is set.

【００４０】続いて５回目の処理では、ＳＴＡＲＴ＝４
３，ＥＮＤ＝７２における平均値ＡＶ，統計量ＳＫを計
算し、それぞれ「５８」，「−２．２」という値を得
る。これも、統計量ＳＫが「−１．０」未満であるた
め、ステップＳ１０においてＳＴＡＲＴ＝４３，ＥＮＤ
＝５８が設定される。Subsequently, in the fifth processing, START = 4
3, the average value AV and the statistic SK at END = 72 are calculated to obtain values of “58” and “−2.2”, respectively. Again, since the statistic SK is less than “−1.0”, START = 43, END in step S10.
= 58 is set.

【００４１】そして６回目の処理では、ＳＴＡＲＴ＝４
３，ＥＮＤ＝５８における平均値ＡＶ，統計量ＳＫを計
算し、それぞれ「５０」，「−０．４」という値を得
る。ここで、統計量ＳＫが「−１．０」以上かつ「１．
０」以下であるため、ステップＳ６，Ｓ７の条件を満た
さず、ステップＳ８へ進んで２値化閾値ＴＨとして、
「５０」が設定される。そしてステップＳ９において、
２値化閾値ＴＨを用いた単純２値化処理が行われ、２値
化された画像は記憶部５に格納される。In the sixth processing, START = 4
3, the average value AV and the statistic SK at END = 58 are calculated to obtain values “50” and “−0.4”, respectively. Here, the statistic SK is “−1.0” or more and “1.
0 "or less, the conditions of Steps S6 and S7 are not satisfied, and the process proceeds to Step S8 to set the binarization threshold TH as
“50” is set. Then, in step S9,
Simple binarization processing using the binarization threshold TH is performed, and the binarized image is stored in the storage unit 5.

【００４２】以上説明したように、本実施例において
は、スキュー値が所定値まで収束するようにして２値化
閾値を決定し、２値化を行う。即ち、入力された多値画
像において、輝度頻度とその偏りに基づいて、画像内の
背景と対象物とを分離するために最も適した閾値が存在
する領域を特定した後、該特定領域の平均輝度値をもっ
て、２値化閾値とする。これにより、多値入力画像上の
領域内における各画素の輝度値を背景と対象物との２つ
のクラスに分類する際の最適閾値を、自動的に求めるこ
とができる。As described above, in this embodiment, the binarization threshold is determined so that the skew value converges to a predetermined value, and binarization is performed. That is, in the input multi-valued image, based on the luminance frequency and its bias, after specifying the region where the threshold value most suitable for separating the background and the object in the image exists, the average of the specific region is determined. The luminance value is used as a binarization threshold. This makes it possible to automatically determine the optimum threshold value when classifying the luminance value of each pixel in the area on the multi-value input image into two classes, the background and the object.

【００４３】＜＜像域分離処理＞＞次に、上述した図２
のステップＳ５０３で示した像域分離処理について、図
６のフローチャートを参照して詳細に説明する。<< Image Area Separation Processing >> Next, FIG.
The image area separation process shown in step S503 will be described in detail with reference to the flowchart in FIG.

【００４４】まず、ステップＳ６０１において、２値画
像を入力して記憶部５に格納する。そしてステップＳ６
０２ではｍ×ｎ画素が１画素となるように入力画像を間
引き、像域分離用の画像を生成する。この時、ｍ×ｎ画
素中に１つでも黒色画素が存在していれば、該ｍ×ｎ画
素を黒の１画素とする。First, in step S601, a binary image is input and stored in the storage unit 5. And step S6
In 02, the input image is thinned out so that m × n pixels become one pixel, and an image for image area separation is generated. At this time, if at least one black pixel exists in the m × n pixels, the m × n pixel is set as one black pixel.

【００４５】そしてステップＳ６０３では、像域分離用
画像の全画素について、黒画素が上下、左右、斜めの方
向に所定数連続している領域を一つの領域として、領域
分離を行なう。その際、領域の検出順に番号を付すこと
により、各領域に対するラベル付けを行なう。次にステ
ップＳ６０４において、各領域の幅、高さ、面積、領域
内の黒画素密度により領域を分類し、属性のラベル付け
を行なう。領域の属性としては、詳細は後述するが、
「テーブル」，「外枠領域」，「テキスト」等がある。Then, in step S603, for all the pixels of the image area separation image, the area where black pixels are continuous by a predetermined number in the up, down, left, right, and oblique directions is defined as one area. At this time, labels are assigned to the respective regions by assigning numbers to the regions in the order of detection. Next, in step S604, the regions are classified according to the width, height, area, and black pixel density of each region, and attribute labeling is performed. As the attribute of the area, details will be described later,
There are "table", "outer frame area", "text" and the like.

【００４６】そしてステップＳ６０５では、「テキス
ト」とラベル付けされた全ての領域の幅と高さの平均を
算出し、得られた平均幅が平均高さより大きい場合には
処理画像は横書きであるとみなし、逆の場合は縦書きで
あるとみなすことにより、文字組みを判断する。同時
に、横書きならば平均高さを、縦書きならば平均幅をも
って、一文字の文字サイズとする。In step S605, the average of the widths and heights of all the regions labeled "text" is calculated, and if the obtained average width is larger than the average height, it is determined that the processed image is horizontal. In the opposite case, the character set is determined by regarding vertical writing. At the same time, the average height is set for horizontal writing, and the average width is set for vertical writing to make the size of one character.

【００４７】また、像域分離用画像上の縦方向（横書き
時）または横方向（縦書き時）の「テキスト」領域全て
のヒストグラムから、文章の段組、行間隔、が検出され
る。ステップＳ６０６では、「テキスト」領域において
文字サイズが大きい領域については、「タイトル」とす
る。そしてステップＳ６０７では、何の関連もなくばら
ばらに存在したままの「タイトル」領域、「テキスト」
領域を、周りの領域との間隔に応じて併合し、一つのま
とまった領域とする。次にステップＳ６０８において、
各領域毎に属性、原画像における座標や大きさ等の領域
データを出力する。Further, from the histograms of all the "text" regions in the vertical direction (at the time of horizontal writing) or the horizontal direction (at the time of vertical writing) on the image for image area separation, columns of sentences and line intervals are detected. In step S606, an area having a large character size in the “text” area is set to “title”. Then, in step S607, the "title" area and the "text"
The regions are merged according to the distance from the surrounding regions to form one integrated region. Next, in step S608,
It outputs area data such as attributes and coordinates and size in the original image for each area.

【００４８】以上の処理を行なうことにより、本実施例
では２値画像の像域分離処理を行い、各領域データが得
られる。By performing the above processing, in this embodiment, image area separation processing of a binary image is performed, and each area data is obtained.

【００４９】図７に、上述した領域データの例を示す。
図７に示す各領域データ項目について、以下説明する。・「番号」：領域の検出順序を示す。・「属性」：領域の属性情報を示し、以下に示す８通り
が用意されている。FIG. 7 shows an example of the above-mentioned area data.
Each area data item shown in FIG. 7 will be described below. "No.": Indicates the detection order of the areas. "Attribute": Indicates attribute information of an area, and eight types shown below are prepared.

【００５０】「ルート」入力画像そのものである
ことを示す。"Route" Indicates that the image is the input image itself.

【００５１】「テキスト」文字領域であることを示
す。"Text" Indicates a character area.

【００５２】「タイトル」見出し領域であることを
示す。"Title" Indicates a title area.

【００５３】「テーブル」表領域であることを示
す。"Table" Indicates a table area.

【００５４】「ノイズ」文字領域とも画像領域と
も判断できなかった領域であることを示す。"Noise" indicates a region that could not be determined as either a character region or an image region.

【００５５】「外枠」罫線等の領域であること
を示す。[Outer frame] Indicates an area such as a ruled line.

【００５６】「写真画像」写真領域であることを示
す。"Photo image" Indicates a photo area.

【００５７】「線画像」線画像領域であることを
示す。・「始点座標」：原画像における領域開始のＸ，Ｙ座標
を示す。・「終点座標」：原画像における領域終了のＸ，Ｙ座標
を示す。・「画素数」：領域内の全画素数を示す。・「文字組情報」：縦書き，横書き，不明の３通りの文
字組情報を示す。"Line image" Indicates a line image area. "Start point coordinates": X and Y coordinates of the area start in the original image. "End point coordinates": X and Y coordinates of the end of the area in the original image. "Number of pixels": Indicates the total number of pixels in the area. "Character set information": Indicates three types of character set information: vertical writing, horizontal writing, and unknown.

【００５８】図７に示す領域データについて、「属性」
が「テキスト」で示される領域のみ、図６に示すステッ
プＳ６０７における併合前の、行に関する領域データ
（行領域データ）を階層的に保持している。For the area data shown in FIG.
Only the area indicated by “text” hierarchically holds the area data (row area data) related to the rows before the merging in step S607 shown in FIG.

【００５９】以上説明したようにして、本実施例では像
域分離処理が行われる。尚、図７に示した領域データは
本実施例を適用した一例にすぎず、画像処理装置に応じ
て例えば他の情報を適宜追加しても良いし、または減ら
しても良い。As described above, in this embodiment, the image area separation processing is performed. Note that the area data shown in FIG. 7 is merely an example to which the present embodiment is applied, and for example, other information may be appropriately added or reduced depending on the image processing apparatus.

【００６０】＜＜ＬＮＲ処理＞＞次に、図３のステップ
Ｓ５０４に示すＬＮＲ処理について、図８のフローチャ
ートを参照して詳細に説明する。ＬＮＲ処理とは、像域
分離された各領域の内、像域分離誤りの領域を除去する
処理である。<< LNR Processing >> Next, the LNR processing shown in step S504 of FIG. 3 will be described in detail with reference to the flowchart of FIG. The LNR process is a process of removing an image area separation error area from each of the image area separated areas.

【００６１】まず図８のステップＳ７０１で、像域分離
後の各領域データはルート領域であるか否かが判断され
る。ルート領域とは画像全体を囲む領域、即ち全体領域
のことであり、ルート領域であればステップＳ７０６に
進み、ＬＮＲ処理は施さない。ルート領域でなければス
テップＳ７０２に進み、テキスト領域（文字領域）であ
るか、またはノイズ領域であるかが判断される。テキス
ト領域またはノイズ領域である場合には処理はステップ
Ｓ７０３へ、いずれでもない場合はステップＳ７０５へ
進む。First, in step S701 in FIG. 8, it is determined whether or not each area data after image area separation is a root area. The root area is an area surrounding the entire image, that is, the entire area. If the area is the root area, the process proceeds to step S706, and the LNR process is not performed. If it is not the root area, the process proceeds to step S702, and it is determined whether the area is a text area (character area) or a noise area. If it is a text area or a noise area, the process proceeds to step S703; otherwise, the process proceeds to step S705.

【００６２】ステップＳ７０３では、領域の大きさに応
じて領域データが領域分離誤りとして除去されるＬＮＲ
処理１を行い、次にステップＳ７０４で、領域内の黒比
率に応じて領域データが領域分離誤りとして除去される
ＬＮＲ処理３を行う。一方、ステップＳ７０５では、テ
キスト領域でなく、かつノイズ領域でもない領域データ
が、領域の大きさに応じて領域分離誤りとして除去され
るＬＮＲ処理２が行われる。尚、ステップＳ７０３，Ｓ
７０４，Ｓ７０５におけるＬＮＲ処理１，３，２につい
ては、それぞれ以下に詳述する。In step S703, the LNR in which the area data is removed as an area separation error according to the area size
Processing 1 is performed, and then, in step S704, LNR processing 3 is performed in which the area data is removed as an area separation error according to the black ratio in the area. On the other hand, in step S705, LNR processing 2 is performed in which region data that is neither a text region nor a noise region is removed as a region separation error according to the size of the region. Steps S703 and S703
The LNR processes 1, 3, and 2 in 704 and S705 will be described in detail below.

【００６３】そしてステップＳ７０６において、全ての
領域に対する処理が終了したか否かが判断され、終了し
ていなければステップＳ７０１へ戻り、終了していれば
ＬＮＲ処理を終了する。Then, in step S706, it is determined whether or not the processing for all the areas has been completed. If the processing has not been completed, the process returns to step S701; if completed, the LNR processing ends.

【００６４】以下、まずステップＳ７０３に示すＬＮＲ
処理１について詳細に説明する。Hereinafter, first, the LNR shown in step S703 will be described.
Processing 1 will be described in detail.

【００６５】図９は、ＬＮＲ処理１を示すフローチャー
トである。まずステップＳ７３１で、処理対象領域の領
域データから高さＨ１，幅Ｗ１を参照する。そして、領
域の大きさの判断に用いる高さの閾値ＨＴ１，幅の閾値
ＷＴ１を算出するために、ステップＳ７３２において、
スキャナ等の画像入力装置２の読み取り解像度ＳＲと、
画像中の除去しない最小文字のポイント数ＭＰ１をそれ
ぞれ高さ，幅についてＭＰ１ｈ，ＭＰ１ｗとして設定す
る。FIG. 9 is a flowchart showing LNR processing 1. First, in step S731, the height H1 and the width W1 are referred to from the area data of the processing target area. Then, in order to calculate the height threshold HT1 and the width threshold WT1 used for determining the size of the region, in step S732,
A reading resolution SR of the image input device 2 such as a scanner;
The minimum number of points MP1 of the characters not to be removed in the image is set as MP1h and MP1w for the height and width, respectively.

【００６６】本実施例において、閾値ＨＴ１，ＷＴ１は
以下に示す（５），（６）式により算出される。In this embodiment, the threshold values HT1 and WT1 are calculated by the following equations (5) and (6).

【００６７】ＨＴ１＝（ＳＲ／７２．０）×ＭＰ１ｈ・・・（５）ＷＴ１＝（ＳＲ／７２．０）×ＭＰ１ｗ・・・（６）ステップＳ７３３では、（５）式により高さの閾値ＨＴ
１を算出する。例えば、画像入力装置２の解像度ＳＲが
４００ｄｐｉで、画像中の最小文字の高さポイント数Ｍ
Ｐ１ｈが４ポイントである場合、高さの閾値ＨＴ１は
「２２」として算出される。そしてステップＳ７３４
で、領域データの高さＨ１とステップＳ７３３で算出し
た高さの閾値ＨＴ１との比較を行なう。領域データの高
さＨ１が閾値ＨＴ１より大きい場合はステップＳ７３５
へ進み、閾値ＨＴ１より小さい場合はステップＳ７３８
へ進む。HT1 = (SR / 72.0) × MP1h (5) WT1 = (SR / 72.0) × MP1w (6) In step S733, the threshold value of the height is calculated by the equation (5). HT
1 is calculated. For example, when the resolution SR of the image input device 2 is 400 dpi and the number of height points M of the minimum characters in the image is M
If P1h is 4 points, the height threshold HT1 is calculated as “22”. Then, step S734
Then, the height H1 of the area data is compared with the height threshold value HT1 calculated in step S733. If the height H1 of the area data is larger than the threshold HT1, step S735
Proceeds to step S738 if it is smaller than the threshold HT1.
Proceed to.

【００６８】ステップＳ７３５では、（６）式により幅
の閾値ＷＴ１を算出する。続いてステップＳ７３６で、
領域データの幅Ｗ１とステップＳ７３５で算出した幅の
閾値ＷＴ１との比較を行なう。領域データの幅Ｗ１が閾
値ＷＴ１よりも大きい場合には、ＬＮＲ処理１は終了す
る。一方、領域データの幅Ｗ１が閾値ＷＴ１よりも小さ
い場合はステップＳ７３７に進み、領域データの高さＨ
１と幅Ｗ１との比Ｈ１／Ｗ１の判断を行なう。この比が
「２」以下である場合には、ＬＮＲ処理１は終了する。
一方、比が「２」を超える場合には処理中の領域が領域
分離誤りであると判断されるため、ステップＳ７３８へ
進んで、該領域が除去される。In step S735, the width threshold value WT1 is calculated by equation (6). Subsequently, in step S736,
The width W1 of the area data is compared with the width threshold WT1 calculated in step S735. If the width W1 of the area data is larger than the threshold WT1, the LNR process 1 ends. On the other hand, if the width W1 of the area data is smaller than the threshold WT1, the process proceeds to step S737, and the height H of the area data
The ratio H1 / W1 between 1 and the width W1 is determined. If this ratio is equal to or less than “2”, the LNR process 1 ends.
On the other hand, if the ratio exceeds “2”, it is determined that the area being processed has an area separation error, and the process advances to step S738 to remove the area.

【００６９】次に、図８のステップＳ７０４に示すＬＮ
Ｒ処理３について、図１０のフローチャートを参照して
詳細に説明する。まずステップＳ７４１において、領域
中の黒画素数ＢＣを累計する。そしてステップＳ７４２
で、領域中の黒比率ＢＲを以下に示す（７）式により計
算する。Next, the LN shown in step S704 of FIG.
The R processing 3 will be described in detail with reference to the flowchart in FIG. First, in step S741, the number of black pixels BC in the area is accumulated. And step S742
Then, the black ratio BR in the area is calculated by the following equation (7).

【００７０】ＢＲ１＝ＢＣ／（Ｗ１×Ｈ１）×１００・・・（７）次にステップＳ７４３において、最小黒比率ＢＲＴ１
と、最大黒比率ＢＲＴ２とを設定する。ＢＲＴ１とＢＲ
Ｔ２は、文字の黒比率特性により予め設定されており、
例えばＢＲＴ１＝５，ＢＲＴ２＝５２である。BR1 = BC / (W1 × H1) × 100 (7) Next, in step S743, the minimum black ratio BRT1
And the maximum black ratio BRT2. BRT1 and BR
T2 is preset according to the black ratio characteristics of the character,
For example, BRT1 = 5, BRT2 = 52.

【００７１】ステップＳ７４４では、領域中の黒比率Ｂ
Ｒと、最小黒比率ＢＲＴ１及び最大黒比率ＢＲＴ２との
比較を行なう。黒比率ＢＲが最小黒比率ＢＲＴ１より小
さい、又は最大黒比率ＢＲＴ２より大きい場合には、処
理中の領域が領域分離誤りであると判断され、ステップ
Ｓ７４５に進んで該領域が除去される。その他の場合
は、ＬＮＲ処理３は終了する。In step S744, the black ratio B in the area
R is compared with the minimum black ratio BRT1 and the maximum black ratio BRT2. If the black ratio BR is smaller than the minimum black ratio BRT1 or larger than the maximum black ratio BRT2, it is determined that the area being processed has an area separation error, and the process advances to step S745 to remove the area. Otherwise, the LNR processing 3 ends.

【００７２】次に、図８のステップＳ７０５に示すＬＮ
Ｒ処理２について、図１１のフローチャートを参照して
詳細に説明する。まずステップＳ７５１において、処理
対象領域の領域データから高さＨ２，幅Ｗ２を参照す
る。そして、領域の大きさの判断に用いる高さの閾値Ｈ
Ｔ２，幅の閾値ＷＴ２を算出するために、ステップＳ７
５２において、画像入力装置２の解像度ＳＲを設定す
る。そして、ステップＳ７５３において、処理中の領域
の属性が外枠領域であるか否かが判断される。そして、
外枠領域であればステップＳ７５４へ、外枠領域でなけ
ればステップＳ７５７へ進む。Next, the LN shown in step S705 of FIG.
The R processing 2 will be described in detail with reference to the flowchart in FIG. First, in step S751, the height H2 and the width W2 are referred to from the area data of the processing target area. Then, a height threshold value H used to determine the size of the region
Step S7 to calculate the threshold value WT2 of T2 and width.
At 52, the resolution SR of the image input device 2 is set. Then, in step S753, it is determined whether or not the attribute of the area being processed is an outer frame area. And
If it is an outer frame area, the process proceeds to step S754.

【００７３】ステップＳ７５４においては、最小ポイン
ト数ＭＰ２１をそれぞれ高さ，幅についてＭＰ２１ｈ，
ＭＰ２１ｗとして設定する。また、ステップＳ７５７で
も同様に、最小ポイント数ＭＰ２２をそれぞれ高さ，幅
についてＭＰ２２ｈ，ＭＰ２２ｗとして設定する。ここ
で最小ポイント数ＭＰ２１，ＭＰ２２とは、ＬＮＲ処理
２において外枠領域であるか否かに応じて、除去しない
領域の最小サイズを文字のポイント数により表わしたも
のである。In step S754, the minimum number of points MP21 is set to MP21h,
Set as MP21w. Similarly, in step S757, the minimum point number MP22 is set as the height and width as MP22h and MP22w, respectively. Here, the minimum point numbers MP21 and MP22 represent the minimum size of the area not to be removed in terms of the character points, depending on whether or not the area is the outer frame area in the LNR processing 2.

【００７４】そして、ステップＳ７５５およびＳ７５８
においては、上述した（５），（６）式により、高さの
閾値ＨＴ２１，ＨＴ２２と幅の閾値ＷＴ２１，ＷＴ２２
を算出する。例えば、画像入力装置２の解像度ＳＲが４
００ｄｐｉで、最小ポイント数ＭＰ２２が高さ、幅共に
４ポイントである場合、各閾値ＨＴ２２，ＷＴ２２は
「２２」として算出される。そして、ステップＳ７５６
およびＳ７５９において、それぞれ高さの閾値ＨＴ２と
幅の閾値ＷＴ２を設定する。Then, steps S755 and S758
, The height thresholds HT21 and HT22 and the width thresholds WT21 and WT22 are calculated by the above-described equations (5) and (6).
Is calculated. For example, if the resolution SR of the image input device 2 is 4
When the minimum point number MP22 is 4 points in both height and width at 00 dpi, the thresholds HT22 and WT22 are calculated as “22”. Then, Step S756
In step S759, a height threshold HT2 and a width threshold WT2 are set.

【００７５】続いてステップＳ７６０では、領域データ
の高さＨ２と、ステップＳ７５６およびＳ７５９で設定
した高さの閾値ＨＴ２との比較、及び領域データの幅Ｗ
２と同じく幅の閾値ＷＴ２との比較を行なう。領域デー
タの高さＨ２が閾値ＨＴ２より小さい、または幅Ｗ２が
閾値ＷＴ２より小さい場合、処理中の領域が領域分離誤
りであると判断され、ステップＳ７６１において該領域
が除去される。その他の場合は、ＬＮＲ処理２を終了す
る。Subsequently, in step S760, the height H2 of the area data is compared with the height threshold HT2 set in steps S756 and S759, and the width W of the area data is compared.
The comparison with the width threshold value WT2 is performed in the same manner as in the case of FIG. If the height H2 of the area data is smaller than the threshold HT2 or the width W2 is smaller than the threshold WT2, it is determined that the area being processed has an area separation error, and the area is removed in step S761. Otherwise, the LNR process 2 ends.

【００７６】以上説明したように本実施例のＬＮＲ処理
は、３種類の処理によって領域分離誤りと判断される領
域を除去する。As described above, the LNR processing of this embodiment removes an area determined to be an area separation error by three types of processing.

【００７７】以上説明したように本実施例によれば、多
値入力画像上の領域内における各画素の輝度値を、背景
と対象物との２つのクラスに分類する際の最適な閾値を
自動的に決定することができる。従って、多値画像にお
ける背景と対象物とを適切に分離することができ、高詳
細なＯＣＲ処理が実行可能となる。As described above, according to this embodiment, the optimum threshold value for classifying the luminance value of each pixel in the area on the multi-valued input image into two classes, the background and the object, is automatically determined. Can be determined. Therefore, the background and the object in the multi-valued image can be appropriately separated, and a highly detailed OCR process can be performed.

【００７８】＜その他の実施例＞上述した第１実施例に
おいて入力される画像は、８ビットの多値画像データと
して説明を行ったが、本発明はこれに限定する必要はな
く、例えばカラー画像等、即ち、２値化するために画像
情報として複数ビットの情報があれば良い。<Other Embodiments> The image input in the first embodiment has been described as 8-bit multi-valued image data. However, the present invention is not limited to this. And so on, that is, it is sufficient that there be information of a plurality of bits as image information for binarization.

【００７９】また、ヒストグラムを算出する際の画像に
おけるサンプリングについて、全画素でも、数画素おき
でもよく限定しない。さらに、平均ＡＶや統計量ＳＫ等
の計算は、必ずしも８ビットで行なう必要はなく、高速
化、メモリの削減等のため、少ないビット数で演算する
ようにしてもよい。The sampling of the image at the time of calculating the histogram is not limited to every pixel or every several pixels. Further, the calculation of the average AV, the statistic SK, and the like need not always be performed with eight bits, and may be performed with a small number of bits for speeding up, reducing memory, and the like.

【００８０】また、統計量であるスキュー値ＳＫの収束
条件を±１．０としたが、これに限定されるものではな
い。スキュー値ＳＫを用いて２値化の閾値を決定するよ
うに構成されていれば良い。Although the convergence condition of the skew value SK, which is a statistic, is ± 1.0, the invention is not limited to this. It is sufficient that the threshold value for binarization is determined using the skew value SK.

【００８１】尚、本発明は、イメージスキャナ、プリン
タコントローラ、プリンタ等の複数の機器から構成され
るシステムに適用しても、カラー複写機のような１つの
機器から成る装置に適用しても良い。また、本発明は上
述のように画像処理装置にハードウェアを設けるものに
限らず、システム或は装置に磁気ディスク等の媒体に記
憶されたプログラムを供給することによって達成される
場合にも適用できることはいうまでもない。The present invention may be applied to a system including a plurality of devices such as an image scanner, a printer controller, and a printer, or may be applied to an apparatus including a single device such as a color copier. . Further, the present invention is not limited to the case where the image processing apparatus is provided with hardware as described above, and can be applied to a case where the present invention is achieved by supplying a program stored in a medium such as a magnetic disk to a system or an apparatus. Needless to say.

【００８２】[0082]

【発明の効果】以上説明したように本発明によれば、輝
度頻度とその偏りに基づいて、画像内の背景と対象物と
を分離するために適した閾値が存在する輝度領域を特定
し、その特定した輝度領域における平均輝度値を２値化
閾値とするので、入力する画像の種類に関わらず背景と
対象物を分離するのに最適な閾値を自動的に設定するこ
とができる。As described above, according to the present invention, a luminance region having a threshold suitable for separating a background and an object in an image is specified based on the luminance frequency and its bias. Since the average luminance value in the specified luminance area is set as the binarization threshold value, an optimal threshold value for separating the background and the object can be automatically set regardless of the type of the input image.

【００８３】[0083]

[Brief description of the drawings]

【図１】本発明に係る一実施例における画像処理装置の
システム構成を示すブロック図である。FIG. 1 is a block diagram illustrating a system configuration of an image processing apparatus according to an embodiment of the present invention.

【図２】本実施例における像域分離ＯＣＲ処理を示すフ
ローチャートである。FIG. 2 is a flowchart illustrating an image area separation OCR process according to the present embodiment.

【図３】本実施例における２値化処理を示すフローチャ
ートである。FIG. 3 is a flowchart illustrating a binarization process in the embodiment.

【図４】本実施例における画像のヒストグラムの例を示
す図である。FIG. 4 is a diagram illustrating an example of an image histogram according to the present embodiment.

【図５】本実施例の２値化処理における各変数値の変遷
例を示す図である。FIG. 5 is a diagram illustrating an example of transition of each variable value in the binarization processing according to the embodiment.

【図６】本実施例における像域分離処理を示すフローチ
ャートである。FIG. 6 is a flowchart illustrating an image area separation process according to the present embodiment.

【図７】本実施例における領域データ例を示す図であ
る。FIG. 7 is a diagram illustrating an example of area data according to the present embodiment.

【図８】本実施例における領域除去（ＬＮＲ）処理を示
すフローチャートである。FIG. 8 is a flowchart illustrating an area removal (LNR) process according to the present embodiment.

【図９】本実施例における領域の大きさによる領域除去
処理１を示すフローチャートである。FIG. 9 is a flowchart showing an area removing process 1 based on the size of the area in the embodiment.

【図１０】本実施例における黒比率による領域除去処理
２を示すフローチャートである。FIG. 10 is a flowchart illustrating a region removal process 2 based on a black ratio in the present embodiment.

【図１１】本実施例における領域の大きさによる領域除
去処理３を示すフローチャートである。FIG. 11 is a flowchart illustrating an area removal process 3 based on the size of the area in the present embodiment.

[Explanation of symbols]

１画像処理装置２画像入力装置３画像表示装置４入力部５記憶部６輝度頻度累計部７２値化閾値算出部８２値化部９像域分離部１０文字認識部１１画像処理部１２出力部 REFERENCE SIGNS LIST 1 image processing device 2 image input device 3 image display device 4 input unit 5 storage unit 6 luminance frequency accumulating unit 7 binarization threshold calculation unit 8 binarization unit 9 image area separation unit 10 character recognition unit 11 image processing unit 12 output Department

フロントページの続き (56)参考文献特開昭61−140276（ＪＰ，Ａ) 特開平５−191649（ＪＰ，Ａ) 特開平８−223409（ＪＰ，Ａ) (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06T 5/00 200 Continuation of the front page (56) References JP-A-61-140276 (JP, A) JP-A-5-191649 (JP, A) JP-A 8-223409 (JP, A) (58) Fields investigated (Int) .Cl. ⁷ , DB name) G06T 5/00 200

Claims

(57) [Claims]

1. A luminance frequency distribution of a multi-valued image is calculated, and first and second luminance values are respectively set as a start point and an end point of a specific luminance area in the luminance frequency distribution, and an average luminance in the specific luminance area is set. Calculating a value and a distribution bias, when the distribution bias does not satisfy a predetermined condition, changing the first luminance value or the second luminance value to newly set the specific luminance region, The process of newly calculating the average luminance value and the distribution deviation in the specific luminance region is repeated until the distribution deviation satisfies the predetermined condition. When the distribution deviation satisfies the predetermined condition, the average in the specific luminance region is obtained. An image processing method comprising: setting a luminance value as a binarization threshold for separating a background and a target in the multi-valued image.

2. The image processing method according to claim 1, further comprising binarizing the multi-valued image using the binarization threshold.

3. The method according to claim 2, wherein the distribution bias is calculated based on a difference between a luminance value of each pixel in the specific luminance region and an average luminance value in the specific luminance region in the luminance frequency distribution. The image processing method according to claim 1.

4. The method according to claim 1, wherein the distribution bias is calculated based on an odd power of a difference between a luminance value of each pixel in the specific luminance region and an average luminance value in the specific luminance region in the luminance frequency distribution. The image processing method according to claim 3, wherein:

5. If the distribution deviation is within a predetermined range, it is determined that the predetermined condition is satisfied. If the distribution deviation is outside the predetermined range and is positive, the first luminance value is set to the specific luminance. 2. The method according to claim 1, wherein the second brightness value is changed to an average brightness value in the specific brightness area when the distribution bias is outside the predetermined range and is negative. The image processing method described in the above.

6. A calculating means for calculating a luminance frequency distribution of a multi-valued image, and a threshold value determining a binarization threshold value for separating a background and a target in the multi-valued image based on the luminance frequency distribution. Means, wherein the threshold value determining means sets first and second luminance values as a start point and an end point of the specific luminance area in the luminance frequency distribution, respectively, Calculating an average luminance value and a distribution bias, and when the distribution bias does not satisfy a predetermined condition, changing either the first luminance value or the second luminance value to newly set the specific luminance region Then, a process of newly obtaining the average luminance value and the distribution bias in the specific luminance region is repeated until the distribution bias satisfies the predetermined condition. When the distribution bias satisfies the predetermined condition, An image processing apparatus, wherein an average luminance value in a specific luminance area is set as a binarization threshold.

7. The image processing apparatus according to claim 6, further comprising: input means for inputting the multi-valued image; and binarizing means for binarizing the multi-valued image using the binarization threshold. The image processing apparatus according to any one of the preceding claims.

8. The method according to claim 1, wherein the threshold value determining unit calculates the distribution bias as:
The image processing apparatus according to claim 6, wherein the calculation is performed based on a difference between a luminance value of each pixel in the specific luminance region and an average luminance value in the specific luminance region in the luminance frequency distribution.

9. The method according to claim 8, wherein the threshold value determining unit calculates the distribution bias as:
9. The image processing apparatus according to claim 8, wherein the calculation is performed based on an odd power of a difference between a luminance value of each pixel in the specific luminance region and an average luminance value in the specific luminance region in the luminance frequency distribution. .

10. The threshold value determining means determines that the predetermined condition is satisfied if the distribution bias is within a predetermined range, and if the distribution bias is outside the predetermined range and is positive, the first Changing the luminance value to an average luminance value in the specific luminance region, and if the distribution bias is outside the predetermined range and negative, changing the second luminance value to an average luminance value in the specific luminance region. 7. The image processing apparatus according to claim 6, wherein: