JP2001143076A

JP2001143076A - Image processor

Info

Publication number: JP2001143076A
Application number: JP32057399A
Authority: JP
Inventors: Kenji Ebiya; 賢治蛯谷
Original assignee: Fuji Xerox Co Ltd
Current assignee: Fujifilm Business Innovation Corp
Priority date: 1999-11-11
Filing date: 1999-11-11
Publication date: 2001-05-25

Abstract

PROBLEM TO BE SOLVED: To exactly distinguish areas of characters and photographs from an inputted image a short time. SOLUTION: This image processor is provided with a binarization part 102 to divide the inputted image into foreground and background, an integrated threshold setting part 103 to create histogram of a background run by every line (or column) from the results of the binarization part 102 and to set a first integrated threshold for each line (or column), an integration processing part 104 to perform integration processing to the foreground by each line (or column), based on the first integrated threshold, a labeling part 105 to attach the same label to continuous pixels to the foreground integrated by the integration processing part 104 and to simultaneously constitute rectangles and an attribute detecting part 106 to distinguish the characters (strings), graphics or the photographs, tables, etc., by using at least one featured value of the rectangles labeled by the labeling part 105.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、ＯＣＲ（光学的文
字認識）装置、複写機、ファクシミリ等の電子装置にお
いて、特に入力画像に対して文字領域と、図形、写真、
表等の領域とに分割する画像処理装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an electronic device such as an OCR (optical character recognition) device, a copying machine, a facsimile, etc.
The present invention relates to an image processing apparatus that divides an image into regions such as a table.

【０００２】[0002]

【従来の技術】従来の画像処理装置においては領域分割
手段として、画像領域全体でランレングスの分布を調
べ、白ラン黒ランの長さにより文字領域や図形領域等を
分割するランレングス分析方式を用いたものや、入力画
像のフーリエスペクトルを分析して各種領域に分割する
スペクトル分析方式を用いたものや、特開昭６４−１５
８８９号公報に記載されているように垂直および水平方
向の射影（ヒストグラム）を交互に繰り返して取り、周
辺部分の情報から領域を分割していく射影分析方式を用
いたものがある。2. Description of the Related Art In a conventional image processing apparatus, a run-length analysis method is used as a region dividing means in which a run-length distribution is examined over the entire image region and a character region, a graphic region, and the like are divided by the length of a white run and a black run. And Japanese Patent Application Laid-Open No. Sho 64-15, a method using a spectrum analysis method of analyzing a Fourier spectrum of an input image and dividing it into various regions.
As described in Japanese Patent Application Laid-Open No. 889, there is a projection analysis method in which projections (histograms) in the vertical and horizontal directions are alternately and repeatedly taken to divide an area from information on a peripheral portion.

【０００３】[0003]

【発明が解決しようとする課題】しかしながら、上述の
ような従来技術では、複雑な領域構成の画像に対する分
割精度が低いという問題がある。また、スペクトル分析
方式では、演算処理に多大な時間を費やし、かつ画像の
画素に対して処理を行うので記憶領域が大規模化してし
まうという問題が生じる。However, the prior art as described above has a problem that the division accuracy for an image having a complicated area configuration is low. Further, in the spectrum analysis method, a large amount of time is spent for arithmetic processing, and processing is performed on pixels of an image, so that there is a problem that a storage area is enlarged.

【０００４】[0004]

【課題を解決するための手段】本発明は、このような課
題を解決するために成された画像処理装置である。すな
わち、本発明は、入力画像を前景と背景とに分離する２
値化手段と、２値化手段の結果から行（または列）毎に
背景ランのヒストグラムを作成し、行（または列）毎に
第１統合閾値を設定する統合閾値設定手段と、第１統合
閾値を基に前景を行（または列）毎に統合処理する統合
処理手段と、統合処理手段で統合された前景に対して連
続している画素に同一ラベルを付け、同時に矩形を構成
していくラベリング手段と、ラベリング手段でラベル付
けされた矩形の少なくとも１つの特徴量を用いて、文字
（列）、図形または写真、表などを区別する属性検出手
段とを具備し、属性検出手段の検出結果に基づいて入力
画像の領域分割を行うものである。SUMMARY OF THE INVENTION The present invention is an image processing apparatus designed to solve such a problem. That is, the present invention provides a method for separating an input image into a foreground and a background.
A binarization unit, an integration threshold setting unit that creates a histogram of a background run for each row (or column) from the result of the binarization unit, and sets a first integration threshold for each row (or column); An integrated processing unit that integrates the foreground for each row (or column) based on the threshold value, and the same label is assigned to pixels that are continuous with the foreground integrated by the integrated processing unit, and a rectangle is formed at the same time. A labeling unit; and an attribute detection unit that distinguishes a character (string), a graphic, a photograph, a table, or the like using at least one feature amount of a rectangle labeled by the labeling unit, and a detection result of the attribute detection unit. Is performed to divide the area of the input image.

【０００５】このような本発明では、２値化手段による
２値化によって入力画像を前景と背景とに分離し、この
結果から統合閾値設定手段で設定した第１統合閾値を用
いて前景を統合処理する。さらにラベリング手段によっ
て、統合された前景に対して連続している画素に同一ラ
ベルを付けして矩形を構成し、属性検出手段により、ラ
ベル付けされた矩形の特徴量によって属性を検出する。
この属性に基づき入力画像を文字や図形、写真等の各領
域に分割することができるようになる。In the present invention, an input image is separated into a foreground and a background by binarization by binarization means, and the foreground is integrated using the first integration threshold set by the integration threshold setting means based on the result. To process. Further, the same label is assigned to pixels that are continuous with the integrated foreground by the labeling means to form a rectangle, and the attribute is detected by the attribute detecting means based on the feature amount of the labeled rectangle.
Based on this attribute, the input image can be divided into regions such as characters, figures, and photographs.

【０００６】[0006]

【発明の実施の形態】以下、本発明の画像処理装置にお
ける実施の形態を図に基づいて説明する。図１は、本実
施形態の画像処理装置における概略構成図である。すな
わち、本実施形態の画像処理装置は、主として、画像入
力部１０１、２値化部１０２、統合閾値設定部１０３、
統合処理部１０４、ラベリング部１０５および属性検出
部１０６を備えている。BRIEF DESCRIPTION OF THE DRAWINGS FIG. 1 is a block diagram showing an embodiment of an image processing apparatus according to the present invention. FIG. 1 is a schematic configuration diagram of the image processing apparatus according to the present embodiment. That is, the image processing apparatus according to the present embodiment mainly includes an image input unit 101, a binarization unit 102, an integrated threshold setting unit 103,
An integration processing unit 104, a labeling unit 105, and an attribute detection unit 106 are provided.

【０００７】画像入力部１０１は、例えばスキャナから
成り、画像データを入力する部分である。２値化部１０
２は、画像入力部１０１で入力した画像データを２値化
して、前景と背景とを分離する処理を行う。統合閾値設
定部１０３は、２値化部１０２の出力に対して、行（ま
たは列）毎に背景ランのヒストグラムを作成し、行（ま
たは列）毎に統合閾値を設定する。[0007] The image input unit 101 is composed of, for example, a scanner and is a unit for inputting image data. Binarization unit 10
2 performs a process of binarizing the image data input by the image input unit 101 to separate the foreground and the background. The integration threshold setting unit 103 creates a background run histogram for each row (or column) for the output of the binarization unit 102, and sets an integration threshold for each row (or column).

【０００８】統合処理部１０４は、統合閾値設定部１０
３の出力を基に行（または列）毎に前景を統合する処理
を行う。ラベリング部１０５は、統合処理部１０４の出
力に対して、連続している黒画素にラベルを付け、同時
に矩形でかたどっていく処理を行う。属性検出部１０６
はラベル付けされた前景の外接矩形毎に少なくとも１つ
の特徴量（例えば、幅、高さ、縦横比、像密度）とそれ
に対応する入力画像の情報（信号レベル分布等）を基に
文字（列）、図形または写真、表などの領域に分割させ
る属性を検出する。The integration processing unit 104 includes an integration threshold setting unit 10
A process for integrating the foreground for each row (or column) based on the output of No. 3 is performed. The labeling unit 105 labels the output of the integration processing unit 104 with continuous black pixels and simultaneously performs a process of forming a rectangle. Attribute detection unit 106
Is based on at least one feature amount (eg, width, height, aspect ratio, image density) for each labeled circumscribed rectangle of the foreground and the corresponding input image information (signal level distribution, etc.). ), Attributes to be divided into regions such as figures, photographs, and tables.

【０００９】次に、図２のフローチャートに沿って、本
実施形態の画像処理装置における処理手順を説明する。
なお、以下の説明で図２に示されない符号は特に示さな
い限り図１を参照するものとする。Next, the processing procedure in the image processing apparatus of the present embodiment will be described with reference to the flowchart of FIG.
In the following description, reference numerals not shown in FIG. 2 refer to FIG. 1 unless otherwise specified.

【００１０】先ず、画像入力部１０１から原画像を入力
する（ステップＳ２０１）。次に、２値化部１０２にお
いて、画像の濃度ヒストグラムを作成し、それをもとに
判別分析法等を用いて閾値を設定し、画像を２値化する
（ステップＳ２０２）。この２値画像においてＯＮ
（黒）画素を前景、ＯＦＦ（白）画素を背景とする。First, an original image is input from the image input unit 101 (step S201). Next, the binarizing unit 102 creates a density histogram of the image, sets a threshold value based on the histogram by using a discriminant analysis method or the like, and binarizes the image (step S202). ON in this binary image
The (black) pixel is set to the foreground, and the OFF (white) pixel is set to the background.

【００１１】次いで、統合閾値設定部１０３において、
前記２値画像の行（または列）毎に白ランのヒストグラ
ムを作成し、その分布から統合閾値を設定する（ステッ
プＳ２０３）。Next, in the integrated threshold setting unit 103,
A white run histogram is created for each row (or column) of the binary image, and an integration threshold is set from the distribution (step S203).

【００１２】この統合閾値の設定を、図３を例に説明す
る。図３（ａ）に示す画像のうち、代表的な５つの状況
を点線〜で示す。点線は前景が全く無く、主走査
方向幅の白ランが１つであり、図３（ｂ）に示すような
ヒストグラム分布となる。この時は統合閾値を無限大
（統合しないということ示しているだけで、実際には主
走査幅より大きいものにすれば統合処理はされない）と
し、このラインにおける前景の統合はしない。ここでの
判断は、ヒストグラムの度数の総数が１であるかどうか
で判断することができる。The setting of the integration threshold will be described with reference to FIG. Among the images shown in FIG. 3A, five typical situations are indicated by dotted lines. The dotted line has no foreground, has one white run in the width in the main scanning direction, and has a histogram distribution as shown in FIG. At this time, the integration threshold is set to infinity (indicating that no integration is to be performed, but the integration processing is not actually performed if the integration threshold is set to be larger than the main scanning width), and the foreground in this line is not integrated. This determination can be made based on whether or not the total number of frequencies in the histogram is one.

【００１３】点線は図形／写真と文字が混在している
部分である。このラインにおけるヒストグラム分布は、
図３（ｃ）に示すようになる。ヒストグラムは文字間の
白ランであるラン長のところにピークが存在し、それよ
り長いラン長ところに段落間（オブジェクト間）の白ラ
ンが分布している。これらの情報を数値化するために、
ピーク検出用閾値ｔｈ１と分布範囲検出用閾値ｔｈ２と
を用いる。このときは統合閾値をピーク検出用閾値ｔｈ
１で検出される文字間の白ランであると予想されるラン
長（１）と分布範囲検出用閾値ｔｈ２で検出されるラン
長の中で前記ラン長（１）を含まない分布の中で前記ラ
ン長（１）より長く、一番短いラン長（２）の中間値と
する。A dotted line is a portion where graphics / photos and characters are mixed. The histogram distribution in this line is
The result is as shown in FIG. The histogram has a peak at a run length that is a white run between characters, and a white run between paragraphs (between objects) is distributed at a longer run length. In order to quantify this information,
The threshold th1 for peak detection and the threshold th2 for distribution range detection are used. In this case, the integration threshold is set to the threshold th for peak detection.
In the run length (1) expected to be a white run between the characters detected in 1 and the run length detected by the distribution range detection threshold th2, the distribution that does not include the run length (1) It is longer than the run length (1) and is an intermediate value of the shortest run length (2).

【００１４】点線は前景が図形／写真のみの部分であ
る。このラインにおけるヒストグラム分布は、図３
（ｄ）に示すようになる。ヒストグラムは主走査方向端
点から図形／写真部分まで続く白ランに対応するラン長
のところにそれぞれ度数１があるだけである。このよう
な場合は点線と同様に統合閾値を無限大とし、このラ
インにおける前景の統合はしない。The dotted line is a portion where the foreground is only a figure / photograph. The histogram distribution in this line is shown in FIG.
As shown in FIG. The histogram has frequency 1 only at the run length corresponding to the white run that continues from the end point in the main scanning direction to the figure / photograph portion. In such a case, the integration threshold is set to infinity similarly to the dotted line, and the foreground is not integrated in this line.

【００１５】ここでの判断は、ヒストグラムの度数の総
数が２であるかどうかで判断する。この例では、図形／
写真が１つの場合であるが、複数の場合も次のようなス
テップを踏んで統合処理を行わない。すなわち、図形／
写真が複数の場合、ヒストグラムの度数の総数が２では
ないが、前記ピーク検出用閾値ｔｈ１で閾値処理した場
合でも検出されるものがなく、文字が存在しないと判断
できる。その結果をもとに統合処理を行わないという判
断がなされる。This determination is made based on whether or not the total number of frequencies in the histogram is two. In this example,
Although there is one photo, the integration process is not performed in the case of a plurality of photos according to the following steps. That is,
When there are a plurality of photographs, the total number of frequencies in the histogram is not 2, but even when threshold processing is performed with the threshold value for peak detection th1, there is nothing to be detected, and it can be determined that there is no character. Based on the result, it is determined that the integration process is not performed.

【００１６】点線は前景が文字のみの場合である。
このときのヒストグラム分布は、を図３（ｅ）、（ｆ）
に示すようになる。この場合は分布形状が図３（ｃ）と
同様であり、統合閾値設定のフローも同様に行うこと
で、実施できる。The dotted line indicates the case where the foreground is only characters.
The histogram distribution at this time is as shown in FIGS.
It becomes as shown in. In this case, the distribution shape is the same as that in FIG. 3C, and the flow of setting the integrated threshold value can be performed by performing the same flow.

【００１７】ここで、図３（ａ）のような文書画像の場
合、従来方式である射影分析法や連続する黒画素に同一
ラベルを付け、そのラベル付けされた外接矩形毎に特徴
量を求め、その特徴量で文字と図形／絵柄（写真）を分
離する方法、画像域全体でランレングスの分布を調べ、
白ラン黒ランの長さにより文字領域や図形領域等を分割
するランレングス分析方式でも、上記実施例と同様の結
果が得られる。Here, in the case of a document image as shown in FIG. 3A, the same label is attached to a continuous black pixel or a conventional projection analysis method, and a feature quantity is obtained for each labeled circumscribed rectangle. , A method of separating characters and figures / pictures (photographs) based on their features, and examining the distribution of run lengths over the entire image area,
The same result as in the above embodiment can be obtained by the run length analysis method in which a character area, a graphic area, and the like are divided according to the length of the white run and the black run.

【００１８】しかし、射影分析法は図４（ａ）のような
文書画像の場合、図５（ａ）の点線部分で示す領域の主
走査方向のヒストグラム（黒画素数）と副走査方向のヒ
ストグラム（黒画素数）において、文字と図形／絵柄
（写真）部分の境界が存在しない（領域がオーバーラッ
プしている）ため、図５（ｂ）のように文字領域と図形
／絵柄（写真）領域とを分離できない。However, in the projection analysis method, in the case of a document image as shown in FIG. 4A, the histogram in the main scanning direction (the number of black pixels) and the histogram in the sub-scanning direction of the area shown by the dotted line in FIG. In (number of black pixels), since there is no boundary between the character and the figure / pattern (photograph) part (the areas overlap), the character area and the figure / pattern (photograph) area as shown in FIG. And cannot be separated.

【００１９】仮に無理に分離すれば図５（ｃ）のように
図形／絵柄部分がバラバラになってしまう。ラベル付け
された外接矩形毎の特徴量で分離する方法は図４（ａ）
のような文書画像においても上記実施例と同様の結果
（図５（ｄ）参照）が得られるが、ラベリング処理が文
字１つ１つ（漢字などでは、偏、旁１つ１つ）に対して
行われことになるので、処理時間が膨大になる（図４で
は表記上の問題で文字数を少なくしている）。If they are forcibly separated, the figure / pattern portions will be different as shown in FIG. 5 (c). FIG. 4 (a) shows a method of separating by a feature amount for each labeled circumscribed rectangle.
The same result as in the above embodiment (see FIG. 5D) can be obtained for such a document image, but the labeling processing is performed for each character (in the case of kanji, etc., one by one, one by one). Therefore, the processing time becomes enormous (in FIG. 4, the number of characters is reduced due to a notation problem).

【００２０】また、文書画像においてフォントサイズの
大きい文字とそれと同サイズの図形／絵柄があった場
合、この方法では分離が難しい。画像域全体でランレン
グスの分布を調べ白ラン黒ランの長さにより文字領域や
図形領域等を分割するランレングス分析方式は図３、図
４ともに上記実施形態と同様の結果が得られるが、同一
文書画像において複数のフォントサイズの文字が混在し
ている場合、画像域全体でランレングスの分布をとって
いるため、大きいフォントサイズの文字列を文字１つ１
つに分割したり、本来異なる２つ図形領域を統合してし
まう可能性がある。In the case where a document image includes a character having a large font size and a figure / picture having the same size, it is difficult to separate the character by this method. The run-length analysis method for examining the run-length distribution over the entire image area and dividing a character area, a graphic area, and the like according to the length of white runs and black runs provides the same results as in the above-described embodiment in both FIGS. 3 and 4. When characters of a plurality of font sizes are mixed in the same document image, a run-length distribution is taken over the entire image area.
There is a possibility that the image area may be divided into two or two originally different graphic areas may be integrated.

【００２１】本実施形態では、ヒストグラム分布を局所
的に調べることで、そのような状況に対応している。図
４（ａ）における点線〜のヒストグラム（図４
（ｂ）〜（ｆ）参照）は、図３（ａ）における点線〜
のヒストグラム（図３（ｂ）〜（ｆ）参照）とほぼ同
様の傾向を示しており、図３の画像と同様の方法で閾値
設定を行えば、同様の結果が得られることが分かる。In the present embodiment, such a situation is dealt with by locally examining the histogram distribution. The histogram indicated by the dotted line in FIG.
(B) to (f)) correspond to dotted lines in FIG.
3 (see FIGS. 3 (b) to 3 (f)), and it can be seen that similar results can be obtained by setting the threshold in the same manner as in the image of FIG.

【００２２】次に、統合処理部１０４において、白画素
のラン長が前記統合閾値設定部で設定された閾値より短
い場合、その白ランを挟む黒画素を同一オブジェクトと
して統合する（ステップＳ２０４）。この処理によっ
て、文字は文字列として１つのオブジェクトとして統合
される。Next, when the run length of the white pixel is shorter than the threshold set by the integration threshold setting unit, the integration processing unit 104 integrates the black pixels sandwiching the white run as the same object (step S204). By this processing, the characters are integrated as one object as a character string.

【００２３】次いで、ラベリング部１０５において、８
近傍に黒画素があれば連続していると判定し、連続する
黒画素に同一のラベルを付ける（ステップＳ２０５）。
このラベル付けについて、図６を例に説明する。先ず、
主走査方向に左から右にラベル付けされていない黒画素
があるかをチェックする。主走査方向の右端までいき、
存在しない場合は、次のラインの左端から同様の処理を
行う。Next, in the labeling unit 105, 8
If there is a black pixel in the vicinity, it is determined that the pixels are continuous, and the same label is assigned to the continuous black pixels (step S205).
This labeling will be described with reference to FIG. First,
Check for unlabeled black pixels from left to right in the main scan direction. Go to the right end in the main scanning direction,
If not, the same processing is performed from the left end of the next line.

【００２４】図６（ａ）において最初に検出される黒画
素ａにラベル１を付加する。次に黒画素ａを起点とし
て、画素ａの８近傍にラベル付けされていない黒画素を
チェックすると、黒画素ｂが該当するので、これは連続
する黒画素と判定し画素ａと同一ラベル１を付ける。A label 1 is added to a black pixel a detected first in FIG. Next, starting from the black pixel a and checking unlabeled black pixels in the vicinity of the pixel 8, the black pixel b corresponds. Therefore, this is determined as a continuous black pixel, and the same label 1 as the pixel a is assigned. wear.

【００２５】次に、画素ａ，ｂの８近傍画素でラベル付
けされていない黒画素があるかをチェックして、あれば
先と同様に同一ラベルを付ける。この操作を繰り返し
て、ラベル１が付加されている全画素の８近傍にラベル
付けされていない黒画素が無くなるまで行う。Next, it is checked whether or not there is a black pixel which is not labeled with the eight neighboring pixels of the pixels a and b. This operation is repeated until there is no unlabeled black pixel in the vicinity of 8 of all the pixels to which the label 1 is added.

【００２６】次に、黒画素ａの右の画素からラベル付け
されていない黒画素があるかをチェックする処理を再開
する。図６（ａ）では黒画素ｇが検出され、この黒画素
ｇにラベル２を付加する。このあと黒画素ａのときと同
様にラベル２が付加されている全画素の８近傍にラベル
付けされていない黒画素が無くなるまでラベル２を付加
する処理を繰り返す。図６（ａ）の場合は、図６（ｂ）
のように黒画素ａ〜ｆまでラベル１が付加され、黒画素
ｇ〜ｌまでラベル２が付加される。Next, the process of checking whether there is any unlabeled black pixel from the right pixel of the black pixel a is restarted. In FIG. 6A, a black pixel g is detected, and a label 2 is added to the black pixel g. Thereafter, the process of adding the label 2 is repeated until there is no unlabeled black pixel near 8 of all the pixels to which the label 2 is added, as in the case of the black pixel a. In the case of FIG. 6A, FIG.
The label 1 is added to the black pixels a to f, and the label 2 is added to the black pixels g to l.

【００２７】最後に、属性判定部１０６において、ラベ
リング部１０５でラベリングされた外接矩形ごとに特徴
量をもとめ、属性を判定する（ステップＳ２０６）。特
徴量としては、矩形の幅、高さ、縦横比、画素密度、前
記矩形領域に対応する入力画像の情報を用いる。Finally, the attribute determining unit 106 determines the attribute by obtaining the characteristic amount for each circumscribed rectangle labeled by the labeling unit 105 (step S206). As the feature quantity, information on the width, height, aspect ratio, pixel density, and input image corresponding to the rectangular area of the rectangle are used.

【００２８】文字列（外接矩形幅Ｗmin で見出し、本文
の大まかな区別は可能）と図形／写真／表は、図７に示
すように文字列の外接矩形縦横比で区別できる（縦書
き、横書きで違うが、縦横比ＷＨＲ＜１／Ｎ、Ｎ＜ＷＨ
Ｒのとき一般的に文字列と判断できる、ＮはＮ＞１を満
たす実数）。しかし、これだけでは完全に区別できない
ため、図８に示すように外接矩形の面積Ｓ、像密度ＩＤ
などを用いて精度を上げている。As shown in FIG. 7, a character string (heading by the circumscribed rectangle width Wmin and the text can be roughly distinguished) and a figure / photo / table can be distinguished by the circumscribed rectangle aspect ratio of the character string (vertical writing, horizontal writing). But the aspect ratio WHR <1 / N, N <WH
In the case of R, it can be generally determined to be a character string, N is a real number satisfying N> 1). However, since it cannot be completely distinguished by this alone, the area S of the circumscribed rectangle and the image density ID as shown in FIG.
The accuracy is increased by using such methods.

【００２９】外接矩形の面積Ｓ、像密度ＩＤを用いての
領域の区別は、面積がｔｈ２より小さければ文字列（本
文）、面積がｔｈ３より大きく像密度がｔｈ１より小さ
いときは表、面積がｔｈ３より大きく像密度がｔｈ１よ
り大きいときは図形／写真と判断する。The area using the area S of the circumscribed rectangle and the image density ID can be distinguished by a character string (text) if the area is smaller than th2, and a table or area if the area is larger than th3 but smaller than th1. If the image density is larger than th3 and the image density is larger than th1, it is determined that the image is a graphic / photograph.

【００３０】上記のように矩形の幅、高さ、縦横比、画
素密度を用いて文字列と表と図形／写真は区別できる
が、図形と写真が区別できない。そこで、本実施形態で
は、矩形領域に対応する入力画像の情報（３次元空間に
おける色数とｖａｒｉａｂｉｌｉｔｙ）を用いて図形と
写真とを区別する。As described above, a character string, a table, and a figure / photo can be distinguished using the width, height, aspect ratio, and pixel density of a rectangle, but a figure and a photograph cannot be distinguished. Therefore, in the present embodiment, a figure is distinguished from a photograph by using information of an input image corresponding to a rectangular area (the number of colors and variability in a three-dimensional space).

【００３１】これは、図形に比べ写真は色数が一般的に
多く、Ｖａｒｉａｂｉｌｉｔｙ（変化度合い）も写真の
方が大きいことを利用している。本実施形態でｖａｒｉ
ａｂｉｌｉｔｙとして採用するのは注目画素とその８近
傍画素との差分最大値が一定値を超えている画素の外接
矩形面積に対する割合を用いた。This takes advantage of the fact that photographs generally have more colors than graphics, and that the variability (degree of change) is greater for photographs. In this embodiment, the vari
The ratio used for the circumscribed rectangular area of the pixel whose maximum difference between the target pixel and its eight neighboring pixels exceeds a certain value is used as the availability.

【００３２】図形の場合、エッジ部分は前記一定値を超
えるが、その他の均一な領域ではほとんど超えない。そ
の結果ｖａｒｉａｂｉｌｉｔｙは低くなる。これを図９
および図１０で説明する。図９では像密度がｔｈ１より
低いものは表、像密度がｔｈ１より大きく、色数がｔｈ
２より少ないものは図形、像密度がｔｈ１より大きく色
数がｔｈ３より多いものは写真である。In the case of a graphic, the edge portion exceeds the above-mentioned fixed value, but hardly exceeds the other uniform regions. As a result, the variability decreases. This is shown in FIG.
And FIG. In FIG. 9, a table having an image density lower than th1 is a table, an image density is higher than th1, and the number of colors is th.
Those with less than 2 are figures, and those with an image density larger than th1 and more colors than th3 are photographs.

【００３３】また、図１０では色数がｔｈ２より少ない
ものは図形、色数がｔｈ３より多いものは写真、色数
がｔｈ２より多く、ｔｈ３より少ないものは、Ｖａｒｉ
ａｂｉｌｉｔｙがｔｈ１より小さいものは図形、ｔｈ１
より大きいものは写真である。これらステップＳ２０６
の動作を図１１にまとめて示す。In FIG. 10, the one having the number of colors smaller than th2 is a figure, the one having the number of colors larger than th3 is a photograph, and the one having the number of colors larger than th2 is smaller than Vari.
The figure whose availability is smaller than th1 is a figure, th1
The larger ones are photographs. These steps S206
11 are collectively shown in FIG.

【００３４】以上のようにして求めた各種領域の矩形デ
ータ（座標データ、属性コード等）を画像データと共に
外部へ出力する。これにより、時間をかけることなく、
入力画像の文字領域と図形／写真領域とを的確に分割す
ることができるようになる。The rectangular data (coordinate data, attribute code, etc.) of the various regions obtained as described above are output to the outside together with the image data. This allows you to spend less time
The character area and the figure / photograph area of the input image can be accurately divided.

【００３５】次に、本実施形態の画像処理装置における
他の例を説明する。すなわち、上述の図２に示すステッ
プＳ２０４の統合処理において、先に説明した例では、
文字間の幅＜オブジェクト間の幅（文字列行間、文字と
図形／写真との間隔等）…を前提としてしいるが、こ
のが成り立たない場合、文字列と図形／写真とを統合
してしまう恐れがある。Next, another example of the image processing apparatus of this embodiment will be described. That is, in the integration processing of step S204 shown in FIG.
It is assumed that the width between characters is smaller than the width between objects (character line spacing, space between characters and figures / photographs, etc.). However, if this does not hold, character strings and figures / photographs will be integrated. There is fear.

【００３６】これに対処するため、上述の図２に示すス
テップ２０４の統合処理において、統合閾値設定部で設
定された閾値より短い白ランがあったときに、その白ラ
ンを挟む黒画素それぞれの黒画素ラン長を求め、その長
さおよび比によって、その白ランを挟む黒画素を同一オ
ブジェクトとして統合するかを判断することで、文字列
と図形／写真とを統合することを回避することが可能に
なる。In order to cope with this, in the integration processing of step 204 shown in FIG. 2, when there is a white run shorter than the threshold set by the integration threshold setting unit, each of the black pixels sandwiching the white run is determined. By determining the black pixel run length and determining, based on the length and the ratio, whether the black pixels sandwiching the white run should be integrated as the same object, it is possible to avoid integrating the character string and the graphic / photograph. Will be possible.

【００３７】これを図１３の模式図に基づき説明する。
図１３のＬ２およびＬ４が共に統合閾値設定部で設定さ
れた閾値より短い場合、先の例では文字『あ』と図形／
写真とを統合してしまい、その後のラベリング処理、属
性判定処理を行った結果として、間違ったものとなる。
本例では、統合閾値設定部で設定された閾値より短い白
ランを挟む黒画素それぞれの黒画素ラン長をＭとＮとす
るとき、Ｍ＜ｔｈ３かつＮ＜ｔｈ３かつｍｉｎ（Ｎ，
Ｍ）／ｍａｘ（Ｎ，Ｍ）＞ｔｈ４のときに、白ランを挟
む黒画素を同一オブジェクトとして統合する。This will be described with reference to the schematic diagram of FIG.
When both L2 and L4 in FIG. 13 are shorter than the threshold set by the integrated threshold setting unit, in the above example, the character “A” and the graphic /
A photograph is integrated, and as a result of performing subsequent labeling processing and attribute determination processing, an incorrect result is obtained.
In this example, when the black pixel run lengths of the black pixels sandwiching the white run shorter than the threshold set by the integrated threshold setting unit are M and N, M <th3, N <th3, and min (N,
When M) / max (N, M)> th4, the black pixels sandwiching the white run are integrated as the same object.

【００３８】図１２においてはＬｌ＜ｔｈ３、Ｌ３＜ｔ
ｈ３、Ｌ５＞ｔｈ３、ｍｉｎ（Ｌｌ，Ｌ３）／ｍａｘ
（Ｌｌ，Ｌ３）＞ｔｈ４、ｍｉｎ（Ｌ３，Ｌ５）／ｍａ
ｘ（Ｌ３，Ｌ５）＜ｔｈ４となり、Ｌｌ，Ｌ２，Ｌ３は
統合するが、Ｌ３，Ｌ４，Ｌ５に関しては統合処理を行
わない。In FIG. 12, Ll <th3, L3 <t
h3, L5> th3, min (L1, L3) / max
(L1, L3)> th4, min (L3, L5) / ma
x (L3, L5) <th4, and L1, L2, and L3 are integrated, but no integration processing is performed on L3, L4, and L5.

【００３９】ここで、ｔｈ３は第３統合閾値、ｔｈ４は
第４統合閾値であり、予め決めておく方法もあるが、対
象となる白ランによって設定する方法（例えば、ｔｈ３
＝対象となる白ラン長×０．９）もある。Here, th3 is a third integration threshold value, and th4 is a fourth integration threshold value. There is a method which is determined in advance, but a method of setting the target white run (for example, th3
= Target white run length x 0.9).

【００４０】また、ステップＳ２０３において、先の例
では行（または列）毎にヒストグラムを作成し統合閾値
を設定しているが、前記ヒストグラムを作成している方
向に対して直交方向についてもヒストグラムを作成し、
その方向の統合閾値を設定し（方向が違えばヒストグラ
ムも変わるので、ここでの閾値は第２統合閾値とす
る）、ステップＳ２０４において両方向に対して統合処
理を行えば、縦書き、横書き混在の文書画像において
も、領域分割性能が良くなる。In step S203, a histogram is created for each row (or column) in the previous example and an integration threshold is set. However, the histogram is also created in a direction orthogonal to the direction in which the histogram is created. make,
An integration threshold in that direction is set (the histogram changes if the direction is different, so the threshold here is the second integration threshold). If integration processing is performed in both directions in step S204, mixed vertical writing and horizontal writing will be performed. Also in the document image, the area dividing performance is improved.

【００４１】また、ステップＳ２０６において、特徴量
として３次元空間における色数を用いているが、明度
（輝度）のヒストグラム分布と彩度（色差）のヒストグ
ラムとを別々に求め、それぞれの結果を用いれば図形お
よび写真について、それぞれグレイおよびカラーという
細分化した領域分割が可能になる。In step S206, the number of colors in the three-dimensional space is used as the feature amount. A histogram distribution of brightness (luminance) and a histogram of saturation (color difference) are separately obtained, and the respective results are used. For example, a figure and a photograph can be subdivided into gray and color areas.

【００４２】また、ステップＳ２０６において、各特徴
量に対して閾値処理を行ってその結果で領域分割を行っ
ている。文字列と表と図形／写真の区別については、そ
れぞれが明確な特徴をもっているため、特徴量に対して
客観的な閾値の設定が可能であるが、図形と写真の区別
についてはあまり明確な特徴を持っていないため（主観
によって図形ｏｒ写真の区別が異なるものも多数存在す
る）、閾値設定の客観性がその他のオブジェクト（文字
列／表）と区別するための閾値に比べ若干低い。In step S206, threshold processing is performed on each feature amount, and the region is divided based on the result. Character strings, tables, and figures / photos each have distinct features, so it is possible to set an objective threshold value for the feature values, but the distinction between figures and photographs is not so clear. (There are many objects that differ between figures and photographs depending on subjectivity), and the objectivity of threshold setting is slightly lower than the threshold for distinguishing from other objects (character strings / tables).

【００４３】そこで図形と写真の区別に用いている特徴
量である色数とｖａｒｉａｂｉｌｉｔｙについて直接閾
値処理するのではなく、図１３に示すようにそれぞれの
値に対してマルチレベルのｓｃｏｒｅを出し、それらを
演算し求められるマルチレベルの結果（図形らしさ、写
真らしさ）に対して閾値処理を行う方法をとることで閾
値設定の客観性を補うことが可能である。Therefore, instead of directly performing threshold processing on the number of colors and variability, which are feature amounts used for distinguishing a figure from a photograph, a multi-level score is output for each value as shown in FIG. Is calculated, and a threshold value process is performed on a multi-level result (likeness of a figure, likeness of a photograph) obtained, thereby compensating for the objectivity of the threshold setting.

【００４４】例えば、図１３における曲線の取り方によ
れば、結果的に図１０を図１４に示すようにすることも
可能である。また、本実施形態ではマルチレベルの結果
に対して、最終的に閾値処理をして図形もしくは写真の
判定を行っているが、領域分割処理の結果を利用する処
理によっては、このマルチレベルの結果を領域信号とす
ることも可能である。For example, according to the way of taking the curve in FIG. 13, it is possible to make FIG. 10 as a result shown in FIG. Also, in the present embodiment, a multi-level result is finally subjected to threshold processing to determine a figure or a photograph. However, depending on processing using the result of the region division processing, the multi-level result may be determined. May be used as a region signal.

【００４５】なお、ここでは図形と写真とを区別するの
に用いている特徴量に対して、マルチレベルのｓｃｏｒ
ｅを出す方法について説明したが、文字列、表を区別す
るために用いている特徴量に対しても同様な処理を行う
ことで、図７、図８、図９を図１０から図１４のように
変形させることは可能である。Here, a multi-level scor is used for the feature amount used to distinguish a figure from a photograph.
e has been described above, but by performing the same processing on the feature values used to distinguish between character strings and tables, FIG. 7, FIG. 8 and FIG. It is possible to deform as follows.

【００４６】また、ステップＳ２０６において、特徴量
ｖａｒａｉｂｉｌｉｔｙについて、注目画素とその８近
傍画素との差分の最大値が一定値を超えている画素の外
接矩形面積に対する割合を用いているが、ヒストグラム
の極の数を用いた方法もある。この場合、写真のヒスト
グラムは、なだらかに広がる分布になることが多く、図
形の場合はヒストグラムに極を多くもつことを利用して
閾値を設定することで先の例と同様の領域分割を行うこ
とができる。In step S206, the ratio of the maximum value of the difference between the target pixel and its eight neighboring pixels to the circumscribed rectangular area of the pixel having the maximum difference between the target pixel and its eight neighboring pixels is used for the characteristic amount variability. There is also a method using the number. In this case, the histogram of a photograph often has a gently spreading distribution, and in the case of a figure, the same region division as in the previous example is performed by setting a threshold using the fact that the histogram has many poles. Can be.

【００４７】[0047]

【発明の効果】以上説明したように、本発明によれば、
局所的なヒストグラムを用いて黒画素の統合処理を行
い、その後にラベリングを用いて文字列、表、図形、写
真等の領域を分割するようにしたので、処理時間を増大
させることなく、複雑な領域構成の分割精度向上を得る
ことが可能となる。As described above, according to the present invention,
Since the integration process of black pixels is performed using a local histogram, and then the regions such as character strings, tables, figures, and photographs are divided using labeling, complicated processing is performed without increasing the processing time. It is possible to improve the division accuracy of the region configuration.

[Brief description of the drawings]

【図１】本実施形態の画像処理装置における概略構成
図である。FIG. 1 is a schematic configuration diagram of an image processing apparatus according to an embodiment.

【図２】本実施形態の処理手順を説明するフローチャ
ートである。FIG. 2 is a flowchart illustrating a processing procedure according to the embodiment.

【図３】統合閾値を説明する図である。FIG. 3 is a diagram illustrating an integrated threshold.

【図４】文書画像の例を説明する図である。FIG. 4 is a diagram illustrating an example of a document image.

【図５】領域分割の例を説明する図である。FIG. 5 is a diagram illustrating an example of area division.

【図６】ラベル付けを説明する図である。FIG. 6 is a diagram illustrating labeling.

【図７】文字列と図形との区別を説明する図（その
１）である。FIG. 7 is a diagram (part 1) for explaining a distinction between a character string and a graphic;

【図８】文字列と図形との区別を説明する図（その
２）である。FIG. 8 is a diagram (part 2) for explaining the distinction between a character string and a graphic;

【図９】文字列と図形との区別を説明する図（その
３）である。FIG. 9 is a diagram (part 3) for explaining the distinction between a character string and a graphic;

【図１０】文字列と図形との区別を説明する図（その
４）である。FIG. 10 is a diagram (part 4) for explaining the distinction between a character string and a graphic;

【図１１】属性検出の流れを示すフローチャートであ
る。FIG. 11 is a flowchart illustrating a flow of attribute detection.

【図１２】他の統合処理を説明する図である。FIG. 12 is a diagram illustrating another integration process.

【図１３】マルチレベルとｓｃｏｒｅとの関係を示す
図である。FIG. 13 is a diagram showing a relationship between a multi-level and a score.

【図１４】写真と図形との区別を示する図である。FIG. 14 is a diagram showing a distinction between a photograph and a figure.

[Explanation of symbols]

１０１…画像入力部、１０２…２値化部、１０３…統合
閾値設定部、１０４…統合処理部、１０５…ラベリング
部、１０６…属性検出部101: Image input unit, 102: Binarization unit, 103: Integration threshold setting unit, 104: Integration processing unit, 105: Labeling unit, 106: Attribute detection unit

Claims

[Claims]

1. A binarizing unit for separating an input image into a foreground and a background, and a histogram of a background run is created for each row (or column) from a result of the binarizing unit, and at least a row (or a column) is generated. An integration threshold value setting unit that sets a first integration threshold value for each, an integration processing unit that integrates a foreground for each row (or column) based on the first integration threshold value, and a foreground integrated by the integration processing unit. Labeling means for assigning the same label to consecutive pixels and simultaneously forming a rectangle; attribute detecting means for distinguishing attributes using at least one feature amount of the rectangle labeled by the labeling means An image processing apparatus comprising: dividing an input image into regions based on attributes distinguished by the attribute detecting unit.

2. The integration threshold setting means creates a histogram also in a direction orthogonal to the direction in which the histogram was created, and sets a second integration threshold different from the integration threshold for each column (or row). The image processing apparatus according to claim 1, wherein:

3. The image processing apparatus according to claim 1, wherein one of the rectangular feature quantities is information on an input image corresponding to the rectangular area, and the information is image signal level distribution information. .

4. The apparatus according to claim 1, wherein the integration processing unit integrates the two adjacent foregrounds when a background run between two adjacent foregrounds is smaller than the first integration threshold. Image processing device.

5. The integration processing means, wherein a background run between two adjacent foregrounds is smaller than the first integration threshold, and
When the run length of the two adjacent foregrounds is equal to or less than a predetermined third integration threshold and the ratio of the run lengths of the two adjacent foregrounds is equal to or greater than a predetermined fourth integration threshold, the adjacent two foregrounds are determined. The image processing apparatus according to claim 1, wherein two foregrounds are integrated.

6. The input image is a color image signal,
The signal binarized by the binarizing means is a luminance signal in the color image signal, and the information of the input image referred to by the attribute detecting means is information of the color image signal including the luminance signal. The image processing apparatus according to claim 1, wherein: