JP3113785B2

JP3113785B2 - Image classification device

Info

Publication number: JP3113785B2
Application number: JP06325282A
Authority: JP
Inventors: 督士天野; 敦史小野
Original assignee: Sharp Corp
Current assignee: Sharp Corp
Priority date: 1994-12-27
Filing date: 1994-12-27
Publication date: 2000-12-04
Anticipated expiration: 2015-12-04
Also published as: JPH08185477A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】この発明は画像分類装置に関し、
特に画像データベース入力時に入力画像の属性を判定
し、自動的に分類、登録する画像分類装置に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image classification device,
In particular, the present invention relates to an image classification device that determines the attribute of an input image when inputting an image database, and automatically classifies and registers the image.

【０００２】[0002]

【従来の技術】人間が予め文書画像であると判断した画
像に対して、その画像の内部を自動的に領域分割して各
領域の属性判定を行なう方法が特開平４−３１６１８０
および特開昭６２−７１３７９において開示されてい
る。2. Description of the Related Art Japanese Patent Laid-Open No. 4-316180 discloses a method of automatically dividing a region of an image, which has been previously determined to be a document image, by a human, and determining the attribute of each region.
And JP-A-62-71379.

【０００３】前者は２値化された入力画像を領域分割
し、各々の領域の白黒反転回数や黒画素の割合により写
真領域か文字領域かなどの属性を判定するものであり、
後者は同じく２値化された入力画像を領域分割し、各々
の領域のランレングス特徴や黒画素率により、写真領域
か文字領域かなどの属性を判定するものである。In the former, a binarized input image is divided into regions, and attributes such as a photograph region and a character region are determined based on the number of black and white reversals and the ratio of black pixels in each region.
In the latter case, the binarized input image is divided into regions, and attributes such as a photograph region and a character region are determined based on the run-length characteristic and the black pixel ratio of each region.

【０００４】[0004]

【発明が解決しようとする課題】しかしながら従来の技
術は文書画像を対象として、画像内部で領域分割された
各々の領域の属性を判定するのみであり、入力画像その
ものが文書画像であるか否かを判定する方法や装置はこ
れまで開示されていなかった。However, the prior art only determines the attribute of each of the regions divided within the image for the document image, and determines whether or not the input image itself is a document image. No method or apparatus has been disclosed so far.

【０００５】そのため文書画像に限らずさまざまな画像
データが入力されるマルチメディアデータベースに画像
が登録される際には、人間がその入力画像の内容を分
類、判断して適当なキーワードを付加する必要があっ
た。Therefore, when an image is registered in a multimedia database to which various image data are input in addition to a document image, it is necessary for a human to classify and judge the content of the input image and add an appropriate keyword. was there.

【０００６】またこのようなマルチメディアデータベー
スに特開平４−３１６１８０および特開昭６２−７１３
７９の技法を適用するときには、まず入力された画像が
文書画像であるかどうかを人間が予め判定してからそれ
らの技法を適用しなければならない。Further, such a multimedia database is disclosed in JP-A-4-316180 and JP-A-62-713.
When applying the 79 techniques, a human must first determine whether or not the input image is a document image, and then apply those techniques.

【０００７】さらに画像データから名刺画像のみを分類
する装置も従来技術としてなかったため、名刺画像から
たとえば文字認識を利用して住所録作成などの処理を行
なう場合、さまざまな画像中から人間が名刺画像を選び
出し、それぞれの画像に対して文字認識処理を行なうと
いう手間をかけなければならなかった。[0007] Further, since there is no device for classifying only business card images from image data as a conventional technology, when performing processing such as creating an address book from a business card image using, for example, character recognition, a human being can select a business card image from various images. And perform the character recognition process for each image.

【０００８】この発明は上記のような課題を解決するた
めになされたもので、入力された画像そのものが文書画
像であるかどうかを自動的に判定して人間の手間をかけ
ずに分類（キーワード付加）して画像データベースに登
録することができ、また画像の内容に応じた処理、たと
えば文字認識を用いた住所録作成などを自動的に行なう
ことのできる画像分類装置を提供することを目的とす
る。SUMMARY OF THE INVENTION The present invention has been made to solve the above-described problem, and automatically determines whether or not an input image itself is a document image, and classifies the input image without any human trouble. It is an object of the present invention to provide an image classifying apparatus which can register in an image database after adding a keyword, and can automatically perform processing according to the content of the image, for example, creation of an address book using character recognition. And

【０００９】[0009]

【課題を解決するための手段】請求項１に記載の画像分
類装置は、入力された画像データの中から対象物を表わ
す隣接した複数の画素よりなる連結成分を識別する識別
手段と、入力された画像データの中の連結成分の形状か
ら文字らしさを表わす特徴量を抽出する抽出手段と、抽
出された特徴量に基づいて、入力された画像データが文
書画像か否かを判別する判別手段とを備え、入力された
画像データは濃淡画像データよりなり、抽出手段は、入
力された濃淡画像データからエッジの強いデータ領域を
検出するエッジ検出手段を含み、連結成分の面積と、連
結成分とデータ領域との重なり領域の面積とから特徴量
を抽出するものである。 According to a first aspect of the present invention, there is provided an image classification device, comprising: an identification unit for identifying a connected component composed of a plurality of adjacent pixels representing an object from input image data; Extracting means for extracting a characteristic amount representing character-likeness from the shape of the connected component in the extracted image data, and determining means for determining whether the input image data is a document image based on the extracted characteristic amount. With the input
The image data is composed of grayscale image data.
Data areas with strong edges from the input grayscale image data
Edge detection means for detecting the area of the connected component
From the area of the overlapping area of the connected component and the data area, the feature amount
Is extracted.

【００１０】[0010]

【００１１】請求項２に記載の画像分類装置は、入力さ
れた画像データの中から対象物を表わす隣接した複数の
画素よりなる連結成分を識別する識別手段と、入力され
た画像データの中の連結成分の形状から文字らしさを表
わす特徴量を抽出する抽出手段と、抽出された特徴量に
基づいて、入力された画像データが文書画像か否かを判
別する判別手段とを備え、抽出手段は、連結成分の各々
に対する外接矩形の面積を算出する算出手段を含み、連
結成分の面積と外接矩形の面積とから特徴量を抽出する
ものである。[0011] image classification apparatus according to claim 2, of the input
From adjacent image data
Identification means for identifying connected components consisting of pixels;
Character-likeness from the shape of connected components in the
Extraction means for extracting the amount of features
To determine whether the input image data is a document image or not.
The extracting means includes calculating means for calculating an area of a circumscribed rectangle for each of the connected components, and extracts a feature amount from the area of the connected component and the area of the circumscribed rectangle.

【００１２】[0012]

【００１３】[0013]

【００１４】[0014]

【００１５】[0015]

【００１６】[0016]

【００１７】請求項３に記載の画像分類装置は、請求項
１または２に記載の画像分類装置であって、識別手段
は、入力された画像データの中から対象物を表わす隣接
した複数の画像のうち所定の条件を満たす画素を連結成
分として識別するものである。According to a third aspect of the present invention, there is provided the image classification apparatus according to the first or second aspect , wherein the identification means includes a plurality of adjacent images representing a target object from the input image data. Pixels satisfying a predetermined condition are identified as connected components.

【００１８】[0018]

【作用】請求項１に記載の画像分類装置は、入力された
画像データの中から対象物を表わす隣接した複数の画素
よりなる連結成分を識別し、入力された画像データの中
の連結成分の形状から文字らしさを表わす特徴量を抽出
し、抽出された特徴量に基づいて入力された画像データ
が文書画像か否かを判別する。また、濃淡画像データを
入力し、入力された濃淡画像データからエッジの強いデ
ータ領域を検出し、連結成分の面積と、連結成分とデー
タ領域との重なり領域の面積とから特徴量を抽出する。 According to the first aspect of the present invention, a connected component composed of a plurality of adjacent pixels representing a target object is identified from input image data, and a connected component of the input image data is identified. A feature amount representing character-likeness is extracted from the shape, and it is determined whether or not the input image data is a document image based on the extracted feature amount. In addition, grayscale image data
Input, and from the input grayscale image data,
Data area and the area of the connected component,
The feature amount is extracted from the area of the overlapping region with the data region.

【００１９】[0019]

【００２０】請求項２に記載の画像分類装置は、連結成
分の各々に対する外接矩形の面積を算出し、連結成分の
面積と外接矩形の面積とから特徴量を抽出する。The image classification apparatus according to claim 2 is to calculate the area of the circumscribed rectangle for each of the communication formed component, extracts a feature quantity from the area of the connecting component and the area of the circumscribed rectangle.

【００２１】[0021]

【００２２】[0022]

【００２３】[0023]

【００２４】[0024]

【００２５】[0025]

【００２６】請求項３に記載の画像分類装置は、請求項
１または２に記載の画像分類装置の作用に加え、入力さ
れた画像データの中から対象物を表わす隣接した複数の
画素のうち所定の条件を満たす画素を連結成分として識
別する。According to a third aspect of the present invention, in addition to the operation of the first or second aspect of the present invention, a predetermined one of a plurality of adjacent pixels representing an object from input image data is provided. Are identified as connected components.

【００２７】[0027]

【Example】

（第１の実施例）図１は本発明の第１の実施例における
画像分類装置のシステム構成を示すブロック図である。(First Embodiment) FIG. 1 is a block diagram showing a system configuration of an image classification device according to a first embodiment of the present invention.

【００２８】図を参照して画像分類装置は、大きくはカ
メラやスキャナなどを含む画像入力のための画像入力装
置１１と、入力された画像を処理することにより画像を
文書画像、非文書画像に分類（キーワード付け）する画
像処理装置１２と、画像処理装置１２により分類された
画像を記憶する画像データベース装置１３とから構成さ
れる。Referring to the drawings, the image classification device is roughly composed of an image input device 11 for inputting an image including a camera, a scanner, and the like, and processing the input image to convert the image into a document image and a non-document image. The image processing apparatus 12 includes an image processing apparatus 12 for performing classification (keyword assignment) and an image database apparatus 13 for storing images classified by the image processing apparatus 12.

【００２９】各々の装置はバス１５により接続されてい
る。画像処理装置１２は、入力された画像を処理、分類
し、かつシステム全体の動作を制御するためのＣＰＵな
どで構成された制御装置１２１と、各種プログラムや定
数などを記憶するＲＯＭ１２２と、画像や処理データな
どを記憶するＲＡＭ１２３とを含む。Each device is connected by a bus 15. The image processing device 12 processes and classifies the input image, and controls the operation of the entire system. The control device 121 includes a CPU, a ROM 122 that stores various programs, constants, and the like. And a RAM 123 for storing processing data and the like.

【００３０】ＲＯＭ１２２は、画像入力の処理のために
用いられる画像入力処理部１２２１と、画像を２値化す
るために用いられる画像２値化処理部１２２２と、２値
化された画像データの中の領域を識別する領域抽出処理
部１２２３と、２値化された画像データの中のデータの
連結成分にラベリング（ラベル番号を付す処理）を行な
うラベリング処理部１２２４と、２値化された画像デー
タの中の文字らしさの特徴を抽出する文字らしさの特徴
抽出処理部１２２５と、画像の分類を行なう画像分類処
理部１２２６と、画像データベース装置１３に画像デー
タを分類して登録する画像登録処理部１２２７とを含
む。The ROM 122 includes an image input processing section 1221 used for image input processing, an image binarization processing section 1222 used for binarizing an image, and a ROM for storing binarized image data. , A labeling processing unit 1224 for labeling (processing for assigning a label number) to connected components of data in the binarized image data, and a binarized image data , A character-like feature extraction processing unit 1225 for extracting the character-like features in the image, an image classification processing unit 1226 for classifying images, and an image registration processing unit 1227 for classifying and registering image data in the image database device 13 And

【００３１】ＲＡＭ１２３は、入力された画像データを
２５６階調の画像データとして記憶する多値画像メモリ
１２３１と、“０”および“１”の２値で表わされる画
像データを記憶する２値画像メモリ１２３２と、データ
の連結成分の各々のラベル番号を記憶するラベル画像メ
モリ１２３３と、各々の連結成分のフィレ座標および面
積を記憶するフィレ座標メモリ１２３４と、多値画像メ
モリ１２３１の画像データの微分画像を記憶する微分画
像メモリ１２３５と、微分画像メモリ１２３５の画像デ
ータをしきい値により２値化した画像データを記憶する
微分２値画像メモリ１２３６と、２値画像メモリ１２３
２の画像データおよび微分２値画像メモリ１２３６の画
像データの論理積画像（ＡＮＤ画像）を記憶するＡＮＤ
画像メモリ１２３７とを含む。The RAM 123 has a multi-valued image memory 1231 for storing inputted image data as image data of 256 gradations, and a binary image memory for storing image data represented by binary values of "0" and "1". 1232, a label image memory 1233 for storing the label number of each connected component of the data, a fillet coordinate memory 1234 for storing the fillet coordinates and area of each connected component, and a differential image of the image data of the multi-valued image memory 1231. , A differential binary image memory 1236 for storing image data obtained by binarizing the image data of the differential image memory 1235 with a threshold value, and a binary image memory 123
For storing a logical product image (AND image) of the image data of the second and the binary image memory 1236
And an image memory 1237.

【００３２】このシステム構成により、画像入力装置１
１から入力されたさまざまな画像は、画像処理装置１２
において画像の持つ特徴量が調べられることにより、自
動的に文書画像か非文書画像かに分類されて画像データ
ベース装置１３に記録されることになる。With this system configuration, the image input device 1
Various images input from the image processing device 1
By examining the feature amount of the image in, the image is automatically classified into a document image or a non-document image and recorded in the image database device 13.

【００３３】たとえば文書画像とは図４に示される文字
を多く含む画像であり、非文書画像とは図５に示される
風景画像などである。For example, a document image is an image containing many characters shown in FIG. 4, and a non-document image is a landscape image shown in FIG.

【００３４】図２および図３は図１の制御装置１２１の
処理ルーチンを示すフローチャートである。FIGS. 2 and 3 are flowcharts showing a processing routine of the control device 121 of FIG.

【００３５】図２および図３のフローチャートは図中Ａ
の部分で連結しており一連の処理を示す。The flowcharts of FIGS. 2 and 3 correspond to A in FIG.
And a series of processing is shown.

【００３６】ステップＳ２０１において、制御装置１２
１は画像入力装置１１により画像データを入力し、それ
を多値画像メモリ１２３１に記憶する。多値画像メモリ
１２３１に記憶される画像データは、図６を参照してＸ
方向に５１２画素、Ｙ方向に４３２画素からなるデータ
であり、１画素は“０”（黒画素）から“２５５”まで
の２５６階調（８ビット）の輝度値を持つ。画像データ
の図に対して最も左上の画素を原点（０，０）とし、Ｘ
方向に０〜５１１の座標を、Ｙ方向に０〜４３１の座標
を付設する。In step S201, the control device 12
1 inputs image data through the image input device 11 and stores it in the multi-valued image memory 1231. The image data stored in the multi-valued image memory 1231 is, as shown in FIG.
The data is composed of 512 pixels in the direction and 432 pixels in the Y direction. One pixel has a luminance value of 256 gradations (8 bits) from “0” (black pixel) to “255”. The origin pixel (0, 0) is defined as the upper left pixel with respect to the image data diagram, and X
The coordinates of 0 to 511 are added in the direction and the coordinates of 0 to 431 are added in the Y direction.

【００３７】多値画像メモリ１２３１は図７に示される
構成をしている。すなわち多値画像メモリ１２３１はＸ
座標、Ｙ座標で示される各々の画素に対して２５６階調
の輝度値を記憶する８ビットのメモリを保有している。
メモリには原点（０，０）の画素の輝度値を先頭とし
て、Ｘ座標の増加順に画素の輝度値のデータが順次記憶
され、次にＹ座標を１増加した後に同様にＸ座標０から
Ｘ座標の増加順に輝度値のデータが記憶され、座標（５
１１，４３１）までのすべての画素の輝度値のデータが
記憶される。The multi-valued image memory 1231 has the structure shown in FIG. That is, the multi-valued image memory 1231 stores X
It has an 8-bit memory that stores a 256-level luminance value for each pixel indicated by coordinates and Y coordinates.
Starting from the luminance value of the pixel at the origin (0, 0), data of the luminance value of the pixel is sequentially stored in the memory in the order of increasing X-coordinate. The luminance value data is stored in the order of increasing coordinates, and the coordinates (5
11, 431) are stored.

【００３８】ステップＳ２０２において、制御装置１２
１は多値画像メモリ１２３１に記憶された２５６階調の
輝度値を持つ画像データ（多値画像データ）の全体を２
値化する。２値化された画像データ（２値画像データ）
は２値画像メモリ１２３２に記憶される。In step S202, the control device 12
Reference numeral 1 denotes the entirety of image data (multi-valued image data) having 256-level luminance values stored in the multi-valued image memory 1231;
Value. Binary image data (binary image data)
Are stored in the binary image memory 1232.

【００３９】２値化は具体的には制御装置１２１がメモ
リ１２３１の多値画像データ全体を走査し、２値化のた
めの輝度のしきい値を設定した後に、しきい値以上の輝
度を持つ画素を白画素（“１”のデータ）、しきい値未
満の輝度を持つ画素を黒画素（“０”のデータ）とする
ことにより行なわれる。ここにしきい値の設定方法とし
て、たとえば田村秀行監修、日本工業技術センター編、
「コンピュータ画像処理入門」，総研出版，１９８８
年，ｐ．６６〜ｐ．６９に述べられている方法（判別分
析法、ｐ−タイル法、モード法など）を用いることがで
きる。この実施例においてはその中の判別分析法が用い
られる。Specifically, in the binarization, the control device 121 scans the entire multi-valued image data in the memory 1231 and sets a threshold value of the luminance for the binarization. This is performed by setting a pixel having a white pixel (“1” data) and a pixel having a luminance lower than the threshold value as a black pixel (“0” data). Here, as a method of setting the threshold, for example, supervised by Hideyuki Tamura, edited by Japan Industrial Technology Center,
"Introduction to Computer Image Processing", Soken Shuppan, 1988
Year, p. 66 to p. 69 (a discriminant analysis method, a p-tile method, a mode method, etc.) can be used. In this embodiment, the discriminant analysis method is used.

【００４０】判別分析法は異なる輝度値を持つ画素の集
合をしきい値（ｔｈ）により２つのクラス（ｔｈ以上と
ｔｈ未満）に分割したと仮定したとき、各々のクラス間
の分散が最大になるようにしきい値（ｔｈ）を設定する
方法である。In the discriminant analysis method, when it is assumed that a set of pixels having different luminance values is divided into two classes (threshold or more and less than th) by a threshold value (th), the variance between each class is maximized. This is a method of setting the threshold value (th) so that

【００４１】しきい値の決定後制御装置１２１は多値画
像メモリ１２３１に記憶されている多値画像データを原
点から走査し、各画素の輝度値を読取り、設定されたし
きい値と比較する。その結果当該輝度値がしきい値未満
であればその画素の輝度値は“０”（黒画素）として、
また当該輝度値がしきい値以上であればその画素の輝度
値は“１”（白画素）として２値画像メモリ１２３２に
記憶される。After the determination of the threshold value, the control device 121 scans the multivalued image data stored in the multivalued image memory 1231 from the origin, reads the luminance value of each pixel, and compares it with the set threshold value. . As a result, if the luminance value is less than the threshold value, the luminance value of the pixel is set to “0” (black pixel), and
If the luminance value is equal to or larger than the threshold value, the luminance value of the pixel is stored in the binary image memory 1232 as “1” (white pixel).

【００４２】２値画像メモリ１２３２のアドレス構成は
図７の多値画像メモリ１２３１と同じであるが、輝度値
として“０”か“１”かの１ビットのデータが記録され
る点で異なる。The address structure of the binary image memory 1232 is the same as that of the multi-valued image memory 1231 of FIG. 7, except that one-bit data of "0" or "1" is recorded as a luminance value.

【００４３】ステップＳ２０３において、制御装置１２
１は２値画像メモリ１２３２の２値画像データの中で白
画素の数が画素の総数の１／３より多いか判定する。Ｙ
ＥＳであればステップＳ２０４において、２値画像メモ
リ１２３２の２値画像データの“０”のデータと“１”
のデータとが反転される。In step S203, the control device 12
1 determines whether the number of white pixels in the binary image data of the binary image memory 1232 is greater than 1/3 of the total number of pixels. Y
If it is ES, in step S204, the binary image data “0” and “1” of the binary image
Is inverted.

【００４４】ステップＳ２０３においてＮＯのときは、
ステップＳ２０４の処理は行なわれない。If NO in step S203,
Step S204 is not performed.

【００４５】ステップＳ２０３からＳ２０４の処理を行
なうのは以下の理由による。画像データは通常は黒画素
が背景、白画素が対象物として処理されるので、たとえ
ば図４に示される白画素の背景に黒画素の対象物からな
る画像データが入力されたときには、画像データの白黒
を反転する必要がある。そこで画素全体の中で白画素が
１／３以上のときは、背景が白画素で対象物が黒画素で
あると判定し、白画素と黒画素とを反転させることにし
ている。The processing in steps S203 to S204 is performed for the following reason. Normally, image data is processed with a black pixel as a background and a white pixel as a target. For example, when image data composed of a black pixel target is input on a white pixel background shown in FIG. It is necessary to reverse black and white. Therefore, when the number of white pixels is 1/3 or more in all the pixels, it is determined that the background is a white pixel and the object is a black pixel, and the white pixel and the black pixel are inverted.

【００４６】ステップＳ２０５において、２値画像メモ
リ１２３２の２値画像データの中の白画素の連結成分が
抽出されそれぞれにラベリングが行なわれる。In step S205, connected components of white pixels in the binary image data of the binary image memory 1232 are extracted, and labeling is performed on each component.

【００４７】連結成分とは画像データの中で白画素
（“１”のデータ）が縦、横、斜めに連続している一群
を示す。たとえば画像データが図８に示されるデータで
あったとすると縦横斜めに“１”のデータが連続してい
るグループとして連結成分２０ａ〜２０ｄの４つの連結
成分が抽出され、それぞれの連結成分を識別するための
ラベル番号が図９のように設定される。図９ではラベル
番号として図８の連結成分２０ａに“１”のデータ、連
結成分２０ｂに“２”のデータ、連結成分２０ｃに
“３”のデータ、連結成分２０ｄに“４”のデータが各
々設定されている。ラベル番号のない黒画素に対しては
ラベル番号の代わりに、“０”のデータが設定される。The connected component is a group of white pixels (data of "1") in the image data which are continuous vertically, horizontally, and diagonally. For example, if the image data is the data shown in FIG. 8, four connected components of the connected components 20a to 20d are extracted as a group in which data of "1" is continuous vertically and horizontally, and each connected component is identified. Label numbers are set as shown in FIG. In FIG. 9, "1" data, "2" data, "3" data, and "4" data are respectively assigned as label numbers to the connected component 20a, the connected component 20b, the connected component 20c, and the connected component 20d of FIG. Is set. For a black pixel without a label number, data “0” is set instead of the label number.

【００４８】ラベル番号はラベル画像メモリ１２３３に
記録される。ラベル画像メモリ１２３３は図１０に示さ
れるアドレス構成をしている。すなわちアドレス構成は
２値画像メモリ１２３２と同じである。各アドレスはそ
のアドレスに対応する画像に付されたラベル番号を記憶
する１６ビットのメモリを備えている。各々のメモリの
ビット数はラベル番号の最大値を記憶することができる
だけ用意する必要があるが、本実施例では１６ビットで
あれば十分であると考えられている。The label number is recorded in the label image memory 1233. The label image memory 1233 has the address configuration shown in FIG. That is, the address configuration is the same as that of the binary image memory 1232. Each address has a 16-bit memory for storing a label number assigned to an image corresponding to the address. It is necessary to prepare the number of bits of each memory as much as possible to store the maximum value of the label number. In this embodiment, it is considered that 16 bits are sufficient.

【００４９】なおラベリングの方法として、たとえば鳥
脇純一郎著「画像理解のためのデジタル画像処理（Ｉ
Ｉ）」昭晃堂、１９８８年、ｐ．４５〜ｐ．４６に述べ
られている方法を用いることができる。As a labeling method, for example, “Digital Image Processing for Image Understanding (I
I) "Shokodo, 1988, p. 45 to p. 46 can be used.

【００５０】ステップＳ２０６において、制御装置１２
１はラベル番号の付された各々の連結成分のフィレ座標
および面積を抽出する。フィレ座標とは画像中の１つの
連結成分に外接する最小の矩形の左上と右下の座標であ
り、図１１に示されるような連結成分があったときに
は、左上のフィレ座標は（ｘ＿ｓｐ，ｙ＿ｓｐ）であ
り、右下のフィレ座標は（ｘ＿ｅｐ，ｙ＿ｅｐ）であ
る。In step S206, the controller 12
1 extracts the fillet coordinates and area of each connected component with a label number. The fillet coordinates are the upper left and lower right coordinates of the smallest rectangle circumscribing one connected component in the image. When there is a connected component as shown in FIG. 11, the upper left fillet coordinates are (x_sp, y_sp ), And the lower right fillet coordinates are (x_ep, y_ep).

【００５１】すなわち同じラベル番号を付されている画
素のＸ座標，Ｙ座標のうち最大のＸ座標および最大のＹ
座標の組が右下のフィレ座標となり、最小のＸ座標およ
び最小のＹ座標の組が左上のフィレ座標となる。That is, the largest X coordinate and the largest Y coordinate among the X coordinate and the Y coordinate of the pixels assigned the same label number
The coordinate set is the lower right fillet coordinate, and the minimum X coordinate and the minimum Y coordinate set is the upper left fillet coordinate.

【００５２】また連結成分の面積とは同じラベル番号を
付されている画素数すなわち各々の連結成分の画素数で
表わされる。The area of the connected component is represented by the number of pixels having the same label number, that is, the number of pixels of each connected component.

【００５３】なお図１１では説明の便宜上連結成分を
黒、背景を白で示している。各々の連結成分のフイレ座
標および面積はフィレ座標メモリ１２３４に記憶され
る。フィレ座標メモリ１２３４は、図１２に示される構
成をしている。すなわちラベル番号順に左上のフィレ座
標のＸ座標，Ｙ座標，右下のフィレ座標のＸ座標，Ｙ座
標，および当該連結成分の面積が記録される。座標、面
積の各々を記憶するメモリは１６ビットである。したが
って１つのラベルにつき８０ビットのメモリ領域を持
つ。In FIG. 11, the connected components are shown in black and the background is shown in white for convenience of explanation. The file coordinates and area of each connected component are stored in the file coordinate memory 1234. The fillet coordinate memory 1234 has the configuration shown in FIG. That is, the X coordinate and the Y coordinate of the upper left fillet coordinate, the X coordinate and the Y coordinate of the lower right fillet coordinate, and the area of the connected component are recorded in the order of the label number. The memory for storing the coordinates and the area is 16 bits. Therefore, one label has a memory area of 80 bits.

【００５４】ステップＳ２０７において不要な連結成分
が除去される。不要な連結成分とは以下に述べる連結成
分である。In step S207, unnecessary connected components are removed. Unnecessary connected components are connected components described below.

【００５５】(1) 連結成分の面積が３画素以下の微小
成分 (2) 連結成分に外接する矩形のＸ方向の幅が画像デー
タの全体のＸ方向の幅（５１２画素）の８分の１より大
きい、あるいは連結成分に外接する矩形のＹ方向の幅が
画像データ全体のＹ方向の幅（４３２画素）の８分の１
より大きい巨大成分 (3) 連結成分に外接する矩形のＸ方向とＹ方向のいず
れかの一方の幅が他方の幅の４倍より大きい細長成分以上に述べた３つの連結成分は文字情報ではないと考え
られるため、２値画像メモリ１２３２から削除される。(1) A minute component having an area of a connected component of 3 pixels or less. (2) The width of the rectangle circumscribing the connected component in the X direction is one-eighth of the entire width (512 pixels) of the image data in the X direction. The width in the Y direction of a rectangle that is larger or circumscribes the connected component is one-eighth of the width (432 pixels) of the entire image data in the Y direction.
Larger component (3) Elongated component whose width in one of the X and Y directions of the rectangle circumscribing the connected component is larger than four times the other width The three connected components described above are not character information Therefore, it is deleted from the binary image memory 1232.

【００５６】たとえば２値画像メモリ１２３２に、図１
３に示されるラベル１からラベル６の６種類の連結成分
が存在しており、それに対応してフィレ座標メモリ１２
３４に図１４に示されるデータが記録されているとす
る。このとき、ラベル２の連結成分はＹ方向の幅が大き
すぎるため、前述した巨大成分として除去される。For example, in the binary image memory 1232, FIG.
6, there are six types of connected components, labeled 1 to label 6, and correspondingly, the fillet coordinate memory 12
It is assumed that the data shown in FIG. At this time, the connected component of the label 2 is removed as the above-described giant component because the width in the Y direction is too large.

【００５７】ラベル４の連結成分は、面積が３画素以下
であるため微小成分として除去される。The connected component of the label 4 is removed as a minute component because the area is 3 pixels or less.

【００５８】ラベル６の連結成分はＸ軸方向の幅がＹ軸
方向の幅の４倍以上であるため細長成分として除去され
る。The connected component of the label 6 is removed as an elongated component because the width in the X-axis direction is four times or more the width in the Y-axis direction.

【００５９】したがって各々の連結成分が除去された結
果、図１５に示されるように２値画像メモリ１２３２に
は、ラベル１，３，５の連結成分が残されることにな
る。Accordingly, as a result of removing each connected component, the connected components of labels 1, 3, and 5 are left in the binary image memory 1232 as shown in FIG.

【００６０】ステップＳ２０８において、制御装置１２
１は多値画像メモリ１２３１の多値画像データに空間微
分処理を施す。微分処理を施された画像（微分画像）は
微分画像メモリ１２３５に記録される。In step S208, the controller 12
Numeral 1 performs a spatial differentiation process on the multivalued image data in the multivalued image memory 1231. The image (differential image) subjected to the differential processing is recorded in the differential image memory 1235.

【００６１】画像の微分は画像中の濃度の不連続性の高
い部分を検出するために行なわれるものである。The differentiation of the image is performed to detect a portion where the density discontinuity is high in the image.

【００６２】画像中の濃度の不連続性の高い部分（濃度
値の変化が大きいところ）は対象物の境界（エッジ部
分）であると認識される。The portion where the density discontinuity is high (where the change in the density value is large) in the image is recognized as the boundary (edge portion) of the object.

【００６３】具体的な微分処理の方法は以下のように行
なわれる。図１６を参照して多値画像メモリ１２３１の
中の微分対象の画像の座標を（ｉ，ｊ）とすると、座標
（ｉ，ｊ）の画素と、座標（ｉ，ｊ）の画素に対して、
縦横斜めに接する合計縦３×横３の画素とが微分に用い
られる画素となる。この縦３×横３の画素の各々の輝度
値に図１７に示される係数が掛けられる。係数が掛けら
れた後の縦３×横３のすべての輝度値をすべて足し合わ
せた形をδ_xＦ（ｉ，ｊ）とする。A specific differential processing method is performed as follows. Referring to FIG. 16, if the coordinates of the image to be differentiated in multi-valued image memory 1231 are (i, j), the pixel at coordinates (i, j) and the pixel at coordinates (i, j) ,
A total of 3 × 3 pixels that are vertically and horizontally inclined are pixels used for differentiation. The luminance value of each of the 3 × 3 pixels is multiplied by a coefficient shown in FIG. Δ _x F (i, j) is the sum of all the luminance values of 3 × 3 after being multiplied by the coefficient.

【００６４】一方、縦３×横３の画素の各々の輝度値
に、図１８に示される係数を掛け、係数を掛けた後の縦
３×横３のすべての輝度値をすべて足し合わせた値をδ
_yＦ（ｉ，ｊ）とする。On the other hand, the brightness value of each of the 3 × 3 pixels is multiplied by the coefficient shown in FIG. 18, and the sum of all the 3 × 3 brightness values after the multiplication is added. To δ
_{Let y} F (i, j).

【００６５】ここで微分値としてたとえば｜δ_xＦ
（ｉ，ｊ）｜＋｜δ_yＦ（ｉ，ｊ）｜により求められる
値を用いることができる。Here, as a differential value, for example, | δ _x F
(I, j) | + | δ _y F (i, j) | can be used.

【００６６】なお図１７，図１８の係数はソーベルオペ
レータと呼ばれ、田村秀行監修、日本工業技術センター
編「コンピュータ画像処理入門」、総研出版、１９８８
年、ｐ．１１８〜ｐ．１２５において開示されている。
微分値を求める方法として他の方法を用いてもよい。The coefficients in FIGS. 17 and 18 are called Sobel operators, supervised by Hideyuki Tamura, edited by Japan Industrial Technology Center, "Introduction to Computer Image Processing", Soken Shuppan, 1988.
Year, p. 118-p. 125.
Other methods may be used as a method for obtaining the differential value.

【００６７】図１７および図１８のソーベルオペレータ
により、多値画像メモリ１２３１の中の画素が走査さ
れ、各々の画素の微分値が微分画像メモリ１２３５に記
録される。The pixels in the multi-valued image memory 1231 are scanned by the Sobel operator shown in FIGS. 17 and 18, and the differential value of each pixel is recorded in the differential image memory 1235.

【００６８】なお多値画像メモリ１２３１の中で、画像
データの端に当たる部分（Ｘ座標が０または５１１もし
くはＹ座標が０または４３１の部分）はソーベルオペレ
ータを適用することができないので、たとえば“０”の
データが記録される。In the multi-valued image memory 1231, the portion corresponding to the end of the image data (the portion where the X coordinate is 0 or 511 or the Y coordinate is 0 or 431) cannot be applied with the Sobel operator. 0 ”data is recorded.

【００６９】ステップＳ２０９において、微分画像メモ
リ１２３５に記録された微分画像がしきい値により２値
化され、微分２値画像メモリ１２３６に記録される。２
値化の処理はステップＳ２０２で説明した方法と同様で
あるのでここでの説明を省略する。In step S 209, the differential image recorded in the differential image memory 1235 is binarized by a threshold, and is recorded in the differential binary image memory 1236. 2
The value conversion process is the same as the method described in step S202, and a description thereof will be omitted.

【００７０】これらの処理により微分２値画像メモリ１
２３６には入力画像の中で輝度値の変化が大きいところ
が白画素（“１”のデータ）として記録されることにな
る。By these processes, the differential binary image memory 1
In 236, a portion where the change in the luminance value is large in the input image is recorded as a white pixel (data of "1").

【００７１】ステップＳ２１０において、２値画像メモ
リ１２３２の画像データと微分２値画像メモリ１２３６
の画像データのＡＮＤ演算が行なわれ、その結果がＡＮ
Ｄ画像メモリ１２３７に記録される。ＡＮＤ演算とは２
つの２値画像の同じ座標位置のデータがともに“１”
（白画素）である画素だけに“１”のデータを与え、そ
れ以外の画素には“０”のデータを与える演算である。
この処理により有効な連結成分の２値画像データと輝度
値の変化が大きい部分の２値画像データとの重なり領域
が“１”のデータとしてＡＮＤ画像メモリ１２３７に記
録されることになる。In step S 210, the image data of the binary image memory 1232 and the differential binary image memory 1236
AND operation is performed on the image data of
This is recorded in the D image memory 1237. What is AND operation?
The data at the same coordinate position of two binary images are both "1"
This is an operation in which “1” data is given only to a pixel that is a (white pixel), and “0” data is given to other pixels.
By this processing, the overlapping area between the binary image data of the effective connected component and the binary image data of the portion where the change in the luminance value is large is recorded in the AND image memory 1237 as "1" data.

【００７２】図１９は図４の文書画像の中の１つの連結
成分が２値画像メモリ１２３２に記録されている状態を
示す図である。FIG. 19 is a diagram showing a state in which one connected component in the document image of FIG.

【００７３】なお図１９は説明の便宜上白画素を背景
（“０”のデータの部分）、黒画素を対象物（“１”の
データの部分）としているが、実際は白画素は対象物、
黒画素は背景として（つまり図１９とは白黒反転して）
メモリに記録されている（以下図２４まで同じ）。In FIG. 19, for convenience of explanation, white pixels are set as the background (portion of "0" data) and black pixels are set as the object (portion of "1" data).
The black pixel is used as the background (that is, black and white inverted from FIG. 19).
It is recorded in the memory (the same applies to FIG. 24 hereinafter).

【００７４】図２０は図１９で示されている連結成分と
同じ連結成分が微分２値画像メモリ１２３４に記録され
ている状態を示す図である。図を参照して輝度値の変化
の大きい対象物と背景の境界部が“１”のデータとして
記録されている。FIG. 20 is a diagram showing a state in which the same connected components as those shown in FIG. 19 are recorded in the differential binary image memory 1234. Referring to the figure, the boundary between the object having a large change in the luminance value and the background is recorded as “1” data.

【００７５】図２１はＡＮＤ画像メモリ１２３７に記録
された状態での図１９と同じ連結成分を説明するための
図である。ＡＮＤ画像メモリ１２３７には図１９の画像
と図２０の画像のＡＮＤ演算を行なった結果が記録され
ることになる。FIG. 21 is a diagram for explaining the same connected components as those in FIG. 19 in a state where they are recorded in the AND image memory 1237. In the AND image memory 1237, a result obtained by performing an AND operation on the image of FIG. 19 and the image of FIG. 20 is recorded.

【００７６】図２２から図２４は図５の風景画像の中の
１つの連結成分が各々のメモリに記録された状態を示す
図であって、図１９から図２１に対応した図である。FIGS. 22 to 24 are views showing a state in which one connected component in the landscape image of FIG. 5 is recorded in each memory, and correspond to FIGS. 19 to 21.

【００７７】風景画像と文書画像とを比較すると以下の
傾向がある。文書画像では２値画像メモリ１２３２に記
録される連結成分の面積に対するＡＮＤ画像メモリ１２
３７に記憶される当該連結成分の面積の比は比較的大き
く、風景画像では２値画像メモリ１２３２に記憶される
連結成分の面積に対するＡＮＤ画像メモリ１２３７に記
憶される当該連結成分の面積の比は比較的小さい。これ
は文字の連結成分はエッジ情報が多いことに基づくもの
であり、この差により入力画像が文書画像かそれ以外か
を判定することができる。A comparison between a landscape image and a document image has the following tendency. For the document image, the AND image memory 12 stores the area of the connected component recorded in the binary image memory 1232.
37, the ratio of the area of the connected component stored in the AND image memory 1237 to the area of the connected component stored in the binary image memory 1232 is relatively large in a landscape image. Relatively small. This is based on the fact that the character connected component has a large amount of edge information, and it is possible to determine whether the input image is a document image or other than the input image based on this difference.

【００７８】ステップＳ２１１において、制御装置１２
１はステップＳ２０７で求められた有効な連結成分の面
積全体に対するステップＳ２１０で求められたＡＮＤ画
像メモリ１２３７の面積全体の割合edge ratioを計算す
る。edge ratioは（１）式により計算される。In step S211, the control device 12
1 calculates the ratio edge ratio of the entire area of the AND image memory 1237 determined in step S210 to the entire area of the effective connected component determined in step S207. The edge ratio is calculated by equation (1).

【００７９】 edge ratio＝（Σedge area ／Σbinary area ）×１００ …（１）なお（１）式においてedge area は、ＡＮＤ画像メモリ
１２３７に記録されている各領域の面積であり、binary
area は２値画像メモリ１２３２に含まれている各領域
の面積である。Σはメモリ中のすべての面積を合計する
ことを示している。Edge ratio = (Σedge area / Σbinary area) × 100 (1) In equation (1), edge area is the area of each area recorded in the AND image memory 1237, and
area is the area of each area included in the binary image memory 1232. Σ indicates that all areas in the memory are summed.

【００８０】なお各々の領域でedge ratioを計算し、最
後にedge ratio全体の平均を取るなどしてもよい。It is also possible to calculate the edge ratio in each area and finally take the average of the entire edge ratio.

【００８１】ステップＳ２１２において、制御装置１２
１はステップＳ２１１で求められたedge ratioを設定さ
れているしきい値と比較し、edge ratioがしきい値以上
か判定する。In step S212, the control device 12
1 compares the edge ratio obtained in step S211 with the set threshold value, and determines whether the edge ratio is equal to or greater than the threshold value.

【００８２】ステップＳ２１２でＹＥＳであれば、ステ
ップＳ２１３において当該画像データは文書画像である
と判断される。If YES in step S212, it is determined in step S213 that the image data is a document image.

【００８３】ステップＳ２１２でＮＯであれば、ステッ
プＳ２１５において当該画像データは非文書であると判
定される。If NO in step S212, the image data is determined to be non-document in step S215.

【００８４】このしきい値はたとえば６０％という値が
用いられるが、ユーザの判断により変更させることもで
きる。The threshold value is, for example, 60%, but can be changed by the user.

【００８５】ステップＳ２１４において、画像は判定さ
れた文書または非文書というキーワード（属性）ととも
に画像データベース装置１３に記憶される。In step S 214, the image is stored in the image database 13 together with the determined keyword (attribute) of a document or a non-document.

【００８６】図２５は本実施例における効果を説明する
ための図である。横軸はedge ratioを、縦軸はedge rat
ioの所定の範囲に分類されるサンプル画像の数を示す。
edge ratioは５％の範囲で区切られており、その範囲に
分類される文書、非文書のサンプル画像の数がヒストグ
ラムとして示されている。FIG. 25 is a diagram for explaining the effect of this embodiment. The horizontal axis is the edge ratio, and the vertical axis is the edge rat
Shows the number of sample images classified into a predetermined range of io.
The edge ratio is divided in a range of 5%, and the number of sample images of documents and non-documents classified in the range is shown as a histogram.

【００８７】グラフを参照して、文書画像はedge ratio
が６０％より大きく、非文書画像はedge ratioが６０％
より小さいという傾向がある。これにより第３のステッ
プＳ２１２でのしきい値として６０％という値を用いる
ことで、文書画像と非文書画像の分類を行なうことが可
能であることがわかる。Referring to the graph, the document image has the edge ratio
Is greater than 60% and non-document images have an edge ratio of 60%
Tends to be smaller. Thus, it is understood that the classification of the document image and the non-document image can be performed by using the value of 60% as the threshold value in the third step S212.

【００８８】以上のように本実施例では文字領域にはエ
ッジ情報が多いという特徴を利用して、有効な連結画像
中の微分２値画像（エッジ情報を表わす）の面積の割合
を調べることで、文字領域を多く含む文書画像と非文書
画像とを精度よく分類することができる。As described above, in the present embodiment, the ratio of the area of the differential binary image (representing the edge information) in the effective connected image is examined by utilizing the feature that the character region has a large amount of edge information. In addition, a document image and a non-document image including many character areas can be classified with high accuracy.

【００８９】したがってこのような画像分類装置は画像
データベースやファクシミリ装置の入力装置として用い
ることができる。たとえば画像データベースの入力装置
として用いた場合は、文書と非文書を自動分類すること
により、入力された画像に対し「文書」または「非文
書」というキーワードを人手を介することなく自動的に
付与することが可能になる。Therefore, such an image classification device can be used as an input device of an image database or a facsimile device. For example, when used as an input device of an image database, a document and a non-document are automatically classified, so that a keyword of “document” or “non-document” is automatically added to an input image without manual intervention. It becomes possible.

【００９０】また文書画像と分類された画像にはさらに
文字認識を適用することなどにより画像中の文書から自
動的にキーワードを抽出し、付与することが可能とな
る。Further, a keyword can be automatically extracted from a document in an image by applying character recognition to the image classified as a document image, and can be added.

【００９１】さらに本発明の画像分類装置を画像データ
ベースの入力時に用いることにより、ユーザは１つのシ
ステムに対し文書画像と非文書画像を意識することなく
入力でき、検索時には文書画像と非文書画像を個別に検
索することが可能となる。Further, by using the image classification device of the present invention at the time of inputting an image database, a user can input a document image and a non-document image to one system without being conscious of it. It becomes possible to search individually.

【００９２】さらにファクシミリ装置の入力装置として
本発明を用いることによって、入力された画像を文書、
非文書に分類することが可能となり、たとえば印刷のモ
ードを自動選択することが可能となる。Further, by using the present invention as an input device of a facsimile machine, an input image can be converted into a document,
It is possible to classify the document as a non-document, for example, it is possible to automatically select a print mode.

【００９３】さらに本発明に係る画像分類装置を、文書
と写真や図などが混合している画像を領域分割した後で
それぞれの領域に対して適用することにより、各領域の
属性判定に用いることが可能である。Further, the image classification apparatus according to the present invention is used for determining the attribute of each area by applying an image in which a document is mixed with a photograph or a figure to each area after dividing the area. Is possible.

【００９４】（第２の実施例）図２６は本発明の第２の
実施例における画像分類装置の装置構成を示すブロック
図である。(Second Embodiment) FIG. 26 is a block diagram showing a device configuration of an image classification device according to a second embodiment of the present invention.

【００９５】図を参照して本ブロック図は、図１のブロ
ック図から微分２値画像メモリ１２３６とＡＮＤ画像メ
モリ１２３７とを除いたものである。Referring to the drawing, this block diagram is obtained by removing the differential binary image memory 1236 and the AND image memory 1237 from the block diagram of FIG.

【００９６】本実施例において行なわれる基本的処理は
第１の実施例と同じであるが、文書画像データと非文書
画像データとを分類する基準として、連結成分の外接矩
形の面積とその内部の有効な連結成分の面積との比を用
いることを特徴としている。The basic processing performed in this embodiment is the same as that of the first embodiment. However, as a criterion for classifying document image data and non-document image data, the area of the circumscribed rectangle of the connected component and the internal It is characterized in that the ratio with the area of the effective connected component is used.

【００９７】本実施例における画像分類装置における処
理は以下のように行なわれる。まず図２のステップＳ２
０１からＳ２０７までの処理が行なわれる。この処理は
第１の実施例で説明した処理と実質的に同一であるので
ここでの説明を繰返さない。The processing in the image classification device according to the present embodiment is performed as follows. First, step S2 in FIG.
The processing from 01 to S207 is performed. This processing is substantially the same as the processing described in the first embodiment, and thus description thereof will not be repeated.

【００９８】ステップＳ２０７の処理が行なわれた後、
図２７に示されるフローチャートの処理が行なわれる。After the processing of step S207 is performed,
The processing of the flowchart shown in FIG. 27 is performed.

【００９９】ステップＳ４０１において、制御装置１２
１は有効な連結成分の面積とその外接矩形の面積との割
合を計算する。有効な連結成分の面積はフィレ座標メモ
リ１２３４から読出され、外接矩形の面積はフィレ座標
メモリ１２３４のフィレ座標から求められる。In step S401, the control device 12
1 calculates the ratio between the area of the effective connected component and the area of the circumscribed rectangle. The area of the effective connected component is read from the fillet coordinate memory 1234, and the area of the circumscribed rectangle is obtained from the fillet coordinates of the fillet coordinate memory 1234.

【０１００】有効な連結成分の外接矩形の面積（filet
area）に対する有効な連結成分の面積（binary area ）
の割合（bin ratio ）は、（２）式により求められる。The area of the circumscribed rectangle of the effective connected component (filet
area) is the area of the effective connected component (binary area)
(Bin ratio) is obtained by the equation (2).

【０１０１】 bin ratio ＝（Σbinary area ／Σfilet area）×１００ …（２）（２）式においてΣは、画像全体の和であることを示
す。Bin ratio = (Σbinary area / Σfilet area) × 100 (2) In equation (2), Σ indicates the sum of the entire image.

【０１０２】なお本実施例では有効な連結成分の面積の
合計と有効な連結成分の外接矩形の面積の合計との割合
をbin ratio として用いることとしたが、各々の有効な
連結成分ごとにその外接矩形の面積に対する有効な連結
成分の面積の割合を求めた後、全体の平均値を取るなど
の方法を用いてもよい。In this embodiment, the ratio of the total area of the effective connected components to the total area of the circumscribed rectangles of the effective connected components is used as the bin ratio. After obtaining the ratio of the area of the effective connected component to the area of the circumscribed rectangle, a method of taking the average value of the whole may be used.

【０１０３】ステップＳ４０２において、制御装置１２
１によりステップＳ４０１において求められたbin rati
o が予め記憶されている所定のしきい値未満であるか判
定される。所定のしきい値としてたとえば後述するよう
に５５％という値が用いられるが、しきい値はユーザの
判断により任意に設定できるようにしてもよい。そのと
きユーザによりしきい値を設定する処理ルーチンは、Ｒ
ＯＭ１２２に記録される。In step S402, the control device 12
1 bin rati obtained in step S401 by
It is determined whether o is less than a predetermined threshold value stored in advance. As the predetermined threshold value, for example, a value of 55% is used as described later, but the threshold value may be arbitrarily set by the user. At that time, the processing routine for setting the threshold by the user is as follows:
Recorded in OM122.

【０１０４】ステップＳ４０２でＹＥＳであれば、ステ
ップＳ４０３において入力画像は文書画像であると判断
される。If YES in step S402, it is determined in step S403 that the input image is a document image.

【０１０５】ステップＳ４０２でＮＯであれば、ステッ
プＳ４０４において入力画像は非文書画像であると判断
される。If NO in step S402, it is determined in step S404 that the input image is a non-document image.

【０１０６】ステップＳ４０５において画像データは、
文書／非文書というキーワード（属性）とともに画像デ
ータベース装置１３に記録される。At step S405, the image data is
It is recorded in the image database device 13 together with the keyword (attribute) of document / non-document.

【０１０７】図２８は文書画像における２値画像データ
の例、図２９は風景画像（非文書画像）における２値画
像データの例である。FIG. 28 shows an example of binary image data in a document image, and FIG. 29 shows an example of binary image data in a landscape image (non-document image).

【０１０８】図中、点線は各々の連結成分における外接
矩形を示す。また説明の便宜上、図では対象物を黒、背
景を白で示している。In the figure, dotted lines indicate circumscribed rectangles in each connected component. Also, for convenience of explanation, the object is shown in black and the background is shown in white.

【０１０９】図を参照して、文書画像は非文書画像より
も外接矩形の面積に対する有効な連結成分の面積の割合
が小さくなっている。これにより文書画像、非文書画像
を分類することができる。Referring to the figure, the ratio of the area of the effective connected component to the area of the circumscribed rectangle is smaller in the document image than in the non-document image. As a result, document images and non-document images can be classified.

【０１１０】図３０は文書画像、非文書画像の持つbin
ratio を説明するための図である。横軸はbin ratio を
５％刻みの範囲で区切ったものであり、縦軸は５％刻み
で区切られた各々のbin ratio の中に分類されるサンプ
ル画像の数を示す。FIG. 30 shows bins of a document image and a non-document image.
It is a figure for explaining ratio. The horizontal axis shows the bin ratio divided in 5% steps, and the vertical axis shows the number of sample images classified in each bin ratio divided in 5% steps.

【０１１１】図を参照して文書画像はbin ratio が５５
％未満であり、非文書画像はbin ratio が５５％以上で
ある。したがって図２７のステップＳ４０２でのしきい
値を５５％とし、しきい値未満を文書画像、しきい値以
上を非文書画像と判定することにより、文書画像と非文
書画像とを分類できることがわかる。Referring to the figure, the document image has a bin ratio of 55.
%, And the non-document image has a bin ratio of 55% or more. Therefore, it is understood that the document image and the non-document image can be classified by setting the threshold value in step S402 in FIG. 27 to 55%, and determining that the document image is less than the threshold value and the non-document image is greater than the threshold value. .

【０１１２】以上のようにこの実施例では、線図形であ
るためその外接矩形中の連結成分の割合が小さいという
文字領域の特徴を利用し、外接矩形の面積に対する有効
な連結成分の面積の割合を用いることで文字領域が多く
並んでいるという文書画像の特徴をうまく抽出し、精度
よく分類することができる。As described above, in this embodiment, the ratio of the area of the effective connected component to the area of the circumscribed rectangle is utilized by utilizing the characteristic of the character region that the ratio of the connected component in the circumscribed rectangle is small because the figure is a line figure. By using, the characteristics of a document image in which many character areas are arranged can be extracted well and classified with high accuracy.

【０１１３】また第１および第２の実施例で用いたedge
ratioやbin ratio を含めた複数の文字らしさを示す特
徴量を組合せて判定することによりさらに精度よく画像
データを分類することが可能である。Further, the edge used in the first and second embodiments is used.
It is possible to classify the image data with higher accuracy by judging by combining a plurality of feature values indicating character likeness including ratio and bin ratio.

【０１１４】これはたとえば１つの画像データにおいて
edge ratioによる文書画像らしさと、bin ratio による
文書画像らしさの各々により得点を付け、その合計得点
により画像データの分類を行なう方法や、一方の特徴量
において、しきい値の近くに判定された画像データにつ
いては他方の特徴量を用いて分類するなどの方法で実現
することが可能である。This is, for example, in one image data.
Scores are given for each of the document image likeness by the edge ratio and the document image likeness by the bin ratio, and the image data is classified based on the total score. Data can be realized by a method such as classification using the other feature amount.

【０１１５】（第３の実施例）図３１は本発明の第３の
実施例における画像分類装置の処理を説明するための図
である。(Third Embodiment) FIG. 31 is a diagram for explaining the processing of the image classification device according to the third embodiment of the present invention.

【０１１６】本発明における画像分類装置の装置構成は
第１の実施例と同じであるので説明を省略する。The configuration of the image classification device according to the present invention is the same as that of the first embodiment, and a description thereof will be omitted.

【０１１７】本実施例における画像分類装置において行
なわれる処理は以下のとおりである。The processing performed in the image classification device according to the present embodiment is as follows.

【０１１８】図２のステップＳ２０１からＳ２０７の処
理が実行された後、図３１の処理が行なわれる。After the processing of steps S201 to S207 of FIG. 2 is performed, the processing of FIG. 31 is performed.

【０１１９】すなわちステップＳ５０１において、有効
な連結成分の文字らしさを示す特徴量が抽出される。特
徴量は画像中の有効な連結成分全体あるいは各々の有効
な連結成分について抽出される。また特徴量は画像全体
で１つの値を持つデータである。文字らしさを表わす特
徴量は、多値画像メモリ１２３１、２値画像メモリ１２
３２、ラベル画像メモリ１２３３、フィレ座標メモリ１
２３４の内容により求められる。That is, in step S501, a feature quantity indicating the character likeness of an effective connected component is extracted. The feature amount is extracted for all the effective connected components in the image or each effective connected component. The feature amount is data having one value in the entire image. The feature quantity representing character-likeness is stored in the multi-valued image memory 1231 and the binary image memory 12.
32, label image memory 1233, fillet coordinate memory 1
234.

【０１２０】ステップＳ５０２において、処理装置は抽
出された特徴量と所定のしきい値とを比較する。所定の
しきい値はＲＯＭに記録するようにしてもよいし、外部
からユーザが指定できるようにしてもよい。また異なる
２以上の方法で求めた２以上の特徴量を比較に用いるよ
うにしてもよい。In step S502, the processing device compares the extracted feature value with a predetermined threshold value. The predetermined threshold value may be recorded in the ROM, or may be externally specified by the user. Further, two or more feature amounts obtained by two or more different methods may be used for comparison.

【０１２１】ステップＳ５０３において、比較された特
徴量と所定のしきい値とから特徴量が文字らしさを示す
条件を満たしているかが判定され、ＹＥＳであればステ
ップＳ５０４において、入力画像は文書と判断される。In step S503, it is determined from the compared characteristic amount and a predetermined threshold value whether the characteristic amount satisfies the condition indicating character-likeness. If YES, the input image is determined to be a document in step S504. Is done.

【０１２２】ステップＳ５０３においてＮＯであれば、
ステップＳ５０６において入力画像は非文書と判断され
る。If NO in step S503,
In step S506, the input image is determined to be a non-document.

【０１２３】ステップＳ５０５において、入力画像は画
像データベースに記録されるがそのとき文書画像である
と判断された画像には、「文書画像」というキーワード
（属性）が、非文書であると判断された画像には「非文
書画像」というキーワードが併わせて記録される。In step S505, the input image is recorded in the image database, but the keyword (attribute) of “document image” is determined to be a non-document for the image determined to be a document image at that time. The keyword “non-document image” is recorded together with the image.

【０１２４】これにより画像データベースから画像デー
タを検索するときにはキーワードを利用し、効率のよい
検索を行なうことができる。Thus, when searching for image data from the image database, an efficient search can be performed using the keyword.

【０１２５】（第４の実施例）図３２は本発明の第４の
実施例における画像分類装置のシステム構成を示すブロ
ック図である。(Fourth Embodiment) FIG. 32 is a block diagram showing a system configuration of an image classification device according to a fourth embodiment of the present invention.

【０１２６】図を参照して画像分類装置は、大きくはカ
メラはスキャナなどを含む画像入力のための画像入力装
置５１と、入力された画像を処理することにより画像を
名刺画像、それ以外の画像に分類（キーワード付け）す
る画像処理装置５２と、画像処理装置５２により分類さ
れた画像を記憶する画像データベース装置５３とから構
成される。各々の装置はバス１５により接続されてい
る。画像処理装置５２は、入力された画像を処理、分類
し、かつシステム全体の動作を制御するためのＣＰＵな
どで構成された制御装置５２１と、各種プログラマブル
や定数などを記憶するＲＯＭ５２２と、画像や処理デー
タなどを記憶するＲＡＭ５２３とを含む。Referring to the figure, the image classifying device is roughly divided into an image input device 51 for inputting an image including a scanner as a camera, a business card image by processing the input image, and other image data. An image processing device 52 classifies the images (keywords are assigned) and an image database device 53 that stores images classified by the image processing device 52. Each device is connected by a bus 15. The image processing device 52 includes a control device 521 including a CPU and the like for processing and classifying input images and controlling the operation of the entire system, a ROM 522 for storing various programmable values and constants, A RAM 523 for storing processing data and the like.

【０１２７】ＲＯＭ５２２は、画像入力の処理のために
用いられる画像処理部５２２１と、画像を２値化するた
めに用いられる画像２値化処理部５２２２と、２値化さ
れた画像データの中の領域を識別するために用いられる
領域抽出処理部５２２３と、２値化された画像データの
中のデータの連結成分にラベリング（ラベル番号を付す
処理）を行なうラベリング処理部５２２４と、画像デー
タの領域の並び方の特徴を抽出する領域の並び方の特徴
抽出処理部５２２５と、画像の分類を行なう画像分類処
理部５２２６と、画像データベース装置５３に画像デー
タの登録を行なう画像登録処理部５２２７とを含む。The ROM 522 includes an image processing section 5221 used for image input processing, an image binarization processing section 5222 used for binarizing an image, and a An area extraction processing unit 5223 used for identifying an area, a labeling processing unit 5224 that performs labeling (processing for assigning a label number) to a connected component of data in the binarized image data, and an area of the image data A region extraction method processing unit 5225 for extracting the characteristics of the arrangement of images, an image classification processing unit 5226 for classifying images, and an image registration processing unit 5227 for registering image data in the image database device 53 are included.

【０１２８】ＲＡＭ５２３は、入力された画像データを
２５６階調の画像データとして記憶する多値画像メモリ
５２３１と、“０”および“１”の２値で表わされる画
像データを記憶する２値画像メモリ５２３２と、データ
の連結成分の各々のラベル番号を記憶するラベル画像メ
モリ５２３３と、各々の連結成分のフィレ座標および面
積を記憶するフィレ座標メモリ５２３４と、有効な連結
成分の重心点の座標を記憶する重心メモリ５２３５と、
重心点と他の重心点との方向頻度を記憶する方向頻度ヒ
ストグラムメモリ５２３６と、多値画像メモリ１２３１
の画像データの微分画像を記憶する微分画像メモリ５２
３７と、微分画像の濃度をＸ，Ｙ方向に投影した濃度の
ヒストグラムを記憶する濃度投影ヒストグラムメモリ５
２３８と、有効な連結成分数を記憶する有効な連結成分
数メモリ５２３９と、ヒストグラムの微分値を記憶する
ヒストグラム微分値メモリ５２４０とを含む。A RAM 523 stores a multi-valued image memory 5231 for storing input image data as image data of 256 gradations, and a binary image memory for storing image data represented by binary values of “0” and “1”. 5232, a label image memory 5233 for storing the label number of each connected component of the data, a fillet coordinate memory 5234 for storing the fillet coordinates and area of each connected component, and storing the coordinates of the center of gravity of the effective connected component. A center-of-gravity memory 5235,
A direction frequency histogram memory 5236 for storing the direction frequency between the center of gravity and another center of gravity, and a multi-valued image memory 1231
Image memory 52 for storing the differential image of the image data
37, a density projection histogram memory 5 for storing a density histogram obtained by projecting the density of the differential image in the X and Y directions.
238, an effective connected component number memory 5239 for storing the effective connected component number, and a histogram differential value memory 5240 for storing the histogram differential value.

【０１２９】このシステム構成により、画像入力装置５
１から入力されたさまざまな画像は画像処理装置５２に
おいて画像の持つ特徴量が調べられることにより自動的
に名刺画像か否かに分類されて、画像データベース装置
５３に記憶されることになる。With this system configuration, the image input device 5
The various images input from 1 are automatically classified as business card images or not by checking the feature amounts of the images in the image processing device 52 and stored in the image database device 53.

【０１３０】本実施例での特徴量を定める基準として、
名刺画像は文字の数が少なく行間が広く開いているとい
う特徴が利用される。As a criterion for determining the feature value in this embodiment,
The business card image utilizes the feature that the number of characters is small and the line spacing is wide.

【０１３１】図３３から図３５は本実施例における制御
装置５２１の処理を示すフローチャートである。FIGS. 33 to 35 are flowcharts showing the processing of the control device 521 in this embodiment.

【０１３２】フローチャートは図中ＡまたはＢの部分で
連結し一連の処理を示す。ステップＳ６０１からＳ６０
７での処理は図２のステップＳ２０１からＳ２０７での
処理と実質的に同一であるのでここでの説明は繰返さな
い。The flow chart shows a series of processing by linking the parts A or B in the figure. Steps S601 to S60
7 is substantially the same as the processing in steps S201 to S207 in FIG. 2, and the description thereof will not be repeated.

【０１３３】ステップＳ６０８において、制御装置５２
１は２値画像メモリ５２３２中に含まれている有効な連
結成分の数を計測し、その数を有効な連結成分メモリ５
２３９に記録する。名刺画像は文字数が少ないため、有
効な連結成分が少ないという特徴を有する。そのため名
刺画像を判定する視標の１つとして、有効な連結成分の
数を計測するのである。At step S608, the control unit 52
1 measures the number of valid connected components contained in the binary image memory 5232, and stores the number in the valid connected component memory 532.
Record at 239. Since the business card image has a small number of characters, it has a feature that there are few effective connected components. Therefore, as one of the targets for determining the business card image, the number of effective connected components is measured.

【０１３４】ステップＳ６０９において、有効な連結成
分それぞれの重心点が判定され、重心メモリ５２３５に
記録される。In step S 609, the center of gravity of each valid connected component is determined and recorded in the center of gravity memory 5235.

【０１３５】重心座標はフィレ座標メモリ５２３４に記
録されている各々の有効な連結成分についてのフイレ座
標について求められる。The coordinates of the barycenter are obtained for the file coordinates of each effective connected component recorded in the file coordinate memory 5234.

【０１３６】有効な連結成分の各々の左上のフィレ座標
を（ｘ＿ｓｐ［ｉ］，ｙ＿ｓｐ［ｉ］）、右下のフィレ
座標を（ｘ＿ｅｐ［ｉ］，ｙ＿ｅｐ［ｉ］）とすると、
重心座標ｇｒａｖ［ｉ］は（３）式により求められる。Assuming that the upper left fillet coordinate of each effective connected component is (x_sp [i], y_sp [i]) and the lower right fillet coordinate is (x_ep [i], y_ep [i])
The barycentric coordinate grav [i] is obtained by equation (3).

【０１３７】ｇｒａｖ［ｉ］＝｛（ｘ＿ｅｐ［ｉ］−ｘ＿ｓｐ［ｉ］）／２，（ｙ＿ｅｐ［ｉ］−ｙ＿ｓｐ［ｉ］）／２｝ …（３）重心メモリ５２３５には、図３６に示されるように有効
な連結成分のラベル番号順にそのラベルの重心点のＸ座
標，Ｙ座標が記録される。Grav [i] = {(x_ep [i] −x_sp [i]) / 2, (y_ep [i] −y_sp [i]) / 2} (3) The centroid memory 5235 has the following structure. As shown, the X and Y coordinates of the center of gravity of the label are recorded in the order of the label numbers of the effective connected components.

【０１３８】ステップＳ６０７において除去された連結
成分に対する重心点は記録されないため、ラベル番号は
必ずしも連続でない。Since the center of gravity of the connected component removed in step S607 is not recorded, the label numbers are not always continuous.

【０１３９】また重心点のＸ，Ｙ座標を記録するメモリ
は、各々１６ビットのメモリ容量を持つため、１つのラ
ベルに対して３２ビットのメモリ領域が確保されている
ことになる。Since the memories for recording the X and Y coordinates of the center of gravity each have a memory capacity of 16 bits, a 32-bit memory area is secured for one label.

【０１４０】ステップＳ６１０において制御装置５２１
は、各々の重心点に最も近い他の重心点を調べ、その方
向を方向頻度ヒストグラムメモリ５２３６に記録する。In step S610, control device 521
Finds the other centroid point closest to each centroid point and records its direction in the direction frequency histogram memory 5236.

【０１４１】図３７は図３４のステップＳ６１０で行な
われる重心点の処理の具体的ルーチンを示したフローチ
ャートである。FIG. 37 is a flowchart showing a specific routine of the processing of the center of gravity performed in step S610 of FIG.

【０１４２】ステップＳ８０１において、探索範囲の大
きさ（画素単位）を表わす変数“ｓｉｚｅ”の値が１と
して設定される。In step S801, the value of a variable “size” representing the size (pixel unit) of the search range is set to 1.

【０１４３】ステップ８０２において、探索範囲の矩形
幅を表わす変数“ｗｓ”の値がｗｓ＝ｓｉｚｅ×２の式
により設定される。In step 802, the value of a variable “ws” representing the rectangular width of the search range is set by the equation ws = size × 2.

【０１４４】ステップＳ８０３において、注目している
重心点を中心とする１辺ｗｓの矩形の辺上に他の重心点
があるか否か判定される。In step S803, it is determined whether or not there is another barycenter on the side of the rectangle of one side ws centered on the barycenter of interest.

【０１４５】ステップＳ８０４で、重心点がない（Ｎ
Ｏ）と判定されたのであれば、ステップ８０９におい
て、変数“ｓｉｚｅ”が１インクリメントされる。In step S804, there is no center of gravity (N
If it is determined as O), in step 809, the variable “size” is incremented by one.

【０１４６】ステップＳ８１０で、変数“ｓｉｚｅ”の
値が最大しきい値であるＭＡＸ＿ＳＩＺＥより小さいか
判定される。ＭＡＸ＿ＳＩＺＥはたとえば１００という
値が設定される。In step S810, it is determined whether the value of the variable “size” is smaller than the maximum threshold value MAX_SIZE. MAX_SIZE is set to, for example, a value of 100.

【０１４７】ステップＳ８１０でＹＥＳであれば、ステ
ップＳ８０２からの処理が繰返される。If YES in step S810, the processing from step S802 is repeated.

【０１４８】ステップＳ８０４でＹＥＳであれば、ステ
ップＳ８０５において中心の重心点から辺上の重心点へ
の方向が計算される。このとき矩形上に複数の重心点が
存在する場合は、すべての重心点に対して方向が計算さ
れる。方向θは、中心から矩形上の重心点への方向ベク
トルが（ｄｘ，ｄｙ）であるとすると、（４）式により
求められる。If YES in step S804, the direction from the center of gravity to the center of gravity on the side is calculated in step S805. At this time, if there are a plurality of centroid points on the rectangle, the directions are calculated for all the centroid points. The direction θ is obtained by Expression (4), assuming that the direction vector from the center to the center of gravity on the rectangle is (dx, dy).

【０１４９】 θ＝ｔａｎ^-1（ｄｘ／ｄｙ） …（４）図３８は、ｓｉｚｅ＝３，ｗｓ＝６の状態で注目してい
る重心点の右下に他の重心点が見つかった状態を示した
図である。図中の斜線部は１辺６の矩形を示している。Θ = tan ⁻¹ (dx / dy) (4) FIG. 38 shows a state in which another barycentric point is found at the lower right of the focused barycentric point in the state of size = 3, ws = 6. FIG. The hatched portion in the figure indicates a rectangle with one side 6.

【０１５０】中心の重心点から矩形上の重心点への方向
ベクトルはそれぞれの重心点の座標の差で表され、この
例では（ｄｘ，ｄｙ）＝（３，２）であり、θ＝ｔａｎ
^-1（ｄｘ／ｄｙ）≒０．９８［ｒａｄ］となる。The direction vector from the center of gravity to the center of gravity on the rectangle is represented by the difference between the coordinates of the respective centers of gravity. In this example, (dx, dy) = (3, 2), and θ = tan
⁻¹ (dx / dy) ≒ 0.98 [rad].

【０１５１】ステップＳ８０６において、求められた方
向が量子化され、方向頻度ヒストグラムメモリ５２３６
に記録される。In step S806, the obtained direction is quantized, and the direction frequency histogram memory 5236
Will be recorded.

【０１５２】量子化の範囲は、図３９に示されるとおり
であり、水平方向（Ｘ軸方向）を基準として、θが−π
／２からπ／２（ｒａｄ）の間は、π／４ずつの角度を
もつ領域０から領域３までの４つの領域に分けられてい
る。求められた方向はこれらの領域の中のいずれかに分
類され記録される。たとえば図３８の方向ベクトルθ≒
０．９８であるので、領域３に分類されることになる。The range of quantization is as shown in FIG. 39, and θ is −π with respect to the horizontal direction (X-axis direction).
The region between // 2 and π / 2 (rad) is divided into four regions from region 0 to region 3 having an angle of π / 4. The determined direction is classified and recorded in any of these areas. For example, the direction vector θ ≒ in FIG.
Since it is 0.98, it is classified into the area 3.

【０１５３】なお方向ベクトルが図３９の領域に分類さ
れない場合（θ＞π／２，θ＜−π／２）は、ベクトル
の方向を逆転させたものをカウントする。If the direction vector is not classified into the region shown in FIG. 39 (θ> π / 2, θ <−π / 2), the vector whose direction is reversed is counted.

【０１５４】したがって本実施例では頻度ヒストグラム
メモリ５２３６は領域０から領域３のそれぞれの方向頻
度をカウントするための４つのメモリ領域を持つ。各々
のメモリ領域は１６ビット用意されている。Therefore, in this embodiment, the frequency histogram memory 5236 has four memory areas for counting the directional frequencies of the areas 0 to 3, respectively. Each memory area has 16 bits.

【０１５５】ステップＳ８０７において、処理は次の重
心点へ移る。ステップＳ８０８において、すべての重心
点について最近の重心点の方向を調べたか判定され、Ｙ
ＥＳであれば本ルーチンを終了する。In step S807, the processing moves to the next center of gravity. In step S808, it is determined whether the direction of the latest centroid point has been checked for all centroid points.
If it is ES, this routine ends.

【０１５６】ステップＳ８０８でＮＯであれば、ステッ
プＳ８０１からの処理を繰返す。ステップＳ８１０でＮ
Ｏであれば、処理はステップＳ８０７へ移る。If NO in step S808, the processing from step S801 is repeated. N in step S810
If O, the process proceeds to step S807.

【０１５７】以上の処理により方向頻度ヒストグラムメ
モリ５２３６にすべての重心点についての最近重心点の
方向の累積頻度が記録される。By the above processing, the cumulative frequency of all the centroid points in the direction of the nearest centroid point is recorded in the direction frequency histogram memory 5236.

【０１５８】図３４のステップＳ６１１において、制御
装置５２１は入力画像の文字らしい領域の間隔や数の特
徴をエッジ部分の並び方の情報から抽出するために、多
値画像メモリ５２３１に記憶された多値画像データに空
間微分処理を施し、１画素８ビットの微分画像を作成す
る。作成された微分画像は微分画像メモリ５２３７に記
憶される。微分画像の作成方法は、ソーベルオペレータ
を用いる第１の実施例と同じであるためここでの説明は
繰返さない。In step S611 of FIG. 34, the control device 521 extracts the characteristics of the intervals and the number of character-like regions of the input image from the information on the arrangement of the edge portions. A spatial differentiation process is performed on the image data to create a differential image of 8 bits per pixel. The created differential image is stored in the differential image memory 5237. The method of creating the differential image is the same as that of the first embodiment using the Sobel operator, and therefore the description thereof will not be repeated.

【０１５９】ステップＳ６１２において、制御装置５２
１は微分画像メモリ５２３７に記憶された微分画像に基
づいて、Ｘ，Ｙ軸に対して投影された濃度データである
濃度投影ヒストグラムを作成する。濃度投影ヒストグラ
ムは濃度投影ヒストグラムメモリ５２３８に記憶され
る。In step S612, control device 52
1 creates a density projection histogram, which is density data projected on the X and Y axes, based on the differential image stored in the differential image memory 5237. The density projection histogram is stored in the density projection histogram memory 5238.

【０１６０】画像をある軸に投影するということは、そ
の軸に垂直な方向の直線に沿った画素の濃度を逐次足し
合わせ、その合計を求める操作を、直線の位置を平行移
動させて繰返すことであり、これにより１次元の濃度の
並び（波形）が得られる。たとえば図４０に示されるよ
うな濃度がすべての部分で一定の円を想定すると、Ｘ
軸、Ｙ軸への各々の濃度投影ヒストグラムは図示される
ように半楕円形のヒストグラムとなる。Projecting an image on a certain axis means repeating the operation of sequentially adding the densities of pixels along a straight line in a direction perpendicular to the axis and calculating the sum by translating the position of the straight line. Thus, a one-dimensional density array (waveform) is obtained. For example, assuming a circle in which the density is constant in all parts as shown in FIG.
Each of the density projection histograms on the axis and the Y axis is a semi-elliptical histogram as shown.

【０１６１】なお図４１（ａ）に示されるように濃度投
影ヒストグラムメモリ５２３８は、Ｘ軸，Ｙ軸各座標の
順番に並んだアドレス構成を取り、各座標に対する濃度
投影値１６ビットを１単位としてＸ座標とＹ座標の数の
和（この場合は５１２＋４３２＝９４４）だけのメモリ
領域を持つ。As shown in FIG. 41 (a), the density projection histogram memory 5238 has an address configuration in which the X-axis and Y-axis coordinates are arranged in order, and the density projection value 16 bits for each coordinate is defined as one unit. It has a memory area of the sum of the number of X coordinates and Y coordinates (512 + 432 = 944 in this case).

【０１６２】画像中の文字列の数が少なければ（すなわ
ち空白部分が多ければ）、文字列の方向への濃度投影ヒ
ストグラムは変化が少なく、値が小さくなる。If the number of character strings in the image is small (that is, if there are many blank portions), the density projection histogram in the direction of the character string changes little and its value becomes small.

【０１６３】ステップＳ６１３において、制御装置５２
１は濃度投影ヒストグラムを微分し、ヒストグラム微分
値メモリ５２４０に記録する。In step S613, control unit 52
1 differentiates the density projection histogram and records it in the histogram differential value memory 5240.

【０１６４】この処理は濃度投影ヒストグラムの変化を
強調してわかりやすくするために行なわれる。この処理
の具体的な内容を以下に述べる。This processing is performed to emphasize the change of the density projection histogram and make it easy to understand. The specific contents of this processing will be described below.

【０１６５】制御装置５２１は、濃度投影ヒストグラム
メモリ５２３８に記録されたＸ，Ｙ軸への濃度投影ヒス
トグラムに対し、それぞれの微分値ｘ＿ｈｉｓｔ′
［ｘ］およびｙ＿ｈｉｓｔ′［ｙ］を式（５）、式
（６）により計算しヒストグラム微分値メモリ５２４０
に記録する。The control device 521 applies the respective differential values x_hist 'to the density projection histogram on the X and Y axes recorded in the density projection histogram memory 5238.
[X] and y_hist '[y] are calculated by Expressions (5) and (6), and the histogram differential value memory 5240 is calculated.
To record.

【０１６６】ｘ＿ｈｉｓｔ′［ｘ］＝（ｘ＿ｈｉｓｔ［ｘ＋１］−ｘ＿ｈｉｓｔ［ｘ−１］）／２，１≦ｘ≦５１０ …（５）ｙ＿ｈｉｓｔ′［ｙ］＝（ｙ＿ｈｉｓｔ［ｙ＋１］−ｙ＿ｈｉｓｔ［ｙ−１］）／２，１≦ｙ≦４３１ …（６）ここで式中ｘ＿ｈｉｓｔ［ｉ］，ｙ＿ｈｉｓｔ［ｉ］と
は、それぞれ座標ｉにおける濃度投影ヒストグラムメモ
リ５２３８のデータ（濃度投影値）である。X_hist ′ [x] = (x_hist [x + 1] −x_hist [x−1]) / 2, 1 ≦ x ≦ 510 (5) y_hist ′ [y] = (y_hist [y + 1] −y_hist [y− 1]) / 2, 1 ≦ y ≦ 431 (6) where x_hist [i] and y_hist [i] are data (density projection values) of the density projection histogram memory 5238 at the coordinate i.

【０１６７】図４１を用いてステップＳ６１３の処理を
説明する。図４１（ｂ）に示されるようにヒストグラム
微分値メモリ５２３８は、Ｘ座標／Ｙ座標各座標の順番
に並んだアドレスを取り、各座標に対する濃度投影値の
微分値１６ビットを１単位としてＸ座標とＹ座標の数の
和（この場合は５１２＋４３２＝９４４）だけのメモリ
領域を持つ。図４１に示されるように、制御装置５４１
はたとえば濃度投影ヒストグラムメモリ５２３８のＸ＝
０とＸ＝２の内容から式（５），式（６）を用いて微分
値を計算し、ヒストグラム微分値メモリ５２４０のＸ＝
１のアドレスに記録する。この計算はＸ軸、Ｙ軸すべて
の値について行なわれる。なお微分値計算には注目画素
の前後に座標が存在する必要があるので、最初と最後の
座標（ここでは座標０と５１１または４３１）について
は微分値は計算されない。すなわち図４１（ｂ）に示さ
れるように、ヒストグラム微分値メモリ５２４０のＸ＝
０，Ｘ＝５１１，Ｙ＝０，Ｙ＝４３１の各々のアドレス
にはデータが記録されない。The processing in step S613 will be described with reference to FIG. As shown in FIG. 41B, the histogram differential value memory 5238 takes addresses arranged in the order of each coordinate of the X coordinate / Y coordinate, and sets the X coordinate of the differential value 16 bits of the density projection value for each coordinate as one unit. And the number of Y coordinates (in this case, 512 + 432 = 944). As shown in FIG. 41, the control device 541
Is, for example, X =
A differential value is calculated from the contents of 0 and X = 2 using Expressions (5) and (6), and X =
Record at address 1. This calculation is performed for all values on the X and Y axes. Note that the differential value calculation requires the presence of coordinates before and after the pixel of interest, so that the differential value is not calculated for the first and last coordinates (here, coordinates 0 and 511 or 431). That is, as shown in FIG. 41B, X =
No data is recorded at each address of 0, X = 511, Y = 0, and Y = 431.

【０１６８】名刺画像では一般に文字数が少ないが文字
は１方向にきれいに整列しており、行間のスペースが大
きいという特徴がある。したがって名刺画像は他の画像
よりも有効な連結成分数が少なく、有効な連結成分の特
定方向への並び方の集中度（重心の並び方向の特定方向
への頻度の集中度）が高く、微分濃度投影ヒストグラム
の平均値は小さいという傾向が生じる。これらの傾向に
より入力画像が名刺画像であるか否かが判定される。A business card image generally has a small number of characters, but the characters are clearly arranged in one direction, and the space between lines is large. Therefore, a business card image has a smaller number of effective connected components than other images, has a higher concentration of effective connected components in a specific direction (a concentration of frequencies in a specific direction of the center of gravity arrangement direction), and has a higher differential density. There is a tendency that the average value of the projection histogram is small. Based on these tendencies, it is determined whether or not the input image is a business card image.

【０１６９】ステップＳ６１４において、ステップＳ６
０８で記憶された有効な連結成分の数がしきい値以下で
あるか判定される。しきい値はたとえば２００という値
が設定され、これはＲＯＭ５２２に記憶されている。At step S614, step S6
It is determined whether the number of valid connected components stored at 08 is equal to or less than a threshold value. The threshold value is set, for example, to a value of 200, which is stored in the ROM 522.

【０１７０】ステップＳ６１４でＹＥＳであれば、ステ
ップＳ６１５において、重心点からその重心点に最も近
い重心点への方向の頻度がしきい値以上であるか判定さ
れる。しきい値はたとえば有効な連結成分の数に対する
方向ベクトルの最頻値の割合が５０％として設定され
る。If YES in step S614, it is determined in step S615 whether the frequency of the direction from the center of gravity to the center of gravity closest to the center of gravity is equal to or greater than a threshold value. The threshold value is set, for example, such that the ratio of the mode value of the direction vector to the number of valid connected components is 50%.

【０１７１】ステップＳ６１５でＹＥＳであれば、ステ
ップＳ６１６においてヒストグラム微分値メモリ５２４
０に記録されているヒストグラムのＸ軸方向の平均値、
Ｙ軸方向の平均値の値の大きいほうと小さいほうとがと
もにしきい値以下であるか判定される。所定のしきい値
はたとえば大きいほうの平均値に対して４．５×１
０ ³、小さいほうの平均値に対して１．５×１０³とい
う値が設定されＲＯＭ５２２に記憶される。If YES in step S615, the
In step S616, the histogram differential value memory 524
The average value in the X-axis direction of the histogram recorded at 0,
Larger and smaller average values in the Y-axis direction
First, it is determined whether the value is equal to or smaller than the threshold value. Predetermined threshold
Is, for example, 4.5 × 1 with respect to the larger average value.
0 ^Three, 1.5 × 10 with respect to the smaller average^ThreeTo
Are set and stored in the ROM 522.

【０１７２】なお各々のしきい値はＲＯＭ５２２に記憶
されることとしたが、ユーザの判断によりしきい値を自
由に設定できるようにしてもよい。その場合設定処理ル
ーチンはＲＯＭ５２２に記憶するようにすればよい。Although the respective thresholds are stored in the ROM 522, the thresholds may be freely set by the user. In this case, the setting processing routine may be stored in the ROM 522.

【０１７３】ステップＳ６１６でＹＥＳであれば、ステ
ップＳ６１７において入力画像は名刺画像であると判断
される。If YES in step S616, it is determined in step S617 that the input image is a business card image.

【０１７４】ステップＳ６１４，Ｓ６１５，Ｓ６１６の
いずれかでＮＯの場合は、ステップＳ６１９において入
力画像は名刺画像でないと判断される。If NO in any of steps S614, S615, and S616, it is determined in step S619 that the input image is not a business card image.

【０１７５】ステップＳ６１８において、入力画像は名
刺画像であるか否かのキーワード（属性）とともに、画
像データベース装置５３に記録される。In step S618, the input image is recorded in the image database device 53 together with a keyword (attribute) as to whether or not the image is a business card image.

【０１７６】これにより各画素の検索時の処理時には与
えられた名刺という属性が利用できる。また名刺と判断
された画像に対しては、文字認識を適用することによ
り、住所録作成を行うなどの応用が可能である。As a result, the attribute of a given business card can be used at the time of processing for searching each pixel. Further, by applying character recognition to an image determined to be a business card, applications such as creating an address book are possible.

【０１７７】次に名刺画像の微分濃度投影ヒストグラム
の微分値の具体的な例について説明する。Next, a specific example of the differential value of the differential density projection histogram of the business card image will be described.

【０１７８】図４４は名刺画像の微分濃度投影ヒストグ
ラムの微分値の例を示すグラフであり、図４５は名刺以
外の文書画像の微分濃度投影ヒストグラムの微分値の例
を示すグラフである。FIG. 44 is a graph showing an example of a differential value of a differential density projection histogram of a business card image, and FIG. 45 is a graph showing an example of a differential value of a differential density projection histogram of a document image other than a business card image.

【０１７９】図を参照して、名刺画像はＸ軸，Ｙ軸に対
する濃度投影ヒストグラムの各平均値のうち大きいほう
が２．２６×１０³、小さいほうが０．８６×１０³で
あり、名刺以外の文書画像はＸ軸，Ｙ軸に対する濃度投
影ヒストグラムの各平均値のうち大きいほうが４．３×
１０³、小さいほうが２．５×１０³である。Referring to the figure, the business card image has the average value of the density projection histograms with respect to the X-axis and the Y-axis of 2.26 × 10 ³ and 0.86 × 10 ³ , respectively. For the document image, the larger of the average values of the density projection histograms with respect to the X axis and the Y axis is 4.3 ×
10 ³ , the smaller one is 2.5 × 10 ³ .

【０１８０】このように名刺画像では微分濃度投影ヒス
トグラムの微分値の平均値が他の文書画像よりも小さな
値になっていることがわかる。Thus, it can be seen that the average value of the differential values of the differential density projection histogram is smaller in the business card image than in other document images.

【０１８１】次に実際の分類結果を図４２に示す。図４
２のグラフのＸ軸は入力された画像の有効な連結成分数
であり、Ｙ軸は入力された画像の有効な連結成分の最近
重心方向の頻度割合である。FIG. 42 shows actual classification results. FIG.
In the graph of FIG. 2, the X axis is the number of effective connected components of the input image, and the Y axis is the frequency ratio of the effective connected components of the input image in the direction of the nearest centroid.

【０１８２】図から明らかなように、名刺画像は有効な
連結成分数が小さく、最近重心方向の頻度割合が大きい
という傾向を持つ。As is clear from the figure, the business card image has a tendency that the number of effective connected components is small and the frequency ratio in the direction of the center of gravity is large recently.

【０１８３】図４３のグラフは図４２のグラフで斜線で
囲まれた領域（有効な連結成分数２００個以下であり、
かつ最近重心方向の頻度割合５０％以上）に含まれるサ
ンプル画像についてのみをグラフ化したものである。グ
ラフのＸ軸は入力された画像のＸ軸およびＹ軸に対する
微分濃度投影ヒストグラムの微分値の平均値のうち小さ
いほうの値であり、グラフのＹ軸は同じく大きいほうの
値である。この例において図４３のＸ軸に対して１．５
×１０³、Ｙ軸に対して４．５×１０³という値をしき
い値とすることにより名刺画像の分類が可能であること
がわかる。The graph of FIG. 43 is a region surrounded by hatching in the graph of FIG. 42 (the number of valid connected components is 200 or less,
In addition, only the sample images included in the frequency ratio of 50% or more in the recent direction of the center of gravity are graphed only. The X axis of the graph is the smaller one of the average values of the differential values of the differential density projection histogram with respect to the X axis and the Y axis of the input image, and the Y axis of the graph is the larger value. In this example, 1.5 with respect to the X axis in FIG.
× 10 ^3, Y understood that it is possible to classify the business card image by a threshold value of 4.5 × 10 ³ relative to axis.

【０１８４】以上のようにこの実施例によると、名刺画
像の特徴を利用し、有効な連結成分画像および微分画像
の濃度投影ヒストグラムから名刺画像を精度よく分類す
ることができる。また名刺画像と判断された場合には、
住所録作成など他の用途、処理に自動的に進むなどの応
用が可能である。As described above, according to this embodiment, a business card image can be accurately classified from the density projection histogram of an effective connected component image and a differential image by utilizing the characteristics of the business card image. If it is determined to be a business card image,
Other applications such as address book creation, and applications such as automatically proceeding to processing are possible.

【０１８５】なお本実施例には名刺画像の判定に有効な
連結成分の数、有効な連結成分から近接する他の有効な
連結成分の方向、微分濃度投影ヒストグラムの微分値の
平均という３つのデータを用いたが、この中の１あるい
は２のデータにより判別するようにしても精度は落ちる
が判別は可能である。In this embodiment, there are three data: the number of effective connected components for determining a business card image, the direction of another effective connected component close to the effective connected component, and the average of the differential values of the differential density projection histogram. Is used, the accuracy may be lowered, but the determination is possible even if the determination is made based on the data of 1 or 2 among them.

【０１８６】（第５の実施例）図４６は本発明の第５の
実施例における画像分類装置の制御装置の処理を示すフ
ローチャートである。(Fifth Embodiment) FIG. 46 is a flowchart showing the processing of the control device of the image classification device according to the fifth embodiment of the present invention.

【０１８７】本実施例における装置構成は図３２に示さ
れている第４の実施例における画像分類装置の装置構成
に同じであるのでここでの説明を繰返さない。The configuration of the apparatus according to this embodiment is the same as the configuration of the image classification apparatus according to the fourth embodiment shown in FIG. 32, and therefore the description thereof will not be repeated.

【０１８８】本実施例における画像分類装置において行
なわれる処理は以下のとおりである。The processing performed in the image classification device according to the present embodiment is as follows.

【０１８９】図３３のステップＳ６０１〜Ｓ６０７の処
理が行なわれた後、図４６に示されるルーチンが実行さ
れる。After the processing of steps S601 to S607 in FIG. 33 is performed, the routine shown in FIG. 46 is executed.

【０１９０】ステップＳ６０８において制御装置５２１
は、画像中の有効な連結成分を文字らしい領域と見な
し、名刺画像を分類するための特徴量の１つとして２値
画像メモリ５２３２内の有効な連結成分の数を計算し、
有効な連結成分数メモリ５２３９に記憶する。さらにス
テップＳ６０９において、制御装置５２１は２値画像メ
モリ５２３２内の有効な連結成分の並び方の特徴量を抽
出する。有効な連結成分の並び方の特徴量としては、た
とえばこの発明の第４の実施例で述べられているような
有効な連結成分の特定方向への並び方の集中度や入力画
像の微分画像の濃度投影ヒストグラムの微分値の平均値
など名刺画像の性質がよく現れる特徴量が用いられる。In step S608, control unit 521
Calculates the number of valid connected components in the binary image memory 5232 as one of the feature amounts for classifying the business card image, considering the valid connected components in the image as a character-like region,
It is stored in the effective connected component number memory 5239. Further, in step S609, the control device 521 extracts a feature amount of the arrangement of effective connected components in the binary image memory 5232. Examples of the feature amount of the arrangement of the effective connected components include the degree of concentration of the arrangement of the effective connected components in a specific direction and the density projection of the differential image of the input image as described in the fourth embodiment of the present invention. A feature amount such as the average value of the differential value of the histogram that often shows the properties of the business card image is used.

【０１９１】ステップＳ６１０において、入力画像が名
刺画像かどうかを判断するために、制御装置５２１はま
ず有効な連結成分数がしきい値以下であるかどうかを調
べる。名刺画像では文字数が少ないため、有効な連結成
分数も少なくなる。ここで有効な連結成分数がしきい値
以下でない場合は（ステップＳ６１０でＮＯ）、ステッ
プＳ６１４において入力画像は名刺画像でないと判断さ
れる。In step S610, to determine whether or not the input image is a business card image, control device 521 first checks whether or not the number of valid connected components is equal to or smaller than a threshold value. Since the number of characters is small in the business card image, the number of effective connected components is also small. If the number of valid connected components is not equal to or smaller than the threshold (NO in step S610), it is determined in step S614 that the input image is not a business card image.

【０１９２】有効な連結成分数がしきい値以下の場合
（ステップＳ６１０でＹＥＳ）は、ステップＳ６１１で
続いて有効な連結成分の並び方の特徴量を所定のしきい
値と比較する。所定のしきい値はＲＯＭ５２２に記憶し
ておいてもよいし、外部からユーザが指定できるように
してもよい。また比較する特徴量は単独でも複数の特徴
量の組合せでもよい。比較する特徴量に対応して所定の
しきい値も単独または複数のしきい値の組合せが用いら
れ、文書画像が満たすべきしきい値条件が設定される。
それらの処理手順はＲＯＭ５２２に記憶される。If the number of valid connected components is equal to or smaller than the threshold value (YES in step S610), then in step S611, the feature amount of the sequence of valid connected components is compared with a predetermined threshold value. The predetermined threshold value may be stored in the ROM 522, or may be externally specified by the user. The feature amounts to be compared may be a single feature amount or a combination of a plurality of feature amounts. A single threshold or a combination of a plurality of thresholds is used as the predetermined threshold corresponding to the feature quantity to be compared, and a threshold condition to be satisfied by the document image is set.
Those processing procedures are stored in the ROM 522.

【０１９３】ステップＳ６１２において、制御装置５２
１は特徴量と所定のしきい値との比較の結果、その入力
画像が名刺画像であるための条件を満たしているかどう
かを判断する。条件を満たしている場合（Ｓ６１２でＹ
ＥＳ）はその入力画像は名刺画像であると判断され、そ
うでない場合（ステップＳ６１２でＮＯ）にはその入力
画像は名刺画像でないと判断される。In step S612, control device 52
As a result of the comparison between the feature value and a predetermined threshold value, it is determined whether or not the input image satisfies a condition for being a business card image. If the condition is satisfied (Y in S612)
ES), it is determined that the input image is a business card image, otherwise (NO in step S612), it is determined that the input image is not a business card image.

【０１９４】判断後ステップＳ６１５において、制御装
置５２１は入力画像を画像データベース装置５３に登録
するが、その際に名刺画像であると判断された画像には
「名刺画像」というキーワード（属性）が付与され、入
力画像データとともに画像データベース装置５３に登録
される。これにより各画像の検索時の処理時には与えら
れた名刺という属性が利用できる。After the determination, in step S615, the control device 521 registers the input image in the image database device 53. At this time, a keyword (attribute) “business card image” is added to the image determined to be a business card image. Then, it is registered in the image database device 53 together with the input image data. Thus, the attribute of a given business card can be used at the time of processing at the time of searching each image.

【０１９５】以上のようにこの発明によれば、文字らし
い領域の数とその並び方を調べて画像全体で名刺画像が
満たすべき条件に合っているかどうかを判断することに
より、文字の数が少なく行間が広く開いているという名
刺画像の特徴をうまく抽出し精度よく分類することがで
きる。As described above, according to the present invention, the number of character-like areas and their arrangement are examined to determine whether or not the entire image satisfies the conditions to be satisfied by the business card image. Of a business card image that is widely open, and can be classified with high accuracy.

【０１９６】また名刺画像と分類された場合にはさらに
文字認識を適用することなどにより自動的に住所録を作
成するなどの応用が可能になる。When the image is classified as a business card image, application such as automatic creation of an address book by further applying character recognition becomes possible.

【０１９７】なお本発明の実施例において入力される画
像データは多値画像データとしたが２値画像データを入
力するようにしてもよい。Although the image data input in the embodiment of the present invention is multi-valued image data, binary image data may be input.

【０１９８】[0198]

【発明の効果】本発明に係る画像分類装置は、画像を文
書画像とそれ以外に分類することができる。The image classification apparatus according to the present invention can classify an image into a document image and a document image.

【０１９９】[0199]

【０２００】[0200]

【０２０１】[0201]

【０２０２】[0202]

【０２０３】[0203]

【０２０４】[0204]

【０２０５】[0205]

【０２０６】請求項３に記載の画像分類装置は、請求項
１または２の効果に加え、対象物を表わす隣接した複数
の画素のうち所定の条件を満たす画素を連結成分として
識別するので、判定精度をさらに高めることができる。According to the image classification device of the third aspect , in addition to the effect of the first or second aspect , a pixel satisfying a predetermined condition among a plurality of adjacent pixels representing an object is identified as a connected component. Accuracy can be further increased.

[Brief description of the drawings]

【図１】本発明の第１の実施例における画像分類装置の
システム構成を示すブロック図である。FIG. 1 is a block diagram illustrating a system configuration of an image classification device according to a first embodiment of the present invention.

【図２】図１の制御装置１２１の処理ルーチンを示すフ
ローチャートである。FIG. 2 is a flowchart showing a processing routine of a control device 121 of FIG.

【図３】図２のフローチャートに続くフローチャートで
ある。FIG. 3 is a flowchart following the flowchart of FIG. 2;

【図４】文書画像の一例を示す図である。FIG. 4 is a diagram illustrating an example of a document image.

【図５】非文書画像である風景画像の一例を示す図であ
る。FIG. 5 is a diagram illustrating an example of a landscape image that is a non-document image.

【図６】多値画像メモリ１２３１に記憶される画像デー
タの座標系について説明するための図である。FIG. 6 is a diagram for describing a coordinate system of image data stored in a multi-valued image memory 1231.

【図７】多値画像メモリ１２３１のアドレス構成を説明
するための図である。FIG. 7 is a diagram for explaining an address configuration of a multi-valued image memory 1231.

【図８】２値画像データの一例を示し、かつ連結成分に
ついて説明するための図である。FIG. 8 is a diagram illustrating an example of binary image data and describing connected components.

【図９】図８の連結成分にラベリングを行なった結果を
示す図である。FIG. 9 is a diagram showing a result of performing labeling on the connected component of FIG. 8;

【図１０】ラベル画像メモリのアドレス構成を示す図で
ある。FIG. 10 is a diagram showing an address configuration of a label image memory.

【図１１】フィレ座標の設定方法について説明するため
の図である。FIG. 11 is a diagram for explaining a method of setting fillet coordinates.

【図１２】フィレ座標メモリ１２３４のアドレス構成を
示す図である。FIG. 12 is a diagram showing an address configuration of a fillet coordinate memory 1234.

【図１３】連結成分の一例を示す図である。FIG. 13 is a diagram illustrating an example of a connected component.

【図１４】図１３の連結成分がフィレ座標メモリに記録
されている状態を示す図である。FIG. 14 is a diagram showing a state in which the connected components of FIG. 13 are recorded in a fillet coordinate memory.

【図１５】図１３から不要な連結成分が除去された後の
連結成分を示す図である。FIG. 15 is a diagram showing a connected component after unnecessary connected components have been removed from FIG. 13;

【図１６】多値画像メモリの多値画像データに空間微分
処理を施すためのデータについて説明するための図であ
る。FIG. 16 is a diagram for explaining data for performing a spatial differentiation process on multi-valued image data in a multi-valued image memory.

【図１７】微分のために用いられる係数について説明す
るための第１の図である。FIG. 17 is a first diagram illustrating coefficients used for differentiation.

【図１８】微分のために用いられる係数について説明す
るための第２の図である。FIG. 18 is a second diagram for describing coefficients used for differentiation.

【図１９】図４の文書画像の中の１つの連結成分が２値
画像メモリ１２３２に記録されている状態を示す図であ
る。19 is a diagram illustrating a state in which one connected component in the document image in FIG. 4 is recorded in a binary image memory 1232. FIG.

【図２０】図１９で示されている連結成分と同じ連結成
分が微分２値画像メモリ１２３４に記録されている状態
を示す図である。20 is a diagram illustrating a state where the same connected component as the connected component illustrated in FIG. 19 is recorded in the differential binary image memory 1234. FIG.

【図２１】ＡＮＤ画像メモリ１２３７に記録された状態
での図１９と同じ連結成分を説明するための図である。FIG. 21 is a diagram for explaining the same connected components as those in FIG. 19 in a state recorded in an AND image memory 1237.

【図２２】図５の風景画像の中の１つの連結成分が２値
画像メモリ１２３２に記録されている状態を示す図であ
る。22 is a diagram illustrating a state in which one connected component in the landscape image of FIG. 5 is recorded in a binary image memory 1232. FIG.

【図２３】図２１で示されている連結成分と同じ連結成
分が微分２値画像メモリ１２３４に記録されている状態
を示す図である。FIG. 23 is a diagram showing a state where the same connected component as that shown in FIG. 21 is recorded in the differential binary image memory 1234.

【図２４】ＡＮＤ画像メモリ１２３７に記録された状態
での図２３と同じ連結成分を説明するための図である。FIG. 24 is a diagram for explaining the same connected components as those in FIG. 23 in a state recorded in an AND image memory 1237.

【図２５】本発明の第１の実施例における効果を説明す
るための図である。FIG. 25 is a diagram for explaining effects in the first embodiment of the present invention.

【図２６】本発明の第２の実施例における画像分類装置
の装置構成を示すブロック図である。FIG. 26 is a block diagram illustrating a device configuration of an image classification device according to a second embodiment of the present invention.

【図２７】第２の実施例における画像分類装置の制御装
置が行なう処理について示すフローチャートである。FIG. 27 is a flowchart illustrating a process performed by a control device of the image classification device according to the second embodiment.

【図２８】文書画像における２値画像データの一例を示
す図である。FIG. 28 is a diagram illustrating an example of binary image data in a document image.

【図２９】風景画像（非文書画像）における２値画像デ
ータの一例を示す図である。FIG. 29 is a diagram illustrating an example of binary image data in a landscape image (non-document image).

【図３０】文書画像、非文書画像の持つbin ratio を説
明するための図である。FIG. 30 is a diagram for explaining a bin ratio of a document image and a non-document image.

【図３１】本発明の第３の実施例における画像分類装置
の処理を説明するための図である。FIG. 31 is a diagram for describing processing of the image classification device according to the third embodiment of the present invention.

【図３２】本発明の第３の実施例における画像分類装置
のシステム構成を示すブロック図である。FIG. 32 is a block diagram illustrating a system configuration of an image classification device according to a third embodiment of the present invention.

【図３３】本発明の第４の実施例における画像分類装置
の制御装置５２１が行なう処理を示すフローチャートで
ある。FIG. 33 is a flowchart illustrating a process performed by the control device 521 of the image classification device according to the fourth embodiment of the present invention.

【図３４】図３３に続くフローチャートである。FIG. 34 is a flowchart following FIG. 33.

【図３５】図３４に続くフローチャートである。FIG. 35 is a flowchart following FIG. 34;

【図３６】重心メモリ５２３５のアドレス構成を示す図
である。FIG. 36 is a diagram showing an address configuration of a centroid memory 5235.

【図３７】図３４のステップＳ６１０で行なわれる重心
点の処理の具体的ルーチンを示すフローチャートであ
る。FIG. 37 is a flowchart showing a specific routine of processing of the center of gravity performed in step S610 of FIG. 34;

【図３８】図３７の処理でｓｉｚｅ＝３，ｗｓ＝６の状
態で注目している重心点の右下に他の重心点が見つかっ
た状態を示す図である。38 is a diagram showing a state where another barycenter point is found at the lower right of the barycenter point of interest in the state of size = 3, ws = 6 in the processing of FIG. 37.

【図３９】求められた方向を量子化するための領域につ
いて説明するための図である。FIG. 39 is a diagram for describing an area for quantizing a determined direction.

【図４０】画像データをＸ軸およびＹ軸へ濃度投影する
例について説明するための図である。FIG. 40 is a diagram for describing an example of density projection of image data on an X axis and a Y axis.

【図４１】濃度投影ヒストグラムメモリ５２３８および
ヒストグラム微分値メモリ５２４０について説明するた
めの図である。FIG. 41 is a diagram for explaining a density projection histogram memory 5238 and a histogram differential value memory 5240.

【図４２】本発明の第４の実施例における効果について
説明するための図である。FIG. 42 is a diagram for describing effects of the fourth embodiment of the present invention.

【図４３】図４２の斜線部内に含まれる画像データの微
分濃度投影ヒストグラムの微分値の平均値について説明
するための図である。FIG. 43 is a diagram for describing an average value of differential values of a differential density projection histogram of image data included in a hatched portion in FIG. 42;

【図４４】名刺画像に対するＸ軸方向およびＹ軸方向の
濃度投影ヒストグラムについて説明するための図であ
る。FIG. 44 is a diagram for describing a density projection histogram of a business card image in the X-axis direction and the Y-axis direction.

【図４５】名刺画像以外の文書画像のＸ軸方向およびＹ
軸方向の濃度投影ヒストグラムについて説明するための
図である。FIG. 45 shows the X-axis direction and Y of a document image other than a business card image.
It is a figure for explaining density projection histogram of an axial direction.

【図４６】本発明の第５の実施例における制御装置の処
理を示すフローチャートである。FIG. 46 is a flowchart showing a process of the control device according to the fifth embodiment of the present invention.

[Explanation of symbols]

１１，５１画像入力装置１２，５２画像処理装置１３，５３画像データベース装置１２１，５２１制御装置１２２，５２２ＲＯＭ１２３，５２３ＲＡＭ１２３１，５２３１多値画像メモリ１２３２，５２３２２値画像メモリ１２３３，５２３３ラベル画像メモリ１２３４，５２３４フィレ座標メモリ１２３５，５２３７微分画像メモリ１２３６微分２値画像メモリ１２３７ＡＮＤ画像メモリ５２３５重心メモリ５２３６方向頻度ヒストグラムメモリ５２３８濃度投影ヒストグラムメモリ５２３９有効な連結成分数メモリ５２４０ヒストグラム微分値メモリ 11, 51 Image input device 12, 52 Image processing device 13, 53 Image database device 121, 521 Control device 122, 522 ROM 123, 523 RAM 1231, 5231 Multivalued image memory 1232, 5232 Binary image memory 1233, 5233 Label image Memory 1234, 5234 Fillet coordinate memory 1235, 5237 Differential image memory 1236 Differential binary image memory 1237 AND image memory 5235 Center of gravity memory 5236 Direction frequency histogram memory 5238 Density projection histogram memory 5239 Effective connected component number memory 5240 Histogram differential value memory

───────────────────────────────────────────────────── フロントページの続き (58)調査した分野(Int.Cl.⁷，ＤＢ名) G06K 9/20 H04N 1/40 ──────────────────────────────────────────────────続き Continued on the front page (58) Field surveyed (Int.Cl. ⁷ , DB name) G06K 9/20 H04N 1/40

Claims

(57) [Claims]

1. An identification means for identifying a connected component consisting of a plurality of adjacent pixels representing an object from input image data, and character-likeness based on the shape of the connected component in the input image data. Extracting means for extracting a characteristic amount representing the following: and determining means for determining whether or not the input image data is a document image based on the extracted characteristic amount , wherein the input image data is shaded. Image data, and the extracting means extracts the image data from the input grayscale image data.
Edge detection means to detect data areas with strong edge
Seen, the area of the connected component, and the connecting component the data area
An image classification device that extracts the feature amount from the area of the overlap region with

2. An object is selected from input image data.
Identify connected components consisting of adjacent pixels to represent
Identification means, and whether the shape of the connected component in the input image data is
Extracting means for extracting a characteristic amount representing character-likeness from the input image based on the extracted characteristic amount.
Determining means for determining whether or not the data is a document image, wherein the extracting means includes a circumscribed rectangle for each of the connected components.
Calculating means for calculating the area of the characteristic, and calculating the characteristic from the area of the connected component and the area of the circumscribed rectangle.
An image classification device that extracts the amount of collection .

3. The image processing apparatus according to claim 2, wherein the identification unit is configured to output the input image data.
Data of adjacent images representing the object
A pixel satisfying a predetermined condition is identified as the connected component.
The image classification device according to claim 1 or 2, wherein: