JP2000175038A

JP2000175038A - Image binary processing system on area basis

Info

Publication number: JP2000175038A
Application number: JP11343264A
Authority: JP
Inventors: Yongchun Lee; リーヨンチュン; Peter Rudak; ルダックピーター
Original assignee: Eastman Kodak Co
Current assignee: Eastman Kodak Co
Priority date: 1998-12-04
Filing date: 1999-12-02
Publication date: 2000-06-23
Anticipated expiration: 2019-12-02
Also published as: US6393150B1; DE19956158A1; JP4261005B2

Abstract

PROBLEM TO BE SOLVED: To realize a top-down type segmentation method that finds out a photographing area on the basis of entire pixel connectivity. SOLUTION: An area-based binary processing system applies adaptive threshold processing and image rendering to a gray scale image to generate 1st and 2nd binary images. The gray scale image is sub-sampled to obtain a low resolution image and a position of a photographing image is detected in the low resolution image. Furthermore, a rectangular photographing image is identified in the detected photographing image and a classification map to distinguish pixels of the rectangular photographing image from other pixels is produced. Then a final binary image can be formed from the 1st and 2nd binary images on the basis of the classification map.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、混合形式の文書に
対して最適な２値イメージ品質を提供する領域ベースの
２値化システムに関する。FIELD OF THE INVENTION The present invention relates to a region-based binarization system that provides optimal binary image quality for mixed format documents.

【０００２】[0002]

【従来の技術】雑誌の頁には、テキスト、線画及びグラ
フィックスに混じり写真が印刷されていることが多い。
斯かる頁がスキャナにより電子的にキャプチャされると
き、キャプチャされたグレースケール・イメージ(grey
scale image)を、出力におけるイメージの複調表示(bit
onal representation)へと変換する２値化プロセスが必
要である。而して、イメージ２値化技術には一般的に２
つの種類がある。ひとつは、主としてテキスト及び線画
を含む種類の文書に対して良好な、適応スレッショルド
化技術(adaptive thresholding technique) と称され
る。他方は、２値フォーマットの形態でグレーの陰影を
再現するディザ(dither)もしくはエラー拡散(error dif
fusion) 技術である。これは写真イメージを２値化する
上で有効である。キャプチャされた文書イメージ中にテ
キスト及び写真が含まれた混合形式の文書の場合、上記
の２つの２値化方法はいずれも、テキスト及び写真の満
足なイメージ品質を生成し得ない。この問題に対する公
知の解決策は、最適なイメージ品質を得るべく夫々の領
域に対して別個の２値化プロセスを適用し得る如く、キ
ャプチャされたデジタル・イメージを写真及びテキスト
の領域にセグメント化することである。2. Description of the Related Art Magazine pages often have photographs printed in addition to text, line drawings and graphics.
When such a page is electronically captured by a scanner, the captured grayscale image (grey
scale image) is converted to an image
An onal representation is needed. Thus, image binarization techniques generally have two
There are two types. One is referred to as an adaptive thresholding technique, which is good for documents of the type that mainly include text and line art. The other is dither or error diffusion which reproduces gray shading in the form of a binary format.
fusion) technology. This is effective in binarizing a photographic image. For a mixed format document where text and photos are included in the captured document image, neither of the two binarization methods described above can produce satisfactory image quality of text and photos. A known solution to this problem is to segment the captured digital image into photographic and text regions so that a separate binarization process can be applied to each region for optimal image quality. That is.

【０００３】公知のセグメント化方法は、混合形式の文
書を４×４ブロックに分割し、各ブロックをテキストま
たはイメージに分類し、ブロックのショートラン(short
run) を排除することにより分類を改良するものである
( 例えば、Chen et al. に対する米国特許第4,668 ，99
5 号を参照されたい) 。イメージ・ラインのブロックは
分類された後、夫々異った２値化プロセスが適用され
る。別の公知の方法は、各走査線に対するランレングス
(run length)を抽出し、それらのランレングスから矩形
(rectangle) を構成し、次に各矩形をテキストまたは非
テキストに分類し、最後には関連テキスト・ブロックを
テキスト領域へと併合することにより、イメージをセグ
メント化している( 例えば、Cullenその他に対する米国
特許第5,335,290 号を参照されたい) 。A known segmentation method divides a mixed-format document into 4 × 4 blocks, classifies each block as text or image, and performs a short run of blocks.
run) to improve classification.
(See, for example, US Pat. No. 4,668,995 to Chen et al.)
See Issue 5). After the blocks of the image lines have been classified, a different binarization process is applied. Another known method is to use a run length for each scan line.
(run length) and extract rectangles from those run lengths
(rectangle), then classifying each rectangle into text or non-text, and finally merging the relevant text blocks into text areas (e.g., US to Cullen et al. See US Pat. No. 5,335,290).

【０００４】[0004]

【発明が解決しようとする課題】上述した２つのセグメ
ント化方法は、情報をピクセル毎にもしくは小さなブロ
ック毎にセグメント化することにより開始し、領域へ展
開して行くというボトムアップ式のセグメント化方法に
言及している。これらの方法はロバストでなく分類エラ
ーを生じ易い、と言うのも、テキストまたは非テキスト
の分類が局所的情報のみに基づくからである。The two segmentation methods described above start by segmenting the information on a pixel-by-pixel or small block basis and then expand into regions. Is mentioned. These methods are not robust and are prone to classification errors, since the classification of text or non-text is based only on local information.

【０００５】本発明の目的は、全体的ピクセル連結度(g
lobal pixel connectivity) に基づいて写真領域を発見
するトップダウン式セグメント化方法を提供すると共
に、セグメント化した結果を使用して最適な２値イメー
ジ品質を獲得する領域ベース２値化システムを提案する
にある。It is an object of the present invention to provide an overall pixel connectivity (g
To provide a top-down segmentation method for finding photographic regions based on global pixel connectivity) and to propose a region-based binarization system that obtains optimal binary image quality using the segmented results. is there.

【０００６】[0006]

【課題を解決するための手段】本発明は領域ベース２値
化システムに関するが、該システムは、適応スレッショ
ルド化と、エラー拡散( またはディザ) などのイメージ
・レンダリングとを個々に適用し、グレースケール・イ
メージから２つの２値イメージを生成し、低解像度イメ
ージで写真イメージの位置を検出し、矩形の形状もしく
は境界を有する写真イメージを識別し、写真ピクセルを
“１”、非写真ピクセルを“０”としてマークする分類
ビットマップを生成し、且つ、記憶された２つの２値イ
メージから分類マップに基づいて最終的２値イメージを
組立てるものである。SUMMARY OF THE INVENTION The present invention is directed to a region-based binarization system that applies adaptive thresholding and image rendering, such as error diffusion (or dither), individually, and uses a grayscale system. Generating two binary images from the image, detecting the position of the photographic image in the low-resolution image, identifying the photographic image having a rectangular shape or boundary, and setting the photographic pixel to "1" and the non-photographic pixel to "0" And generating a classification bitmap to be marked as "" and assembling a final binary image from the two stored binary images based on the classification map.

【０００７】写真検出プロセスは、全体的スレッショル
ド化を使用して低解像度グレースケール・イメージを２
値イメージへと変換する段階と、細線と、文字の大多数
とを除去すべく２値イメージ浸食プロセス(binary imag
e erosion process)を実行する段階と、連結成分分析(c
onnected component analysis)を適用してオブジェクト
を発見する段階と、サイズ・フィルタを使用して小寸オ
ブジェクトを排除する段階とを備えている。大寸オブジ
ェクトの位置は、写真の位置と見なされる。[0007] The photo detection process uses global thresholding to convert low resolution grayscale images to two.
A binary image erosion process (binary imag) to remove the thin lines and the majority of characters.
e erosion process) and connected component analysis (c
applying an onnected component analysis to find objects, and using a size filter to eliminate small objects. The position of the large object is considered as the position of the photograph.

【０００８】本発明はまた、グレースケール・イメージ
を第１及び第２の２値イメージに変換する段階と、グレ
ースケール・イメージにおける写真イメージの位置を検
出する段階と、検出された写真イメージの内で矩形境界
を有する写真イメージを識別する段階と、矩形境界を有
する写真イメージ中のピクセルを残りのピクセルから区
別する分類マップを生成する段階と、上記分類マップに
基づいて上記第１及び第２の２値イメージから最終的２
値イメージを形成する段階とを備えた領域ベース２値化
プロセスに関している。The present invention also includes converting the grayscale image to first and second binary images, detecting a position of the photographic image in the grayscale image, and detecting the position of the photographic image in the grayscale image. Identifying a photographic image having a rectangular boundary with, generating a classification map that distinguishes pixels in the photographic image having a rectangular boundary from the remaining pixels, and based on the classification map. Final 2 from binary image
Forming a value image.

【０００９】本発明は更に、イメージをキャプチャする
段階と、キャプチャされたイメージにおける写真イメー
ジの位置を検出する段階と、検出された写真イメージの
内で矩形境界を有する写真イメージを識別する段階と、
上記写真イメージにおいて矩形境界を有する写真ピクセ
ルを非写真ピクセルから区別する分類マップを生成する
段階と、上記分類マップに基づいて最終的２値イメージ
を形成する段階とを備えた領域ベース２値化プロセスに
関している。[0009] The present invention further comprises the steps of capturing the image, locating the photographic image in the captured image, and identifying a photographic image having a rectangular boundary in the detected photographic image.
A region-based binarization process comprising: generating a classification map that distinguishes photographic pixels having rectangular boundaries from non-photographic pixels in the photographic image; and forming a final binary image based on the classification map. About.

【００１０】本発明は更に、イメージをキャプチャする
イメージ・キャプチャ部分と、キャプチャされたイメー
ジを、該キャプチャされたイメージを表すデジタル・イ
メージ情報に変換する変換部分と、上記デジタル・イメ
ージ情報を処理して上記キャプチャされたイメージにお
ける写真イメージの位置を検出すると共に、検出された
写真イメージの内で矩形境界を有する写真イメージを識
別し、且つ、上記矩形境界を有する写真イメージの内の
ピクセルを残りのピクセルから区別する分類マップを生
成する処理部分とを備えたイメージ・キャプチャ・アセ
ンブリに関する。The invention further comprises an image capture portion for capturing the image, a conversion portion for converting the captured image into digital image information representing the captured image, and processing the digital image information. Detecting the position of the photographic image in the captured image, identifying the photographic image having a rectangular boundary among the detected photographic images, and relocating the pixels in the photographic image having the rectangular boundary to the remaining pixels. A processing portion for generating a classification map that distinguishes from pixels.

【００１１】[0011]

【発明の実施の形態】図面を参照すると、各図を通して
同一の参照番号は同一のまたは対応部分を示し、図１に
は領域ベースのイメージ２値化方法のブロック図が示さ
れている。入力としてのデジタル・グレースケール・イ
メージ・データに対して上記方法は次の様に作用する。
適応イメージ・スレッショルド化( 段階１５ａ）が適用
されて、グレースケール・イメージ（Ｇ) を、テキスト
及び線画の良好なイメージ品質を示す２値イメージ（Ｂ
１）へと変換する。エラー拡散またはディザなどのイメ
ージ・レンダリング( 段階１５ｂ）が同一のグレースケ
ール・イメージ（Ｇ) に適用されて、イメージの写真部
分における良好なイメージ品質を呈するレンダリング済
２値イメージ（Ｂ２）を獲得する。段階１５ｃにては、
サブサンプル化イメージ（Ｇｓ）を提供すべくグレース
ケール・イメージのサブサンプリングが行われる。段階
１６にては、サブサンプル化イメージ（Ｇｓ）内におけ
る矩形写真イメージの位置が検出される一方、段階１７
では矩形の形状または境界を有する写真イメージが検出
される。段階１８にては矩形写真イメージを“１”で且
つ他のピクセルは“０”でマークする分類マップの生成
が行われ、且つ、最終的２値イメージ（Ｂ）は、生成さ
れた分類マップに基づく２つの２値イメージＢ１及びＢ
２のイメージ合成(image composition) の結果である。
即ち、生成された分類マップ中の位置（ｉ，ｊ）におけ
るピクセルが写真ピクセルを示す“１”によりマークさ
れたとすれば、イメージＢ２中の位置（ｉ，ｊ）におけ
るピクセルが２値イメージＢへとコピーされる。一方、
分類マップ中の位置（ｉ，ｊ）におけるピクセルが、テ
キストピクセルを示す“０”であれば、上記イメージＢ
中の位置（ｉ，ｊ）におけるピクセルは２値イメージ
（Ｂ１）のコピーである。Referring to the drawings, wherein like reference numerals indicate like or corresponding parts throughout the different views, FIG. 1 is a block diagram of a region-based image binarization method. For digital grayscale image data as input, the above method works as follows.
Adaptive image thresholding (step 15a) is applied to convert the grayscale image (G) to a binary image (B) that exhibits good image quality of text and line art.
Convert to 1). An image rendering (step 15b) such as error diffusion or dither is applied to the same grayscale image (G) to obtain a rendered binary image (B2) exhibiting good image quality in the photographic portion of the image. . In step 15c,
Subsampling of the grayscale image is performed to provide a subsampled image (Gs). In step 16, the position of the rectangular photographic image in the sub-sampled image (Gs) is detected, while step 17
Detects a photographic image having a rectangular shape or boundary. At step 18, a classification map is generated that marks the rectangular photographic image with "1" and other pixels with "0", and the final binary image (B) is added to the generated classification map. Based two binary images B1 and B
2 is the result of image composition.
That is, assuming that the pixel at position (i, j) in the generated classification map is marked by "1" indicating a photographic pixel, the pixel at position (i, j) in image B2 is converted to binary image B. Is copied. on the other hand,
If the pixel at the position (i, j) in the classification map is “0” indicating a text pixel, the image B
The pixel at the middle position (i, j) is a copy of the binary image (B1).

【００１２】図２は、記述された本発明の特徴に従っ
て、キャプチャされたイメージを処理するイメージ・キ
ャプチャ・アセンブリ３００の概略図である。イメージ
・キャプチャ・アセンブリ３００は、イメージをキャプ
チャする例えば荷電結合素子などの形態のイメージ・キ
ャプチャ部分３０１と、キャプチャされたイメージを、
該キャプチャされたイメージを表すデジタル情報へと変
換する例えばＡ／Ｄ変換器などの形態の変換部分３０３
とを含んでいるスキャーとすることができる。上記デジ
タル情報は、図１に関して記述されると共に図３及び図
４に関して更に記述される手法で上記デジタル情報を処
理するイメージ・プロセッサ３０５へと送信される。FIG. 2 is a schematic diagram of an image capture assembly 300 for processing a captured image in accordance with the described features of the invention. The image capture assembly 300 includes an image capture portion 301 that captures the image, for example, in the form of a charge coupled device,
A conversion part 303 in the form of, for example, an A / D converter, for converting it into digital information representing the captured image
And a skier that includes The digital information is transmitted to an image processor 305 that processes the digital information in the manner described with respect to FIG. 1 and further described with respect to FIGS.

【００１３】図３には、混合形式文書における写真イメ
ージの検出の詳細( 段階１６) が示されている。最初
に、サブサンプリング段階( 図１の１５ｃ）において、
グレースケール・イメージはＮピクセル毎かつＮ走査線
毎にサブサンプリングされて、低解像度のグレースケー
ル・イメージ（Ｇｓ）を獲得する。固定スレッショルド
値( 全体的スレッショルド化)(段階２０）が供給され、
上記グレースケール・イメージ（Ｇｓ）を２値イメージ
（Ｂｓ）へと変換する。２値イメージの全てのピクセル
に亙り３×３の２値浸食操作( 段階２１) が適用され、
細線と、文字などの他の細かいオブジェクトとを除去す
る。イメージ浸食操作の後の結果的イメージは、イメー
ジ（Ｅｓ）として保存される。該イメージ（Ｅｓ）に対
しては連結成分分析( 段階２２）が適用され、連結ピク
セルをグループ化する。連結ピクセルの全てのグループ
はオブジェクトとして取り込まれる。オブジェクトの境
界座標は、該オブジェクトの位置を定義する。サイズ・
フィルタに基づき( 段階２３）、その境界サイズがサイ
ズ・スレッショルド値より大きいオブジェクトは写真と
見做される。一例として、サイズ・フィルタはスキャン
解像度に依存し得る。実例による上記方法の各段階の例
証図５は、テキスト、ライン、矩形写真および非矩形グラ
フィックス( サングラスのグラフィックス) を含む雑誌
の混合形式の文書頁をスキャンして印刷したものであ
る。図５のグレースケール・イメージに対して適応スレ
ッショルド化( 段階１５ａ、図１) を適用すると、２値
イメージ（Ｂ１）が生成される。図６の２値イメージ
（Ｂ１）は明確で鮮やかな文字およびラインを呈する
が、写真の陰影の詳細は消失している。同一のグレース
ケール・イメージに対してエラー拡散技術( 図１の段階
１５ｂ）を適用することにより、図７に示された結果的
２値イメージ（Ｂ２）は、写真の領域のイメージ詳細が
保持されて実際の写真品質に近くなっていることを示し
ている。但し、テキスト・イメージ品質は霞んでいる。
上記の２つの２値イメージ（Ｂ１）および（Ｂ２）を比
較すると、混合形式文書において良好な２値イメージを
生成する為には、テキスト領域に対する適応スレッショ
ルド化、および、写真領域に対するエラー拡散、の組合
せが必要であると結論づけられる。この作業を達成する
為には、写真領域の検出が必要とされる。FIG. 3 shows the details of detecting a photographic image in a mixed format document (step 16). First, in the sub-sampling stage (15c in FIG. 1)
The grayscale image is subsampled every N pixels and every N scan lines to obtain a low resolution grayscale image (Gs). A fixed threshold value (global thresholding) (step 20) is provided,
The grayscale image (Gs) is converted into a binary image (Bs). A 3 × 3 binary erosion operation (step 21) is applied to all pixels of the binary image,
Remove thin lines and other fine objects, such as characters. The resulting image after the image erosion operation is saved as an image (Es). Connected component analysis (step 22) is applied to the image (Es) to group connected pixels. All groups of connected pixels are captured as objects. The boundary coordinates of an object define the position of the object. size·
Based on the filter (step 23), objects whose border size is greater than the size threshold value are considered photographs. As an example, the size filter may depend on the scan resolution. Illustrating Steps of the Method by Example FIG. 5 shows a scanned and printed page of a mixed format document page of a magazine containing text, lines, rectangular photographs and non-rectangular graphics (sunglasses graphics). Applying adaptive thresholding (step 15a, FIG. 1) to the grayscale image of FIG. 5 produces a binary image (B1). The binary image (B1) in FIG. 6 shows clear and vivid characters and lines, but the details of the shading of the photograph have been lost. By applying the error diffusion technique (step 15b in FIG. 1) to the same grayscale image, the resulting binary image (B2) shown in FIG. 7 retains the image details of the photographic area. Indicates that it is close to the actual photographic quality. However, the text image quality is blurred.
Comparing the two binary images (B1) and (B2) above, in order to generate a good binary image in a mixed format document, adaptive thresholding for text regions and error diffusion for photographic regions are considered. It is concluded that a combination is needed. In order to accomplish this task, the detection of a photographic area is required.

【００１４】検出プロセス( 図１の段階１６および図３
のフローチャート) においては最初に、図８に示された
小寸グレースケール・イメージ（Ｇｓ）を生成すべくグ
レースケール・イメージのサブサンプリングが実行され
る。その後、固定スレッショルド値によりグレースケー
ル・イメージ（Ｇｓ）をスレッショルド処理することに
より、２値イメージ（Ｂｓ）が生成される。結果的な２
値イメージ（Ｂｓ）は図９に示されている。図１０に示
された如く、２値イメージ浸食操作( 図３、段階２１)
を適用するとイメージＥｓに帰着するが、該イメージに
おいては小寸文字および細線が除去されると共に、残り
のブラック・ピクセルの殆どは写真領域内に在る。上記
イメージ（Ｅｓ）内の各オブジェクトの境界枠は、２値
イメージの連結ブラック・ピクセルを個別のオブジェク
トとしてグループ化する連結成分( 図３、段階２２）に
より検出される( 例えば、米国特許出願第08/739,076号
を参照されたい) 。小寸オブジェクトを排除する( 図
３、段階２３）と、潜在的写真境界枠１００が図１１に
示される。境界座標は完全解像度へと変換されて図１２
に示されている。４個の境界枠１００は、検出された写
真の位置である。検出された境界枠内の写真の全てが矩
形状であるとは限らない。The detection process (step 16 in FIG. 1 and FIG. 3)
First, the grayscale image is sub-sampled to generate the small grayscale image (Gs) shown in FIG. Thereafter, the grayscale image (Gs) is thresholded with a fixed threshold value to generate a binary image (Bs). Result 2
The value image (Bs) is shown in FIG. As shown in FIG. 10, the binary image erosion operation (FIG. 3, step 21)
Applies to the image Es, where small letters and fine lines are removed and most of the remaining black pixels are in the photographic area. The bounding box of each object in the image (Es) is detected by a connected component (FIG. 3, step 22) that groups the connected black pixels of the binary image as individual objects (see, for example, U.S. Pat. 08 / 739,076). Eliminating the small objects (FIG. 3, step 23), the potential photographic bounding box 100 is shown in FIG. The boundary coordinates are converted to full resolution and
Is shown in The four border frames 100 are the positions of the detected photos. Not all of the photos in the detected border frame are necessarily rectangular.

【００１５】次の段階は、４個の境界枠１００の位置内
において矩形の写真オブジェクトを検出することである
( 図１、段階１７）。この検出は、図５の２値イメージ
（Ｂ１）内で検出された境界枠内の一切の文字を検証す
ることにより行われる。境界枠内に何らかの文字が存在
すれば、境界枠内の写真は非矩形と分類される。一方、
境界枠内に何らの文字も存在しなければ、該境界枠内の
写真は矩形写真として分類される。上記例においては図
１３に示された如くサングラスのグラフィックスに対す
る境界枠１００’内には文字が在る。従って、サングラ
スのグラフィックスは非矩形写真と見做される。残りの
他の３個の境界枠は文字を含まないことから、これらの
境界枠が含む写真は矩形として分類される。矩形の形状
または境界を有する写真イメージの検出( 図１、段階１
７）の詳細は、図４に示されている。図４に示された如
く、潜在的な写真イメージ並びに２値イメージ（Ｂ１）
の境界座標に関する情報は、連結成分分析に関して考慮
される( 段階３０）。段階３０においては、連結成分分
析が行われて各境界枠内のオブジェクトが抽出される。
最大オブジェクト(largest object)は絵画的イメージ(p
ictorial image) であると見做され、小寸オブジェクト
は文字またはノイズとして分類されることを銘記された
い。段階３３においては、幾何的に最大オブジェクトの
境界の外側に配置されていない小寸オブジェクト( 文
字) が存在するか否かを認識すべきチェックの判断が為
される。もし段階３３に対する答えがＹＥＳであれば、
オブジェクトは非矩形境界を有する写真イメージであ
る。段階３３に対する答えがＮＯであれば、オブジェク
トは矩形境界を有する写真イメージである。The next step is to detect rectangular photographic objects within the positions of the four border frames 100.
(FIG. 1, step 17). This detection is performed by verifying all the characters in the boundary frame detected in the binary image (B1) in FIG. If there are any characters in the bounding box, the pictures in the bounding box are classified as non-rectangular. on the other hand,
If no characters are present in the border frame, the picture in the border frame is classified as a rectangular picture. In the above example, as shown in FIG. 13, characters are present in the border frame 100 'for the sunglasses graphics. Therefore, sunglasses graphics are considered non-rectangular photographs. Since the other three border frames do not contain characters, the photos they contain are classified as rectangular. Detection of photographic images with rectangular shapes or boundaries (Figure 1, step 1)
Details of 7) are shown in FIG. As shown in FIG. 4, a potential photographic image and a binary image (B1)
The information about the boundary coordinates is considered for connected component analysis (step 30). In step 30, a connected component analysis is performed to extract objects within each bounding box.
The largest object is a pictorial image (p
Note that small objects are classified as text or noise. In step 33, a check is made to determine if there are any small objects (characters) that are not geometrically located outside the boundaries of the largest object. If the answer to step 33 is YES,
The object is a photographic image with non-rectangular boundaries. If the answer to step 33 is NO, the object is a photographic image with rectangular boundaries.

【００１６】次に、図１４に示された如く３個の矩形写
真の境界領域内をブラック・ピクセルで充填することに
より、分類マップ( 図１、段階１８）が生成される。最
終的２値イメージ（Ｂ）は、分類マップに基づき２値イ
メージ（Ｂ１）および（Ｂ２）から組立てられる。イメ
ージ（Ｂ）のピクセルは分類マップのテキスト領域にお
いてはイメージ（Ｂ１）のコピーであり、且つ、写真領
域( 分類マップ中のブラック領域) においてイメージ
（Ｂ）はイメージ（Ｂ２）のコピーである。結果は図１
５に示されている。Next, a classification map (FIG. 1, step 18) is generated by filling the boundaries of the three rectangular photographs with black pixels as shown in FIG. The final binary image (B) is assembled from the binary images (B1) and (B2) based on the classification map. Pixels of image (B) are copies of image (B1) in the text area of the classification map, and images (B) are copies of image (B2) in the photographic area (black area of the classification map). The result is shown in FIG.
It is shown in FIG.

[Brief description of the drawings]

【図１】混合形式の文書に対する領域ベース２値化シス
テムの各段階を示すフローチャートである。FIG. 1 is a flowchart showing the steps of an area-based binarization system for a mixed format document.

【図２】本発明に係る装置の概略図である。FIG. 2 is a schematic diagram of an apparatus according to the present invention.

【図３】写真検出プロセスの各段階を概略化したフロー
チャートである。FIG. 3 is a flowchart outlining each step of the photo detection process.

【図４】矩形の形状または境界を有する写真イメージを
検出する各段階を概略化したフローチャートである。FIG. 4 is a flowchart outlining steps for detecting a photographic image having a rectangular shape or boundary.

【図５】デジタル的に印刷されたグレースケール混合文
書（Ｇ）の一例である。FIG. 5 is an example of a grayscale blended document (G) printed digitally.

【図６】適応スレッショルド化方法を使用した図３のイ
メージ（Ｇ）のスレッショルド化イメージ（Ｂ１）であ
る。FIG. 6 is a thresholded image (B1) of the image (G) of FIG. 3 using the adaptive thresholding method.

【図７】エラー拡散方法を使用した図３のイメージ
（Ｇ）のスレッショルド化イメージ（Ｂ２）である。FIG. 7 is a thresholded image (B2) of the image (G) of FIG. 3 using the error diffusion method.

【図８】４：１サイズ縮小を使用した図３のイメージ
（Ｇ）のサブサンプル化イメージ（Ｇｓ）である。FIG. 8 is a sub-sampled image (Gs) of the image (G) of FIG. 3 using 4: 1 size reduction.

【図９】固定スレッショルド化を使用した図６のイメー
ジ（Ｇｓ）のスレッショルド化イメージ（Ｂｓ）であ
る。9 is a thresholded image (Bs) of the image of FIG. 6 (Gs) using fixed thresholding.

【図１０】図９のイメージ（Ｂｓ）に対するイメージ侵
食処理の結果的２値イメージ（Ｅｓ）である。FIG. 10 is a resultant binary image (Es) of the image erosion process on the image (Bs) of FIG. 9;

【図１１】図１０の２値イメージ（Ｅ）に対して検出さ
れた大寸オブジェクトの境界枠である。11 is a boundary frame of a large object detected for the binary image (E) in FIG.

【図１２】図６のイメージ（Ｂ１）において検出された
写真の位置である。FIG. 12 is a photograph position detected in the image (B1) of FIG. 6;

【図１３】非矩形写真に対するテキストを含む境界枠の
例である。FIG. 13 is an example of a border frame including text for a non-rectangular photograph.

【図１４】検出された矩形写真の領域を示すビットマッ
プである。FIG. 14 is a bitmap showing an area of a detected rectangular photograph.

【図１５】本発明の方法を使用した最終的２値イメージ
である。FIG. 15 is a final binary image using the method of the present invention.

[Explanation of symbols]

１００…潜在的写真境界枠３００…イメージ・キャプチャ・アセンブリ３０１…イメージ・キャプチャ３０３…Ａ／Ｄ変換器３０５…イメージ・プロセッサ 100: Potential photo border 300: Image capture assembly 301: Image capture 303: A / D converter 305: Image processor

Claims

[Claims]

1. converting a grayscale image into first and second binary images; detecting a position of the photographic image in the grayscale image; Identifying a photographic image having a rectangular boundary within; generating a classification map that distinguishes pixels in the photographic image having the rectangular boundary from the remaining pixels; and, based on the classification map, the first and second images. Forming a final binary image from the two binary images.

2. The method of claim 1, further comprising the step of subsampling the grayscale image to obtain a subsampled image, wherein the detecting step comprises detecting a position of a photographic image in the subsampled image. The method of claim 1.

3. The method of claim 1, wherein the grayscale image has been captured from a document that includes at least a photographic portion and a text portion.

4. The method of claim 1, wherein the converting step comprises:
Applying one of the first and second binary images by applying an adaptive thresholding technique to the image; and applying the image rendering technique to the first and second grayscale images. Acquiring the other of the second binary image.

5. The method of claim 4, wherein said image rendering technique comprises an error diffusion process.

6. The method of claim 4, wherein said image rendering technique comprises a dither process.

7. The method of claim 2, wherein said sub-sampled image is a low resolution image.

8. The detecting step further comprises: converting the sub-sampled image into a further binary image; removing thin lines and characters from the further binary image; Performing connected component analysis on the further binary image to group the connected pixels in, wherein the group of connected pixels is identified as an object in the further binary image; and 3. The method of claim 2, comprising: designating an object having a size greater than the threshold value as a photographic image at.

9. The method according to claim 8, wherein the specifying step includes the step of specifying an object larger than the threshold value as an object image and using a size filter to exclude objects smaller than the threshold value.