JP2010056827A

JP2010056827A - Apparatus and program for processing image

Info

Publication number: JP2010056827A
Application number: JP2008219069A
Authority: JP
Inventors: Toshimasa Dobashi; 外志正土橋; Hiroyuki Mizutani; 博之水谷
Original assignee: Toshiba Corp; Toshiba Solutions Corp
Current assignee: Toshiba Corp; Toshiba Digital Solutions Corp
Priority date: 2008-08-28
Filing date: 2008-08-28
Publication date: 2010-03-11

Abstract

<P>PROBLEM TO BE SOLVED: To optimize a balance between a readability (decrypting performance) and a file size of a document image. <P>SOLUTION: The apparatus for processing an image has: a layout analyzing part 13 for acquiring layout information of the document image stored in an image memory 12; a separation-object-region-determination part 14 for determining a foreground-background separation object region from the document image with the use of the layout information; a character size estimation part 15 for estimation the character size in the region with the character included out of the foreground-background separation object region; a resolution conversion part 16 for varying the resolution of the region in response to the character size; a region separation part 17 for separating the region with the resolution converted and the foreground-background separation object region from the original document image; a region image generation part 18 for generating the foreground image and the background image from the separated region; an image compression part 19 for compressing the foreground image and the background image; and an image integration part 20 for generating the document image arranged in accordance with the layout information. <P>COPYRIGHT: (C)2010,JPO&INPIT

Description

本発明は、例えば画像処理装置および画像処理プログラムに関する。 The present invention relates to an image processing apparatus and an image processing program, for example.

多機能周辺装置（Multifunction Peripheral：ＭＦＰ）やカラースキャナの普及やドキュメントのカラー化により、文書画像をカラーで保存する機会が高まっている。カラーの画像はファイルサイズが大きくなりがちであるため、画像は圧縮されて保存される場合が多い。 With the proliferation of multifunction peripherals (MFPs) and color scanners and the colorization of documents, the opportunity to store document images in color is increasing. Since a color image tends to have a large file size, the image is often compressed and stored.

通常、カラー画像の圧縮にはＪＰＥＧ圧縮が用いられることが多いが、文書画像には文字の輪郭による高周波成分が多く含まれるためにＪＰＥＧ圧縮での圧縮効率が悪く、圧縮率を上げると画質が悪くなるという問題があった。 Normally, JPEG compression is often used for color image compression. However, since the document image contains many high-frequency components due to the outline of characters, the compression efficiency of JPEG compression is poor, and image quality improves when the compression ratio is increased. There was a problem of getting worse.

そこで、文書画像から文字成分とそれ以外の成分（文字を除去した画像）をそれぞれ、前景成分、背景成分として分離し、それぞれを異なる圧縮手法で圧縮することによって高圧縮を実現する文書画像向けの圧縮手法が開発された（例えば特許文献１参照）。なお、この技術を「前景背景分離による圧縮手法」と呼ぶ。 Therefore, the character component and other components (images from which characters are removed) are separated from the document image as a foreground component and a background component, respectively, and are compressed by different compression methods. A compression technique has been developed (see, for example, Patent Document 1). This technique is referred to as a “compression technique using foreground / background separation”.

この前景背景分離による圧縮手法では、文字領域の抽出を行い、文字と判断された領域に対しては、文字は少なくとも文字単位では単色であることが多いということから、高解像度のまま減色しＭＭＲ圧縮手法などで圧縮する。 In this compression method using foreground / background separation, character regions are extracted, and for regions determined to be characters, characters are often monochromatic at least in character units. Compress with compression method.

文字領域として抽出された画素をその周囲の画素値によって補間し文字を除去した画像（背景成分）は、なめらかな画像となりＪＰＥＧ圧縮などでの圧縮効率が高まるため、ＪＰＥＧ圧縮手法などで圧縮する。 An image (background component) obtained by interpolating pixels extracted as a character region with surrounding pixel values and removing characters becomes a smooth image, and the compression efficiency of JPEG compression or the like increases. Therefore, the image is compressed by a JPEG compression method or the like.

その際には、背景成分の解像度を下げる（画像を縮小する）ことによってさらに圧縮率を高めることも可能である。このようにして生成した背景成分上に、前景成分を重ねて表示することで、元の画像と見た目がほとんど変らない画像を小さいファイルサイズの画像に変換することができる。
特許第２６１１０１２号 In that case, it is possible to further increase the compression rate by lowering the resolution of the background component (reducing the image). By displaying the foreground component superimposed on the background component generated in this way, an image that hardly changes in appearance from the original image can be converted into an image having a small file size.
Japanese Patent No. 2611012

前景背景分離による圧縮手法による画像圧縮を行う際に、文字のサイズが非常に小さい、あるいは入力画像の解像度が比較的低い文字領域では、前景成分を減色する際に文字形状がつぶれてしまい、文字の読みやすさが損なわれる、あるいは読めなくなるなどといった画質劣化問題が生じる一方、大きい文字を小さい文字と同等に高解像にすることはオーバースペックまたはファイルサイズが肥大化するなどの問題も生じる。 When performing image compression using a compression method based on foreground / background separation, in character areas where the character size is very small or the resolution of the input image is relatively low, the character shape is crushed when the foreground component is reduced, and the characters are collapsed. However, there is a problem that image quality deteriorates such that the readability of the text is impaired or cannot be read. On the other hand, making a large character as high as a small character causes a problem such as overspec or enlargement of the file size.

本発明はこのような課題を解決するためになされたもので、文書画像の文字の読みやすさ（判読性）とファイルサイズとのバランスの最適化を図ることのできる画像処理装置および画像処理プログラムを提供することを目的とする。 The present invention has been made to solve such a problem, and an image processing apparatus and an image processing program capable of optimizing the balance between character readability (readability) of a document image and file size. The purpose is to provide.

上記の課題を解決するために本発明の画像処理装置は、文字が記入された文書画像を保持する画像メモリと、前記画像メモリに保持された文書画像に対してレイアウト解析を行うことで、前記文書画像を構成する要素の位置と前記要素の種別を示す属性とを含むレイアウト情報を得るレイアウト解析部と、前記レイアウト解析部により解析されたレイアウト情報を用いて前記文書画像から分離すべき領域を決定する領域決定部と、前記領域決定部により決定された領域のうち、前記属性から解像度変換対象の要素を含む領域を選定し、選定した領域の要素のサイズを推定するサイズ推定部と、前記サイズ推定部により推定された要素のサイズに応じて前記領域の解像度変換を行う解像度変換部と、前記解像度変換部により解像度変換された領域の画像を前記文書画像から分離する分離部と、前記分離部により分離された画像に対して二値化を含む処理を行うことで、前景画像と背景画像とを生成する画像生成部と、前記画像生成部により生成された前景画像と背景画像に対して個々に圧縮処理を行う画像圧縮部と、前記画像圧縮部により圧縮された前景画像と背景画像を前記レイアウト情報に従って配置した文書画像を生成する画像統合部とを具備することを特徴とする。 In order to solve the above problems, an image processing apparatus according to the present invention includes an image memory that holds a document image in which characters are written, and a layout analysis performed on the document image held in the image memory. A layout analysis unit that obtains layout information including a position of an element constituting the document image and an attribute indicating the type of the element; and an area to be separated from the document image using the layout information analyzed by the layout analysis unit. A region determining unit to determine, a region including a resolution conversion target element from the attribute among the regions determined by the region determining unit, a size estimation unit that estimates a size of an element of the selected region, A resolution conversion unit that performs resolution conversion of the region in accordance with the size of the element estimated by the size estimation unit; and a region that has undergone resolution conversion by the resolution conversion unit. A separation unit that separates the image from the document image, an image generation unit that generates a foreground image and a background image by performing processing including binarization on the image separated by the separation unit, An image compression unit that individually compresses the foreground image and the background image generated by the image generation unit, and a document image in which the foreground image and the background image compressed by the image compression unit are arranged according to the layout information And an image integration unit.

本発明の画像処理プログラムは、画像処理装置に処理を実行させる画像処理プログラムにおいて、前記画像処理装置を、文字が記入された文書画像を保持する画像メモリと、前記画像メモリに保持された文書画像に対してレイアウト解析を行うことで、前記文書画像を構成する要素の位置と前記要素の種別を示す属性とを含むレイアウト情報を得るレイアウト解析部と、前記レイアウト解析部により解析されたレイアウト情報を用いて前記文書画像から分離すべき領域を決定する領域決定部と、前記領域決定部により決定された領域のうち、前記属性から解像度変換対象の要素を含む領域を選定し、選定した領域の要素のサイズを推定するサイズ推定部と、前記サイズ推定部により推定された要素のサイズに応じて前記領域の解像度変換を行う解像度変換部と、前記解像度変換部により解像度変換された領域の画像を前記文書画像から分離する分離部と、前記分離部により分離された画像に対して二値化を含む処理を行うことで、前景画像と背景画像とを生成する画像生成部と、前記画像生成部により生成された前景画像と背景画像に対して個々に圧縮処理を行う画像圧縮部と、前記画像圧縮部により圧縮された前景画像と背景画像を前記レイアウト情報に従って配置した文書画像を生成する画像統合部として機能させることを特徴とする。 The image processing program of the present invention is an image processing program for causing an image processing apparatus to execute processing. The image processing apparatus includes: an image memory that holds a document image in which characters are written; and a document image that is held in the image memory. The layout analysis unit obtains layout information including the position of the element constituting the document image and the attribute indicating the type of the element by performing layout analysis on the layout image, and the layout information analyzed by the layout analysis unit An area determining unit that determines an area to be separated from the document image, and an area that includes an element to be converted from the attribute among the areas determined by the area determining unit; A size estimation unit that estimates the size of the region, and resolution conversion of the region according to the size of the element estimated by the size estimation unit. By performing processing including binarization on the resolution conversion unit, the separation unit that separates the image of the region that has been resolution-converted by the resolution conversion unit from the document image, and the image separated by the separation unit, An image generation unit that generates a foreground image and a background image, an image compression unit that individually compresses the foreground image and the background image generated by the image generation unit, and a foreground compressed by the image compression unit It is made to function as an image integration part which produces | generates the document image which has arrange | positioned the image and the background image according to the said layout information.

本発明によれば、文書画像の文字の読みやすさ（判読性）とファイルサイズとのバランスの最適化を図ることができる。 According to the present invention, it is possible to optimize the balance between the readability (readability) of characters in a document image and the file size.

以下、本発明の実施の形態を図面を参照して詳細に説明する。図１は本発明の一つの実施形態に係る画像処理装置の構成を示すブロック図である。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings. FIG. 1 is a block diagram showing a configuration of an image processing apparatus according to an embodiment of the present invention.

図１に示すように、この画像処理装置は、例えばデジタルカメラ、イメージスキャナなどの画像取込装置１と、印刷装置などのプリンタ、表示装置などのモニタなどといった画像出力装置３と、これら画像取込装置１および画像出力装置３を接続したコンピュータ２（以下ＰＣ２と称す）とから構成されている。 As shown in FIG. 1, the image processing apparatus includes an image capturing apparatus 1 such as a digital camera or an image scanner, an image output apparatus 3 such as a printer such as a printing apparatus, a monitor such as a display apparatus, and the like. And a computer 2 (hereinafter referred to as PC2) to which the image capturing device 1 and the image output device 3 are connected.

画像取込装置１は、文字が記入された画像、例えば文書画像などを読み取り、ＰＣ２に入力する。つまり、画像取込装置１は、文字認識対象の帳票の表面を撮像して画像を読み取り取得する画像取得手段として機能する。 The image capturing device 1 reads an image in which characters are written, for example, a document image, and inputs it to the PC 2. That is, the image capturing device 1 functions as an image acquisition unit that captures an image of a surface of a character recognition target form and reads and acquires the image.

帳票には、所定の情報として、例えば文字などが記入されているものとし、画像取込装置１により得られる帳票の画像には、ＲＧＢなどのカラー画像、白黒画像、グレースケール画像、これらが混在するものがある。以下「記入」とは、文字などが手書きされた場合と、印刷された場合の双方の意味を含む。 It is assumed that, for example, characters or the like are entered in the form as predetermined information, and the form image obtained by the image capturing device 1 includes a color image such as RGB, a monochrome image, a gray scale image, and the like. There is something to do. Hereinafter, “entry” includes both the meanings when characters and the like are handwritten and printed.

画像情報は、デジタルカメラやイメージスキャナーなどで読み取ったものでも良く、文書ファイルからの変換によって取得したものでも良い。また他のコンピュータに蓄積されている画像情報をＬＡＮなどのネットワークを通じてＰＣ２が取得しても良い。 The image information may be read by a digital camera or an image scanner, or may be acquired by conversion from a document file. Further, the PC 2 may acquire image information stored in another computer through a network such as a LAN.

ＰＣ２は、ＣＰＵ、ＲＡＭ、ＲＯＭなどのメモリ、ハードディスク装置などの補助記憶装置、キーボードなどの入力装置およびマウスなどの指示装置、モニタなどの表示装置、イメージスキャナとのインターフェースボードなどのハードウェアを有している。 The PC 2 has hardware such as a CPU, RAM, ROM and other memory, an auxiliary storage device such as a hard disk device, an input device such as a keyboard and a pointing device such as a mouse, a display device such as a monitor, and an interface board with an image scanner. is doing.

ＰＣ２のハードディスク装置には、オペレーティングシステム（以下ＯＳと称す）および画像処理ソフトウェアなどのプログラムが記憶されている。ＰＣ２は、自動起動またはユーザによる起動操作により、これらのプログラムがＣＰＵによってメモリに読み込まれて画像処理装置として機能するようになる。つまりＰＣ２の機能は、メモリに読み込まれたＯＳおよび画像処理ソフトウェアなどのプログラムと上記ハードウェアとが協動して実現される。 The hard disk device of the PC 2 stores programs such as an operating system (hereinafter referred to as OS) and image processing software. The PC 2 functions as an image processing apparatus by reading these programs into the memory by the CPU by automatic activation or activation operation by the user. That is, the functions of the PC 2 are realized by the cooperation of a program such as the OS and image processing software read into the memory and the hardware.

画像処理装置として機能するＰＣ２は、画像メモリ１２、レイアウト解析部１３、分離対象領域決定部１４、文字サイズ推定部１５、解像度変換画像生成部としての解像度変換部１６、領域分離部１７、領域画像生成部１８、画像圧縮部１９、画像出力装置３などを有している。画像メモリ１２には、画像取込装置１から入力された文書画像が保持（記憶）される。 The PC 2 functioning as an image processing apparatus includes an image memory 12, a layout analysis unit 13, a separation target region determination unit 14, a character size estimation unit 15, a resolution conversion unit 16 as a resolution conversion image generation unit, a region separation unit 17, and a region image. A generation unit 18, an image compression unit 19, an image output device 3, and the like are included. The image memory 12 holds (stores) the document image input from the image capturing device 1.

レイアウト解析部１３は、画像メモリ１２から読み出した文書画像を構成する要素のレイアウト解析を行うことで、書面上における各要素の位置情報と、要素が分離対象か否かを判定するためのキャラクタ属性（文字、図形、画像、線画など）を含むレイアウト情報（文書画像のどのエリアに各要素の画素が分布しているかといった情報）とを得る。 The layout analysis unit 13 performs layout analysis of the elements constituting the document image read from the image memory 12, thereby determining the position information of each element on the document and the character attribute for determining whether the element is a separation target. Layout information (information such as in which area of the document image the pixels of each element are distributed) including (characters, figures, images, line drawings, etc.).

ここで、レイアウト解析とは、文書画像を構成する要素を抽出することであって、文書に含まれる単位文字領域やそれらを統合して求められる文字行領域、文字列領域、さらには写真領域、図表領域、さらにはそれらの属性を求めることである。 Here, the layout analysis is to extract the elements constituting the document image. The unit character area included in the document, the character line area obtained by integrating them, the character string area, the photo area, It is to obtain the chart area and further their attributes.

例えば文字列領域の場合では、文字列領域の位置や幅、高さ、文字列領域に含まれる文字行などが属性として抽出される。また、文字行の場合では、行の位置や幅、高さ、文字行が含まれる文字列領域、文字行に含まれる単位文字領域などが属性として抽出される。 For example, in the case of a character string area, the position, width, height, character line included in the character string area, and the like are extracted as attributes. In the case of a character line, the position, width, and height of the line, a character string area including the character line, a unit character area included in the character line, and the like are extracted as attributes.

また、単位文字領域の場合では、文字の位置や幅、高さ、文字が含まれる文字行領域などが属性として抽出される。文書画像に対するレイアウト解析は文字認識などの分野では一般的な技術である。 In the case of a unit character area, the character position, width, height, character line area including the character, and the like are extracted as attributes. Layout analysis for document images is a common technique in fields such as character recognition.

すなわち、レイアウト解析部１３は、画像メモリに保持された文書画像に対してレイアウト解析を行うことで、文書画像を構成する要素（文字、線画、写真など）の位置と各要素の種別（文字か線画か写真かなどの）を示す属性とを含むレイアウト情報を得る。 In other words, the layout analysis unit 13 performs layout analysis on the document image held in the image memory, so that the position of the elements (characters, line drawings, photographs, etc.) constituting the document image and the type of each element (character Layout information including attributes indicating line drawings or photographs).

分離対象領域決定部１４は、レイアウト解析部１３により解析されたレイアウト情報を用いて、文書画像から分離すべき対象の要素（前景となる文字列や背景となる線画、写真など）が存在する領域（前景背景分離対象領域）を決定する決定部として機能する。 The separation target region determination unit 14 uses the layout information analyzed by the layout analysis unit 13 and includes a target element (a character string serving as a foreground, a line drawing serving as a background, a photograph, or the like) to be separated from a document image. It functions as a determining unit that determines (foreground / background separation target region).

すなわち、分離対象領域決定部１４は、レイアウト解析部１３により解析されたレイアウト情報を用いて、文書画像中の前景背景分離対象領域を決定する。 That is, the separation target area determination unit 14 determines the foreground / background separation target area in the document image using the layout information analyzed by the layout analysis unit 13.

文字サイズ推定部１５は、解像度変換対象の要素（この場合、文字）の属性情報を予め有しており、分離対象領域決定部１４により決定された前景背景分離対象領域のうち、レイアウト解析結果のレイアウト情報に含まれる属性から解像度変換対象の要素（この場合、文字）を含む領域を選定し、選定した領域の要素（文字）のサイズ（何ポイントか）を推定する。ここで推定とは、領域の高さから文字の大きさ（ポイント）を計算することをいう。 The character size estimation unit 15 has attribute information of the element (in this case, a character) for resolution conversion in advance, and the layout analysis result of the foreground / background separation target region determined by the separation target region determination unit 14 An area including an element (in this case, a character) for resolution conversion is selected from the attributes included in the layout information, and the size (how many points) of the element (character) in the selected area is estimated. Here, estimating means calculating the size (point) of the character from the height of the area.

解像度変換部１６は、文字サイズ推定部１５により推定された文字のサイズに対して、予め設定された閾値（許容範囲）との比較で文字サイズが閾値を外れた場合、文字が含まれる領域の解像度を文字サイズが閾値以内になるような値に変換する。 The resolution conversion unit 16 compares the size of the character estimated by the character size estimation unit 15 with a threshold value (allowable range) set in advance. The resolution is converted to a value such that the character size is within the threshold.

解像度変換部１６は、文字サイズが、予め設定された閾値よりも小さい場合に、解像度を上げる方向に前景背景分離対象領域の解像度を変化させる。解像度変換部１６は、文字サイズが、予め設定された閾値よりも大きい場合に、解像度を下げる方向に前景背景分離対象領域の解像度を変化させる。 The resolution conversion unit 16 changes the resolution of the foreground / background separation target area in the direction of increasing the resolution when the character size is smaller than a preset threshold value. The resolution conversion unit 16 changes the resolution of the foreground / background separation target region in the direction of decreasing the resolution when the character size is larger than a preset threshold value.

すなわち、解像度変換部１６は、文字サイズ推定部１５により推定された文字サイズに応じて前景背景分離の解像度を変化させる。
領域分離部１７は、解像度変換部１６により文字サイズに応じて解像度変換され、許容範囲内のサイズにされた前景背景分離対象領域の画像を、文書画像から分離する処理を行う。 That is, the resolution conversion unit 16 changes the resolution of the foreground / background separation according to the character size estimated by the character size estimation unit 15.
The area separation unit 17 performs a process of separating the image of the foreground / background separation target area whose resolution has been converted according to the character size by the resolution conversion unit 16 and has a size within an allowable range from the document image.

領域分離部１７は、分離した前景背景分離対象領域の画像に対して二値化を含む処理を行うことで、文書画像から前景と背景とを分離する。 The region separation unit 17 separates the foreground and the background from the document image by performing processing including binarization on the image of the separated foreground / background separation target region.

例えば、領域分離部１７は、二値化された画像の値のうち濃度が濃い方の画素領域を前景、それ以外を背景とする。 For example, the region separation unit 17 sets the pixel region having the higher density among the binarized image values as the foreground and the other as the background.

前景画像の色（前景色）または背景画像の色（背景色）を推定するには、前景または背景としてそれぞれ分離した領域の画素値を、元の文書画像を参照しＲＧＢ毎の平均画素値あるいは最頻値をとるものとする。なおここで示した画像の色を推定する技術は一例であり、他の技法を用いても良い。 In order to estimate the color of the foreground image (foreground color) or the color of the background image (background color), the pixel value of the region separated as the foreground or background is referred to the original document image, or the average pixel value for each RGB or The mode is assumed. Note that the technique for estimating the color of the image shown here is an example, and other techniques may be used.

領域画像生成部１８は、領域分離部１７により分離された画像と推定前景色情報、推定背景色情報を基に前景画像と背景画像を生成する。 The region image generation unit 18 generates a foreground image and a background image based on the image separated by the region separation unit 17, the estimated foreground color information, and the estimated background color information.

画像圧縮部１９は、領域画像生成部１８により生成された前景画像と背景画像に対して所定の圧縮処理を行う。
所定の圧縮処理とは、例えば前景画像と背景画像を同じ圧縮技術で圧縮してもよく、それぞれ別の圧縮技術で圧縮しても良い。ここでは、それぞれ別の圧縮技術で圧縮するものとする。前景画像は、例えばＭＭＲ圧縮技術で圧縮し、背景画像は、例えばＪＰＥＧ圧縮技術で圧縮する。 The image compression unit 19 performs a predetermined compression process on the foreground image and the background image generated by the area image generation unit 18.
For example, the foreground image and the background image may be compressed with the same compression technique, or may be compressed with different compression techniques. Here, it is assumed that compression is performed using different compression techniques. The foreground image is compressed by, for example, the MMR compression technique, and the background image is compressed by, for example, the JPEG compression technique.

画像統合部２０は、画像圧縮部１９により圧縮された各画像（前景画像と背景画像）をレイアウト情報に従って組み合わせて統合し、元の文書画像と同じ配置にした文書画像を生成し、生成した文書画像を画像出力装置３へ出力する。 The image integration unit 20 combines and integrates each image (foreground image and background image) compressed by the image compression unit 19 according to the layout information, generates a document image having the same arrangement as the original document image, and generates the generated document The image is output to the image output device 3.

以下、図２乃至図１０を参照してこの画像処理装置の動作を説明する。
画像取込装置１としてのイメージスキャナが、読み取り面に配置された文書などの帳票から画像を取り込み、文字情報を含んだ電子画像データ（以下「文書画像」と称す）を生成する。 The operation of this image processing apparatus will be described below with reference to FIGS.
An image scanner as the image capturing device 1 captures an image from a document such as a document placed on a reading surface, and generates electronic image data including character information (hereinafter referred to as “document image”).

そして、画像取込装置１が取り込んだ文書画像は、画像取込装置１からＰＣ２に入力され（ステップＳ１０１）、画像メモリ１２に保持される。ＰＣ２に入力された文書画像の一例を図３に示す。 Then, the document image captured by the image capturing device 1 is input from the image capturing device 1 to the PC 2 (step S101) and held in the image memory 12. An example of a document image input to the PC 2 is shown in FIG.

図３に示すように、文書画像４０には、ＴＩＴＬＥ、区分バー、大きいフォントの文字、写真（絵）、小さいフォントの文字などが記載されているものとする。 As shown in FIG. 3, it is assumed that the document image 40 includes TITLE, a section bar, a large font character, a photograph (picture), a small font character, and the like.

続いて、図３の文書画像４０に対してレイアウト解析部１３によってレイアウト解析が行われる（ステップＳ１０２）。 Subsequently, the layout analysis unit 13 performs layout analysis on the document image 40 in FIG. 3 (step S102).

この場合、レイアウト解析が行われて、図４に示すように、文書画像４０のＴＩＴＬＥ領域４１、区分バー領域４２、大きいフォントの文字領域４３、写真（絵）領域４４、小さいフォントの文字領域４５などの位置座標（範囲）が検出される。これらレイアウト解析結果の情報をレイアウト情報と称す。 In this case, layout analysis is performed, and as shown in FIG. 4, a TITLE area 41, a section bar area 42, a large font character area 43, a photo (picture) area 44, and a small font character area 45, as shown in FIG. 4. Position coordinates (range) are detected. Information on the layout analysis result is referred to as layout information.

領域の位置座標（範囲）としては、少なくとも各領域を構成する矩形範囲の左上頂部のｘ，ｙ座標、右下頂部のｘ，ｙ座標の２点が検出される。なお矩形範囲の四隅の座標でもよい。 As the position coordinates (range) of the area, at least two points of the x and y coordinates of the upper left top part and the x and y coordinates of the lower right top part of the rectangular range constituting each area are detected. The coordinates of the four corners of the rectangular range may be used.

続いて、分離対象領域決定部１４は、レイアウト情報を用いて、例えば文字が記入された領域や写真及び線画の領域などを検出し、前景背景分離対象領域を決定する（ステップＳ１０３）。 Subsequently, the separation target area determination unit 14 uses the layout information to detect, for example, an area in which characters are written, a photograph and a line drawing area, and determines a foreground / background separation target area (step S103).

ここでは、レイアウト解析部１３により解析された文書画像のレイアウト情報から、複数の文字が横に並ぶ行方向の領域（文字行領域）を分離対象領域とすることとする。 Here, based on the layout information of the document image analyzed by the layout analysis unit 13, an area in the row direction (character line area) in which a plurality of characters are arranged horizontally is set as a separation target area.

また、この例では、文字行領域を前景背景分離対象領域としたが、文字行領域ではなく、単位文字領域あるいは文字列領域を前景背景分離対象領域としてもよい。また、線図領域、罫線領域なども前景背景分離対象領域に含めてもよい。 In this example, the character line area is the foreground / background separation target area, but the unit character area or the character string area may be the foreground / background separation target area instead of the character line area. Further, a line drawing area, a ruled line area, and the like may be included in the foreground / background separation target area.

続いて、文字サイズ推定部１５は、分離対象領域決定部１４により決定された前景背景分離対象領域の中で、文字が含まれる領域を選出し、その領域の文字のサイズを推定する（ステップＳ１０４）。図３の文書画像に対して文字サイズ推定を行った例を図５、図６に示す。 Subsequently, the character size estimation unit 15 selects a region including characters from the foreground / background separation target region determined by the separation target region determination unit 14, and estimates the character size of the region (step S104). ). An example in which character size estimation is performed on the document image of FIG. 3 is shown in FIGS.

ここでは、文字サイズ推定部１５は、分離対象領域決定部１４により決定された文字が存在する領域４１、４３、４５（高さと幅）を検出し、それぞれの領域４１、４３、４５毎に文字のサイズ（何ポイントか）を推定する。例えば領域４３の場合、文字サイズ推定部１５は、領域４３の画素の分布、画素配置の連続性等から、図５に示すように、複数の文字列の領域Ｅ１〜Ｅ５を検出し、領域の高さが何ポイントの文字に相当するかを予め設定された文字のポイント数の情報と比較することで判定（推定）する。 Here, the character size estimation unit 15 detects the regions 41, 43, and 45 (height and width) where the character determined by the separation target region determination unit 14 exists, and the character is determined for each of the regions 41, 43, and 45. Estimate the size (number of points) of. For example, in the case of the region 43, the character size estimation unit 15 detects a plurality of character string regions E1 to E5 from the pixel distribution of the region 43, the continuity of the pixel arrangement, and the like, as shown in FIG. It is determined (estimated) by comparing the number of character points corresponding to the height with information on the number of character points set in advance.

そして、文字サイズ推定部１５は、それぞれの文字列の領域Ｅ１〜Ｅ５に対して順に文字行属性を持たせ、文字行毎に単位文字の位置（範囲）と大きさを求める。例えば文字行である文字列領域Ｅ１の縦幅（高さ）から、図６に示すように、その行の中に配置可能な単位文字領域Ａ１〜Ａ５…のそれぞれの位置（範囲）と大きさを求める。 Then, the character size estimation unit 15 sequentially assigns character line attributes to the regions E1 to E5 of the character strings, and obtains the position (range) and size of the unit character for each character line. For example, from the vertical width (height) of the character string area E1 that is a character line, as shown in FIG. 6, the positions (ranges) and sizes of the unit character areas A1 to A5... That can be arranged in the line. Ask for.

この場合、文字行に含まれる文字単位の大きさの平均値や再頻値を推定値として用いてもよく、文字行の高さを推定値として用いてもよい。 In this case, the average value or the frequent value of the character units included in the character line may be used as the estimated value, and the height of the character line may be used as the estimated value.

また、各前景背景分離対象領域毎に文字サイズを推定してもよく、近傍の前景背景分離対象領域をまとめて文字サイズを推定してもよい。線図領域、罫線領域などのように文字が含まれない領域に対しては、文字サイズの推定は行われないものとする。 In addition, the character size may be estimated for each foreground / background separation target area, or the character size may be estimated by grouping nearby foreground / background separation target areas. It is assumed that the character size is not estimated for an area that does not include characters such as a diagram area and a ruled line area.

続いて、解像度変換部１６は、文字サイズ推定部１５によって推定された文字サイズに応じて前景背景分離対象領域の解像度を変換する（ステップＳ１０５）。なおこの解像度変換処理は文字サイズと予め設定された閾値（の範囲）とを比較して、その大小関係（閾値の範囲内か範囲外か、範囲外の場合、範囲の上か下か等）により解像度を変えない、解像度を上げる方向、または下げる方向に行うものとする。 Subsequently, the resolution conversion unit 16 converts the resolution of the foreground / background separation target region according to the character size estimated by the character size estimation unit 15 (step S105). In this resolution conversion process, the character size is compared with a preset threshold (range), and the size relationship (within the threshold range or out of range, and when out of range, above or below the range) Therefore, the resolution is not changed, and the resolution is increased or decreased.

すなわち、拡大による解像度変換を行うかどうかを決定するための文字サイズの閾値を、閾値Ｔｕとし、縮小による解像度変換を行うかどうかを決定するための文字サイズの閾値を閾値Ｔｄとすると、推定された前景の文字サイズＴｅが、閾値Ｔｅ＜閾値Ｔｕの場合、解像度を上げ、推定された文字サイズＴｅが閾値Ｔｅ＞閾値Ｔｄの場合、解像度を下げる。 That is, it is estimated that the threshold of the character size for determining whether to perform resolution conversion by enlargement is the threshold Tu, and the threshold of the character size for determining whether to perform resolution conversion by reduction is the threshold Td. If the foreground character size Te is threshold Te <threshold Tu, the resolution is increased, and if the estimated character size Te is threshold Te> threshold Td, the resolution is decreased.

閾値Ｔｕ，Ｔｄは固定値を用いても良く、また応用用途や要求される文字認識品質を実現するために必要なサイズに応じて変動させても良く、動的に決定しても良い。例えば文書に含まれる文字の平均サイズがＴａであった場合には、Ｔｕ＝Ｔａ／２，Ｔｄ＝Ｔａ×３などのように値を決定することにより、閾値Ｔｕ，Ｔｄの動的な決定が可能である。 The threshold values Tu and Td may be fixed values, may be varied according to the application use and the size required to achieve the required character recognition quality, or may be determined dynamically. For example, when the average size of characters included in the document is Ta, the threshold values Tu and Td are dynamically determined by determining values such as Tu = Ta / 2, Td = Ta × 3, and the like. Is possible.

解像度の設定方法としては、変換前の画像の解像度Ｒ、変換後の解像度をＲ’とすると、元画像の解像度にある値Ｒａを加えたものを変換後の解像度とする（Ｒ’＝Ｒ＋Ｒａ）、元画像の解像度にある値Ｒｍをかけたものを変換後の解像度とする（Ｒ’＝Ｒ×Ｒｍ）、変換後の文字サイズがある値Ｔｔとなるようにする（Ｒ’＝Ｒ×Ｔｔ／Ｔｅ）などのように適切な方法で決定すればよい。 As the resolution setting method, assuming that the resolution R of the image before conversion and the resolution after conversion are R ′, the value obtained by adding the value Ra to the resolution of the original image is set as the resolution after conversion (R ′ = R + Ra). The original image resolution multiplied by a value Rm is used as the converted resolution (R ′ = R × Rm), and the converted character size is set to a certain value Tt (R ′ = R × Tt). / Te) and the like may be determined by an appropriate method.

解像度が設定された後、変換対象の領域の画像に対して（Ｒ’／Ｒ）倍の拡大、または縮小処理を行う。拡大または縮小処理の方法としては、一般的に知られているバイリニア、バイキュービックなどの画素値の補間手法を用いるものとする。 After the resolution is set, enlargement or reduction processing of (R ′ / R) times is performed on the image of the conversion target area. As an enlargement / reduction processing method, a commonly known pixel value interpolation method such as bilinear or bicubic is used.

ここで、閾値Ｔｕが、例えば８ポイントに予め設定されていたものとすると、図７の例では、右下の６ポイントの文字サイズの領域ＥＥが拡大による解像度変換の対象となり、上部の３０ポイントの文字サイズの領域ＡＡが縮小による解像度変換の対象となる。 Here, if the threshold value Tu is set in advance to 8 points, for example, in the example of FIG. 7, the lower right 6 point character size area EE is subject to resolution conversion by enlargement, and the upper 30 points. The character size area AA is subject to resolution conversion by reduction.

続いて、領域分離部１７は、解像度変換部１６により文字サイズに応じて解像度変換された領域または閾値の範囲内の解像度変換されなかった前景背景分離対象領域の分離を行う（ステップＳ１０６）。 Subsequently, the region separation unit 17 separates the region whose resolution has been converted according to the character size by the resolution conversion unit 16 or the foreground / background separation target region which has not been subjected to resolution conversion within the threshold range (step S106).

まず、領域分離部１７は、前景背景分離対象領域の画像の二値化を行う。画像の二値化は大津の二値化方法など広く知られた二値化手法を用いるものとする。 First, the region separation unit 17 binarizes the image of the foreground / background separation target region. For the binarization of the image, a widely known binarization method such as the binarization method of Otsu is used.

続いて、領域分離部１７は、二値化された領域の画像の値（０または１）のどちらが前景となるべきか背景となるべきかを判定する。文書は、多くの場合、白などの濃度の薄い背景に文字や画像などの濃度の濃い内容が書かれる場合が多い。そこで、領域分離部１７は、二値化された画像の値のうち濃度が濃い方の画素領域を前景、それ以外を背景として分離する。 Subsequently, the region separation unit 17 determines which of the binarized region image values (0 or 1) should be the foreground or the background. Documents are often written with dark contents such as characters and images on a light background such as white. Therefore, the region separation unit 17 separates the pixel region having the higher density among the binarized image values as the foreground and the other as the background.

この他、通常、文字が含まれる領域内で背景（文字以外の部分）の面積（画素数）の方が前景（文字の黒画素部分）の面積よりも広いことが多いことから、二値化された画像の値のうち、面積がより少ない値に対応する画素の領域を、前景として選出してもよい。 In addition, since the area (number of pixels) of the background (the part other than the character) is usually larger than the area of the foreground (the black pixel part of the character) in the area including the character, it is binarized. A pixel area corresponding to a smaller area among the image values may be selected as the foreground.

また、文字列を構成する領域の端部は背景となる場合が多いため、二値化された画像の値のうち、文字列領域を構成する領域の端部をより少なく占める方を前景として選ぶことにより、反転文字にも対応可能な前景背景分離処理を行ってもよい。 In addition, since the edge of the area constituting the character string often becomes the background, the value that occupies the edge of the area constituting the character string area is selected as the foreground among the binarized image values. As a result, foreground / background separation processing that can also handle inverted characters may be performed.

ここで、領域分離部１７は、前景として分離した領域の画素値を、元の文書画像を参照し、前景の色（前景色）を推定する。前景色を推定するには、前景として分離した領域の画素値を、元の文書画像を参照しＲＧＢ毎の平均画素値あるいは最頻値をとればよい。 Here, the region separation unit 17 refers to the original document image for the pixel value of the region separated as the foreground, and estimates the foreground color (foreground color). In order to estimate the foreground color, the pixel value of the region separated as the foreground may be determined by referring to the original document image and taking the average pixel value or mode value for each RGB.

続いて、領域分離部１７は、背景として分離した領域の画素値を、元の文書画像を参照し、背景色を推定する。背景色を推定する場合、背景の部分の画素値を、元画像を参照してＲＧＢ毎の平均画素値、あるいは最頻値をとればよい。 Subsequently, the region separation unit 17 estimates the background color by referring to the original document image for the pixel value of the region separated as the background. When estimating the background color, the pixel value of the background portion may be an average pixel value or mode value for each RGB with reference to the original image.

続いて、領域分離部１７は、背景に対応する画素値を元の文書画像から参照し背景の色（背景色）を推定する。背景色の推定方法としては、画像メモリ１２に記憶されている元の文書画像を読み出して、背景の部分についてＲＧＢ毎の平均画素値、あるいは最頻値を得ることで、背景の色（背景色）を推定する。 Subsequently, the region separation unit 17 refers to the pixel value corresponding to the background from the original document image and estimates the background color (background color). As a background color estimation method, the original document image stored in the image memory 12 is read out, and an average pixel value or mode value for each RGB is obtained for the background portion, thereby obtaining a background color (background color). ).

続いて、領域画像生成部１８は、領域分離部１７により分離された前景および背景の各画像に対して、それぞれ推定した色で、前景画像、背景画像を生成する（ステップＳ１０７）。 Subsequently, the region image generation unit 18 generates a foreground image and a background image with estimated colors for each of the foreground and background images separated by the region separation unit 17 (step S107).

前景画像は、前景色が類似する解像度が等しい前景背景分離対象領域の前景画素成分を統合することにより生成される。前景画像と背景画像を統合する際には、予めレイアウト解析で求めておいた配置情報（どの前景画像を背景画像のどの位置に統合するかという配置情報）を用いる。 The foreground image is generated by integrating the foreground pixel components of the foreground / background separation target region having the same resolution of the foreground and the same resolution. When integrating the foreground image and the background image, arrangement information (placement information indicating which foreground image is integrated into which position of the background image) obtained by layout analysis in advance is used.

これにより、前景画像と背景画像を統合する際に、正しい位置に前景画像を配置することが可能となる。元の文書画像から前景を構成する画素の画素値をその画素が含まれる前景の背景色で置き換えることにより背景画像を生成することができる。 Thereby, when the foreground image and the background image are integrated, the foreground image can be arranged at a correct position. A background image can be generated by replacing the pixel value of a pixel constituting the foreground from the original document image with the background color of the foreground including the pixel.

この処理により、背景画像は、文字などが含まれない、なめらかな画像となり、画像効率を高めることができる。背景画像のサイズを小さくしたい場合には、背景画像の縮小を行う解像度変換を行っても良い。生成された前景背景画像の例を図８に示す。 By this processing, the background image becomes a smooth image that does not include characters and the like, and the image efficiency can be improved. When it is desired to reduce the size of the background image, resolution conversion for reducing the background image may be performed. An example of the generated foreground / background image is shown in FIG.

続いて、画像圧縮部１９は、領域画像生成部１８により生成された前景画像と背景画像それぞれに対して圧縮処理を行うことで（ステップＳ１０８）、図９に示すように、解像度の異なる前景画像９１，９２，９３と背景画像９４を生成する。なお、文字サイズが閾値内の場合は、解像度変換が行われない場合もあり、それぞれの解像度は必ずしも異なるものではない。 Subsequently, the image compression unit 19 performs compression processing on each of the foreground image and the background image generated by the region image generation unit 18 (step S108), and as shown in FIG. 9, the foreground images having different resolutions are obtained. 91, 92, 93 and a background image 94 are generated. Note that when the character size is within the threshold value, resolution conversion may not be performed, and the respective resolutions are not necessarily different.

前景画像９１は、解像度２００ｄｐｉ、ＲＧＢ（０，０，１００）、座標情報ｘ１等で生成される。前景画像９２は、解像度３００ｄｐｉ、ＲＧＢ（０，０，０）、座標情報ｘ２等で生成される。前景画像９３は、解像度６００ｄｐｉ、ＲＧＢ（０，０，０）、座標情報ｘ３等で生成される。背景画像９４は、解像度１５０ｄｐｉ等で生成される。 The foreground image 91 is generated with a resolution of 200 dpi, RGB (0, 0, 100), coordinate information x1, and the like. The foreground image 92 is generated with a resolution of 300 dpi, RGB (0, 0, 0), coordinate information x2, and the like. The foreground image 93 is generated with a resolution of 600 dpi, RGB (0, 0, 0), coordinate information x3, and the like. The background image 94 is generated with a resolution of 150 dpi or the like.

前景画像９１，９２，９３は、二値画像の圧縮に適したＭＭＲ圧縮で圧縮され、背景画像９４は自然画の圧縮に適したＪＰＥＧ圧縮で圧縮される。これにより圧縮率を高めることができる。ここで用いる圧縮手法は、ＭＭＲ圧縮、ＪＰＥＧ圧縮に限定されるものではなく、他の圧縮手法を用いてもかまわない。 The foreground images 91, 92, 93 are compressed by MMR compression suitable for binary image compression, and the background image 94 is compressed by JPEG compression suitable for natural image compression. Thereby, a compression rate can be raised. The compression method used here is not limited to MMR compression and JPEG compression, and other compression methods may be used.

続いて、画像統合部２０は、画像圧縮部１９により圧縮された前景画像９１，９２，９３と背景画像９４とを統合（再配置）することで、図１０に示すように、元の文章画像とほぼ同じ配置の文書画像４０ａを生成し、画像出力装置３へ出力する（ステップＳ１０９）。 Subsequently, the image integration unit 20 integrates (rearranges) the foreground images 91, 92, 93 and the background image 94 compressed by the image compression unit 19, as shown in FIG. Is generated and output to the image output device 3 (step S109).

ここでは、出力形式としてPortable Document Format（以下「ＰＤＦ」と称す）を選ぶ場合について説明する。ＰＤＦは、複数の画像形式や解像度を混在させることができ、二値画像に対するマスク属性の付与が可能なフォーマットである。 Here, a case where Portable Document Format (hereinafter referred to as “PDF”) is selected as the output format will be described. PDF is a format in which a plurality of image formats and resolutions can be mixed, and a mask attribute can be assigned to a binary image.

すなわち、画像統合部２０は、背景画像の上に前景画像を配置した文書画像を生成し、前景画像に対しては画素毎にマスク属性を与える。マスクがＯＮ（１）の画素のみそれぞれの前景画像に設定された前景色が表示されるようにすることで、前景画像、背景画像から元画像と同様の見た目の文書を出力することができる。 That is, the image integration unit 20 generates a document image in which a foreground image is arranged on a background image, and gives a mask attribute for each pixel to the foreground image. By displaying the foreground color set in each foreground image only for pixels whose mask is ON (1), it is possible to output a document with the same appearance as the original image from the foreground image and the background image.

このようにこの実施形態の画像処理装置によれば、文書画像の中に小さい文字が含まれていた場合に、ファイルサイズが小さいにもかかわらず画質の劣化が少なくて見やすい文書を生成することができる。 As described above, according to the image processing apparatus of this embodiment, when a small character is included in a document image, it is possible to generate an easy-to-read document with little deterioration in image quality despite a small file size. it can.

ここで、図１１、図１２を参照して本実施形態による画質向上効果について説明する。図１１にカラーの元画像をそのまま二値化する場合の画像を示し、図１２に本実施形態により解像度を向上させた場合の画像を示す。 Here, the image quality improvement effect according to the present embodiment will be described with reference to FIGS. FIG. 11 shows an image when the original color image is binarized as it is, and FIG. 12 shows an image when the resolution is improved by this embodiment.

図１１に示すように、前景部分の３００ｄｐｉのカラー画像９５（以下「元画像９５」と称す）をそのままの解像度で二値化すると、二値化による階調情報が欠落するため、二値化した画像９６の小さな文字では見た目の画質劣化が激しい。文字の輪郭形状や文字ストロークの太さもいびつに見える。 As shown in FIG. 11, when a 300 dpi color image 95 (hereinafter referred to as “original image 95”) in the foreground portion is binarized at the same resolution, gradation information due to binarization is lost, and thus binarization is performed. In the small characters of the image 96, the visual image quality is greatly deteriorated. The outline shape of characters and the thickness of character strokes also appear to be irregular.

一方、図１２に示すように、本実施形態の画像処理装置において、元画像９５から解像度を上げる方向に解像度変換（３００ｄｐｉ→６００ｄｐｉ）を行って高解像度の画像９７を生成し、その画像９７を二値化した場合に、その二値化した画像９８は、階調情報が欠落していても見た目の画像の劣化が少ない。 On the other hand, as shown in FIG. 12, in the image processing apparatus of this embodiment, resolution conversion (300 dpi → 600 dpi) is performed in the direction of increasing the resolution from the original image 95 to generate a high resolution image 97. In the case of binarization, the binarized image 98 has little deterioration in the appearance image even if gradation information is missing.

互いの画像（図１１の画像９６と図１２の画像９８）を比較すると、本実施形態の画像処理装置で生成した画像９８（前景と背景に分離した後に前景画像を解像度変換した後、統合したもの）の画質向上効果は明らかである。 Comparing each other's images (the image 96 in FIG. 11 and the image 98 in FIG. 12), the image 98 generated by the image processing apparatus of the present embodiment (the foreground image is separated into the foreground and the background, the resolution is converted, and then integrated) The effect of improving the image quality is obvious.

本実施形態の画像処理装置では、文字が含まれる前景の解像度を向上させることで、前景成分のファイルサイズが増大する可能性はあるものの、ＭＭＲ圧縮技術を用いて前景成分の圧縮文書画像を圧縮する場合、ファイルサイズの増大は画素数の増大と比較して小さく、画質向上のメリットの方が大きいといえる。 In the image processing apparatus of the present embodiment, the foreground component file size may be increased by improving the resolution of the foreground including characters, but the foreground component compressed document image is compressed using the MMR compression technique. In this case, the increase in the file size is small compared with the increase in the number of pixels, and it can be said that the merit of improving the image quality is larger.

また、画像中に大きな文字が含まれている場合は、その領域の前景成分の解像度を下げたとしても画質に与える影響は少ないため、解像度を低下させる方向で解像度変換を行うことによって出力画像のファイルサイズをより小さくすることができる。 In addition, when large characters are included in the image, even if the resolution of the foreground component in that area is lowered, there is little effect on the image quality.Therefore, by performing resolution conversion in the direction of decreasing the resolution, the output image The file size can be made smaller.

なお、本願発明は、上記実施形態のみに限定されるものではなく、実施段階ではその要旨を逸脱しない範囲で構成要素を変形してもよい。また、上記実施形態に開示されている複数の構成要素を適宜組み合わせることにより、種々の発明を構成できる。 In addition, this invention is not limited only to the said embodiment, You may deform | transform a component in the range which does not deviate from the summary in an implementation stage. Moreover, various inventions can be configured by appropriately combining a plurality of components disclosed in the embodiment.

例えば、実施形態に示される全構成要素から幾つかの構成要素を削除してもよい。さらに、異なる実施形態にわたる構成要素を適宜組み合わせてもよい。 For example, some components may be deleted from all the components shown in the embodiment. Furthermore, constituent elements over different embodiments may be appropriately combined.

本発明の一実施形態に係る画像処理装置の構成を示すブロック図である。1 is a block diagram illustrating a configuration of an image processing apparatus according to an embodiment of the present invention. 画像処理装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of an image processing apparatus. 処理対象の文書画像の一例を示す図である。It is a figure which shows an example of the document image of a process target. レイアウト解析により検出された各領域を示す図である。It is a figure which shows each area | region detected by the layout analysis. 図３の文書画像の文字領域に対して文字列領域を検出した例を示す図である。FIG. 4 is a diagram illustrating an example in which a character string region is detected with respect to a character region of the document image in FIG. 3. 図５の文字列領域に含まれる単位文字領域を検出した例を示す図である。It is a figure which shows the example which detected the unit character area | region contained in the character string area | region of FIG. 文書画像の各領域の指定文字サイズを示す図である。It is a figure which shows the designated character size of each area | region of a document image. 前景画像を示す図である。It is a figure which shows a foreground image. 元の文書画像に対して解像度変換した前景画像と背景画像を示す図である。It is a figure which shows the foreground image and background image which converted the resolution with respect to the original document image. 元の文章画像とほぼ同じ配置で生成した文書画像を示す図である。It is a figure which shows the document image produced | generated by the substantially same arrangement | positioning as the original sentence image. カラーの元画像をそのまま二値化する場合の画質を示す図である。It is a figure which shows the image quality when binarizing the original image of a color as it is. カラーの元画像に対して解像度変換を行った後に二値化する場合の画質を示す図である。It is a figure which shows the image quality in the case of binarizing after performing resolution conversion with respect to the color original image.

Explanation of symbols

１…画像取込装置、２…コンピュータ（ＰＣ）、３…画像出力装置、１２…画像メモリ、１３…レイアウト解析部、１４…分離対象領域決定部、１５…文字サイズ推定部、１６…解像度変換部、１７…領域分離部、１８…領域画像生成部、１９…画像圧縮部、２０…画像統合部。 DESCRIPTION OF SYMBOLS 1 ... Image capture device, 2 ... Computer (PC), 3 ... Image output device, 12 ... Image memory, 13 ... Layout analysis part, 14 ... Separation object area | region determination part, 15 ... Character size estimation part, 16 ... Resolution conversion , 17... Region separation unit, 18... Region image generation unit, 19... Image compression unit, 20.

Claims

An image memory for holding a document image filled with characters;
A layout analysis unit that obtains layout information including a position of an element constituting the document image and an attribute indicating the type of the element by performing a layout analysis on the document image held in the image memory;
An area determination unit that determines an area to be separated from the document image using layout information analyzed by the layout analysis unit;
Among the areas determined by the area determining unit, a region including a resolution conversion target element is selected from the attribute, and a size estimating unit that estimates the size of the element in the selected region;
A resolution conversion unit that performs resolution conversion of the region according to the size of the element estimated by the size estimation unit;
A separation unit that separates an image of a region whose resolution has been converted by the resolution conversion unit from the document image;
An image generation unit that generates a foreground image and a background image by performing processing including binarization on the image separated by the separation unit;
An image compression unit that individually compresses the foreground image and the background image generated by the image generation unit;
An image processing apparatus comprising: an image integration unit that generates a document image in which a foreground image and a background image compressed by the image compression unit are arranged according to the layout information.

The resolution converter
The image processing apparatus according to claim 1, wherein when the size of the element estimated by the size estimation unit is smaller than a preset threshold, the resolution of the region is changed in a direction of increasing the resolution.

The resolution converter
The image processing apparatus according to claim 1, wherein when the size of the element estimated by the size estimation unit is larger than a preset threshold, the resolution of the region is changed in a direction of decreasing the resolution.

In an image processing program for causing an image processing apparatus to execute processing,
The image processing apparatus;
An image memory for holding a document image filled with characters;
A layout analysis unit that obtains layout information including a position of an element constituting the document image and an attribute indicating the type of the element by performing a layout analysis on the document image held in the image memory;
An area determination unit that determines an area to be separated from the document image using layout information analyzed by the layout analysis unit;
Among the areas determined by the area determining unit, a region including a resolution conversion target element is selected from the attribute, and a size estimating unit that estimates the size of the element in the selected region;
A resolution conversion unit that performs resolution conversion of the region according to the size of the element estimated by the size estimation unit;
A separation unit that separates an image of a region whose resolution has been converted by the resolution conversion unit from the document image;
An image generation unit that generates a foreground image and a background image by performing processing including binarization on the image separated by the separation unit;
An image compression unit that individually compresses the foreground image and the background image generated by the image generation unit;
An image processing program that functions as an image integration unit that generates a document image in which a foreground image and a background image compressed by the image compression unit are arranged according to the layout information.

The resolution converter
The image processing program according to claim 4, wherein when the size of the element estimated by the size estimation unit is smaller than a preset threshold, the resolution of the region is changed in a direction of increasing the resolution.

The resolution converter
5. The image processing program according to claim 4, wherein when the size of the element estimated by the size estimation unit is larger than a preset threshold value, the resolution of the area is changed in a direction of decreasing the resolution.