JPH04356873A

JPH04356873A - Adaptive encoding system for color document image

Info

Publication number: JPH04356873A
Application number: JP3164377A
Authority: JP
Inventors: Toshio Shirasawa; 寿夫白沢
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1991-02-06
Filing date: 1991-07-04
Publication date: 1992-12-10
Anticipated expiration: 2015-11-13
Also published as: JP3108133B2

Abstract

PURPOSE:To clearly reproduce characters and to obtain high quality images having no density nonuniformity on the base by extracting a base area excepting for characters as well, encoding the characters and the base area as binary images and executing natural image encoding for remaining picture pattern areas. CONSTITUTION:The image data from a scanner 1 are stored in a buffer 2 and outputted to an area separation part 3 for each block of 8X8 picture elements, and characters, base block and picture pattern are judged. According to the judged result of the separation part 3, a switch 4 is operated to supply the characters and the image data in the base block to a binary image encoder part 5 and to supply the picture pattern block to a natural image encoder part 6. At the encoder part 5, the image data are binarized by a prescribed threshold value and encoded to binary images. At the encoder part 6, encoding for natural image is executed while defining the blocks excepting for the characters and the base block as picture pattern areas. Thus, the characters are made clear, and high quality images can be obtained without density nonuniformity on the base.

Description

[Detailed description of the invention]

【０００１】0001

【産業上の利用分野】この発明は文字と絵柄とが混在す
るカラー文書画像の適応符号化方式に関し、電子ファイ
リングシステム、カラーファクシミリ、カラーコピー等
の各種カラー文書画像用の入出力装置に適用して好適な
ものである。[Field of Industrial Application] The present invention relates to an adaptive encoding method for color document images containing a mixture of text and pictures, and is applicable to various input/output devices for color document images such as electronic filing systems, color facsimiles, and color copies. It is suitable for this purpose.

【０００２】0002

【従来の技術】近年、文字と絵柄とが混在するカラー文
書画像に対する符号化方式の研究が活発化している。こ
のような符号化方式が必要とされるのは、カラー文書画
像は従来のモノクロ文書画像に比べてデータ量が多いた
め、スキャナ等の画像入力装置から画像データを取り込
んでデータベースを構築する場合、データ量を削減する
ための効率よい圧縮処理が不可欠になるためである。ま
た、自然画像符号化方式として高域成分をカットする帯
域圧縮法を用いたＡＤＣＴ（Ａｄａｐｔｉｖｅ　Ｄｉｓ
ｃｒｅｔｅＣｏｓｉｎｅ　Ｔｒａｎｓｆｏｒｍ）方式が
標準化方式として採用されると、文字のようなエッジ部
分ではモスキートノイズが生じて画質劣化が目立つから
である。文字のように分解能が重要となる画像には可逆
エントロピー符号を用い、自然画像のように階調性が重
要となる画像には帯域圧縮のような非可逆で多値の符号
化方式を用いた方が適している。2. Description of the Related Art In recent years, research into encoding systems for color document images containing a mixture of text and pictures has become active. Such an encoding method is required because color document images require a larger amount of data than conventional monochrome document images, so when building a database by importing image data from an image input device such as a scanner, This is because efficient compression processing to reduce the amount of data is essential. In addition, as a natural image encoding method, ADCT (Adaptive Disc
This is because, if the createCosineTransform) method is adopted as a standardized method, mosquito noise will occur in edge portions such as characters, and image quality deterioration will be noticeable. Reversible entropy coding is used for images where resolution is important, such as text, and irreversible, multi-valued encoding methods such as band compression are used for images where gradation is important, such as natural images. is more suitable.

【０００３】そこで、原画像から文字のエッジ成分を抽
出し、エッジ成分に対しては動的算術符号を用いて符号
化し、原画像からエッジ成分を除去した残りの成分に対
してはＡＤＣＴを用いて符号化する『文字・画像混在文
書の一符号化方式』（平成２年度画像電子学会全国大会
予稿集，Ｐ１３１　〜Ｐ１３６　）が提案されている。[0003] Therefore, edge components of characters are extracted from the original image, the edge components are encoded using dynamic arithmetic codes, and ADCT is used for the remaining components after removing the edge components from the original image. ``One encoding method for mixed text/image documents'' (Proceedings of the 1990 National Conference of the Institute of Image Electronics Engineers, pp. 131-136) has been proposed.

【０００４】この方式は、まず文字のエッジ成分を抽出
し、その成分を２値化することによって文字パターンを
生成して算術符号化を行う。次いで、抽出した文字パタ
ーンの部分の濃度を周辺の下地部分の濃度の平均値で置
き換えて原画像からエッジ成分を除去した画像を生成し
、この画像をＡＤＣＴによって符号化する。[0004] In this method, edge components of characters are first extracted, and the components are binarized to generate a character pattern, which is then arithmetic encoded. Next, the density of the extracted character pattern portion is replaced with the average density of the surrounding background portion to generate an image in which edge components are removed from the original image, and this image is encoded by ADCT.

【０００５】このようにエッジ成分を分離して適応的に
符号化するのは、前述したようにＡＤＣＴ方式は自然画
像向けの画像圧縮方式であり、基本的には高域成分をカ
ットする方式であるため、文字のような画像に適用する
には圧縮率を下げて高域成分のカット量を減らさなけれ
ばエッジ周辺の画質が劣化してしまうからである。エッ
ジ成分を分離したのち２値化し可逆の符号化を行えば、
エッジ部分の高域成分を除去した画像にＡＤＣＴを施す
ことになるので、全体として同じ符号量でも画質を向上
させることができる。The ADCT method, which separates edge components and adaptively encodes them in this way, is an image compression method for natural images, as described above, and is basically a method that cuts high-frequency components. Therefore, when applying to images such as text, the image quality around the edges will deteriorate unless the compression rate is lowered to reduce the amount of high-frequency components cut. If the edge components are separated, then binarized and reversibly encoded,
Since ADCT is applied to the image from which the high-frequency components of the edge portion have been removed, the image quality can be improved even with the same amount of code as a whole.

【０００６】[0006]

【発明が解決しようとする課題】ところで、前述の従来
方式では、カラー文書画像の描かれている紙面の地肌部
分も自然画像の一部とみなしてＡＤＣＴ方式を適用して
いる。しかし、カラー文書画像に使用される白地原稿を
考えた場合、地肌部分には意図的な情報は含まれておら
ず、イメージスキャナで読み取った際の濃度ムラは使用
した紙の品質に起因している。By the way, in the conventional method described above, the background portion of the paper surface on which the color document image is drawn is regarded as part of the natural image and the ADCT method is applied. However, when considering a blank original used for color document images, the background part does not contain any intentional information, and the density unevenness when read with an image scanner is due to the quality of the paper used. There is.

【０００７】従って、このような地肌部分に対しては入
力時の階調を忠実に再現するよりも、むしろ一定濃度で
置き換えた方が濃度ムラがなくなり画像品質が向上する
。また、符号化効率の面からも地肌を自然画像とみなし
て符号化するよりも一定濃度値で置き換えた方がかなり
効率が良くなる。そこで、この発明では、文字の他に地
肌部分も抽出することによって前述の従来方式に比べ、
高圧縮かつ高画質な符号化を行うことを目的とする。Therefore, for such a background part, rather than faithfully reproducing the gradation at the time of input, it is better to replace it with a constant density to eliminate density unevenness and improve image quality. Furthermore, in terms of encoding efficiency, it is much more efficient to replace the background with a constant density value than to encode it by regarding the background as a natural image. Therefore, in this invention, by extracting the background part in addition to the characters, compared to the conventional method described above,
The purpose is to perform high compression and high quality encoding.

【０００８】また、この発明は、自然画像にはそれほど
高い解像度は必要としないことが多い点に着目し、絵柄
領域の解像度を低くして符号化することにより高圧縮な
符号化を行うことを目的とする。さらに、この発明は、
原画像から文字パターンを取り除いた絵柄領域のうちで
も自然画領域と背景領域とではその特性が大きく異なる
点に着目し、自然画領域と背景領域とで符号化方式を異
ならせることで高圧縮な符号化を行うことを目的とする
。[0008] Further, the present invention focuses on the fact that natural images often do not require such high resolution, and it is possible to perform highly compressed encoding by lowering the resolution of the picture area and encoding it. purpose. Furthermore, this invention
We focused on the fact that the characteristics of the natural image area and the background area are significantly different from each other in the picture area from which the character pattern has been removed from the original image, and by using different encoding methods for the natural image area and the background area, we achieved high compression. The purpose is to perform encoding.

【０００９】[0009]

【課題を解決するための手段】この発明によるカラー文
書画像の適応符号化方式は、階調性を有する自然画像と
文字とが混在するカラー文書画像を、Ｍ×Ｎ画素のブロ
ック単位で文字・地肌領域または絵柄領域の何れである
かを判定し、この判定結果に基づき、文字・地肌領域と
判定したブロックの画像データは２値画像に適した符号
化を行い、絵柄領域と判定したブロックの画像データは
絵柄領域に適した符号化を行うことを特徴とする。[Means for Solving the Problems] The adaptive encoding method for color document images according to the present invention processes color document images in which natural images with gradation and characters are mixed in blocks of M×N pixels. It is determined whether the block is a background area or a picture area, and based on this determination result, the image data of the block determined to be a text/texture area is encoded in a manner suitable for a binary image, and the image data of the block determined to be a picture area is encoded. The image data is characterized by being encoded in a way that is suitable for the picture area.

【００１０】0010

【作用】この発明によるカラー文書画像の適応符号化方
式においては、文字以外に地肌領域も抽出し、文字・地
肌領域は２値画像として符号化し、残りの絵柄領域は自
然画像符号化を行う。このため、地肌領域を絵柄領域と
していた従来の方式に比べ、文字が鮮明に再生され、地
肌も濃度ムラがなくなり高画質な画像となる。[Operation] In the adaptive encoding system for color document images according to the present invention, background regions are also extracted in addition to characters, the characters and background regions are encoded as binary images, and the remaining picture regions are encoded as natural images. Therefore, compared to the conventional method in which the background area is used as a picture area, characters are reproduced more clearly and the background has even density unevenness, resulting in a high-quality image.

【００１１】[0011]

【実施例】図１は、この発明によるカラー文書画像の適
応符号化方式の一実施例を示すブロック図である。図１
において、スキャナ１は対象となるカラー文書画像を読
み取るもので、文字を読み取る必要性から、例えば、４
００ｄｐｉ　程度の解像度を有している。このスキャナ
１から出力される画像データは３原色ＲＧＢの３種類の
画像データで、データ値は各色２５６階調、白画素は〔
Ｒ，Ｇ，Ｂ〕＝〔０，０，０〕、黒画素は〔Ｒ，Ｇ，Ｂ
〕＝〔　２５６，　２５６，２５６　〕となっている。DESCRIPTION OF THE PREFERRED EMBODIMENTS FIG. 1 is a block diagram showing an embodiment of an adaptive encoding method for color document images according to the present invention. Figure 1
, the scanner 1 reads the target color document image, and because of the need to read characters, for example, 4
It has a resolution of about 00 dpi. The image data output from this scanner 1 is three types of image data of three primary colors RGB, the data value is 256 gradations for each color, and the white pixel is [
R, G, B] = [0, 0, 0], the black pixel is [R, G, B
] = [ 256, 256, 256 ].

【００１２】スキャナ１から出力される画像データはラ
スタースキャンの順に出力され、数ライン分の画像デー
タを蓄積するバッファ２に一旦格納される。バッファ２
に格納された画像データは、ブロック毎に像域分離部３
に出力され、当該ブロックが文字・地肌ブロックか絵柄
（写真，網点，ベタ等）ブロックか判定される。１ブロ
ックはＡＤＣＴ方式に合わせて８×８画素または１６×
１６画素からなっている。像域分離部３の構成および判
定動作については後述する。Image data output from the scanner 1 is output in the order of raster scanning and is temporarily stored in a buffer 2 that stores several lines of image data. buffer 2
The image data stored in the image area separation unit 3 is processed block by block.
It is determined whether the block is a character/background block or a pattern (photo, halftone dot, solid, etc.) block. One block is 8x8 pixels or 16x according to the ADCT method.
It consists of 16 pixels. The configuration and determination operation of the image area separation unit 3 will be described later.

【００１３】なお、ここでいう文字とは印刷文字程度に
黒い文字で、鉛筆書きのように薄い文字や色文字は絵柄
とする。また、地肌とはコピー用紙程度に白い紙の地肌
で、新聞紙のような輝度の低い地肌は含めない。また、
写真や網点のハイライト部は絵柄とし地肌に含めない。[0013] The characters referred to here are black characters that are comparable to printed characters, and pale characters such as those written in pencil or colored characters are images. Furthermore, the background refers to the background of paper as white as copy paper, and does not include background with low brightness such as newspaper. Also,
Highlights of photographs and halftone dots are treated as the image and are not included in the background.

【００１４】スイッチ部４では、像域分離部３での判定
結果に基づいて当該ブロックが文字・地肌ブロックであ
れば画像データを２値画像符号化部５に供給し、当該ブ
ロックが絵柄ブロックであれば自然画像符号化部６に供
給する。従って、自然画像符号化は絵柄領域にしか行わ
ないので、絵柄領域の少ない画像に対しては大幅に圧縮
率を上げることができる。The switch section 4 supplies the image data to the binary image encoding section 5 if the block is a character/background block based on the determination result of the image area separation section 3, and if the block is a picture block. If so, it is supplied to the natural image encoding unit 6. Therefore, since natural image encoding is performed only on picture areas, the compression rate can be significantly increased for images with few picture areas.

【００１５】２値画像符号化部５では、供給される文字
・地肌ブロックの画像データを所定の閾値により２値化
して２値画像符号化を行う。２値化の際には閾値によっ
て文字の太さが変わるのを防ぐために、エッジ強調を施
したデータに対して行う。また、２値領域はブロック単
位で与えられるため、各ブロック毎にラスタースキャン
して１次元のデータ系列に変換し、ハフマン符号や算術
符号等のエントロピー符号化を施す。なお、スキャン方
式の詳細については後述する。The binary image encoding unit 5 performs binary image encoding by binarizing the supplied image data of the character/background block using a predetermined threshold. Binarization is performed on edge-enhanced data in order to prevent the thickness of characters from changing depending on the threshold value. Furthermore, since the binary region is given in units of blocks, each block is raster-scanned and converted into a one-dimensional data series, and then entropy encoded using Huffman code, arithmetic code, or the like. Note that the details of the scanning method will be described later.

【００１６】自然画像符号化部６では、文字・地肌ブロ
ック以外のブロックを絵柄領域として自然画像用の符号
化を行う。絵柄領域には網点領域も含まれているため平
滑化を行い、網点の形状を除去してからＡＤＣＴのよう
な直交変換符号化を行う。予め網点の形状を除去する理
由は、もともと網点は少ない階調で見かけ上の階調性を
向上させるために行われているので（ただし、低線数の
網点の中には例外もある）、網点の形状をそのまま残す
必要がないということ、ＡＤＣＴは網点画像に対して圧
縮効率が非常に悪いことなどによる。The natural image encoding unit 6 performs encoding for natural images using blocks other than character/background blocks as picture areas. Since the picture area also includes a halftone dot area, smoothing is performed to remove the shape of the halftone dots, and then orthogonal transform encoding such as ADCT is performed. The reason for removing the shape of halftone dots in advance is that halftone dots are originally removed to improve the apparent gradation with a small number of gradations (however, there are exceptions to some halftone dots with a low number of lines). This is because there is no need to leave the shape of the halftone dots intact, and ADCT has very low compression efficiency for halftone images.

【００１７】領域情報符号化部７では、像域分離回路３
で判定した領域に関する情報を領域情報として符号化す
る。この領域情報はブロック単位の情報であり、しかも
文字・地肌領域かそれ以外の領域かの２値情報であり、
データ量はかなり小さい。単純に文字・地肌と同じよう
な自己相関を持っていたとしてもデータ量は約１／２５
６になる。領域情報も２値であることから文字・地肌領
域と同様にエントロピー符号化を行う。In the area information encoding unit 7, the image area separation circuit 3
The information regarding the area determined in is encoded as area information. This area information is block-based information, and is binary information indicating whether it is a character/background area or another area.
The amount of data is quite small. Even if it simply has the same autocorrelation as characters and background, the amount of data is about 1/25
It becomes 6. Since the area information is also binary, entropy encoding is performed in the same way as for character/background areas.

【００１８】こうして各符号化部５〜７で符号化された
各データはフォーマット部８でアプリケーションに適し
たフォーマットに変換された後、外部記憶装置９に記憶
される。Each data encoded in each of the encoding sections 5 to 7 is converted by a formatting section 8 into a format suitable for the application, and then stored in an external storage device 9.

【００１９】次に、図２に示すブロック図を参照して像
域分離部３について説明する。像域分離部３はバッファ
２から供給されるブロック毎の画像データをＭＴＦ（Ｍ
ｏｄｕｌａｔｉｏｎ　Ｔｒａｎｓｆｅｒ　Ｆｕｎｃｔｉ
ｏｎ）補正部１０および白地抽出部１１に供給する。Ｍ
ＴＦ補正部１０では、画像データをＭＴＦ補正して２値
化し、解像度を重視したデータとしてエッジ抽出部１２
に出力する。Next, the image area separation section 3 will be explained with reference to the block diagram shown in FIG. The image area separation unit 3 converts the image data for each block supplied from the buffer 2 into MTF (M
oduration Transfer Function
on) is supplied to the correction section 10 and the white background extraction section 11. M
The TF correction unit 10 performs MTF correction on the image data, converts it into a binary value, and outputs the image data as resolution-oriented data to the edge extraction unit 12.
Output to.

【００２０】白地抽出部１１では、バッファ２から供給
されるブロック毎の画像データを、画素単位に白地か否
か判定して白地領域を抽出する。単純に濃度情報を用い
て画素単位に白地か否か判定するが、地肌領域自体の濃
度ムラのために地肌領域中に多くの非白地画素が残って
しまうので、ブロック単位の膨張処理を行う。地肌中の
非白地画像はランダムなパターンで生じると思われるの
で、ブロック中の白地画素の画素数で判定する。例えば
、ブロックサイズが１６×１６画素の場合、約８０％が
白画素であればそのブロック全てを白地ブロックとする
。ただし、写真領域を白地と誤判定しないようにブロッ
ク中に一画素でも色画素がある場合は絵柄ブロックとす
る。しかし、これらの処理では、まだ写真中の白い部分
や網点中のハイライト部分が白地と判定されるので、網
点領域を地肌と誤判定するのを防ぐために網点検出を行
う。白地ブロックで網点と判定された領域が存在する場
合には、そのブロックは絵柄ブロックに変更する。これ
らの処理によって白地領域を抽出する。The white background extracting unit 11 extracts a white background area by determining whether or not each block of image data supplied from the buffer 2 is a white background on a pixel by pixel basis. Although density information is simply used to determine whether or not the background is white pixel by pixel, many non-white background pixels remain in the background area due to density unevenness in the background area itself, so expansion processing is performed in units of blocks. Since non-white background images in the background appear to occur in a random pattern, the determination is made based on the number of white background pixels in the block. For example, when the block size is 16×16 pixels, if approximately 80% of the pixels are white, the entire block is treated as a white background block. However, in order to avoid misjudging a photo area as a white background, if there is even one color pixel in the block, it is treated as a picture block. However, in these processes, the white parts in the photograph and the highlighted parts in the halftone dots are still determined to be white background, so halftone dot detection is performed to prevent the halftone dot area from being mistakenly determined as the background. If there is an area determined to be a halftone dot in a white background block, that block is changed to a pattern block. A white background area is extracted by these processes.

【００２１】エッジ抽出部１２では、文字のエッジを分
離する。画素単位のエッジ抽出だけでは誤分離が多いた
め、基本的にブロック単位で判定する。この判定法には
、単純にエッジ画素数だけで判定する方法も考えられる
が、抽出精度をあげるためにエッジ画素の形状パターン
により抽出する。この理由は、一般に文字エッジは連続
性を持っているからである。形状パターンとしては縦、
横、斜め方向に連続するパターンをブロックのサイズに
応じて用意しておく。最近、ＤＴＰ（デスク・トップ・
パブリッシング）に多くみられる大きな文字の内部の黒
ベタ領域は黒ベタと精度よく分離することが難しいので
絵柄領域として取り扱う。The edge extraction section 12 separates edges of characters. Since there are many erroneous separations when edge extraction is performed on a pixel-by-pixel basis, the determination is basically made on a block-by-block basis. Although a method of simply determining based on the number of edge pixels may be considered as this determination method, in order to improve extraction accuracy, extraction is performed based on the shape pattern of edge pixels. The reason for this is that character edges generally have continuity. The shape pattern is vertical,
Prepare patterns that are continuous in the horizontal and diagonal directions depending on the size of the block. Recently, DTP (Desktop/
Since it is difficult to accurately separate the solid black area inside large characters, which is often found in printing (publishing), from the solid black area, it is treated as a picture area.

【００２２】総合判定部１３では、白地抽出部１１およ
びエッジ抽出部１２における判定結果に基づいて、得ら
れたエッジ領域と白地領域とを併せて文字・地肌領域と
判定する。ただし、符号化が１６×１６画素ブロックを
単位とするため、文字・地肌領域はこのサイズのブロッ
ク単位で判定する。判定はエッジまたは白地ブロックが
有れば文字・地肌領域とする。ただし、ブロック中の文
字でも白地でもない領域に色画素が存在するならば絵柄
領域とする。最後に、写真中の絵柄ブロックで囲まれた
白地ブロックを絵柄ブロックに変更するため全体的に絵
柄ブロックを上下左右に１ブロック分膨張させる。The comprehensive determination section 13 determines that the obtained edge area and white background area together are a character/background area based on the determination results of the white background extraction section 11 and the edge extraction section 12. However, since encoding is performed in units of 16×16 pixel blocks, character/background regions are determined in units of blocks of this size. If there is an edge or a white background block, it is determined that it is a character/background area. However, if a color pixel exists in an area that is neither a character nor a white background in a block, it is treated as a picture area. Finally, in order to change the white background block surrounded by the picture blocks in the photograph to a picture block, the picture blocks are expanded by one block in the vertical and horizontal directions.

【００２３】図３は、像域分離部３の他の実施例を示す
ブロック図である。この実施例においては、バッファ２
から供給される画像データを、黒文字抽出部２０および
白画素抽出部２１にそれぞれ供給し、黒文字抽出および
白画素抽出を行う。黒文字抽出部２０は文字として黒文
字を抽出する。ただし、大きな文字内部の黒ベタ領域は
絵柄部の黒ベタと精度よく分離することが難しいので、
絵柄領域として扱う。黒文字抽出のアルゴリズムとして
は、例えば、本出願人が先に提出した「カラー画像処理
装置」（特願平２−２１４１４０号）による像域分離方
式を用いる。この場合、文字周辺では白地の濃度レベル
があがっていると予想されるため、文字領域の膨張を行
い、周辺の地肌を含めて文字領域としておく。この処理
は文字と白地との間に絵柄領域が生じるのを防ぐもので
ある。抽出結果は２値論理で文字領域は「１」、非文字
領域は「０」としてバッファメモリ２２に記憶する。FIG. 3 is a block diagram showing another embodiment of the image area separation section 3. As shown in FIG. In this example, buffer 2
The image data supplied from the CPU 11 is supplied to a black character extracting section 20 and a white pixel extracting section 21, respectively, to perform black character extraction and white pixel extraction. The black character extraction unit 20 extracts black characters as characters. However, it is difficult to accurately separate the solid black area inside large characters from the solid black area in the pattern, so
Treated as a picture area. As an algorithm for extracting black characters, for example, an image area separation method based on a "color image processing device" (Japanese Patent Application No. 2-214140) previously submitted by the present applicant is used. In this case, since the density level of the white background is expected to increase around the characters, the character area is expanded to include the surrounding background as the character area. This process prevents a picture area from occurring between the characters and the white background. The extraction result is stored in the buffer memory 22 in binary logic as "1" for the character area and "0" for the non-character area.

【００２４】白画素抽出部２１は各画素のＧ（緑）成分
の濃度レベルが閾値Ｔｈ以上なら非白画素、閾値Ｔｈ未
満なら白画素とする。一般に原稿の地肌濃度をコピー用
紙の濃度以下に限定した場合、すなわち、新聞や低品質
の印刷原稿の地肌を地肌領域として検出しない場合、絵
柄（写真や網点）中のハイライト領域の濃度レベルは原
稿の地肌そのものの領域の濃度レベルよりも高くなるこ
とが多い。そのため、このように単純な閾値で画素単位
に判定しても地肌領域では白画素が密集し、ハイライト
領域ではまばらにしか白画素が分布しないといった特徴
の差が生じる。抽出結果は白画素は論理「１」、非白画
素は論理「０」としてバッファメモリ２３に記憶する。The white pixel extraction unit 21 treats each pixel as a non-white pixel if the density level of the G (green) component of the pixel is equal to or higher than a threshold Th, and as a white pixel if it is less than the threshold Th. In general, when the background density of the original is limited to the density of copy paper or less, that is, when the background of a newspaper or low-quality printed original is not detected as a background area, the density level of the highlight area in the picture (photo or halftone dot) is often higher than the density level of the background area of the original. Therefore, even if the determination is made on a pixel-by-pixel basis using a simple threshold value, there will be a difference in characteristics such that white pixels are densely distributed in the background region and white pixels are sparsely distributed in the highlight region. The extraction results are stored in the buffer memory 23 as logic "1" for white pixels and logic "0" for non-white pixels.

【００２５】バッファメモリ２２および２３に記憶した
黒文字抽出部２０および白画素抽出部２１の抽出結果は
、オア回路２４で論理和を取りバッファメモリ２５に記
憶する。論理和の結果が画素単位で論理「１」ならば文
字・地肌領域、論理「０」ならば絵柄領域となる。The extraction results of the black character extraction unit 20 and the white pixel extraction unit 21 stored in the buffer memories 22 and 23 are logically summed by an OR circuit 24 and stored in a buffer memory 25. If the result of the logical OR is logical "1" in pixel units, it becomes a character/background area, and if it is logical "0", it becomes a picture area.

【００２６】ところで、自然静止画像の符号化方式であ
るＡＤＣＴ方式では、一般に８×８画素または１６×１
６画素を１ブロックとしてブロック単位で符号化を行っ
ているため、文字・地肌領域の判定もこのような符号化
方式に適したブロックサイズで判定しなければならない
。そこで、総合判定部２６では、バッファメモリ２５に
記憶した画素データを、例えば１６×１６画素単位で読
み出し、ブロック単位で文字・地肌領域か絵柄領域かを
判定する。判定はブロック内の黒文字および白画素の数
を用いる。先に述べたようにハイライト領域では白画素
がまばらに分布し、地肌領域では白画素が密集する。そこで、ハイライト領域を文字・地肌領域と誤判定しな
いようにブロック内に含まれる黒文字および白画素の数
を閾値と定め（約１８０〜２３０画素）、黒文字および
白画素の数を閾値以上含まれるブロックを文字・地肌ブ
ロックとする。最後に絵柄部での誤分離を防ぐために絵
柄ブロックに囲まれた文字・地肌ブロックを絵柄ブロッ
クに変更する。以上の処理により文字・地肌領域が抽出
できる。By the way, in the ADCT method, which is a coding method for natural still images, generally 8×8 pixels or 16×1 pixels are used.
Since encoding is performed block by block with six pixels as one block, determination of character/background areas must also be determined using a block size suitable for such an encoding method. Therefore, the comprehensive determination unit 26 reads out the pixel data stored in the buffer memory 25 in units of, for example, 16×16 pixels, and determines whether the block is a text/background area or a picture area. The determination uses the number of black characters and white pixels in the block. As described above, white pixels are sparsely distributed in the highlight area, and white pixels are densely distributed in the background area. Therefore, in order to avoid misjudging highlight areas as text/background areas, we set the number of black characters and white pixels included in the block as a threshold (approximately 180 to 230 pixels), and set the number of black characters and white pixels included in the block to be greater than or equal to the threshold. Set the block as a character/text block. Finally, in order to prevent erroneous separation in the picture area, the text/background blocks surrounded by the picture blocks are changed to picture blocks. Through the above processing, character/background regions can be extracted.

【００２７】次に、図４（ａ）　〜（ｃ）　を参照して
バッファ２にブロック単位で格納された画像データを符
号化する際のスキャン方式について説明する。図４（ａ
）　に示すスキャン方式は画素データを１ブロック毎に
スキャンして符号化する方式である。この方式は、文字
・地肌ブロックについては当該ブロック内でラスタース
キャンしながら符号化し、絵柄ブロックについてはＡＤ
ＣＴのようなブロック符号化を施す。この方式はブロッ
ク単位での処理が可能なのでメモリ容量が少なくて済む
が、動的算術符号などのように参照画素を必要とする場
合には実施できない。Next, a scanning method for encoding image data stored in blocks in the buffer 2 will be described with reference to FIGS. 4(a) to 4(c). Figure 4 (a
) is a method in which pixel data is scanned and encoded block by block. In this method, character/background blocks are encoded while raster scanning within the block, and picture blocks are encoded using AD.
Perform block encoding such as CT. This method requires less memory capacity because it can process in blocks, but it cannot be implemented when reference pixels are required, such as in dynamic arithmetic codes.

【００２８】図４（ｂ）　に示すスキャン方式は連続す
る同一属性ブロック単位に符号化する方式である。この
方式は、文字・地肌ブロックについては行方向に連続す
る複数ブロックをひとまとめにしてラスタースキャンの
順に符号化し、絵柄ブロックについては図（ａ）　の場
合と同様に１ブロック毎に符号化する。従って、１度に
処理するブロック数は可変となる。この方式で動的算術
符号のように参照画素を用いる場合は絵柄ブロックの画
素は「０」または「１」の一方の値を用いる。The scanning method shown in FIG. 4(b) is a method of encoding in units of consecutive blocks of the same attribute. In this method, for character/background blocks, multiple blocks that are continuous in the row direction are collectively encoded in raster scan order, and for picture blocks, each block is encoded as in the case of Figure (a). Therefore, the number of blocks processed at one time is variable. When reference pixels are used in this method, such as in dynamic arithmetic codes, the pixels of the picture block use either a value of "0" or "1".

【００２９】図４（ｃ）　に示すスキャン方式は１ブロ
ック行単位に符号化する方式である。この方式は文字・
地肌ブロックについてはブロック行毎にラスタースキャ
ンの順に符号化し、絵柄ブロックの画素はジャンプして
符号化しない。絵柄ブロックについては図（ａ）　の場
合と同様に１ブロック毎に符号化する。この方式で動的
算術符号を用いる場合は図（ｂ）　の場合と同様に絵柄
ブロックの画素は「０」または「１」の一方の値を用い
る。The scanning method shown in FIG. 4(c) is a method of encoding one block row by row. This method uses characters and
The background blocks are encoded in the raster scan order for each block row, and the pixels of the picture blocks are not jumped and encoded. As for the picture blocks, each block is encoded as in the case of Figure (a). When dynamic arithmetic codes are used in this method, the pixels of the picture block use either a value of ``0'' or ``1'', as in the case of Figure (b).

【００３０】次に、図５に示すブロック図を参照して図
４（ｃ）　に示すスキャン方式を適用した場合の２値画
像符号化部５について説明する。図５において、スイッ
チ部４（図１）から供給される画像データは、ＭＴＦ補
正部３０でＭＴＦ補正され、２値化回路３１で所定の閾
値と比較されて２値化され、ラインバッファメモリ３２
に格納される。Next, with reference to the block diagram shown in FIG. 5, the binary image encoding section 5 when the scanning method shown in FIG. 4(c) is applied will be explained. In FIG. 5, image data supplied from the switch unit 4 (FIG. 1) is subjected to MTF correction in an MTF correction unit 30, and is compared with a predetermined threshold value in a binarization circuit 31 to be binarized.
is stored in

【００３１】ラインバッファメモリ３２は、像域分離部
３での判定の結果、画像データが文字・地肌領域のデー
タであれば２値化回路３１で２値化したデータを対応す
るアドレスに書き込み、絵柄領域のデータであれば論理
「１」を書き込む。この書き込み制御はアドレスコント
ローラ３３によって行われる。If the result of the judgment in the image area separation unit 3 is that the image data is character/background area data, the line buffer memory 32 writes the binarized data in the binarization circuit 31 to the corresponding address; If the data is in the picture area, a logic "1" is written. This write control is performed by the address controller 33.

【００３２】ラインバッファメモリ３２に１ブロック行
分のデータが書き込まれると、アドレスコントローラ３
３の制御のもとにメモリ３２から注目画素と参照画素の
データとが読み出され、符号化部３４に供給されて符号
化される。参照画素の抽出には、図６に示すようなテン
プレートを用いる。また、符号化部３４としては、ＪＢ
ＩＧによるＱＭ−Ｃｏｄｅｒ符号化部を用いる。When data for one block row is written to the line buffer memory 32, the address controller 3
3, the data of the pixel of interest and the reference pixel are read out from the memory 32, and are supplied to the encoding unit 34 and encoded. A template as shown in FIG. 6 is used to extract reference pixels. Further, as the encoding unit 34, JB
A QM-Coder encoding unit by IG is used.

【００３３】ラインバッファメモリ３２について更に詳
述すると、このメモリは１画素１ビットで２ブロック行
分のデータが格納でき、パイプラインの動作が可能なよ
うに構成されている。例えば、１ブロック行が横１６０
０画素、縦１６画素の場合には１６０４×１６ビットの
ラインバッファメモリが２組必要になる。横に４画素分
多いのは、参照画素を計算するための予備領域として左
右２画素分持っているためである。この２組のラインバ
ッファメモリをそれぞれＢａ，Ｂｂとすると、処理の開
始時には全てのバッファメモリの値を「０」にセットす
る。そして、最初の書き込みデータが入力されるとバッ
ファメモリＢｂの（２，０）の位置から順にラスター方
向に書き込んで行く。バッファメモリＢｂに全てのデー
タが書き込まれると、今度はラインバッファメモリＢａ
にデータを書き込み、参照画素および注目画素をバッフ
ァメモリＢｂから読み出す。ただし、第１，第２ライン
の１つおよび２つ前のラインの参照画素としては、バッ
ファメモリＢａの１５，１６ラインの値を用いる。例え
ば、バッファメモリＢｂの（２，０）を注目画素とする
ときの参照画素の位置は、バッファメモリＢｂの（０，
０）、（１，０）とバッファメモリＢａの（０，１５）
、（１，１５）、（２，１５）、（３，１５）、（４，
１５）、（１，１４）、（２，１４）、（３，１４）と
なる。To explain the line buffer memory 32 in more detail, this memory can store data for two block rows with one pixel and one bit, and is configured to enable pipeline operation. For example, one block line is 160 horizontally
In the case of 0 pixel and 16 pixels vertically, two sets of 1604×16 bit line buffer memories are required. The reason why there are 4 extra pixels horizontally is because there are 2 pixels on the left and right as a reserve area for calculating reference pixels. Letting these two sets of line buffer memories be Ba and Bb, respectively, the values of all buffer memories are set to "0" at the start of processing. Then, when the first write data is input, data is sequentially written in the raster direction from the position (2,0) of the buffer memory Bb. When all the data is written to the buffer memory Bb, the line buffer memory Ba is written.
The reference pixel and the target pixel are read from the buffer memory Bb. However, the values of the 15th and 16th lines of the buffer memory Ba are used as the reference pixels of the first and second lines and the line two lines before. For example, when (2,0) of buffer memory Bb is the pixel of interest, the position of the reference pixel is (0,0) of buffer memory Bb.
0), (1,0) and (0,15) of buffer memory Ba
, (1,15), (2,15), (3,15), (4,
15), (1, 14), (2, 14), (3, 14).

【００３４】次に、図７を参照してこの発明の他の符号
化方式について説明する。この方式は絵柄領域の解像度
を低くして符号化する方式である。すなわち、カラー文
書画像をスキャナで読み取る場合、現在の技術では一文
書の全面を同じ解像度でしか読み取ることが出来ないた
め、一般には小さな文字が読み取れるように４００ｄｐ
ｉ　以上の解像度が必要になってくる。しかし、自然画
像にはそれほど高い解像度は必要としないことが多い。ところが、これまでは絵柄領域に対して入力時の解像度
のままで符号化を行っている。これは、自然画像符号化
方式として用いているＡＤＣＴ方式では、８×８画素ま
たは１６×１６画素を最小のブロックサイズとしている
ため、絵柄領域の解像度変換を行おうとすると像域分離
の判定のためにブロックサイズを大きくしなければなら
なくなり、バッファメモリの容量が大きくなってしまう
からである。Next, another encoding method of the present invention will be explained with reference to FIG. This method is a method in which the resolution of the picture area is lowered and encoded. In other words, when reading a color document image with a scanner, the current technology is only able to read the entire surface of a document at the same resolution, so generally it is 400 dp so that small characters can be read.
A resolution higher than i is required. However, natural images often do not require such high resolution. However, until now, encoding has been performed for the picture area using the resolution at the time of input. This is because in the ADCT method used as a natural image encoding method, the minimum block size is 8 x 8 pixels or 16 x 16 pixels. This is because the block size has to be increased, which increases the buffer memory capacity.

【００３５】そこで、この方式では、像域分離の判定ブ
ロックサイズを大きくせずに絵柄領域の解像度を低くし
て符号化するようにしている。図７の例では、絵柄領域
の必要解像度を１００ｄｐｉ　、文字・地肌領域を４０
０ｄｐｉ　として符号化している。従って、絵柄領域は
縦横それぞれ４分の１に縮小されることになる。すなわ
ち、８×８画素を１ブロックとすると、絵柄領域１ブロ
ック当たりで符号化を行う画素数は２×２画素の計４画
素となり、データ量は１６分の１に圧縮される。縮小処
理は単純に平滑化処理で行う。符号化は絵柄領域に対し
てはＤＰＣＭのような可逆かつ画素単位の符号化を用い
る。文字・地肌領域に対しては動的算術符号のような２
値画像符号化を用いる。いずれも周辺画素を参照する方
式であることから１ブロック行毎に行う。Therefore, in this method, the resolution of the picture area is lowered and encoded without increasing the size of the judgment block for image area separation. In the example in Figure 7, the required resolution for the picture area is 100 dpi, and the required resolution for the text/background area is 40 dpi.
It is encoded as 0dpi. Therefore, the picture area is reduced to one quarter in both the vertical and horizontal directions. That is, assuming that 8×8 pixels are one block, the number of pixels to be encoded per block of the picture area is 2×2 pixels, a total of 4 pixels, and the amount of data is compressed to 1/16. The reduction process is simply performed by smoothing process. For encoding, reversible pixel-by-pixel encoding such as DPCM is used for the picture area. 2, such as dynamic arithmetic code, for characters and background areas.
Use value image encoding. Since both methods refer to peripheral pixels, they are performed for each block row.

【００３６】図７において、実線で囲んだ正方形の部分
が１ブロックを表している。絵柄ブロックは平滑化によ
り破線で示すように１ブロック４画素に縮小されている
。絵柄ブロックの符号化は画素ａ１１，ａ１２，ｂ１１
，ｂ１２，…の順に行い、最右端の画素を符号化すると
次のラインの符号化を始め、画素ａ２１，ａ２２，ｂ２
１，ｂ２２，…の順に符号化する。注目画素値の予測に
周辺画素を用いる場合は、周辺画素が文字・地肌領域に
属する場合は文字・地肌領域に含まれる画素値を「０」
とみなして注目画素を予測し符号化する。文字・地肌領
域の符号化は図に矢印で示すように、１画素単位でラス
タースキャンの順に符号化する。符号化方式として動的
算術符号化のように参照画素を用いる場合は、絵柄領域
に含まれる画素値を「０」とみなして注目画素を予測し
符号化する。In FIG. 7, a square portion surrounded by a solid line represents one block. The picture block is reduced to 4 pixels per block by smoothing, as shown by the broken line. The picture block is encoded using pixels a11, a12, b11.
, b12, ..., and when the rightmost pixel is encoded, the next line starts to be encoded, and pixels a21, a22, b2 are encoded.
1, b22, . . . are encoded in this order. When using surrounding pixels to predict the pixel value of interest, if the surrounding pixels belong to the text/background area, set the pixel value included in the text/background area to "0".
The pixel of interest is predicted and encoded. The character/background area is encoded pixel by pixel in the order of raster scan, as shown by the arrow in the figure. When reference pixels are used as an encoding method such as dynamic arithmetic encoding, the pixel value included in the picture area is assumed to be "0", and the pixel of interest is predicted and encoded.

【００３７】この方式によれば、絵柄領域の画像に対し
て適切な解像度で符号化を行っているので、原画像デー
タのままで符号化するよりも高い圧縮率が可能となる。According to this method, since the image in the picture area is encoded at an appropriate resolution, a higher compression rate is possible than when encoding the original image data as is.

【００３８】次に、図８に示すブロック図を参照して自
然画像符号化部６の他の実施例について説明する。この
実施例では、入力画像から文字パターンを取り除いた絵
柄領域のうちでも自然画領域と背景領域とではその特性
が大きく異なる点に着目し、自然画領域と背景領域とで
符号化方式を異ならせることでより高い符号化効率を得
るようにしている。すなわち、自然画領域は色の変化や
テクスチャーを含んでいるため、直交変換の結果、ＤＣ
（直流）成分以外の低周波成分が比較的多く現れる。こ
れに対して背景領域では局所的に見てほぼ濃度が一定で
あるため、空間周波数領域では殆どＤＣ成分のみとなっ
てしまう。Next, another embodiment of the natural image encoding section 6 will be described with reference to the block diagram shown in FIG. In this example, we focus on the fact that the natural image area and the background area have significantly different characteristics even in the picture area where the character pattern is removed from the input image, and use different encoding methods for the natural image area and the background area. In this way, higher encoding efficiency can be obtained. In other words, since the natural image area includes color changes and textures, as a result of orthogonal transformation, DC
A relatively large number of low frequency components other than (DC) components appear. On the other hand, in the background region, the density is approximately constant when viewed locally, so that in the spatial frequency region there is almost only a DC component.

【００３９】そこで、この実施例では、入力画像から文
字パターンを取り除いた残りの画像を自然画領域と背景
領域とに分離し、それぞれの領域に応じて適応的に符号
化方式を変化させることにより、高圧縮な符号化を行う
ようにしている。符号化方式を変えるには、符号化方式
そのものを変える方法とパラメータのみを変える方法と
があるが、この実施例では、ハードウェアの共有化を考
慮して直交変換を行った後のハフマン符号のテーブルを
切り換えることによって圧縮率の向上を実現している。Therefore, in this embodiment, the remaining image after removing the character pattern from the input image is separated into a natural image area and a background area, and the encoding method is adaptively changed according to each area. , and performs high-compression encoding. To change the encoding method, there are two methods: changing the encoding method itself and changing only the parameters. The compression ratio is improved by switching the tables.

【００４０】図８において、像域分離部４０は原画像デ
ータから文字パターンを除去した後の画像データをＭ×
Ｎ画素のブロック単位で背景領域か自然領域かを判定す
る。この判定には白画素の密度を用いる。すなわち、ブ
ロック内に輝度成分の濃度レベルが閾値Ｔｈ以下の画素
が、例えば９０％以上含まれるならば、そのブロックは
背景ブロックとする。この結果は、そのブロックの属性
情報として領域情報符号化部７（図１）に送られると共
に、ハフマン符号化テーブル４１または４２の一方を選
択するためのスイッチ部４３の制御端子に供給される。ハフマン符号化テーブル４１は自然画用のテーブルであ
り、ハフマン符号化テーブル４２は背景用のテーブルで
ある。In FIG. 8, the image area separation unit 40 converts the image data after removing the character pattern from the original image data into M×
It is determined whether each block of N pixels is a background area or a natural area. The density of white pixels is used for this determination. That is, if a block contains, for example, 90% or more of pixels whose luminance component density level is equal to or less than the threshold Th, that block is determined to be a background block. This result is sent to the area information encoding section 7 (FIG. 1) as the attribute information of the block, and is also supplied to the control terminal of the switch section 43 for selecting either the Huffman encoding table 41 or 42. The Huffman encoding table 41 is a table for natural images, and the Huffman encoding table 42 is a table for backgrounds.

【００４１】ＡＤＣＴ符号化部４４では、図９に示すよ
うに、離散コサイン変換部５０で入力画像に対しブロッ
ク単位（８×８画素）で２次元離散コサイン変換（ＤＣ
Ｔ）を施し、空間周波数領域に変換する。ＤＣＴ演算を
行った結果は、ブロックの左上がＤＣ（直流）成分とな
り、右下に行くほど高周波成分となる。自然画像の場合
には画素間の相関が高いため左上の方の低周波成分に大
きなＤＣＴ係数が現れることが多い。そこで、量子化部
５１でＤＣＴ係数の低周波成分を小さな値で割り、高周
波成分を大きな値で割ることによって圧縮を行っている
。このときの各周波数成分に対する除数は、量子化テー
ブル５２に予め与えられている。量子化後の高周波成分
域の量子化係数値は「０」が多く続くため、図１０に示
すようにジグザグスキャンの順に走査して「０」のラン
レグスと「０」以外の成分値に変換し、これに対してハ
フマン符号化部５３でハフマン符号化テーブル５４を参
照しながらハフマン符号化を行う。このハフマン符号化
テーブル５４は、図８に示すハフマン符号化テーブル４
１および４２に対応する。In the ADCT encoding unit 44, as shown in FIG. 9, a discrete cosine transformation unit 50 performs two-dimensional discrete cosine transformation (DC
T) and transform it into the spatial frequency domain. As a result of the DCT calculation, the upper left of the block becomes a DC (direct current) component, and the closer to the lower right, the higher the frequency component becomes. In the case of a natural image, since the correlation between pixels is high, a large DCT coefficient often appears in the low frequency component at the upper left. Therefore, the quantization unit 51 performs compression by dividing the low frequency components of the DCT coefficients by a small value and dividing the high frequency components by a large value. The divisor for each frequency component at this time is given in advance to the quantization table 52. Since the quantization coefficient values in the high frequency component area after quantization continue to have many "0"s, they are scanned in zigzag order as shown in Figure 10 and converted into run legs of "0" and component values other than "0". , the Huffman encoding section 53 performs Huffman encoding on this while referring to the Huffman encoding table 54. This Huffman encoding table 54 is the Huffman encoding table 4 shown in FIG.
1 and 42.

【００４２】ハフマン符号化は出現確率の高い事象に対
し短い符号を割り当てるエントロピー符号化の一種であ
る。ハフマン符号化はＤＣ成分とＡＣ成分とに分けて行
う。ＤＣ成分の符号化は、基本的にＤＣ成分値そのもの
ではなく前のブロックのＤＣ成分値との差分を符号化す
る。全てのデータに対しハフマン符号を割り当てると、
そのテーブルを格納するために膨大な記憶領域を必要と
するため、ＤＣ成分値をカテゴリーと付加ビットとに分
解する。カテゴリーは成分値の集合体で、成分値を表現
するのに必要な最小ビット数Ｃに等しい。付加ビットは
成分値が正の場合はその値の下位Ｃビットで、成分値が
負の場合はその成分値から「１」を引いた値の下位Ｃビ
ットになる。例えば、ＤＣ成分の差分値が「１０」（１
０進数）の場合は、カテゴリーが「４」で、付加ビット
は“００１０”となる。このカテゴリーに対しハフマン
符号が割り当てられることになる。図１１に、輝度信号
の場合のハフマン符号テーブルの例を示す。Huffman encoding is a type of entropy encoding that assigns short codes to events with a high probability of occurrence. Huffman encoding is performed separately into a DC component and an AC component. In encoding the DC component, basically, the difference with the DC component value of the previous block is encoded instead of the DC component value itself. Assigning Huffman codes to all data,
Since a huge amount of storage space is required to store the table, the DC component value is decomposed into categories and additional bits. A category is a collection of component values, and is equal to the minimum number of bits C required to represent a component value. If the component value is positive, the additional bits are the lower C bits of that value, and if the component value is negative, the additional bits are the lower C bits of the value obtained by subtracting "1" from the component value. For example, the difference value of the DC component is “10” (1
In the case of a decimal number), the category is "4" and the additional bit is "0010". A Huffman code will be assigned to this category. FIG. 11 shows an example of a Huffman code table for a luminance signal.

【００４３】ＡＣ成分の場合の符号語は、「０」のラン
レングスと非「０」のＡＣ成分のカテゴリー値で構成さ
れる。例えば、図１０に示すような量子化係数が得られ
た場合には、ジグザグスキャンの結果、（０，３）、（
４，３）、（１０，１）、ＥＯＢ（Ｅｎｄ　Ｏｆ　Ｂｌ
ｏｃｋ）という事象が送られる。カッコ内の最初の数値
は「０」のランレングスを示し、後の数値は非「０」の
ＡＣ成分の値である。この事象から「ＮＮＮＮＳＳＳＳ
＋付加ビット」を構成する。「ＮＮＮＮ」はランレング
スを示す。ランレングスは最大１５までとなっているので４ビット
で表される。次の「ＳＳＳＳ」は非「０」の値のカテゴ
リーである。これも４ビットで表すため合計８ビットで
ハフマン符号化の符号語が構成される。付加ビットの作
り方はＤＣ成分と同様である。“ＥＯＢ”はブロックの
以下の値が全て零であることを示し、“０ｘ００”で表
される。ＡＣ成分のハフマン符号化テーブルは、この「
ＮＮＮＮＳＳＳＳ」に対し出現頻度の高い符号語に短い
符号語を割り当てて圧縮を行う。The code word for the AC component consists of a run length of "0" and a category value of the AC component of non-"0". For example, if the quantization coefficients shown in FIG. 10 are obtained, the results of zigzag scan are (0,3), (
4,3), (10,1), EOB (End Of Bl
ock) is sent. The first number in parentheses indicates the "0" run length, and the subsequent numbers are the non-"0" AC component values. From this event, “NNNNSSSS
+ additional bits. "NNNN" indicates the run length. Since the run length is up to 15, it is represented by 4 bits. The next “SSSS” is a category of non-“0” values. Since this is also expressed using 4 bits, a code word for Huffman encoding is composed of a total of 8 bits. The additional bits are created in the same way as the DC component. “EOB” indicates that the following values of the block are all zero, and is represented by “0x00”. The Huffman encoding table for the AC component is
NNNNSSSS", compression is performed by assigning short code words to code words that appear frequently.

【００４４】自然画像では輝度成分の量子化係数の非「
０」の値がＡＣ成分の第１成分や第２成分にも比較的高
い確率で現れるため、“０ｘ０１”、“０ｘ０２”、“
０ｘ０３”、“０ｘ１１”の符号語に対しても２ビット
から４ビットの短い符号語が割り当てられている。しか
し、背景領域では第１成分や第２成分にさえも非「０」
の値は殆ど現れない。すなわち、“ＥＯＢ”の出現確率
が極端に高くなる。したがって、背景領域では“ＥＯＢ
”に最も近い符号を割り当てることによって大幅に圧縮
率を向上することが出来る。この理由から“ＥＯＢ”に
割り当てる符号は１ビットないし２ビットが適当であろ
う。In natural images, the quantization coefficient of the luminance component is
Since the value "0" appears with a relatively high probability in the first and second components of the AC component, "0x01", "0x02", "
Short codewords of 2 to 4 bits are also assigned to the codewords ``0x03'' and ``0x11.'' However, in the background area, even the first and second components are non-zero.
The value of almost never appears. In other words, the probability of "EOB" appearing becomes extremely high. Therefore, in the background area “EOB
The compression rate can be greatly improved by assigning the code closest to "EOB". For this reason, it would be appropriate to assign a code of 1 or 2 bits to "EOB".

【００４５】図８に戻り、ＡＤＣＴ符号化部４４では、
先に述べたようにブロックの属性に応じて適応的にハフ
マン符号化テーブル４１および４２を切り換える。した
がって、テーブル以外のＤＣＴ変換部５０、量子化部５
１、ハフマン符号化部５３等は全てのブロックデータに
対して共通である。符号化の結果は他の符号化部の出力
と合成されて外部記憶装置９（図１）に記録される。Returning to FIG. 8, in the ADCT encoding section 44,
As described above, the Huffman encoding tables 41 and 42 are adaptively switched according to the attributes of the block. Therefore, the DCT transformer 50 and the quantizer 5 other than the table
1. The Huffman encoding unit 53 and the like are common to all block data. The encoding result is combined with the outputs of other encoding sections and recorded in the external storage device 9 (FIG. 1).

【００４６】この実施例によれば、原画像から文字を除
去した画像から自然画領域と背景領域とを分離し、符号
化パラメータ（符号化テーブル）を変えることによって
それぞれに適した符号化を行っているので、簡易な構成
で符号化効率を向上することが出来る。According to this embodiment, a natural image area and a background area are separated from an image obtained by removing characters from the original image, and encoding is performed appropriately for each by changing the encoding parameters (encoding table). Therefore, encoding efficiency can be improved with a simple configuration.

【００４７】こうして外部記憶装置９に記憶された画像
データの復号化処理は、図１２に示す復号化装置によっ
て行われる。復号化は符号化の逆の処理を行う。ただし
、像域分離は行う必要がない。まず、外部記憶装置９か
らデフォーマット部６０にデータを読み出し、デフォー
マットして２値画像符号は２値画像復号化部６１に、自
然画像符号は自然画像復号化部６２に、領域情報は領域
情報復号化部６３にそれぞれ出力する。２値画像復号化
部６１および自然画像復号化部６２における２値画像符
号および自然画像符号の復号は、領域情報復号化部６３
で復元された領域情報に従って制御される。そして、最
終的な復号はアプリケーションの要求に合わせて総合復
号部６４でそれに適した形で行う。例えば、プリンタ６
５に出力する場合は画像をラスター順に復元する。また
、ディスプレイ６６に全体を表示する場合は間引きなど
を行った縮小画像を再生する。The decoding process of the image data thus stored in the external storage device 9 is performed by the decoding device shown in FIG. Decoding is the inverse of encoding. However, it is not necessary to perform image area separation. First, data is read from the external storage device 9 to the deformatting unit 60 and deformatted, the binary image code is sent to the binary image decoding unit 61, the natural image code is sent to the natural image decoding unit 62, and the area information is sent to the area Each is output to the information decoding section 63. Decoding of binary image codes and natural image codes in the binary image decoding unit 61 and the natural image decoding unit 62 is performed by the area information decoding unit 63.
controlled according to the area information restored in . Then, the final decoding is performed by the general decoding unit 64 in a form suitable for the application request. For example, printer 6
5, the images are restored in raster order. Furthermore, when displaying the entire image on the display 66, a reduced image that has been thinned out is played back.

【００４８】[0048]

【発明の効果】この発明によれば、文字以外に地肌領域
も抽出するので、文字および白地領域は２値画像として
符号化でき、残りの絵柄領域だけを自然画像符号化を行
えばよいため、絵柄領域の少ない文書では大幅に圧縮率
を向上できる。また、文字・地肌領域は２値画像に変換
されるため文字が鮮明に再生され、また地肌も濃度ムラ
がなくなり高画質な画像を得ることができる。According to the present invention, since the background area is extracted in addition to the characters, the characters and the white area can be encoded as a binary image, and only the remaining picture area needs to be encoded as a natural image. The compression rate can be significantly improved for documents with small picture areas. Furthermore, since the text/background area is converted into a binary image, the text is clearly reproduced, and the background is also free from density unevenness, making it possible to obtain a high-quality image.

【００４９】また、ブロック単位の白地検出と網点検出
の結果を組み合わせて地肌領域を抽出しているので、絵
柄が網点で表現されている印刷原稿などにおいて精度良
く地肌領域を求めることができる。さらに、最終的に絵
柄領域を膨張させているので、文字・地肌ブロックの周
囲が絵柄領域で囲まれているような場合に、その文字・
地肌ブロックを絵柄ブロックに変更することにより絵柄
領域の誤判定を減らし、より高画質な復元画像を得るこ
とができる。[0049] Furthermore, since the background area is extracted by combining the results of white background detection and halftone dot detection in units of blocks, it is possible to accurately determine the background area in printed manuscripts where the image is expressed by halftone dots. . Furthermore, since the picture area is expanded in the end, if the periphery of the character/background block is surrounded by the picture area,
By changing the background block to a picture block, it is possible to reduce erroneous determination of picture areas and obtain a restored image of higher quality.

【００５０】また、黒文字と白画素を画素単位で個別に
求めた後、ブロック内に含まれる黒文字ないし白画素の
密度によってブロック単位の像域分離を行うため、簡単
な処理で高精細度に文字・地肌領域を抽出することがで
きる。また、この発明によれば、文字・地肌ブロックを
１ブロック単位で符号化するため記憶容量の少ないバッ
ファメモリで符号化できる。さらに、連続する複数ブロ
ック単位で、または、１ブロック行単位で符号化するた
め、効率のよい符号化を行うことが出来る。Furthermore, after obtaining black characters and white pixels individually in pixel units, image area separation is performed in block units based on the density of black characters or white pixels included in the block. - The background area can be extracted. Further, according to the present invention, since character/background blocks are encoded in units of blocks, encoding can be performed using a buffer memory with a small storage capacity. Furthermore, since encoding is performed in units of a plurality of consecutive blocks or in units of one block row, efficient encoding can be performed.

【００５１】また、この発明によれば、自然画像に対し
て適切な解像度で符号化を行っているため、原画像のま
まで符号化する場合に比べ高圧縮が可能となる。また、
この発明によれば、原画像から文字を除去した画像に対
し背景領域と自然画領域とを分離し、符号化パラメータ
を変えることによってそれぞれに適した符号化を行って
いるので、高圧縮な符号化を行うことが出来る。Furthermore, according to the present invention, since the natural image is encoded at an appropriate resolution, higher compression is possible than when encoding the original image as it is. Also,
According to this invention, the background area and the natural image area are separated from the image from which characters have been removed from the original image, and encoding is performed suitable for each by changing the encoding parameters. It is possible to perform conversion.

[Brief explanation of drawings]

【図１】この発明による適応符号化方式の一実施例を示
すブロック図である。FIG. 1 is a block diagram showing an embodiment of an adaptive encoding method according to the present invention.

【図２】図１における像域分離部の一実施例を示すブロ
ック図である。FIG. 2 is a block diagram showing one embodiment of the image area separation section in FIG. 1;

【図３】像域分離部の他の実施例を示すブロック図であ
る。FIG. 3 is a block diagram showing another embodiment of the image area separation unit.

【図４】ブロック単位で画像データを符号化する際のス
キャン方式を説明するための図である。FIG. 4 is a diagram for explaining a scanning method when encoding image data in units of blocks.

【図５】２値画像符号化部のブロック図である。FIG. 5 is a block diagram of a binary image encoding section.

【図６】参照画素を抽出するためのテンプレートを示す
図である。FIG. 6 is a diagram showing a template for extracting reference pixels.

【図７】絵柄領域の解像度を低くして符号化する方式を
示す図である。FIG. 7 is a diagram showing a method of encoding a picture area by lowering its resolution.

【図８】自然画像符号化部の他の実施例を示す図である
。FIG. 8 is a diagram showing another embodiment of the natural image encoding section.

【図９】図８におけるＡＤＣＴ符号化部のブロック図で
ある。FIG. 9 is a block diagram of an ADCT encoding section in FIG. 8;

【図１０】量子化係数値のジグザグスキャンを示す図で
ある。FIG. 10 is a diagram showing a zigzag scan of quantization coefficient values.

【図１１】輝度信号のハフマン符号テーブルの例を示す
図である。FIG. 11 is a diagram showing an example of a Huffman code table for a luminance signal.

【図１２】符号化した画像データを復号する復号化装置
の一例を示すブロック図である。FIG. 12 is a block diagram showing an example of a decoding device that decodes encoded image data.

[Explanation of symbols]

１　　　　　　スキャナ２　　　　　　バッファ３　　　　　　像域分離部５　　　　　　２値画像符号化部６　　　　　　自然画像符号化部７　　　　　　領域情報符号化部８　　　　　　フォーマット部９　　　　　　外部記憶装置１０　　　　ＭＴＦ補正部１１　　　　白地抽出部１２　　　　エッジ抽出部１３，２６　　　　総合判定部２０　　　　黒文字抽出部２１　　　　白画素抽出部 1 Scanner 2 Buffer 3 Image area separation section 5 Binary image encoding unit 6 Natural image encoding unit 7 Area information encoding unit 8 Format section 9 External storage device 10 MTF correction section 11 White background extraction part 12 Edge extraction part 13, 26 Comprehensive Judgment Department 20 Black character extraction part 21 White pixel extraction section

Claims

[Claims]

Claim 1: A color document image in which a natural image with gradation and text are mixed is determined in block units of M×N pixels as to whether it is a text/background area or a picture area, and the determination result is determined. The image data of the block determined to be the character/background area based on the above is subjected to encoding suitable for a binary image, and the image data of the block determined to be the above picture area is subjected to encoding suitable for the picture area. An adaptive encoding method for color document images.

2. In claim 1, the determination of the background area is performed by determining whether or not each pixel is a white pixel to obtain the number of white pixels for each block, and determining whether the number of white pixels in the block is a predetermined number. Adaptation to color document images characterized in that if the block is equal to or greater than a threshold value and does not contain color pixels, the block is treated as a white background block, and if there is no halftone area in this white background block, the block is treated as a background area. Encoding method.

3. In claim 1, the image area separation between the character/background area and the picture area is performed when an edge area or a background area exists in each block after extracting the character area and the background area. An adaptive encoding method for color document images characterized in that, if there are no color pixels, the block is determined to be a character/background area, and then the picture area is expanded.

4. In claim 1, the image area separation between the character/background area and the picture area includes extracting pixels around the character, extracting white pixels, and taking a logical sum of these extraction results. Extract characters or background pixels by
A color document image characterized in that when character/background areas are determined in units of M×N pixel blocks, if the density of characters or white pixels in the block is high, the block is determined to be a character/background block. adaptive coding scheme.

5. The adaptive encoding method for color document images according to claim 1, wherein the encoding of the character/background area is performed in units of the blocks.

6. The adaptive encoding method for color document images according to claim 1, wherein the character/background area is encoded in units of a plurality of character/background blocks that are continuous in the row direction.

7. The adaptive encoding method for color document images according to claim 1, wherein the character/background area is encoded in units of character/background blocks of one block row.

8. In claim 6 or 7, when the character/background block is encoded by a method that refers to surrounding pixels, the pixels included in the picture block are encoded by regarding them as a predetermined constant value. An adaptive encoding method for color document images.

9. The adaptive encoding method for color document images according to claim 1, wherein the picture area is encoded after being converted to a resolution lower than that of the character/background area.

10. Separating a text/background area and a picture area from a color document image in which a natural image with gradation and text coexist, further separating a background area from the picture area, and separating the text/background area from the picture area. It is characterized by performing encoding suitable for a binary image for the background area and performing encoding suitable for the pattern area using different parameters for the background area and the pattern area from which the background area has been removed. An adaptive encoding method for color document images.

11. In claim 10, the encoding suitable for the picture area is ADCT encoding, and the different parameters are a Huffman encoding table suitable for a background area and a Huffman encoding table suitable for a natural image. An adaptive encoding method for color document images characterized by the following.