JP2006093880A

JP2006093880A - Image processing apparatus and control method thereof, computer program, and computer-readable storage medium

Info

Publication number: JP2006093880A
Application number: JP2004273971A
Authority: JP
Inventors: Shigeo Fukuoka; 茂雄福岡
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2004-09-21
Filing date: 2004-09-21
Publication date: 2006-04-06

Abstract

<P>PROBLEM TO BE SOLVED: To generate a data file, having a high compression ratio by maintaining the edge to be clear regardless of characters and handwritten images in a manuscript, and to maintaining the gradation properties of a natural image, when there is one. <P>SOLUTION: A region identifier 104 identifies a character region constituted of printing types in inputted image data from a non-character region, and generates each coordinates data. A handwritten region determinating section determines a handwritten region, based on information from a pixel determining section 109 for outputting character line drawing information and non-character region coordinate data. A painted-out section 106 paints out a character section, in a region shown by character region coordinates data and handwritten region coordinate data with a background color, and generates image data in a state without high-frequency components that include natural image as a whole. Then, the result is encoded by a JPEG encoder 108. The character region and the handwritten region are encoded by MMR encoders 112, 115 that maintain the edges of characters. <P>COPYRIGHT: (C)2006,JPO&NCIPI

Description

本発明は、画像データを入力し、符号化して出力する技術に関するものである。 The present invention relates to a technique for inputting, encoding, and outputting image data.

近年、スキャナの普及により文書の電子化が進んでいる。Ａ４サイズ１枚の原稿を３００ｄｐｉの解像度、フルカラーで読取ると、その画像データは実に２４Ｍバイトものデータ量になり、記憶管理するにはメモリを逼迫するし、ネットワークを介して他のデバイスにメール等で転送する場合にはネットワークトラフィックの問題にもなる。よって、画像データに対する圧縮技術は重要である。 In recent years, the digitization of documents has progressed with the spread of scanners. When a single A4 size document is scanned at 300 dpi resolution and full color, the image data will actually be 24 Mbytes in size, and memory will be strained for storage management, and mail etc. will be sent to other devices via the network. In case of transferring with, it becomes a problem of network traffic. Therefore, a compression technique for image data is important.

例えば、フルカラー画像圧縮としてＪＰＥＧが知られている。ＪＰＥＧは写真などの自然画像を圧縮するには非常に効果も高く、画質も良いことで知られている。かかる圧縮技術を用いると、上記のデータ量の問題は緩和される。 For example, JPEG is known as full color image compression. JPEG is known to be very effective for compressing natural images such as photographs and to have good image quality. The use of such a compression technique alleviates the above-described data amount problem.

しかしその一方で、ＪＰＥＧは原稿中の文字部などの高周波部分を持つ画像についてはモスキートノイズと呼ばれる画像劣化が発生し、圧縮率も悪い。 However, on the other hand, JPEG causes image deterioration called mosquito noise for an image having a high frequency portion such as a character portion in a document, and the compression rate is also bad.

そこで領域分割を行い、写真領域と文字領域が混在する画像から文字成分を抜いた下地部分のＪＰＥＧ圧縮し、抜いた文字成分は色情報と文字の位置を示す成分としてＭＭＲ圧縮する技術が知られている（特許文献１、２）。
特開２００２−７７６３３公報特開２００３−３０９７２７公報 Therefore, a technique is known in which area division is performed, JPEG compression is performed on the background portion obtained by removing character components from an image in which a photographic area and character area are mixed, and the extracted character components are subjected to MMR compression as components indicating color information and character positions (Patent Documents 1 and 2).
JP 2002-77633 A JP 2003-309727 A

上記の圧縮方式では、コンピュータで作成され、プリンタにて印刷された印刷物を読取り対象としている。従って、原稿中の文字に手書きで修正等を加えたりすると、手書き部位は非文字として扱われ、ＪＰＥＧ圧縮対象として処理される。ＪＰＥＧは先に説明したように、自然画に有効な圧縮技術であるものの、文字や線画については画質劣化が発生しやすいものであるから、手書き画像が重畳された文字は、その周辺の文字（言い変えれば手書き画像が重畳されていない文字）と比べると視覚的な画質が見劣りする場合があった。 In the above compression method, a printed matter created by a computer and printed by a printer is targeted for reading. Accordingly, when a character in the document is corrected by handwriting, the handwritten part is treated as a non-character and processed as a JPEG compression target. As described above, JPEG is a compression technique that is effective for natural images. However, since characters and line drawings are prone to image quality degradation, characters with a handwritten image superimposed on the surrounding characters ( In other words, the visual image quality may be inferior to that of a character on which a handwritten image is not superimposed.

本発明はかかる課題に鑑みなされたものであり、原稿中の文字、手書き画像、自然画が混在するような画像の符号化をする場合に、文字、手書き画像のエッジを明瞭に維持しつつ、自然画がある場合には自然画の階調性も維持し、圧縮率の高いデータファイルを生成する技術を提供しようとするものである。 The present invention has been made in view of such problems, and when encoding an image in which characters, handwritten images, and natural images in a document are mixed, while maintaining the edges of the characters and handwritten images clearly, In the case where there is a natural image, it is intended to provide a technique for maintaining the gradation of the natural image and generating a data file with a high compression rate.

この課題を解決するため、例えば本発明の画像処理装置は以下の構成を備える。すなわち、
入力画像中の文字領域を判定し、文字領域座標情報並びに非文字領域座標情報を生成する文字領域判定手段と、
注目画素が文字・線画にあるか否かを示す属性情報を入力し、当該属性情報と、前記非文字領域座標情報に基づいて手書き領域を判定し、手書き領域座標情報を生成する手書き領域判定手段と、
前記文字領域座標情報及び前記手書き領域座標情報に基づき、前記入力画像中の文字及び手書きされた部位を、周辺の色で塗りつぶす塗り潰し手段と、
該塗り潰し手段で塗り潰して得られた画像全体を階調画像用の符号化手段で符号化する第１の符号化手段と、
前記文字及び手書きされた部位を、文字線画用の符号化手段で符号化する第２の符号化手段と、
前記第１、第２の符号化手段で得られたそれぞれの符号化データを１つのデータファイルに含めて出力する出力手段とを備える。 In order to solve this problem, for example, an image processing apparatus of the present invention has the following configuration. That is,
A character region determination means for determining a character region in the input image and generating character region coordinate information and non-character region coordinate information;
Handwritten region determination means for inputting attribute information indicating whether or not the target pixel is in a character / line drawing, determining a handwritten region based on the attribute information and the non-character region coordinate information, and generating handwritten region coordinate information When,
Based on the character area coordinate information and the handwritten area coordinate information, a painting means for painting a character and a handwritten part in the input image with surrounding colors;
First encoding means for encoding the entire image obtained by filling with the filling means with encoding means for a gradation image;
A second encoding means for encoding the character and the handwritten portion with an encoding means for character line drawing;
And output means for outputting each encoded data obtained by the first and second encoding means in one data file.

本発明によれば、原稿中の文字、手書き画像にかかわらず、そのエッジが明瞭に維持しつつ、且つ、自然画がある場合には自然画の階調性も維持し、圧縮率の高いデータファイルを生成することが可能になる。 According to the present invention, regardless of the character or handwritten image in the document, the edge is maintained clearly, and when there is a natural image, the gradation of the natural image is maintained and data with a high compression rate is obtained. A file can be generated.

以下添付図面に従って、本発明に係る実施形態を詳細に説明する。 Embodiments according to the present invention will be described below in detail with reference to the accompanying drawings.

図１は実施形態におけるシステム構成図である。図示において、１００は複合機（ＭＦＰ）であって、ネットワークスキャナ、ネットワークプリンタ、複写機として機能するものである。２００は複合機１００で読み取られた原稿画像データ等のファイルを記憶するファイルサーバ、３０１及び３０２はパーソナルコンピュータ等のクライアントＰＣであり、４００はこれら各装置を接続するためのいネットワークである。 FIG. 1 is a system configuration diagram in the embodiment. In the figure, reference numeral 100 denotes a multifunction peripheral (MFP) that functions as a network scanner, a network printer, and a copier. Reference numeral 200 denotes a file server for storing files such as document image data read by the multifunction peripheral 100, 301 and 302 denote client PCs such as personal computers, and 400 denotes a network for connecting these devices.

上記構成において、クライアントＰＣは複合機１００をネットワークプリンタとして活用すると共に、ファイルサーバ２００にネットワーク上の各クライアントで共有するファイルを格納することになる。 In the above configuration, the client PC uses the multifunction peripheral 100 as a network printer, and stores a file shared by each client on the network in the file server 200.

本実施形態における複合機１００は、上記のようにネットワークプリンタ、複写機としても機能するものであるが、これらの機能は本発明には直接には関係がないので、以下では、複合機１００のスキャナ機能にについて説明することとする。 The multifunction device 100 according to the present embodiment functions as a network printer and a copying machine as described above, but these functions are not directly related to the present invention. The scanner function will be described.

図２は読取り対象の原稿の一例を示している。図示に示すように、自然画、文字（文章）を含み、尚且つ、文章の一部に手書きで修正がなされた例を示している。文章は黒のみではなく、赤、青等一般的にプリンターで印刷し得る色文字が混在していても構わない。また、図示では、「開発の新しい技術」という文字列が「新規開発技術」と赤色ペンで手書き修正されている様を示している。なお、この場合にはユーザーに元の文字を修正しようとする意図が有るものとして、修正と表現するが、単なる署名（サイン）やマーキングかもしれない。この様に手書き画像が重畳される（書き込まれる）状況であれば、本実施形態で説明する意図から逸脱するものではない。 FIG. 2 shows an example of a document to be read. As shown in the figure, an example is shown in which a natural image and characters (sentences) are included, and a part of the sentence is corrected by handwriting. The text is not limited to black, and may be mixed with color characters such as red and blue that can be printed by a printer. Further, the drawing shows that the character string “new development technology” is handwritten and corrected with “new development technology” and a red pen. In this case, the user intends to correct the original character and is expressed as correction, but it may be a simple signature or marking. As long as the handwritten image is superimposed (written) in this way, it does not depart from the intention described in the present embodiment.

さて、この原稿を複合機１００で読取り（Ｒ、Ｇ、Ｂ各８ビットの２５６階調とする）、圧縮符号化するわけであるが、文字か否かの判別は、公知の文字認識処理で利用されている手法を採用する。すなわち、文字画像のドットの水平及び垂直方向のドットのヒストグラムを作成し、文字の外接矩形を検出し、その検出された外接矩形の並び方向に文字が並んでいるものを互いに接続し、行間についても所定距離以内であればそれらを文字行として連結して矩形の文字領域として判定する手法である。 Now, this document is read by the multi-function device 100 (256 gradations of 8 bits for each of R, G, and B) and compression-encoded, and whether or not it is a character is determined by a known character recognition process. Adopt the method used. That is, a horizontal and vertical dot histogram of the character image dots is created, the circumscribed rectangle of the character is detected, the characters arranged in the direction of the detected circumscribed rectangle are connected to each other, and the line spacing Is a method of determining a rectangular character region by connecting them as character lines if they are within a predetermined distance.

この場合、「新規開発技術」を活字並で記入できれば、その文字列は文字として認識されるが、ここでは、ここでは文字として認識されなかったとする（勿論、その一部の文字が文字として判断されても構わない）。また、文章中の「開発の新しい技術」という文字であるが、手書きの線分が記入されたことにより、そのそれらが文字として判断されず、且つ、引き出し線が通った文字についても文字として判断されなかったとする。 In this case, if “newly developed technology” can be entered in the same order as the type, the character string is recognized as a character, but here it is assumed that it was not recognized as a character here (of course, part of the characters are determined as characters). May be.) In addition, the characters “new development technology” in the text are not judged as characters because handwritten line segments have been entered, and the characters that have been drawn through are also judged as characters. Suppose not.

さて、上記状態で、文字領域と判断された領域を図３に示すようになる。文字領域は矩形として扱うので、図示のように６個の文字領域が判定される。なお。文字領域を矩形として扱うのは、その矩形領域を対角線上にある左上隅と右下隅の２つの座標で管理するのに都合が良いからであるが、矩形（＝４角形）に限らず、それ以上の多角形で管理することを許容しても構わない。 Now, the area determined as the character area in the above state is as shown in FIG. Since the character area is handled as a rectangle, six character areas are determined as shown in the figure. Note that. The character area is treated as a rectangle because it is convenient to manage the rectangular area with two coordinates of the upper left corner and the lower right corner on the diagonal line, but it is not limited to a rectangle (= tetragon). You may allow management with the above polygons.

さて、図３に示す文字領域を符号化する際には、一般に文字画像はエッジが明瞭になっていることが望まれる。そこで、実施形態では、文字領域についてはＭＭＲ等の可逆符号化を適用して符号化するものとした（ＪＢＩＧ等他の符号化技術を採用しても構わない）。ＭＭＲは２値画像の符号化であるので、図３に示した文字領域を濃度（もしくは輝度）の中間値（８ビットであれば１２８）を使って単純２値化し、それを符号化する。ただし、文字領域内にある個々の文字の色についても考慮する必要がある。そこで、ＭＭＲ符号化を行う際には、２値化した際に“１”となる画素が、オリジナル画像データでは何色であったかのか示す色情報を別途保持する。色情報はＲ，Ｇ，Ｂの値で持つようにしても良いが、文字の色数は、自然画ほどの色数を有することはあり得ないので、パレットに割り当てて記憶する。すなわち、２値化して“１”となった位置のオリジナル画像（多値カラー画像）の色の分布を調べ、幾つの色が存在するかを判定し、それぞれにパレット番号を割り当てる。例えば文字の色が黒（Ｒ≒Ｇ≒Ｂ≒０）であればパレット番号「０」を割り当てる。また、文字の色が赤（例えばＲ≧２００以上、Ｇ≒Ｂ≒０）の場合であればパレット番号「１」を割り当てる等である。隣接する文字は同じパレットになる確率が高いであろうから、パレット番号もランレングス符号化すると、高い圧縮率で色情報を保持した符号化が可能になる。勿論、復号する際には、文字色まで再現するために、パレット番号と実際のＲＧＢの値との関係を示すテーブルを別途用意する。すなわち、２値画像を復号した際に得られる有意画素“１”の画素データを、パレット番号で示される多値画素データに復元する。なお、文字には、その背景色を伴うことも有り得る。この背景色については後述することとする。 Now, when encoding the character area shown in FIG. 3, it is generally desired that the character image has a clear edge. Therefore, in the embodiment, the character region is encoded by applying lossless encoding such as MMR (other encoding techniques such as JBIG may be adopted). Since MMR is encoding of a binary image, the character region shown in FIG. 3 is simply binarized using an intermediate value (or 128 for 8 bits) of density (or luminance) and encoded. However, it is necessary to consider the color of each character in the character area. Therefore, when performing MMR encoding, color information indicating how many colors the pixels that become “1” when binarized are in the original image data is held separately. Although the color information may have values of R, G, and B, the number of characters cannot be as large as that of a natural image and is therefore assigned to a palette and stored. That is, the color distribution of the original image (multi-valued color image) at the position where it is binarized to “1” is examined to determine how many colors exist, and a palette number is assigned to each. For example, if the character color is black (R≈G≈B≈0), the palette number “0” is assigned. If the character color is red (for example, R ≧ 200 or more, G≈B≈0), the palette number “1” is assigned. Since adjacent characters are likely to have the same palette, if the palette number is also run-length encoded, it is possible to perform encoding that retains color information at a high compression rate. Of course, when decoding, a table indicating the relationship between the palette number and the actual RGB value is prepared separately in order to reproduce even the character color. That is, the pixel data of the significant pixel “1” obtained when the binary image is decoded is restored to the multi-value pixel data indicated by the palette number. A character may be accompanied by its background color. This background color will be described later.

次に、文字領域以外の符号化について説明する。ここで、色ペンによる修正（重畳または書き込み）を考慮しないのであれば、文字領域以外の領域とは自然画領域となる。自然画に優れた符号化技術としてはＪＰＥＧが知られている。 Next, encoding other than the character area will be described. Here, if correction (superimposition or writing) by the color pen is not taken into consideration, the area other than the character area is a natural image area. JPEG is known as an encoding technique excellent for natural images.

ここで仮に、文字領域外をＪＰＥＧ符号化するものとしてしまうと、図２の手書き修正された部分までも含む図４の画像がＪＰＥＧ符号化対象となってしまう。ＪＰＥＧは非可逆符号化であり階調画像に優れているものの、文字画像をＪＰＥＧ符号化してしまうと、そのエッジがボケてしまいうので、例え手書きであっても、その視認性が損なわれるであろう。さらにまた、このような手書きによる修正された活字文字、線分、手書き文字は高周波成分が多く含まれているので、高い圧縮率は得られない。 If it is assumed that the outside of the character area is JPEG-encoded, the image of FIG. 4 including the handwritten corrected portion of FIG. 2 becomes the target for JPEG encoding. JPEG is an irreversible encoding and is excellent in gradation images, but if a character image is JPEG encoded, its edges will be blurred, so even if it is handwritten, its visibility is impaired. I will. Furthermore, since the type characters, line segments, and handwritten characters corrected by such handwriting contain many high-frequency components, a high compression rate cannot be obtained.

そこで、本実施形態では、文字領域以外から、図５に示すように、手書きによって文字とは判断されなかった領域（以下、手書き領域という）を抽出し、手書き領域については文字領域ほぼ同様の符号化処理を行うようにした。この手書き領域の符号化は次の通りである。 Therefore, in the present embodiment, as shown in FIG. 5, an area that is not determined to be a character by handwriting (hereinafter referred to as a handwriting area) is extracted from other than the character area. To make it easier. The handwriting area is encoded as follows.

先ず、非文字領域と判断された領域内に存在する色数は、文字領域の色数と同じか、それより少なくなる傾向が高い。理由は、手書きによる修正対象は、せいぜい数箇所であろうし、修正する際に用いるペンのインク色も実際には１、２種類程度であるからである。そこで、先ず、手書き領域内の画像を減色する。実施形態ではＲ，Ｇ，Ｂそれぞれ８ビット（２５６階調）で表わされているので、Ｒ、Ｇ、Ｂを２ビットで表現（減色）する。単純には、各成分の最上位の２ビットのみを使うと考えると分かりやすい。 First, the number of colors existing in the area determined to be a non-character area tends to be the same as or less than the number of colors in the character area. The reason is that the number of correction targets by handwriting will be at most several, and the pen ink colors used for correction are actually about one or two. Therefore, first, the color in the handwritten area is reduced. In the embodiment, each of R, G, and B is represented by 8 bits (256 gradations), so R, G, and B are represented (reduced color) by 2 bits. Simply, it is easy to understand if only the most significant 2 bits of each component are used.

そして、減色後の画像の色分布を調べ、色数を算出する。Ｒ、Ｇ、Ｂはそれぞれ２ビットであるから、２＋２＋２＝６ビットで色を表現するわけであるから、最大で２⁶＝６４色で表わすことになる。これは、手書き修正された文字、使用したペンのインクの色を表わすのに十分な数である。 Then, the color distribution of the image after color reduction is examined, and the number of colors is calculated. Since each of R, G, and B is 2 bits, the color is expressed by 2 + 2 + 2 = 6 bits, so that the maximum is expressed by 2 ⁶ = 64 colors. This is a sufficient number to represent the handwritten corrected character and the color of the pen ink used.

色分布によって仮にＮ色が存在すると判定した場合には、Ｃ₀、Ｃ₁、…Ｃ_N-1の各色の２値プレーンを生成し、それぞれについてＭＭＲ符号化を行う。そして、各プレーンを色と対応づけるために、パレットを割り当てる。 If it is determined by the color distribution that there are N colors, binary planes of C ₀ , C ₁ ,..., C _N-1 are generated, and MMR encoding is performed for each. A palette is assigned to associate each plane with a color.

次に、ＪＰＥＧ符号化について説明する。 Next, JPEG encoding will be described.

実施形態におけるＪＰＥＧ符号化対象は、図２に示す自然画領域だけでなく、１ページ（１画像）の全体をその符号化対象とする。すなわち、元に入力した画像と同じサイズの画像である。このようにしたのは、文字領域、手書き文字領域内に存在していた文字或いは手書き文字や手書き線分の背景色を考慮するためである。図２に示す原稿（手書き部分を除く）をコンピュータ上で動作するアプリケーションで作成し印刷する際、文字の背景色を設定することは、普通に行われているので、その背景色までも含めることで、原画像に忠実にさせるためである。また、既に説明した文字領域、非文字領域の符号化では、このような背景色は符号化対象ではないとも言える。 The JPEG encoding target in the embodiment is not only the natural image region shown in FIG. 2 but also the entire one page (one image). That is, the image is the same size as the originally input image. The reason for this is to consider the background color of characters or handwritten characters or handwritten line segments that existed in the character region or handwritten character region. When creating and printing the manuscript (excluding handwritten parts) shown in Fig. 2 with an application that runs on a computer, it is common to set the background color of characters, so include that background color as well. This is to make the original image faithful. It can also be said that such a background color is not an object to be encoded in the encoding of the character area and non-character area already described.

ここで問題となるのは、ＪＰＥＧは文字等の高周波成分が含まれると、圧縮率が低くなる点である。そこで、図７に示す如く、文字領域及び手書き領域中の１つの文字に着目した際、その文字を構成する有意画素（その色は何色でも構わない）５００をオリジナル画像から抜き出し、その内部を背景色（５０１）で塗りつぶすことで、下地色或いは背景色のみの状態にする。この処理を全ての文字領域、手書き領域内の有意画素について行う。この結果を示すのが図６である。図示の如く、自然画領域以外の文字領域、或いは手書き領域は、ほぼ一様な輝度分布にさせてから、ページ全体をＪＰＥＧ符号化することとした。１ページ中に高周波成分が無くなった状態で符号化するので、自然画領域のみをＪＰＥＧ符号化した場合と比較し、１ページの符号化データ量が極端に多くなることはない。 The problem here is that JPEG has a low compression rate when it contains high-frequency components such as characters. Therefore, as shown in FIG. 7, when focusing on one character in the character area and the handwritten area, significant pixels (any color) 500 constituting the character are extracted from the original image, and the inside is extracted. By painting with the background color (501), only the background color or the background color is set. This process is performed for significant pixels in all character areas and handwritten areas. This result is shown in FIG. As shown in the drawing, the character area other than the natural image area or the handwritten area is made to have a substantially uniform luminance distribution, and then the entire page is subjected to JPEG encoding. Since encoding is performed with no high-frequency components in one page, the amount of encoded data per page does not become extremely large compared to the case where only a natural image area is JPEG encoded.

以上実施形態における符号化処理について説明したが、まとめると図８のようになる。同図は、実施形態における処理の概念図である。 Although the encoding process in the embodiment has been described above, it can be summarized as shown in FIG. FIG. 5 is a conceptual diagram of processing in the embodiment.

読み取った画像５００には、文字領域５０１、手書き領域５０２、及び、自然画領域５０３が含まれている。 The read image 500 includes a character area 501, a handwriting area 502, and a natural image area 503.

このうち、文字領域５０１については、その文字領域に含まれる各文字の色を検出してパレットを生成する（処理５０４）。そして、文字画像領域を２値化し、ＭＭＲ符号化を行い（処理５０５）、文字領域符号化データ５０６を生成する。この文字領域符号化データは、文字領域の座標データ、パレット、ＭＭＲ符号化データで構成される。 Among these, for the character area 501, a palette is generated by detecting the color of each character included in the character area (process 504). Then, the character image area is binarized, and MMR encoding is performed (process 505) to generate character area encoded data 506. The character area encoded data is composed of the coordinate data of the character area, the palette, and the MMR encoded data.

また、手書き領域５０２については、減色処理を行い、パレットを生成する（処理５０７）。そして、各パレットで示される２値プレーン毎にＭＭＲ符号化を行い（処理５０８）、手書き領域符号化データ５０９を生成する。手書き領域符号化データ５０９は、各手書き領域の座標、パレット、各プレーン毎のＭＭＲ符号化データで構成される。 For the handwritten area 502, a color reduction process is performed to generate a palette (process 507). Then, MMR encoding is performed for each binary plane indicated by each palette (processing 508), and handwritten region encoded data 509 is generated. The handwritten region encoded data 509 is composed of the coordinates of each handwritten region, the palette, and MMR encoded data for each plane.

一方、自然画を含む１ページ全体については、文字領域、手書き領域における有意な画素（背景画素ではない画素）について、抜き出し、その内部を近傍の背景色で塗り潰し、文字領域に対応する文字領域の背景のみの画像５０１’、手書き領域の背景のみの画像５０２’及び自然画領域５０３で構成されるページ画像５００’を生成する。 On the other hand, for an entire page including a natural image, significant pixels (pixels that are not background pixels) in the character area and handwriting area are extracted, and the interior is filled with a nearby background color, and the character area corresponding to the character area is extracted. A page image 500 ′ including a background-only image 501 ′, a background-only image 502 ′ in a handwritten region, and a natural image region 503 is generated.

そして、このページ画像５００’をＪＰＥＧ符号化を行い、背景＆自然画領域符号化データ５１１を生成する。ＪＰＥＧ符号化データの構造は、通常通りでよいので、説明するまでもないであろう。 Then, the page image 500 ′ is subjected to JPEG encoding to generate background & natural image region encoded data 511. Since the structure of JPEG encoded data may be as usual, it will not be described.

こうして、文字領域符号化データ５０６、手書き領域符号化データ５０９、及び、背景＆自然画領域符号化データ５１１を内包し、所定のフォーマットにした１つのファイル５１２を生成し、それをファイルサーバ２００に転送することになる。 In this way, the character area encoded data 506, the handwritten area encoded data 509, and the background & natural image area encoded data 511 are included, and one file 512 having a predetermined format is generated, and the file 512 is stored in the file server 200. Will be transferred.

以上説明した処理を実現するための具体的な構成を図９に示し、以下に処理内容について説明する。 A specific configuration for realizing the processing described above is shown in FIG. 9, and the processing content will be described below.

原画像１０１はイメージスキャナで読み取られた画像データであって、各画素はＲＧＢそれぞれ８ビットで表現されるデータである。画像２値化部１０２は、原稿画像１０１から輝度成分のみを抽出し（例えば、ＲＧＢ→Ｌａｂ変換して得られたＬ成分を用いれば良いであろう）、輝度分布にしたがって得られた閾値に基づき２値化し、２値画像データ１０３を出力する。そして、領域識別部１０４は、得られた２値画像データ１０３から、通常の文字認識と同様に、文字外接矩形の分布の検出や外接矩形の結合等を行い文字領域と非文字領域を判定し、文字領域については文字領域座標データ１１７、非文字領域については非文字領域座標データ１０５を生成する。 The original image 101 is image data read by an image scanner, and each pixel is data represented by 8 bits for each of RGB. The image binarization unit 102 extracts only the luminance component from the document image 101 (for example, an L component obtained by RGB → Lab conversion may be used), and sets the threshold value obtained according to the luminance distribution. Based on the binarization, binary image data 103 is output. Then, the area identifying unit 104 determines the character area and the non-character area by detecting the distribution of the circumscribed rectangle, combining the circumscribed rectangles, and the like from the obtained binary image data 103, as in normal character recognition. The character area coordinate data 117 is generated for the character area, and the non-character area coordinate data 105 is generated for the non-character area.

ここで注意すべき点は、領域識別部１０４が生成する非文字領域座標データ１０５には、手書き領域、自然画領域の識別はないものの、それらの座標が混在した状態にある点である。 The point to be noted here is that the non-character area coordinate data 105 generated by the area identification unit 104 does not identify the handwritten area and the natural image area, but is in a state where these coordinates are mixed.

塗りつぶし部１０６は、文字領域座標データに基づいて、文字領域であると判断され、尚且つ、２値画像データ１０３の“１”画素（有意画素）を原画像１０１から抜き出し（除去し）、その抜き出した内部を、２値画像データの近傍の“０”の色で塗りつぶす処理を行う。また、後述する手書き領域座標データ１１９から、手書き領域と判断され、２値画像データ１０３の“１”画素（有意画素）となっている原画像の画素を抜き出し、その抜き出した内部を、２値画像データの近傍の“０”の色で塗りつぶす処理を行う。 The fill unit 106 is determined to be a character region based on the character region coordinate data, and extracts (removes) the “1” pixel (significant pixel) of the binary image data 103 from the original image 101. A process of filling the extracted interior with a color of “0” in the vicinity of the binary image data is performed. Further, from the handwritten region coordinate data 119 described later, a pixel of the original image that is determined to be a handwritten region and is “1” pixel (significant pixel) of the binary image data 103 is extracted, and the extracted inside is expressed in binary. A process of painting with a color of “0” in the vicinity of the image data is performed.

この結果、塗りつぶし部１０６から出力されるデータは、ちょうど図６に示すような多値画像データとなる。 As a result, the data output from the filling unit 106 is multi-valued image data as shown in FIG.

文字色抽出部１１３は、文字領域座標データ１１７で示される領域内にあって、２次画像データで“１”（有意画素）である画素が、文字を構成する画素と判断し、カラー画像データ中の該当する位置の画素の色情報を抽出することを繰り返す。この結果、文字領域内の全文字の色を検出することになるので、その結果をパレットして出力する。そして、各色毎の２値画像プレーンデータを生成し、それをＭＭＲ符号化部１１５で符号化させる。そして、パレット、ＭＭＲ符号化データ、並びに、各プレーンとパレットとの関係を示す情報を符号化データ１２１として出力する。つまり、文字色抽出部１１３、ＭＭＲ符号化部１１５は、図３に示す文字領域を符号化することになる。 The character color extracting unit 113 determines that pixels in the region indicated by the character region coordinate data 117 and having “1” (significant pixel) in the secondary image data are pixels constituting the character, and color image data The extraction of the color information of the pixel at the corresponding position in the inside is repeated. As a result, the colors of all characters in the character area are detected, and the result is output as a palette. Then, binary image plane data for each color is generated and encoded by the MMR encoding unit 115. Then, the pallet, MMR encoded data, and information indicating the relationship between each plane and the pallet are output as encoded data 121. That is, the character color extraction unit 113 and the MMR encoding unit 115 encode the character region shown in FIG.

また、画素判定部１０９は入力されるカラー画像データの輝度成分の変化が急峻（隣接する画素の輝度成分の差が所定閾値以上）となっている画素を検出し、その結果を手書き領域判定部１１０に出力する。この判定結果の信号は、文字線画領域の画素であるか否かの信号でもある。 In addition, the pixel determination unit 109 detects a pixel in which the change in the luminance component of the input color image data is steep (the difference between the luminance components of adjacent pixels is equal to or greater than a predetermined threshold value), and the result is the handwritten region determination unit To 110. This determination result signal is also a signal indicating whether or not the pixel is in the character / line drawing area.

手書き領域判定部１１０は、画素判定部１０９での判定結果に基づき、注目画素が文字／線画か、或いはそれ以外かを判別できる。したがって、２値画像データ１０３内の“１”の画素で、非文字領域座標データ１０５に合致する画素データのみを出力すれば、その出力画素データは手書き修正された文字や線分データを構成する画素データとなるので、その座標データを手書き領域の座標データ１１９として出力する。 The handwritten region determination unit 110 can determine whether the pixel of interest is a character / line drawing or other than that based on the determination result of the pixel determination unit 109. Therefore, if only pixel data that matches the non-character area coordinate data 105 is output at the pixel “1” in the binary image data 103, the output pixel data constitutes character or line segment data corrected by handwriting. Since it becomes pixel data, the coordinate data is output as the coordinate data 119 of the handwritten region.

減色＆パレット生成部１１１は、上記のようにして手書き領域１１０からの手書き領域内の画素データを減色（実施形態ではＲ、Ｇ、Ｂそれぞれ２ビット化している）し、その減色後の色を計数し、その色毎の分布（ヒストグラム）を求める。そして、ヒストグラムの谷位置で区切り、各区切り間の中央位置の色情報について１つのパレット番号を割り当てる。この結果、例えばＮ個のパレットＣ₀、Ｃ₁、…Ｃ_N-1（Ｎ＜６４）を生成する。そして、ＭＭＲ符号化部１１２にて、パレットＣ₀の色の画素を２値画像としてＭＭＲ符号化する。この処理をパレットＣ₁、…Ｃ_N-1についても同様に行う。この結果、各ＭＭＲ符号化データをパレット番号と対応づけ、且つ、パレット番号と減色後の色情報とをパレット＆ＭＭＲ符号化データ１２０として出力する。 The color reduction & palette generation unit 111 reduces the color of pixel data in the handwritten area from the handwritten area 110 as described above (in the embodiment, R, G, and B are each converted to 2 bits), and the color after the color reduction is obtained. Count and determine the distribution (histogram) for each color. Then, it is divided at the valley positions of the histogram, and one pallet number is assigned to the color information at the center position between the divisions. As a result, for example, N pallets C ₀ , C ₁ ,... C _N-1 (N <64) are generated. Then, the MMR encoding unit 112 performs MMR encoding on the color pixels of the palette C ₀ as a binary image. Pallet C ₁ this process, ... similarly performed for C _N-1. As a result, each MMR encoded data is associated with the palette number, and the palette number and the color information after color reduction are output as palette & MMR encoded data 120.

以上の結果、各処理部でデータ１１７乃至１２１が生成されるので、これらを所定形式のフォーマットに合成した１つのファイル１３０にして、出力することになる。 As a result, the data 117 to 121 are generated in each processing unit, and are output as a single file 130 that is synthesized into a predetermined format.

ここで、実施形態における減色＆パレット生成部１１１及びＭＭＲ符号化部１１２の詳細を図１０に示し、その処理内容を説明する。 Here, details of the color reduction & palette generation unit 111 and the MMR encoding unit 112 in the embodiment are shown in FIG.

減色部１００３には、画素判定部１０９で判定された文字・線画属性情報１００１、原画像データ１００２、及び、非文字領域座標データ１０５が供給される。 Character reduction / line drawing attribute information 1001, original image data 1002, and non-character area coordinate data 105 determined by the pixel determination unit 109 are supplied to the color reduction unit 1003.

減色部１００３は、注目画素が文字・線画属性を有し、且つ、非文字領域座標内に存在する場合、入力した原画像中の注目画素位置のＲ、Ｇ、Ｂ（各８ビット）を、それぞれ２ビット（計６ビット）に減色し、その減色後のデータをカラー情報計数部１００４に出力すると共に、２値化部１００８に出力する。 When the target pixel has the character / line drawing attribute and exists in the non-character area coordinates, the color reduction unit 1003 calculates R, G, B (each 8 bits) of the target pixel position in the input original image. Each color is reduced to 2 bits (6 bits in total), and the data after the color reduction is output to the color information counting unit 1004 and to the binarization unit 1008.

カラー情報計数部１００４は、Ｒ、Ｇ、Ｂ各２ビットの計６ビットで示される色をカウントするカウンタが内蔵されている。各カウンタをＣＴ（Ｒ（２ビット）、Ｇ（２ビット）、Ｂ（２ビット））で表わし、減色後の注目画素の色（Ｒｒ、Ｇｒ、Ｂｒ）であった場合には、
ＣＴ（Ｒｒ、Ｇｒ、Ｂｒ）←ＣＴ（Ｒｒ、Ｇｒ、Ｂｒ）＋１
を演算することになる。 The color information counting unit 1004 has a built-in counter that counts the color represented by a total of 6 bits, 2 bits each for R, G, and B. When each counter is represented by CT (R (2 bits), G (2 bits), B (2 bits)) and the color of the target pixel after color reduction (Rr, Gr, Br),
CT (Rr, Gr, Br) ← CT (Rr, Gr, Br) +1
Will be calculated.

この処理を、１ページ分の画像入力が完了するまで行うと、カウンタＣＴ（）には、手書きによって文字と認定されなかった活字文字、手書き線分、手書き文字（正確には、文字として認定されなかった手書き文字）を構成する有意な画素の減色後の頻度が格納されることになる。 If this process is performed until image input for one page is completed, the counter CT () prints characters, handwritten line segments, and handwritten characters that are not recognized as characters by handwriting (accurately, characters are recognized as characters). The frequency after the subtractive color reduction of the significant pixels constituting the handwritten character that did not exist) is stored.

ソート部１００５は、上記のようにして得られたＲ、Ｇ、Ｂの３次元色空間におけるカウンタ値を頻度の多いものからソートする。そして、パレットテーブル生成部１００６は、各色毎に頻度の多い順にパレット番号として“０”、“１、”…を割り当て、パレット番号と色との関係を示すパレットテーブル１００７を生成する。なお、実施形態の場合、減色は各ＲＧＢについて２ビットとしているので、パレット番号は最大で“６３”になるが、頻度数が０の色についてはパレット番号を割り当てない。 The sorting unit 1005 sorts the counter values obtained in the above manner in the three-dimensional color space of R, G, B from the ones with the highest frequency. Then, the palette table generation unit 1006 assigns “0”, “1,”,... As the palette numbers in the order of frequency for each color, and generates a palette table 1007 indicating the relationship between the palette number and the color. In the embodiment, since the subtractive color is 2 bits for each RGB, the palette number is “63” at the maximum, but the palette number is not assigned to the color with the frequency number 0.

２値化部１００８は、減色部１００３からの減色後の各画素の色情報と、パレットテーブル生成部１００６からの情報に基づき２値化する。具体的には、パレット番号“０”の色（Ｒ、Ｇ、Ｂ）として（１、０、０）が設定された場合、減色部１００３からの出力データが（１、０、０）の画素のみ、すなわち、パレット“０”に一致する色を有する画素のみを“１”（１ビット）とし、それ以外を“０”として出力する。 The binarization unit 1008 binarizes based on the color information of each pixel after color reduction from the color reduction unit 1003 and information from the palette table generation unit 1006. Specifically, when (1, 0, 0) is set as the color (R, G, B) of the palette number “0”, the output data from the color reduction unit 1003 is a pixel of (1, 0, 0). Only, that is, only pixels having a color matching the palette “0” are output as “1” (1 bit), and the others are output as “0”.

この処理を、生成されたパレット番号全てに行うことで、各パレット番号毎の２値画像データ群１００９を生成する。 By performing this process for all the generated pallet numbers, a binary image data group 1009 for each pallet number is generated.

ＭＭＲ符号化部１１２は、２値画像データ群１００９の各２値画像をＭＭＲ符号化を行うことで符号化データ１０１０を生成することになる。 The MMR encoding unit 112 generates encoded data 1010 by performing MMR encoding on each binary image of the binary image data group 1009.

以上説明したように本実施形態によれば、コンピュータ上で作成され、プリンタで印刷された文書に対し、手書き修正（重畳或いは書き込み）を行い、その修正結果を電子化する場合に、もともとプリンタで印刷された活字文字やそれに匹敵する文字、更には、手書きにより修正されて文字とは認定されなかった部分及び手書き線分や手書き文字については、自然画用の符号化（実施形態ではＪＰＥＧ）対象とはならず、可逆符号化（実施形態ではＭＭＲ）で符号化されるので手書き文字や線分、更には、手書きで文字とは認定されなかった活字文字のエッジが保持されることになり、良好な符号化画像データを得ることが可能になる。また、文字の色も、通常の利用する色をほぼ網羅する色数で再現することも可能になり、１００％原画像に忠実な文字色ではなくとも、それに近い色で再現することが可能になる。 As described above, according to this embodiment, when a document created on a computer and printed by a printer is subjected to handwriting correction (superimposition or writing) and the correction result is digitized, the printer originally For printed type characters and comparable characters, and parts that were corrected by handwriting and not recognized as characters, and handwritten line segments and handwritten characters are subject to natural image encoding (JPEG in the embodiment). However, since it is encoded by lossless encoding (MMR in the embodiment), handwritten characters and line segments, and furthermore, edges of type characters that are not recognized as characters by handwriting are retained. Good encoded image data can be obtained. In addition, the color of characters can be reproduced with the number of colors that almost cover the colors used normally, and even if it is not 100% faithful to the original image, it can be reproduced with colors close to it. Become.

＜第２の実施形態＞
図１１は上記第１の実施形態における図９に代わる構成を示している。図１１が図９と異なる点は、ＪＰＥＧ符号化部１０８の直前に、縮小部１０７を配置した点である。 <Second Embodiment>
FIG. 11 shows a configuration in place of FIG. 9 in the first embodiment. FIG. 11 differs from FIG. 9 in that a reduction unit 107 is disposed immediately before the JPEG encoding unit 108.

ＪＰＥＧ符号化する対象は、自然画領域と、文字を抜き出し背景色で塗りつぶした領域を含む情報であり、文字や線分は含まれない。つまり、解像度よりも階調性が優先される画像ということになる。したがって、イメージスキャナの読取り解像度が十分であれば、解像度を下げても、ページ全体の画質に与える影響は少ないと言える。そこで、ＪＰＥＧ符号化部１０８で符号化する直前に、塗りつぶし後の画像を縮小する縮小部を設けた。この結果、ＪＰＥＧ符号化データ量を格段に減らすことができるようになる。勿論、出力ファイル１３０のヘッダ（不図示）には、縮小率を示す情報を格納することで、復号処理に対処する必要はある。 The target for JPEG encoding is information including a natural image area and an area in which characters are extracted and painted with a background color, and does not include characters or line segments. In other words, this is an image in which gradation is prioritized over resolution. Therefore, if the reading resolution of the image scanner is sufficient, it can be said that even if the resolution is lowered, the influence on the image quality of the entire page is small. Therefore, immediately before encoding by the JPEG encoding unit 108, a reduction unit for reducing the image after filling is provided. As a result, the amount of JPEG encoded data can be significantly reduced. Of course, it is necessary to cope with the decoding process by storing information indicating the reduction ratio in the header (not shown) of the output file 130.

＜第３の実施形態＞
複合機には、既に、原稿読取部内に既に第１の実施形態で説明した画素判定部１０９に相当する構成が備えている場合がある。その場合には、その判定結果を入力すれば良いので、図１２に示す構成で実現出来よう。図１２と図９との違いは、画素判定部１０９が無くなった代わりに、該当する判定結果である属性情報１２５を入力するようにした点にある。また、本第３の実施形態の構成に、第２の実施形態で示した構成を適用しても構わない。 <Third Embodiment>
In some cases, the multifunction peripheral already includes a configuration corresponding to the pixel determination unit 109 described in the first embodiment in the document reading unit. In that case, it is only necessary to input the determination result, which can be realized with the configuration shown in FIG. The difference between FIG. 12 and FIG. 9 is that the attribute information 125, which is a corresponding determination result, is input instead of the pixel determination unit 109 being eliminated. Further, the configuration shown in the second embodiment may be applied to the configuration of the third embodiment.

以上本発明に係る実施形態では、複合機に適用する例を説明したが、イメージスキャナ等の画像入力装置を接続したコンピュータ上で、図８に示す処理とほぼ同等の処理を行うプログラムを実行しても実現できるのは明らかである。また、通常、コンピュータプログラムはＣＤ−ＲＯＭ等のコンピュータ可読記憶媒体に格納されていて、それをコンピュータにセットしてシステムにコピーもしくはインストールすることで実行可能になるわけであるから、当然、このようなコンピュータ可読記憶媒体も本発明の範疇に含まれる。 In the embodiment according to the present invention, the example applied to the multifunction machine has been described. However, on the computer connected to the image input device such as an image scanner, a program for executing processing substantially equivalent to the processing shown in FIG. 8 is executed. But it is clear that it can be realized. Further, since the computer program is usually stored in a computer-readable storage medium such as a CD-ROM and can be executed by setting it in the computer and copying or installing it in the system, of course, Such computer-readable storage media are also included in the scope of the present invention.

実施形態におけるシステム全体構成図である。1 is an overall system configuration diagram according to an embodiment. 実施形態における読取り対象の原稿画像の例を示す図である。5 is a diagram illustrating an example of a document image to be read in the embodiment. FIG. 原稿画像中の文字領域を示す図である。It is a figure which shows the character area in a document image. 原稿画像中の手書き領域と自然画領域を示す図である。It is a figure which shows the handwritten area | region and natural image area | region in a document image. 実施形態における手書き領域を示す図である。It is a figure which shows the handwritten area | region in embodiment. ＪＰＥＧ符号化するページ全体のデータ形式を示す図である。It is a figure which shows the data format of the whole page which carries out JPEG encoding. 塗りつぶし部の処理内容を示す図である。It is a figure which shows the processing content of the filling part. 実施形態における画像データの処理の流れを示す図である。It is a figure which shows the flow of a process of the image data in embodiment. 第１の実施形態における画像データの電子化に関するブロック構成図である。It is a block block diagram regarding the digitization of image data in the first embodiment. 第１の実施形態における減色＆パレット生成部、並びにＭＭＲ符号化部の詳細構成を示す図である。It is a figure which shows the detailed structure of the color reduction & palette production | generation part in a 1st embodiment, and an MMR encoding part. 第２の実施形態における画像データの電子化に関するブロック構成図である。It is a block block diagram regarding the digitization of the image data in 2nd Embodiment. 第３の実施形態における画像データの電子化に関するブロック構成図である。It is a block block diagram regarding digitization of image data in the third embodiment.

Claims

A character region determination means for determining a character region in the input image and generating character region coordinate information and non-character region coordinate information;
Handwritten region determination means for inputting attribute information indicating whether or not the target pixel is in a character / line drawing, determining a handwritten region based on the attribute information and the non-character region coordinate information, and generating handwritten region coordinate information When,
Based on the character area coordinate information and the handwritten area coordinate information, a painting means for painting a character and a handwritten part in the input image with surrounding colors;
First encoding means for encoding the entire image obtained by filling with the filling means with encoding means for a gradation image;
A second encoding means for encoding the character and the handwritten portion with an encoding means for character line drawing;
An image processing apparatus comprising: output means for including each encoded data obtained by the first and second encoding means in one data file and outputting the data.

The second encoding means includes
Means for generating a binary image corresponding to each color and generating a palette indicating the color of the binary image;
Lossless encoding means for losslessly encoding the generated binary image,
The image processing apparatus according to claim 1, wherein the table indicating the relationship between the generated palette and color and the encoded data obtained by the lossless encoding unit are output as an encoding result.

The image processing apparatus according to claim 1, wherein the second encoding unit includes a color reduction unit, and encodes image data after color reduction by the color reduction unit.

2. The first encoding unit includes a reduction unit that reduces an entire image obtained by painting by the painting unit to a predetermined size, and encodes the image reduced by the reduction unit. An image processing apparatus according to 1.

2. The image according to claim 1, wherein the attribute determination unit outputs attribute information that is a character line image when a luminance difference or density difference between adjacent pixels of the input image is larger than a predetermined threshold. Processing equipment.

A character region determination step for determining a character region in the input image and generating character region coordinate information as well as non-character region coordinate information;
A handwritten region determination step of inputting attribute information indicating whether or not the target pixel is in a character / line drawing, determining a handwritten region based on the attribute information and the non-character region coordinate information, and generating handwritten region coordinate information When,
Based on the character region coordinate information and the handwritten region coordinate information, a painting step of painting a character and a handwritten part in the input image with surrounding colors;
A first encoding step of encoding the entire image obtained by the filling step with an encoding means for a gradation image;
A second encoding step of encoding the character and the handwritten portion with an encoding means for character line drawing;
An image processing method comprising: an output step of outputting each encoded data obtained by the first and second encoding means in one data file.

A computer program that is read and executed by a computer,
A character region determination means for determining a character region in the input image and generating character region coordinate information and non-character region coordinate information;
Handwritten region determination means for inputting attribute information indicating whether or not the target pixel is in a character / line drawing, determining a handwritten region based on the attribute information and the non-character region coordinate information, and generating handwritten region coordinate information When,
Based on the character area coordinate information and the handwritten area coordinate information, a painting means for painting a character and a handwritten part in the input image with surrounding colors;
First encoding means for encoding the entire image obtained by filling in the filling step with an encoding means for a gradation image;
A second encoding means for encoding the character and the handwritten portion with an encoding means for character line drawing;
A computer program that functions as output means for outputting each encoded data obtained by the first and second encoding means in one data file.

A computer-readable storage medium storing the computer program according to claim 7.