JPS61107876A

JPS61107876A - Picture procession device

Info

Publication number: JPS61107876A
Application number: JP59229266A
Authority: JP
Inventors: Hiroshi Tanioka; 宏谷岡
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 1984-10-31
Filing date: 1984-10-31
Publication date: 1986-05-26
Anticipated expiration: 2011-06-12
Also published as: JP2505402B2

Abstract

PURPOSE:To apply data-compression with high efficiency to a binary picture signal to store and further decode by adding picture data for every area of the storing means, adding area classification and the size of the minimum area of areas and storing them. CONSTITUTION:The picture read from a reading device 1 by a solid-state image pick-up element is made binary in a binary processing and a coordinates arranging part 2. If the picture is inputted in inclination in order to match the character row with address space coordinates in a page memory 5, the picture is rotated, coordinates are shaped and the picture is stored in the memory 5. At a mesh dividing coding processing part 3, a half tone part or a pattern and line picture area is MH-coded by a dot dimension, and the compressed coded data are stored in a data storing means 4. On the other hand, at a compositing part 8 based upon the size of the mesh and the coding data, a printing type font stored in a font ROM9 is read, successively reproduced and stored in a space of a line memory 7 as a printing type row and outputted as a visible image at an output device 6.

Description

【発明の詳細な説明】［技術分野］本発明は２値化された画像信号のブロック符号化貯蔵し
、更に復号化する画像処理装置に関する。DETAILED DESCRIPTION OF THE INVENTION [Technical Field] The present invention relates to an image processing device for storing block coding of a binarized image signal and further decoding it.

［従来技術］画像を複写機等の読み取り装置で読み取り２値化後　、
例えば光デイスク装置等にファイルする時、符号化して
データ圧縮を行なう事が望ましい、しかしながら、画像
信号は画調に応じて冗長、度が異なる為に、文字、写真
、図形等が混在する１枚の画像全域に対して従来のよう
に１つの符号化方法のみでデータ圧縮する事は圧縮効率
が低くなる。[Prior art] After reading the image with a reading device such as a copying machine and converting it into two values,
For example, when saving a file to an optical disk device, etc., it is desirable to encode and compress the data. However, since the image signal is redundant and has different degrees of intensity depending on the image quality, a single image contains a mixture of text, photographs, graphics, etc. If data is compressed using only one encoding method for the entire image area as in the past, the compression efficiency will be low.

［目的］本発明は上記従来例の欠点に鑑みてなされたもので、そ
の目的は２値化された画像信号を高効率でデータ圧縮し
て貯蔵し、更に復号化する画像処理装置を提供する所に
ある。[Objective] The present invention has been made in view of the drawbacks of the conventional example described above, and its object is to provide an image processing device that compresses and stores binarized image signals with high efficiency, and further decodes the data. It's there.

［実施例］本発明の概要は２値化画像信号中、一定の大きさの文字
列を含む画像領域をメツシュに分割し、個々の文字をそ
のメツシュ内に格納できるようなメツシュの大きさを認
識し、該メツシュ毎に文字認識を行うと共に、認識した
文字に対して符号化することを特徴とする。[Embodiment] The outline of the present invention is to divide an image area containing a character string of a certain size into meshes in a binary image signal, and to determine the size of the mesh such that each character can be stored in the mesh. Character recognition is performed for each mesh, and the recognized characters are encoded.

更に、上記メツシュに格納出来ない、つまり大きざの異
なる文字、あるいは図形、写真領域を分離して、これら
の画像領域に従来の画素ベースでの符号化を適用するこ
とを特徴とする。Furthermore, the present invention is characterized in that characters, graphics, and photographic regions that cannot be stored in the mesh, that is, different sizes, are separated and conventional pixel-based encoding is applied to these image regions.

上記特徴を踏えて、以下図面を参照しながら本発明に係
る実施例を具体的に説明する。Based on the above features, embodiments of the present invention will be specifically described below with reference to the drawings.

第１図は一実施例である画像処理装置のブロック図であ
る。FIG. 1 is a block diagram of an image processing apparatus according to an embodiment.

１はＣＣＤ等固体撮像素子による画像読み取り部である
。読み取られた画像は２において２値化処理され、また
文字列をページメモリ内アドレス空間座標に合わせる為
に、もし傾けて画像が入力された場合には回転させて座
標整形を行ない、ページメモリ５に格納する。３は本発
明に特徴的なメツシュ分割符号化処理部と名付けられる
べき部分である。符号化されたデータはデータ貯蔵手段
４に格納される。Reference numeral 1 denotes an image reading section using a solid-state image sensor such as a CCD. The read image is binarized in step 2, and in order to align the character string with the address space coordinates in the page memory, if the image is input at an angle, it is rotated and coordinates shaped. Store in. 3 is a portion to be named a mesh division encoding processing section which is characteristic of the present invention. The encoded data is stored in data storage means 4.

一方、複合化部８ではメツシュの大きさと符号化データ
に基づき、フォノ）ＲＯＭ９に格納された活字フォント
を読み出し、順次ラインメモリ７の空間に活字列として
再生格納し、出力装置６で可視像として出力する。On the other hand, the decoding unit 8 reads out the type font stored in the phono ROM 9 based on the size of the mesh and the encoded data, sequentially reproduces and stores it as a type string in the space of the line memory 7, and outputs a visible image on the output device 6. Output as .

次に、本実施例のメツシュ分割画像処理部３に於ける画
像処理の概略について第２図のフローチャートに基づい
てステップ毎に説明する。Next, an outline of the image processing in the mesh divided image processing unit 3 of this embodiment will be explained step by step based on the flowchart of FIG.

くステップ２０〉・・・メツシュサイズの決定ページメ
モリ５に蓄えられた１ページの画像データＤ　（ｘ、ｙ
）からＸ、７両方向における黒ドツト数のヒストグラム
を求める。但し、ｘ、ｙはページメモリ５内の適当な直
交座標軸である。Step 20>... Determination of mesh size One page of image data D (x, y
), calculate the histogram of the number of black dots in both the X and 7 directions. However, x and y are appropriate orthogonal coordinate axes within the page memory 5.

Ｘ方向のヒストグラムを求める時は、あるＸ座標値に対
する全てのｙ座標値における黒ドツト数を計数し、これ
を全てのＸ座標値について行う事によりＸ方向のヒスト
グラムを作成する。Ｘ方向におけるヒストグラムを作成
する時も、あるｙ座標値に対する全てのＸ座標値におけ
る黒ドツト数を求めるようにする。When obtaining an X-direction histogram, the number of black dots at all y-coordinate values for a certain X-coordinate value is counted, and this is performed for all X-coordinate values to create an X-direction histogram. When creating a histogram in the X direction, the number of black dots at all X coordinate values for a certain y coordinate value is determined.

上記方法を第３図に示すような文章の文字列に適用する
と、Ｘ方向のヒストグラムについては第４図（ａ）の如
く、Ｘ方向のヒストグラムについては第４図（ｂ）の如
く得られる。第４図（ａ）（ｂ）のヒストグラムにおけ
る゛谷″は夫々文字間、行間の空白と考えられる。第３
図の如く、文字の大きさが概ね一定している文章の時は
第４図（ａ）、（ｂ）に示される如く、そのヒストグラ
ムには周期性がある。しかしながら、異なった大きさの
文字が混在する場合、あるいは図形等が含まれた場合は
ヒストグラムの形状は周期性がくずれる。When the above method is applied to a character string of a sentence as shown in FIG. 3, a histogram in the X direction as shown in FIG. 4(a) and a histogram in the X direction as shown in FIG. 4(b) are obtained. The "troughs" in the histograms in Figures 4(a) and (b) are considered to be spaces between characters and lines, respectively. 3.
As shown in the figure, when the text has approximately constant font size, the histogram has periodicity as shown in FIGS. 4(a) and 4(b). However, if characters of different sizes are mixed, or if figures or the like are included, the periodicity of the histogram shape will be lost.

一般に１ページ内の文字の大きさは全字数の８割程度が
同一の大きさである。従って、第３図に示す各方向にお
ける黒ドツト数の総和分布を所定閾値Ｓｘ、Ｓｙを用い
れば、文字位置の座標（Ｘ＋　＋　Ｖｒ　）ｌ（Ｘ２　
＋　”ｌｚ　）＋（Ｘ３＋ｙ３）ｓ”・・・が得られる
。そこで、（Ｘ２−ｘｔ　）Ｉ　（Ｘ３−Ｘ２　）ｌ　
（ＸＡ　−Ｘｓ　）ｓ・・・・・（ＸＡ　−Ｘｎ−ｔ）
−１及び（ｙｚ　−ｙｔ　）ｌ　（ｙ３−７ｚ　）、　
（ｙａ−７３）。Generally, about 80% of the total number of characters on one page are the same size. Therefore, if the total distribution of the number of black dots in each direction shown in FIG.
+ "lz)+(X3+y3)s"... is obtained. Therefore, (X2-xt)I (X3-X2)l
(XA -Xs)s...(XA -Xn-t)
-1 and (yz -yt)l (y3-7z),
(ya-73).

（ｙｓ　−ｙａ　Ｌ・・・・・・（Ｖ４　ｙｎ−ｉ）＊
・・・・・・を求めてヒストグラム化すれば第５図（ａ
）　、　（ｂ）が求められる０度数が最大となる座標値
をＭｘ　、　Ｍｙとすれば、ステップ２０で求めるメツ
シュの大きさはＸ方向についてはＸ１画素、Ｘ方向につ
いてはＸ１画素の大きさとすればよい、この大きさのメ
ツシュで文字列を区切ればほとんどの文字は該メツシュ
内に１ケづつ含まれる事となる。(ys-ya L...(V4 yn-i)*
Figure 5 (a) is obtained by calculating and creating a histogram.
), (b) are obtained by Mx, My, the mesh size obtained in step 20 is X1 pixel in the X direction, and X1 pixel in the X direction. If you divide a character string with a mesh of this size, most characters will be included in each mesh, one at a time.

更に、精度よく文字の大きさを決定する事も（Ｘ＋−Ｘ
’ｔ）、（Ｘ２　　Ｘ’ｚ）＝（Ｘ−＋。Furthermore, it is also possible to determine the font size with precision (X+-X
't), (X2 X'z) = (X-+.

ｘ　’ｗ　）及び（ｙｔ　−ｙ　’１）　ｒ　（ｙｚ　
−ｙ′２）・・・（ｙ〜−ｙ′４）を求めて同じくヒス
トグラム化しその最大値を求めれば可能である。x 'w ) and (yt -y '1) r (yz
-y'2)...(y to -y'4), convert it into a histogram, and find its maximum value.

そのように決定した文字域を第６図のＭｘ’ＸＭｙ’と
する。Let the character area thus determined be Mx'XMy' in FIG.

第６図は第３図のテキストがＭｘＸＭ７の大きさのメツ
シュに分割された様子を示す（又１文字域として認識さ
れたＭｘ’ＸＭ７’をも示す）、第６図で明白な様に本
発明によるメツシュはその内部に１文字が含まれるとと
もに下地の空白部分をも内部に包含出来る。後述するが
、この空白部分と文字を含めて符号化を行なう為に本発
明の符号化法の圧縮率は極めて向上する。Figure 6 shows how the text in Figure 3 has been divided into meshes of size MxXM7 (also showing Mx'XM7' recognized as one character area). The mesh according to the invention can contain not only one character but also a blank part of the background. As will be described later, the compression rate of the encoding method of the present invention is greatly improved because the encoding is performed including the blank areas and characters.

ｋころで、上述の符号化法は文字の大きさが統一されて
いる文書に対しては極めて大きな圧縮率を期待出来るが
、一般の文書は第３図に示す様に文字の大きさが統一さ
れている場合は少なく、また図形、写真領域を含むこと
がほとんどである為に１ペ一ジ全面を前述したメツシュ
で区切り符号化しても圧縮率の向上は期待できない。The encoding method described above can be expected to have an extremely high compression rate for documents with uniform font sizes, but for general documents, the font sizes are uniform as shown in Figure 3. Since there are only a few cases in which data is displayed, and most of the time it includes graphics and photographic areas, it is not possible to expect an improvement in the compression rate even if the entire page is divided and encoded using the mesh described above.

そこで、次のステップ２１では上記メツシュを適用出来
ない領域の検出アルゴリズムについて詳説する。Therefore, in the next step 21, the algorithm for detecting areas to which the mesh cannot be applied will be explained in detail.

＜ステップ２１〉・・・・・・メツシュ分割による不適
合画像域の判定メツシュで分割する車が適当でない画像域として次のよ
うなものが挙げられる。<Step 21> Determination of unsuitable image areas by mesh division The following are examples of image areas in which cars are inappropriate to be divided by mesh.

■大きさの異なる文字（文章） ■図形、写真領域 ■下地が白でない（背景十文字）文字領域■プロポーシ
ョナル印字原稿ステップ２１は上記■〜■の画像が第６図の如く規則的
な文字列の中に混在する場合についてそれらを識別する
方法であり、以下に説明する。■Characters (text) of different sizes ■Graphics, photo areas ■Character areas with non-white background (background cross) ■Proportional printing manuscript Step 21 is where the images of This is a method for identifying them when they are mixed together, and will be explained below.

第６図の文字列と■の大きさの異なる文字（第７図）が
混在している場合は、前述のステップ２０の方法によっ
て求めたメツシュ分割×に！を大きさ°の異る文字列に
適用すると、第７図のトｌ〜）Ｉ−４の如く分割される
場合がある。If the character string in Figure 6 and the characters with different sizes (Figure 7) are mixed, the mesh division × obtained by the method in step 20 above is used! When applied to character strings of different sizes, they may be divided as shown in FIG.

例えばＭ−１のメツシュに関して説明すれば、メツシュ
の下方の空白の部分（行間空白）にまで文字の一部が含
まれている。従って、メツシュ内の文字域を特定し、そ
の文字域外の部分の黒画素の有無を調べれば大きさの異
なる文字を含む画像域を識別出来る。この時メツシュ内
の文字域の決定にはＭｘ　、　Ｍｙのうち小さい方で構
成したメツシュ即ち、第６図においてはＭｘ＞Ｍｙであ
るからＨｘＸＭｘをメツシュ内の文字域と決定しても良
いし、又さらに正確に求めるには、前述したように単に
文字間隔のみを求めるのではなく、直接その文字域質！
′×１′から文字域外の黒画素の有無を調べても良い。For example, regarding mesh M-1, some characters are included even in the blank area below the mesh (space between lines). Therefore, by specifying the character area within the mesh and checking the presence or absence of black pixels outside the character area, image areas containing characters of different sizes can be identified. At this time, to determine the character area within the mesh, a mesh composed of the smaller of Mx and My, that is, since Mx>My in FIG. 6, HxXMx may be determined as the character area within the mesh, In addition, to obtain even more accurate results, instead of simply obtaining the character spacing as mentioned above, you can directly calculate the character area quality.
The presence or absence of black pixels outside the character area may be checked from '×1'.

ところで太き・さの異なる文字であっても、ｗ４７図中
の１４の様に上記文字域該に黒ドツトがない場合も有り
得る。しかしながら隣接するメツシュド３では明らかに
該メツシュに不適合であると識別出来る。つまり、ステ
ップ２１においては各メツシュ毎に適合性を判定し、次
のステップ２２において二次元的に不適合メツシュを判
定し不適合領域を決定出来る。By the way, even if the characters have different thicknesses and sizes, there may be cases where there is no black dot in the above character area, as shown in 14 in the w47 diagram. However, the adjacent mesh 3 can be clearly identified as being incompatible with the mesh. That is, in step 21, compatibility is determined for each mesh, and in the next step 22, non-conforming meshes are determined two-dimensionally, and non-conforming regions can be determined.

■の図形、写真領域及び■の背景に画像情報を持ってい
る領域に対しても上述した処理により不適合領域と判定
出来る。It is also possible to determine non-conforming areas by the above-described processing for the figure (2), the photo area, and the area (2) having image information in the background.

ただ、第６図において、隣接するメツシュの文字領域内
に例えばＸ軸に平行な直線が存在した場合は上述の処理
では不適合領域の判定は出来ない、従って、Ｍｘ、Ｍ７
を比較し、長い方、つまり本実施例ではＹ軸方向のメツ
シュ上に黒ドツトが数点存在するか否かの判定を行なう
、もし、存在した場合は該軸で分割されるメツシュ領域
を不適合と判断し、上述した直線を識別出来ることにな
る。However, in FIG. 6, if there is a straight line parallel to the X axis in the character area of the adjacent mesh, the above process cannot determine the non-conforming area. Therefore, Mx, M7
It is determined whether there are several black dots on the longer mesh, that is, in this example, the Y-axis direction. If there are, the mesh area divided along the axis is determined to be unsuitable. Therefore, the straight line described above can be identified.

くステップ２２〉・・・・・・符号化領域の分離本ステ
ップではステップ２１で判定した不適合性に基づき１画
像をその符号化法の違いによって２分割する。Step 22> Separation of encoded regions In this step, one image is divided into two parts based on the incompatibility determined in step 21, depending on the encoding method.

■ＭＨ（Ｍｏｄｉｆｉｅｄ　ＨｕＨｓ＋ａｎ方式）　、
　　ＭＲ（Ｍｏｄｉｆｉｅｄ　Ｒ，Ｅ、Ａ、Ｄ方式）等
、画素ドツト次元での符号化すべき領域 ■前述のメツシュ分割に基づいて文字符号化すべき領域つまり本発明によって１ページの大部分（白地領域を含
む）は文字の大きさによって１文字を含んでメツシュ分
割される為に後述する符号コード化が可能であるが、中
間調部あるいは図形、線画領域は既存のドツト次元での
符号化を適用するのが望ましい。■MH (Modified HuHs+an method),
MR (Modified R, E, A, D method), etc., area to be coded in pixel dot dimension ■ Area to be character coded based on the mesh division mentioned above, that is, most of one page (including white area) according to the present invention. ) is divided into meshes including one character depending on the character size, so it is possible to use the code encoding described below, but for halftone parts, figures, and line drawing areas, it is not possible to apply the existing dot-dimensional encoding. is desirable.

例えば金弟８図に示すように１ページをメツシュに分割
しステップ２１によって不適合とされるメツシュ（■で
記述）が点在すると仮定すれば、符号化領域の分離は例
えば次のようにして行う。For example, if one page is divided into meshes as shown in Figure 8 of Kinshi, and it is assumed that there are meshes (denoted by ■) that are determined to be nonconforming in step 21, then the coding regions can be separated as follows, for example. .

Ｘ方向に連なるメツシュ列をＹｌメツシュライン、Ｙ２
メツシュライン・・・パ・・Ｙ　２８メツシユラインと
名づけ、各メツシュライン中に不適合メツシュが１つで
もあれば、該メツシュラインはＸ方向に対してドツト次
元での符号化■を行なう０本実施例ではＭＨ符号化を行
なう。The mesh rows that are continuous in the X direction are called the Yl mesh line and the Y2 mesh line.
Mesh line...P...Y 28 Named mesh line. If there is even one non-conforming mesh in each mesh line, the mesh line is encoded in the dot dimension in the X direction.0 In this embodiment, MH code .

従って、第８図においてはＹ２〜Ｙ４　、　Ｙ６　、　
Ｙ１４〜Ｙ１８　　Ｙ２［ｉのメツシュラインはＭＨ符
号化、その他は全て本発明によるメツシュ分割文字符号
化を行なう。Therefore, in FIG. 8, Y2 to Y4, Y6,
Y14 to Y18 Y2[i's mesh line is MH encoded, and all others are mesh divided character encoded according to the present invention.

尚、本発明によるメツシュ分割は黒ドツトの存在しない
領域は適合メツシュとして扱うことになるから圧縮率向
上がはかれる。また、上述の分離精度を上げる為に、上
記した不適合メツシュラインをはさむｙ方向の２列の適
合メツシュラインを不適合メツシュラインとして扱いＭ
Ｈ符号化を行うことを提案する。Note that in the mesh division according to the present invention, areas where no black dots are present are treated as compatible meshes, so that the compression ratio can be improved. In addition, in order to improve the separation accuracy mentioned above, two rows of compatible mesh lines in the y direction sandwiching the non-conforming mesh line mentioned above are treated as non-conforming mesh lines.
We propose to perform H encoding.

くステップ２３〉・・・・・・メツシュ分割文字符号化
くステップ２１＞及びくステップ２２〉において１ペー
ジをＭｘＸＭｙに分割し、かつ該メツシュ内には１文字
格納されているメツシュか否かの判定が終了している為
に、ステップ２３においては不適合メツシュに対しては
ＭＨ符号化を行ない。In step 21> and step 22>, one page is divided into MxXMy, and one character is stored in the mesh. Since the determination has been completed, MH encoding is performed on the non-conforming mesh in step 23.

適−メツシュ内の文字に対してはメツシュ毎に文字の認
識を行なう。- For characters within a mesh, character recognition is performed for each mesh.

この種の認識手法にはすでに各種の方式が提案されてお
り、基本的にはどの方式を適用しても実現出来る０本実
施例においては、Ｄ　Ｐ　（Ｄ７ｎａｍｉｃＰａｔｔｅ
ｒｎ）マツチング法を用いる。ＤＰマツチング法は動的
計画法を基にしたパターンマツチング手法であり、入カ
バターンと登録しである辞書パターンとの距離を算出す
る時、パターンを非線形に伸縮させてパターン全体とし
て見た場合の距離が最少になるようにマツチングする手
法である。Various methods have already been proposed for this type of recognition method, and basically any method can be applied. In this embodiment, D P (D7namicPatte
rn) Using the matching method. The DP matching method is a pattern matching method based on dynamic programming, and when calculating the distance between an input cover pattern and a registered dictionary pattern, it is This is a matching method that minimizes the distance.

用いる辞書パターンは常用漢字的２０００及びその他の
フォントであり、認識した文字は各々２バイトのたとえ
ば２ｘアスキーコードに符号化する。くステップ２４〉
・・・・・・データの貯蔵データ貯蔵手段への貯蔵はペ
ージ毎に、１つのページは更にメツシュライン毎のレコ
ードに分１１されている。ページ毎に有するパラメータ
としてはメツシュの大きさＨｚ　、　）ｌｙがあり、メ
ツシュライン毎のパラメータとしては該符号化が適用さ
れたか否かを表わすメツシュライン毎の先頭に付加され
る符号種別フラグである。符号種別フラグは本実施例に
おいては画素ドツトの符号化が適用されるメツシュはＭ
Ｈ符号化を適用する為にメツシュライン毎に切り換わる
２種の符号化データでよい。The dictionary patterns used are Common Kanji 2000 and other fonts, and each recognized character is encoded into a 2-byte 2x ASCII code, for example. Step 24>
. . . Storage of data The storage in the data storage means is performed page by page, and one page is further divided into records for each mesh line. Parameters for each page include mesh size Hz, )ly, and parameters for each mesh line include a code type flag added to the beginning of each mesh line to indicate whether or not the encoding is applied. In this embodiment, the code type flag indicates that the mesh to which pixel dot encoding is applied is M.
In order to apply H encoding, two types of encoded data that are switched for each mesh line may be used.

次に第９図のフローチャートに基づいて、第２図の制御
フローをより具体的に説明する。Next, the control flow in FIG. 2 will be explained in more detail based on the flowchart in FIG. 9.

ステップ１００にて前述した方法によりメツシュ大きさ
Ｍｘ、Ｍｙを決定する。In step 100, mesh sizes Mx and My are determined by the method described above.

くステップ１０２＞ページメモリ５の１頁分の画像信号をＮｘＸ　Ｍｙのメ
ツシュに分割する。Step 102> The image signal for one page in the page memory 5 is divided into NxXMy meshes.

くステップ１０４＞ｙ方向に１の幅を持つメツシュ列を１ライン取り出す。Step 104> One line of mesh array having a width of 1 in the y direction is extracted.

くステップ１０６＞ステップ１０４で取り出した１ラインの中から順に１メ
ツシユずつ画像信号を取り出す。Step 106> Image signals are extracted one mesh at a time from one line extracted in step 104.

くステップ１０８〉当該メツシュの文字域外に黒ドツトが有るか判定する。Step 108> Determine whether there is a black dot outside the character area of the mesh.

大きさの異なった文字や写真などの画像と定型の大きさ
の文字とが判別される。黒ドツト　゛が有る（ＹＥＳ）
ならばステップ１１４で不適合と判定される。Characters of different sizes and images such as photographs are distinguished from characters of a standard size. There is a black dot (YES)
If so, it is determined in step 114 that it is non-conforming.

くステップ１１０〉ステップ１０８での判定が文字域外に黒ドツトが存在し
ない（Ｎｏ）というのであれば、さらにステップ１１０
でＹ軸上に黒ドツトが存在するかを調べてメツシュ内に
Ｘ軸方向に平行な直線が存在しないか判定する。もし存
在するならば不適合と判定される（ステップ１１４）。Step 110> If the determination in step 108 is that there is no black dot outside the character area (No), step 110 is further performed.
It is checked whether a black dot exists on the Y-axis, and it is determined whether a straight line parallel to the X-axis direction exists in the mesh. If it exists, it is determined that it is non-conforming (step 114).

上記のいずれの場合にも黒ドツトが存在しないならば、
そのメツシュは適合と判断する（ステップ１１２）　。If there is no black dot in any of the above cases,
The mesh is determined to be suitable (step 112).

くステップ１１６〉ステップ１１８ではＮＹの幅を持つメツシュ列の全ての
メツシュが適合／不適合について判定されたか否かを判
断する。未だ全メツシュの判定が終了していないのなら
ばステップ１０６へ戻り、前記フローを繰り返す。Step 116> In step 118, it is determined whether all the meshes in the mesh row having a width of NY have been determined to be conforming/unconforming. If the determination of all meshes has not been completed yet, the process returns to step 106 and the above flow is repeated.

くステップ１１８＞当該１メツシユラインの全メツシュについて判定が終了
したら、ステップ１１８で適合／不適合の判断結果を調
べる。１つでも不適合なメッシュが存在すればＭ、Ｈ符
号化を行い（ステップ１２６）、Ｍ、Ｈ符号化を行なっ
た事を示す符号化種別フラッグ及びＴｅｒｓｉｎａｔｉ
ｎｇ　Ｃｏｄｅ、　Ｍａｋｅ−ｕｐ　Ｃｏｄｅを作成す
る（ステップ１２８）くステップ１２０〜１２４〉１メツシユデイン内の全メツシュが適合と判定されたな
らば、メツシュ分割文字符号化を行い（ステップ１２０
）、メツシュ内の文字を前述したり、Ｐマツチングに従
って文字認識を行ない２バイトのアスキーコードに変換
する。Step 118> When the determination is completed for all the meshes of the one mesh line, the determination result of conformity/nonconformity is checked in step 118. If there is even one mesh that is incompatible, M and H encoding is performed (step 126), and the encoding type flag and Tersinati are set to indicate that M and H encoding has been performed.
ng Code and Make-up Code (Step 128). Steps 120 to 124> If all meshes within one mesh are determined to be compatible, perform mesh division character encoding (Step 120).
), the characters in the mesh are recognized as described above or according to P matching and converted into 2-byte ASCII code.

くステップ１３０〜１３４〉各メツシュラインの符号化種別コード及びＴｅｒ＋＊ｉ
ｎａｔｉｎｇ　Ｃｏｄｅ等を、そして更にそのメツシュ
ラインが１頁の最初のラインであればＭｘ、Ｍｙをデー
タとして追加してデータ貯蔵手段子に格納する。Steps 130 to 134> Encoding type code and Ter+*i of each mesh line
nating Code, etc., and if the mesh line is the first line of one page, Mx and My are added as data and stored in the data storage means.

くステップ１３６〜１４０〉１頁について全部終了するまでステップ１０４以降を繰
り返す。Steps 136 to 140> Steps 104 and subsequent steps are repeated until all pages are completed.

復号化は次のようにする。データ貯蔵手段４からページ
毎のメツシュライン毎のレコードを読出し最初のライン
のデータに記憶されたメツシュの大きさデータＭｘ、Ｍ
ｙに基づき、例えば本実施例の場合にはＸ方向の画素数
Ｘ８７分のラインメモリを用意し、ＭＨ符号化を行なっ
たメツシュラインは１ラインずつ復号し、一方、メツシ
ュ分割文字符号化されたメツシュラインは２パイトスつ
キャラクタニーにから用意したフォントＲＯＭより該当
する文字をメツシュ大きさ内に格納出来る大きさに変換
してドツトレベルに落とす、尚、メツシュ内の文字外の
領域は全て白と復号する。Decryption is done as follows. A record for each mesh line of each page is read from the data storage means 4, and the mesh size data Mx, M stored in the data of the first line.
Based on y, for example, in the case of this embodiment, a line memory for the number of pixels in the X direction (X87) is prepared, and mesh lines subjected to MH encoding are decoded line by line. Converts the corresponding character from a font ROM prepared from two-byte character knee into a size that can be stored within the mesh size and reduces it to the dot level. Furthermore, all areas outside the character within the mesh are decoded as white.

以上の処理を各メツシュライン毎に繰り返し行ない、１
ページを復号化する。The above process is repeated for each mesh line, and 1
Decrypt the page.

以上説明したように本実施例は、１枚原稿中の文字の大
きさは大部分統一されていることに着目し、行間空白及
び該文字をも含むメツシュを用いて文字認識後符号化を
行なったが、変形例としてステップ２２において符号化
領域の分離後、ドツト次元の符号化を適用すべきと判定
された領域であってもその領域が異なる大きさの文字を
含んでいるのみであれば、その領域に対して更に第２の
メツシュ分割を行ない再度文字認識を行なう事により第
２のメツシュ分割文字符号化が可能でありさらに効率の
良い符号化が実現出来る。As explained above, this embodiment focuses on the fact that most of the characters in a single document are uniform in size, and performs character recognition and encoding using a mesh that includes space between lines and the characters. However, as a modified example, even if it is determined that dot-dimensional encoding should be applied after separating the encoding regions in step 22, if that region only contains characters of different sizes, , by further performing a second mesh division on that area and performing character recognition again, second mesh division character encoding is possible and even more efficient encoding can be realized.

又、プロポーショナルな文字文書に対しては文字の大き
さを認識後、メツシュ内に割り付けて再編集すれば本発
明を適用することも可能となる。Further, the present invention can be applied to a proportional character document by recognizing the size of the characters and then allocating them in a mesh and re-editing them.

［効果］以上説明したように、本発明の画像処理装置によればデ
ータ圧縮率の高められた画像データを効率よく貯蔵し、
又、復号化の際も領域種別、領域の大きさを知る事によ
り高速の復号化が可能となる。[Effects] As explained above, according to the image processing device of the present invention, image data with an increased data compression rate can be efficiently stored,
Also, during decoding, high-speed decoding becomes possible by knowing the area type and area size.

更に、又、本願発明はＯＣＲによる文字切出し技術にも
応用することができ、新聞から雑誌まで、種々の書式未
知の紙面内の文字を正確に選択的に切出すことができ、
延いては、認識率の向上につながる。又、画素単位で読
み取ったデータは他の切出し方法により切出したり、あ
るいはりジュツト（読み取り不能）文字としてディスプ
レイに表示させるようにすることも可能である。Furthermore, the present invention can also be applied to character cutting technology using OCR, and it is possible to accurately and selectively cut out characters in papers of various unknown formats, from newspapers to magazines.
This in turn leads to an improvement in the recognition rate. Further, the data read in pixel units can be extracted by other extraction methods, or can be displayed on a display as unreadable characters.

[Brief explanation of drawings]

第１図は本発明に係る一実施例のブロック図、第２図は
実施例の処理フローの概略図、第３図はページメモリ内
の入力原稿を視覚的に表現した図。第４図（ａ）、（ｂ）は夫々ｘ、ｙ方向のヒストグラム
、第５図（ａ）、（ｂ）は夫々ｘ、ｙ方向におけるメツシ
ュの大きさの決定方法を示した図、第６図は決定された
メツシュによって入力原稿が分割された図、第７図は同一メツシュを大きさの異なつ文字に適用した
図、第８図はメツシュ分割された１頁の画像信号が適合／不
適合に判定された結果の１例の図。第９図は制御部の制御フローチャートである。図中、５・・・ページメモリ、３・・・メツシュ分割符
号化処理部、４・・・データ貯蔵手段、９・・・フォン
トＲＯＭである。第１図第２図第３図度　　　　第４図（ｂ）救第５図（０）度叡Ｘ友　第５図（ｂ）数第６図第８図FIG. 1 is a block diagram of an embodiment according to the present invention, FIG. 2 is a schematic diagram of a processing flow of the embodiment, and FIG. 3 is a diagram visually representing an input document in a page memory. Figures 4 (a) and (b) are histograms in the x and y directions, respectively. Figures 5 (a) and (b) are diagrams showing how to determine the mesh size in the x and y directions, respectively. The figure shows the input document divided by the determined mesh. Figure 7 shows the same mesh applied to characters of different sizes. Figure 8 shows the conformance/inconformity of the mesh-divided image signal of one page. A diagram of an example of the results determined. FIG. 9 is a control flowchart of the control section. In the figure, 5: page memory, 3: mesh division encoding processing section, 4: data storage means, 9: font ROM. Fig. 1 Fig. 2 Fig. 3 Fig. 4 (b) Salvation Fig. 5 (0) Degree Ei

Claims

[Claims]

(1) A dividing means for dividing a read image of a document into at least two types of areas according to the image tone thereof, an encoding means for encoding each area based on a different encoding method, and an encoded storage means for storing image data, and the storage means stores, in addition to the image data for each area, the type of the area and the size of the smallest area in the area. Device.

(2) The image data stored in the storage means is read out, and the read image data is switched and decoded for each region based on the encoding method. Image processing device.