JP5067182B2

JP5067182B2 - Image processing apparatus and image processing program

Info

Publication number: JP5067182B2
Application number: JP2008026635A
Authority: JP
Inventors: 俊一木村; 雅則関野
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2008-02-06
Filing date: 2008-02-06
Publication date: 2012-11-07
Anticipated expiration: 2028-02-06
Also published as: JP2009187292A

Abstract

<P>PROBLEM TO BE SOLVED: To provide an image processing apparatus for improving data amount reduction effect caused by compression, and for reducing data amount of its drawing information at the case when an image is described by the drawing information. <P>SOLUTION: Representative pixel block deciding means of the image processing apparatus decides a pixel block which represents the pixel block from among similar pixel blocks within an image, then representative pixel block drawing information producing means produces the drawing information of the representative pixel block decided by the representative pixel block deciding means using a relative position. Individual pixel block drawing information producing means produces the drawing information of the individual pixel block by adding an absolute position in the image into the drawing information produced by the representative pixel block drawing information producing means, while reversible compressing means performs reversible compression about the drawing information produced by the individual pixel block drawing information producing means. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、画像処理装置及び画像処理プログラムに関する。 The present invention relates to an image processing apparatus and an image processing program.

電子文書のフォーマットとして、ポータブルドキュメントフォーマット（ＰＤＦ：ＰｏｒｔａｂｌｅＤｏｃｕｍｅｎｔＦｏｒｍａｔ（登録商標））、画像をベクトル表現するフォーマットであるＳＶＧ（ＳｃａｌａｂｌｅＶｅｃｔｏｒＧｒａｐｈｉｃｓ）、ページ記述言語であるポストスクリプト（登録商標）、又は各種ワードプロセッサで用いられる文書フォーマットがある。
これらの文書フォーマットで記述された電子文書内では、直線や曲線などの描画命令を用いて、文字形状や図形形状を、表現することが可能である。
通常の場合は、ユーザが直線や曲線をアプリケーションソフトウェア上で描画して、アプリケーションソフトウェアがこれを描画命令に変換して、前記の電子文書フォーマットに変換する。
入力が画像の場合にも、入力画像を解析して、画像情報を描画命令に変換する技術がある。つまり、入力画像の黒画素閉領域のアウトラインを前記描画命令で記述しなおし、さらに、閉領域の塗りつぶし処理（塗りつぶし命令）を行うことによって、入力画像を描画命令に変換することができる。
このように描画命令で記述しなおすことによって、画像を拡大した場合においても、ジャギーの少ない画像を得ることができる。 As a format of an electronic document, a portable document format (PDF: Portable Document Format (registered trademark)), an SVG (Scalable Vector Graphics) that represents a vector representation of an image, a postscript (registered trademark) that is a page description language, or various types There are document formats used in word processors.
In an electronic document described in these document formats, it is possible to express character shapes and graphic shapes using drawing commands such as straight lines and curves.
In a normal case, the user draws a straight line or a curve on the application software, and the application software converts it into a drawing command and converts it into the electronic document format.
Even when an input is an image, there is a technique for analyzing an input image and converting image information into a drawing command. That is, it is possible to convert the input image into a drawing command by rewriting the outline of the black pixel closed region of the input image with the drawing command and further performing a filling process (painting command) of the closed region.
By rewriting the drawing command in this way, an image with less jaggy can be obtained even when the image is enlarged.

これに関連する技術として、例えば、特許文献１には、より実際の文字輪郭に近いアウトラインフォントに変換することを目的とし、ＲＡＭから読み出されたデータが直線又は曲線データのいずれであるかをＣＰＵにより判定し、曲線データと判定された場合、その曲線データにより形成される曲線に極値が存在するか否かをＣＰＵにより判定し、肯定判定された場合、ＣＰＵにより、当該曲線を極値の位置で分割し、分割して得られた曲線のデータと、直線と判定された直線のデータとをＲＡＭに格納することが開示されている。
特開平０８−１１５４１９号公報 As a technology related to this, for example, Patent Document 1 describes whether the data read from the RAM is straight line or curved line data for the purpose of converting to an outline font closer to the actual character outline. If it is determined by the CPU and is determined as curve data, the CPU determines whether or not an extremum exists in the curve formed by the curve data. If the determination is affirmative, the CPU determines that the curve is an extremum. And the data of the curve obtained by the division and the data of the straight line determined to be a straight line are stored in a RAM.
Japanese Patent Laid-Open No. 08-115419

ところで、電子文書内の描画情報は例えばテキストコード（アスキーコード等）で記載されている場合が多い。その場合、１つの直線や曲線を表現するのに、複数のテキストコードが必要になる。
本発明は、画像を描画情報で記述する場合にあって、圧縮によるデータ量削減効果を増加させ、その描画情報のデータ量を削減するようにした画像処理装置及び画像処理プログラムを提供することを目的としている。 By the way, drawing information in an electronic document is often written in, for example, a text code (ASCII code or the like). In that case, a plurality of text codes are required to express one straight line or curve.
The present invention provides an image processing apparatus and an image processing program that increase the data amount reduction effect by compression and reduce the data amount of the drawing information when the image is described by drawing information. It is aimed.

かかる目的を達成するための本発明の要旨とするところは、次の各項の発明に存する。
請求項１の発明は、画像内の類似する画素塊から該画素塊を代表する画素塊を決定する代表画素塊決定手段と、前記代表画素塊決定手段によって決定された代表の画素塊の描画情報を相対位置を用いて生成する代表画素塊描画情報生成手段と、前記代表画素塊描画情報生成手段によって生成された描画情報に、前記画像における絶対位置を付与して、個別の画素塊の描画情報を生成する個別画素塊描画情報生成手段と、前記個別画素塊描画情報生成手段によって生成された描画情報を可逆圧縮する可逆圧縮手段を具備し、前記個別画素塊描画情報生成手段は、前記代表画素塊決定手段によって決定された代表の画素塊に対応する画素塊の描画順序を近接させるようにすることを特徴とする画像処理装置である。 The gist of the present invention for achieving the object lies in the inventions of the following items.
According to the first aspect of the present invention, representative pixel block determining means for determining a pixel block representing the pixel block from similar pixel blocks in the image, and drawing information of the representative pixel block determined by the representative pixel block determining unit The representative pixel block drawing information generating unit for generating the pixel pixel using relative positions, and the drawing information generated by the representative pixel block drawing information generating unit by giving an absolute position in the image, An individual pixel block drawing information generation unit, and a reversible compression unit that reversibly compresses the drawing information generated by the individual pixel block drawing information generation unit, and the individual pixel block drawing information generation unit includes the representative pixel The image processing apparatus is characterized in that the drawing order of pixel blocks corresponding to the representative pixel block determined by the block determination unit is made closer .

請求項２の発明は、画像内の類似する画素塊から該画素塊を代表する画素塊を決定する代表画素塊決定手段と、前記代表画素塊決定手段によって決定された代表の画素塊の描画情報を相対位置を用いて生成する代表画素塊描画情報生成手段と、前記代表画素塊描画情報生成手段によって生成された描画情報に、前記画像における絶対位置を付与して、個別の画素塊の描画情報を生成する個別画素塊描画情報生成手段と、前記個別画素塊描画情報生成手段によって生成された描画情報を可逆圧縮する可逆圧縮手段を具備し、前記個別画素塊描画情報生成手段は、前記代表画素塊決定手段によって決定された代表の画素塊に対応する画素塊の描画を連続させるようにすることを特徴とする画像処理装置である。 According to the second aspect of the present invention, representative pixel block determining means for determining a pixel block representing the pixel block from similar pixel blocks in the image, and drawing information of the representative pixel block determined by the representative pixel block determining unit The representative pixel block drawing information generating unit for generating the pixel pixel using relative positions, and the drawing information generated by the representative pixel block drawing information generating unit by giving an absolute position in the image, An individual pixel block drawing information generation unit, and a reversible compression unit that reversibly compresses the drawing information generated by the individual pixel block drawing information generation unit, and the individual pixel block drawing information generation unit includes the representative pixel The image processing apparatus is characterized in that the drawing of pixel blocks corresponding to the representative pixel block determined by the block determination unit is made continuous .

請求項３の発明は、前記個別画素塊描画情報生成手段は、前記可逆圧縮手段による圧縮結果に基づいて、個別の画素塊の描画情報を生成することを特徴とする請求項１〜２のいずれか１項に記載の画像処理装置である。 The invention according to claim 3, wherein the individual pixel block drawing information generating means, based on the compression result of the lossless compression means any of claims 1-2, characterized in that to generate drawing information of an individual pixel block The image processing apparatus according to claim 1.

請求項４の発明は、前記個別画素塊描画情報生成手段は、アフィン変換をして類似となる画素塊に対しても、前記代表画素塊描画情報生成手段によって生成された描画情報及びアフィン変換の描画情報を含めて、個別の画素塊の描画情報を生成することを特徴とする請求項１〜３のいずれか１項に記載の画像処理装置である。 According to a fourth aspect of the present invention, the individual pixel block drawing information generating unit performs the drawing information generated by the representative pixel block drawing information generating unit and the affine transformation of a pixel block that is similar by affine transformation. including drawing information, an image processing apparatus according to any one of claims 1 to 3, characterized in that to generate drawing information of an individual pixel block.

請求項５の発明は、コンピュータを、画像内の類似する画素塊から該画素塊を代表する画素塊を決定する代表画素塊決定手段と、前記代表画素塊決定手段によって決定された代表の画素塊の描画情報を相対位置を用いて生成する代表画素塊描画情報生成手段と、前記代表画素塊描画情報生成手段によって生成された描画情報に、前記画像における絶対位置を付与して、個別の画素塊の描画情報を生成する個別画素塊描画情報生成手段と、前記個別画素塊描画情報生成手段によって生成された描画情報を可逆圧縮する可逆圧縮手段として機能させ、前記個別画素塊描画情報生成手段は、前記代表画素塊決定手段によって決定された代表の画素塊に対応する画素塊の描画順序を近接させるようにすることを特徴とする画像処理プログラムである。
請求項６の発明は、コンピュータを、画像内の類似する画素塊から該画素塊を代表する画素塊を決定する代表画素塊決定手段と、前記代表画素塊決定手段によって決定された代表の画素塊の描画情報を相対位置を用いて生成する代表画素塊描画情報生成手段と、前記代表画素塊描画情報生成手段によって生成された描画情報に、前記画像における絶対位置を付与して、個別の画素塊の描画情報を生成する個別画素塊描画情報生成手段と、前記個別画素塊描画情報生成手段によって生成された描画情報を可逆圧縮する可逆圧縮手段として機能させ、前記個別画素塊描画情報生成手段は、前記代表画素塊決定手段によって決定された代表の画素塊に対応する画素塊の描画を連続させるようにすることを特徴とする画像処理プログラムである。 According to a fifth aspect of the present invention, there is provided a computer comprising: a representative pixel block determining unit that determines a pixel block that represents the pixel block from similar pixel blocks in an image; and the representative pixel block determined by the representative pixel block determining unit. The representative pixel block drawing information generating unit that generates the drawing information using relative positions, and the drawing information generated by the representative pixel block drawing information generating unit is assigned an absolute position in the image to obtain individual pixel blocks. The individual pixel block drawing information generating unit for generating the drawing information of the first pixel block and the reversible compression unit for reversibly compressing the drawing information generated by the individual pixel block drawing information generating unit, An image processing program characterized in that the drawing order of pixel blocks corresponding to the representative pixel block determined by the representative pixel block determining means is brought close to each other .
According to a sixth aspect of the present invention, the computer comprises a representative pixel block determining means for determining a pixel block representing the pixel block from similar pixel blocks in the image, and the representative pixel block determined by the representative pixel block determining means. The representative pixel block drawing information generating unit that generates the drawing information using relative positions, and the drawing information generated by the representative pixel block drawing information generating unit is assigned an absolute position in the image to obtain individual pixel blocks. The individual pixel block drawing information generating unit for generating the drawing information of the first pixel block and the reversible compression unit for reversibly compressing the drawing information generated by the individual pixel block drawing information generating unit, The image processing program is characterized in that the drawing of pixel blocks corresponding to the representative pixel block determined by the representative pixel block determining means is made continuous.

請求項１の画像処理装置によれば、画像を描画情報で記述する場合にあって、圧縮によるデータ量削減効果を増加させ、その描画情報のデータ量を削減することができる。圧縮によるデータ量削減効果をより増加させ、その描画情報のデータ量を削減することができる。 According to the image processing apparatus of the first aspect, when the image is described by the drawing information, the data amount reduction effect by the compression can be increased, and the data amount of the drawing information can be reduced. The data amount reduction effect by compression can be further increased, and the data amount of the drawing information can be reduced.

請求項２の画像処理装置によれば、画像を描画情報で記述する場合にあって、圧縮によるデータ量削減効果を増加させ、その描画情報のデータ量を削減することができる。圧縮によるデータ量削減効果をさらに増加させ、その描画情報のデータ量を削減することができる。 According to the image processing apparatus of the second aspect, when an image is described with drawing information, the data amount reduction effect by compression can be increased, and the data amount of the drawing information can be reduced. The data amount reduction effect by compression can be further increased, and the data amount of the drawing information can be reduced.

請求項３の画像処理装置によれば、個別の画素塊の描画情報の生成処理と圧縮処理を高速化することができる。 According to the image processing device of the third aspect , it is possible to increase the speed of the generation processing and compression processing of the drawing information of the individual pixel block.

請求項４の画像処理装置によれば、圧縮によるデータ量削減効果をさらに増加させ、その描画情報のデータ量を削減することができる。 According to the image processing apparatus of the fourth aspect , it is possible to further increase the data amount reduction effect by the compression and to reduce the data amount of the drawing information.

請求項５の画像処理プログラムによれば、画像を描画情報で記述する場合にあって、圧縮によるデータ量削減効果を増加させ、その描画情報のデータ量を削減することができる。圧縮によるデータ量削減効果をより増加させ、その描画情報のデータ量を削減することができる。
請求項６の画像処理プログラムによれば、画像を描画情報で記述する場合にあって、圧縮によるデータ量削減効果を増加させ、その描画情報のデータ量を削減することができる。圧縮によるデータ量削減効果をさらに増加させ、その描画情報のデータ量を削減することができる。 According to the image processing program of the fifth aspect, when the image is described by the drawing information, the data amount reduction effect by the compression can be increased, and the data amount of the drawing information can be reduced. The data amount reduction effect by compression can be further increased, and the data amount of the drawing information can be reduced.
According to the image processing program of the sixth aspect, when the image is described by the drawing information, the data amount reduction effect by the compression can be increased, and the data amount of the drawing information can be reduced. The data amount reduction effect by compression can be further increased, and the data amount of the drawing information can be reduced.

まず、本実施の形態の概要を説明する。
なお、本実施の形態が対象とする画素塊とは、４連結又は８連結で連続する画素領域を少なくとも含み、これらの画素領域の集合をも含む。これらの画素領域の集合とは、４連結等で連続した画素領域が複数あり、その複数の画素領域は近傍にあるものをいう。ここで、近傍にあるものとは、例えば、互いの画素領域が距離的に近いもの、文章としての１行から１文字ずつ切り出すように縦又は横方向に射影し、空白地点で切り出した画像領域、又は一定間隔で切り出した画像領域等がある。
なお、１つの画素塊として、１文字の画像となる場合が多い。ただし、実際に人間が文字として認識できる画素領域である必要はない。文字の一部分、文字を形成しない画素領域等もあり、何らかの画素の塊であればよい。例えば、円や線のような図形であってもよい。以下、「文字」又は「文字画像」という場合は、特に断りがない限り「画素塊」の意で用いる。 First, an outline of the present embodiment will be described.
Note that the pixel block targeted by this embodiment includes at least a pixel region that is continuous in four or eight connections, and also includes a set of these pixel regions. The set of these pixel areas means that there are a plurality of continuous pixel areas such as 4-connected, and the plurality of pixel areas are in the vicinity. Here, what is in the vicinity is, for example, an image area in which the pixel areas are close to each other in distance, an image area that is projected vertically or horizontally so as to cut out one character at a time from a line as a sentence, and cut out at a blank spot Or an image region cut out at regular intervals.
In many cases, an image of one character is formed as one pixel block. However, it is not necessary that the pixel area is actually recognizable as a character by humans. There are a part of a character, a pixel region that does not form a character, and the like, and any pixel block may be used. For example, it may be a figure such as a circle or a line. Hereinafter, the term “character” or “character image” means “pixel block” unless otherwise specified.

本実施の形態は、複数の文字がある画像を対象としている。通常の文書画像では、同じ文字が何度も出現する場合がある。本実施の形態は、その同じ文字に対して、同じ描画命令を与えることによって、データサイズを削減するものである。
ただし、単純に同じ描画命令の記述を並べるだけでは、データサイズが削減されるわけではない。
本実施の形態では、描画命令を記述し、さらに、その記述を可逆圧縮（例えば、ＬＺ圧縮（ＬＺ圧縮アルゴリズムを利用した派生版を含む））する。例えば、ＬＺ圧縮アルゴリズムは、同じバイト列の並びを発見した場合に、そのバイト列の並びを再度利用することなく、さらに小さな符号に変換する方式であるため、ファイルサイズを低減させることが可能となる。以下、可逆圧縮としてＬＺ圧縮を例示して説明する。 This embodiment is intended for an image having a plurality of characters. In a normal document image, the same character may appear many times. In the present embodiment, the data size is reduced by giving the same drawing command to the same character.
However, simply arranging the description of the same rendering command does not reduce the data size.
In this embodiment, a drawing command is described, and the description is further losslessly compressed (for example, LZ compression (including a derivative version using an LZ compression algorithm)). For example, the LZ compression algorithm is a method that, when the same byte string sequence is found, is converted into a smaller code without reusing the byte sequence sequence, the file size can be reduced. Become. Hereinafter, LZ compression will be described as an example of lossless compression.

このような状況を作り出すため、同じ文字を発見することが必要となる。
本実施の形態では、まず、画像から文字の切り出しを行い、各文字間の相違度を計測する。例えば、相違度が所定の閾値以下であれば、類似する文字であると判断する。その類似文字に対しては、同じ描画命令を与える。もちろん、類似する文字には同一の文字を含む。
なお、描画命令の作成方法は様々ある。例えば、類似する文字として判断された複数の文字の内から１つの文字を選択して、その文字の画像を代表画像として描画命令を作成する。あるいは、同じ文字として判断された複数の文字の平均画像を用いて、その平均画像を代表画像として描画命令を作成する。また、平均画像を作成するときに重み付けを行ってもよい。これらの方法に関しては、後に詳述する。
前述のように、類似する文字には同じ描画命令を与え、かつ、ＬＺ圧縮を行うことで、ファイルサイズを低減することが可能となる。 To create such a situation, it is necessary to find the same letter.
In the present embodiment, first, characters are cut out from the image, and the degree of difference between the characters is measured. For example, if the degree of difference is equal to or less than a predetermined threshold, it is determined that the characters are similar. The same drawing command is given to the similar characters. Of course, similar characters include the same characters.
There are various methods for creating drawing commands. For example, one character is selected from a plurality of characters determined as similar characters, and a drawing command is created using the character image as a representative image. Alternatively, a drawing command is created using an average image of a plurality of characters determined as the same character and using the average image as a representative image. Also, weighting may be performed when creating an average image. These methods will be described in detail later.
As described above, it is possible to reduce the file size by giving the same drawing command to similar characters and performing LZ compression.

さらに、ＬＺ圧縮では、既に圧縮したバイト列を保持するバッファサイズが限られている場合がある。
このようなリソースが限定されている符号化器に対応するため、同じ文字をまとめて圧縮することを行ってもよい。
例えば、文字順を変更して、類似する文字は順に並ぶように記載した後に圧縮する。このように文字順を変更することによって、辞書あるいはバッファ内のバイト列がヒットしやすくなって、リソースが限られている場合も圧縮率を上げる（ファイルサイズを低減する）ことができる。 Furthermore, in LZ compression, the buffer size for holding already compressed byte sequences may be limited.
In order to correspond to an encoder with such limited resources, the same characters may be compressed together.
For example, the character order is changed, and similar characters are described in order and then compressed. By changing the character order in this way, it becomes easy to hit the dictionary or the byte string in the buffer, and the compression rate can be increased (the file size can be reduced) even when the resources are limited.

以下、図面に基づき本発明を実現するにあたっての好適な各種の実施の形態の例を説明する。
図１は、第１の実施の形態の構成例についての概念的なモジュール構成図を示している。
なお、モジュールとは、一般的に論理的に分離可能なソフトウェア（コンピュータ・プログラム）、ハードウェア等の部品を指す。したがって、本実施の形態におけるモジュールはコンピュータ・プログラムにおけるモジュールのことだけでなく、ハードウェア構成におけるモジュールも指す。それゆえ、本実施の形態は、コンピュータ・プログラム、システム及び方法の説明をも兼ねている。ただし、説明の都合上、「記憶する」、「記憶させる」、これらと同等の文言を用いるが、これらの文言は、実施の形態がコンピュータ・プログラムの場合は、記憶装置に記憶させる、又は記憶装置に記憶させるように制御するの意である。また、モジュールは機能にほぼ一対一に対応しているが、実装においては、１モジュールを１プログラムで構成してもよいし、複数モジュールを１プログラムで構成してもよく、逆に１モジュールを複数プログラムで構成してもよい。また、複数モジュールは１コンピュータによって実行されてもよいし、分散又は並列環境におけるコンピュータによって１モジュールが複数コンピュータで実行されてもよい。なお、１つのモジュールに他のモジュールが含まれていてもよい。また、以下、「接続」とは物理的な接続の他、論理的な接続（データの授受、指示、データ間の参照関係等）を含む。
また、システム又は装置とは、複数のコンピュータ、ハードウェア、装置等がネットワーク（一対一対応の通信接続を含む）等の通信手段で接続されて構成されるほか、１つのコンピュータ、ハードウェア、装置等によって実現される場合も含まれる。「装置」と「システム」とは、互いに同義の用語として用いる。 Hereinafter, examples of various preferred embodiments for realizing the present invention will be described with reference to the drawings.
FIG. 1 is a conceptual module configuration diagram of a configuration example according to the first embodiment.
The module generally refers to components such as software (computer program) and hardware that can be logically separated. Therefore, the module in the present embodiment indicates not only a module in a computer program but also a module in a hardware configuration. Therefore, the present embodiment also serves as an explanation of a computer program, a system, and a method. However, for the sake of explanation, the words “store”, “store”, and equivalents thereof are used. However, when the embodiment is a computer program, these words are stored in a storage device or stored in memory. It is the control to be stored in the device. In addition, the modules correspond almost one-to-one with the functions. However, in mounting, one module may be composed of one program, or a plurality of modules may be composed of one program. A plurality of programs may be used. The plurality of modules may be executed by one computer, or one module may be executed by a plurality of computers in a distributed or parallel environment. Note that one module may include other modules. In the following, “connection” includes not only physical connection but also logical connection (data exchange, instruction, reference relationship between data, etc.).
In addition, the system or device is configured by connecting a plurality of computers, hardware, devices, and the like by communication means such as a network (including one-to-one correspondence communication connection), etc., and one computer, hardware, device. The case where it implement | achieves by etc. is also included. “Apparatus” and “system” are used as synonymous terms.

本実施の形態は、図１に示すように、文字切り出し処理モジュール１１０、類似文字探索処理モジュール１２０、代表文字決定処理モジュール１３０、代表文字描画命令生成処理モジュール１４０、個別文字描画命令生成処理モジュール１５０、可逆圧縮処理モジュール１６０を有している。 In this embodiment, as shown in FIG. 1, a character segmentation processing module 110, a similar character search processing module 120, a representative character determination processing module 130, a representative character drawing command generation processing module 140, and an individual character drawing command generation processing module 150 The reversible compression processing module 160 is included.

本実施の形態では、２値画像を受け付ける。受け付けた画像が、２値画像ではない場合、つまり、カラー画像又はグレイ画像を受け付け、文字画素抽出処理（写真画像等と区別して文字画像を抽出する処理）を行って、文字部を１、非文字部を０とするような２値画像を生成する処理を行う２値化処理モジュールを文字切り出し処理モジュール１１０の前処理として付加してもよい。そのようにすれば、本実施の形態の動作をそのまま利用することができる。文字画素抽出処理としては、様々な手法を利用できる。例として、特開２００５−３９７７１号公報、特開２００５−６３４４５号公報、特開２００５−１７５６４１号公報、特開２００５−１８４４０２号公報、特開２００５−１８４４０４号公報等に開示されている手法を用いることができる。
なお、画像を受け付けるとは、例えば、スキャナ等で入力した画像を受け付けること、ファックスによって画像を受信すること、ハードディスク（コンピュータに内蔵されているものの他に、ネットワークを介して接続されているもの等を含む）等に記憶されている画像を読み出すこと等がある。 In this embodiment, a binary image is received. When the received image is not a binary image, that is, a color image or a gray image is received, a character pixel extraction process (a process of extracting a character image in distinction from a photographic image or the like) is performed, and a character portion is set to 1, A binarization processing module that performs processing for generating a binary image in which the character portion is set to 0 may be added as preprocessing of the character segmentation processing module 110. By doing so, the operation of the present embodiment can be used as it is. Various methods can be used as the character pixel extraction process. For example, the methods disclosed in Japanese Patent Application Laid-Open Nos. 2005-39771, 2005-63445, 2005-175541, 2005-184402, 2005-184404, etc. Can be used.
Note that accepting an image means, for example, accepting an image input by a scanner or the like, receiving an image by fax, a hard disk (in addition to those incorporated in a computer, those connected via a network, etc. Etc.) and the like.

文字切り出し処理モジュール１１０は、類似文字探索処理モジュール１２０と接続されており、受け付けた２値画像内の個々の文字を切り出し、その切り出した文字を類似文字探索処理モジュール１２０へ渡す。
文字切り出し処理モジュール１１０が行う文字切り出し処理として、例えば、特開平５−２８３０１号公報、特開平５−１２０４８１号公報、特開平５−１４３７７６号公報、特開平５−１７４１８５号公報、特開平６−４４４０６号公報、特開平６−１８７４８９号公報、特開平６−３４８９１１号公報、特開平７−１３９９４号公報、特開平７−１６０８１０号公報、特開平８−１６１４３２号公報、特開平８−２９７７１８号公報、特開平１０−６９５２４号公報、特開平１０−１３４１４５号公報、特開平１０−２６１０４７号公報、特開２０００−５７２６１号公報、特開２００１−４３３１４号公報、特開２００４−７８５３１号公報等に記載の様々な手法を用いることができる。 The character cutout processing module 110 is connected to the similar character search processing module 120, cuts out individual characters in the received binary image, and passes the cutout characters to the similar character search processing module 120.
Examples of character cutout processing performed by the character cutout processing module 110 include, for example, Japanese Patent Application Laid-Open No. 5-28301, Japanese Patent Application Laid-Open No. 5-120481, Japanese Patent Application Laid-Open No. 5-143776, Japanese Patent Application Laid-Open No. 5-174185, and Japanese Patent Application Laid-Open No. No. 44406, JP-A-6-187489, JP-A-6-348911, JP-A-7-13994, JP-A-7-160810, JP-A-8-161432, JP-A-8-297718. JP, 10-69524, JP 10-134145, JP 10-261047, JP 2000-57261, JP 2001-43314, JP 2004-78531, etc. Can be used.

文字切り出し処理モジュール１１０は、例えば、図３に示すような対象画像３００を受け付ける。このような対象画像３００を受け付けると、文字切り出し処理モジュール１１０は、文字の切り出しを行う。文字毎に切り出した結果を、図４に示す。なお、切り出した結果として、その文字を囲む矩形を表示している。この矩形は外接矩形であってもよい。 The character cutout processing module 110 receives a target image 300 as shown in FIG. 3, for example. When such a target image 300 is received, the character cutout processing module 110 cuts out characters. The result cut out for each character is shown in FIG. As a result of the cutout, a rectangle surrounding the character is displayed. This rectangle may be a circumscribed rectangle.

類似文字探索処理モジュール１２０は、文字切り出し処理モジュール１１０、代表文字決定処理モジュール１３０と接続されており、文字切り出し処理モジュール１１０から切り出された文字を受け取り、その文字間で類似しているものを探索する。そして、その文字を代表文字決定処理モジュール１３０へ渡す。
例えば、図４に示す例では、１８個の文字が存在しており、この１８個の文字を文字切り出し処理モジュール１１０から受け取る。各文字に対して、他の１７個の文字のうち、類似している文字を探索する。
例えば、最初に出てくる文字「あ」に類似する文字は、他の１７個のうち、８個ある。全てあわせて９個の類似文字「あ」が存在している。これら９個の「あ」を類似文字として登録する。 The similar character search processing module 120 is connected to the character cutout processing module 110 and the representative character determination processing module 130, receives characters cut out from the character cutout processing module 110, and searches for similar characters between the characters. To do. Then, the character is transferred to the representative character determination processing module 130.
For example, in the example shown in FIG. 4, there are 18 characters, and these 18 characters are received from the character cutout processing module 110. For each character, search for similar characters among the other 17 characters.
For example, there are 8 characters that are similar to the first character “A” appearing among the other 17 characters. In total, there are nine similar characters “A”. These nine “a” are registered as similar characters.

類似しているかどうかを判定する方法は様々にある。以下、文字の相違度、あるいは類似度を判定する方法の例を述べる。
文字の類似度を検証する方式として以下を例に挙げる。
（Ａ１）切り出された文字画像である２枚の２値画像を入力する。
（Ａ２）２枚の入力画像の黒画素の重心をあわせる。
（Ａ３）重心をあわせた２枚の入力画像のＸＯＲ（ｅＸｃｌｕｓｉｖｅＯＲ：排他的論理和）演算を行う。
（Ａ４）さらに、微小（上下左右斜めに数画素）移動させて、ＸＯＲ演算の結果、１となった画素（相違する画素）の個数を数える。非移動を含む様々な移動後のＸＯＲ演算の結果、１となった画素（相違する画素）の個数の最小値が所定の閾値以下であれば、類似する文字画像であると判断する。前記所定の閾値は、予め定めた固定の数値であってもよいし、文字画像の画素数に比例する数値であってもよい。
（Ａ５）切り出された文字画像に対して、他の文字画像との類似度を前記の手法で計算し、次々に類似している文字画像をまとめる（グルーピングする）。 There are various ways to determine whether they are similar. Hereinafter, an example of a method for determining the difference or similarity of characters will be described.
The following is an example of a method for verifying the similarity of characters.
(A1) Two binary images that are clipped character images are input.
(A2) The centers of black pixels of the two input images are matched.
(A3) An XOR (eXclusive OR) operation is performed on two input images that have the same center of gravity.
(A4) Further, it is moved slightly (several pixels vertically and horizontally), and the number of pixels (different pixels) that become 1 as a result of the XOR operation is counted. As a result of various XOR operations after movement including non-movement, if the minimum value of the number of pixels that are 1 (different pixels) is equal to or less than a predetermined threshold, it is determined that the character images are similar. The predetermined threshold value may be a predetermined fixed numerical value or a numerical value proportional to the number of pixels of the character image.
(A5) The cut-out character images are calculated by using the above-described method for similarity to other character images, and character images that are similar to each other are grouped together (grouped).

なお、文字画像が類似するか否かの判定として、前述した他に、例えば、切り出された文字画像を１つのベクトルデータとして扱い、クラスタリングを行い、当該文字画像と類似する文字画像を抽出してもよい。この場合、当該文字画像を表すベクトルデータと判定対象の文字画像を表すベクトルデータとの距離（例えば、ユークリッド距離等）が所定値以下（すなわち、２つのベクトルデータの距離が近いとき）の場合は、当該文字画像と判定対象の文字画像とが類似していると判定する。
さらに、２つの文字画像パターンの論理演算の結果画像に基づいて膨張画像を生成し、その膨張画像と重なり合う割合に基づいて、類似しているか否かを判定するようにしてもよい。つまり、その２つが一致する場合の膨張させる度合い（膨張半径）によって、相違度を判定するようにしてもよい。
この他に、特開平０７−２００７４５、I. H. Witten, A. Moffat, and T. C. Bell 著「Managing Gigabytes」
Morgan Kaufmmann Publishers pp.320-332.等に記載されている方法を用いてもよい。 In addition to the above, as a determination of whether or not the character images are similar, for example, the cut character image is treated as one vector data, clustered, and a character image similar to the character image is extracted. Also good. In this case, when the distance (for example, the Euclidean distance) between the vector data representing the character image and the vector data representing the character image to be determined is equal to or less than a predetermined value (that is, when the distance between the two vector data is short) The character image is determined to be similar to the character image to be determined.
Furthermore, an expanded image may be generated based on the result image of the logical operation of the two character image patterns, and it may be determined whether or not they are similar based on the ratio of overlapping with the expanded image. That is, the degree of difference may be determined based on the degree of expansion (expansion radius) when the two match.
In addition, "Managing Gigabytes" by JP-A-07-200755, IH Witten, A. Moffat, and TC Bell.
You may use the method described in Morgan Kaufmmann Publishers pp.320-332.

なお、類似度とは、２つの画像が合同である場合に最大となり、相違する度合いに応じて減少する量のことである。
類似度のかわりに、２つの画像が合同である場合に最小となり、相違する度合いに応じて増加する量を用いてもよい。この場合は、「距離」又は「相違度」と呼ばれる。距離とは、画像をベクトルで表した場合（画素値そのものをベクトルとする、又は画像の特徴量をベクトルとする等）に、ベクトルで表現した画像を空間内に配置し、その空間内での各画像同士の隔たり（距離）のことである。例えば、ユークリッド距離、マンハッタン距離、ハウスドルフ距離、マハラノビス距離、ベクトル間の角度θ、ｃｏｓθ、ｃｏｓθの２乗等がある。
つまり、類似する文字を探索するのに、類似度の他に、「距離」又は「相違度」を用いてもよい。 Note that the similarity is the amount that is maximized when two images are congruent and decreases according to the degree of difference.
Instead of similarity, an amount that is minimized when two images are congruent and increases according to the degree of difference may be used. In this case, it is called “distance” or “degree of difference”. The distance is an image represented by a vector (such as a pixel value itself as a vector or an image feature amount as a vector). This is the distance (distance) between the images. For example, there are Euclidean distance, Manhattan distance, Hausdorff distance, Mahalanobis distance, angles between vectors θ, cos θ, and the square of cos θ.
That is, in order to search for similar characters, “distance” or “difference” may be used in addition to the similarity.

なお、前述の類似文字の探索処理では、１個の文字に対し、他の全ての文字との類似度を計測しているが、高速化を図るために、文字矩形の大きさがある程度同じものに対してのみ、計算量の多い類似度の計算を行うようにしてもよい。 In the above-described similar character search process, the degree of similarity of one character with all other characters is measured. However, in order to increase the speed, the size of the character rectangle is somewhat the same. Only for the above, similarity calculation with a large amount of calculation may be performed.

図５は、類似する文字をグルーピング処理した例を示す説明図である。
類似文字探索処理モジュール１２０による類似文字探索の結果、図５に示す例のように、類似文字がまとめられる（グルーピングされる）。つまり、５つのグループに分けられ、図５（ａ）は第１のグループ（「あ」という文字画像に類似する９個の文字画像）であり、図５（ｂ）は第２のグループ（「い」という文字画像に類似する３個の文字画像）であり、図５（ｃ）は第３のグループ（「う」という文字画像に類似する２個の文字画像）であり、図５（ｄ）は第４のグループ（「え」という文字画像に類似する３個の文字画像）であり、図５（ｅ）は第５のグループ（「お」という１個の文字画像）である。 FIG. 5 is an explanatory diagram illustrating an example in which similar characters are grouped.
As a result of similar character search by the similar character search processing module 120, similar characters are grouped (grouped) as in the example shown in FIG. That is, it is divided into five groups. FIG. 5A shows the first group (9 character images similar to the character image “A”), and FIG. 5B shows the second group (“ FIG. 5C shows a third group (two character images similar to the character image “U”), and FIG. ) Is the fourth group (three character images similar to the character image “e”), and FIG. 5E is the fifth group (one character image “o”).

代表文字決定処理モジュール１３０は、類似文字探索処理モジュール１２０、代表文字描画命令生成処理モジュール１４０と接続されており、類似文字探索処理モジュール１２０から類似文字の探索結果を受け取り、グループ毎に、そのグループ内の文字画像を代表する代表文字を決定する。その代表文字を代表文字描画命令生成処理モジュール１４０へ渡す。
代表文字の決定処理としては、様々なものを用いることができる。例えば、最初に出現する文字を代表文字として選択してもよい。又は、類似する全ての文字を、前出のＸＯＲ演算結果が最小となるような位置に移動してから、その平均値を取得して、さらに２値化を行って代表文字を作成してもよい。 The representative character determination processing module 130 is connected to the similar character search processing module 120 and the representative character drawing command generation processing module 140, receives a similar character search result from the similar character search processing module 120, and for each group, A representative character representing the character image is determined. The representative character is transferred to the representative character drawing command generation processing module 140.
Various types of representative character determination processing can be used. For example, the character that appears first may be selected as the representative character. Alternatively, after moving all similar characters to a position where the above-mentioned XOR operation result is minimized, the average value is obtained, and further binarization is performed to create a representative character. Good.

又は、次に示す代表文字の決定処理を用いてもよい。
まず、類似する全ての文字を拡大する。拡大した後で、全ての文字間の誤差（２乗誤差であってもよいし、絶対値誤差であってもよい）が最小となる位置を抽出する。その位置で、文字画像の各画素の平均値を取得して、さらに２値化を行って代表文字を作成してもよい。この拡大処理について、より詳細に説明する。 Alternatively, the following representative character determination process may be used.
First, enlarge all similar characters. After enlarging, a position where an error between all characters (a square error or an absolute value error) is minimized is extracted. At that position, an average value of each pixel of the character image may be acquired, and binarization may be performed to create a representative character. This enlargement process will be described in more detail.

図６を用いて、文字切り出し処理モジュール１１０、類似文字探索処理モジュール１２０、代表文字決定処理モジュール１３０、代表文字描画命令生成処理モジュール１４０による処理例を説明する。代表文字決定処理モジュール１３０は拡大処理を行うものである。
文字切り出し処理モジュール１１０は、複数の「２」という文字が記載された入力画像６１０内の文字画像６１１、文字画像６１２、文字画像６１３を対象画像とする。そして、それぞれの文字画像を、文字画像６１１の解像度で切り出す。また、それぞれの文字画像の文字サイズ／文字位置データ６５０を抽出して、代表文字描画命令生成処理モジュール１４０へ渡すようにしてもよい。
類似文字探索処理モジュール１２０は、類似している文字画像を探索し、複数の「２」という文字画像のグループを生成する。
そして、代表文字決定処理モジュール１３０は、文字画像６１１、文字画像６１２、文字画像６１３の重点（重心線６１１Ａ等の交差点）を求め、その重点を一致させるように位相を移動して高解像度文字画像６２０を生成する。また、代表文字決定処理モジュール１３０は、例えば「２」という文字画像の文字コード・データ６４０を割り当てるようにしてもよい。文字コードの割り当て処理は、文字認識処理によって行うようにしてもよい。
代表文字描画命令生成処理モジュール１４０は、代表文字決定処理モジュール１３０から受け取った高解像度文字画像６２０から例えばアウトライン情報であるフォント・データ６３０を生成する。 An example of processing by the character cutout processing module 110, the similar character search processing module 120, the representative character determination processing module 130, and the representative character drawing command generation processing module 140 will be described with reference to FIG. The representative character determination processing module 130 performs enlargement processing.
The character cutout processing module 110 uses the character image 611, the character image 612, and the character image 613 in the input image 610 on which a plurality of characters “2” are described as target images. Then, each character image is cut out with the resolution of the character image 611. Alternatively, the character size / character position data 650 of each character image may be extracted and passed to the representative character drawing command generation processing module 140.
The similar character search processing module 120 searches for similar character images and generates a plurality of character image groups “2”.
Then, the representative character determination processing module 130 obtains the emphasis (intersection of the center of gravity line 611A, etc.) of the character image 611, the character image 612, and the character image 613, and moves the phase so that the emphasis coincides with each other. 620 is generated. Further, the representative character determination processing module 130 may assign, for example, character code data 640 of a character image “2”. The character code assignment processing may be performed by character recognition processing.
The representative character drawing command generation processing module 140 generates, for example, font data 630 that is outline information from the high resolution character image 620 received from the representative character determination processing module 130.

図７を用いて、代表文字決定処理モジュール１３０による代表文字画像の生成、つまり高解像度文字画像を生成する拡大処理例を説明する。
図７（Ａ）は、入力画像６１０の解像度（第１の解像度）における標本化格子（第１の標本化格子７０１、第１の標本化格子７０２、第１の標本化格子７０３、第１の標本化格子７０４）及び文字画像の重心位置（重心７０１Ａ、重心７０２Ａ、重心７０３Ａ、重心７０４Ａ）を表している。
代表文字決定処理モジュール１３０は、まず、図７（Ｂ）に示すように、文字画像の重心に基づいて、４つの標本化格子の位相を移動させる。 A representative character image generation by the representative character determination processing module 130, that is, an enlargement processing example for generating a high-resolution character image will be described with reference to FIG.
FIG. 7A shows a sampling grid (first sampling grid 701, first sampling grid 702, first sampling grid 703, first resolution) at the resolution of the input image 610 (first resolution). The sampling grid 704) and the gravity center positions (centroid 701A, centroid 702A, centroid 703A, centroid 704A) of the character image are shown.
First, as shown in FIG. 7B, the representative character determination processing module 130 moves the phases of the four sampling grids based on the center of gravity of the character image.

図７（Ｃ）、図７（Ｄ）は、第１の解像度よりも高い第２の解像度の標本化格子を設定する手法の例を説明する図である。図７（Ｃ）に記載された丸数字（１、２、３、４）は、第１の解像度における文字画像の値を例示している。ここで、文字画像は、丸数字が第１の解像度における標本化格子の格子点上に表されるようにプロットされている。
図７（Ｄ）において、第２の標本化格子７０６は高解像度画像の標本化格子である。
代表文字決定処理モジュール１３０は、第１の解像度における４つの標本化格子の位相が移動されると、図７（Ｃ）に示すように、第２の解像度における標本化格子を設定し、図７（Ｄ）に示すように、文字画像の重心が一致するように、第２の解像度における標本化格子の位相を移動させる。 FIGS. 7C and 7D are diagrams illustrating an example of a technique for setting a sampling grid having a second resolution higher than the first resolution. The circled numbers (1, 2, 3, 4) illustrated in FIG. 7C exemplify the value of the character image at the first resolution. Here, the character image is plotted so that the circled numbers are represented on the grid points of the sampling grid at the first resolution.
In FIG. 7D, a second sampling grid 706 is a sampling grid for high-resolution images.
When the phases of the four sampling grids at the first resolution are moved, the representative character determination processing module 130 sets the sampling grid at the second resolution as shown in FIG. As shown in (D), the phase of the sampling grid at the second resolution is moved so that the centroids of the character images coincide.

図７（Ｅ）は、第２の解像度における文字画像の値を算出する手法の例を説明する図である。第２の標本化格子７０６Ａ、第２の標本化格子７０６Ｂ、第２の標本化格子７０６Ｃ、第２の標本化格子７０６Ｄ内の中心にある丸数字は、第２の解像度における文字画像の値を例示しているものである。ここで、第２の解像度における文字画像は、中心にある丸数字が、第２の解像度における標本化格子の格子点上に表されるように示されている。
そして、代表文字決定処理モジュール１３０は、第１の解像度における各文字画像の位相に基づいて、その各文字画像の画素値から、第２の解像度における文字画像の画素値を補間する。本例では、代表文字決定処理モジュール１３０は、最近傍補間法を適用して、第２の解像度における文字画像の画素値を補間する。すなわち、代表文字決定処理モジュール１３０は、第１の解像度における文字画像の４つの値（図７（Ｅ）では、丸数字の１、２、３、４）のうち、第２の解像度における標本化格子点に最も近い値を選択して、第２の解像度における文字画像の値とする。具体的には、第２の標本化格子７０６Ａでは、中心に最も近い値は「１」であり、「１」を採用している（丸数字は１である）。なお、補間方法は、この方法に限定されるものではなく、その他の方法（例えば、線形補間法など）を適用してもよい。
なお、代表文字決定処理モジュール１３０の処理は、前述の処理に限られず、線形補間、キュービックコンボリューション等でもよい。 FIG. 7E is a diagram for explaining an example of a technique for calculating the value of the character image at the second resolution. The circle numbers at the center of the second sampling grid 706A, the second sampling grid 706B, the second sampling grid 706C, and the second sampling grid 706D indicate the value of the character image at the second resolution. This is just an example. Here, the character image in the second resolution is shown such that the circled numbers at the center are represented on the grid points of the sampling grid in the second resolution.
Then, the representative character determination processing module 130 interpolates the pixel value of the character image at the second resolution from the pixel value of each character image based on the phase of each character image at the first resolution. In this example, the representative character determination processing module 130 applies the nearest neighbor interpolation method to interpolate the pixel values of the character image at the second resolution. That is, the representative character determination processing module 130 samples the character image at the second resolution among the four values of the character image at the first resolution (circle numbers 1, 2, 3, and 4 in FIG. 7E). The value closest to the grid point is selected as the character image value at the second resolution. Specifically, in the second sampling grid 706A, the value closest to the center is “1”, and “1” is adopted (the circled number is 1). The interpolation method is not limited to this method, and other methods (for example, linear interpolation method) may be applied.
The processing of the representative character determination processing module 130 is not limited to the processing described above, and linear interpolation, cubic convolution, or the like may be used.

代表文字描画命令生成処理モジュール１４０は、代表文字決定処理モジュール１３０、個別文字描画命令生成処理モジュール１５０と接続されており、代表文字決定処理モジュール１３０によって決定された代表の文字画像の描画命令からなる描画情報を相対位置を用いて生成する。その描画情報を個別文字描画命令生成処理モジュール１５０へ渡す。
描画情報内の描画命令とは、直線、曲線等を描画するための指示である。その描画命令の生成処理とは、文字画像のアウトラインをなぞって近似する処理である。なお、必ずしも近似である必要はなく、文字画像を忠実になぞるようにしてもよい。さらに、アウトラインの内部を埋める処理を含めるようにしてもよい。描画命令の生成処理に関しては、様々な手法があり、それらの手法を用いればよい。例えば、特開平５−３５８６２号公報、特開平６−２２３１７６号公報、特開平８−１１５４１９号公報等に記載されている手法がある。又は、単純に、１つの画素を正方形として捉え、その正方形の辺をなぞるような直線を描画する描画命令を生成してもよい。なお、描画命令の生成処理をアウトライン化又はベクトル化と称することもある。 The representative character drawing command generation processing module 140 is connected to the representative character determination processing module 130 and the individual character drawing command generation processing module 150 and includes drawing commands for representative character images determined by the representative character determination processing module 130. Drawing information is generated using relative positions. The drawing information is passed to the individual character drawing command generation processing module 150.
The drawing command in the drawing information is an instruction for drawing a straight line, a curve or the like. The drawing command generation process is a process of approximating the outline of a character image. It should be noted that it is not necessarily approximate, and the character image may be traced faithfully. Furthermore, a process for filling the inside of the outline may be included. There are various methods for generating a drawing command, and these methods may be used. For example, there are techniques described in JP-A-5-35862, JP-A-6-223176, JP-A-8-115419, and the like. Alternatively, a drawing command for drawing a straight line such that one pixel is regarded as a square and the sides of the square are traced may be generated. Note that the drawing command generation process may be referred to as outlining or vectorization.

代表文字描画命令生成処理モジュール１４０による描画命令生成処理は、相対位置を用いた描画命令を生成するものである。つまり、元の対象画像における絶対位置を用いずに、代表文字画像内のある位置（描画開始位置、その画像の中心等）を基準（原点）とした相対位置を用いている。 The drawing command generation processing by the representative character drawing command generation processing module 140 is to generate a drawing command using a relative position. That is, instead of using the absolute position in the original target image, a relative position based on a certain position (drawing start position, center of the image, etc.) in the representative character image is used.

個別文字描画命令生成処理モジュール１５０は、代表文字描画命令生成処理モジュール１４０、可逆圧縮処理モジュール１６０と接続されており、代表文字描画命令生成処理モジュール１４０によって生成された描画情報に、対象画像における絶対位置を付与して、個別の文字画像の描画情報を生成する。つまり、代表文字画像の描画命令を個別の文字画像の描画命令に変換するものである。 The individual character drawing command generation processing module 150 is connected to the representative character drawing command generation processing module 140 and the lossless compression processing module 160, and the drawing information generated by the representative character drawing command generation processing module 140 includes the absolute information in the target image. A position is given and drawing information of an individual character image is generated. That is, the representative character image drawing command is converted into an individual character image drawing command.

以下、図８、図９に示す例を用いて、代表文字描画命令生成処理モジュール１４０、個別文字描画命令生成処理モジュール１５０によって生成された描画命令を説明する。図８には、同じ形の三角形の画像が二つ（描画対象三角形８０１、８０２）存在している。この三角形は同じ形であるため、１つの代表文字として描画され得るものである。 The drawing commands generated by the representative character drawing command generation processing module 140 and the individual character drawing command generation processing module 150 will be described below using the examples shown in FIGS. In FIG. 8, there are two triangular images (drawing target triangles 801 and 802) having the same shape. Since this triangle is the same shape, it can be drawn as one representative character.

ここで用いる描画命令は、以下の性質を持つものである。
（Ｂ１）描画開始位置を指定する。描画開始位置を指定しなかった場合は、直前の描画の終点を描画開始位置としてもよい。
（Ｂ２）描画開始位置からの相対位置で、終点、制御点等を指定する。なお、制御点は曲線描画のときに必要となるものである。
このような描画命令を使って、代表文字の描画命令が形成される。この代表文字の描画命令では、相対位置（Ｂ２）のみ記述される。すなわち、最初の描画開始位置の指定（Ｂ１）はない。なお、その指定はあってもよいが、個別文字描画命令生成処理モジュール１５０では無視されることになる。 The drawing command used here has the following properties.
(B1) Designate the drawing start position. If the drawing start position is not specified, the end point of the previous drawing may be set as the drawing start position.
(B2) The end point, control point, etc. are designated by the relative position from the drawing start position. The control points are necessary for drawing a curve.
Using such a drawing command, a drawing command for a representative character is formed. In this representative character drawing command, only the relative position (B2) is described. That is, there is no designation (B1) of the first drawing start position. The designation may be made but is ignored by the individual character drawing command generation processing module 150.

代表文字を図９に示した例（頂点Ａ、Ｂ、Ｃを有する三角形）のようにした場合に、代表文字の描画命令の代表文字描画命令生成処理モジュール１４０による生成処理の結果として、次のような（Ｃ１）〜（Ｃ３）からなる描画命令が作成される。
（Ｃ１）描画開始位置Ａを原点として、ＡからＢに直線を引く。
（Ｃ２）ＢからＣに直線を引く。
（Ｃ３）ＣからＡに直線を引く。又は、直線で原点に戻るという命令を使ってもよい。
この描画命令では、描画開始位置Ａを原点としているため、代表文字の描画命令は、相対的な位置関係のみを使って記述されることになる。 When the representative character is as in the example shown in FIG. 9 (triangle having vertices A, B, and C), as a result of the generation processing by the representative character drawing command generation processing module 140 of the representative character drawing command, A drawing command composed of (C1) to (C3) is created.
(C1) A straight line is drawn from A to B with the drawing start position A as the origin.
(C2) A straight line is drawn from B to C.
(C3) A straight line is drawn from C to A. Alternatively, a command for returning to the origin by a straight line may be used.
In this drawing command, since the drawing start position A is the origin, the representative character drawing command is described using only the relative positional relationship.

次に、個別文字描画命令生成処理モジュール１５０によって生成される個別文字の描画命令では、代表文字の描画命令の生成処理で記載された相対位置を絶対位置に変換する。
すなわち、図８における、各三角形の描画開始位置（Ｄ及びＥ）の座標を用いる。つまり、電子文書８００における絶対位置である。なお、描画開始位置は、文字切り出し処理モジュール１１０によって抽出された文字の位置を用いてもよい。
（Ｄ１）最初の三角形（描画対象三角形８０１）の描画
描画開始位置をＤとする。
代表文字の描画命令（（Ｃ１）〜（Ｃ３）からなる描画命令）をここにコピーして記載する。
（Ｄ２）２番目の三角形（描画対象三角形８０２）の描画
描画開始位置をＥとする。
代表文字の描画命令（（Ｃ１）〜（Ｃ３）からなる描画命令）をここにコピーして記載する。
（Ｄ１）、（Ｄ２）のように、描画開始位置を絶対位置で指定して、その後で、相対座標で指定された代表文字の描画命令を記載することによって、二つの三角形（描画対象三角形８０１、８０２）を記述することができる。 Next, in the individual character drawing command generated by the individual character drawing command generation processing module 150, the relative position described in the generation processing of the representative character drawing command is converted into an absolute position.
That is, the coordinates of the drawing start positions (D and E) of each triangle in FIG. 8 are used. That is, the absolute position in the electronic document 800. Note that the position of the character extracted by the character cutout processing module 110 may be used as the drawing start position.
(D1) Draw the first triangle (drawing target triangle 801).
A representative character drawing command (drawing command including (C1) to (C3)) is copied and described here.
(D2) Drawing of the second triangle (drawing target triangle 802) Let E be the drawing start position.
A representative character drawing command (drawing command including (C1) to (C3)) is copied and described here.
As shown in (D1) and (D2), the drawing start position is designated by an absolute position, and thereafter, a drawing command for the representative character designated by the relative coordinates is described, whereby two triangles (drawing target triangle 801) are written. , 802).

前述の例では、描画命令の開始位置に絶対座標を用いた例を示したが、絶対座標の与え方はこれに限らない。例えば、図４に示したように文字「あ」の外接矩形の頂点（例えば左上頂点）の座標を絶対座標として指定してもよい。代表文字の場合、描画開始位置への移動を相対位置で示せばよい。
また、描画開始位置（ＤやＥの位置）を絶対位置で示してもよいし、その描画開始位置を、直前の描画終点からの相対位置で表すようにしてもよい。結果として、描画開始位置が、電子文書上で特定できればよい。
また、描画命令内に、文字コードを含めるようにしてもよい。 In the above-described example, the absolute coordinate is used as the start position of the drawing command. However, the method of giving the absolute coordinate is not limited to this. For example, as shown in FIG. 4, the coordinates of the circumscribed rectangle vertex (for example, the upper left vertex) of the character “A” may be designated as absolute coordinates. In the case of a representative character, the movement to the drawing start position may be shown as a relative position.
In addition, the drawing start position (the position of D or E) may be indicated by an absolute position, or the drawing start position may be expressed by a relative position from the previous drawing end point. As a result, it is only necessary that the drawing start position can be specified on the electronic document.
In addition, a character code may be included in the drawing command.

可逆圧縮処理モジュール１６０は、個別文字描画命令生成処理モジュール１５０と接続されており、個別文字描画命令生成処理モジュール１５０から描画命令を受け取り、それを可逆圧縮する。つまり、描画命令を記載したテキスト列を可逆圧縮して、最終的な電子文書を生成する。この可逆圧縮としては、ＬＺ圧縮が優れている。例えば、特開平８−２３７１３８等に記載されているものがある。これ以外にも様々なＬＺ圧縮の派生版があり、どれを利用してもよい。
ＬＺ圧縮では、過去に出現したバイト列を同じバイト列が出現した場合に、２回目以降に出現したバイト列はそのまま符号化せずに、過去の同じバイト列を参照するようにする。参照時の符号量は、そのまま符号化する場合の符号量よりも少ないため、バイト列を圧縮することが可能となる。
本実施の形態では、代表文字の描画命令を複数回コピーして記載している。この代表文字の描画命令は、同じ命令（同じバイト列）であるため、前述のように過去の同じバイト列を参照することが可能となり、高い圧縮率を得ることができる。 The lossless compression processing module 160 is connected to the individual character drawing command generation processing module 150, receives a drawing command from the individual character drawing command generation processing module 150, and reversibly compresses it. That is, the final electronic document is generated by reversibly compressing the text string describing the drawing command. As this lossless compression, LZ compression is excellent. For example, there is one described in JP-A-8-237138. There are various other versions of LZ compression, and any of them may be used.
In LZ compression, when the same byte string appears in a byte string that has appeared in the past, the byte string that appears in the second and subsequent times is not encoded as it is, and the same byte string in the past is referred to. Since the code amount at the time of reference is smaller than the code amount in the case of encoding as it is, the byte string can be compressed.
In the present embodiment, the representative character drawing command is copied and described a plurality of times. Since the representative character drawing command is the same command (the same byte sequence), it is possible to refer to the same byte sequence in the past as described above, and a high compression rate can be obtained.

図２は、第１の実施の形態による処理例を示すフローチャートである。
ステップＳ２０２では、文字切り出し処理モジュール１１０が対象となる画像を受け取る。
ステップＳ２０４では、文字切り出し処理モジュール１１０がステップＳ２０２で受け取った対象画像内の文字を切り出す。また、各々の文字の文字サイズ、文字位置等を抽出してもよい。
ステップＳ２０６では、類似文字探索処理モジュール１２０がステップＳ２０４で切り出された文字のうちから、類似している文字を探索してグループ化する。
ステップＳ２０８では、代表文字決定処理モジュール１３０がステップＳ２０６でグループ化された類似文字を用いて代表文字を決定する。また、その代表文字に文字コードを割り当ててもよい。
ステップＳ２１０では、代表文字描画命令生成処理モジュール１４０がステップＳ２０８で決定された代表文字に対して、相対位置による描画命令を生成する。
ステップＳ２１２では、個別文字描画命令生成処理モジュール１５０がステップＳ２１０で生成された描画命令を用いて、絶対位置を用いた個別の文字の描画命令を生成する。
ステップＳ２１４では、可逆圧縮処理モジュール１６０がステップＳ２１２で生成された描画命令に対して可逆圧縮する。
ステップＳ２１６では、可逆圧縮処理モジュール１６０がステップＳ２１４で圧縮した描画命令等を用いて電子文書を生成する。この電子文書は、ステップＳ２０２で受け取った画像を再生するものである。 FIG. 2 is a flowchart illustrating a processing example according to the first exemplary embodiment.
In step S202, the character cutout processing module 110 receives a target image.
In step S204, the character cutout processing module 110 cuts out characters in the target image received in step S202. Moreover, you may extract the character size, character position, etc. of each character.
In step S206, the similar character search processing module 120 searches for and groups similar characters from the characters extracted in step S204.
In step S208, the representative character determination processing module 130 determines a representative character using the similar characters grouped in step S206. A character code may be assigned to the representative character.
In step S210, the representative character drawing command generation processing module 140 generates a drawing command based on the relative position for the representative character determined in step S208.
In step S212, the individual character drawing command generation processing module 150 generates an individual character drawing command using the absolute position using the drawing command generated in step S210.
In step S214, the lossless compression processing module 160 performs lossless compression on the drawing command generated in step S212.
In step S216, the lossless compression processing module 160 generates an electronic document using the drawing command and the like compressed in step S214. This electronic document reproduces the image received in step S202.

次に、第２の実施の形態について説明する。第２の実施の形態の構成例は、第１の実施の形態のそれと同様である。ただし、個別文字描画命令生成処理モジュール１５０は次のような処理を行う。つまり、個別文字の描画命令を記述するときの処理に関するものである。
通常は、図４の例に示すように文字が並んでいるため、この並び順に（すなわち、「あ」、「い」、「う」、「あ」、「え」、「お」の順に）描画命令を記載することになる。
しかし、このような順で描画命令を記述すると、ＬＺ圧縮の圧縮率が低下する虞がある。つまり、ＬＺ圧縮では、過去のバイト列を記憶しておく必要があり、その過去のバイト列を記憶するバッファサイズには限界があるため、バッファがあふれた場合には、古いバイト列は消去されることになる。なお、バッファとは、バイト列を単に順に記憶するバッファの場合もあるし、バイト列を辞書として登録するメモリ領域である場合もある。
さらに、バッファサイズを小さくすると、圧縮や伸長時の速度が向上するというメリットもある。
前述のように古いバイト列が消去された場合、そのバイト列を参照することができなくなってしまうため、圧縮率を上げることができなくなる。
そこで、第２の実施の形態では、バッファサイズが小さな場合でも圧縮率を高めることのできるものである。 Next, a second embodiment will be described. The configuration example of the second embodiment is the same as that of the first embodiment. However, the individual character drawing command generation processing module 150 performs the following processing. In other words, the present invention relates to processing for writing individual character drawing commands.
Normally, since the characters are arranged as shown in the example of FIG. 4, the characters are arranged in this order (that is, “A”, “I”, “U”, “A”, “E”, “O”). A drawing command will be described.
However, if the drawing commands are described in this order, the compression rate of LZ compression may decrease. In other words, in LZ compression, it is necessary to store past byte sequences, and the buffer size for storing the past byte sequences is limited. When the buffer overflows, old byte sequences are deleted. Will be. The buffer may be a buffer that simply stores byte sequences in order, or may be a memory area that registers byte sequences as a dictionary.
Furthermore, if the buffer size is reduced, there is an advantage that the speed at the time of compression and decompression is improved.
As described above, when an old byte string is erased, the byte string cannot be referred to, and thus the compression rate cannot be increased.
Therefore, in the second embodiment, the compression rate can be increased even when the buffer size is small.

第１の実施の形態では、個別文字描画命令生成処理モジュール１５０による処理において、各文字の記述順を明記していなかったが、第２の実施の形態ではその順序に規定を与える。
図５に示す例のような類似文字の探索結果があるとする。第２の実施の形態では、類似している文字は、連続して記述する。例えば、類似文字探索処理モジュール１２０による類似文字の探索結果によって、代表文字画像に対応する文字が並ぶことになる。この並び順に個別文字の描画命令を記載するようにする。
これによって、同じバイト列、すなわち代表文字の描画命令が近くに存在することになるため、ＬＺ圧縮のバッファサイズが小さな場合でも、バイト列のヒット率を向上させることができる。
前述の例では、類似文字を連続して記述しているが、必ずしも連続させる必要はない。画像中に出現している順と比較して、類似文字がより近くになるように順序を変更するだけで、ＬＺ圧縮の圧縮率向上の効果を得ることができる。 In the first embodiment, the description order of each character is not specified in the processing by the individual character drawing command generation processing module 150, but in the second embodiment, the order is specified.
Assume that there is a search result of similar characters as in the example shown in FIG. In the second embodiment, similar characters are described consecutively. For example, the characters corresponding to the representative character image are arranged according to the similar character search result by the similar character search processing module 120. The drawing commands for individual characters are described in the order of arrangement.
As a result, the same byte string, that is, a representative character drawing command is present nearby, so that the hit ratio of the byte string can be improved even when the LZ compression buffer size is small.
In the above-described example, similar characters are described continuously, but it is not always necessary to continue them. The effect of improving the compression ratio of LZ compression can be obtained simply by changing the order so that similar characters are closer to each other than the order in which they appear in the image.

次に、第３の実施の形態について説明する。第３の実施の形態の構成例は、第１の実施の形態のそれと同様である。ただし、個別文字描画命令生成処理モジュール１５０は次のような処理を行う。つまり、個別文字の描画命令を記述するときの処理に関するものである。個別文字描画命令生成処理モジュール１５０は、可逆圧縮処理モジュール１６０による圧縮結果に基づいて、個別の文字の描画命令を生成する。
前述の第１、第２の実施の形態では、一旦描画命令のテキスト列を形成した後で、ＬＺ圧縮を行っているが、必ずしもこのような順序である必要はない。
ＬＺ圧縮では、過去のバイト列から同じバイト列を探し出す探索処理が必要となる。
しかしながら、第３の実施の形態においては、代表文字の描画命令が同じバイト列となっていることが、予め分かっているのであるから、ＬＺ圧縮による探索を行う必要がなく、そのバイト列を参照することが可能となる。
すなわち、第３の実施の形態の個別文字描画命令生成処理モジュール１５０は、２回目以降の代表文字の描画命令に関しては、その描画命令であるテキスト列を記述することなく、ＬＺ圧縮の参照情報を出力する。代表文字の１回目の描画命令については、個別文字描画命令生成処理モジュール１５０、可逆圧縮処理モジュール１６０の順番で処理を行い、２回目以降の描画命令については、また個別文字描画命令生成処理モジュール１５０に戻り、そこでの処理（ＬＺ圧縮の参照情報の記載）を行って可逆圧縮処理モジュール１６０へ処理を渡すことになる。 Next, a third embodiment will be described. The configuration example of the third embodiment is the same as that of the first embodiment. However, the individual character drawing command generation processing module 150 performs the following processing. In other words, the present invention relates to processing for writing individual character drawing commands. The individual character drawing command generation processing module 150 generates an individual character drawing command based on the compression result by the lossless compression processing module 160.
In the first and second embodiments described above, LZ compression is performed after a text string of a rendering command is once formed. However, this order is not necessarily required.
In LZ compression, a search process for searching for the same byte string from past byte strings is required.
However, in the third embodiment, since it is known in advance that the drawing command for the representative character is the same byte string, there is no need to perform a search by LZ compression, and the byte string is referred to. It becomes possible to do.
That is, the individual character drawing command generation processing module 150 of the third embodiment uses the reference information for LZ compression for the second and subsequent representative character drawing commands without describing the text string that is the drawing command. Output. The first drawing command of the representative character is processed in the order of the individual character drawing command generation processing module 150 and the lossless compression processing module 160, and the second and subsequent drawing commands are again processed by the individual character drawing command generation processing module 150. Then, the process (description of reference information for LZ compression) is performed and the process is passed to the lossless compression processing module 160.

次に、第４の実施の形態について説明する。第４の実施の形態の構成例は、第１の実施の形態のそれと同様である。ただし、類似文字探索処理モジュール１２０、個別文字描画命令生成処理モジュール１５０は次のような処理を行う。つまり、個別文字の描画命令を記述するときの処理に関するものである。個別文字描画命令生成処理モジュール１５０は、アフィン変換をして類似となる文字に対しても、代表文字描画命令生成処理モジュール１４０によって生成された描画命令及びアフィン変換の描画命令を含めて、個別の文字の描画命令を生成する。
前述の第１〜第３の実施の形態では、代表文字と個別文字の大きさは同じであることを前提としていた。
しかし、必ずしも同じである必要はない。つまり、類似文字探索処理モジュール１２０は、アフィン変換（拡大縮小等）をして類似となる文字も類似文字とする。
そして、個別文字描画命令生成処理モジュール１５０は、アフィン変換をして類似となる文字に対しても、代表文字の描画命令をコピーし、さらに描画命令としてアフィン変換の命令を加えて、個別の文字の描画命令を生成する。 Next, a fourth embodiment will be described. The configuration example of the fourth embodiment is the same as that of the first embodiment. However, the similar character search processing module 120 and the individual character drawing command generation processing module 150 perform the following processing. In other words, the present invention relates to processing for writing individual character drawing commands. The individual character drawing command generation processing module 150 also includes individual drawing commands generated by the representative character drawing command generation processing module 140 and affine transformation drawing commands for characters that are similar by affine transformation. Generate a character drawing command.
In the first to third embodiments described above, it is assumed that the sizes of the representative character and the individual character are the same.
However, it need not be the same. In other words, the similar character search processing module 120 also makes a similar character a similar character by performing affine transformation (enlargement and reduction).
Then, the individual character drawing command generation processing module 150 copies the representative character drawing command even for characters that are similar by affine transformation, and further adds an affine transformation command as a drawing command to obtain individual characters. Generate a drawing command.

図１０を参照して、第１〜４の実施の形態のハードウェア構成例について説明する。図１０に示す構成は、例えばパーソナルコンピュータ（ＰＣ）などによって構成されるものであり、スキャナ等のデータ読み取り部１０１７と、プリンタなどのデータ出力部１０１８を備えたハードウェア構成例を示している。 With reference to FIG. 10, a hardware configuration example of the first to fourth embodiments will be described. The configuration illustrated in FIG. 10 is configured by, for example, a personal computer (PC), and illustrates a hardware configuration example including a data reading unit 1017 such as a scanner and a data output unit 1018 such as a printer.

ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）１００１は、前述の実施の形態において説明した各種のモジュール、すなわち、文字切り出し処理モジュール１１０、類似文字探索処理モジュール１２０、代表文字決定処理モジュール１３０、代表文字描画命令生成処理モジュール１４０、個別文字描画命令生成処理モジュール１５０、可逆圧縮処理モジュール１６０等の各モジュールの実行シーケンスを記述したコンピュータ・プログラムにしたがった処理を実行する制御部である。 A CPU (Central Processing Unit) 1001 includes various modules described in the above-described embodiments, that is, a character segmentation processing module 110, a similar character search processing module 120, a representative character determination processing module 130, and a representative character drawing command generation processing module. 140, a control unit that executes processing according to a computer program describing an execution sequence of each module such as the individual character drawing command generation processing module 150 and the lossless compression processing module 160.

ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）１００２は、ＣＰＵ１００１が使用するプログラムや演算パラメータ等を格納する。ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）１００３は、ＣＰＵ１００１の実行において使用するプログラムや、その実行において適宜変化するパラメータ等を格納する。これらはＣＰＵバスなどから構成されるホストバス１００４により相互に接続されている。 A ROM (Read Only Memory) 1002 stores programs used by the CPU 1001, calculation parameters, and the like. A RAM (Random Access Memory) 1003 stores programs used in the execution of the CPU 1001, parameters that change as appropriate in the execution, and the like. These are connected to each other by a host bus 1004 including a CPU bus.

ホストバス１００４は、ブリッジ１００５を介して、ＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔ／Ｉｎｔｅｒｆａｃｅ）バスなどの外部バス１００６に接続されている。 The host bus 1004 is connected to an external bus 1006 such as a PCI (Peripheral Component Interconnect / Interface) bus via a bridge 1005.

キーボード１００８、マウス等のポインティングデバイス１００９は、操作者により操作される入力デバイスである。ディスプレイ１０１０は、液晶表示装置又はＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）などから成り、各種情報をテキストやイメージ情報として表示する。 A keyboard 1008 and a pointing device 1009 such as a mouse are input devices operated by an operator. The display 1010 includes a liquid crystal display device or a CRT (Cathode Ray Tube), and displays various types of information as text and image information.

ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）１０１１は、ハードディスクを内蔵し、ハードディスクを駆動し、ＣＰＵ１００１によって実行するプログラムや情報を記録又は再生させる。ハードディスクには、受け付けた画像、描画命令などが格納される。さらに、その他の各種のデータ処理プログラム等、各種コンピュータ・プログラムが格納される。 An HDD (Hard Disk Drive) 1011 includes a hard disk, drives the hard disk, and records or reproduces a program executed by the CPU 1001 and information. The hard disk stores received images, drawing commands, and the like. Further, various computer programs such as various other data processing programs are stored.

ドライブ１０１２は、装着されている磁気ディスク、光ディスク、光磁気ディスク、又は半導体メモリ等のリムーバブル記録媒体１０１３に記録されているデータ又はプログラムを読み出して、そのデータ又はプログラムを、インタフェース１００７、外部バス１００６、ブリッジ１００５、及びホストバス１００４を介して接続されているＲＡＭ１００３に供給する。リムーバブル記録媒体１０１３も、ハードディスクと同様のデータ記録領域として利用可能である。 The drive 1012 reads data or a program recorded on a removable recording medium 1013 such as a magnetic disk, an optical disk, a magneto-optical disk, or a semiconductor memory, and the data or program is read out to the interface 1007 and the external bus 1006. , The bridge 1005, and the RAM 1003 connected via the host bus 1004. The removable recording medium 1013 can also be used as a data recording area similar to a hard disk.

接続ポート１０１４は、外部接続機器１０１５を接続するポートであり、ＵＳＢ、ＩＥＥＥ１３９４等の接続部を持つ。接続ポート１０１４は、インタフェース１００７、及び外部バス１００６、ブリッジ１００５、ホストバス１００４等を介してＣＰＵ１００１等に接続されている。通信部１０１６は、ネットワークに接続され、外部とのデータ通信処理を実行する。データ読み取り部１０１７は、例えばスキャナであり、ドキュメントの読み取り処理を実行する。データ出力部１０１８は、例えばプリンタであり、ドキュメントデータの出力処理を実行する。 The connection port 1014 is a port for connecting the external connection device 1015 and has a connection unit such as USB and IEEE1394. The connection port 1014 is connected to the CPU 1001 and the like via the interface 1007, the external bus 1006, the bridge 1005, the host bus 1004, and the like. A communication unit 1016 is connected to a network and executes data communication processing with the outside. The data reading unit 1017 is a scanner, for example, and executes document reading processing. The data output unit 1018 is a printer, for example, and executes document data output processing.

なお、図１０に示すハードウェア構成は、１つの構成例を示すものであり、前記実施の形態は、図１０に示す構成に限らず、前記実施の形態において説明したモジュールを実行可能な構成であればよい。例えば、一部のモジュールを専用のハードウェア（例えば特定用途向け集積回路（ＡｐｐｌｉｃａｔｉｏｎＳｐｅｃｉｆｉｃＩｎｔｅｇｒａｔｅｄＣｉｒｃｕｉｔ：ＡＳＩＣ）等）で構成してもよく、一部のモジュールは外部のシステム内にあり通信回線で接続しているような形態でもよく、さらに図１０に示すシステムが複数互いに通信回線によって接続されていて互いに協調動作するようにしてもよい。また、複写機、ファックス、スキャナ、プリンタ、複合機（スキャナ、プリンタ、複写機、ファックス等のいずれか２つ以上の機能を有している画像処理装置）などに組み込まれていてもよい。 Note that the hardware configuration shown in FIG. 10 shows one configuration example, and the embodiment is not limited to the configuration shown in FIG. 10, and is a configuration that can execute the modules described in the embodiment. I just need it. For example, some modules may be configured with dedicated hardware (for example, Application Specific Integrated Circuit (ASIC), etc.), and some modules are in an external system and connected via a communication line In addition, a plurality of systems shown in FIG. 10 may be connected to each other via communication lines so as to cooperate with each other. Further, it may be incorporated in a copying machine, a fax machine, a scanner, a printer, a multifunction machine (an image processing apparatus having any two or more functions of a scanner, a printer, a copying machine, a fax machine, etc.).

前記実施の形態においては、可逆圧縮としてＬＺ圧縮を例示したが、同一の情報列がある場合に圧縮率が高まる可逆圧縮であれば、他の圧縮アルゴリズムを採用してもよい。
また、第１〜第４の実施の形態の個別文字描画命令生成処理モジュール１５０を組み合わせてもよい。 In the above embodiment, LZ compression is exemplified as the lossless compression. However, other compression algorithms may be adopted as long as the compression rate is high when the same information sequence is present.
Further, the individual character drawing command generation processing module 150 of the first to fourth embodiments may be combined.

なお、説明したプログラムについては、記録媒体に格納して提供してもよく、また、そのプログラムを通信手段によって提供してもよい。その場合、例えば、前記説明したプログラムについて、「プログラムを記録したコンピュータ読み取り可能な記録媒体」の発明として捉えてもよい。
「プログラムを記録したコンピュータ読み取り可能な記録媒体」とは、プログラムのインストール、実行、プログラムの流通などのために用いられる、プログラムが記録されたコンピュータで読み取り可能な記録媒体をいう。
なお、記録媒体としては、例えば、デジタル・バーサタイル・ディスク（ＤＶＤ）であって、ＤＶＤフォーラムで策定された規格である「ＤＶＤ−Ｒ、ＤＶＤ−ＲＷ、ＤＶＤ−ＲＡＭ等」、ＤＶＤ＋ＲＷで策定された規格である「ＤＶＤ＋Ｒ、ＤＶＤ＋ＲＷ等」、コンパクトディスク（ＣＤ）であって、読出し専用メモリ（ＣＤ−ＲＯＭ）、ＣＤレコーダブル（ＣＤ−Ｒ）、ＣＤリライタブル（ＣＤ−ＲＷ）等、光磁気ディスク（ＭＯ）、フレキシブルディスク（ＦＤ）、磁気テープ、ハードディスク、読出し専用メモリ（ＲＯＭ）、電気的消去及び書換可能な読出し専用メモリ（ＥＥＰＲＯＭ）、フラッシュ・メモリ、ランダム・アクセス・メモリ（ＲＡＭ）等が含まれる。
そして、前記のプログラム又はその一部は、前記記録媒体に記録して保存や流通等させてもよい。また、通信によって、例えば、ローカル・エリア・ネットワーク（ＬＡＮ）、メトロポリタン・エリア・ネットワーク（ＭＡＮ）、ワイド・エリア・ネットワーク（ＷＡＮ）、インターネット、イントラネット、エクストラネット等に用いられる有線ネットワーク、あるいは無線通信ネットワーク、さらにこれらの組み合わせ等の伝送媒体を用いて伝送させてもよく、また、搬送波に乗せて搬送させてもよい。
さらに、前記のプログラムは、他のプログラムの一部分であってもよく、あるいは別個のプログラムと共に記録媒体に記録されていてもよい。また、複数の記録媒体に分割して
記録されていてもよい。また、圧縮や暗号化など、復元可能であればどのような態様で記録されていてもよい。 The program described above may be provided by being stored in a recording medium, or the program may be provided by communication means. In that case, for example, the above-described program may be regarded as an invention of a “computer-readable recording medium recording the program”.
The “computer-readable recording medium on which a program is recorded” refers to a computer-readable recording medium on which a program is recorded, which is used for program installation, execution, program distribution, and the like.
The recording medium is, for example, a digital versatile disc (DVD), which is a standard established by the DVD Forum, such as “DVD-R, DVD-RW, DVD-RAM,” and DVD + RW. Standards such as “DVD + R, DVD + RW, etc.”, compact discs (CDs), read-only memory (CD-ROM), CD recordable (CD-R), CD rewritable (CD-RW), etc. MO), flexible disk (FD), magnetic tape, hard disk, read only memory (ROM), electrically erasable and rewritable read only memory (EEPROM), flash memory, random access memory (RAM), etc. It is.
The program or a part of the program may be recorded on the recording medium for storage or distribution. Also, by communication, for example, a local area network (LAN), a metropolitan area network (MAN), a wide area network (WAN), a wired network used for the Internet, an intranet, an extranet, etc., or wireless communication It may be transmitted using a transmission medium such as a network or a combination of these, or may be carried on a carrier wave.
Furthermore, the program may be a part of another program, or may be recorded on a recording medium together with a separate program. Moreover, it may be divided and recorded on a plurality of recording media. Further, it may be recorded in any manner as long as it can be restored, such as compression or encryption.

第１〜４の実施の形態の構成例についての概念的なモジュール構成図である。It is a notional module block diagram about the structural example of 1st-4th embodiment. 第１の実施の形態による処理例を示すフローチャートである。It is a flowchart which shows the process example by 1st Embodiment. 対象とする画像の例を示す説明図である。It is explanatory drawing which shows the example of the image made into object. 対象画像から文字を切り出した例を示す説明図である。It is explanatory drawing which shows the example which cut out the character from the target image. 類似する文字をグルーピング処理した例を示す説明図である。It is explanatory drawing which shows the example which grouped the similar character. 第１の実施の形態による処理例を示す説明図である。It is explanatory drawing which shows the process example by 1st Embodiment. 代表文字決定処理モジュールによる高解像度文字画像データの生成処理例を示す説明図である。It is explanatory drawing which shows the example of a production | generation process of the high resolution character image data by a representative character determination processing module. 代表文字を描画する例を示す説明図である。It is explanatory drawing which shows the example which draws a representative character. 描画命令の例を示すための説明図である。It is explanatory drawing for showing the example of a drawing command. 第１〜４の実施の形態を実現するコンピュータのハードウェア構成例を示すブロック図である。It is a block diagram which shows the hardware structural example of the computer which implement | achieves 1st-4th embodiment.

Explanation of symbols

１１０…文字切り出し処理モジュール
１２０…類似文字探索処理モジュール
１３０…代表文字決定処理モジュール
１４０…代表文字描画命令生成処理モジュール
１５０…個別文字描画命令生成処理モジュール
１６０…可逆圧縮処理モジュール DESCRIPTION OF SYMBOLS 110 ... Character cut-out processing module 120 ... Similar character search processing module 130 ... Representative character determination processing module 140 ... Representative character drawing command generation processing module 150 ... Individual character drawing command generation processing module 160 ... Lossless compression processing module

Claims

Representative pixel block determining means for determining a pixel block representing the pixel block from similar pixel blocks in the image;
Representative pixel block drawing information generating unit for generating drawing information of the representative pixel block determined by the representative pixel block determining unit using a relative position;
Individual pixel block drawing information generating unit for generating drawing information of individual pixel blocks by giving an absolute position in the image to the drawing information generated by the representative pixel block drawing information generating unit;
Reversible compression means for reversibly compressing the drawing information generated by the individual pixel block drawing information generating means ,
The individual pixel block drawing information generating means includes:
An image processing apparatus, wherein the drawing order of pixel blocks corresponding to the representative pixel block determined by the representative pixel block determining means is made closer .

Representative pixel block determining means for determining a pixel block representing the pixel block from similar pixel blocks in the image;
Representative pixel block drawing information generating unit for generating drawing information of the representative pixel block determined by the representative pixel block determining unit using a relative position;
Individual pixel block drawing information generating unit for generating drawing information of individual pixel blocks by giving an absolute position in the image to the drawing information generated by the representative pixel block drawing information generating unit;
Reversible compression means for reversibly compressing the drawing information generated by the individual pixel block drawing information generating means ,
The individual pixel block drawing information generating means includes:
An image processing apparatus, wherein a pixel block corresponding to the representative pixel block determined by the representative pixel block determining unit is continuously drawn .

The individual pixel block drawing information generating means includes:
Based on the compression result of the reversible compressing unit, an image processing apparatus according to any one of claims 1-2, characterized in that to generate drawing information of an individual pixel block.

The individual pixel block drawing information generation unit includes individual drawing blocks generated by the representative pixel block drawing information generation unit and drawing information of the affine conversion for pixel blocks that are similar by affine transformation. the image processing apparatus according to any one of claims 1 to 3, characterized in that to generate drawing information of the pixel block.

Computer
Representative pixel block determining means for determining a pixel block representing the pixel block from similar pixel blocks in the image;
Representative pixel block drawing information generating unit for generating drawing information of the representative pixel block determined by the representative pixel block determining unit using a relative position;
Individual pixel block drawing information generating unit for generating drawing information of individual pixel blocks by giving an absolute position in the image to the drawing information generated by the representative pixel block drawing information generating unit;
Functioning as reversible compression means for reversibly compressing the drawing information generated by the individual pixel block drawing information generating means ,
The individual pixel block drawing information generating means includes:
An image processing program for causing a drawing order of pixel blocks corresponding to a representative pixel block determined by the representative pixel block determining means to approach each other .

  Computer
  Representative pixel block determining means for determining a pixel block representing the pixel block from similar pixel blocks in the image;
  Representative pixel block drawing information generating unit for generating drawing information of the representative pixel block determined by the representative pixel block determining unit using a relative position;
  Individual pixel block drawing information generating unit for generating drawing information of individual pixel blocks by giving an absolute position in the image to the drawing information generated by the representative pixel block drawing information generating unit;
  Reversible compression means for reversibly compressing the drawing information generated by the individual pixel block drawing information generating means
  Function as
  The individual pixel block drawing information generating means includes:
  Drawing of pixel blocks corresponding to the representative pixel block determined by the representative pixel block determining means is made continuous.
  An image processing program characterized by that.