JPH10178638A

JPH10178638A - Character area encoding method, decoding method, character area encoder and decoder

Info

Publication number: JPH10178638A
Application number: JP33838996A
Authority: JP
Inventors: Yutaka Watanabe; 裕渡辺; Kazuto Kamikura; 一人上倉; Hirotaka Jiyosawa; 裕尚如沢
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 1996-12-18
Filing date: 1996-12-18
Publication date: 1998-06-30

Abstract

PROBLEM TO BE SOLVED: To drastically reduce the encoding information amount of a character part and to improve a compression rate of an image jumpingly. SOLUTION: First, a character area extracting part 102 extracts a character area 103 from an input image 101. Pixel values of character area 103 are vectorized in a character recognition part 104 and recognized as a character. As a result, a character code 105 is outputted. Various fonts 107 which correspond to the code 105 are retrieved from a font database 106 and inputted to a character matching part 108 together with an image of the area 103. The part 108 searches a font type 111, a character size 112 and a character color 113 which approximate the pixels of the area 103 most. A start position 109 of the character part, a character interval 110 and a character code 105 together with these data are outputted as encoded data.

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【発明の属する技術分野】本発明は、文字を含む画像の
符号化方法および装置に関する。[0001] 1. Field of the Invention [0002] The present invention relates to a method and an apparatus for encoding an image including characters.

【０００２】[0002]

【従来の技術】従来、文字を含む画像の符号化方法とし
て、画像中の文字部分を含む領域を抽出した後に、ＰＣ
ＭやＤＰＣＭを用いて符号化し、文字以外の部分に対し
てはＤＣＴなどの変換符号化を用いる手法が提案されて
いる。例えば、図５に示すように、基本的な画像の処理
に変換符号化を用い、文字部分のみをＤＰＣＭで符号化
する手法である。2. Description of the Related Art Conventionally, as a method of encoding an image including a character, after extracting an area including a character portion in the image, a PC
A method has been proposed in which encoding is performed using M or DPCM, and transform encoding such as DCT is performed on parts other than characters. For example, as shown in FIG. 5, there is a method in which conversion encoding is used for basic image processing, and only character portions are encoded by DPCM.

【０００３】変換符号化は自然画像のように、高い周波
数成分を含まない画像に対して有効な手法であり、文字
のように高い周波数成分を多量に含む急峻な信号を含む
領域に対して適用すると高域周波数成分の欠落により、
ブロック状の符号化雑音が発生する。ブロック状の符号
化雑音を避けるためには、情報量が多く必要であり、そ
の結果高圧縮を達成できない。そこで文字部分に対して
は、ＰＣＭやＤＰＣＭのように、急峻な信号変化があっ
ても変換符号化のようにブロック状の符号化雑音を発生
しない処理を適用する方法が提案されている。[0003] Transform coding is an effective technique for images that do not contain high frequency components, such as natural images, and is applied to regions that include steep signals that contain a large amount of high frequency components, such as characters. Then, due to the lack of high frequency components,
Block-like coding noise occurs. In order to avoid block-like coding noise, a large amount of information is required, and as a result, high compression cannot be achieved. Therefore, there has been proposed a method of applying a process such as PCM or DPCM that does not generate block-like coding noise even when there is a sharp signal change, such as transform coding, for a character portion.

【０００４】[0004]

【発明が解決しようとする課題】上述した従来の文字領
域符号化方法では、ＰＣＭやＤＰＣＭのような画素単位
の符号化方法を用いているため、文字が複雑になれば符
号化データが急激に増大するため欠点があり、圧縮率に
限界があった。In the above-described conventional character area encoding method, an encoding method in pixel units such as PCM or DPCM is used. There was a drawback due to the increase, and the compression ratio was limited.

【０００５】本発明の目的は、文字部分の符号化情報量
を大幅に削減し、画像の圧縮率を飛躍的に向上させる、
文字領域符号化方法および装置とこれに対応する文字領
域復号方法および装置を提供することにある。SUMMARY OF THE INVENTION It is an object of the present invention to greatly reduce the amount of encoded information in a character portion and to dramatically improve the image compression ratio.
An object of the present invention is to provide a character area encoding method and apparatus and a corresponding character area decoding method and apparatus.

【０００６】[0006]

【課題を解決するための手段】本発明の第１の文字領域
符号化方法は、画像に含まれる文字部分の画素を符号化
する際に、文字部分を文字領域として抽出し、文字認識
を行った後に、文字コード化を行い、該文字コードに対
応する各種文字フォントのうち前記文字部分の画素に最
も近いフォントを探索し、そのパラメータと、前記文字
部分の再生に必要なパラメータを符号化する。According to a first character area encoding method of the present invention, when encoding pixels of a character portion included in an image, the character portion is extracted as a character region and character recognition is performed. After that, character encoding is performed, a font closest to the pixel of the character portion is searched for among various character fonts corresponding to the character code, and parameters thereof and parameters necessary for reproducing the character portion are encoded. .

【０００７】本発明の第１の文字領域復号方法は、第１
の文字領域符号化方法に対応する文字領域復号方法であ
って、前記文字コードに対応するフォントを出力し、前
記両パラメータを復号し、前記文字領域の画素を再生
し、出力する。[0007] A first character area decoding method according to the present invention comprises:
A font corresponding to the character code is output, the two parameters are decoded, and pixels in the character area are reproduced and output.

【０００８】画像に含まれる文字部分の画素を符号化す
る際に、文字部分を領域として抽出し、文字認識を行
う。認識処理は任意の手法で構わない。例えば、文献
［１］の手法を用いることができる。文字の認識率は１
００％である必要はない。符号化側と復号側の両方で、
多種の文字ベクトルフォントをデータラインブラリとし
て記憶しておく。[0008] When encoding pixels of a character portion included in an image, the character portion is extracted as a region and character recognition is performed. The recognition processing may be performed by an arbitrary method. For example, the method of Reference [1] can be used. Character recognition rate is 1
It need not be 00%. On both the encoding and decoding sides,
Various character vector fonts are stored as data line libraries.

【０００９】認識された文字に対するコードに対応する
多種の文字ベクトルフォントのサイズと色と間隔を変化
させて、文字部分とのマッチングを計算する。このマッ
チングの規範には、ＲＧＢあるいはＹＣｒＣｂといった
画素値レベルでの平均自乗誤差や平均絶対誤差や相関な
ど任意のものを用いることができる。最も近似度が高い
文字フォントの種類、サイズおよび間隔をマッチングに
より探し出す。The matching with the character portion is calculated by changing the size, color, and interval of various character vector fonts corresponding to the code for the recognized character. Any standard such as a mean square error, a mean absolute error, or a correlation at a pixel value level such as RGB or YCrCb can be used as a standard for this matching. The type, size and interval of the character font having the highest degree of approximation are found by matching.

【００１０】符号化するデータは、文字部分の開始位
置、文字コード、フォントの種類、文字のサイズ、色お
よび間隔であり、必要に応じてハフマン符号化や算術符
号化のような可変長符号化を行い、圧縮データとして出
力する。The data to be encoded are the starting position of the character portion, the character code, the font type, the character size, the color and the interval, and if necessary, variable length encoding such as Huffman encoding or arithmetic encoding. And outputs it as compressed data.

【００１１】復号側では、文字部分の開始位置、文字コ
ード、フォントの種類、文字のサイズ、色、文字間隔を
復号して、文字コードに対応するフォントを指定のサイ
ズと色を持ったものに復元し、画像中の文字部分の開始
位置から得られた文字間隔で並べて文字列を再生する。The decoding side decodes the start position of the character portion, the character code, the font type, the character size, the color, and the character interval, and converts the font corresponding to the character code into one having the specified size and color. The character string is reproduced by arranging the character strings at the character intervals obtained from the start position of the character part in the image.

【００１２】本発明の第２の文字領域符号化方法は、画
像に含まれる文字部分の画素を符号化する際に、文字部
分を文字領域として抽出し、文字認識を行った後に、文
字コード化を行い、該文字コードに対応する各種文字フ
ォントのうち前記文字部分の画素に最も近いフォントを
探索し、そのパラメータと、前記文字部分の再生に必要
なパラメータを符号化し、出力し、前記文字コードに対
応する文字フォントと両パラメータから前記文字部分の
近似画像を生成して入力画像に対する予測画像とし、予
測誤差を求め、該予測誤差を符号化する。According to a second character area encoding method of the present invention, when encoding pixels of a character portion included in an image, the character portion is extracted as a character region, and after character recognition, the character encoding is performed. Performing a search for a font closest to the pixel of the character portion among various character fonts corresponding to the character code, encoding and outputting the parameters and parameters necessary for reproducing the character portion, and outputting the character code. Then, an approximate image of the character portion is generated from the character font corresponding to the above and both parameters, and is used as a predicted image for the input image, a prediction error is obtained, and the prediction error is encoded.

【００１３】本発明の第２の文字領域復号方法は、第２
の文字領域符号化方法に対応する文字領域復号方法であ
って、前記文字コードに対応するフォントを出力し、前
記両パラメータを復号し、前記フォントと復号されたパ
ラメータから前記文字部分の近似画像の画素値を求め、
前記予測誤差を復号し、前記近似画像の画素値と復号さ
れた予測誤差を加算して、前記文字領域の画素を再生
し、出力する。According to a second character area decoding method of the present invention,
A character area decoding method corresponding to the character area encoding method, wherein a font corresponding to the character code is output, the two parameters are decoded, and an approximate image of the character part is obtained from the font and the decoded parameters. Find the pixel value,
The prediction error is decoded, the pixel value of the approximate image is added to the decoded prediction error, and the pixel of the character area is reproduced and output.

【００１４】文字フォントの種類、文字のサイズおよび
間隔をマッチングにより探し出す処理までは第１の方法
と同一である。その後、得られた文字部分の開始位置、
文字コード、フォントの種類、サイズ、色および文字間
隔で決められる文字列の近似画像を生成する。この近似
画像を、入力画像に対する予測画像とし、予測誤差を求
める。予測誤差は、ＤＰＣＭのような予測符号化やＤＣ
Ｔのような変換符号化を用いて符号化する。The process up to the process of searching for the type of character font, character size and spacing by matching is the same as in the first method. Then, the starting position of the obtained character part,
Generates an approximate image of a character string determined by character code, font type, size, color, and character spacing. This approximate image is used as a predicted image for the input image, and a prediction error is obtained. The prediction error is calculated by predictive coding such as DPCM or DC.
The encoding is performed using a transform encoding such as T.

【００１５】符号化側からは、文字部分の開始位置、文
字コード、フォントの種類、文字のサイズ、色および間
隔を符号化すると共に予測誤差を符号化して出力する。The encoding side encodes a start position of a character portion, a character code, a font type, a character size, a color and an interval, and encodes and outputs a prediction error.

【００１６】復号側では、文字部分の開始位置、文字コ
ード、フォントの種類、文字のサイズ、色および間隔を
復号して、文字コードに対応するフォントを指定のサイ
ズと色を持ったものに復元し、画像中の文字部分の開始
位置から得られた文字間隔で並べ、さらに復号した予測
誤差を加算して、文字領域の画像を再生する。On the decoding side, the starting position of the character portion, the character code, the font type, the character size, the color and the interval are decoded, and the font corresponding to the character code is restored to the one having the specified size and color. Then, the images are arranged at character intervals obtained from the start positions of the character portions in the image, and the decoded prediction errors are added to reproduce the image in the character region.

【００１７】本発明の第１の文字領域符号化装置は、入
力画像から文字部分を文字領域として抽出する文字領域
抽出手段と、文字コードに対応する各種フォントが格納
されているフォントデータベースと、前記文字領域から
文字を認識し、文字コードを出力し、前記フォントデー
タベースを検索する文字認識手段と、前記文字領域と前
記フォントデータベースから出力されたフォントを入力
し、前記文字領域の画素に最も近いフォントを探索し、
そのパラメータと、前記文字部分の再生に必要なパラメ
ータを符号化して出力する文字マッチング手段を有す
る。According to a first aspect of the present invention, there is provided a character region encoding device for extracting a character portion as a character region from an input image, a font database storing various fonts corresponding to character codes, Character recognition means for recognizing a character from a character area, outputting a character code, and searching the font database; and inputting a font output from the character area and the font database, and a font closest to a pixel of the character area. Exploring,
There is a character matching means for encoding and outputting the parameters and the parameters necessary for reproducing the character portion.

【００１８】本発明の第２の文字領域符号化装置は、入
力画像から文字部分を文字領域として抽出する文字領域
抽出手段と、文字コードに対応する各種フォントが格納
されているフォントデータベースと、前記文字領域から
文字を認識し、文字コードを出力し、前記フォントデー
タベースを検索する文字認識手段と、前記文字領域と前
記フォントデータベースから出力されたフォントを入力
し、前記文字領域の画素に最も近いフォントを探索し、
そのパラメータと、前記文字部分の再生に必要なパラメ
ータを符号化して出力する文字マッチング手段と、前記
文字マッチング手段から出力された両パラメータと、前
記フォントデータベースから出力されたフォントを入力
し、前記文字部分の近似画像を生成する近似文字生成手
段と、前記文字領域と前記近似画像の差分をとり、予測
誤差として出力する差分手段と、前記予測誤差を符号化
する予測誤差符号化手段を有する。According to a second character area encoding apparatus of the present invention, a character area extracting means for extracting a character portion as a character area from an input image, a font database storing various fonts corresponding to character codes, Character recognition means for recognizing a character from a character area, outputting a character code, and searching the font database; and inputting a font output from the character area and the font database, and a font closest to a pixel of the character area. Exploring,
Inputting the parameters, a character matching means for encoding and outputting parameters necessary for reproducing the character portion, both parameters output from the character matching means, and a font output from the font database; There is an approximate character generating means for generating an approximate image of a portion, a difference means for obtaining a difference between the character area and the approximate image and outputting the difference as a prediction error, and a prediction error encoding means for encoding the prediction error.

【００１９】第１の文字領域符号化装置に対応する第１
の文字領域復号装置は、各種フォントが格納されてお
り、前記文字コードに対応するフォントを出力するフォ
ントデータベースと、前記両パラメータを復号し、復号
された両パラメータと前記フォントから前記文字領域の
画素を再生し、出力する文字生成手段を有する。The first character region encoding apparatus has a first
The character area decoding device stores a variety of fonts, outputs a font corresponding to the character code, a font database, decodes both parameters, and decodes both parameters and the pixels of the character area from the font. And a character generating means for reproducing and outputting.

【００２０】第２の文字領域符号化装置に対応する第２
の文字領域復号装置は、各種フォントが格納されてお
り、前記文字コードに対応するフォントを出力するフォ
ントデータベースと、前記両パラメータを復号し、復号
された両パラメータと前記フォントから前記文字部分の
近似画像の画素値を求める近似文字生成手段と、前記予
測誤差を復号する予測誤差復号手段と、前記近似画像の
画素値と前記復号された予測誤差を加算し、前記文字領
域の画素を再生し、出力する加算手段を有する。The second character area encoding apparatus has a second
The character area decoding device stores a variety of fonts, outputs a font corresponding to the character code, a font database, decodes both parameters, and approximates the character portion from the decoded parameters and the font. Approximate character generation means for determining the pixel value of the image, prediction error decoding means for decoding the prediction error, and adding the pixel value of the approximate image and the decoded prediction error, to reproduce the pixels of the character area, It has addition means for outputting.

【００２１】[0021]

【発明の実施の形態】次に、本発明の実施の形態につい
て図面を参照して説明する。Next, embodiments of the present invention will be described with reference to the drawings.

【００２２】図１は本発明の第１の実施形態の文字領域
符号化装置のブロック図である。FIG. 1 is a block diagram of a character area coding apparatus according to a first embodiment of the present invention.

【００２３】本実施形態の文字領域符号化装置は文字領
域抽出部１０２と文字認識部１０４とフォントデータベ
ース１０６と文字マッチング部１０８で構成されてい
る。The character area encoding apparatus according to the present embodiment includes a character area extraction unit 102, a character recognition unit 104, a font database 106, and a character matching unit 108.

【００２４】まず、入力画像１０１に対して文字領域抽
出部１０２において文字領域１０３が抽出される。文字
領域１０３の画素値は文字認識部１０４において文字部
分がベクトル化され、文字として認識される。その結
果、文字コード１０５が出力される。文字コード１０５
に対応する各種フォント１０７がフォントデータベース
１０６から検索され、それらが文字領域１０３の画像と
共に文字マッチング部１０８に入力される。First, a character area 103 is extracted from an input image 101 by a character area extraction unit 102. In the pixel value of the character area 103, the character part is vectorized by the character recognition unit 104 and recognized as a character. As a result, the character code 105 is output. Character code 105
Are searched from the font database 106 and input to the character matching unit 108 together with the image of the character area 103.

【００２５】文字マッチング部１０８では、各種フォン
ト１０７のうち文字領域１０３の画素を最もよく近似す
るフォントの種類１１１、文字のサイズ１１２、文字の
色１１３が探索される。これらのデータと共に、文字部
分の開始位置１０９、文字間隔１１０が符号化データと
して出力される。The character matching unit 108 searches for a font type 111, a character size 112, and a character color 113 which best approximate the pixels of the character area 103 among the various fonts 107. Along with these data, the start position 109 of the character portion and the character interval 110 are output as encoded data.

【００２６】図２は図１の文字領域符号化装置に対応す
る文字領域復号装置のブロック図である。FIG. 2 is a block diagram of a character area decoding apparatus corresponding to the character area encoding apparatus of FIG.

【００２７】本実施形態の文字領域復号装置は文字生成
部２０１とフォントデータベース２０２で構成される。The character area decoding apparatus according to the present embodiment includes a character generation unit 201 and a font database 202.

【００２８】文字領域符号化装置から送られてきた文字
コード１０５に対応する各種フォント２０３がフォント
データベース２０２から検索され、それが同じく文字領
域符号化装置から送られてきた符号化データ（文字部分
の開始位置１０９、文字間隔１１０、フォントの種類１
１１、文字のサイズ１１２、文字の色１１３）と共に文
字生成部２０１に入力される。Various fonts 203 corresponding to the character codes 105 sent from the character area encoding apparatus are searched from the font database 202, and the fonts 203 are coded data (character part data) also sent from the character area encoding apparatus. Start position 109, character spacing 110, font type 1
11, the character size 112, and the character color 113) are input to the character generation unit 201.

【００２９】文字生成部２０１では、以下に示す手順で
文字の画素値を計算する。選び出された文字コード１０
５に対応する各種フォントを２０３のうち、受信したフ
ォントの種類１１１に対応するものが選ばれる。例え
ば、受信した文字コード１０５が「あ」を表し、フォン
トデータベース２０２に明朝体とゴシック体が蓄えられ
ているとすると、明朝体とゴシック体の「あ」というフ
ォントが文字生成部２０１に送られる。The character generator 201 calculates the pixel value of a character according to the following procedure. Selected character code 10
The font corresponding to the received font type 111 is selected from among the various fonts 203 corresponding to 5. For example, if the received character code 105 represents “A” and the font database 202 stores Mincho and Gothic fonts, the font “A” of Mincho and Gothic Sent.

【００３０】仮に受信したフォントの種類１１１が明朝
体であり、受信した文字のサイズ１１２が２０ポイント
だとすると、文字生成部２０１では、線の幅などを含み
文字の形は「明朝体」の「２０ポイント」の「あ」であ
ると決定する。また、文字の大きさおよび位置は受信し
た文字部分の開始位置１０９と、文字の間隔１１０で決
まる。Assuming that the received font type 111 is Mincho font and the received character size 112 is 20 points, the character generation unit 201 determines that the character shape is “Mincho font” including the line width. It is determined to be "A" of "20 points". The size and position of the character are determined by the start position 109 of the received character portion and the character interval 110.

【００３１】文字の開始位置は複数のデータであっても
よい。この例では文字部分の開始位置１０９と文字の間
隔１１０で文字の大きさを決定できる場合について示し
ているが、文字が等間隔でない場合には、文字の拡大率
あるいは文字幅あるいは文字の高さを符号化情報として
受けとる必要がある。The start position of a character may be a plurality of data. In this example, the case where the character size can be determined by the start position 109 of the character portion and the character interval 110 is shown. Must be received as encoded information.

【００３２】文字の画素値は濃度を含んだ文字の色１１
３によって決まる。この文字の色１１３の情報は色彩の
みの情報と輝度だけの情報に分解して受信しても構わな
い。The pixel value of the character is the character color 11 including the density.
Determined by 3. The information of the character color 113 may be received after being decomposed into only the color information and the only luminance information.

【００３３】以上により、文字の形、場所、大きさ、色
（輝度）が計算されるので、画像のサンプル点が文字上
に相当する場合には、その値を計算することができる。As described above, the shape, location, size, and color (brightness) of the character are calculated. If the sample points of the image correspond to the characters, the values can be calculated.

【００３４】このようにして、文字コード１０５に対応
する各種フォント２０３から指定のサイズと色を持った
フォントを復元し、画像中の文字部分の開始位置から得
られた文字間隔で並べて文字列を再生し、文字領域の画
素２０４として出力する。In this way, a font having a specified size and color is restored from the various fonts 203 corresponding to the character codes 105, and the character strings are arranged at the character intervals obtained from the start positions of the character portions in the image. It is reproduced and output as the pixel 204 of the character area.

【００３５】図３は本発明の第２の実施形態の文字領域
符号化装置のブロック図である。FIG. 3 is a block diagram of a character area coding apparatus according to a second embodiment of the present invention.

【００３６】本実施形態は、図１の文字領域符号化装置
に近似文字生成部１１４と差分器１１６と予測誤差符号
化部１１８が付加されて構成されている。The present embodiment is configured by adding an approximate character generator 114, a differentiator 116 and a prediction error encoder 118 to the character area encoding apparatus of FIG.

【００３７】本実施形態では、フォント１０７と文字部
分の開始位置１０９、文字間隔１１０、フォントの種類
１１１、文字のサイズ１１２、文字の色１１３から、近
似文字生成部１１４において近似文字領域の画素値１１
５が文字生成部２０１と同様の手順で計算される。この
画素値１１５が入力画像中の文字領域１０３の画像値の
予測値として用いられる。差分器１１６で差分処理を行
って、予測誤差１１７が得られる。この予測誤差１１７
は、予測誤差符号化部１１８において、例えばＤＰＣＭ
により符号化され、符号化予測誤差１１９が出力され
る。In this embodiment, the approximate character generation unit 114 determines the pixel value of the approximate character area from the font 107 and the start position 109 of the character portion, the character interval 110, the font type 111, the character size 112, and the character color 113. 11
5 is calculated in the same procedure as the character generation unit 201. This pixel value 115 is used as a predicted value of the image value of the character area 103 in the input image. The difference processing is performed by the differentiator 116, and a prediction error 117 is obtained. This prediction error 117
Is, for example, DPCM
And an encoded prediction error 119 is output.

【００３８】符号化装置から符号化データとして出力さ
れるのは、文字コード１０５、文字部分の開始位置１０
９、文字間隔１１０、フォントの種類１１１、文字のサ
イズ１１２、文字の色１１３および符号化予測誤差１１
９である。Output from the encoding device as encoded data are the character code 105 and the start position 10 of the character part.
9, character spacing 110, font type 111, character size 112, character color 113, and encoding prediction error 11
9

【００３９】図４は図３の文字領域符号化装置に対応す
る文字領域復号装置のブロック図である。FIG. 4 is a block diagram of a character area decoding apparatus corresponding to the character area encoding apparatus of FIG.

【００４０】本実施形態の文字領域復号装置はフォント
データベース２０２と近似文字生成部２０５と予測誤差
復号部２０７と加算器２０９で構成されている。The character area decoding apparatus of this embodiment comprises a font database 202, an approximate character generation unit 205, a prediction error decoding unit 207, and an adder 209.

【００４１】文字領域符号化装置から送られてきた文字
コード１０５に対応する各種フォント２０３がフォント
データベース２０２から検索され、それが同じく文字領
域符号化装置から送られてきた符号化データ（文字部分
の開始位置１０９、文字間隔１１０、フォントの種類１
１１、文字のサイズ１１２、文字の色１１３）と共に近
似文字生成部２０８に入力される。Various fonts 203 corresponding to the character codes 105 sent from the character area encoding apparatus are searched from the font database 202, and the fonts 203 are coded data (character part data) also sent from the character area encoding apparatus. Start position 109, character spacing 110, font type 1
11, the character size 112, and the character color 113) are input to the approximate character generation unit 208.

【００４２】近似文字生成部２０５では、文字部分の開
始値１０９、文字間隔１１０、フォントの種類１１１、
文字のサイズ１１２、色１１３を復号し、近似文字生成
部１１４と同様にして近似文字領域の画素値２０６を計
算する。予測誤差符号化部２０７では文字領域符号化装
置から送られてきた予測誤差１１９を復号する。近似文
字領域の画素値２０６と復号された予測誤差２０８は加
算器２０９で加算され、文字領域の画素２１０が出力さ
れる。In the approximate character generation unit 205, the start value 109 of the character part, the character interval 110, the font type 111,
The character size 112 and the color 113 are decoded, and the pixel value 206 of the approximate character area is calculated in the same manner as the approximate character generation unit 114. The prediction error encoding unit 207 decodes the prediction error 119 sent from the character area encoding device. The pixel value 206 of the approximate character area and the decoded prediction error 208 are added by the adder 209, and the pixel 210 of the character area is output.

【００４３】[0043]

【発明の効果】以上説明したように、本発明は、送受信
でフォントに関する知識を共有していることから、画像
中の文字部分の文字を認識すれば、文字をコード化して
再現することができるため、大幅な情報量削減が可能で
ある。As described above, according to the present invention, since knowledge about fonts is shared in transmission and reception, if characters in a character portion in an image are recognized, characters can be coded and reproduced. Therefore, the amount of information can be significantly reduced.

[Brief description of the drawings]

【図１】本発明の第１の実施形態の文字領域符号化装置
のブロック図である。FIG. 1 is a block diagram of a character area encoding device according to a first embodiment of the present invention.

【図２】図１の文字領域符号化装置に対応する文字領域
復号装置のブロック図である。FIG. 2 is a block diagram of a character region decoding device corresponding to the character region encoding device of FIG. 1;

【図３】本発明の第２の実施形態の文字領域符号化装置
のブロック図である。FIG. 3 is a block diagram of a character area encoding device according to a second embodiment of the present invention.

【図４】図３の文字領域符号化装置に対応する文字領域
復号装置のブロック図である。FIG. 4 is a block diagram of a character region decoding device corresponding to the character region encoding device of FIG. 3;

【図５】従来の文字領域符号化方法の例を示す図であ
る。FIG. 5 is a diagram illustrating an example of a conventional character area encoding method.

[Explanation of symbols]

１０１入力画像１０２文字領域抽出部１０３文字領域１０４文字認識部１０５文字コード１０６フォントデータベース１０７文字コード１０５に対応する各種フォント１０８文字マッチング部１０９文字部分の開始位置１１０文字間隔１１１フォントの種類１１２文字のサイズ１１３文字の色１１４近似文字生成部１１５近似文字領域の画素値１１６差分器１１７予測誤差１１８予測誤差符号化部１１９符号化予測誤差２０１文字生成部２０２フォントデータベース２０３文字コード１０５に対応する各種フォント２０４文字領域の画素２０５近似文字生成部２０６近似文字領域の画素値２０７予測誤差復号部２０８復号された予測誤差２０９加算器２１０文字領域の画素 Reference Signs List 101 input image 102 character area extraction unit 103 character area 104 character recognition unit 105 character code 106 font database 107 various fonts corresponding to character code 105 108 character matching unit 109 start position of character part 110 character interval 111 font type 112 character Size 113 Character color 114 Approximate character generation unit 115 Pixel value of approximate character area 116 Difference unit 117 Prediction error 118 Prediction error encoding unit 119 Encoding prediction error 201 Character generation unit 202 Font database 203 Various fonts corresponding to character code 105 204 Pixel of character area 205 Approximate character generation unit 206 Pixel value of approximate character area 207 Prediction error decoding unit 208 Decoded prediction error 209 Adder 210 Pixel of character area

Claims

[Claims]

When encoding pixels of a character portion included in an image, the character portion is extracted as a character region, and after performing character recognition, character encoding is performed, and various characters corresponding to the character code are encoded. A character area encoding method for searching for a font closest to the pixel of the character portion of the font, and encoding parameters and parameters necessary for reproducing the character portion.

2. When encoding pixels of a character portion included in an image, the character portion is extracted as a character region, and after character recognition, character encoding is performed, and various characters corresponding to the character code are encoded. The font is searched for the font closest to the pixel of the character portion, and the parameters and parameters necessary for reproducing the character portion are encoded and output.The character font corresponding to the character code and the character font and both parameters are output. A character area encoding method for generating an approximate image of a part to be a predicted image for an input image, determining a prediction error, and encoding the prediction error.

3. The parameter of the font is a font type, a character size, and a character color, and parameters necessary for reproducing the character portion are a start position of the character portion and a character interval. 3. The character area encoding method according to 1 or 2.

4. A character area extracting means for extracting a character portion as a character area from an input image, a font database storing various fonts corresponding to the character codes, and recognizing characters from the character areas to convert the character codes. Output,
Character recognition means for searching the font database, inputting the font output from the character area and the font database, searching for a font closest to the pixel of the character area, and its parameters and reproducing the character part. A character region encoding device having character matching means for encoding and outputting necessary parameters.

5. A character area extracting means for extracting a character part as a character area from an input image, a font database storing various fonts corresponding to the character codes, and recognizing characters from the character areas to convert the character codes. Output,
Character recognition means for searching the font database, inputting the font output from the character area and the font database, searching for a font closest to the pixel of the character area, and its parameters and reproducing the character part. Character matching means for encoding and outputting necessary parameters; inputting both parameters output from the character matching means and a font output from the font database to generate an approximate character generating an approximate image of the character portion A character region encoding apparatus comprising: means, a difference unit that calculates a difference between the character region and the approximate image and outputs the difference as a prediction error, and a prediction error encoding unit that encodes the prediction error.

6. The font parameter is a font type, a character size, and a character color, and parameters necessary for reproducing the character portion are a start position of the character portion and a character interval. 6. The character area encoding device according to 4 or 5.

7. A character area decoding method corresponding to the character area encoding method according to claim 1, wherein a font corresponding to the character code is output, the parameters are decoded, and pixels of the character area are decoded. A character area decoding method to reproduce and output.

8. A character area decoding method corresponding to the character area encoding method according to claim 2, wherein a font corresponding to the character code is output, and an approximate image of the character part is obtained from the font and the two parameters. A character area decoding method for obtaining a pixel value, decoding the prediction error, adding the pixel value of the approximate image and the decoded prediction error, and reproducing and outputting a pixel of the character area.

9. A character area decoding apparatus corresponding to the character area encoding apparatus according to claim 4, wherein said font database stores various fonts and outputs a font corresponding to said character code. A character area decoding device comprising: a character generation unit that decodes the two parameters and reproduces the pixels of the character area from the decoded parameters and the font and outputs the pixels.

10. A character area decoding apparatus corresponding to the character area encoding apparatus according to claim 5, wherein: a font database storing various fonts and outputting a font corresponding to the character code; Approximate character generation means for decoding the approximate value of the image of the character portion from both the font and the decoded parameters; a prediction error decoding means for decoding the prediction error; and a pixel value of the approximate image and A character area decoding apparatus comprising an adding unit that adds decoded prediction errors, reproduces and outputs pixels of the character area.