JP3193140B2

JP3193140B2 - Image and code data compression

Info

Publication number: JP3193140B2
Application number: JP21234692A
Authority: JP
Inventors: 雅岳大森
Original assignee: Ricoh Co Ltd
Current assignee: Ricoh Co Ltd
Priority date: 1992-07-17
Filing date: 1992-07-17
Publication date: 2001-07-30
Anticipated expiration: 2016-07-30
Also published as: JPH0638048A

Description

DETAILED DESCRIPTION OF THE INVENTION

【０００１】[0001]

【産業上の利用分野】本発明は、２値画像データと文字
などの符号データとを選択的にデータ圧縮する画像と符
号のデータ圧縮方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to an image and code data compression method for selectively compressing binary image data and code data such as characters.

【０００２】[0002]

【従来の技術】データを伝送したり蓄積したりする場合
にデータ圧縮がよく行なわれている。データ圧縮を行な
うための符号化方式として、予測符号化方式の一方式で
ある算術符号やジブ・レンペル（Ｚｉｖ−Ｌｅｍｐｅ
ｌ）のユニバーサル符号が知られている。2. Description of the Related Art When data is transmitted or stored, data compression is often performed. As a coding method for performing data compression, an arithmetic code or a Ziv-Lempe (Ziv-Lempe) which is one of predictive coding methods is used.
The universal code of l) is known.

【０００３】算術符号は、マルコフ情報源のデータに適
すると共に、データの特性に応じた適応化が可能である
ため、２値画像データのデータ圧縮によく採用されてい
る。一方、ユニバーサル符号は、同一シンボルパターン
の長いデータが繰り返し出現するデータに対して圧縮効
果が高いため、文字コードなどの符号データのデータ圧
縮によく採用されている。[0003] Arithmetic codes are well-suited for data compression of binary image data because they are suitable for data of Markov information sources and can be adapted according to the characteristics of the data. On the other hand, the universal code has a high compression effect on data in which long data of the same symbol pattern repeatedly appears, and is therefore often used for data compression of code data such as character codes.

【０００４】ところで、画像データと符号データの内の
どちらでも任意に選択してデータ圧縮したいという場合
がある。There is a case where it is desired to arbitrarily select either image data or code data and compress the data.

【０００５】この場合、従来は、一般に、上記算術符号
とユニバーサル符号というように、２種類の符号化方式
を使い分けるようにしていた。このため、装置内に２種
類の符号化手段を備えなくてはならず、装置が複雑化し
ていた。[0005] In this case, conventionally, two types of coding schemes, such as the above-mentioned arithmetic code and universal code, have been selectively used. For this reason, two types of encoding means must be provided in the device, and the device is complicated.

【０００６】一方、例えば、特開平３−７０２６８号公
報に見られるように、画像データと文字コードデータと
を１つの符号化手段によりデータ圧縮するものが提案さ
れている。On the other hand, for example, as disclosed in Japanese Patent Application Laid-Open No. 3-70268, there has been proposed a technique in which image data and character code data are compressed by one encoding means.

【０００７】この提案では、文字コードデータは、バイ
ト単位にユニバーサル符号によりデータ圧縮している。
また、画像データは、ＭＨ（ＭｏｄｉｆｉｅｄＨｕｆ
ｆｍａｎ），ＭＲ（ＭｏｄｉｆｉｅｄＲｅｌａｔｉｖ
ｅＥｌｅｍｅｎｔＡｄｄｒｅｓｓＤｅｓｉｇｎａ
ｔｅ）あるいはＭＭＲ（ＭｏｄｉｆｉｅｄＭＲ）方式
により符号化した後、その符号化により得られるランレ
ングスコードやモード情報に固定長コードを割り付け、
その固定長コードをユニバーサル符号によりデータ圧縮
している。この処理により、文字コードや固定長コード
の各データの周期性が吸収され、圧縮効果が比較的高く
なる。[0007] In this proposal, character code data is subjected to data compression by a universal code in byte units.
The image data is MH (Modified Huf).
fman), MR (Modified Relativ)
e Element Address Designna
te) or MMR (Modified MR) method, and then assigns a fixed-length code to the run-length code and mode information obtained by the encoding.
The fixed-length code is data-compressed by a universal code. By this processing, the periodicity of each data of the character code and the fixed length code is absorbed, and the compression effect becomes relatively high.

【０００８】しかしながら、上記提案では、ＭＨ，ＭＲ
あるいはＭＭＲという別の符号化手段が必要になるた
め、前記と同様に、装置が複雑化していた。However, in the above proposal, MH, MR
Alternatively, since another encoding means called MMR is required, the apparatus is complicated as described above.

【０００９】また、ユニバーサル符号の場合、処理する
データ内に、同一シンボル系列の長いデータが繰り返し
出現する場合に、圧縮効果が高くなるが、同一シンボル
系列のデータ長が短かったり、出現回数が少ない場合、
圧縮効果が低下していた。In the case of the universal code, when long data of the same symbol sequence repeatedly appears in the data to be processed, the compression effect is improved, but the data length of the same symbol sequence is short or the number of appearances is small. If
The compression effect was reduced.

【００１０】[0010]

【発明が解決しようとする課題】このように、従来は、
画像データと符号データの双方において高い圧縮効果を
得ようとすると、複数の符号化手段が必要になり、装置
が複雑化してしまうという問題があった。As described above, conventionally,
If a high compression effect is to be obtained for both the image data and the code data, a plurality of coding means are required, resulting in a problem that the apparatus becomes complicated.

【００１１】本発明は、上記の問題を解決し、簡単な装
置構成で画像データと符号データの双方をデータ圧縮す
ると共に、高い圧縮効果が得られる画像と符号のデータ
圧縮方法を提供することを目的とする。An object of the present invention is to solve the above problems and provide a method for compressing both image data and code data with a simple apparatus configuration, and a high-compression image and code data compression method. Aim.

【００１２】[0012]

【課題を解決するための手段】このために本発明は、入
力データを一定数ずつ並列配置して、注目ビット周囲の
各ビットの状態から注目ビットのシンボル出現確率を予
測し、その予測値と実際のシンボルとの対応を符号化す
ることによりデータ圧縮する既知の算術符号化手段を１
つ備え、２値画像データをデータ圧縮する場合には、上
記一定数として、画像１ラインの画素数を設定して、そ
の算術符号化手段によりデータ圧縮する一方、符号デー
タをテータ圧縮する場合には、上記一定数として、符号
データ１単位のビット数を設定して、上記算術符号化手
段によりデータ圧縮するようにしている。According to the present invention, for this purpose, a predetermined number of input data are arranged in parallel, a symbol appearance probability of the bit of interest is predicted from the state of each bit surrounding the bit of interest, and the predicted value and A known arithmetic encoding unit for compressing data by encoding the correspondence with the actual symbol is 1
When the binary image data is compressed, the number of pixels in one line of the image is set as the fixed number, and the data is compressed by the arithmetic coding means. Sets the number of bits per code data as the fixed number, and compresses the data by the arithmetic coding means.

【００１３】[0013]

【作用】符号化手段は１つだけでよいので、装置構成が
簡単になる。また、画像データは、算術符号化手段によ
り従来どおりデータ圧縮するので、高い圧縮効果が得ら
れる。また、符号データは、シンボル出現確率を予測す
る際に、符号データ１単位ずつ並列配置するので、注目
ビットと周囲の各ビット間の相関が強くなる。これによ
り、予測精度が向上し、符号データにおいても高い圧縮
効果が得られるようになる。Since only one encoding means is required, the structure of the apparatus is simplified. Further, since the image data is compressed as before by the arithmetic coding means, a high compression effect can be obtained. Further, when predicting the symbol appearance probability, the code data is arranged one by one in parallel with the code data, so that the correlation between the target bit and each surrounding bit becomes strong. As a result, the prediction accuracy is improved, and a high compression effect can be obtained even for code data.

【００１４】[0014]

【実施例】以下、添付図面を参照しながら、本発明の実
施例を詳細に説明する。Embodiments of the present invention will be described below in detail with reference to the accompanying drawings.

【００１５】図１は、本発明の一実施例に係る符号化装
置のブロック構成図を示したものである。図において、
Ｐ／Ｓ変換手段１は、パラレル信号の文字コードデータ
をシリアル信号のデータに変換するものである。入力切
換手段２は、画像データを入力するか、文字コードデー
タを入力するかを切り換えるものである。参照シンボル
取得手段３は、入力したデータを一定数ずつ並列配置
し、注目ビット周囲の各ビットの状態を判別するもので
ある。この参照シンボル取得手段３内には、入力データ
を並列配置する上記一定数を２段階に切り換える１ライ
ンビット数切換手段３ａを備えている。FIG. 1 is a block diagram showing an encoding apparatus according to one embodiment of the present invention. In the figure,
The P / S converter 1 converts character code data of a parallel signal into data of a serial signal. The input switching means 2 switches between inputting image data and inputting character code data. The reference symbol acquiring means 3 is for arranging the input data in a fixed number at a time and determining the state of each bit around the bit of interest. The reference symbol acquiring means 3 includes a one-line bit number switching means 3a for switching the above-mentioned fixed number for arranging the input data in two stages in two stages.

【００１６】確率テーブル４は、シンボルの出現確率を
示す各種確率値を格納しているデータテーブルである。
予測値・確率データ選択テーブル５は、上記判別される
各ビット状態に対応して、シンボルの予測値と、確率テ
ーブル４内の１つの確率値を指示するポインタとを格納
しているデータテーブルである。予測判定手段６は、シ
ンボルの予測値と実際の値とが一致、すなわち予測が的
中したかどうか判定するものである。なお、この判定結
果は、言い換えると、データのシンボルがＭＰＳ（優勢
シンボル）であるかＬＰＳ（劣勢シンボル）であるかを
示すものである。算術符号生成手段７は、その判定結果
と読み出された確率値とにより、算術符号を生成するも
のである。The probability table 4 is a data table that stores various probability values indicating the appearance probabilities of symbols.
The predicted value / probability data selection table 5 is a data table that stores a predicted value of a symbol and a pointer pointing to one probability value in the probability table 4 corresponding to each of the determined bit states. is there. The prediction determining means 6 determines whether the predicted value of the symbol matches the actual value, that is, whether or not the prediction was successful. In addition, this determination result indicates, in other words, whether the data symbol is an MPS (dominant symbol) or an LPS (inferior symbol). The arithmetic code generation means 7 generates an arithmetic code based on the determination result and the read probability value.

【００１７】テーブル書換手段８は、シンボル予測の的
中状況に応じて予測値・確率データ選択テーブル５内の
上記ポインタを書き換えるものである。識別コード挿入
手段９は、画像および文字の各符号化データに先立っ
て、どちらのデータであるかを示す識別コードを挿入す
るものである。The table rewriting means 8 rewrites the pointer in the predicted value / probability data selection table 5 according to the hit state of symbol prediction. The identification code insertion means 9 inserts an identification code indicating which data is prior to each of the encoded data of the image and the character.

【００１８】図２は、上記符号化装置により生成された
符号化データを元のデータに復元する復号化装置のブロ
ック構成図を示したものである。図において、識別コー
ド検出・除去手段１０は、入力される符号化データの識
別コードを検出する一方、その識別コードを除去するも
のである。算術符号復号手段１１は、符号化データと確
率値とにより、データのシンボルがＭＰＳであるかＬＰ
Ｓであるか判別するものである。FIG. 2 shows a block diagram of a decoding apparatus for restoring the encoded data generated by the above-mentioned encoding apparatus to the original data. In the figure, an identification code detection / removal means 10 detects an identification code of input encoded data and removes the identification code. The arithmetic code decoding means 11 determines whether the symbol of the data is MPS or not by using the encoded data and the probability value.
S is determined.

【００１９】参照シンボル取得手段３，確率テーブル
４，予測値・確率データ選択テーブル５，予測判定手段
６および算術符号生成手段７は、それぞれ図１と同一機
能である。なお、予測判定手段６は、この場合、ＭＰＳ
とＬＰＳの判別結果とシンボル予測値とにより、データ
のシンボルを再生することになる。The reference symbol acquisition means 3, probability table 4, prediction value / probability data selection table 5, prediction determination means 6, and arithmetic code generation means 7 have the same functions as those in FIG. Note that, in this case, the prediction determination means 6
The symbol of the data is reproduced based on the determination result of the LPS and the symbol prediction value.

【００２０】出力切換手段１２は、画像データを再生し
た場合と文字コードデータを再生した場合とで、信号路
を切り換えるものである。Ｓ／Ｐ変換手段１３は、シリ
アル信号のデータをパラレル信号に変換するものであ
る。The output switching means 12 switches the signal path between when the image data is reproduced and when the character code data is reproduced. The S / P converter 13 converts serial signal data into parallel signals.

【００２１】以上の構成で、次に本実施例の符号化装置
の動作を説明する。Next, the operation of the encoding apparatus according to the present embodiment having the above configuration will be described.

【００２２】この符号化装置には、データ種別信号と共
に、画像データまたは文字コードデータが任意に入力さ
れる。データ種別信号は、入力されるデータが画像デー
タであるか文字コードデータであるかを示すものであ
る。画像データは、原稿画像を一定の解像度で主走査と
副走査とを実行して読み取った２値データであり、１ラ
インずつ順次入力される。文字コードデータは、英数字
などの文字列情報であり、アスキーコードの形式で順次
入力される。Image data or character code data is arbitrarily input to the encoding device together with the data type signal. The data type signal indicates whether the input data is image data or character code data. The image data is binary data obtained by executing a main scan and a sub scan at a fixed resolution on a document image, and is sequentially input line by line. The character code data is character string information such as alphanumeric characters, and is sequentially input in the form of an ASCII code.

【００２３】この符号化装置は、図３に示すように、デ
ータ種別信号が入力されると、その信号により入力され
るデータの種別を判別する（処理１０１）。いま、例え
ば、入力データが画像データであったとすると（処理１
０１の「画像」）、入力切換手段２で画像データ側を選
択し（処理１０２）、識別コード挿入手段９から画像で
あることを示す識別コードを出力する（処理１０３）。
そして、１ラインビット数切換手段３ａで１ラインのビ
ット数を原稿画像の主走査方向の画素数に設定する（処
理１０４）。As shown in FIG. 3, when a data type signal is input, the coding apparatus determines the type of data input by the signal (process 101). Now, for example, if the input data is image data (processing 1
01, “image”), the image data side is selected by the input switching means 2 (processing 102), and an identification code indicating an image is output from the identification code inserting means 9 (processing 103).
Then, the one-line bit number switching means 3a sets the number of bits of one line to the number of pixels in the main scanning direction of the original image (process 104).

【００２４】一方、入力データが文字コードデータであ
ったとすると（処理１０１の「文字」）、入力切換手段
２でＰ／Ｓ変換手段１側を選択し（処理１０５）、識別
コード挿入手段９から文字であることを示す識別コード
を出力する（処理１０６）。そして、１ラインビット数
切換手段３ａで１ラインのビット数をアスキーコードの
１単位である８ビットに設定する（処理１０７）。On the other hand, if the input data is character code data ("character" in processing 101), the input switching means 2 selects the P / S conversion means 1 (processing 105). An identification code indicating a character is output (process 106). Then, the one-line bit number switching means 3a sets the number of bits of one line to 8 bits, which is one unit of the ASCII code (process 107).

【００２５】そして、所定の符号化処理を開始する。す
なわち、いま、画像データが入力された場合、その画像
データは、そのまま参照シンボル取得手段３に入力され
る。また、文字コードデータが入力された場合、Ｐ／Ｓ
変換手段１でシリアル信号に変換されて参照シンボル取
得手段３に入力される。Then, a predetermined encoding process is started. That is, when image data is input, the image data is input to the reference symbol acquisition unit 3 as it is. When character code data is input, P / S
The signal is converted into a serial signal by the conversion means 1 and input to the reference symbol acquisition means 3.

【００２６】参照シンボル取得手段３では、図４に示す
ように、入力するデータを上記設定した１ラインビット
数ずつ順次並列配置する。そして、入力する各ビットに
注目し、予め設定されているテンプレートＴに従って、
その注目ビットＸに対して一定位置にある１０ビットの
データを取り出す。すなわち、取り出すビットは、同一
ライン上の２ビットのデータＤ１・Ｄ２と、前ラインの
５ビットのデータＤ２〜Ｄ６および前々ラインの３ビッ
トのデータＤ７〜Ｄ９である。In the reference symbol acquiring means 3, as shown in FIG. 4, input data is sequentially arranged in parallel by the set number of bits per line. Then, paying attention to each input bit, according to a preset template T,
The 10-bit data at a fixed position with respect to the target bit X is extracted. That is, the bits to be extracted are 2-bit data D1 and D2 on the same line, 5-bit data D2 to D6 on the previous line, and 3-bit data D7 to D9 on the line immediately before the previous line.

【００２７】この取り出したデータＤ０〜Ｄ９は１０ビ
ットなので、図５（ａ）に示すように、シンボルパター
ンの各状態を整理すると１０２４通りになる。参照シン
ボル取得手段３は、取り出したデータＤ０〜Ｄ９のシン
ボルパターンを状態０〜１０２３というように判別す
る。Since the extracted data D0 to D9 are 10 bits, as shown in FIG. 5A, the states of the symbol patterns are arranged in 1024 patterns. The reference symbol acquisition means 3 determines the symbol pattern of the extracted data D0 to D9 as states 0 to 1023.

【００２８】予測値・確率データ選択テーブル５には、
同図（ｂ）に示すように、上記シンボルパターンの各状
態０〜１０２３に対応して、ＭＰＳの予測値とポインタ
値とが格納されている。予測値・確率データ選択テーブ
ル５は、上記判別された１つの状態に対応する予測値と
ポインタ値とを出力する。確率テーブル４には、同図
（ｃ）に示すように、通し番号が付与された各種確率値
が格納されている。確率テーブル４は、上記ポインタ値
で示された番号の確率値を出力する。The predicted value / probability data selection table 5 includes:
As shown in FIG. 13B, the predicted value of the MPS and the pointer value are stored corresponding to each of the states 0 to 1023 of the symbol pattern. The predicted value / probability data selection table 5 outputs a predicted value and a pointer value corresponding to the one determined state. The probability table 4 stores various probability values to which serial numbers are assigned, as shown in FIG. The probability table 4 outputs the probability value of the number indicated by the pointer value.

【００２９】予測判定手段６は、上記ＭＰＳの予測値と
入力されたデータのシンボルとを照合して、予測が的中
したかどうか判定する。この判定結果は、言い換える
と、入力されたデータのシンボルがＭＰＳであるかＬＰ
Ｓであるかを示している。算術符号生成手段７は、その
判定結果と上記確率値とに基ずいて、既知演算により算
術符号を生成して、外部に出力する（以上、処理１０
８）。The prediction judging means 6 judges whether or not the prediction is correct by comparing the predicted value of the MPS with the symbol of the input data. In other words, this determination result indicates that the symbol of the input data is MPS or LP
S. The arithmetic code generating means 7 generates an arithmetic code by a known operation based on the determination result and the probability value, and outputs the generated arithmetic code to the outside (the above processing 10).
8).

【００３０】上記符号化動作に並行して、テーブル書換
手段８は、シンボル予測が的中した回数と、外れた回数
とをそれぞれ計数する（処理１０９）。そして、計数し
たそれぞれの回数が、予め規定されている回数に達した
かどうか判定する（処理１１０）。規定回数に達しない
場合には（処理１１０のＮ）、次にデータ種別信号によ
りデータの切り換えが指示されていないかどうか判別す
る（処理１１１）。データの切り換えが指示されない場
合には（処理１１１のＮ）、データの終了がどうか判別
し（処理１１２）、終了でなければ（処理１１２の
Ｎ）、そのまま動作を継続する（処理１０８へ）。In parallel with the above encoding operation, the table rewriting means 8 counts the number of times the symbol prediction was successful and the number of times the symbol prediction was missed (step 109). Then, it is determined whether or not the counted number has reached a predetermined number (step 110). If the specified number of times has not been reached (N in process 110), it is then determined whether or not data switching has been instructed by the data type signal (process 111). If data switching is not instructed (N in process 111), it is determined whether the data is completed (process 112). If not (N in process 112), the operation is continued (to process 108).

【００３１】ここで、いま、画像データを入力している
ものとする。この場合、参照シンボル取得手段３内で
は、図４に示したように、その画像データが１ラインず
つ配列され、テンプレートＴにより、注目ビットＸ周辺
のデータＤＯ〜Ｄ９が取り出される。画像データの場
合、これらのデータＤＯ〜Ｄ９と注目ビットＸとの相関
が強いことがよく知られている。上記符号化処理では、
注目ビットＸのシンボルを、相関の強いデータＤＯ〜Ｄ
９に基ずいて予測するので、予測精度が高くなり、デー
タ圧縮率も高くなる。Here, it is assumed that image data has been input. In this case, in the reference symbol acquiring means 3, the image data is arranged line by line as shown in FIG. 4, and data DO to D9 around the target bit X is extracted by the template T. In the case of image data, it is well known that the correlation between these data DO to D9 and the target bit X is strong. In the above encoding process,
The symbol of the bit of interest X is converted to data DO to D having a strong correlation.
9, the prediction accuracy is high and the data compression ratio is high.

【００３２】次に、文字コードデータを入力しているも
のとする。この場合、参照シンボル取得手段３内では、
その文字コードデータは、８ビットを１ラインとして順
に配列され、上記と同様に、注目ビットＸ周辺の各デー
タＤＯ〜Ｄ９が取り出される。Next, it is assumed that character code data has been input. In this case, in the reference symbol acquisition means 3,
The character code data is arranged in order with 8 bits as one line, and each data DO to D9 around the target bit X is extracted in the same manner as described above.

【００３３】いま、例えば、入力した文字コードデータ
が、「ＡＳＣＩＩ・・・」という文字列のデータであっ
たとする。アスキーコードは、１文字８ビットであり、
参照シンボル取得手段３内で配列されると、図６に示す
ようになる。すなわち、「Ａ」は“０１０００００
１”、「Ｓ」は“０１０１００１１”、「Ｃ」は“０１
００００１１”、「Ｉ」は“０１００１００１”という
データであり、各文字の上位３〜４ビットは、同一であ
るデータが多い。従って、上記テンプレートＴにより取
り出される各データＤＯ〜Ｄ９と注目ビットＸとの相関
が強いことになる。Assume that the input character code data is character string data "ASCII ...". The ASCII code is 8 bits per character,
When they are arranged in the reference symbol acquiring means 3, they become as shown in FIG. That is, “A” is “0100000”
1 "," S "is" 01010011 "," C "is" 01 "
“0010011” and “I” are data “01001001”, and the upper 3 to 4 bits of each character are often identical. Therefore, the correlation between the data DO to D9 extracted from the template T and the target bit X is strong.

【００３４】これにより、文字コードデータを符号化処
理する際にも、上記画像データの場合と同様に、予測精
度が高くなり、データ圧縮率も高くなる。Thus, when encoding the character code data, the prediction accuracy is increased and the data compression ratio is increased, as in the case of the image data.

【００３５】一方、上記シンボル予測の的中回数、また
は外れた回数が規定回数に達した場合（処理１１０の
Ｙ）、予測値・確率データ選択テーブル５に格納してい
るポインタ値の書き換える。すなわち、的中回数が規定
回数に達した場合には、確率テーブル４の確率テーブル
内のより高い確率値を指示するようにポインタ値を書き
換える。また、外れた回数が規定回数に達した場合に
は、その反対に、より低い確率値を指示するようにポイ
ンタ値を書き換える（処理１１３）。これにより、いま
符号化しているデータの特性に応じた適応化が行なわ
れ、シンボルの予測精度がさらに向上するようになる。On the other hand, if the number of hits or misses of the symbol prediction reaches the specified number (Y in step 110), the pointer value stored in the predicted value / probability data selection table 5 is rewritten. That is, when the number of hits reaches the specified number, the pointer value is rewritten so as to indicate a higher probability value in the probability table of the probability table 4. On the other hand, when the number of departures reaches the specified number, the pointer value is rewritten so as to indicate a lower probability value (process 113). As a result, adaptation according to the characteristics of the data currently being encoded is performed, and the symbol prediction accuracy is further improved.

【００３６】次に、データ種別信号によりデータの切り
換えが指示されたとする。この場合（処理１１１の
Ｙ）、データ種別を判別して以上の処理を同様に実行す
る（処理１０へ）。これにより、図７に示すように、デ
ータ種別を示す識別コードと共に、画像と文字のそれぞ
れの符号化データが順次出力される。そして、入力デー
タがなくなると（処理１１２のＹ）、以上の符号化処理
を終了する。Next, it is assumed that data switching is instructed by the data type signal. In this case (Y in step 111), the data type is determined, and the above processing is executed similarly (to step 10). As a result, as shown in FIG. 7, the encoded data of each of the image and the character is sequentially output together with the identification code indicating the data type. Then, when there is no more input data (Y in process 112), the above-described encoding process ends.

【００３７】次に、本実施例の復号化装置の動作を説明
する。Next, the operation of the decoding apparatus of this embodiment will be described.

【００３８】復号化装置には、上記符号化装置により符
号化されたデータ信号が入力される。復号化装置は、図
８に示すように、データ信号が入力されると、識別コー
ド検出・除去手段１０は、識別コードを読み取ってデー
タ種別を判別する（処理２０１）。The data signal encoded by the encoding device is input to the decoding device. As shown in FIG. 8, when a data signal is input to the decoding device, the identification code detecting / removing means 10 reads the identification code and determines the data type (process 201).

【００３９】いま、データ信号が画像データであった場
合（処理２０２の「画像」）、１ラインビット数切換手
段３ａで１ラインのビット数を原稿画像の主走査方向の
画素数に設定する（処理２０３）。また、文字コードデ
ータであった場合（処理２０２の「文字」）、１ライン
ビット数切換手段３ａで１ラインのビット数を８ビット
に設定する（処理２０４）。そして、画像データの場
合、出力切換手段１２で画像出力の信号ラインを選択
し、文字データの場合、Ｓ／Ｐ変換手段１３側を選択す
る（処理２０５）。If the data signal is image data ("image" in step 202), the bit number of one line is set by the one-line bit number switching means 3a to the number of pixels in the main scanning direction of the original image ( Process 203). If it is character code data ("character" in step 202), the number of bits per line is set to 8 bits by the one-line bit number switching means 3a (step 204). Then, in the case of image data, the signal line for image output is selected by the output switching means 12, and in the case of character data, the S / P conversion means 13 is selected (process 205).

【００４０】そして、所定の復号化処理を実行する。す
なわち、算術符号復号手段１１は、入力された符号化デ
ータと確率テーブル４から出力される確率値とに基ずい
て、元のデータシンボルがＭＰＳであるかＬＰＳである
かという情報に復号する。予測判定手段６は、その情報
と予測値・確率データ選択テーブル５から出力される予
測値とに基ずいて、元のデータのシンボルを復元する。
画像データの場合、復元されたデータは、そのまま出力
される。また、文字コードデータの場合、Ｓ／Ｐ変換手
段１３でパラレル信号に変換されて出力される。なお、
参照シンボル取得手段３，予測値・確率データ選択テー
ブル５，確率テーブル４，算術符号生成手段７は、それ
ぞれ前述の符号化装置の場合と同様に動作する（以上、
処理２０６）。Then, a predetermined decoding process is executed. That is, the arithmetic code decoding means 11 decodes the information into whether the original data symbol is MPS or LPS based on the input coded data and the probability value output from the probability table 4. The prediction determination unit 6 restores the original data symbol based on the information and the predicted value output from the predicted value / probability data selection table 5.
In the case of image data, the restored data is output as it is. In the case of character code data, the data is converted into a parallel signal by the S / P converter 13 and output. In addition,
The reference symbol acquisition means 3, the predicted value / probability data selection table 5, the probability table 4, and the arithmetic code generation means 7 operate in the same manner as in the case of the above-described encoding apparatus.
Process 206).

【００４１】上記復号化動作に並行して、図３の場合と
同様に、シンボルの予測結果を判定し、その判定結果に
応じて、予測値・確率データ選択テーブル５のポインタ
値を書き換える（処理１０９，処理１１０および処理１
１３）。In parallel with the above decoding operation, as in the case of FIG. 3, the symbol prediction result is determined, and the pointer value of the prediction value / probability data selection table 5 is rewritten according to the determination result (processing 109, process 110 and process 1
13).

【００４２】また、上記動作中、入力データの識別コー
ドを監視する（処理２０７）。識別コードを検出しない
場合（処理２０７のＮ）、入力データの終了かどうか判
別し（処理２０８）、終了でなければ（処理２０８の
Ｎ）、そのまま復号化動作を継続する（処理２０６
へ）。そして、入力データがなくなると（処理２０８の
Ｙ）、以上の復号化処理を終了する。During the above operation, the identification code of the input data is monitored (process 207). If the identification code is not detected (N in process 207), it is determined whether or not the input data is completed (process 208). If not (N in process 208), the decoding operation is continued (process 206).
What). Then, when there is no more input data (Y in process 208), the above-described decoding process ends.

【００４３】以上のように、本実施例の符号化装置は、
算術符号の符号化手段を１つ配設して、その１つの符号
化手段で画像データと文字コードデータの双方をデータ
圧縮するようにしている。As described above, the encoding apparatus according to the present embodiment
One encoding means for arithmetic codes is provided, and the one encoding means compresses both image data and character code data.

【００４４】これにより、従来のように２種類の符号化
手段を備える必要がないため、装置構成が簡単になる。
また、復号化装置も、対応する算術符号の復号化手段を
１つ配設すればよいので、同様に装置構成が簡単にな
る。As a result, it is not necessary to provide two types of encoding means as in the prior art, so that the apparatus configuration is simplified.
In addition, since the decoding device only needs to provide one decoding means for the corresponding arithmetic code, the device configuration is similarly simplified.

【００４５】また、算術符号は、画像データに好適な既
知の符号化方式であるので、高い圧縮効果が得られる。
また、文字コードデータをテータ圧縮する場合には、参
照シンボル取得手段３内において、入力データを１文字
のコード長である８ビットを単位として並列配列し、注
目ビット周辺の各ビットデータを取り出すようにしてい
る。Since the arithmetic code is a known coding method suitable for image data, a high compression effect can be obtained.
When the character code data is subjected to data compression, the input data is arranged in parallel in the reference symbol acquisition means 3 in units of 8 bits which is the code length of one character, and each bit data around the target bit is extracted. I have to.

【００４６】文字コードデータの場合、文字が異なって
も、同一位置のビットが同一になりやすいので、注目ビ
ットと周辺ビット間の相関が強くなる。これにより、シ
ンボル出現確率の予測の的中率が高くなるため、文字コ
ードデータに対しても高い圧縮効果が得られるようにな
る。In the case of character code data, even if the character is different, the bit at the same position is likely to be the same, so that the correlation between the target bit and the surrounding bits becomes stronger. As a result, the hit rate of predicting the symbol appearance probability increases, and a high compression effect can be obtained even for character code data.

【００４７】発明者の実験によると、本実施例の符号化
装置により、Ｃ言語のソースプログラムの文字列情報を
データ圧縮した場合に、データ量が４０％以下に圧縮さ
れることを確認している。According to an experiment by the inventor, it was confirmed that the data amount of the character string information of the C language source program is reduced to 40% or less when the encoding apparatus of this embodiment compresses character string information. I have.

【００４８】また、Ｃ言語のソースプログラムの場合、
同一文字列が頻繁に出現するが、例えば「０．１２５８
６２．１３６５８５．２３６９８・・・」のよう
に、同一文字列がほとんど出現しない実数値データの場
合でも、各文字の上位ビットは一致することになるの
で、本実施例の符号化装置により、高いデータ圧縮効果
が得られるようになる。In the case of a C language source program,
Although the same character string frequently appears, for example, “0.1258
6 2.13658 5.23698... ", Even in the case of real-valued data in which the same character string hardly appears, the upper bits of each character match, so that the encoding apparatus of the present embodiment Thus, a high data compression effect can be obtained.

【００４９】また、本実施例では、生成した符号化デー
タの先頭部に、画像と文字とのデータ種別を示す識別コ
ードを挿入するようにしている。これにより、復号化装
置は、復号化の際に、その識別コードによりデータ種別
を判別して、それぞれ所定の動作を自動的に実行するこ
とができるようになっている。In this embodiment, an identification code indicating the data type of an image and a character is inserted at the head of the generated encoded data. This allows the decoding device to determine the data type based on the identification code at the time of decoding, and to automatically execute the respective predetermined operations.

【００５０】なお、上述の実施例では、コードデータと
して、アスキーコードの文字列情報を符号化する例を説
明したが、他の文字列コードでもよく、さらに文字でな
い各種コード情報でも同様に符号化することができる。
その場合、参照シンボル取得手段３内で並列配置する１
ラインビット数を、１つのコード長に設定すればよい。
例えば、ＪＩＳコードの場合、１文字２バイトなので、
１ラインビット数を２バイトに設定する。さらに、その
１ラインビット数は、１つのコード長に限らず、そのコ
ード長の整数倍に設定してもよい。In the above-described embodiment, an example in which character string information of an ASCII code is encoded as code data has been described. However, other character string codes may be encoded, and various non-character code information may also be encoded. can do.
In this case, 1 which is arranged in parallel in the reference symbol acquisition means 3
The number of line bits may be set to one code length.
For example, in the case of JIS code, each character is 2 bytes,
Set the number of bits per line to 2 bytes. Further, the number of bits per line is not limited to one code length, and may be set to an integral multiple of the code length.

【００５１】また、符号化データに挿入する識別コード
は、データ種別のみ示すようにしたが、さらに上記１ラ
インビット数を示すようにしてもよい。Although the identification code inserted into the encoded data indicates only the data type, the identification code may further indicate the number of bits per line.

【００５２】ところで、算術符号の符号化データは、疑
似ランダムデータになるため、出現しないデータパター
ンというものがない。このため、識別コードとして特定
符号を固定的に定義することができない。そこで、例え
ば、第１〜第３の各種符号を設定し、識別コードは、第
１と第２の符号が連続したものと定義する一方、符号化
の際に、符号化データに第１の符号が出現した場合に
は、その後に第３の符号を必ず挿入するように取り決め
る。これにより、符号化の際には、第１の符号を検出
し、さらに第２の符号を検出した場合に、識別コードで
あると判定することができる。By the way, the encoded data of the arithmetic code is pseudo-random data, and there is no data pattern which does not appear. For this reason, the specific code cannot be fixedly defined as the identification code. Therefore, for example, first to third various codes are set, and the identification code is defined as a series of the first and second codes, while the first code is added to the encoded data at the time of encoding. Appears, the third code must be inserted afterwards. Thus, upon encoding, the first code can be detected, and when the second code is detected, it can be determined that the code is an identification code.

【００５３】[0053]

【発明の効果】以上のように、本発明によれば、１つの
算術符号化手段により、画像データと符号データの双方
を符号化するようにしたので、装置構成が簡単になると
共に、符号データをテータ圧縮する場合には、符号デー
タ１単位のビット数を単位として並列配置して、注目ビ
ット周囲の各ビットのシンボルのパターンに基ずいて注
目シンボルの出現確率を予測するようにしたので、予測
結果の的中率が高くなるため、画像データと符号データ
の双方で高い圧縮効果が得られるようになる。As described above, according to the present invention, since one image coding means encodes both image data and code data, the structure of the apparatus can be simplified and the code data can be simplified. When the data compression is performed, data is arranged in parallel using the number of bits per code data as a unit, and the appearance probability of the symbol of interest is predicted based on the symbol pattern of each bit around the bit of interest. Since the accuracy of the prediction result is high, a high compression effect can be obtained for both the image data and the code data.

[Brief description of the drawings]

【図１】本発明の一実施例に係る符号化装置のブロック
構成図である。FIG. 1 is a block diagram of an encoding device according to an embodiment of the present invention.

【図２】本発明の一実施例に係る復号化装置のブロック
構成図である。FIG. 2 is a block diagram of a decoding device according to an embodiment of the present invention.

【図３】符号化処理の動作フローチャートである。FIG. 3 is an operation flowchart of an encoding process.

【図４】入力データを配列して各ビットデータを取り出
す動作を示す説明図である。FIG. 4 is an explanatory diagram showing an operation of arranging input data and extracting each bit data.

【図５】シンボルの予測動作を示す説明図である。FIG. 5 is an explanatory diagram showing a symbol prediction operation.

【図６】文字コードデータの配列方法を示す説明図であ
る。FIG. 6 is an explanatory diagram showing a method of arranging character code data.

【図７】符号化装置の出力データの説明図である。FIG. 7 is an explanatory diagram of output data of an encoding device.

【図８】復号化処理の動作フローチャートである。FIG. 8 is an operation flowchart of a decoding process.

[Explanation of symbols]

１Ｐ／Ｓ変換手段２入力切換手段３参照シンボル取得手段３ａ１ラインビット数切換手段４確率テーブル５予測値・確率データ選択テーブル６予測判定手段７算術符号生成手段８テーブル書換手段９識別コード挿入手段１０識別コード検出・除去手段１１算術符号復号手段１２出力切換手段１３Ｓ／Ｐ変換手段 DESCRIPTION OF SYMBOLS 1 P / S conversion means 2 Input switching means 3 Reference symbol acquisition means 3a 1 line bit number switching means 4 Probability table 5 Prediction value / probability data selection table 6 Prediction judgment means 7 Arithmetic code generation means 8 Table rewriting means 9 Identification code insertion Means 10 Identification code detection / removal means 11 Arithmetic code decoding means 12 Output switching means 13 S / P conversion means

Claims

(57) [Claims]

An image and code data compression method for selectively inputting binary image data in which the number of pixels in one line is constant and code data in which the number of bits in one unit is constant, and compressing the data. A predetermined number of input data are arranged in parallel, a symbol appearance probability of the bit of interest is predicted from the state of each bit around the bit of interest, and data compression is performed by encoding the correspondence between the predicted value and the actual symbol. 1
When two arithmetic coding means are provided and the binary image data is compressed, the number of pixels in one line is set as the fixed number for arranging the data in parallel, and the binary coding is performed by the arithmetic coding means. In the case of compressing the image data while compressing the code data, setting the number of bits of one unit of the code data as the fixed number and compressing the code data by the arithmetic coding means. An image and code data compression method characterized by the following.

2. A method according to claim 1, wherein said code data is a character code.

3. The method according to claim 1, wherein when starting the data compression of the binary image data and the code data, an encoded code is outputted after outputting an identification code indicating a type of the data. Data compression method for the described image and code.

4. The image and code data compression method according to claim 3, wherein said identification code includes information indicating said constant number in which said data is arranged in parallel.