JPWO2002054757A1

JPWO2002054757A1 - Data encoding method and apparatus, and data encoding program

Info

Publication number: JPWO2002054757A1
Application number: JP2002555519A
Authority: JP
Inventors: 坂無　英徳; 樋口　哲也
Original assignee: National Institute of Advanced Industrial Science and Technology AIST
Current assignee: National Institute of Advanced Industrial Science and Technology AIST
Priority date: 2000-12-28
Filing date: 2001-12-26
Publication date: 2004-05-13
Anticipated expiration: 2021-12-26
Also published as: WO2002054758A1; CA2433163C; EP1357735A4; JPWO2002054758A1; JP3986012B2; US20050100232A1; US7254273B2; CA2433163A1; WO2002054757A1; JP3986011B2; EP1357735A1

Abstract

本発明は、テンプレートを用いたデータ圧縮方式に開し、入力データを均質な領域単位で分割し、分割された領域ごとに人工知能技術（遺伝的アルゴリズム等）を適用したテンプレート最適化手段により予測精度を高め、その最適化結果を用いてデータを圧縮すると共に、データベースを更新して次回以降における圧縮効率および速度の向上を図る。更に、得られたテンプレートでデータベースを更新することで、人工知能技術を適用せずとも、高予測精度に寄与するテンプレートを高速に得ることができるようになる。The present invention is based on a template-based data compression method, divides input data into homogeneous regions, and predicts each divided region by a template optimizing means applying an artificial intelligence technique (such as a genetic algorithm). The accuracy is increased, the data is compressed using the optimization result, and the database is updated to improve the compression efficiency and speed in the next and subsequent times. Further, by updating the database with the obtained template, a template that contributes to high prediction accuracy can be obtained at high speed without applying artificial intelligence technology.

Description

技術分野
本発明は例えば画像情報の各画素のとる値を周囲画素の状態により予測し、その予測結果に基づいて画情報を符号化するデータ符号化方式において、圧縮効果を向上させるために、参照する周囲画素位置のパターンを高速に精度よく最適化するための方式に関する。
背景技術
予測符号化方式において、各画素の取る値の予測精度が高いほど圧縮効果が向上する。予測精度は、予測時に参照する画素（参照画素）の数を増やすことにより向上する。また、参照画素の位置パターンを最適化することにより、予測精度を高めることができる。
図２は、画像を構成する各画素が０か１の２値しか取らない２値画像を符号化する方式の国際標準であるＪＢＩＧ（Ｊｏｉｎｔ　Ｂｉ−ｌｅｖｅｌ　Ｉｍａｇｅ　ｅｘｐｅｒｔｓ　Ｇｒｏｕｐ）方式の参照画素位置パターンを示す（以降では、このような参照画素位置パターンのことをテンプレートと呼ぶ）。
同図において、網掛けの四角は注目画素、ｐ１〜ｐ９とＡで示される四角は参照画素を示す。全１０個の参照画素のうち、Ａで示す画素はＡＴ（Ａｄａｐｔｉｖｅ　Ｔｅｍｐｌａｔｅ）画素と呼ばれ、画像の性質に応じて、図中に示すほぼ２５６×２５６画素の領域内において任意の位置に動くことが許されている。しかし、このように広大な領域から最適な位置を発見するのは非常に困難で、大きな計算コストを要するために、ＪＢＩＧ方式を実装した多くのシステムでは、ＡＴ画素が同図中の×印で示す８画素の範囲でしか動かないように制限している。なお、ＪＢＩＧ方式の勧告書において、相関強度の最も強い周囲画素を選択する方法が、ＡＴ画素位置を決定する方式として推奨されている。
一般に、文字や図形のような画像を圧縮符号化処理する場合、全参照画素が注目画素の近くに密集配置されているテンプレートが予測精度向上に有効である（電子情報通信学会論文誌ｖｏｌ．Ｊ７０−Ｂ、ｎｏ．７に掲載された加藤らの論文である″参照画素のダイナミック選択による２値画像の適応マルコフモデル符号化″）。よって、上記ＪＢＩＧ方式のテンプレートが有効に機能し、高い精度で注目画素の画素値を予測することができ、結果として高い圧縮率を得ることができる。
しかし自然画の場合には、参照画素を広範囲に分散配置させた方が高い予測精度を得られることが多い。さらに、どのように分散配置すべきかということは画像の性質に大きく依存する。たとえば、クラスター型のディザ法で２値化された画像では、一定周期で相関の強い画素が出現するため、参照画素配置には、絵柄に起因する特性だけでなく、ディザ法に起因する周期性を反映しなければ高い圧縮効率を得ることができない。また、渦巻き型、ベイヤー型、誤差拡散など、他のディザ法で生成された画像にも、それぞれ固有の特性があるため、高い圧縮効率を得るためには、それぞれに応じたテンプレートを使用しなければならない。
ところが、与えられた２値の自然画が、どのようなディザ法で作られたものであるかを判定することは非常に難しい。その上、画像の絵柄に起因する特性も考慮して、最適なテンプレートを簡単な計算により求めることは不可能である。ゆえに、これまでに、高い圧縮効率を得るためのテンプレートを決定する多くの方式が提案されてきた。
たとえば、宇都宮大の加藤らの方式や、米特許５０２３６１１（ルーセント）では、２値画像中で周囲画素のとる値が注目画素のとる値と同一であった回数を数え上げ、圧縮の最中にその値が一定条件を満たした場合、値の大きい周囲画素から順に参照画素としてテンプレートに組み入れる方法を採用している。
特開平６−９０３６３号公報（リコー）では、上記のように逐次的にテンプレートを変更する方式の他に、圧縮する前に画面全体にわたって周囲画素の相関の強さをあらかじめ求めておき、それに基づいてテンプレートを決定する方法や、あらかじめ用意したテンプレートから画像情報に適したものを選択する方式について言及している。
上記３方式のように、注目画素との相関の強さだけに基づく方式では、クラスター型のディザ法で２値化された画像において最適なテンプレートを決定することができない。なぜなら、実際には複数のクラスター間の相関を考慮する必要があるにもかかわらず、１つのクラスター内における画素同士の相関が強いため、単純に相関強度だけに基づいて参照画素を選び出すと、図１（ａ）のように密集したテンプレートとなってしまうためである。
これに対し、特開平５−３０３６２号公報（富士通）では、注目画素までのラン（同一ライン上で同一の画素値が連続する領域）と直前の同一色ランの距離差を周期性として用いる方法が記されている。しかし、この方式は周期性を持たない画像では有効に機能しない。
特開平１１−２４３４９１号公報（三菱重工）では、テンプレートの最適化のために、重回帰分析と、圧縮率を評価関数とした遺伝的アルゴリズムを用いて、網点構造と画像が表現する絵柄に起因する性質の両方への対応を図っている。遺伝的アルゴリズムは、自然界にみられる生物の進化や適応をモデル化した計算方式で、人工知能の強力な探索手法である。これにより膨大な可能性の中から、適切なテンプレートを選び出すことが可能となる。
しかし、この方式には、処理速度が極めて遅いという問題点がある。その理由は、遺伝的アルゴリズムの実装方法にある。遺伝的アルゴリズムは、複数の解候補からなる集団を用意し、それぞれを評価し、評価に基づいて新しい解候補集団を生成するというプロセスを１世代とし、停止条件が満たされるまで何世代も繰り返すという計算手続きをとる。すなわち、１回の試行につき［個体集団サイズ］×［世代数］回の評価を要する。
一方、特開平１１−２４３４９１号公報の方式では、評価方法として圧縮率の計算を使っているために、１回の評価を行うために圧縮対象である画像データを１度符号化して、圧縮率を求なければならない。つまり、同方式でテンプレートを最適化するためには、テンプレートの最適化を行わない場合と比較して、［個体集団サイズ］×［世代数］倍の計算時間が必要となる。たとえば、個体集団サイズを３０、世代数を１００とすると、最適化されたテンプレートを得るためには、遺伝的アルゴリズムを使用しない場合よりも３０００倍長い計算時間が必要となる。
ファクシミリ装置のような画像伝送システムへの適用を鑑みた場合、このように膨大な時間をかけて遺伝的アルゴリズムでテンプレートの最適化をしている間に、無圧縮のデータ転送が完了してしまう。また、印刷用画像データのように数百ギガバイトの画像の圧縮を完了するためには莫大な計算時間が必要となり、とても現実的ではない。
本発明は、予測符号化における上記の問題を解決し、各種画像情報に対して常に高いデータ圧縮効率を得るための、テンプレート（参照画素位置パターン）の高速な適応的調整方式を提供することを目的とする。
発明の開示
このために、本発明は、（Ａ）入力された画像情報に対してその局所的な類似度に基づいて画像分割を行い、（Ｂ）分割された領域ごとにデータベース、解析処理、高速化機構を有する人工知能技術などを用いたテンプレート最適化により予測精度を高め、（Ｃ）その最適化結果を用いてデータベースを更新して、次回以降における圧縮効率および速度の向上を図ること特徴とするものである。以下では、これらの要素について順に説明する。
まず、（Ａ）について述べる。入力された画像情報を適切に分割するということは、同一の領域内では均質で、隣り合う領域同士では異なる性質を持つ領域単位で画像を切り分けることである。分割の例を図１４に示す。こうすることで、同一領域内では無条件に同じテンプレートを用いることができるようになる。さらに、適切に画像分割することで、木目細かなテンプレートの最適化を行えるようになり、分割画像ブロック内での画素値予測の精度が向上するため、結果として全体の圧縮効率が向上する。
分割の仕方を決定する方法は、１パスと２パスに大きく分類される。２パスとは、画像全体を最初に読み込んで、全体に対する解析などを行って分割方法を決定した後、あらためて分割領域ごとに読み込む方式である。画像全体の情報を利用できるため、精度の高い分割の仕方を得ることができる反面、２度にわたってデータの読み込みを行わねばならず、処理速度が低下するという問題がある。逆に１パスでは、画像を読み込みながら、逐次的に分割の仕方を決定する。画像全体の性質を反映した処理を行えないため、必ずしも最適な画像分割は行えないが、処理速度の低下を避けることができる。本発明では、１パス方式をベースに、ラスター方向に読み込んだ画像データをライン毎に分析することにより、適切な画像分割を行う。その詳細および具体的な処理方法については後述の実施例において説明する。
続いて（Ｂ）について述べる。本特許では、従来手法とは異なり、圧縮対象となる画像を生成したディザ方式を仮定しながら統計的処理および分析を行うことで、精度の高いテンプレートを生成する。また、あらかじめ想定されるディザ法とテンプレートの関係を格納したデータベースを用いて、圧縮符号化効率だけでなく、テンプレートを決定するために要する時間を飛躍的に短縮することが可能である。さらに、上記の方法で決定されたテンプレートをベースに、人工知能技術（遺伝的アルゴリズム等）を適用して、画像の絵柄に応じて適応的に調整することによって、より高い予測精度に寄与するテンプレートを得ることができる。画像の大局的な特性を損なうことなく一部分だけを切り出すことで、テンプレート決定の高速化も同時に行う。
最後に（Ｃ）について述べる。（Ｂ）で得られたテンプレートをフィードバックし、データベースを更新することで、使い込むほどに賢くなるデータ圧縮システムを構築できる。すなわち、同様な性質を持つ画像データを繰り返し処理することにより、人工知能技術を適用せずとも、高予測精度に寄与するテンプレートを高速に得ることができるようになる。この仕組みは、自律的に学習するエキスパートシステムと見なすことも可能である。
発明を実施するための最良の形態
実施例１
以下、添付図面を参照しながら、本発明の実施例を詳細に説明する。図６は、本発明の一実施例に係る画像伝送装置の送信側の構成図を示している。同図において、画像バッファ３は、画像データ入力線２からの入力される画像データ１を一時保存しておくためのものである。保存された画像データは、後述するブロック判定器６からのブロック情報に基づいて、ブロック判定器６、コンテクスト生成器５、テンプレート発生器８、圧縮符号化器１１（いずれも後述）へと、画像データ信号線４を通して出力される。
ブロック判定器６は、入力される画像データを、特徴的なブロック単位に分割し、以降の処理をブロックごとに行わせることで、圧縮符号化効率を高めるためのものである。ブロック判定器６へは、画像データ信号線４を介して画像データが入力され、これを基にブロック分割の仕方が決定され、ブロック情報（各ブロックの大きさと位置）がブロック情報信号線７を通して出力される。なお、ブロック分割の仕方は、ブロック判定器制御信号線１７を通してブロック判定器制御信号を入力することにより、外部から決定・制御することも可能である。
テンプレート発生器８は、入力される画像データに対して、最大の圧縮率に寄与できる最適なテンプレートを決定する。ここへは、画像データ信号線４を経た画像データと、圧縮データ信号線１２を介した圧縮データが入力される。圧縮データは、テンプレート発生器が出力したテンプレートの良さを示す指標として利用される。なお、テンプレート発生器制御信号線１６を通して、外部からテンプレートを強制的に指定し、圧縮符号化を行わせることも可能である。
コンテクスト生成器５は、テンプレート信号線９を介して入力されるテンプレートを用いて、画像データ信号線４を介して入力される画像データにおける参照画素のとる値のパターン（コンテクスト）を取得するためのものである。
ここでコンテクストとはテンプレートにより指定された位置にある画素のとる値を取り出して並べたベクトルのことである。たとえば図１５（ａ）のようなテンプレートでは、網掛けの四角が注目画素、ｐ１〜ｐ４が参照画素の位置を示すものとすると、同図（ｄ）においてｐ１〜ｐ４に対応する画素のとる値は、それぞれｐ１＝０、ｐ２＝１、ｐ３＝１、ｐ４＝１である。よって、これらの値を並べて生成される＜０１１１＞がコンテクストとなる。
コンテクスト生成器では、シフトレジスタを有効に利用することで処理速度を向上することができる。図１５（ａ）のテンプレートの例では、ｐ１、ｐ２、ｐ３が横に並んでいる。このテンプレートが右に１ビットずれると、ｐ２の値がｐ１に移り、ｐ３の値がｐ２に移ることとなる。すなわち、ｐ１〜ｐ３に対応するシフトレジスタを使用することで、テンプレートが右にずれたときにメモリから新規に読み込む必要があるのはｐ３とｐ４の２画素分だけ済む。その結果、メモリアドレスの演算回数を減らすことができるため、処理速度を向上させることができる。
圧縮符号化器１１は、コンテクスト信号線１０を介して入力されるコンテクストを用いて、画像データ信号線４を介して入力される画像データを圧縮符号化し、圧縮データを生成するためのものである。
圧縮符号化済みデータ合成器１３は、圧縮データに対して、それを生成するために使用したテンプレートと、圧縮データの元となったブロック情報を付加して、圧縮符号化済みデータ１５を生成するためのものである。ここへは、ブロック情報信号線７を介してブロック情報、テンプレート信号線９を介してテンプレート、圧縮データ信号線１２を介して圧縮データが入力される。また、当該システムの外部から、画像データに関する属性情報（画像データの解像度、線数、スクリーン角度、画像データを生成したシステム名とバージョン、画像データの作成日時など）を、属性情報入力線１８を介して圧縮符号化済みデータ合成器１３に入力し、圧縮符号化済みデータ１５のヘッダに記録することもできる。
図６に基づいて、画像データの圧縮および圧縮符号化済みデータ１５の出力までの手順を簡単に説明する。
まず、当該システムの外部から、画像データに関する属性情報が属性情報入力線１８を介して圧縮符号化済みデータ合成器１３に入力され、圧縮符号化済みデータのヘッダとして記録される。この属性情報はなくても構わない。
続いて、圧縮対象となる画像データ１が画像データ入力線２を通して画像バッファ３へと入力される。このデータは画像データ信号線４を通してブロック判定器６へと送られる。このとき、画像データの大きさ（縦横サイズ）が判明するので、ブロック情報信号線７を介して圧縮符号化済みデータ合成器１３に送られ、圧縮符号化済みデータ１５のヘッダに記録される。
ブロック判定器６では画像に対するブロック分割の仕方が計算される。なお、たとえば新聞の紙面のように文字領域や写真領域の位置が紙面編集の段階で確定しているような場合には、ブロック判定器で最適なブロック分割の仕方を計算する必要はなく、これを行わないことで処理速度を向上させることができる。すなわち、ブロック判定器制御信号線１７を通して、紙面配置情報をブロック判定器６が読み込むことにより、この処理を高速化可能である。また、紙面配置情報をベースに、さらにブロックを適切に細分化することで、より高い圧縮効率が期待できる。この具体的な処理方法は後述する。
上記の手順で決定されたブロック分割情報は、画像バッファ３へと送られ、画像バッファ３からはブロック単位で画像データがテンプレート発生器８へと送られる。テンプレート発生器８では、図８のフローチャートに示す手順に基づいて、ブロック毎に最適なテンプレートが計算される。このときの処理方法については後述する。なお、フォントサイズの小さな文字領域については、図１（ａ）のように注目画素の周囲に参照画素が密集した形状のテンプレートが有効に機能することが知られているため、テンプレート最適化を行う必要はない。そこで、このように特定のブロックに対する最適なテンプレートがあらかじめ判明している場合には、テンプレート発生器制御信号線１６を介して入力されるテンプレート発生器制御信号により、自動的にテンプレートが決定される。
コンテクスト生成器５では、テンプレート発生器８で計算されたテンプレートを用いて、画像バッファ３から画像データ信号線４を通して送られてくるブロック分割された画像データを走査しながら、各画素毎にコンテクストを抽出する。
圧縮符号化器１１では、画像バッファ３から画像データ信号線４を通して送られてくるブロック分割された画像データの各画素と、上記の手順で得られたコンテクストを用いて圧縮データを生成する。得られた圧縮データは圧縮データ信号線１２を通して圧縮符号化済みデータ合成器１３へと送られ、ブロック判定器６からブロック情報信号線７を通して送られてくるブロック情報と、テンプレート発生器８からテンプレート信号線９を通して送られてくるテンプレートと合成され、各ブロックに関する圧縮情報単位で連結され、圧縮符号化済みデータ１５として圧縮符号化済みデータ出力線１４を通して受信側へと送信される。
なお、圧縮符号化器１１から生成される圧縮データはテンプレート発生器８におけるテンプレート最適化処理において圧縮率の計算に使用されるため、圧縮データ信号線１２を通してテンプレート発生器へも送られる。
図７は、本発明の一実施例に係る画像伝送装置の受信側の構成図を示している。同図において、データ解析器２２は、圧縮符号化済みデータ入力線２１から入力される圧縮符号化済みデータ２０から、ヘッダ部分と、ブロックごとに圧縮された各ブロックに関する圧縮情報（テンプレート、ブロック情報、圧縮データ）を分離し、ブロック情報に基づいてテンプレートと圧縮データを出力するためのものである。
コンテクスト生成器２７は、テンプレート信号線２５を介して入力されるテンプレートを用いて、復号データ信号線２９を介して入力される復号データにおける参照画素のとる値のパターン（コンテクスト）を取得するためのものである。
圧縮符号復号化器２６は、コンテクスト信号線２８を介して入力されるコンテクストを用いて、圧縮データ信号線２５を介して入力される圧縮データを復号化し、復号データを生成するためのものである。
圧縮符号復号化器２６からは、ブロック分割された復号データがパッチワークのように断片的に出力されるため、これらを正しく配列しなければ原画像を復元することはできない。そこで、伸張データ合成器３０は、圧縮符号復号化器２６から復号データ信号線２９を通して入力される復号データを受け取り、データ解析器２１からブロック情報信号線２３を通して受け取るブロック情報に基づいて、復号データを配置しながら蓄積することで伸張データを合成し、伸張データ出力線２８を通して外部へと出力する。
図７に基づいて，圧縮データの復号および伸張データの出力までの手順を簡単に説明する．
まず、圧縮符号化済みデータ入力線２１を介してデータ解析器２２へと圧縮符号化済みデータ２０が入力される。圧縮符号化済みデータ２０のフォーマットは図２４を用いて後述する。データ解析器２２では、圧縮符号化済みデータ２０を解析し、ヘッダ部分と、各ブロックに関する圧縮情報（ブロック情報、テンプレート、圧縮データ）が分離して取り出される。
ヘッダ部分に記録されていた原画像の属性情報は、伸張データ合成器３０へと送られ、そのまま伸張画像ヘッダとなる。また、ヘッダ部分には必ず原画像データの大きさ（縦横サイズ）が記録されており、伸張データ合成器３０はこの情報を用いて、復号データから伸張データの復元を行う。
圧縮データは圧縮データ信号線２４を通して圧縮符号復号化器２６へと送られ、これに対応するテンプレートがテンプレート信号線２５を介してコンテクスト生成器２７へと送られる。圧縮符号復号化器２６では、コンテクスト生成器２７からコンテクスト信号線２８を介してコンテクストを受け取り、これを用いて圧縮データを復号化して、復号データ信号線２９を介して出力する。出力された復号データはコンテクスト生成器２７へと送られ、テンプレートを用いてコンテクストを生成するために使用される。
伸張データ合成器３０へは、圧縮符号復号化器２６から復号データ信号線２９を介して伸張データが入力され、データ解析器２２からブロック情報信号線２３を介して入力されるブロック情報を用いて、パッチワーク状に断片化された画像データとしての復号データの再配置が行われ、伸張データが合成される。伸張データは伸張データ出力線２８を介して外部へと出力される。
図３４は、本発明にかかわる技術をハードウェアも使用して実現した例である。本技術を用いた画像データ圧縮ハードウェアは、プロセッサ、メインメモリ、各種レジスタ、圧縮／伸張処理部から構成される。圧縮時における各要素の動作は以下のとおりである。プロセッサでは、ブロックの判定、テンプレートの生成、およびホスト計算機とのインタフェースを行う。メインメモリは、プロセッサで処理するブロック判定およびテンプレート生成用プログラム、さらに発見されたテンプレートを格納するために用いる。各種レジスタは、プロセッサと圧縮処理部のインタフェースとして使用される。圧縮処理部では、コンテクスト生成、圧縮符号化処理、圧縮符号化済みデータ合成が行われる。なお、圧縮符号化済みデータ合成は、外部のホスト計算機で行ってもよい。
伸張時における各要素の動作は以下のとおりである。プロセッサは、伸張の際にはほとんど使用されず、圧縮データ先頭部分のデータ解析と、ホスト計算機とのインタフェースを行うために使用される。メインメモリには、圧縮時と同様に、プロセッサが実行するデータ解析プログラムを格納される。各種レジスタは、プロセッサと圧縮／伸張処理部とのインタフェースとして働く。圧縮伸張処理部では、コンテクスト生成、圧縮符号復号化処理および伸張データの合成が行われる。なお、伸張データの合成は、外部のホスト計算機で行ってもよい。
以下では、動作について説明する。まず、圧縮の開始時に、ホスト計算機に格納されている圧縮対象画像データに関して、その大きさや解像度などの情報が、画像データ圧縮ハードウェアのプロセッサと圧縮／伸張処理部に送られる。圧縮／伸張処理部では、これらのデータを圧縮符号化済みデータのヘッダに記録する。プロセッサでは、圧縮／伸張処理部と協調して、これらのデータを用いて、ブロック判定およびテンプレート最適化が行われる。
ソフトウェアによる適応型テンプレート調整方式の動作手順を図８に示す。はじめに、全体の流れを概説し、その後、各処理について詳しく説明する。
入力画像データに対して、はじめに、性質の似通った領域単位で別々に圧縮処理を行うために画像分割ｓ１を行う。ここで、画像全体をブロックに分割してしまってもよいし、一部分だけをブロックに分割し、残った部分を後からブロックに分割してもよい。前者は２パス、後者は１パスの処理に相当する。ここでは１パス処理を行うこととする。
画像の一部分が複数のブロックに分割された後、分割された画像ブロックのそれぞれに対してテンプレート最適化ｓ２が行われる。
画像分割ｓ１で生成された画像ブロックの全てに対してテンプレート最適化が行われた後、画像末端に達するまで、残りの部分のブロック分割とブロックごとのテンプレート最適化が行われる。
図９にブロック判定のための画像分割（ｓ１）手続きを示す。水平方向分割ｓ１２では、プロセッサがブロック境界を検知するまで、図３５に示す圧縮／伸張処理部の画像データメモリに１ラインずつ逐次的に読み込まれる。ブロック境界の検知方法は以下のとおりである。
図１０は、水平方向分割ｓ１２のフローチャートを示す。まず、ステップｓ１２１において、１ラインずつ走査しながら、同一色が連続するランの長さを全て記録し、同時に当該ラインの特徴量を計算する。記録されたラン長は垂直方向のブロック分割に使用され、ラインの特徴量は水平方向のブロック分割に使用される。特徴量としては、平均ラン長や同時確率密度（参照可能範囲内にある全ての画素に関して、注目画素と同じ値を持つ確率）に基づくＫＬ（Ｋｕｌｌｂａｃｋ−Ｌｅｉｂｌｅｒ）情報量などを使用する。
そして、注目ラインの特徴量と、その前ラインの特徴量を比較する（ｓ１２２）。ここで両者の差異があらかじめ定められた閾値以上に大きい場合、注目しているライン直前までを同一ブロックであるとし、ブロック境界を検知する。また、閾値を越えていない場合でも、画像末端に達した場合には、そこをブロック境界とする。
同時確率密度について図１５を用いて説明する。同図（ａ）において、ｐ１〜ｐ４の４画素を参照可能範囲にある全ての画素、網掛けの四角が注目画素であるものとする。また、同図（ｂ）の第２ライン目が、現在着目しているラインであるものとする。同時確率密度を求めるためには同図（ｃ）〜（ｅ）のように注目ラインを走査して、各参照画素が注目画素と同じ値を取る確率を計算する。同図（ｃ）では、参照画素ｐ１とｐ４がデータの定義域の外に出てしまっているが、この場合は０であるものとすると、ｐ３のみが着目画素と同じ値を持つ。同様に同図（ｄ）ではｐ２、ｐ３、ｐ４の３画素、同図（ｅ）ではｐ１、ｐ２、ｐ３の３画素が着目画素と同じ値を持ち、同図（ｃ）〜（ｅ）まで１ラインを走査すると、ｐ１〜ｐ４が着目画素と同じ値であった回数は、それぞれ７、７、７、６となり、それぞれの注目画素に対する相関強度を意味する。そして、これを正規化して得られる７／２７、７／２７、７／２７、６／２７が同時確率密度である。
同時確率密度に基づくＫＬ情報量の計算手続きは以下のとおりである。すなわち、Ｘｉを当該ラインの特徴量、Ｙｉを前ラインの特徴量であるとすると、ＫＬ情報量であるＫＬＤは
ＫＬＤ＝ＫＬ（Ｘ）＋ＫＬ（Ｙ）
と計算される。ただし、ここで
ＫＬ（Ｘ）＝Σ｛Ｘｉ×ｌｏｇ（Ｘｉ／Ｙｉ）｝
ＫＬ（Ｙ）＝Σ｛Ｙｉ×ｌｏｇ（Ｙｉ／Ｘｉ）｝
であるものとする。
なお、特徴量の比較を行う際に、前ラインだけの特徴量ではなく、当該ブロックから前ラインまでの全てのラインに関する特徴量を計算しておき、その平均や移動平均、加重平均などと、当該ラインの特徴量と比較することにより精度の高い特徴量比較を行うことができる。
また、特徴量比較における規範として、ＫＬ情報量の他にＭＤＬ（Ｍｉｎｉｍｕｍ　Ｄｅｓｃｒｉｐｔｉｏｎ　Ｌｅｎｇｔｈ）基準、ＡＩＣ（Ａｋａｉｋｅ’ｓ　Ｉｎｆｏｒｍａｔｉｏｎ　Ｃｒｉｔｅｒｉｏｎ）、ベクトル間の距離などを使用してもよい。
さらに、特徴量を求める際に、参照可能範囲内にある全ての画素に関して計算をするのではなく、直前のブロックで使用したテンプレートに含まれる参照画素だけに関して計算を行うことで、高速化を図ることもできる。
水平方向のブロック境界が検知された後、そのブロックをさらに垂直方向に分割することも可能である。図１１は、垂直方向分割（ｓ１３）の詳細な手続きを示す。
まずステップｓ１３１において、垂直分割位置候補の選出を行う。ステップ１２１において記録した、各ラインにおける同一色によるランの長さを参照しながら、全てのラインにおいて垂直分割位置候補を選び出すことができる。
図３３は、幅１６画素、高さ８ラインの画像ブロックの例を示す。第１行目においては、（ａ）、（ｃ）、（ｄ）、（ｆ）、（ｉ）、（ｌ）、（ｎ）、（ｏ）の８箇所が垂直分割位置候補となる。同様に、第２行目は（ｂ）、（ｆ）、（ｊ）、（ｎ）の４箇所が垂直分割位置候補となる。同様の処理を全てのラインに繰り返すことで、各ラインで垂直分割位置候補が選び出される。
続いて、ステップｓ１３２おいて、全ての垂直分割位置候補を、実際に当該ブロックの垂直分割位置として選択するかどうかが検査される。その手順としては、多数決的な手法を使用する。すなわち、あらかじめ与えられた閾値としての選定基準率を０．８とすると、８ライン中７ラインが（ｂ）にランの切れ目を有し、全体の７／８、すなわち割合として０．８以上のラインが（ｂ）を垂直分割位置として支持しているものと見なせる。よって（ｂ）は垂直分割位置として選択される。同様に（ｆ）と（ｎ）は全てのラインが同位置にランの切れ目を有するため（割合として１．０）、これらも当該ブロックの垂直分割位置として選択される。
上記の例では、最小ラン長を１として、長さ１以上のランの切れ目を垂直分割位置候補として選出したが、最小ラン長を２以上に設定してもよい。最小ラン長が小さくなるほど、垂直分割位置選出の感度が高くなるため、垂直分割数が多くなる。逆に、最小ラン長が大きくなるほど、感度が下がるために、垂直分割数は少なくなる。実際、上記の例の場合では、最小ラン長が２以上になると、垂直分割は行われないことになる。
なお、圧縮／伸張処理部内に専用の計算ユニットを配置し、上記の水平および垂直ブロック分割によるブロック判定をプロセッサではなく、この専用計算ユニットで行ってもよい。
ブロック判定の後は、それぞれのブロックに対してテンプレートの最適化（ｓ２）が行われる。テンプレート最適化の計算手続きは図２８の通りである。
まず、ステップｓ２１にてデータベース検索を行う。図６におけるメモリ８Ｍに格納されるデータベースには、あらかじめ登録されたテンプレートのほか、後述のテンプレート探索で得られたテンプレートが記録されている。
データベース検索の結果、登録テンプレートが発見された場合、そのままステップ２８に進む（ｓ２６）。
もし、データベース中に該当するテンプレートが登録されていなければ、次に、画像解析ｓ２２を行う。ここでは、データの周期性などを参照して、統計処理や簡単な計算により、画像データに適したテンプレートを求める。続いて、画像解析ｓ２２の結果であるテンプレートを検索キーとして用いて、データベースの再検索ｓ２７を行う。このとき、データベースに該当テンプレートが登録されていようと、いまいと、テンプレート探索を行うかどうかの判断ステップｓ２８に進む。
ステップ２８では、テンプレート探索を行うかどうかの判断がなされる。その判断基準については後述する。もし、テンプレート探索を行わないという判断が下された場合、ステップ２５に進み、テンプレート最適化処理を完了する。
テンプレート探索ｓ２３では、人工知能に基づく探索的手法を用いて、テンプレートの最適化を行う。最後に、テンプレート探索で得られたテンプレートをデータベースに適切に格納する。ここで格納されたテンプレートは、上述のデータベース検索で使用され、精度の高いテンプレートを高速に得るために利用される。
なお、この上記手続きは、上記ブロック分割で得られた分割画像ブロックの全てに対して行われる。また、計算速度を向上させるため、全ての分割画像ブロックではなく、一部の分割画像ブロックを代表例として処理し、その結果を単純に他の分割ブロックに適用するということも可能である。さらに高速化の必要がある場合には、画像解析やテンプレート探索を必要に応じて省略することも可能である。
以降、図２８における各処理ステップについて順に説明する。
図２９は、データベース検索ｓ２１の処理フローである。データベース検索は、ステップｓ２１１において、属性情報が登録内容と照合される。図１６は、テンプレートデータベース構造の一例である。この場合、検索キーとして符号化対象である画像データを生成したシステム、そのバージョン、画像データの解像度、線数、スクリーン角度が使用され、あらかじめ定められた閾値以上に一致度が高い登録テンプレートが選択される。画像解析結果（入力テンプレート）は、図２８におけるデータベース再検索ｓ２７で使用するための検索キーで、ここでは使用しない。データベース検索ｓ２１の結果、適合するテンプレートが発見された場合、ステップｓ２６から、テンプレート探索を行うかどうかの判断ステップｓ２８に進む。ここでの判断規範としては、基準テンプレートを用いた場合と比較して、（ａ）圧縮率が一定閾値以上高いかどうか、（ｂ）テンプレートに含まれる各画素の相関強度（後述）の総和が一定閾値以上高いかどうか、などを用いる。通常は（ａ）を用いるが、少しでも処理時間を短縮したい場合には（ｂ）を使用してもよい。
ここで、基準テンプレートとしては、ＪＢＩＧおよびＪＢＩＧ２方式のデフォルトテンプレートを使用する。データベース検索で２番目に適合したテンプレート、あるいは、既に処理済のブロックで最終的に使用されたテンプレートなどを使用してもよい。
このとき、計算時間を短縮するために、画像の対象ブロックの一部分だけを切り出して生成される画像、行および列に関して一定規則に基づいて間引いて生成された画像を使用してもよい。
なお、強制的にテンプレート探索を行わせるか、行わせないかどうかの判断を、オペレータや外部システムにより下すことも可能である。
ステップｓ２８において、テンプレート探索を行うという判断が下されたときは、そのままステップｓ２３でテンプレート探索を行い、行わないという判断が下された場合は、ステップｓ２５に進んで、テンプレート最適化処理を完了する。テンプレート探索については後述する。
データベース検索ｓ２１の結果、適合するテンプレートが発見されなかった場合は、図２８のステップｓ２６を経て、画像解析ｓ２２に進む。
図３０は、画像解析の処理フローである。ここでは、ディザ分析ｓ２２１を試み、それが可能だった場合はそのまま終了し、それができなかった場合には相関分析ｓ２２３を行ってから終了する。
ディザ分析は、ディザ画像を生成する際に使用するディザマトリクスに起因する周期性を利用する。この手順は下記のとおりである。まず、参照可能範囲にある全ての画素について、注目画素との相関を計算する。次に、注目画素から一定距離以上離れている周囲画素のうち、最も相関の強い画素を一つ選び出す。
符号化対象の画像がディザマトリクスを使用して生成されたディザ画像である場合、相関の強い画素同士は一定角度傾き、特定の辺の長さを持つ正方形の頂点に配置されるため、注目画素から選び出された画素までの距離と方向は、図１８（ａ）の関係性を満たしている蓋然性が高い。同図において、網掛けの四角は注目画素、×印の四角は注目画素から一定距離以上離れた画素のうちで最も大きな相関強度を持つ画素、空白の四角はこれらの関係性から強い相関を持つと推定される画素を示す（これらを理想配置画素と以降では呼ぶ）。そこで、理想配置画素の中から参照画素を選択することで、単純な相関強度に基づく参照画素の選択方法よりも精度よくテンプレートを決定することが可能となる。
もし、参照可能範囲に存在する理想配置画素数ｍが、テンプレートを構成する参照画素数ｎよりも多い場合、注目画素からの距離が小さい順にｎ個の理想配置画素を選択し、初期テンプレートの構成参照画素として採用する。例えば図１８（ａ）のような理想配置画素かつ参照画素数ｎ＝１０の場合、ｐ１〜ｐ１０までの理想配置画素が参照画素として選択される。
逆に、参照可能範囲に存在する理想配置画素数ｎがテンプレートを構成する参照画素数ｍよりも少ない場合は、注目画素の隣接画素と、理想配置画素の隣接画素の中から、注目画素からの距離が小さい順に参照画素を選択する。例えば図１８（ａ）のような理想配置画素かつ参照画素数ｎ＝１６の場合、ｐ１〜ｐ１３までの理想配置画素をテンプレートに採用し、なお不足する参照画素を補充するため、相関強度の強い順に３つの参照画素を選択する。
ところが、参照可能範囲が狭すぎて十分な数の理想配置画素数を包含できない場合、あるいは参照可能範囲に対する理想配置画素間の距離が極めて大きい場合、一般に相関強度の強い画素は注目画素近傍に集まりやすいため、理想配置画素以外の参照画素は全て注目画素近傍から選ばれることになる可能性が高い。
図１８（ｂ）は参照可能範囲が狭すぎる場合の例である。ここで参照画素数ｎ＝１６とすると、理想配置画素ｐ１〜ｐ７以外の９個の参照画素は全て注目画素近傍から選ばれる可能性が高い。同図（ｂ）中では、注目画素近傍の添え字のない空白の四角で示される画素が、不足分を補充するための参照画素として選択される。しかし、これでは参照画素同士が密集しすぎているため、自然画に対して有効に機能できない。
このような状況を避けるため、以下に説明するような方式でテンプレートを決定する。すなわち、まず、注目画素に隣接する画素のうちで注目画素との相関が最も強い画素を参照画素として採用する。次に、注目画素から近い順に理想配置画素を選び、それと隣接する画素のうちで注目画素との相関が最も強い画素を参照画素として採用する。図１８（ｂ）の例の場合、ここまでの手順で、注目画素の隣接画素、７個の理想配置画素とそれぞれ１つずつの隣接画素の、合計１５（＝１＋７×２）個の参照画素が決定される。これでもまだ総参照画素数よりも少ない場合は、注目画素に隣接する画素から２つ目の参照画素を採用し、さらに理想配置画素に隣接する画素からもそれぞれ２つめの参照画素を採用するプロセスを、採用した参照画素数がテンプレートを構成する総参照画素数に達するまで繰り返す。
なお、上記ディザ分析において、相関強度の代わりに、画像分割を行うときに計算してある同時確率密度を使用し、計算時間を大幅に短縮することができる。
以上の手続きにより、ディザ分析が完了する。なお、あらかじめディザマトリクスが判明している場合には、その大きさから周期性や相関パターンを明確に検出することができるので、理想画素配置の計算を省略することが可能である。
図３０のステップｓ２２３では、ディザ分析が上手く機能しない場合に相関分析を行う。なお、相関強度を計算する際、通常は分割画像ブロックを構成する全ての画素を走査するが、計算を省略することで高速化することが可能である。すなわち、数画素おき、数ラインおきに走査すること、あるいは一部の領域内だけを走査することにより計算速度を向上させられる。高速化のためには、分割画像ブロック内の、さらに一部分だけを限定して走査してもよい。
相関分析ｓ２２３では、まず、参照可能範囲にある全ての画素について、注目画素との相関を計算する。そして、テンプレートがｎ画素から構成される場合、相関の強い順にｎ個の画素を選び出して初期テンプレートに採用し、画像解析手続きを完了する。
ところで、高周波成分が非常に強い画像において、単純に相関強度の大きさに基づく選択基準を用いると、図１（ａ）のように、注目画素周囲に参照画素が密集してしまう。このようなテンプレートは文字領域では有効であるものの、自然画には不適切であることはすでに述べた。そこで、相関の強い画素から単純に選び出す方法に基づきながら、より参照画素の配置を広く分散させることが必要となる。
そのための方法の一例を、図１７（ａ）を用いて説明する。同図中、網掛けの四角は注目画素、空白の四角は相関強度が０の画素、数字が入力された四角はその数字を相関強度として持つ画素である。この例では、テンプレートは３つの参照画素から構成されるものとする。ここで、参照画素が単純に相関強度の大きさだけを基準として選び出させる場合には、相関強度が１００、９０、８０の画素がテンプレートを構成することになる。しかし、参照画素として選び出された画素の隣接画素の相関強度を強制的に減少させることにより、異なるテンプレートが生成される。
すなわち、参照画素として選び出された画素の隣接画素の相関強度を、そのたびに９０％に減少させるものとすると、相関強度が１００、９０の２つの画素が選び出された後の状態は図１７（ｂ）のようになる。図１７（ａ）で相関強度が８０であった画素は、相関強度１００、９０の２画素が選び出されることにより、その相関強度は８０×０．９×０．９＝６４．８にまで減少する。したがって、この場合の３つ目の参照画素としては、相関強度が７０の画素が選ばれることになる。
なお、上記相関分析において、相関強度の代わりに、画像分割を行うときに計算してある同時確率密度を使用し、計算時間を大幅に短縮することができる。
検索キーとしての画像解析結果（入力テンプレート）に適合するテンプレートがデータベース中に登録されているということは、現在圧縮処理しようとしている画像データと、非常に統計的性質の似通った画像データに関して、これまでに圧縮処理を行った経験があるということを意味している。すなわち、そのときのテンプレート探索結果である登録テンプレートを利用することによって、テンプレート探索結果に要する計算時間を省きつつ、画像解析結果そのままより有効に機能するテンプレートを得ることができる。
ただし、ある画像で有効に機能したテンプレートが、その画像と統計的性質が非常に似通った画像でも有効に機能するということの蓋然性は高いものの、保証されるものではない。たとえば、誤差拡散法で処理されたディザ画像では、絵柄によらず、統計的手法で決定した参照画素は図１（ａ）のように密集配置されやすい。そこで、そのような状況があらかじめ想定される場合には、データベース検索における適合度の閾値を高く設定しておけばよい。また、ディザ方式により異なる閾値を設定しておけば、自動的に上記問題を解消することも可能である。
ここまでが、図２８における画像解析ｓ２２の動作説明に関する説明である。つづいて、データベース再検索ｓ２７について述べる。
上述のデータベース検索ｓ２１とは異なり、ここでは検索キーとして画像解析結果（入力テンプレート）だけを使用し（図１６参照）、一定閾値以上適合した登録テンプレートを出力する。
データベース再検索の後は、ステップｓ２８において、テンプレート探索を行うかどうかの判断が下されるが、この処理手順は上述のとおりである。
つづいて、テンプレート探索ｓ２３について述べる。
図３１のフローチャートに例示されるように、テンプレート探索ｓ２３の処理手順は２段階の処理からなる。第１段階では、画像解析ｓ２２で得られたテンプレートを初期状態として用いて遺伝的アルゴリズムｓ２３１を実行し、テンプレートの最適化を行う。その後、第２段階で、得られたテンプレートを局所探索ｓ２３２にてさらに調整し、処理を完了する
まず、遺伝的アルゴリズムｓ３０１について説明する。遺伝的アルゴリズムの参考文献としては、例えば、出版社ＡＤＤＩＳＯＮ−ＷＥＳＬＥＹ　ＰＵＢＬＩＳＨＩＮＧ　ＣＯＭＰＡＮＹ、ＩＮＣ．が１９８９年に出版した、Ｄａｖｉｄ　Ｅ．Ｇｏｌｄｂｅｒｇ著の「Ｇｅｎｅｔｉｃ　Ａｌｇｏｒｉｔｈｍｓ　ｉｎ　Ｓｅａｒｃｈ、Ｏｐｔｉｍｉｚａｔｉｏｎ、ａｎｄ　Ｍａｃｈｉｉｎｅ　Ｌｅａｒｎｉｎｇ」がある。
一般的な遺伝的アルゴリズムでは、まず遺伝子を持つ仮想的な生物の集団を設定し、あらかじめ定めた環境に適応している個体が、その適応度の高さに応じて生存し、子孫を残す確率が増えるようにする。そして、遺伝的操作と呼ばれる手順で親の遺伝子を子に継承させる。このような世代交代を実行し、遺伝子および生物集団を進化させることにより、高い適応度を持つ個体が生物集団の大勢を占めるようになる。そしてその際の遺伝的操作としては、実際の生物の生殖においても生じる、遺伝子の交叉、および突然変異等が用いられる。
図１２は、かかる遺伝的アルゴリズムの概略手順を示すフローチャートであり、ここでは、初めにステップｓ２３１１で、個体の染色体を決定する。すなわち、世代交代の際に親の個体から子孫の個体に、どのような内容のデータをどのような形式で伝えるかを定める。図１９に染色体を例示する。ここでは、対象とする最適化問題の変数ベクトルｘを、Ｍ個の記号Ａｉ（ｉ＝１、２、…、Ｍ）の列で表わすことにし、これをＭ個の遺伝子からなる染色体とみなす。図１９中、Ｃｈは染色体、Ｇｓは遺伝子を示し、遺伝子の個数Ｍは５である。遺伝子の値Ａｉとしては、ある整数の組、ある範囲の実数値、単なる記号の列などを問題に応じて定める。図１９の例では、ａ〜ｅのアルファベットが遺伝子である。このようにして記号化された遺伝子の集合が個体の染色体である。
上記ステップｓ２３１１では次に、各個体が環境にどの程度適応しているかを表わす適応度の計算方法を決定する。その際、対象とする最適化問題の評価関数の値がより高い変数あるいはより低い変数ほど、それに対応する個体の適応度が高くなるように設計する。またその後に行う世代交代では、適応度の高い個体ほど、生き残る確率あるいは子孫を作る確率が他の適応度の低い個体よりも高くなるようにする。逆に、適応度の低い個体は、環境にうまく適応していない個体とみなして、消滅させる。これは、進化論における自然淘汰の原理を反映したものである。すなわち適応度は、生存の可能性という面から見て各個体がどの程度優れているかを表わす尺度となる。
遺伝的アルゴリズムでは、探索開始時においては、対象とする問題は一般にまったくのブラックボックスであり、どのような個体が望ましいかはまったく不明である。このため通常、初期の生物集団は乱数を用いてランダムに発生させる。従ってここにおける手順でも、ステップｓ２３１２で処理を開始した後のステップｓ２３１３では、初期の生物集団は乱数を用いてランダムに発生させる。なお、探索空間に対して何らかの予備知識がある場合は、評価値が高いと思われる部分を中心にして生物集団を発生させるなどの処理を行うこともある。ここで、発生させる個体の総数を、集団の個体数という。
次にステップｓ２３１４で、生物集団中の各個体の適応度を、先にステップｓ２３１１で決めた計算方法に基づいて計算する。各個体について適応度が求めた後、次にステップｓ２３１５で、次世代の個体の基となる個体を集団から選択淘汰する。しかしながら選択淘汰を行うだけでは、現時点で最も高い適応度を持つ個体が生物集団中に占める割合が高くなるだけで、新しい探索点が生じないことになる。このため、次に述べる交叉と突然変異と呼ばれる操作を行う。
すなわち、次のステップｓ２３１６では、選択淘汰によって生成された次世代の個体の中から、所定の発生頻度で二つの個体のペアをランダムに選択し、染色体を組み変えて子の染色体を作る（交叉）。ここで、交叉が発生する確率を、交叉率と呼ぶ。交叉によって生成された子孫の個体は、親にあたる個体のそれぞれから形質を継承した個体である。この交叉の処理によって、個体の染色体の多様性が高まり進化が生じる。
交叉処理後は、次のステップｓ２３１７で、個体の遺伝子を一定の確率で変化させる（突然変異）。ここで、突然変異が発生する確率を突然変異率と呼ぶ。遺伝子の内容が低い確率で書き換えられるという現象は、実際の生物の遺伝子においても見られる現象である。ただし、突然変異率を大きくしすぎると、交叉による親の形質の遺伝の特徴が失われ、探索空間中をランダムに探索することと同様になるので注意を必要とする。
以上の処理によって次世代の集団が決定され、ここでは次に、ステップｓ２３１８で、生成された次世代の生物集団が探索を終了するための評価基準を満たしているか否かを調べる。この評価基準は、問題に依存するが、代表的なものとして次のようなものがある。
・生物集団中の最大の適応度が、ある閾値より大きくなった。
・生物集団全体の平均の適応度が、ある閾値より大きくなった。
・生物集団の適応度の増加率が、ある閾値以下の世代が一定の期間以上続いた。
・世代交代の回数が、あらかじめ定めた回数に到達した。
上述の如き終了条件の何れかが満たされた場合は、探索を終了し、その時点での生物集団中で最も適応度の高い個体を、求める最適化問題の解とする。終了条件が満たされない場合は、ステップｓ２３１４の各個体の適応度の計算の処理に戻って探索を続ける。このような世代交代の繰り返しによって、集団の個体数を一定に保ちつつ、個体の適応度を高めることが出来る。以上が一般的な遺伝的アルゴリズムの概略である。
上で述べた遺伝的アルゴリズムの枠組みは、実際のプログラミングの詳細を規定しない緩やかなものとなっており、個々の問題に対する詳細なアルゴリズムを規定するものではない。このため、遺伝的アルゴリズムを本実施例のテンプレート最適化に用いるには、以下の項目をテンプレート最適化用に実現する必要がある。
（ａ）　染色体の表現方法
（ｂ）　初期個体集団の発生方法
（ｃ）　個体の評価関数
（ｄ）　選択淘汰方法
（ｅ）　交叉方法
（ｆ）　突然変異方法
（ｇ）　探索終了条件
図２０は、本実施例における染色体の表現方法を示す。同図は、テンプレートを構成する参照画素数をｎ個、参照可能範囲を２５６×２５６画素の領域とした場合の例である。ただし、本実施例における参照可能範囲の広さは任意であり、あらかじめ定めておけば、どのような参照可能範囲でもよい。
このとき、染色体は、それぞれが１つの参照画素に対応するｎ個の部分に分けられる。また各部分はさらに２つに分けられ、参照画素のｘ座標とｙ座標位置を特定することとなる。例では、参照可能範囲の大きさはｘ方向、ｙ方向ともに２５６であるため、それぞれ８ビットの２進数表現される。
なお、染色体における参照画素の座標位置を２進数ではなくグレイコードで表現することも可能である。
またさらに、｛０、１｝の２進数あるいはグレイコードで染色体を表現するのではなく、参照画素位置の座標を示す整数で染色体を構成することも可能である。つまり、この場合には同図（ｂ）におけるｘ座標値とｙ座標値がともに｛０、…、２５５｝の値をとる。
本実施例では、図２８における画像解析ｓ２２において、分割画像ブロックごとに、分析的・解析的な手法を用いて初期テンプレートを決定している。そこで、この初期テンプレートを個体集団に埋め込んで、遺伝的アルゴリズムにおける初期個体集団を生成する。
図２６は、６つの染色体から構成される個体集団への、初期テンプレートの埋めこみ方の例を示す模式図である。同図（ａ）は、全ての染色体をランダムに初期化する方法であり、通常の遺伝的アルゴリズムと同じである。同図（ｂ）は、初期テンプレートを染色体表現に変換したものをマスター染色体とし、これを個体集団の第１染色体にコピーして、残る５つの染色体をランダムに初期化する方法である。マスター染色体を第１染色体だけでなく、個体集団内の任意の個数の染色体にコピーしてもよい。
図２６（ｂ）の方法で個体集団を初期化する場合、マスター染色体を個体集団にコピーする数が多いほど探索速度が向上するが、反面、個体集団の多様性が小さくなるために、最適なテンプレートを発見する以前に進化が止まってしまう可能性が高くなる。このような場合、数多くのマスター染色体を個体集団にコピーするのではなく、マスター染色体に突然変異を施した染色体をコピーすることで、個体集団の多様性を維持することができる。
同図（ｃ）は上記の方法の一例を示す模式図である。ここでは、マスター染色体を突然変異させた染色体を３つ生成し、個体集団の第１染色体にはマスター染色体をコピーし、第２〜第４染色体にはマスター染色体を突然変異させて得られた３つの染色体をそれぞれコピーし、残る２つの染色体だけをランダムに初期化する方法である。個体集団内の複数の染色体にマスター染色体をコピーしてもよく、マスター染色体を突然変異させて得られる染色体の個数も任意であり、どちらもパラメータで制御可能としておく。
本実施例において、遺伝的アルゴリズムにおける個体の評価関数Ｆとしては、個体が特定するテンプレートを用いて入力画像を圧縮符号化してえられる圧縮データの大きさの逆数であるものとする。たとえば、圧縮データの大きさが１ＫＢｙｔｅとなったばあい、適応度の値は１／（１×１０２４）とする。
遺伝的アルゴリズムは、評価関数Ｆを最大化するように振る舞うため、圧縮データの大きさの逆数が最大化されることにより、圧縮データの大きさを最小化するテンプレートを表現する個体が探索されることとなる。
なお、染色体の適応度を示す指標は、圧縮データの大きさの逆数だけでなく、圧縮データが小さくなるほど大きくなるという性質をもつものであれば何でもよい。たとえばある特定の値から圧縮データの大きさを差し引いて得られる値を適応度として用いることもできる。
なお、適応度を計算するために圧縮データの大きさを用いるということは、大きな計算コストを必要とするため、入力画像全てを適応度計算のために圧縮する代わりに、数画素おき、数ラインおきに間引いた縮小画像を圧縮することもできる。入力画像の一部を切り出して、切り出し領域だけを圧縮して、適応度を計算してもよい。このような間引きや切り出しにより評価関数の精度は落ちるものの、計算時間を大きく節約することが可能となる。
この精度低下は、世代毎に間引き方や切り出し領域を少しずつずらすことで、抑制することができる。たとえば、ｍ×ｍの領域に切り出して評価を行う場合、世代が変わるごとに縦および横方向に１ビットずらすと、切り出し領域の大部分である（ｍ−１）×（ｍ−１）領域は、前世代で使用した切り出し領域と重なっている（図３における網掛け領域）。よって、前世代において優れていると評価された染色体（テンプレート）は、新しい切り出し領域でも高く評価される蓋然性が高い。その上、切り出し領域をずらさなかった場合と比較して、（２×ｍ−１）だけ広い領域を評価に使用できたことになる。よって、この処理を繰り返すことで、より広い領域が評価に使用されるため、切り出し処理に起因する評価関数の精度低下を抑制することが可能となる。
上記の例では、ずらし量を１ビットとしているが、この値が大きいほど、より広い領域を評価に使用することができ、評価関数の精度低下も小さくなる。ただし、その分だけ、前世代で切り出した領域との重なりも小さくなってしまうため、探索速度が低下してしまう。よって、切り出し領域の大きさに応じて、ずらし量も変更する必要がある。
さらに、切り出しと間引きの両者を併用することで、評価関数の精度低下をさらに抑えることもできる。すなわち、対象としている分割画像ブロックの大きさに対して、切り出し領域の大きさが小さすぎる場合、ブロック全体を包含するように切り出し領域をずらし、走査するためには、非常に長い時間がかかる。その点、間引き方式では、隣接画素間の関係性が失われてしまうが、大域的な統計的性質を保持しやすいという利点がある。よって、両者を併用して組み合わせることで、お互いの欠点を補うことができる上、切り出し領域を小さくし、間引き間隔を広げることによる評価関数の精度低下を大きく抑制できる。
本実施例では、切り出し領域サイズを２５６×２５６、ずらし量を１ビットとしている。また、間引き間隔については、切り出し領域の縦，横サイズをそれぞれＨ、Ｗとすると、縦方向に（Ｈ／２５６）、横方向に（Ｗ／２５６）としている。さらに、間引きによるサンプリングの偏りを抑えるため、オフセット（１ライン目の位置）を縦横方向それぞれランダムに１ビットずつ移動させている。
なお、評価関数として、圧縮データの大きさの代わりに、エントロピを使用することもできる。情報理論的には、エントロピを最小化することと圧縮データの大きさを最小化することは等価であるため、理想状態において両者は区別されない。さらに、エントロピ計算においても、上記のような間引きや切り出しを用いてもよい。
選択処理においては、集団から適応度に比例した確率で個体を選び出す作業を、集団の個体数分だけ行う（ルーレット選択）。これにより、新しい個体集団が生成される。ルーレット選択のほか、トーナメント選択やランク選択と呼ばれる手法を用いてもよい。
交叉処理では、集団からランダムに選ばれた２つの親個体Ａ、Ｂに対して、図２１（ａ）の説明図に示す方法を用いる。これは染色体をランダムな位置で座標値を一塊として部分的に入れ替える操作であり、一点交叉と呼ばれる手法である。図２１（ａ）では、Ｃｈ１およびＣｈ２が親個体Ａ、Ｂの染色体であり、ここにおける交叉処理では、これらの染色体を、ランダムに選んだ交叉位置ＣＰで切断する。図２１（ａ）の例では、左から２番目の遺伝子と３番目の遺伝子の間を交叉位置としている。そして、切断した部分的な遺伝子型を入れ替えることによって、染色体Ｃｈ３およびＣｈ４をそれぞれ持つ子個体Ａ’、Ｂ’を生成する。
なお、交叉時に対として選択された２つの染色体が全く同じビット列である場合、１つの染色体集団に同一の染色体が含まれていることによる探索速度低下を避けるため、一方だけに、後述する突然変異を施す。
ところで、テンプレートは図２０のように染色体表現されているため、同図（ｂ）に示すように、１６ビット単位で意味のある情報ブロックが染色体に含まれていると見なすことができる。そこで、交叉時には、この上記情報ブロックの切れ目以外の箇所にＣＰが選ばれる確率を低く抑えることにより、意味のある情報ブロックが交叉により壊されてしまうのを防ぐことができるようになり、探索速度が向上するものと期待される。
しかし、その反面、交叉によって新しい情報ブロックが生成される確率も低下するため、探索が収束しやすいという副作用が生じる。そこで、情報ブロックの切れ目以外の箇所にＣＰが選ばれる確率を、ＧＡによる探索の初期段階では大きくしておき、世代の進行に従って小さくし、最終的には全てのＣＰ候補位置が等確率で選択されるようにすることで、探索初期段階での意図しない収束を避けることができる。なお、以上の方法は、一点交叉だけでなく、二点交叉や多点交叉，一様交叉においても使用することが可能である。
さらに、交叉時に対となった２つの染色体が示すテンプレートが共通して持つ参照画素を考慮して交叉を行うことで、探索速度を向上させることが可能である。例として、図４（ａ）のようなテンプレートを考える。同図で？印は注目画素、網掛けの四角は参照画素を示す。このテンプレートは、同図（ｂ）に示す染色体で表現される（情報ブロック単位で値を表現し、交叉点ＣＰは、情報ブロックの切れ目だけに限定している）。さらに、（ｃ）も全く同じテンプレートを示す染色体である。ここで、この２つの染色体を、第２と第３の情報ブロックの切れ目をＣＰとして交叉させたときの例が（ｄ）である。このように、全く同じテンプレートを示す染色体でありながら、交叉の結果として生成された染色体は全く異なるものとなってしまい、探索速度の低下の原因となる。
そこで、２つの染色体が示すテンプレートにおいて、共有されない参照画素に相当する情報ブロックを染色体から抜き出し、その部分だけで交叉を行い、結果を元の染色体に戻す操作を行う。図５を用いて、この操作について説明する。同図（ａ）のような２つの染色体を交叉する場合を仮定する。両者は参照画素１９と２８を共有している（ここで、参照画素番号は図４（ａ）を参照）。そこで、これら共有参照画素に対応する情報ブロック以外の情報ブロックだけを抜き出したものが同図（ｂ）である。つづいて、抜き出された部分染色体を交叉する。同図（ｃ）は第２と第３の情報ブロックの切れ目をＣＰとして１点交叉した結果である。最後に、抜き出され、交叉を施された部分染色体を、元の染色体に書き戻す。その結果が同図（ｄ）である。
なお、上記のテンプレート表現に基づく交叉において、抜き出された部分染色体の長さが０である場合、２つの染色体は全く同一のテンプレートを表現していることを示す。このような場合、前述のように、１つの集団に全く同一のテンプレートが含まれていることによる探索速度低下を避けるため、一方だけに、後述の突然変異を施すことにする。
交叉に引続いて実行する突然変異は、全染色体の全遺伝子の値を突然変異率に従って反転させる操作とした。図２１（ｂ）に突然変異の例を示す。この図では、染色体Ｃｈ５において、２番目の遺伝子の値が突然変異率に基づく確率で反転している。
なお、｛０、１｝の２値ではなく、整数で染色体を表現している場合はビット反転できないため、染色体の各遺伝子にガウス分布Ｎ（０、σ）に従って発生させた正規乱数を加算する操作を行う。ガウス分布以外のコーシー分布などの他の分布を用いてもよい。
突然変異をビット単位で行うのではなく、上記の情報ブロック単位で行うことも可能である。すなわち、図２０（ｂ）の場合、各情報ブロックは１６ビットで構成されるため、０〜６５５３５までの正数値を示す。そこで突然変異では、一定の突然変異率で、各情報ブロックを±（１〜α）だけ増減させる。ここで、αの値が小さいほど突然変異による影響が弱くなり、ＧＡによる探索速度は向上するが、小さすぎると、探索の初期段階で収束してしまい、よいテンプレートを得ることができなくなる。逆に、α値が大きいと、意図しない探索の収束を避けることができるが、探索速度は低下する。そこで、探索の初期段階ではαの値を大きくしておき、世代の進行に応じて徐々にα値を小さくする。これによって、探索初期段階での意図しない収束を抑制しつつ、探索終盤での探索速度を向上させることができるようになる。
さらに、同様の処理を、各情報ブロックを構成するｘ座標とｙ座標のそれぞれに対して、別々に施すことも可能である。こうすることで、より微妙な突然変異率の調整が可能となる。
安定状態（Ｓｔｅａｄｙ　Ｓｔａｔｅ）遺伝的アルゴリズムを使用することも可能である。すなわち、集団中から最良の個体とランダムに選択された個体の２個体を選び出し、それらが交叉して生成される子個体の一方に突然変異を施したものを、集団中で最も評価の低い個体と置換する方式である。各世代で１個体しか置き換わらないため、少ない個体数でも、局所解に陥りにくいという性質を持つ。さらに、交叉と突然変異で生成された個体の適応度が、集団中で最も評価の低い個体の適応度よりも高い場合にのみ、置換を許すことにより、探索速度を高めることも可能である。
テンプレート最適化のために、遺伝的アルゴリズムの代わりに、進化戦略や山登り法、焼きなまし法、タブーサーチ、貪欲法などの所謂ブラインドサーチ手法を適用することもできる。特に山登り法と焼きなまし法は遺伝的アルゴリズムほどの高い探索能力を持たないが、処理が高速というメリットがある。なお、上記ブラインドサーチ方式において、評価関数として圧縮データの大きさとエントロピどちらを用いてもよい。
なお、進化戦略の参考文献としては、例えば、出版社Ｊｏｈｎ　Ｗｉｌｅｙ　＆　Ｓｏｎｓが１９９５年に出版した、Ｈ．Ｐ．Ｓｃｈｗｅｆｅｌ著の「Ｅｖｏｌｕｔｉｏｎ　ａｎｄ　Ｏｐｔｉｍｕｍ　Ｓｅｅｋｉｎｇ」がある。
また焼きなまし方の詳細は、例えば、出版社Ｊｏｈｎ　Ｗｉｌｅｙ　＆　Ｓｏｎｓが１９９５年に出版した、Ｅ．Ａａｒｔｓ　ａｎｄ　Ｊ．Ｋｏｒｓｔ著の「Ｓｉｍｕｌａｔｅｄ　Ａｎｎｅａｌｉｎｇ　ａｎｄ　Ｂｏｌｔｚｍａｎｎ　Ｍａｃｈｉｎｅｓ」に開示されている。
つづいて、図３１における局所探索ｓ２３２の説明を行う。
遺伝的アルゴリズムは、非常に強力な探索手法であるが、探索手続きの終盤で、最適解あるいは局所解近傍において探索速度が低下するという問題がある。すなわち、図２２に例示するように、探索の序盤から中盤にかけて、世代数が進むごとに適応度の増加率が低下してゆき、終盤では殆ど適応度が改善されなくなる。そこで、本実施例では、遺伝的アルゴリズムによる探索で得られた最良のテンプレートに対して、山登り方を用いて最終調整を行う。この処理を導入することにより、遺伝的アルゴリズムにおける終盤の無駄な探索を行わずにすむため、全体として探索速度が向上する。
なお、完全に単色のブロックにおいては、どのようなテンプレートを使用してもまったく同じ圧縮率となってしまうため、テンプレート最適化を行う必要性はまったく無い。そこで、本技術では、ステップｓ１の画像分割を行う際に、各ブロックが単色であるかどうかの検査を同時に行い、単色であると判明した際には、テンプレート最適化を省略する。なお、圧縮／伸張処理部内に専用の計算ユニットを配置し、テンプレート決定をプロセッサではなく、この専用計算ユニットで行ってもよい。
ここまでが、図２８のテンプレート探索ｓ２３に関する説明で、次にデータベース更新ｓ２４に関する説明を行う。
データベースは、圧縮符号化対象である画像データに関する先験的知識が明らかな場合に、画像解析ｓ２２やテンプレート探索ｓ２３を実行せずにテンプレートを決定するために用いられる。とくに印刷用画像の場合、ディザ法をベースとした画像生成手続きが、その画像生成ソフトウェアを作成したメーカーごとに決まっていて、それぞれのメーカー毎に、ディザマトリクスの作り方などに独自の特徴がある。一般にそのような特徴が公開されることはないが、本実施例のようにテンプレート探索の結果をデータベースに蓄積することで、テンプレート決定に関する重要な手がかりとして再利用できるようになる。また、データベースに登録された情報を初期状態としてテンプレート探索ｓ２３を行うことにより、より優れたテンプレートを発見することもある。
図１３は、データベース更新ｓ２４の処理手続きを示すフローチャートである。
まず、ステップｓ２４１において、図１６に構造例を示すテンプレートデータベースを検索し、全く同じ画像生成システムで生成された、解像度や線数やスクリーン角度の同じ画像に対するテンプレートが登録されているか否かをまず検索する。
もし登録されていない場合は、ステップｓ２４４において、新しく得られたテンプレートを新規に登録して処理を完了する。このように登録しておくことで、テンプレート最適化ｓ２のデータベース検索ｓ２１において、新しいテンプレートを再利用することができる。
ステップｓ２４１において当該テンプレートが登録されていると判明した場合は、当該テンプレートが、テンプレートデータベース検索で得られ初期テンプレートを基にテンプレート最適化処理で得られたものかどうかをステップｓ２４２で調べる。もし得られたものであるならば、当該テンプレートは登録されているテンプレートよりも品質が優れたものであると期待されるため、ステップｓ２４３において、登録テンプレートを当該テンプレートで更新する。ここでの、品質が優れているというとは、このテンプレートを使用して予測符号化を行うことにより、高い圧縮効率を得られることである。
以上の手続きで、分割された各ブロックごとに最適なテンプレートが決定され、次に、このテンプレートを用いて、各ブロックの圧縮処理が開始される。ブロック分割された画像データは、必要な部分だけを切り出され、リファレンスバッファへと送られる。ここで、必要な部分の大きさについて、その高さは、参照画素として取り得る、注目画素から垂直方向に最も離れた画素までのライン数、その幅は、画素から水平方向に最も離れた両端の画素の距離であるものとする。
つづいて、コンテクスト抽出機において、テンプレートに基づいて、リファレンスバッファからコンテクスト（すなわち、参照画素の値により構成されるベクトルデータ）が抽出される。
コンテクストと、注目画素の値は、それぞれ後段のＦＩＦＯを介して、圧縮用のＭＱ−Ｃｏｄｅｒ（Ｃ）へと送られ、圧縮データが計算される。なお、図３５のＭＱ−Ｃｏｄｅｒ（Ｄ）は伸張用のＭＱ−Ｃｏｄｅｒである。この例では、実装時に容易に高速化できるように、圧縮用のＭＱ−Ｃｏｄｅｒ（Ｃ）と伸張用のＭＱ−Ｃｏｄｅｒ（Ｄ）を別に搭載しているが、もともとＭＱ−Ｃｏｄｅｒは可逆処理が可能であるため、ＬＳＩのゲート数を小さくするなどの目的で、圧縮と伸張の両方を実行可能なＭＱ−Ｃｏｄｅｒを１つだけ使用してもよい。なお、ＭＱ−Ｃｏｄｅｒの構造や処理手続きはＪＢＩＧ２方式に関する国際規格において明確に定義されている。さらに、ＱＭ−ＣｏｄｅｒやＪｏｎｓｏｎ符号化器のように、ＭＱ−Ｃｏｄｅｒ以外のエントロピ符号化器を使用することも可能である。
ＭＱ−Ｃｏｄｅｒ（Ｃ）の処理が完了すると、リファレンスバッファへは、右へ１ビット分ずらした領域が切り出され、コンテクストの抽出、ＭＱ−Ｃｏｄｅｒ（Ｃ）での圧縮データ計算が行われる。以降、画像末端まで同じ処理が繰り返される。
画像末端まで達すると、今度は画像データの左端で、下へ１ビット分ずらした領域がリファレンスバッファへと切り出され、ブロック下端まで上記の手続きが繰り返される。
なお、リファレンスバッファへの領域切り出し、コンテクスト抽出、ＭＱ−Ｃｏｄｅｒ（Ｃ）での圧縮データ計算をパイプライン的に行うことで、上記の手続きを高速化することが可能である。
この処理は、リファレンスバッファを２つ用意し、それらを交互に並列動作させることで高速化することができる。図３６に基づいて説明する。ブロックの大きさは同図（ａ）であるものとする。
また、この例では、参照画素の最大可動範囲を８画素×４ラインとすると、後段のコンテクスト抽出機で必要とするサイズは、最低で８画素×４ラインなので、リファレンスバッファの高さは４ラインとなり、横幅は８画素以上であれば何画素でもよい。ここでは１６画素であるものとする。
手順では、まず、１つ目のリファレンスバッファにブロック左上端を読み込む（同図（ｂ）における白抜きの長方形で示す領域）。このリファレンスバッファの幅は、コンテクスト抽出に必要な領域の２倍の１６であるため、シフト演算を８回行うことで、８回分のコンテクスト抽出を行うことができる。そこで、この８回のコンテクスト抽出を行っている間に、もうひとつのリファレンスバッファに、右へ８ビット分ずらした領域を読み込む（同図（ｃ）における網掛けの長方形で示す領域）。そして、第１のリファレンスバッファが８回分のコンテクスト抽出に必要な領域データを供給し終わった後を継いで、第２のリファレンスバッファが領域データの供給を開始する。以降、この動作を交互に繰り返すことで、リファレンスバッファへの領域データの書き込み時間を完全に隠すことが可能となる。
なお、この場合のリファレンスバッファの幅を、コンテクスト抽出に必要な領域の何倍に設定するかということは、リファレンスバッファへの書き込み時間と、コンテクスト抽出に必要な時間に応じて、適切に設定しておく必要がある。
以上が、図３４で示す画像データ圧縮ハードウェアの動作手順である。伸張時は、これまでに説明した圧縮手順の逆の動作を行うことで、原画像が復元される。
実施例２
図３２は、並列化処理を大幅に導入して、圧縮および伸張処理の高速化を図ったハードウェア構成のブロック図である。なお、同図中、圧縮伸張処理ユニットは、図３５から、画像データメモリと制御レジスタを取り除いたものと等価な回路である。また、本構成では、ホスト計算機から画像データおよび圧縮データをＤＭＡ転送で授受するため、ＤＭＡプロセッサを使用している。
この大きな特徴は、画像メモリに読み込まれた画像データを強制的に垂直方向に分割し、それぞれを並列に独立して圧縮する点にある。同図は２並列の例であるため、圧縮および伸張に要する時間が約１／２となる。画像メモリ、圧縮伸張ユニット、圧縮データ入出力ＦＩＦＯをｎ個ずつ用意することでｎ並列とすることができ、動作速度も約ｎ倍になる。
同図中、２つの画像データメモリは、画像入出力ＦＩＦＯからは１つの連続したメモリとして見える。そのため、圧縮時には、実施例１と同様に画像データを１ラインずつ逐次的に書き込んでゆくことになる。
圧縮時における、画像ブロック分割とテンプレート最適化については、垂直方向に強制分割された画像領域ごとに独立で行ってもよいし、１つの大きなブロックと見なして、実施例１と全く同じ手続きで画像ブロック分割とテンプレート最適化だけを行い、実際の圧縮処理だけを各領域で独立に並行処理してもよい。前者の方が高い圧縮率を得られるものと期待されるが、後者の方が早い処理を実現できる。
なお、このハードウェア構成を用いた伸張時には、画像メモリ１と画像メモリ２で同一ラインの処理が完了していることを必ず確認してから、当該ラインを画像入出力ＦＩＦＯへ書き出す必要がある。
実施例３
これまでに述べた実施例による適応型予測符号化および復号化方法をコンピュータによって実行するためのプログラムを、ハードディスク、フレキシブルディスク、ＣＤ−ＲＯＭまたはＤＶＤ−ＲＯＭ等の記録媒体に記録しておけば、このような記録媒体をコンピュータに読み込ませることにより、コンピュータを利用して、実施例１による適応型予測符号化および復号化方法を簡単に実施することができる。
適応型予測符号化プログラムは図２３に示されるように、
（１）圧縮すべき画像データを読み込む手順（ｓ６０１）と、
（２）圧縮符号化済みデータのヘッダを生成し、出力する手順（ｓ６０２）と、
（３）入力された画像データを、特徴的なブロック単位に分割する手順（ｓ６０３）と、
（４）ブロック毎に最適なテンプレートを生成する手順（ｓ６０４）と、
（５）ブロック内の全ての画素に対して、走査方向順にコンテクストを発生する手順（ｓ６０５）と、
（６）ブロック内の全ての画素と、ステップｓ６０５で生成されたその画素に対応するコンテクストを用いて圧縮データを生成する手順（ｓ６０６）と、
（７）これまでの手順で得られたブロック情報、テンプレート情報、圧縮データを合成して、圧縮符号化済みデータを生成および出力する手順（ｓ６０７）と、を実行させるようになっている。
一般に、ソフトウェアで実行するときには、ハードウェアよりもメモリ容量に余裕がある場合が多いため、実施例１のような、データ読み込みと画像分割を同時に行う方式（１パス）ではなく、画像データを全て読み込んでからブロック分割を行い、ブロックごとに圧縮符号化処理を行ってもよい（２パス）。
図２４（ａ）は、圧縮対象である画像データが、ｎ個のブロックに分割された場合における、圧縮符号化済みデータのフォーマットの一例を示す模式図で、ヘッダ部分と、ｎ個の画像ブロックに対応する圧縮情報から構成される。ヘッダ部分には、圧縮対象である画像データの大きさ（縦横サイズ）と、解像度、線数、スクリーン角度、画像生成システム名とそのバージョン等が記録される。
同図（ｂ）は、第ｉ番目の画像ブロックに対応する圧縮情報の構成例を示す模式図で、各ブロックの大きさと位置（ブロック情報）が記録されるブロックヘッダと、対応する画像データの圧縮符号化に用いたテンプレートと、圧縮データと、第ｉ番目の圧縮情報の終端を示す終端記号から構成される。
復号化プログラムは、図２５に示されるように、
（１）圧縮符号化済みデータを読み込む手順（ｓ７０１）と、
（２）圧縮符号化済みデータを解析して、伸張画像全体の大きさ（縦横サイズ）と、ブロック分割された各ブロックに対応する圧縮情報とに分離し、各ブロックを圧縮する際に使用したテンプレートと圧縮データを取り出す手順（ｓ７０２）と、
（３）伸張された画像データの先頭からコンテクストを逐次生成するための手順（ｓ７０３）と、
（４）コンテクストを用いて圧縮データから復号データを生成する手順（ｓ７０４）と、
（５）各ブロックに対応する復号データを合成して、１つの伸張画像データを生成する手順（ｓ７０６）と、を実行させるようになっている。
なお、本発明は通常の２値画像（文字画像、線画像、ディザ法や誤差拡散法などで２値化された画像、これらの混在画像など）の全てに対して有効に機能する。さらに、多値画像についても、ビットスライス処理や画像幅の拡大処理を行って複数の２値画像に分解することで適用可能である。ここでビットスライス処理とは、たとえば図２７（ａ）の場合、各画素は８ビットで表現されるので、第１番目のビットだけで構成した２値画像から、第８番目のビットだけで構成した２値画像までの、８つの２値画像に、１つの８ビット画像を分解する処理である。
ビットスライスを行うことで、高周波成分を多く含む画像から、低周波成分を主成分とする画像が生成されるが、一般に高周波成分に相当する画像の圧縮は難しいため、ビットスライスを行う前に、各画素値を交番２進数（グレイコード）に変換してもよい。これにより、上記の問題を解消することができる。さらに、原画像が限定色である場合、各色同士の相関に基づいて２進数のビット並びを変換することで、効率のよいビットスライスを行うことができる。
画像幅の拡大処理とは、たとえば図２７（ｂ）の場合、上述の例と同じく２５６階調（８ビット）画像の場合、画像幅を仮想的に８倍にし、画素を構成するビットを分解してそのまま同一ライン上に並べるだけの処理である。
以上、図示例に基づき説明したが、この発明は上述の例に限定されるものではなく、特許請求の範囲内で当業者が容易に改変し得る他の構成や方法をも含むものである。例えば、実施例においては、入力データとして画像データを用いる例を開示したが、本発明のデータ符号化方式は、テキストファイルをはじめ、任意のアプリケーションのデータファイルなど、ビット間に相関のある任意のデータに適用可能である。
産業上の利用可能性
本発明においては、入力された画像情報を均質な領域単位で分割することで、同一領域内では無条件に同じテンプレートを用いることができるようになる。さらに、適切に画像分割することで、木目細かなテンプレートの最適化を行えるようになり、分割画像ブロック内での画素値予測の精度が向上するため、結果として全体の圧縮効率が向上する。
本発明においては、従来手法とは異なり、圧縮対象となる画像を生成したディザ方式を仮定しながら統計的処理および分析を行うことで、精度の高いテンプレートを生成する。また、あらかじめ想定されるディザ法とテンプレートの関係を格納したデータベースを用いて、圧縮符号化効率だけでなく、テンプレートを決定するために要する時間を飛躍的に短縮することが可能である。
また、上記の方法で決定されたテンプレートをベースに、人工知能技術（遺伝的アルゴリズム等）を適用して、画像の絵柄に応じて適応的に調整することによって、より高い予測精度に寄与するテンプレートを得ることができる。画像の大局的な特性を損なうことなく一部分だけを切り出すことで、テンプレート決定の高速化も同時に行う。
更に、得られたテンプレートをフィードバックし、データベースを更新することで、使い込むほどに賢くなるデータ圧縮システムを構築できる。すなわち、同様な性質を持つ画像データを繰り返し処理することにより、人工知能技術を適用せずとも、高予測精度に寄与するテンプレートを高速に得ることができるようになる。
【図面の簡単な説明】
図１は、既知の各種テンプレートを示す説明図である。
図２は、ＪＢＩＧ方式で用いられるテンプレートを示す説明図である。
図３は、切り出し領域のずらし処理の模式図である。
図４は、同じテンプレートが異なる染色体で表現される例である。
図５は、テンプレート表現に基づく交叉に関する説明図である。
図６は、本発明の一実施例に係る画像伝送装置の送信側のブロック構成図である。
図７は、本発明の一実施例に係る画像伝送装置の受信側のブロック構成図である。
図８は、本発明の一実施例に係るテンプレート最適化方式の動作フローチャートである。
図９は、本発明の一実施例に係る画像分割方式の動作フローチャートである。
図１０は、水平方向ブロック分割の処理手順を示すフローチャートである。
図１１は、垂直方向ブロック分割の処理手順を示すフローチャートである。
図１２は、遺伝的アルゴリズムの動作フローチャートである。
図１３は、本発明の一実施例に係るテンプレートデータ更新の動作フローチャートである。
図１４は、画像のブロック分割方法を示す模式図である。
図１５は、同時確率密度の計算方法を示す模式図である。
図１６は、画像解析方式で用いるテンプレートデータベースの構造例を示す説明図である。
図１７は、画像解析方式で行う相関分析の動作原理を示す説明図である。
図１８は、画像解析方式で行う周期性推定の動作原理を示す説明図である。
図１９は、染色体と遺伝子の関係を示す模式図である。
図２０は、遺伝的アルゴリズムにおける染色体表現の説明図である。
図２１は、遺伝的アルゴリズムにおける交叉と突然変異の動作原理を示す説明図である。
図２２は、遺伝的アルゴリズムの一般的な学習曲線の例である。
図２３は、符号化プログラムの動作手順を示すフローチャートである。
図２４は、圧縮符号化済みデータのフォーマット例を示す模式図である。
図２５は、復号化プログラムの動作手順を示すフローチャートである。
図２６は、遺伝的アルゴリズムにおける個体集団の初期化方法例を示す模式図である。
図２７は、本発明を多値画像に適用可能とするための方法例を示す模式図である。
図２８は、テンプレート最適化の動作手順を示すフローチャートである。
図２９は、データベース検索の動作手順を示すフローチャートである。
図３０は、画像解析の動作手順を示すフローチャートである。
図３１は、テンプレート探索の動作手順を示すフローチャートである。
図３２は、並列化画像データ圧縮ハードウェアのブロック図である。
図３３は、垂直分割位置候補の決定方法を示す模式図である。
図３４は、画像データ圧縮ハードウェアのブロック図である。
図３５は、圧縮／伸張処理部のブロック図である。
図３６は、並列化リファレンスバッファの処理手順を示す模式図である。Technical field
The present invention predicts, for example, the value of each pixel of image information based on the state of surrounding pixels, and in a data encoding method that encodes image information based on the prediction result, to improve the compression effect, The present invention relates to a method for quickly and accurately optimizing a pattern of a pixel position.
Background art
In the prediction encoding method, the higher the prediction accuracy of the value taken by each pixel, the higher the compression effect. The prediction accuracy is improved by increasing the number of pixels (reference pixels) referred to at the time of prediction. In addition, by optimizing the position pattern of the reference pixel, the prediction accuracy can be improved.
FIG. 2 shows a reference pixel position pattern of a JBIG (Joint Bi-level Image experts Group) system, which is an international standard for encoding a binary image in which each pixel constituting the image takes only a binary value of 0 or 1. (Hereinafter, such a reference pixel position pattern is referred to as a template.)
In the figure, the shaded squares indicate the target pixel, and the squares indicated by p1 to p9 and A indicate the reference pixels. Of all the ten reference pixels, the pixel indicated by A is called an AT (Adaptive Template) pixel, and moves to an arbitrary position within a region of approximately 256 × 256 pixels shown in the figure according to the nature of the image. Is allowed. However, it is very difficult to find an optimal position from such a large area, and a large calculation cost is required. Therefore, in many systems implementing the JBIG method, AT pixels are marked with a cross in FIG. It is limited so that it can move only within the range of 8 pixels shown. In the recommendation of the JBIG system, a method of selecting a surrounding pixel having the highest correlation strength is recommended as a method of determining an AT pixel position.
In general, when performing compression encoding on an image such as a character or a figure, a template in which all reference pixels are densely arranged near a pixel of interest is effective for improving prediction accuracy (Transactions of the Institute of Electronics, Information and Communication Engineers, vol. J70). -B, No. 7, "Adaptive Markov Model Coding of Binary Images by Dynamic Selection of Reference Pixels" by Kato et al. Therefore, the JBIG template functions effectively, and the pixel value of the target pixel can be predicted with high accuracy, and as a result, a high compression ratio can be obtained.
However, in the case of a natural image, it is often possible to obtain higher prediction accuracy by dispersing the reference pixels in a wide range. Further, how to disperse the data largely depends on the properties of the image. For example, in an image binarized by a cluster-type dither method, pixels having a strong correlation appear at a fixed period. Therefore, the reference pixel arrangement includes not only the characteristic due to the pattern but also the periodicity due to the dither method. , High compression efficiency cannot be obtained. Images generated by other dither methods, such as spiral, Bayer, and error diffusion, also have unique characteristics.To achieve high compression efficiency, templates must be used for each. Must.
However, it is very difficult to determine what dither method is used to give a given binary natural image. In addition, it is impossible to obtain an optimum template by a simple calculation in consideration of characteristics caused by a picture pattern of an image. Therefore, many schemes for determining a template for obtaining high compression efficiency have been proposed.
For example, in the method of Kato et al. Of Utsunomiya University and U.S. Pat. When the value satisfies a certain condition, a method is adopted in which surrounding pixels having larger values are incorporated in the template as reference pixels in order.
In Japanese Patent Application Laid-Open No. 6-90363 (Ricoh), in addition to the method of sequentially changing the template as described above, the strength of correlation between surrounding pixels is obtained in advance over the entire screen before compression, and based on that, It refers to a method for deciding a template by using a template and a method for selecting a template suitable for image information from templates prepared in advance.
In a method based on only the strength of correlation with the pixel of interest as in the above three methods, an optimal template cannot be determined in an image binarized by a cluster type dither method. This is because, although it is actually necessary to consider the correlation between a plurality of clusters, the correlation between the pixels in one cluster is strong. This is because the template becomes dense as shown in FIG.
On the other hand, JP-A-5-30362 (Fujitsu) discloses a method in which a distance difference between a run to a target pixel (a region where the same pixel value is continuous on the same line) and a run of the same color immediately before is used as periodicity. Is written. However, this method does not work effectively for images having no periodicity.
Japanese Patent Application Laid-Open No. H11-243491 (Mitsubishi Heavy Industries) uses a multiple regression analysis and a genetic algorithm that uses a compression ratio as an evaluation function to optimize a template and to apply a dot structure and a pattern represented by an image. It tries to deal with both of the properties that arise. Genetic algorithms are computational methods that model the evolution and adaptation of living things found in nature, and are powerful search methods for artificial intelligence. This makes it possible to select an appropriate template from among a huge number of possibilities.
However, this method has a problem that the processing speed is extremely slow. The reason lies in the implementation method of the genetic algorithm. Genetic algorithm is a process that prepares a group consisting of multiple solution candidates, evaluates each of them, generates a new solution candidate group based on the evaluation as one generation, and repeats it for generations until the stop condition is satisfied. Take the calculation procedure. That is, one trial requires [individual group size] × [number of generations] evaluations.
On the other hand, in the method disclosed in Japanese Patent Application Laid-Open No. H11-243493, since the calculation of the compression ratio is used as the evaluation method, the image data to be compressed is encoded once to perform one evaluation, and the compression ratio is calculated. Must be obtained. In other words, optimizing a template by the same method requires a calculation time multiplied by [individual group size] × [number of generations] as compared to a case where template optimization is not performed. For example, when the population size is 30 and the number of generations is 100, it takes 3000 times longer calculation time to obtain an optimized template than when no genetic algorithm is used.
In view of the application to an image transmission system such as a facsimile machine, the uncompressed data transfer is completed while optimizing the template with the genetic algorithm taking such an enormous amount of time. . Further, it takes a huge amount of calculation time to complete the compression of an image of several hundred gigabytes, such as image data for printing, which is not very realistic.
An object of the present invention is to solve the above problem in predictive coding and provide a high-speed adaptive adjustment method of a template (reference pixel position pattern) for always obtaining high data compression efficiency for various types of image information. Aim.
Disclosure of the invention
To this end, the present invention provides (A) image division based on the local similarity of input image information, and (B) a database, an analysis process, and a speed-up mechanism for each divided region. (C) Update the database using the optimization result to improve the compression efficiency and speed from the next time onward. It is. Hereinafter, these elements will be described in order.
First, (A) will be described. To appropriately divide the input image information means to divide the image into units of regions that are homogeneous within the same region and have different properties between adjacent regions. FIG. 14 shows an example of division. By doing so, the same template can be used unconditionally within the same area. Further, by appropriately dividing an image, it becomes possible to optimize a fine-grained template, and to improve the accuracy of pixel value prediction in a divided image block, thereby improving the overall compression efficiency.
The method of determining the division method is roughly classified into one pass and two passes. The two-pass method is a method in which an entire image is first read, an analysis is performed on the entire image, a division method is determined, and the image is read again for each divided region. Since information on the entire image can be used, a highly accurate division method can be obtained, but data must be read twice, which causes a problem that the processing speed is reduced. Conversely, in one pass, the way of division is determined sequentially while reading the image. Since processing that reflects the properties of the entire image cannot be performed, optimal image division cannot always be performed, but a reduction in processing speed can be avoided. In the present invention, appropriate image division is performed by analyzing image data read in the raster direction line by line based on the one-pass method. Details and a specific processing method will be described in an embodiment described later.
Subsequently, (B) will be described. In this patent, unlike the conventional method, a highly accurate template is generated by performing statistical processing and analysis while assuming a dither method that generates an image to be compressed. In addition, by using a database that stores a relationship between a presumed dither method and a template, it is possible to dramatically reduce not only the compression encoding efficiency but also the time required to determine a template. Furthermore, based on the template determined by the above method, an artificial intelligence technique (such as a genetic algorithm) is applied and adaptively adjusted according to the picture pattern of the image, thereby contributing to higher prediction accuracy. Can be obtained. By cutting out only a part without impairing the global characteristics of the image, the speed of template determination is also increased.
Finally, (C) will be described. By feeding back the template obtained in (B) and updating the database, a data compression system that is smart enough to be used can be constructed. In other words, by repeatedly processing image data having similar properties, a template that contributes to high prediction accuracy can be obtained at high speed without applying artificial intelligence technology. This mechanism can be regarded as an expert system that learns autonomously.
BEST MODE FOR CARRYING OUT THE INVENTION
Example 1
Hereinafter, embodiments of the present invention will be described in detail with reference to the accompanying drawings. FIG. 6 is a configuration diagram on the transmission side of the image transmission apparatus according to one embodiment of the present invention. In FIG. 1, an image buffer 3 is for temporarily storing image data 1 input from an image data input line 2. The stored image data is sent to a block determiner 6, a context generator 5, a template generator 8, and a compression encoder 11 (all described later) based on block information from a block determiner 6 described later. It is output through the data signal line 4.
The block determiner 6 divides the input image data into characteristic blocks, and performs the subsequent processing for each block, thereby improving the compression encoding efficiency. Image data is input to the block determiner 6 via the image data signal line 4, a method of dividing the block is determined based on the image data, and block information (the size and position of each block) is transmitted through the block information signal line 7. Is output. The method of dividing the block can be determined and controlled externally by inputting a block decision unit control signal through the block decision unit control signal line 17.
The template generator 8 determines an optimal template that can contribute to the maximum compression ratio for the input image data. Here, image data via the image data signal line 4 and compressed data via the compressed data signal line 12 are input. The compressed data is used as an index indicating the goodness of the template output by the template generator. Note that it is also possible to forcibly specify a template from the outside through the template generator control signal line 16 and perform compression encoding.
The context generator 5 uses a template input via the template signal line 9 to obtain a pattern (context) of a value taken by a reference pixel in image data input via the image data signal line 4. Things.
Here, the context is a vector in which values taken by pixels at positions designated by the template are extracted and arranged. For example, in a template as shown in FIG. 15A, assuming that the shaded squares indicate the target pixel and p1 to p4 indicate the positions of the reference pixels, the values of the pixels corresponding to p1 to p4 in FIG. Are p1 = 0, p2 = 1, p3 = 1, and p4 = 1, respectively. Therefore, <0111> generated by arranging these values is the context.
In the context generator, the processing speed can be improved by effectively using the shift register. In the example of the template in FIG. 15A, p1, p2, and p3 are arranged side by side. If the template is shifted one bit to the right, the value of p2 moves to p1, and the value of p3 moves to p2. That is, by using the shift registers corresponding to p1 to p3, only two pixels of p3 and p4 need to be newly read from the memory when the template is shifted to the right. As a result, the number of memory address operations can be reduced, so that the processing speed can be improved.
The compression encoder 11 compresses and encodes the image data input via the image data signal line 4 using the context input via the context signal line 10 to generate compressed data. .
The compression-encoded data synthesizer 13 generates the compression-encoded data 15 by adding the template used to generate the compressed data and the block information on which the compressed data is based to the compressed data. It is for. Here, block information is input via a block information signal line 7, a template is input via a template signal line 9, and compressed data is input via a compressed data signal line 12. In addition, from outside the system, attribute information relating to image data (resolution of image data, number of lines, screen angle, system name and version of image data generated, date and time of image data creation, etc.) are input to attribute information input line 18. Alternatively, the data may be input to the compression-encoded data synthesizer 13 and recorded in the header of the compression-encoded data 15.
Based on FIG. 6, the procedure up to the compression of the image data and the output of the compressed and encoded data 15 will be briefly described.
First, from outside the system, attribute information on image data is input to the compression-encoded data synthesizer 13 via the attribute information input line 18 and recorded as a header of the compression-encoded data. This attribute information need not be present.
Subsequently, the image data 1 to be compressed is input to the image buffer 3 through the image data input line 2. This data is sent to the block determiner 6 through the image data signal line 4. At this time, since the size (vertical and horizontal size) of the image data is determined, the image data is sent to the compression-encoded data synthesizer 13 via the block information signal line 7 and is recorded in the header of the compression-encoded data 15.
The block determiner 6 calculates how to divide the image into blocks. In the case where the positions of the character area and the photograph area are determined at the stage of editing the paper, for example, as in the paper of a newspaper, it is not necessary to calculate the optimal block dividing method by the block determiner. By not performing the above, the processing speed can be improved. In other words, the processing can be speeded up by the block determiner 6 reading the page layout information through the block determiner control signal line 17. Further, by appropriately subdividing the blocks based on the paper layout information, higher compression efficiency can be expected. This specific processing method will be described later.
The block division information determined by the above procedure is sent to the image buffer 3, and image data is sent from the image buffer 3 to the template generator 8 in block units. The template generator 8 calculates an optimum template for each block based on the procedure shown in the flowchart of FIG. The processing method at this time will be described later. For a character area having a small font size, it is known that a template having a shape in which reference pixels are densely arranged around a target pixel as shown in FIG. No need. Therefore, when the optimum template for a specific block is known in advance, the template is automatically determined by the template generator control signal input via the template generator control signal line 16. .
The context generator 5 scans the block-divided image data sent from the image buffer 3 through the image data signal line 4 using the template calculated by the template generator 8 and generates a context for each pixel. Extract.
The compression encoder 11 generates compressed data using each pixel of the block-divided image data sent from the image buffer 3 through the image data signal line 4 and the context obtained by the above procedure. The obtained compressed data is sent to a compression-encoded data synthesizer 13 through a compressed data signal line 12, and the block information sent from the block determiner 6 through the block information signal line 7 and the template information from the template generator 8. The template is combined with the template sent through the signal line 9, connected in units of compression information for each block, and transmitted as compression-encoded data 15 to the receiving side through a compression-encoded data output line 14.
Note that the compressed data generated from the compression encoder 11 is used for calculating the compression ratio in the template optimizing process in the template generator 8, and thus is also sent to the template generator through the compressed data signal line 12.
FIG. 7 shows a configuration diagram on the receiving side of the image transmission apparatus according to one embodiment of the present invention. In the figure, a data analyzer 22 converts a header portion and compression information (template, block information, etc.) about each block compressed for each block from a compression-encoded data 20 input from a compression-encoded data input line 21. , Compressed data) and outputs a template and compressed data based on the block information.
The context generator 27 uses a template input via the template signal line 25 to obtain a pattern (context) of a value taken by a reference pixel in decoded data input via the decoded data signal line 29. Things.
The compression code decoder 26 is for decoding the compressed data input via the compressed data signal line 25 using the context input via the context signal line 28 to generate decoded data. .
From the compression encoder / decoder 26, since the block-divided decoded data is output in pieces like a patchwork, the original image cannot be restored unless these are arranged correctly. Accordingly, the decompressed data synthesizer 30 receives the decoded data input from the compression encoder / decoder 26 through the decoded data signal line 29, and decodes the decoded data based on the block information received from the data analyzer 21 through the block information signal line 23. Are stored while the data is stored, the decompressed data is synthesized, and output to the outside through the decompressed data output line 28.
The procedure up to decoding of compressed data and outputting of decompressed data will be briefly described based on FIG.
First, the compressed and encoded data 20 is input to the data analyzer 22 via the compressed and encoded data input line 21. The format of the compression-encoded data 20 will be described later with reference to FIG. The data analyzer 22 analyzes the compression-encoded data 20, and separates and extracts a header portion and compression information (block information, template, and compressed data) on each block.
The attribute information of the original image recorded in the header portion is sent to the decompressed data synthesizer 30 and directly becomes the decompressed image header. Also, the size (vertical and horizontal size) of the original image data is always recorded in the header portion, and the decompressed data synthesizer 30 uses this information to restore the decompressed data from the decoded data.
The compressed data is sent to a compression encoder / decoder 26 via a compressed data signal line 24, and a corresponding template is sent to a context generator 27 via a template signal line 25. The compression encoder / decoder 26 receives the context from the context generator 27 via the context signal line 28, decodes the compressed data using the context, and outputs the decoded data via the decoded data signal line 29. The output decoded data is sent to the context generator 27, and is used to generate a context using the template.
The decompressed data is input to the decompressed data synthesizer 30 from the compression encoder / decoder 26 via the decoded data signal line 29 and the block information input from the data analyzer 22 via the block information signal line 23. Then, the decoded data as the image data fragmented into a patchwork is rearranged, and the decompressed data is synthesized. The decompressed data is output to the outside via the decompressed data output line 28.
FIG. 34 is an example in which the technology according to the present invention is also realized using hardware. Image data compression hardware using the present technology includes a processor, a main memory, various registers, and a compression / decompression processing unit. The operation of each element during compression is as follows. The processor performs block determination, template generation, and interface with the host computer. The main memory is used to store a block determination and template generation program to be processed by the processor, and a discovered template. Various registers are used as an interface between the processor and the compression processing unit. In the compression processing unit, context generation, compression encoding processing, and compression-encoded data synthesis are performed. The compression-encoded data may be synthesized by an external host computer.
The operation of each element at the time of decompression is as follows. The processor is hardly used at the time of decompression, but is used to analyze the data at the head of the compressed data and to interface with the host computer. The data analysis program executed by the processor is stored in the main memory as in the compression. The various registers function as an interface between the processor and the compression / decompression processing unit. The compression / decompression processing unit performs context generation, compression / code decoding processing, and synthesis of decompressed data. Note that the synthesis of the decompressed data may be performed by an external host computer.
The operation will be described below. First, at the start of compression, information such as the size and resolution of compression target image data stored in the host computer is sent to the processor and the compression / decompression processing unit of the image data compression hardware. The compression / decompression processing unit records these data in the header of the compression-encoded data. In the processor, block determination and template optimization are performed using these data in cooperation with the compression / decompression processing unit.
FIG. 8 shows an operation procedure of the adaptive template adjustment method by software. First, the overall flow will be outlined, and then each process will be described in detail.
First, image division s1 is performed on input image data in order to separately perform compression processing in units of regions having similar properties. Here, the entire image may be divided into blocks, or only a part may be divided into blocks, and the remaining part may be divided into blocks later. The former corresponds to two-pass processing, and the latter corresponds to one-pass processing. Here, one-pass processing is performed.
After a part of the image is divided into a plurality of blocks, template optimization s2 is performed on each of the divided image blocks.
After template optimization is performed on all the image blocks generated in the image division s1, block division of the remaining portion and template optimization for each block are performed until the end of the image is reached.
FIG. 9 shows an image division (s1) procedure for block determination. In the horizontal division s12, lines are sequentially read line by line into the image data memory of the compression / decompression processing unit shown in FIG. 35 until the processor detects a block boundary. The detection method of the block boundary is as follows.
FIG. 10 shows a flowchart of the horizontal division s12. First, in step s121, while scanning one line at a time, the lengths of runs in which the same color continues are recorded, and the feature amount of the line is calculated at the same time. The recorded run length is used for vertical block division, and the feature value of the line is used for horizontal block division. As the feature amount, an KL (Kullback-Leibler) information amount based on an average run length or a joint probability density (a probability that all pixels within the referenceable range have the same value as the pixel of interest) is used.
Then, the feature amount of the line of interest is compared with the feature amount of the preceding line (s122). Here, if the difference between the two is greater than or equal to a predetermined threshold, the block immediately before the line of interest is regarded as the same block, and a block boundary is detected. Further, even when the threshold value is not exceeded, when the end of the image is reached, the position is regarded as a block boundary.
The joint probability density will be described with reference to FIG. In FIG. 7A, it is assumed that all pixels within the referenceable range of four pixels p1 to p4 and a hatched square are target pixels. It is also assumed that the second line in FIG. 3B is the line currently focused on. In order to obtain the joint probability density, the line of interest is scanned as shown in FIGS. 3C to 3E, and the probability that each reference pixel has the same value as the pixel of interest is calculated. In FIG. 3 (c), the reference pixels p1 and p4 are out of the domain of the data. In this case, if it is 0, only p3 has the same value as the pixel of interest. Similarly, three pixels p2, p3, and p4 have the same value as the pixel of interest, and three pixels p1, p2, and p3 have the same value as the pixel of interest in FIG. When one line is scanned, the number of times that p1 to p4 have the same value as the pixel of interest is 7, 7, 7, and 6, respectively, which means the correlation strength for each pixel of interest. 7/27, 7/27, 7/27, and 6/27 obtained by normalizing this are joint probability densities.
The procedure for calculating the KL information amount based on the joint probability density is as follows. That is, assuming that Xi is the feature quantity of the line and Yi is the feature quantity of the previous line, KLD which is the KL information quantity is
KLD = KL (X) + KL (Y)
Is calculated. However, here
KL (X) = {Xi × log (Xi / Yi)}
KL (Y) = {Yi × log (Yi / Xi)}
It is assumed that
When comparing the feature amounts, not only the feature amount of the previous line but also the feature amounts of all lines from the block to the previous line are calculated, and the average, the moving average, the weighted average, and the like are calculated. By comparing with the feature amount of the line, highly accurate feature amount comparison can be performed.
In addition, in addition to the KL information amount, a minimum description length (MDL) criterion, an AIC (Akaike's Information Criterion), a distance between vectors, or the like may be used as a reference in the feature amount comparison.
Furthermore, when calculating the feature amount, instead of calculating for all pixels within the referenceable range, calculation is performed only for the reference pixels included in the template used in the immediately preceding block, thereby increasing the speed. You can also.
After a horizontal block boundary is detected, the block can be further divided vertically. FIG. 11 shows a detailed procedure of the vertical division (s13).
First, in step s131, vertical division position candidates are selected. By referring to the length of the run of the same color in each line recorded in step 121, a vertical division position candidate can be selected for all lines.
FIG. 33 shows an example of an image block having a width of 16 pixels and a height of 8 lines. In the first row, eight positions (a), (c), (d), (f), (i), (l), (n), and (o) are vertical division position candidates. Similarly, in the second row, four locations (b), (f), (j), and (n) are vertical division position candidates. By repeating the same process for all lines, a vertical division position candidate is selected for each line.
Subsequently, at step s132, it is checked whether or not all the vertical division position candidates are actually selected as the vertical division positions of the block. As the procedure, a majority method is used. That is, assuming that the selection reference rate as a threshold given in advance is 0.8, 7 out of 8 lines have a break in the run at (b), and are 7/8 of the whole, that is, 0.8 or more as a ratio. It can be considered that the line supports (b) as a vertical division position. Therefore, (b) is selected as the vertical division position. Similarly, in (f) and (n), since all lines have a break in the run at the same position (1.0 as a ratio), these are also selected as vertical division positions of the block.
In the above example, the minimum run length is set to 1, and a break between runs having a length of 1 or more is selected as a vertical division position candidate. However, the minimum run length may be set to 2 or more. The smaller the minimum run length, the higher the sensitivity of vertical division position selection, and therefore the number of vertical divisions increases. Conversely, the longer the minimum run length, the lower the sensitivity, and thus the smaller the number of vertical divisions. Actually, in the case of the above example, when the minimum run length is 2 or more, the vertical division is not performed.
A dedicated calculation unit may be arranged in the compression / decompression processing unit, and the block determination based on the horizontal and vertical block division may be performed by the dedicated calculation unit instead of the processor.
After the block determination, template optimization (s2) is performed on each block. The calculation procedure of template optimization is as shown in FIG.
First, a database search is performed in step s21. In the database stored in the memory 8M in FIG. 6, in addition to templates registered in advance, templates obtained by a template search described later are recorded.
If the registered template is found as a result of the database search, the process directly proceeds to step 28 (s26).
If the corresponding template is not registered in the database, next, image analysis s22 is performed. Here, a template suitable for image data is obtained by statistical processing or simple calculation with reference to the periodicity of the data. Subsequently, using the template as a result of the image analysis s22 as a search key, the database is searched again s27. At this time, whether or not the template is registered in the database or not, the process proceeds to step s28 for determining whether to search for a template.
In step 28, a determination is made as to whether to perform a template search. The criteria will be described later. If it is determined that the template search is not to be performed, the process proceeds to step 25, and the template optimization processing is completed.
In the template search s23, the template is optimized using a search technique based on artificial intelligence. Finally, the template obtained by the template search is appropriately stored in the database. The template stored here is used in the above-described database search, and is used to quickly obtain a template with high accuracy.
This procedure is performed for all of the divided image blocks obtained by the block division. In order to improve the calculation speed, it is also possible to process some of the divided image blocks as a representative example instead of all the divided image blocks, and simply apply the result to another divided block. If it is necessary to further increase the speed, the image analysis and the template search can be omitted as necessary.
Hereinafter, each processing step in FIG. 28 will be described in order.
FIG. 29 is a processing flow of the database search s21. In the database search, in step s211, attribute information is compared with registered contents. FIG. 16 is an example of a template database structure. In this case, the system that generated the image data to be encoded as its search key, its version, the resolution of the image data, the number of lines, and the screen angle are used, and a registered template with a high degree of matching that is equal to or higher than a predetermined threshold is selected. Is done. The image analysis result (input template) is a search key for use in the database re-search s27 in FIG. 28, and is not used here. If a matching template is found as a result of the database search s21, the process proceeds from step s26 to step s28 for determining whether to perform a template search. As a criterion here, as compared with the case where the reference template is used, (a) whether the compression ratio is higher than a certain threshold value, and (b) the sum of the correlation strength (described later) of each pixel included in the template are as follows. Whether it is higher than a certain threshold is used. Usually, (a) is used, but (b) may be used if it is desired to reduce the processing time even a little.
Here, a default template of the JBIG and JBIG2 system is used as the reference template. A template that matches second in the database search or a template that is finally used in a block that has already been processed may be used.
At this time, in order to reduce the calculation time, an image generated by cutting out only a part of the target block of the image, and an image generated by thinning out rows and columns based on a certain rule may be used.
It is also possible for an operator or an external system to determine whether or not to forcibly perform the template search.
If it is determined in step s28 that the template search is to be performed, the template search is directly performed in step s23. If it is determined that the template search is not to be performed, the process proceeds to step s25 to complete the template optimization process. . The template search will be described later.
If no matching template is found as a result of the database search s21, the flow advances to image analysis s22 via step s26 in FIG.
FIG. 30 is a processing flow of image analysis. Here, the dither analysis s221 is tried, and if it is possible, the process is terminated. If the dither analysis s221 is not possible, the correlation analysis s223 is performed and the process is terminated.
The dither analysis utilizes periodicity caused by a dither matrix used when generating a dither image. The procedure is as follows. First, the correlation between all pixels in the referenceable range and the pixel of interest is calculated. Next, one of the pixels having the strongest correlation is selected from the surrounding pixels that are at least a certain distance from the target pixel.
If the image to be coded is a dither image generated using a dither matrix, pixels having strong correlation are arranged at the vertices of a square having a certain angle and a specific side length, so that the pixel of interest It is highly probable that the distance and the direction to the pixel selected from satisfies the relationship shown in FIG. In the figure, the shaded squares indicate the target pixel, the squares indicated by the crosses indicate the pixels having the largest correlation intensity among the pixels separated from the target pixel by a certain distance or more, and the blank squares indicate a strong correlation based on these relationships. (These pixels are hereinafter referred to as ideally arranged pixels). Therefore, by selecting a reference pixel from the ideally arranged pixels, it becomes possible to determine a template with higher accuracy than a method of selecting a reference pixel based on a simple correlation strength.
If the number m of ideally arranged pixels existing in the referenceable range is larger than the number n of reference pixels forming the template, n ideally arranged pixels are selected in ascending order of the distance from the pixel of interest, and the configuration of the initial template Used as a reference pixel. For example, when the ideally arranged pixels and the number of reference pixels n = 10 as shown in FIG. 18A, the ideally arranged pixels p1 to p10 are selected as the reference pixels.
Conversely, when the number n of ideally arranged pixels existing in the referenceable range is smaller than the number m of reference pixels constituting the template, the number of pixels adjacent to the target pixel and the number of pixels adjacent to the ideally arranged pixel are determined from the pixel of interest. The reference pixels are selected in ascending order of the distance. For example, when the ideally arranged pixels and the number of reference pixels n = 16 as shown in FIG. 18A, the ideally arranged pixels from p1 to p13 are adopted as the template, and the reference pixels that are still insufficient are supplemented. Three reference pixels are selected in order.
However, when the referenceable range is too narrow to include a sufficient number of ideally arranged pixels, or when the distance between the ideally arranged pixels with respect to the referenceable range is extremely large, pixels having a strong correlation strength generally gather near the target pixel. Therefore, there is a high possibility that all the reference pixels other than the ideally arranged pixels will be selected from the vicinity of the target pixel.
FIG. 18B shows an example in which the referenceable range is too narrow. Here, if the number of reference pixels is n = 16, there is a high possibility that all nine reference pixels other than the ideal arrangement pixels p1 to p7 are selected from the vicinity of the target pixel. In FIG. 13B, a pixel indicated by a blank square without a subscript near the target pixel is selected as a reference pixel for supplementing the shortage. However, in this case, since the reference pixels are too dense, it cannot function effectively for a natural image.
In order to avoid such a situation, a template is determined by a method described below. That is, first, among the pixels adjacent to the target pixel, the pixel having the strongest correlation with the target pixel is adopted as the reference pixel. Next, an ideally arranged pixel is selected in the order from the pixel closest to the pixel of interest, and a pixel having a strongest correlation with the pixel of interest among pixels adjacent thereto is adopted as a reference pixel. In the case of the example of FIG. 18B, in the procedure up to this point, a total of 15 (= 1 + 7 × 2) reference pixels of the adjacent pixel of the target pixel, seven ideally arranged pixels, and one adjacent pixel each. Is determined. If the number of reference pixels is still smaller than the total number of reference pixels, a process of employing the second reference pixel from the pixel adjacent to the target pixel and further employing the second reference pixel from each of the pixels adjacent to the ideal arrangement pixel Is repeated until the adopted number of reference pixels reaches the total number of reference pixels constituting the template.
In the above dither analysis, the joint probability density calculated when performing image division is used instead of the correlation strength, and the calculation time can be significantly reduced.
With the above procedure, the dither analysis is completed. When the dither matrix is known in advance, the periodicity and the correlation pattern can be clearly detected from the size thereof, so that the calculation of the ideal pixel arrangement can be omitted.
In step s223 of FIG. 30, if dither analysis does not work well, correlation analysis is performed. When calculating the correlation strength, all the pixels constituting the divided image block are normally scanned, but the calculation can be omitted to increase the speed. In other words, the scanning speed can be improved by scanning every few pixels, every few lines, or by scanning only a part of the area. In order to increase the speed, scanning may be performed by limiting only a part of the divided image block.
In the correlation analysis s223, first, the correlation with the target pixel is calculated for all the pixels in the referenceable range. When the template is composed of n pixels, n pixels are selected in the order of strong correlation and adopted as the initial template, and the image analysis procedure is completed.
By the way, in an image having a very strong high-frequency component, if a selection criterion is simply used based on the magnitude of the correlation strength, the reference pixels are concentrated around the target pixel as shown in FIG. As described above, such a template is effective in a character area, but is not appropriate for a natural image. Therefore, it is necessary to widely disperse the arrangement of reference pixels based on a method of simply selecting pixels having strong correlation.
An example of a method for that will be described with reference to FIG. In the figure, a shaded square is a pixel of interest, a blank square is a pixel having a correlation strength of 0, and a square to which a number is input is a pixel having the number as the correlation strength. In this example, the template is assumed to be composed of three reference pixels. Here, when the reference pixels are simply selected based on only the magnitude of the correlation strength, the pixels having the correlation strengths of 100, 90, and 80 constitute the template. However, a different template is generated by forcibly reducing the correlation strength of the pixel adjacent to the pixel selected as the reference pixel.
That is, assuming that the correlation intensity of the pixel adjacent to the pixel selected as the reference pixel is reduced to 90% each time, the state after the two pixels having the correlation intensity of 100 and 90 are selected is shown in FIG. 17 (b). 17A, two pixels having a correlation strength of 80 and a correlation strength of 100 and 90 are selected, so that the correlation strength reaches 80 × 0.9 × 0.9 = 64.8. Decrease. Therefore, a pixel having a correlation strength of 70 is selected as the third reference pixel in this case.
In the above-described correlation analysis, the simultaneous probability density calculated when performing image division is used instead of the correlation intensity, and the calculation time can be significantly reduced.
The fact that a template that matches the image analysis result (input template) as a search key is registered in the database means that the image data that is currently undergoing compression processing and the image data that has very similar statistical properties This means that you have experience in compression processing before. That is, by using the registered template that is the template search result at that time, it is possible to obtain a template that functions more effectively as it is as the image analysis result while saving the calculation time required for the template search result.
However, it is highly probable, but not guaranteed, that a template that works well for an image will work well for an image whose statistical properties are very similar to that of the image. For example, in a dither image processed by the error diffusion method, reference pixels determined by a statistical method are likely to be densely arranged as shown in FIG. Therefore, when such a situation is assumed in advance, the threshold value of the degree of matching in the database search may be set high. If different thresholds are set according to the dither method, the above problem can be automatically solved.
Up to here, the operation of the image analysis s22 in FIG. 28 has been described. Subsequently, the database re-search s27 will be described.
Unlike the database search s21 described above, here, only the image analysis result (input template) is used as a search key (see FIG. 16), and a registered template that matches a certain threshold or more is output.
After the database re-search, it is determined in step s28 whether or not to perform a template search. The processing procedure is as described above.
Next, the template search s23 will be described.
As exemplified in the flowchart of FIG. 31, the processing procedure of the template search s23 includes two-stage processing. In the first stage, the genetic algorithm s231 is executed using the template obtained in the image analysis s22 as an initial state, and the template is optimized. Then, in the second stage, the obtained template is further adjusted in the local search s232, and the processing is completed.
First, the genetic algorithm s301 will be described. References to the genetic algorithm include, for example, the publisher ADDISON-WESLEY PUBLISHING COMPANY, INC. Published in 1989 by David E. Goldberg, "Genetic Algorithms in Search, Optimization, and Machine Learning."
In a general genetic algorithm, first, a population of virtual organisms having genes is set, and individuals that have adapted to a predetermined environment will survive according to the degree of fitness, and the probability of leaving offspring To increase. Then, the child inherits the gene of the parent by a procedure called genetic operation. By performing such generational changes and evolving genes and biological populations, individuals with high fitness will dominate the biological population. As the genetic manipulation at that time, gene crossover, mutation, and the like, which occur in the reproduction of actual organisms, are used.
FIG. 12 is a flowchart showing a schematic procedure of the genetic algorithm. Here, in step s2311, the chromosome of the individual is determined. That is, it is determined what data is to be transmitted and in what format from the parent individual to the offspring at the time of generation change. FIG. 19 illustrates a chromosome. Here, the variable vector x of the target optimization problem is represented by a sequence of M symbols Ai (i = 1, 2,..., M), and this is regarded as a chromosome composed of M genes. In FIG. 19, Ch indicates a chromosome, Gs indicates a gene, and the number M of genes is 5. As the gene value Ai, a certain set of integers, a certain range of real values, a simple symbol sequence, and the like are determined according to the problem. In the example of FIG. 19, the alphabets a to e are genes. The set of genes thus encoded is the chromosome of an individual.
Next, in step s2311, a method of calculating a fitness indicating how much each individual has adapted to the environment is determined. At this time, a design is made such that the higher or lower the variable of the evaluation function of the target optimization problem is, the higher the fitness of the individual corresponding to the variable is. In the subsequent generation change, individuals with higher fitness have a higher probability of surviving or producing offspring than individuals with lower fitness. Conversely, individuals with low fitness are considered to be individuals that are not well adapted to the environment and are eliminated. This reflects the principle of natural selection in evolution. That is, the fitness is a measure of how superior each individual is from the viewpoint of the possibility of survival.
In the genetic algorithm, at the start of the search, the problem in question is generally a completely black box, and it is completely unknown what individual is desirable. For this reason, the initial population of organisms is usually generated randomly using random numbers. Therefore, also in this procedure, in step s2313 after the processing is started in step s2312, an initial population of organisms is generated randomly using random numbers. If there is some prior knowledge about the search space, a process such as generating a biological population may be performed mainly on a portion that is considered to have a high evaluation value. Here, the total number of individuals to be generated is referred to as the number of individuals in a group.
Next, in step s2314, the fitness of each individual in the biological population is calculated based on the calculation method previously determined in step s2311. After the fitness is determined for each individual, next, at step s2315, the individual that is the basis of the next generation individual is selected from the population. However, performing only selective selection only increases the proportion of individuals having the highest fitness at the present time in the biological population, and does not generate new search points. Therefore, the following operations called crossover and mutation are performed.
That is, in the next step s2316, a pair of two individuals is randomly selected with a predetermined frequency of occurrence from the next generation individuals generated by the selection, and the chromosomes are rearranged to form a child chromosome (crossover). ). Here, the probability of occurrence of crossover is called a crossover rate. The offspring individuals generated by the crossover are individuals that have inherited the trait from each of the parents. This crossover process increases the chromosome diversity of the individual and causes evolution.
After the crossover process, in the next step s2317, the gene of the individual is changed with a certain probability (mutation). Here, the probability of occurrence of a mutation is called a mutation rate. The phenomenon that the content of a gene is rewritten with a low probability is a phenomenon that is also seen in the gene of an actual organism. However, if the mutation rate is set too high, the genetic characteristics of the parent's trait due to crossover are lost, which is similar to random search in the search space.
The next-generation population is determined by the above processing. Here, in step s2318, it is determined whether the generated next-generation organism population satisfies the evaluation criteria for ending the search. This evaluation criterion depends on the problem, but typical ones are as follows.
・ The maximum fitness value in the biological population has exceeded a certain threshold.
-The average fitness of the whole population became larger than a certain threshold.
・ Generations in which the rate of increase in the fitness of the biological population is below a certain threshold have continued for a certain period of time.
・ The number of generation changes has reached the predetermined number.
If any of the above-mentioned termination conditions is satisfied, the search is terminated, and the individual with the highest fitness in the biological population at that time is used as the solution to the optimization problem to be obtained. If the termination condition is not satisfied, the process returns to step S2314 to calculate the fitness of each individual, and the search is continued. By repeating such generation alternation, it is possible to increase the fitness of individuals while keeping the number of individuals in the population constant. The above is the outline of the general genetic algorithm.
The framework of the genetic algorithm described above is a loose one that does not specify actual programming details, and does not specify a detailed algorithm for each problem. Therefore, in order to use the genetic algorithm for the template optimization of the present embodiment, it is necessary to implement the following items for the template optimization.
(A) Chromosome representation method
(B) Method of generating initial population
(C) Individual evaluation function
(D) Selection method
(E) Crossover method
(F) Mutation method
(G) Search end condition
FIG. 20 shows a chromosome expression method in this example. The figure shows an example in which the number of reference pixels constituting the template is n and the referenceable range is an area of 256 × 256 pixels. However, the range of the referenceable range in the present embodiment is arbitrary, and any referenceable range may be used as long as it is determined in advance.
At this time, the chromosome is divided into n parts each corresponding to one reference pixel. Each part is further divided into two parts, and the x coordinate and the y coordinate position of the reference pixel are specified. In the example, since the size of the referenceable range is 256 in both the x and y directions, each is represented by an 8-bit binary number.
Note that the coordinate position of the reference pixel on the chromosome can be represented by a gray code instead of a binary number.
Further, instead of expressing the chromosome with a binary number of {0, 1} or a gray code, the chromosome can be constituted by an integer indicating the coordinates of the reference pixel position. That is, in this case, both the x coordinate value and the y coordinate value in FIG.
In the present embodiment, in the image analysis s22 in FIG. 28, an initial template is determined for each divided image block using an analytical / analytical method. Therefore, the initial template is embedded in the individual population to generate an initial individual population in the genetic algorithm.
FIG. 26 is a schematic diagram showing an example of how to embed an initial template into an individual population composed of six chromosomes. FIG. 2A shows a method of randomly initializing all chromosomes, which is the same as a normal genetic algorithm. FIG. 2B shows a method in which the initial template is converted into a chromosome expression, which is used as a master chromosome, which is copied to the first chromosome of the individual population, and the remaining five chromosomes are randomly initialized. The master chromosome may be copied not only to the first chromosome but also to any number of chromosomes in the individual population.
When the individual population is initialized by the method of FIG. 26 (b), the search speed is improved as the number of copies of the master chromosome is copied to the individual population, but on the other hand, the diversity of the individual population decreases, so the optimal It is more likely that evolution will stop before the template is discovered. In such a case, the diversity of the individual population can be maintained by copying the chromosomes with the master chromosomes mutated instead of copying many master chromosomes into the individual population.
FIG. 1C is a schematic diagram showing an example of the above method. Here, three chromosomes with the master chromosome mutated are generated, the master chromosome is copied to the first chromosome of the individual population, and the third chromosome 2 to the fourth chromosome are obtained by mutating the master chromosome. In this method, two chromosomes are copied, and only the remaining two chromosomes are randomly initialized. The master chromosome may be copied to a plurality of chromosomes in the individual population, the number of chromosomes obtained by mutating the master chromosome is arbitrary, and both can be controlled by parameters.
In this embodiment, it is assumed that the individual evaluation function F in the genetic algorithm is the reciprocal of the size of the compressed data obtained by compressing and encoding the input image using the template specified by the individual. For example, when the size of the compressed data is 1 KByte, the value of the fitness is 1 / (1 × 1024).
Since the genetic algorithm behaves so as to maximize the evaluation function F, the reciprocal of the size of the compressed data is maximized, and an individual expressing a template that minimizes the size of the compressed data is searched for. It will be.
Note that the index indicating the fitness of the chromosome is not limited to the reciprocal of the size of the compressed data, but may be any index having a property of increasing as the compressed data decreases. For example, a value obtained by subtracting the size of the compressed data from a specific value can be used as the fitness.
Note that using the size of the compressed data to calculate the fitness requires a large calculation cost, so instead of compressing the entire input image for the fitness calculation, every few pixels and several lines are used. It is also possible to compress a reduced image that has been thinned out every other time. A part of the input image may be cut out and only the cut-out area may be compressed to calculate the fitness. Although the accuracy of the evaluation function is reduced by such thinning or clipping, it is possible to greatly reduce the calculation time.
This decrease in accuracy can be suppressed by thinning out or shifting the cutout area little by little for each generation. For example, in the case of cutting out an area of m × m and performing an evaluation, shifting one bit in the vertical and horizontal directions every time the generation changes, the (m−1) × (m−1) area, which is the majority of the cutout area, , And overlap with the cutout area used in the previous generation (the shaded area in FIG. 3). Therefore, a chromosome (template) evaluated as being excellent in the previous generation is likely to be highly evaluated even in a new excision region. In addition, an area wider by (2 × m−1) can be used for the evaluation as compared with the case where the cutout area is not shifted. Therefore, by repeating this processing, a wider area is used for evaluation, and it is possible to suppress a decrease in the accuracy of the evaluation function due to the cutout processing.
In the above example, the shift amount is 1 bit. However, as this value is larger, a wider area can be used for evaluation, and a decrease in accuracy of the evaluation function is smaller. However, the overlap with the region cut out in the previous generation is reduced by that much, so that the search speed is reduced. Therefore, it is necessary to change the shift amount according to the size of the cutout area.
Further, by using both the cutout and the thinning, it is possible to further suppress a decrease in the accuracy of the evaluation function. That is, when the size of the cutout area is too small with respect to the size of the target divided image block, it takes a very long time to shift the cutout area so as to cover the entire block and to scan. In this regard, the thinning-out method loses the relationship between adjacent pixels, but has an advantage that it is easy to maintain global statistical properties. Therefore, by combining the two, it is possible to compensate for each other's drawbacks, and it is possible to greatly suppress a decrease in the accuracy of the evaluation function due to a reduction in the cutout area and an increase in the thinning interval.
In the present embodiment, the cutout area size is 256 × 256, and the shift amount is 1 bit. The thinning interval is (H / 256) in the vertical direction and (W / 256) in the horizontal direction, assuming that the vertical and horizontal sizes of the cutout area are H and W, respectively. Further, in order to suppress the sampling bias due to the thinning, the offset (the position of the first line) is moved one bit at a time in each of the vertical and horizontal directions.
Note that entropy can be used as the evaluation function instead of the size of the compressed data. In information theory, minimizing entropy and minimizing the size of compressed data are equivalent, so that they are not distinguished in an ideal state. Furthermore, in the entropy calculation, the above-described thinning or clipping may be used.
In the selection process, an operation of selecting individuals from the population with a probability proportional to the fitness is performed for the number of individuals in the population (roulette selection). Thereby, a new population of individuals is generated. In addition to roulette selection, a technique called tournament selection or rank selection may be used.
In the crossover process, the method shown in the explanatory diagram of FIG. 21A is used for two parent individuals A and B randomly selected from a group. This is an operation of partially replacing the chromosomes at random positions with one block of coordinate values, and is a method called one-point crossover. In FIG. 21A, Ch1 and Ch2 are chromosomes of parent individuals A and B, and in the crossover process, these chromosomes are cut at a crossover position CP selected at random. In the example of FIG. 21A, the crossing position is between the second gene and the third gene from the left. Then, by replacing the cut partial genotypes, offspring individuals A ′ and B ′ having chromosomes Ch3 and Ch4, respectively, are generated.
When two chromosomes selected as a pair at the time of crossover have exactly the same bit sequence, to avoid a decrease in search speed due to the fact that one chromosome group contains the same chromosome, only one of the mutations described later is used. Is applied.
By the way, since the template is represented as a chromosome as shown in FIG. 20, it can be considered that the chromosome includes a meaningful information block in 16-bit units as shown in FIG. Therefore, at the time of crossover, by suppressing the probability that a CP is selected at a portion other than the above-mentioned break of the information block, it is possible to prevent a meaningful information block from being destroyed by crossover, and the search speed can be reduced. Is expected to improve.
However, on the other hand, the probability that a new information block is generated due to the crossover also decreases, which has a side effect that the search is likely to converge. Therefore, the probability that a CP is selected at a position other than the break of an information block is increased in the initial stage of the search by the GA, and is reduced as the generation progresses. Finally, all CP candidate positions are selected with equal probability. By doing so, unintended convergence at the initial stage of the search can be avoided. Note that the above method can be used not only for one-point crossover but also for two-point crossover, multipoint crossover, and uniform crossover.
Furthermore, by performing the crossover in consideration of the reference pixels shared by the templates indicated by the two chromosomes paired at the time of the crossover, it is possible to improve the search speed. As an example, consider a template as shown in FIG. In the same figure? The mark indicates the target pixel, and the shaded square indicates the reference pixel. This template is represented by a chromosome shown in FIG. 6B (values are represented in units of information blocks, and the intersection points CP are limited to only the breaks of the information blocks). Further, (c) is a chromosome showing the same template. Here, (d) shows an example in which the two chromosomes are crossed using the break between the second and third information blocks as a CP. As described above, even though the chromosomes have the same template, the chromosomes generated as a result of the crossover are completely different, which causes a reduction in search speed.
Therefore, in the template indicated by the two chromosomes, an operation of extracting an information block corresponding to a reference pixel that is not shared from the chromosome, performing a crossover on only the portion, and returning the result to the original chromosome is performed. This operation will be described with reference to FIG. It is assumed that two chromosomes cross as shown in FIG. Both share reference pixels 19 and 28 (refer to FIG. 4A for reference pixel numbers). Therefore, only the information blocks other than the information blocks corresponding to these shared reference pixels are extracted from FIG. Subsequently, the extracted partial chromosomes are crossed. FIG. 9C shows the result of one-point crossing using the break between the second and third information blocks as CP. Finally, the extracted and crossed partial chromosomes are written back to the original chromosomes. The result is shown in FIG.
In the crossover based on the template expression, if the length of the extracted partial chromosome is 0, it indicates that the two chromosomes represent exactly the same template. In such a case, as described above, in order to avoid a decrease in the search speed due to the fact that one group contains exactly the same template, only one of them is subjected to the mutation described later.
The mutation performed following the crossover was an operation of inverting the values of all genes on all chromosomes according to the mutation rate. FIG. 21B shows an example of the mutation. In this figure, on the chromosome Ch5, the value of the second gene is inverted with a probability based on the mutation rate.
Note that if the chromosome is represented by an integer instead of the binary value of {0, 1}, the bits cannot be inverted. Therefore, a normal random number generated according to the Gaussian distribution N (0, σ) is added to each gene of the chromosome. Perform the operation. Other distributions such as Cauchy distribution other than Gaussian distribution may be used.
Mutation can be performed not on a bit basis but on a per information block basis. That is, in the case of FIG. 20B, since each information block is composed of 16 bits, it indicates a positive value from 0 to 65535. Therefore, in the mutation, each information block is increased or decreased by ± (1 to α) at a constant mutation rate. Here, as the value of α is smaller, the influence of the mutation is weaker and the search speed by GA is improved, but if it is too small, it converges in the initial stage of the search, and a good template cannot be obtained. Conversely, if the α value is large, unintended convergence of the search can be avoided, but the search speed decreases. Therefore, at the initial stage of the search, the value of α is increased, and the α value is gradually reduced as the generation progresses. As a result, the search speed at the end of the search can be improved while suppressing unintended convergence at the initial stage of the search.
Further, the same processing can be separately applied to each of the x coordinate and the y coordinate constituting each information block. This allows for a more subtle adjustment of the mutation rate.
It is also possible to use a Steady State genetic algorithm. That is, two individuals, the best individual and an randomly selected individual, are selected from the population, and one of the offspring individuals generated by crossing them is mutated to the individual with the lowest evaluation in the population. It is a method to replace with. Since only one individual is replaced in each generation, even a small number of individuals has the property of hardly falling into a local solution. Furthermore, it is possible to increase the search speed by permitting substitution only when the fitness of an individual generated by crossover and mutation is higher than the fitness of the lowest-evaluated individual in the population.
For template optimization, a so-called blind search method such as an evolution strategy, a hill-climbing method, an annealing method, a taboo search, or a greedy method can be applied instead of the genetic algorithm. In particular, the hill-climbing method and the annealing method do not have as high a search capability as the genetic algorithm, but have the advantage of high-speed processing. In the blind search method, either the size of the compressed data or the entropy may be used as the evaluation function.
References for the evolution strategy include, for example, H. W., published by the publisher John Wiley & Sons in 1995. P. Schwefel's "Evolution and Optimum Seeking".
The details of the annealing method are described in, for example, E.W., published by the publisher John Wiley & Sons in 1995. Aarts and J.A. It is disclosed in "Simulated Annealing and Boltzmann Machines" by Korst.
Subsequently, the local search s232 in FIG. 31 will be described.
The genetic algorithm is a very powerful search method, but has a problem that the search speed is reduced near the optimal solution or the local solution at the end of the search procedure. That is, as illustrated in FIG. 22, from the beginning to the middle of the search, the increasing rate of the fitness decreases as the number of generations advances, and the fitness hardly improves at the end. Therefore, in the present embodiment, the final adjustment is performed on the best template obtained by the search using the genetic algorithm by using the hill-climbing method. By introducing this processing, it is unnecessary to perform useless search at the end of the genetic algorithm, so that the search speed is improved as a whole.
In a completely monochrome block, no matter what template is used, the compression ratio is exactly the same, so there is no need to perform template optimization at all. Therefore, in the present technology, when performing the image division in step s1, an inspection as to whether each block is a single color is performed at the same time, and when it is determined that each block is a single color, the template optimization is omitted. Note that a dedicated calculation unit may be arranged in the compression / decompression processing unit, and the template may be determined by this dedicated calculation unit instead of the processor.
Up to this point, the description relating to the template search s23 in FIG.
The database is used to determine a template without performing the image analysis s22 and the template search s23 when a priori knowledge regarding image data to be compression-encoded is clear. In particular, in the case of a print image, an image generation procedure based on the dither method is determined for each maker that creates the image generation software, and each maker has a unique feature in a method of creating a dither matrix. Generally, such features are not disclosed, but by accumulating the results of the template search in the database as in the present embodiment, they can be reused as important clues regarding template determination. Further, a better template may be found by performing the template search s23 with the information registered in the database as an initial state.
FIG. 13 is a flowchart showing a processing procedure of the database update s24.
First, in step s241, a template database whose structure example is shown in FIG. 16 is searched to determine whether or not a template for an image having the same resolution, the same number of lines, and the same screen angle, generated by the completely same image generation system, is registered. Search for.
If not, in step s244, the newly obtained template is newly registered, and the process is completed. By registering in this way, a new template can be reused in the database search s21 of the template optimization s2.
If it is determined in step s241 that the template is registered, it is checked in step s242 whether the template is obtained by template optimization processing based on the initial template obtained by searching the template database. If it is obtained, the template is expected to have higher quality than the registered template. Therefore, in step s243, the registered template is updated with the template. Here, “excellent quality” means that high compression efficiency can be obtained by performing predictive encoding using this template.
With the above procedure, an optimal template is determined for each of the divided blocks, and then the compression processing of each block is started using this template. Only necessary portions of the image data obtained by the block division are cut out and sent to the reference buffer. Here, regarding the size of the necessary portion, the height is the number of lines from the pixel of interest to the pixel farthest away from the pixel of interest, which can be taken as a reference pixel, and the width is the both ends farthest away from the pixel in the horizontal direction. Pixel distance.
Subsequently, in the context extractor, the context (that is, vector data composed of the value of the reference pixel) is extracted from the reference buffer based on the template.
The context and the value of the pixel of interest are sent to the MQ-coder (C) for compression via the FIFO at the subsequent stage, and the compressed data is calculated. Note that the MQ-Coder (D) in FIG. 35 is an MQ-Coder for decompression. In this example, the MQ-Coder (C) for compression and the MQ-Coder (D) for decompression are separately installed so that the speed can be easily increased at the time of mounting, but originally MQ-Coder can perform reversible processing. Therefore, only one MQ-coder that can execute both compression and decompression may be used for the purpose of reducing the number of gates of the LSI. Note that the structure and processing procedure of the MQ-Coder are clearly defined in international standards relating to the JBIG2 system. Furthermore, it is also possible to use an entropy encoder other than the MQ-Coder, such as a QM-Coder or a Johnson encoder.
When the processing of the MQ-Coder (C) is completed, an area shifted by one bit to the right is cut out in the reference buffer, and the context is extracted and the compressed data is calculated by the MQ-Coder (C). Thereafter, the same processing is repeated up to the end of the image.
When reaching the end of the image, an area shifted one bit downward at the left end of the image data is cut out to the reference buffer, and the above procedure is repeated up to the lower end of the block.
Note that the above procedure can be speeded up by performing area extraction to the reference buffer, context extraction, and calculation of compressed data in the MQ-coder (C) in a pipeline manner.
This processing can be sped up by preparing two reference buffers and operating them alternately in parallel. This will be described with reference to FIG. It is assumed that the size of the block is as shown in FIG.
In this example, assuming that the maximum movable range of the reference pixel is 8 pixels × 4 lines, the size required by the subsequent context extractor is at least 8 pixels × 4 lines, so the height of the reference buffer is 4 lines. The width may be any number of pixels as long as it is 8 pixels or more. Here, it is assumed that there are 16 pixels.
In the procedure, first, the upper left corner of the block is read into the first reference buffer (the area indicated by a white rectangle in FIG. 3B). Since the width of this reference buffer is 16 which is twice the area required for context extraction, eight times of context extraction can be performed by performing the shift operation eight times. Therefore, while the eight context extractions are being performed, an area shifted by 8 bits to the right is read into another reference buffer (an area indicated by a shaded rectangle in FIG. 9C). Then, after the first reference buffer finishes supplying the area data necessary for the context extraction for eight times, the second reference buffer starts supplying the area data. Thereafter, by repeating this operation alternately, it becomes possible to completely hide the time for writing the area data to the reference buffer.
In this case, how many times the width of the reference buffer should be set for the area required for context extraction depends on the time required for writing to the reference buffer and the time required for context extraction. Need to be kept.
The above is the operation procedure of the image data compression hardware shown in FIG. At the time of decompression, the original image is restored by performing the reverse operation of the compression procedure described above.
Example 2
FIG. 32 is a block diagram of a hardware configuration in which parallel processing is largely introduced to increase the speed of compression and decompression processing. Note that, in the figure, the compression / decompression processing unit is a circuit equivalent to the one obtained by removing the image data memory and the control register from FIG. In this configuration, a DMA processor is used to transfer image data and compressed data from the host computer by DMA transfer.
The major feature is that the image data read into the image memory is forcibly divided in the vertical direction, and each is compressed independently in parallel. Since the figure shows an example of two parallel processes, the time required for compression and decompression is reduced to about １／. By providing n image memories, compression / decompression units, and compressed data input / output FIFOs each by n, n parallel operations can be performed, and the operating speed is also increased by about n times.
In the figure, the two image data memories appear as one continuous memory to the image input / output FIFO. Therefore, at the time of compression, image data is sequentially written line by line as in the first embodiment.
At the time of compression, image block division and template optimization may be performed independently for each image region that has been forcibly divided in the vertical direction, or may be regarded as one large block, and image processing may be performed in exactly the same procedure as in the first embodiment. Only the block division and template optimization may be performed, and only the actual compression processing may be independently performed in parallel in each area. The former is expected to obtain a higher compression ratio, but the latter can realize faster processing.
At the time of decompression using this hardware configuration, it is necessary to always confirm that the processing of the same line has been completed in the image memory 1 and the image memory 2 and then write the line to the image input / output FIFO.
Example 3
If a program for executing the adaptive predictive encoding and decoding method according to the embodiments described above by a computer is recorded on a recording medium such as a hard disk, a flexible disk, a CD-ROM or a DVD-ROM, By reading such a recording medium into a computer, the adaptive predictive encoding and decoding method according to the first embodiment can be easily implemented using the computer.
The adaptive predictive coding program, as shown in FIG.
(1) Procedure for reading image data to be compressed (s601);
(2) a procedure for generating and outputting a header of the compression-encoded data (s602);
(3) a step of dividing the input image data into characteristic blocks (s603);
(4) a procedure for generating an optimal template for each block (s604);
(5) a procedure for generating a context for all pixels in the block in the scanning direction (s605);
(6) a procedure (s606) of generating compressed data using all the pixels in the block and the context corresponding to the pixels generated in step s605;
(7) A step (s607) of synthesizing the block information, template information, and compressed data obtained by the above procedures to generate and output compression-encoded data (s607).
In general, when executing by software, there is often more room for memory capacity than hardware. Therefore, instead of the method of performing data reading and image division simultaneously (1 pass) as in the first embodiment, all image data is After reading, block division may be performed, and compression encoding may be performed for each block (two passes).
FIG. 24A is a schematic diagram showing an example of a format of compression-encoded data when image data to be compressed is divided into n blocks, and includes a header portion and n image blocks. Is composed of compression information corresponding to The header section records the size (vertical and horizontal size) of image data to be compressed, the resolution, the number of lines, the screen angle, the image generation system name and its version, and the like.
FIG. 2B is a schematic diagram showing a configuration example of the compression information corresponding to the i-th image block. The block header in which the size and position (block information) of each block is recorded and the corresponding image data It is composed of a template used for compression encoding, compressed data, and a terminal symbol indicating the end of the i-th compressed information.
The decryption program, as shown in FIG.
(1) Procedure for reading compression-encoded data (s701);
(2) The compression-encoded data is analyzed and separated into the size (vertical and horizontal size) of the entire decompressed image and compression information corresponding to each of the divided blocks, and is used when each block is compressed. A procedure for extracting the template and the compressed data (s702);
(3) a procedure for sequentially generating a context from the beginning of the decompressed image data (s703);
(4) a procedure for generating decoded data from the compressed data using the context (s704);
(5) A procedure of synthesizing the decoded data corresponding to each block to generate one decompressed image data (s706) is executed.
The present invention functions effectively for all ordinary binary images (character images, line images, images binarized by a dither method, an error diffusion method, or the like, and mixed images thereof). Further, the present invention is also applicable to a multi-valued image by performing bit slice processing and image width enlargement processing to decompose the image into a plurality of binary images. Here, in the bit slice processing, for example, in the case of FIG. 27A, since each pixel is represented by 8 bits, a binary image composed of only the first bit is composed of only the 8th bit. This is a process of decomposing one 8-bit image into eight binary images up to the binary image.
By performing bit slicing, an image mainly including low frequency components is generated from an image including many high frequency components.However, since it is generally difficult to compress an image corresponding to high frequency components, before performing bit slicing, Each pixel value may be converted to an alternating binary number (gray code). Thereby, the above problem can be solved. Furthermore, when the original image is a limited color, efficient bit slicing can be performed by converting the binary bit arrangement based on the correlation between the colors.
For example, in the case of FIG. 27B, in the case of a 256-tone (8-bit) image as in the above example, the image width is virtually increased by 8 times to decompose the bits constituting the pixel. This is a process of simply arranging them on the same line as they are.
As described above, the present invention has been described based on the illustrated examples. However, the present invention is not limited to the above examples, but includes other configurations and methods that can be easily modified by those skilled in the art within the scope of the claims. For example, in the embodiment, an example in which image data is used as input data is disclosed. However, the data encoding method of the present invention includes any file having a correlation between bits, such as a text file and a data file of any application. Applicable to data.
Industrial applicability
In the present invention, by dividing the input image information in units of uniform regions, the same template can be used unconditionally within the same region. Further, by appropriately dividing an image, it becomes possible to optimize a fine-grained template, and to improve the accuracy of pixel value prediction in a divided image block, thereby improving the overall compression efficiency.
In the present invention, unlike the conventional method, a high-accuracy template is generated by performing statistical processing and analysis while assuming a dither method that generates an image to be compressed. In addition, by using a database that stores a relationship between a presumed dither method and a template, it is possible to dramatically reduce not only the compression encoding efficiency but also the time required to determine a template.
Further, based on the template determined by the above-described method, an artificial intelligence technique (such as a genetic algorithm) is applied and adaptively adjusted according to the picture pattern, thereby providing a template that contributes to higher prediction accuracy. Can be obtained. By cutting out only a part without impairing the global characteristics of the image, the speed of template determination is also increased.
Further, by feeding back the obtained template and updating the database, a data compression system that is smart enough to be used can be constructed. In other words, by repeatedly processing image data having similar properties, a template that contributes to high prediction accuracy can be obtained at high speed without applying artificial intelligence technology.
[Brief description of the drawings]
FIG. 1 is an explanatory diagram showing various known templates.
FIG. 2 is an explanatory diagram showing a template used in the JBIG method.
FIG. 3 is a schematic diagram of the shift processing of the cutout area.
FIG. 4 shows an example in which the same template is expressed by different chromosomes.
FIG. 5 is an explanatory diagram relating to crossover based on a template expression.
FIG. 6 is a block diagram of the transmission side of the image transmission apparatus according to one embodiment of the present invention.
FIG. 7 is a block diagram of the receiving side of the image transmission apparatus according to one embodiment of the present invention.
FIG. 8 is an operation flowchart of the template optimization method according to one embodiment of the present invention.
FIG. 9 is an operation flowchart of an image division method according to an embodiment of the present invention.
FIG. 10 is a flowchart illustrating a processing procedure of horizontal block division.
FIG. 11 is a flowchart illustrating a processing procedure of vertical block division.
FIG. 12 is an operation flowchart of the genetic algorithm.
FIG. 13 is an operation flowchart for updating template data according to an embodiment of the present invention.
FIG. 14 is a schematic diagram showing a method of dividing an image into blocks.
FIG. 15 is a schematic diagram illustrating a method for calculating the joint probability density.
FIG. 16 is an explanatory diagram showing an example of the structure of a template database used in the image analysis method.
FIG. 17 is an explanatory diagram illustrating the operation principle of the correlation analysis performed by the image analysis method.
FIG. 18 is an explanatory diagram showing the operation principle of periodicity estimation performed by the image analysis method.
FIG. 19 is a schematic diagram showing the relationship between chromosomes and genes.
FIG. 20 is an explanatory diagram of chromosome expression in the genetic algorithm.
FIG. 21 is an explanatory diagram showing the operation principle of crossover and mutation in the genetic algorithm.
FIG. 22 is an example of a general learning curve of a genetic algorithm.
FIG. 23 is a flowchart showing the operation procedure of the encoding program.
FIG. 24 is a schematic diagram illustrating a format example of compression-encoded data.
FIG. 25 is a flowchart showing the operation procedure of the decryption program.
FIG. 26 is a schematic diagram illustrating an example of an initialization method for an individual population in the genetic algorithm.
FIG. 27 is a schematic diagram showing an example of a method for making the present invention applicable to a multivalued image.
FIG. 28 is a flowchart showing an operation procedure of template optimization.
FIG. 29 is a flowchart showing the operation procedure of the database search.
FIG. 30 is a flowchart showing the operation procedure of the image analysis.
FIG. 31 is a flowchart showing the operation procedure of the template search.
FIG. 32 is a block diagram of parallelized image data compression hardware.
FIG. 33 is a schematic diagram illustrating a method for determining a vertical division position candidate.
FIG. 34 is a block diagram of the image data compression hardware.
FIG. 35 is a block diagram of a compression / decompression processing unit.
FIG. 36 is a schematic diagram illustrating a processing procedure of the parallelized reference buffer.

Claims

Database means for storing template information together with data attribute parameters;
Search means for searching template information from the database means using an attribute parameter as a search key;
Determining means for determining whether or not to perform optimization by evaluating the performance of the template output from the search means;
Template optimization means for generating an optimal template by an optimization method using one or both of the code compression rate or entropy as an evaluation function,
A data encoding apparatus comprising: a template generating unit including a registering unit that registers template information output from the template optimizing unit in the database unit.

2. The method according to claim 1, wherein the optimization method employed in the template optimization means is any one of an enumeration method, a genetic algorithm, an evolution strategy, a hill climbing method, an annealing method, a tabu search method, and a greedy method. Data encoding device.

A temporary template generating unit configured to analyze the input data and generate a temporary template;
2. The data encoding apparatus according to claim 1, further comprising a re-search unit that searches the database unit based on the temporary template information.

The data encoding device further includes:
Blocking means for dividing the input data into one or more blocks according to features;
Compression means for compressing input data based on a template output from the template generation means;
2. The data encoding apparatus according to claim 1, further comprising a combining unit that combines output data of the compression unit, the template information, and the block information.

Searching template information from a database using the attribute parameter as a search key;
Determining whether to perform optimization by evaluating the performance of the template of the search result,
When performing optimization, a step of generating an optimal template by an optimization method using one or both of a code compression rate and entropy as an evaluation function,
Registering the optimized template information in the database. Encoding the data using a template generated by the template generation method.

The data encoding method according to claim 5, wherein the optimization method is a enumeration method, a genetic algorithm, an evolution strategy, a hill-climbing method, an annealing method, or a tabu search method.

Computer
Database means for storing template information together with data attribute parameters,
Search means for searching template information from the database means using an attribute parameter as a search key;
Determining means for determining whether to perform optimization by evaluating the performance of the template output from the search means,
Template optimization means for generating an optimal template by an optimization method using one or both of a code compression rate and entropy as an evaluation function,
Registration means for registering template information output from the template optimization means in the database means,
A data encoding program characterized by functioning as a template generating means comprising: