JP4003410B2

JP4003410B2 - Encoding selection apparatus, encoding apparatus and method thereof

Info

Publication number: JP4003410B2
Application number: JP2001175939A
Authority: JP
Inventors: 太郎横瀬
Original assignee: Fuji Xerox Co Ltd; Fujifilm Business Innovation Corp
Current assignee: Fujifilm Business Innovation Corp
Priority date: 2001-06-11
Filing date: 2001-06-11
Publication date: 2007-11-07
Anticipated expiration: 2021-06-11
Also published as: JP2002369198A

Description

【０００１】
【発明の属する技術分野】
本発明は圧縮の選択技術に関するものであり、特に簡易に可逆符号化と非可逆符号化の選択を行う装置に関するものである。
【０００２】
【従来の技術】
画像データは一般にデータ量が膨大になるので、通信、蓄積などを行う際には圧縮してデータ量を削減することが多い。また画像以外の用途でも、特に大きなデータ量を扱う場合には圧縮技術は不可欠の技術といっていい。
【０００３】
圧縮を行うための符号化技術には、いくつもの手法が存在する。大別すれば、復号したときに入力を完全に再現する可逆方式と、何らかの損失を伴う非可逆方式がある。さらにこの２つの方式を基本的アルゴリズムやパラメータなどで分類すれば、いくつもの手法に細分できる。これらの圧縮手法を、以下では符号化と称する。
【０００４】
このようなさまざまな符号化は、同一の入力であってもそれぞれ異なる符号量を出力する。これを仮に符号化の入力依存性と呼ぶ。どの入力に対しても効率よく圧縮できるような、単一の符号化は原理的に存在しない。そこで、入力依存性を排したシステムが必要な場合には、入力に応じて符号化を使い分ける機構が必要になる。
【０００５】
このようなシステムでは事前に符号量を予測して符号化を選択するか、全ての符号化で圧縮してみて符号量を確かめてから選択するかの２つの方法が考えられる。後者は正確だが負荷が重いので、一般的なシステムには向かない。ここにおいて符号化の選択技術は符号量の予測技術に依存することになる。
【０００６】
一般に符号量の予測は難しい。特にＤＣＴ（ＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ、離散コサイン変換）を使った国際標準ＪＰＥＧ（ＪｏｉｎｔＰｈｏｔｏｇｒａｐｈｉｃＥｘｐｅｒｔｓＧｒｏｕｐ）のような変換符号化の場合、変換処理なしに符号量を推定することは不可能に近い。これは変換符号化は変換後のデータを符号化対象とするためである。
【０００７】
そこでこのような変換符号化を含めた２種類の符号化を選択する従来技術である特開平１０−２４３３８８号公報の手法を従来例として説明する。この従来例は変換結果を分析することによって、変換符号化と可逆符号化のうちから、最適な符号化を選択する技術である。
【０００８】
なお、従来例では非可逆符号化を選択したとき画質が劣化しやすい画像がある旨記述されている。本発明では画質についての観点は持たないので、一見従来例と目的が違うように思われるかもしれない。しかし非可逆符号化では一般に画質と符号量がトレードオフの関係にあることを考えると、従来例が問題としているケースは非可逆符号化で画質を維持しようとしたときの符号量増加の問題でもあることがわかる。つまり、従来例が問題としている点を符号量の問題としてとらえることが可能であり、その場合は本発明が関わる問題と本質的に同一なのである。
【０００９】
図１１は従来例の符号化選択装置の構成例である。本発明の説明の趣旨に沿うように用語を一部変更しているが、従来例の手法の本質に関わるものではない。図中、１０はデータ入力部、３０は第１の符号量推定部、４０は符号化選択部、５０は選択結果出力部、１０１０はＡ／Ｄ変換器、１０２０はフレームメモリ、３０１０はラスターブロック変換部、３０２０はＤＣＴ変換部、１１０は入力データ、１３２はブロックデータ、１３３はＤＣＴデータ、１４０は選択結果データである。
【００１０】
図１１の各部について説明する。図１１の符号化選択装置は以下の構成よりなる。Ａ／Ｄ変換器１０１０は外部からデータを受け取り、入力データ１１０としてフレームメモリ１０２０へ送出する。フレームメモリ１０２０は入力データ１１０の全体を格納してから、改めて入力データ１１０としてラスターブロック変換部３０１０へ送出する。ラスターブロック変換部３０１０は入力データのスキャン順を変換してブロック分割を行い、これをブロックデータ１３２としてＤＣＴ変換部３０２０へ送出する。ＤＣＴ変換部３０２０はブロックデータ１３２にＤＣＴ処理を行い、ＤＣＴデータ１３３として符号化選択部４０へ送出する。符号化選択部４０はＤＣＴデータ１３３に基づいて最適な符号化を選択し、これを選択結果データ１４０として選択結果出力部５０に送出する。選択結果出力部５０は外部に選択結果データ１４０を送出する。
【００１１】
以上の構成の中で、選択結果出力部５０は従来例の記載に含まれないが、説明の都合上追加した。また符号化選択部４０は従来例では圧縮方式判定器という名前であるが、これも説明の都合上変更した。また従来例には実際の符号化を行う手段も含まれているが、これも従来例の特徴を明瞭にするために便宜上省略した。
【００１２】
以上の構成に基づいた従来例の動作について説明する。図１２は従来例の符号化選択装置の動作を示すフローチャートである。以下、図１２を用いて従来例の動作について説明する。
【００１３】
Ｓ１０ではＡ／Ｄ変換器１０１０およびフレームメモリ１０２０においてデータの入力を行う。Ｓ３１ではラスターブロック変換部３０１０においてブロック分割を行う。Ｓ３２ではＤＣＴ変換部３０２０においてＤＣＴ処理を行う。Ｓ４０では符号化選択部４０において符号化選択を行う。Ｓ５０では選択結果出力部５０において選択結果の出力を行う。
【００１４】
以上の動作の中で、符号化選択部４０における符号化選択はＤＣＴデータ１３３の周波数分布を分析して画像の性質を判別し、最適と思われる符号化を選択する。具体的には高周波成分の量を評価することによって、ＤＣＴによる変換符号化とＤＰＣＭ（ＤｉｆｆｅｒｅｎｔｉａｌＰｕｌｓｅＣｏｄｅＭｏｄｕｌａｔｉｏｎ）による可逆符号化を自動的に切り替える。従来例では高周波成分の多い画像ではＤＣＴによる圧縮率を高くすることができないと記述している。これは本発明の観点からすれば、符号量こそ求めていないものの、大まかな符号量予測と同等の処理をしているに他ならない。
【００１５】
次に従来例の問題点について述べる。画像圧縮処理の中でも、ＤＣＴ処理は比較的重い処理である。例えば従来例で選択対象となっているＤＰＣＭは１画素につき１度の減算しか必要ないのに対して、８×８のＤＣＴで原理的には１画素あたり１６回の乗算と１４回の加算を必要とする。実装上の工夫でこの値は減らすことができるが、それでもＤＰＣＭに比べると圧倒的に処理が重い。さらにＤＣＴの対象となるブロックを切り出すためにブロックラインメモリが必要となるので、この点でも規模の大きい処理装置が必要となる。
【００１６】
従来点の問題は結果的にＤＰＣＭが選ばれる画像に対しても、この重いＤＣＴ処理を行わなければならない点である。これは符号化の選択にＤＣＴ処理の結果を必要とすることに起因する。冒頭に述べたように本発明の観点から言えば、実際に符号量を求めることなしに符号化を切り替えるのは処理負荷を軽くするためであり、これを実現するためにＤＣＴ処理を行うのは目的に矛盾する。
【００１７】
【発明が解決しようとする課題】
以上で述べてきたように、従来例の問題点として、実際の符号化処理に比較して符号化選択処理そのものの負荷が重くなってしまうことがあげられる。
【００１８】
本発明は上述の事情に鑑みてなされたもので、符号化処理に比較して処理の十分軽い符号量予測処理ならびに符号化選択処理を提供することを目的とする。
【００１９】
【課題を解決するための手段】
本発明によれば上述の目的を達成するために特許請求の範囲に記載のとおりの構成を採用している。ここでは、特許請求の範囲の記載内容について補充的に説明を行う。
【００２０】
本発明の一側面によれば、符号化選択装置において、入力データを入力するデータ入力手段と、上記データ入力手段により入力したデータに基づき少なくとも１つ以上の所定の符号化に関して符号量を推定する第１の符号量推定手段と、少なくとも１つ以上の所定の符号化に関して符号量を推定する第２の符号量推定手段と、上記第１の符号量推定手段および上記第２の符号量推定手段によって推定された符号量の比較に基づいて符号化方式を決定する符号化選択手段と、上記符号化選択手段の結果を外部へ出力する選択結果出力手段とを具備し、上記第２の符号量推定手段における符号量推定処理は上記入力データと無関係に行うようにしている。
【００２１】
この構成においては、上記第２の符号量推定手段による符号量推定処理は上記入力データの性質と無関係であるので軽い処理で行うことができる。たとえば、可逆符号化と変換符号化（非可逆符号化）とを選択する場合を考える。この場合、変換符号化の従来の符号量荷推定手法は高い負荷である反面、その符号量は、可逆符号化に比べ入力データの性質に対して比較的安定している。可逆符号化の従来の符号量推定は比較的小さな負荷で済む反面、その符号量は、入力データの性質に対して大幅に変化する。したがって、可逆符号化については通常の符号量推定を行い、変換符号化については入力データに依存しない推定を行い、負荷を抑えながら符号量が増加するのを抑えることができる。
【００２２】
また、この構成においては、符号化パラメータを入力するパラメータ入力手段をさらに設け、上記第２の符号量推定手段は上記入力データによらず、上記パラメータ入力手段により入力したパラメータのみに基づいて符号量を推定するようにしてもよい。また、推定符号量の比較には重み付けを行ってもよい。
【００２３】
パラメータによらず、一定の符号量（入力データの大きさに比例する）として推定してもよい。あるいは入力データの大きさに依存する関数で推定を行ってもよい。
【００２４】
また、上記第１の符号量推定手段で行われる符号量推定処理は、先に述べたように、可逆符号化に対して行うものであってもよい。
【００２５】
また、上記第１の符号量推定手段で行われる符号量推定処理は、対象となる符号化のソースコーディングの部分的な処理もしくはその簡略処理の結果に基づいて符号量を推定するものであってもよい。簡易シンボルを用いて推定するようにしてもよい。
【００２６】
また、上記第１の符号量推定手段で行われる符号量推定処理は、ソースコーディングの部分的な処理もしくはその簡略処理の結果と符号量の関係を統計的にまとめた結果を表または式などの形式で参照し、必要な場合にはこれを補間を加えて推定符号量を算出するようにしてもよい。
【００２７】
また、上記第２の符号量推定手段で行われる符号量推定処理は、先に触れたように、非可逆符号化に対して行うものであってもよい。
【００２８】
また、上記第２の符号量推定手段で行われる符号量推定処理は、入力される符号化パラメータと符号量の関係を統計的にまとめた結果を参照し、また必要な場合はこれに補間を加えて推定符号量を算出するようにしてもよい。
【００２９】
また、上記パラメータ入力手段において入力されたパラメータに対して四則演算をはじめとする所定の補正を行うことで、いずれかの符号化を優先的に選択させるようにしてもよい。
【００３０】
また、上記第１の符号量推定手段または上記第２の符号量推定手段において算出した推定符号量のうち少なくとも１つに対して、四則演算をはじめとする所定の補正を行うことで、いずれかの符号化を優先的に選択させるようにしてもよい。
【００３１】
また、上記符号化選択手段で入力された推定符号量のうち少なくとも１つに対して、四則演算をはじめとする所定の補正を行うことで、いずれかの符号化を優先的に選択させるようにしてもよい。
【００３２】
また、上記第１の符号量推定手段を、上記データ入力手段によって入力されたデータを予測する予測手段と、上記予測が上記入力データと一致する回数を計数する予測一致計数手段と、上記予測一致回数と符号量との関係を保持する符号量保持手段と、上記符号量保持手段によって与えられる推定符号量に上記予測一致回数に応じた補間を行う補間手段とから構成するようにしてもよい。
【００３３】
また、上記第２の符号量推定手段を、上記パラメータ入力手段によって入力されたパラメータと符号量との関係を保持する符号量保持手段と、上記符号量保持手段によって与えられる推定符号量に上記入力パラメータに応じた補間を行う補間手段とから構成するようにしてもよい。
【００３４】
また、上記予測手段によって行われる予測は少なくとも２つ以上の予測手法によって行い、上記予測一致計数手段は、上記予測手法のうち１つでも一致したときは予測一致回数として計数するようにしてもよい。
【００３５】
また、上記データ入力手段は入力されるデータを部分的に選択して第１の符号量推定手段に送出するようにしてもよい。
【００３６】
また、上記第１の符号量推定手段および上記第２の符号量推定手段における上記符号量保持部はそれぞれ対応する符号化を用いて、それぞれ上記入力データ、上記入力パラメータにおける、符号量との関係を事前に統計的に求めた結果を保持するようにしてもよい。
【００３７】
また、上記第１の符号量推定手段および上記第２の符号量推定手段における上記符号量保持部に格納するデータは、それぞれ上記入力データ、上記入力パラメータと符号量との関係が特に非線形の部分については細かい間隔で保持するようにしてもよい。
【００３８】
また、上記第１の符号量推定手段および上記第２の符号量推定手段における上記符号量保持手段は、それぞれ上記入力データ、上記入力パラメータに対して、最も近いデータか、または内輪と外輪のそれぞれにおいて最も近いデータの両方を選択してそれぞれの上記補間手段に送出するようにしてもよい。
【００３９】
また、本発明は符号化装置としても実現できる。
【００４０】
また、本発明は装置またはシステムに実装されるのみでなく、方法の態様でも実現可能であり、少なくともその一部をコンピュータプログラムとして構成できることはもちろんである。
【００４１】
【発明の実施の形態】
以下本発明の実施例について詳細に説明する。
【００４２】
［基本的な原理］
本発明の実施例の具体的な説明の前に、本発明の基本的な原理について述べる。本発明は大きく２つの原理からなる。
【００４３】
以下、第１の原理について説明する。本発明は可逆符号化と非可逆符号化の選択に関するものである。そこでまずこの両者の特性の違いについて説明する。可逆符号化には理論的な圧縮限界が存在する。これは一般に情報量などと呼ばれるが、数値としてはエントロピーがその例である。従って入力データが含む情報量によってその限界値が極端に変化し得る。結果として得られる符号量も極端に変化することになる。
【００４４】
これに対して非可逆符号化は必ず量子化処理またはその同等処理を内部に含む。例えば前出のＪＰＥＧの場合はＤＣＴ処理された結果に線形量子化を行う。この量子化は一般に画質に影響が少ないと思われる情報をより粗く量子化するように行われる。ＪＰＥＧの例でいえば、高周波成分は低周波成分よりもより粗く量子化される。このため例えば画像の情報量が高周波成分に多く含まれる場合には、量子化の効果で符号量が大きくなりづらい。つまり可逆符号化に比較して、より符号量が安定する。
【００４５】
非可逆符号化の符号量が大きく変動するのは、符号化パラメータが変化した場合である。このパラメータには量子化処理を制御する値が含まれていることが多い。ＪＰＥＧの例では量子化テーブルをパラメータとして与えることができる。つまり、符号化パラメータが一定の場合、入力データによらず非可逆符号化の符号量は可逆符号化よりも安定していることが多い、というのがより正確な記述になる。
【００４６】
この性質を確かめるために行った実験の結果を図１３に示す。可逆符号化には予測符号化の一種を使用した。また非可逆符号化はＪＰＥＧを使用し、１から順に量子化を粗くした。この結果、可逆符号化は画像に応じて符号量が１００倍以上に変化したが、非可逆符号化は最も変化が激しかった非可逆符号化１でも７倍程度と安定していた。以上が本発明の第１の原理である。
【００４７】
本発明の第２の原理を説明するために、可逆符号化の符号量予測について述べる。可逆符号化は情報を欠くことなく復元する必要があるため、誤差を含む可能性のある計算、例えば一般的な周波数変換は処理に含むことができない。これは原理的に可逆符号化は浮動小数による演算をほとんど含まないことを意味する。同様に除算も扱いづらく、一般には入力データそのままか、加減算を加える程度が主である。
【００４８】
符号量予測に実際の符号化の一部の処理か、またはそれを簡易化した処理が必要な点では、可逆符号化の事情は非可逆符号化と同様である。しかし上のように可逆符号化は一般に軽い処理が多いので、非可逆符号化と比較すると符号量を比較的簡単に予測することができる。これが本発明を構成する第２の原理である。
【００４９】
以上、以下の２原理が明らかとなった。
【００５０】
第１の原理：非可逆符号化の符号量は可逆符号化より入力画像への依存性が少ない、
第２の原理：可逆符号化の符号量予測は非可逆符号化のそれより容易である。
【００５１】
そこで本発明は、可逆符号化による符号量を符号量予測で推定し、これを画像によらず一定と仮定した非可逆符号化による符号量と比較することにより、より符号量の少ないと思われる符号化を選択する。このとき非可逆符号化の符号量については、事前に統計的に調べた符号化パラメータとの関係を使った補正を加えることで精度を向上する。
【００５２】
本発明の具体的な例については実施例において説明する。以下、本発明の実施例として、
（１）一般的な例
（２）予測符号化とＪＰＥＧの選択に適用した例について述べる。
【００５３】
［実施例１］
本発明の実施例１として、まず一般的な例を述べる。以下、実施例１の具体的な説明を行う。図１は実施例１における符号化選択装置を示すブロック図である。図中、図１１と同様の部分には同一の符号を付して説明を省略する。図中、２０はパラメータ入力部、３１は第２の符号量推定部、１２０はパラメータデータ、１３０、１３１は推定符号量データである。
【００５４】
図１の各部について説明する。データ入力部１０は外部から符号化すべきデータを入力し、入力データ１１０として第１の符号量推定部３０へ送出する。パラメータ入力部２０は外部からパラメータを入力し、パラメータデータ１２０として第２の符号量推定部３１へ送出する。第１の符号量推定部３０は入力データ１１０を解析して所定の可逆符号化による符号量を推定し、推定符号量データ１３０として符号化選択部４０へ送出する。第２の符号量推定部３１はパラメータデータ１２０から所定の非可逆符号化による符号量を推定し、推定符号量データ１３１として符号化選択部４０へ送出する。符号化選択部４０は推定符号量データ１３０および１３１に基づき、符号化手法を選択して選択結果データ１４０として選択結果出力部５０へ送出する。選択結果出力部５０は選択結果データ１４０を外部へ出力する。
【００５５】
以上の構成に基づいて本発明の実施例１の動作について説明する。図２は実施例１における符号化動作を示すフローチャートである。図中、図１２と同様の部分には同一の符号を付して説明を省略する。ただし若干異なる部分については説明を加える。
【００５６】
Ｓ１０ではデータ入力部１０およびパラメータ入力部２０において、外部からそれぞれデータとパラメータの入力を受け付ける。Ｓ２０では第１の符号量推定部３０において所定の可逆符号化による符号量を推定する。Ｓ３０では第２の符号量推定部３１において所定の非可逆符号化による符号量を推定する。Ｓ４０ではＳ２０およびＳ３０で推定された符号量に基づいて、符号化方式を選択する。
【００５７】
以上の動作の中で、説明の都合上Ｓ１０でデータとパラメータを同時に入力するように説明したが、これらはそれぞれの符号量推定処理であるＳ２０およびＳ３０に間に合えば良いので、特に同期を取る必要はない。またＳ２０とＳ３０はＳ４０に間に合えば良いので、実際にはこの順序が逆でも構わないし、あるいは並行して行われても良い。
【００５８】
次に第１の符号量推定部３０で行われる可逆符号化の符号量推定処理について説明する。符号化処理は一般に前段のソースコーディングと後段のエントロピーコーディングからなる。ごく一般的な定義で言えば、ソースコーディングは入力に対して何らかの仮定あるいはモデリングをした変換処理で、エントロピーコーディングは統計的な圧縮処理である。前出のＪＰＥＧの例でいえばソースコーディングはＤＣＴと量子化を、またエントロピーコーディングはハフマン符号化を採用している。
【００５９】
一般に符号量推定はソースコーディングの出力を観察することによって得ることが多い。これは入力データの違いによる影響がソースコーディングに出やすいためである。これに対してエントロピーコーディングは入力によらず、比較的安定した圧縮率を示すことが多い。例えばＪＰＥＧの場合、量子化の結果０にならなかった変換係数の個数などから符号量を推定する事ができる。第１の符号量推定部３０で行われる符号量推定処理も、例えばこのようなソースコーディングの結果に基づくものであってよい。
【００６０】
また別の例としてソースコーディング自体ではなく、それを簡略化した処理やその代替処理などの結果から符号量を推定してもよい。このような例については実施例２において詳細に説明する。
【００６１】
ソースコーディングの結果から符号量を求めるには、事前に統計的な実験をしておく必要がある。ここでいう統計的な実験とはそのシステムに入力され得る画像を多く集め、これらのソースコーディングの結果と符号量の関係を統計的に処理することを指す。この統計的処理はあるソースコーディングの結果が得られたときに対応する符号量がいくつになるかを推定するのが目的なので、最も単純な場合は平均を用いればよい。もちろん公知の統計的技術によって重み付けや偏差による補正が行われてもよい。
【００６２】
このような統計処理の結果は第１の符号量推定部３０に保持する必要がある。それは表の形式で保持してもよいし、線形または非線形の式で近似してもよい。もちろんこれらの組み合わせでも構わない。例として代表値を表の形式で持ち、これらの間を補間して求める構成を実施例２において詳細に説明する。またこの部分は入力とすべきデータなどの細部を除いて第２の符号量推定部３１と共通する処理なので、以下で改めて説明する。
【００６３】
次に第２の符号量推定部３１で行われる非可逆符号化の符号量推定処理について説明する。この処理は入力データ１１０を参照することなく行う。例えば統計結果として図１３を得た場合、これら４画像についての推定符号量は統計値、例えば平均、最大値、最小値、最頻値あるいは中間値などから算出しておき、実際の符号量推定処理はこれらを参照して行う。これらの統計値は符号化パラメータ別に求めておき、例えば表のかたちで第２の符号量推定部３１に保持する。図３は符号化パラメータが２種類ある場合における、このような表の概念図である。
【００６４】
表にないパラメータが入力された場合は、線形もしくは非線形の補間を行って、該当する値を算出する。この場合、表にある中で近いパラメータで代用してもよいが、あまり好ましくない。それは非可逆符号化の符号量がこのようなパラメータへ強く依存するためである。
【００６５】
また逆にパラメータを固定して圧縮を運用するような場合がある。例えば画質が厳しく問われるような用途では、結果的にではあるが使えるパラメータの範囲に制限を生じるので、事実上は固定のパラメータで設計してしまっても構わない。このような場合には第２の符号量推定部３１はパラメータにも入力画像にもよらない、固定の符号量を送出する。このときパラメータ入力部２０は本実施例の構成から省くことができる。
【００６６】
最後に符号化選択部４０における選択処理は、基本的には推定符号量データ１３０および１３１のうち、小さい方に対応する符号化を選択する。しかしこの比較に何らかの重みをつけてもよい。ここでいう重みづけとは、推定符号量に何らかの値を加えたり乗じたりする処理を指す。例えば画質の問題で符号量の差がＤ以下の場合は可逆符号化を選択したいような場合、推定符号量データ１３０からＤを減じたものを推定符号量データ１３１と比較する事によって選択処理を行えばよい。
【００６７】
このような補正処理はもっと間接的に行うこともできる。例えば第１の符号量推定部３０もしくは第２の符号量推定部３１にそのような重みづけをする機能を加えてもよい。さらにパラメータ入力部２０の内部でパラメータを調整すれば、それ以外の構成を変更しなくても同等の目的を実現することができる。これについて以下に説明する。
【００６８】
本実施例は実際の符号化を行う部分とは独立なので、パラメータ入力部２０から送出するパラメータデータ１２０は実際に符号化するときのパラメータでなくても構わない。そこで上述のように可逆符号化を優先したい場合、このパラメータを非可逆符号化の圧縮率が悪くなる方に調整する。この調整の度合いは理論的に算出できることもあるし、それができなくても予め統計処理などで求めておける。すると推定符号量データ１３１は実際の符号量よりも多めになるので、結局符号化選択部４０で行われる選択処理を、見かけ上可逆を優先するような処理にすることができる。パラメータ入力部２０は入力インターフェースの部分に実装できるので、例えば本実施例をハードウェアで実装したような場合にも、デバイスドライバなどのハードウェアの制御部分や、ハードウェアを起動するアプリケーションなどでの実装が可能である。
【００６９】
以上の説明の中で説明のための便宜上、選択対象となる非可逆符号化および可逆符号化は１つづつであるかのように説明したが、これがそれぞれ２つ以上であっても構わない。そのような場合の本実施例の拡張については、以上の説明より明らかなので説明を省略する。
【００７０】
以上で説明したように、実施例１によれば非可逆符号化の符号量を画像によらず一定と仮定するので、符号量予測を可逆符号化についてのみ行えばよく、ごく高速かつ低負荷で符号化選択処理を行うことができる。
【００７１】
なお、実施例１の符号化選択装置を用いた符号化装置は図４に示すように構成される。この図では、符号化選択装置（図１）の選択結果出力部５０からの選択結果に基づいて符号化部６０の第１の符号化ユニット６１および第２お符号化ユニット６２を選択して利用するようになっている。第１の符号化ユニット６１は第１の符号量推定部３０に対応し、第２の符号化ユニット６２は第２の符号量推定部３１に対応する。もちろん、第１符号化ユニット６１および第２符号化ユニット６２が符号化部６０全体に対応してもよいし、その一部のステージに対応してもよい。
【００７２】
［実施例２］
本発明の実施例２として、本発明を非可逆符号化であるＪＰＥＧと、可逆符号化である特開平０９−２２４２５３号公報に開示された予測符号化との選択に適用した例について説明する。
【００７３】
以下、実施例２の具体的な説明を行う。図５は実施例２における符号化選択装置を示すブロック図である。図中、図１および図１１と同様の部分には同一の符号を付して説明を省略する。図中、３０３０は予測部、３０４０は予測一致計数部、３０５０は補間部、３０５１は符号量保持部、３０６０は符号量保持部、３０７０は補間部、１３４は予測データ、１３５は予測一致データ、１３６、１３７は推定符号量データである。
【００７４】
図５の各部について説明する。予測部３０３０は所定の１つ以上の予測処理を入力データ１１０に対して行い、その結果を予測データ１３４として予測一致計数部３０４０へ送出する。予測一致計数部３０４０は予測データ１３４と入力データ１１０が一致した回数を計数し、その結果を予測一致データ１３５として補間部部３０５０および符号量保持部３０５１へ送出する。符号量保持部３０５１は各予測一致回数に対応する推定符号量を保持し、予測一致データ１３５に基づいて適当な推定符号量を推定符号量データ１３６として補間部３０５０へ送出する。補間部３０５０は予測一致データ１３５に基づいて、必要であれば所定の補間処理を行って推定符号量データ１３５を符号化選択部４０へ送出する。符号量保持部３０６０は各パラメータに対応する推定符号量を保持し、パラメータデータ１２０に基づいて適当な推定符号量を推定符号量データ１３７として補間部３０７０へ送出する。補間部３０７０はパラメータデータ１２０に基づいて、必要であれば所定の補間処理を行って推定符号量データ１３１を符号化選択部４０へ送出する。
【００７５】
詳細な動作については実施例１の説明などから明らかなので、省略する。
【００７６】
以上の構成において、まず第１の符号量推定部３０の詳細について説明する。予測部３０３０における予測は特開平０９−２２４２５３号公報に開示された予測符号化において行われる予測のうち、一部または全部を行う。一部の予測を行う場合、どの予測を行うかについては事前に各予測の一致率と符号量との関係を調べ、より相関性が高い予測を優先的に採用すればよい。また予測が複数の場合、予測別に一致を計数してもよいし、いずれかの予測が一致した回数を計数することも考えられる。これらの選択についても、より符号量と相関性が高くなるような値を優先する。
【００７７】
これらの相関性は事前に統計的な処理によって求めることができる。もちろん、理論的に算出できるような場合は、そのようにしても構わない。例えば特開平０９−２２４２５３号公報に開示された技術の場合、複数の予測から一致した予測を選択するように符号化するので、いずれかの予測が一致した回数だけを計数すれば十分と理論的に決めることができる。
【００７８】
本実施例において、第１の符号量推定部３０で行われる処理は特開平０９−２２４２５３号公報に開示された符号化処理をごく単純化したものである。つまり実施例１の説明で触れた、ソースコーディングを簡略化した処理で符号量を推定する例にあたる。一般にはこうした符号量の推定は実際にソースコーディングを行って符号量を推定するものほど精度が高くないが、予測一致データ１３５と実際の符号量との相関が高ければ、本実施例の目的に供せられる程度の精度は確保することができる。図６はこれを確かめる実験結果である。横軸が予測の一致率、縦軸が特開平０９−２２４２５３号公報に開示された技術による符号量を示している。図６より両者の相関性は明らかに高い。
【００７９】
またこの処理をより簡略化するために、画像をサンプリングして処理することが考えられる。例えば画像からＮラインのサンプルを取り出して、これについての予測一致率をとるだけでも符号量との相関をある程度とることができる。Ｎの値は必要とする精度によって異なるが、例えば全ラインの１／１０程度でもいいし、さらに高速化が必要な場合は入力画像の解像度が高ければ１／１０００程度でも比較的高い相関性を維持できる。またこのサンプルは画像の局所性を避けるために、なるべく画像全体に散っていることが望ましい。
【００８０】
なお以上の説明では都合上、本実施例では特開平０９−２２４２５３号公報に開示された技術を取り上げたが、他の可逆符号化への応用も容易である。例えば差分符号化に関しては、差分をとる対象となる画素値と処理しようとする画素値との相関をとることで符号量推定が可能である。同様にマルコフモデル符号化では各マルコフモデルの出現確率の測定から、ブロックソーティング符号化では条件付き確率の測定から、ＬＺ符号化の場合は周辺画素との相関から、それぞれ符号量の推定が可能である。これらの詳細については本実施例の本質から外れるので省略する。
【００８１】
また本実施例では第１の符号量推定部３０の処理を簡略化したものについて説明してきたが、処理負荷の増加が許容できるのであれば、前述したソースコーディングそのままでももちろん構わない。この場合の構成は以上の説明から容易に類推可能なので、説明を省略する。
【００８２】
なお、以上の説明では補間部３０５０および符号量保持部３０５１の詳細について説明していないが、これは以下に述べる補間部３０７０および符号量保持部３０６０の詳細から容易に類推可能なので、ここでは説明を省略する。
【００８３】
次に第２の符号量推定部３１について説明する。符号量保持部３０６０は保持している推定符号量うち、入力されたパラメータデータ１２０に近いものを選択する。図３は本実施例においては符号量保持部３０６０が保持する推定符号量の表に相当するが、簡単のため本実施例では符号化パラメータが１つの場合について詳細に説明する。本実施例で仮定しているＪＰＥＧの場合、スケーリングファクタと呼ばれるパラメータをこれに対応させることができる。図７はそのような推定符号量の表の概念図である。いまスケーリングファクタの昇順に、左から並べられているものとする。例えば入力されたパラメータデータ１２０がＳｎより大きく、かつＳｎ＋１より小さかった場合、Ｓｎ、Ｓｎ＋１、Ｃｎ、Ｃｎ＋１を推定符号量データ１３７として補間部３０７０へ送出する。図８はこのときの推定符号量データ１３７のフォーマット例である。
【００８４】
図７におけるスケーリングファクタの間隔は、望ましい符号量の推定精度と保持可能な表のサイズを勘案して決める。スケーリングファクタ間の間隔は一定でなくても構わないので、スケーリングファクタと推定符号量の関係に非線形性が強い部分にサンプル数を多くするのが一般には好ましい。図９はそのような表の一例であるが、これは符号量保持部３０５１における例であって、図６に例示した予測一致率と符号量との関係を示している。
【００８５】
さて以上の説明は上述の通り、符号量保持部３０５１についても同様にあてはまるが、入力データが異なるので注意が必要である。つまり符号量保持部３０６０への入力が符号化パラメータであるのに対して、符号量保持部３０５１への入力は予測の一致回数である。この値は画像の大きさによって異なるので、補正が必要になる。例えば図７のスケーリングファクタを予測の一致率に置き換え、符号量保持部３０５１の内部で予測一致回数を画像サイズで除することで予測一致確率に正規化すれば、画像サイズによらない参照が可能となる。もちろんこのような正規化を予測一致計数部３０４０側で行っても構わない。
【００８６】
次に補間部３０７０での補間について述べる。ここで行われる補間は符号量保持部３０６０が保持する推定符号量が、符号化パラメータに対して十分細かい単位でとられていれば、例えば線形補間のような単純な補間で構わない。この場合、パラメータＳに対して次のように推定符号量Ｃを求める。
【００８７】
【数１】

【００８８】
推定符号量の表で符号化パラメータの間隔が広く、かつスケーリングファクタと推定符号量の関係が非線形の場合には、もっと複雑な多次の補間式がよい。具体例については公知の技術が多く、また本実施例の本質から外れるので説明を省略する。
【００８９】
本実施例の効果を確認するために、コンピュータ上で本実施例のシミュレーションを行った。図１０は実験結果である。またこのときの本実施例の処理時間は、ＪＰＥＧに比較して約１／４０、特開平０９−２２４２５３号公報に開示された予測符号化に比較しても約１／１０であった。この結果から本実施例の効果は明らかである。
【００９０】
以上で説明したように、実施例２によれば非可逆符号化と可逆符号化の選択を、軽い処理負荷で実現することができる。
【００９１】
【発明の効果】
以上の説明から明らかなように、本発明によれば複数の非可逆符号化と可逆符号化から符号量の意味で最適なものを選択する符号化選択装置において、十分な精度でしかも軽い処理負荷の符号化選択処理を実現することができる。
【図面の簡単な説明】
【図１】本発明の符号化選択装置の実施例１を示す構成図である。
【図２】本発明の符号化選択装置の実施例１における動作の一例を示すフローチャートである。
【図３】本発明の符号化選択装置の実施例１の符号量推定処理において保持する推定符号量の表の概念図である。
【図４】実施例１の符号化選択装置を採用した符号化装置を示す構成図である。
【図５】本発明の符号化選択装置の実施例２を示す構成図である。
【図６】本発明の符号化選択装置の実施例２における可逆符号化の予測一致率と符号量の関係の一例を示す説明図である。
【図７】本発明の符号化選択装置の実施例２の符号量推定処理において保持する推定符号量の表の概念図である。
【図８】本発明の符号化選択装置の実施例２の符号量推定処理において使用する推定符号量データ１３７の概念図である。
【図９】本発明の符号化選択装置の実施例２の符号量推定処理において保持する推定符号量の表の例である。
【図１０】本発明の符号化選択装置の実施例２による実験結果の一例を示す説明図である。
【図１１】従来例の符号化選択装置を示す構成図である。
【図１２】従来例の符号化選択装置の動作の一例を示すフローチャートである。
【図１３】非可逆符号化と可逆符号化との性質の違いを説明する実験結果の説明図である。
【符号の説明】
１０データ入力部
２０パラメータ入力部
３０第１の符号量推定部
３１第２の符号量推定部
４０符号化選択部
５０選択結果出力部
６０符号化部
６１第１の符号化ユニット
６２第２の符号化ユニット
１１０入力データ
１２０パラメータデータ
１３０推定符号量データ
１３１推定符号量データ
１３２ブロックデータ
１３３ＤＣＴデータ
１３４予測データ
１３５予測一致データ
１３６推定符号量データ
１３７推定符号量データ
１４０選択結果データ
１０１０Ａ／Ｄ変換器
１０２０フレームメモリ
３０１０ラスターブロック変換部
３０２０ＤＣＴ変換部
３０３０予測部
３０４０予測一致計数部
３０５０補間部
３０５１符号量換算部
３０６０符号量保持部
３０７０補間部[0001]
BACKGROUND OF THE INVENTION
The present invention relates to a compression selection technique, and more particularly to an apparatus that easily selects lossless encoding and lossy encoding.
[0002]
[Prior art]
Since image data generally has an enormous amount of data, it is often compressed to reduce the amount of data when performing communication or storage. In applications other than images, compression technology is an indispensable technology especially when handling large amounts of data.
[0003]
There are a number of techniques for encoding techniques for compression. Broadly speaking, there are a reversible method that completely reproduces the input when decoded and a nonreciprocal method that involves some loss. Furthermore, if these two methods are classified by basic algorithms and parameters, they can be subdivided into a number of methods. These compression methods are hereinafter referred to as encoding.
[0004]
Such various encodings output different code amounts even with the same input. This is called encoding input dependency. In principle, there is no single encoding that can efficiently compress any input. Therefore, when a system that eliminates the input dependency is necessary, a mechanism for properly using the encoding according to the input is required.
[0005]
In such a system, two methods are conceivable: a coding amount is predicted in advance and coding is selected, or compression is performed after all coding is performed and the coding amount is confirmed and then selected. The latter is accurate but heavy and not suitable for general systems. Here, the encoding selection technique depends on the code amount prediction technique.
[0006]
In general, it is difficult to predict the code amount. In particular, in the case of transform coding such as the international standard JPEG (Joint Photographic Experts Group) using DCT (Discrete Cosine Transform), it is almost impossible to estimate the code amount without transform processing. This is because conversion encoding uses converted data as an encoding target.
[0007]
Therefore, a method of Japanese Patent Laid-Open No. 10-243388, which is a conventional technique for selecting two types of encoding including such transform encoding, will be described as a conventional example. This conventional example is a technique for selecting an optimal encoding from conversion encoding and lossless encoding by analyzing a conversion result.
[0008]
In the conventional example, it is described that there is an image whose image quality is likely to deteriorate when lossy encoding is selected. Since the present invention has no viewpoint on image quality, it may seem that the purpose is different from the conventional example. However, considering that there is generally a trade-off between image quality and code amount in lossy encoding, the case where the conventional example is a problem is the problem of increase in code amount when trying to maintain image quality with lossy encoding. I know that there is. That is, it is possible to regard the point that the conventional example has a problem as a code amount problem, and in this case, it is essentially the same as the problem related to the present invention.
[0009]
FIG. 11 shows a configuration example of a conventional coding selection apparatus. The terminology has been partially changed to meet the spirit of the description of the present invention, but is not related to the essence of the conventional technique. In the figure, 10 is a data input unit, 30 is a first code amount estimation unit, 40 is an encoding selection unit, 50 is a selection result output unit, 1010 is an A / D converter, 1020 is a frame memory, and 3010 is a raster block. A conversion unit, 3020 is a DCT conversion unit, 110 is input data, 132 is block data, 133 is DCT data, and 140 is selection result data.
[0010]
Each part of FIG. 11 will be described. The encoding selection apparatus in FIG. 11 has the following configuration. The A / D converter 1010 receives data from the outside and sends it as input data 110 to the frame memory 1020. The frame memory 1020 stores the entire input data 110 and then transmits the input data 110 to the raster block conversion unit 3010 as input data 110 again. The raster block conversion unit 3010 converts the scan order of the input data to perform block division, and sends this as block data 132 to the DCT conversion unit 3020. The DCT conversion unit 3020 performs DCT processing on the block data 132 and sends it to the encoding selection unit 40 as DCT data 133. The encoding selection unit 40 selects an optimal encoding based on the DCT data 133 and sends this to the selection result output unit 50 as selection result data 140. The selection result output unit 50 sends the selection result data 140 to the outside.
[0011]
In the above configuration, the selection result output unit 50 is not included in the description of the conventional example, but is added for convenience of explanation. The encoding selection unit 40 is named as a compression method determiner in the conventional example, but this is also changed for convenience of explanation. The conventional example also includes means for performing actual encoding, but this is also omitted for the sake of convenience in order to clarify the features of the conventional example.
[0012]
The operation of the conventional example based on the above configuration will be described. FIG. 12 is a flowchart showing the operation of the conventional coding selection apparatus. The operation of the conventional example will be described below with reference to FIG.
[0013]
In S10, data is input in the A / D converter 1010 and the frame memory 1020. In S31, the raster block converter 3010 performs block division. In S32, the DCT conversion unit 3020 performs DCT processing. In S40, the encoding selection unit 40 performs encoding selection. In S50, the selection result output unit 50 outputs the selection result.
[0014]
Among the operations described above, the coding selection in the coding selection unit 40 is performed by analyzing the frequency distribution of the DCT data 133 to determine the nature of the image and selecting the coding that seems to be optimal. Specifically, by evaluating the amount of high frequency components, transform coding by DCT and lossless coding by DPCM (Differential Pulse Code Modulation) are automatically switched. In the conventional example, it is described that the compression rate by DCT cannot be increased for an image having many high-frequency components. From the viewpoint of the present invention, although the amount of code is not calculated, this is nothing but processing equivalent to rough code amount prediction.
[0015]
Next, problems of the conventional example will be described. Among the image compression processes, the DCT process is a relatively heavy process. For example, the DPCM, which is the target of selection in the conventional example, only needs to be subtracted once per pixel, but in principle, the 8 × 8 DCT performs 16 multiplications and 14 additions per pixel. I need. Although this value can be reduced by a device in mounting, the processing is still overwhelmingly heavy compared with DPCM. Furthermore, since a block line memory is required to cut out a block to be subjected to DCT, a large-scale processing apparatus is also required in this respect.
[0016]
The problem with the conventional point is that this heavy DCT processing must be performed even for an image for which DPCM is selected as a result. This is because the result of DCT processing is required for selection of encoding. As described at the beginning, from the viewpoint of the present invention, the coding is switched without actually obtaining the code amount in order to reduce the processing load, and the DCT process is performed to realize this. Contradicts purpose.
[0017]
[Problems to be solved by the invention]
As described above, the problem with the conventional example is that the load of the encoding selection process itself becomes heavier than the actual encoding process.
[0018]
The present invention has been made in view of the above circumstances, and an object of the present invention is to provide a code amount prediction process and an encoding selection process that are sufficiently lighter than the encoding process.
[0019]
[Means for Solving the Problems]
According to the present invention, in order to achieve the above-mentioned object, the configuration as described in the claims is adopted. Here, supplementary explanation will be given for the contents described in the claims.
[0020]
According to an aspect of the present invention, in a coding selection apparatus, a code amount is estimated for at least one or more predetermined coding based on data input means for inputting input data and data input by the data input means. A first code amount estimating unit; a second code amount estimating unit for estimating a code amount with respect to at least one predetermined encoding; the first code amount estimating unit; and the second code amount estimating unit. And a selection result output means for outputting the result of the encoding selection means to the outside, and the second code amount. The code amount estimation process in the estimation means is performed regardless of the input data.
[0021]
In this configuration, the code amount estimation processing by the second code amount estimation means is irrelevant to the nature of the input data, and can be performed with light processing. For example, consider a case where lossless encoding and transform encoding (lossy encoding) are selected. In this case, the conventional code amount load estimation method for transform coding is a high load, but the code amount is relatively stable with respect to the properties of the input data as compared with lossless coding. While the conventional code amount estimation of lossless encoding requires a relatively small load, the code amount varies greatly with the nature of the input data. Therefore, it is possible to perform normal code amount estimation for lossless encoding, and perform estimation independent of input data for transform encoding, and suppress an increase in code amount while suppressing a load.
[0022]
Further, in this configuration, parameter input means for inputting an encoding parameter is further provided, and the second code amount estimation means is based on only the parameter input by the parameter input means without depending on the input data. May be estimated. Further, weighting may be performed for comparison of estimated code amounts.
[0023]
It may be estimated as a constant code amount (proportional to the size of the input data) regardless of the parameters. Alternatively, the estimation may be performed using a function that depends on the size of the input data.
[0024]
Further, the code amount estimation processing performed by the first code amount estimation means may be performed for lossless encoding as described above.
[0025]
The code amount estimation process performed by the first code amount estimation means estimates a code amount based on a result of a partial process of source coding of a target encoding or a simplified process thereof. Also good. You may make it estimate using a simple symbol.
[0026]
Further, the code amount estimation processing performed by the first code amount estimation means is a table or an expression that shows a result of statistically summarizing the relationship between the partial amount of source coding or the result of the simplification process and the code amount. The estimated code amount may be calculated by referring to the format and adding interpolation if necessary.
[0027]
Also, the code amount estimation process performed by the second code amount estimation means may be performed for lossy encoding as described above.
[0028]
The code amount estimation process performed by the second code amount estimation means refers to the result of statistically collecting the relationship between the input encoding parameter and the code amount, and interpolates this if necessary. In addition, the estimated code amount may be calculated.
[0029]
Also, any encoding may be preferentially selected by performing predetermined correction including four arithmetic operations on the parameters input by the parameter input means.
[0030]
Further, by performing a predetermined correction including four arithmetic operations on at least one of the estimated code amounts calculated by the first code amount estimating unit or the second code amount estimating unit, The encoding may be preferentially selected.
[0031]
Also, at least one of the estimated code amounts input by the encoding selection unit is subjected to predetermined correction including four arithmetic operations so that any one encoding is preferentially selected. May be.
[0032]
Further, the first code amount estimation means includes a prediction means for predicting data input by the data input means, a prediction match counting means for counting the number of times the prediction matches the input data, and the prediction match. You may make it comprise the code amount holding | maintenance means which hold | maintains the relationship between the frequency | count and code amount, and the interpolation means which performs the interpolation according to the said prediction matching frequency | count to the estimated code amount given by the said code amount holding | maintenance means.
[0033]
Further, the second code amount estimation unit is configured to input the code amount holding unit that holds the relationship between the parameter input by the parameter input unit and the code amount, and the estimated code amount given by the code amount holding unit. You may make it comprise from the interpolation means which performs the interpolation according to a parameter.
[0034]
The prediction performed by the prediction unit may be performed by at least two prediction methods, and the prediction match counting unit may count the number of prediction matches when one of the prediction methods matches. .
[0035]
Further, the data input means may partially select input data and send it to the first code amount estimation means.
[0036]
In addition, the code amount holding units in the first code amount estimation unit and the second code amount estimation unit respectively use the corresponding encoding, and the relationship between the input data and the code amount in the input parameter, respectively. You may make it hold | maintain the result calculated | required statistically beforehand.
[0037]
The data stored in the code amount holding unit in the first code amount estimation means and the second code amount estimation means is the portion where the relationship between the input data and the input parameter and the code amount is particularly nonlinear, respectively. May be held at fine intervals.
[0038]
Further, the code amount holding means in the first code amount estimation means and the second code amount estimation means are respectively the closest data to the input data and the input parameter, or the inner ring and the outer ring, respectively. It is also possible to select both of the closest data and send them to the respective interpolation means.
[0039]
The present invention can also be realized as an encoding device.
[0040]
In addition, the present invention is not only implemented in an apparatus or system, but can also be realized in a method aspect, and at least a part of the present invention can be configured as a computer program.
[0041]
DETAILED DESCRIPTION OF THE INVENTION
Examples of the present invention will be described in detail below.
[0042]
[Basic principles]
Prior to specific description of the embodiments of the present invention, the basic principle of the present invention will be described. The present invention mainly consists of two principles.
[0043]
Hereinafter, the first principle will be described. The present invention relates to selection between lossless encoding and lossy encoding. First, the difference in characteristics between the two will be described. There is a theoretical compression limit for lossless encoding. This is generally called information amount, and entropy is an example of a numerical value. Therefore, the limit value can change extremely depending on the amount of information included in the input data. As a result, the resulting code amount will also change drastically.
[0044]
On the other hand, lossy encoding always includes quantization processing or equivalent processing therein. For example, in the case of JPEG described above, linear quantization is performed on the result of DCT processing. This quantization is generally performed so as to coarsely quantize information that seems to have little influence on image quality. In the JPEG example, the high frequency component is quantized more coarsely than the low frequency component. For this reason, for example, when the amount of information of an image is included in a high-frequency component, the amount of code is difficult to increase due to the quantization effect. That is, the code amount is more stable than lossless encoding.
[0045]
The code amount of lossy encoding varies greatly when the encoding parameter changes. This parameter often includes a value for controlling the quantization process. In the example of JPEG, a quantization table can be given as a parameter. That is, when the encoding parameter is constant, the more accurate description is that the amount of lossy encoding is often more stable than lossless encoding regardless of input data.
[0046]
FIG. 13 shows the result of an experiment conducted to confirm this property. A kind of predictive coding was used for lossless coding. In addition, lossy encoding uses JPEG, and the quantization is coarsened in order from 1. As a result, in the lossless encoding, the code amount changed to 100 times or more according to the image, but in the lossy encoding, the lossy encoding 1 in which the change was most severe was stable at about 7 times. The above is the first principle of the present invention.
[0047]
In order to explain the second principle of the present invention, code amount prediction of lossless coding will be described. Since lossless encoding needs to be restored without loss of information, calculations that may contain errors, such as general frequency transforms, cannot be included in the process. This means that in principle, lossless encoding includes almost no operations with floating point numbers. Similarly, division is difficult to handle, and generally the input data is used as it is or addition / subtraction is added.
[0048]
The situation of lossless encoding is the same as that of lossy encoding in that a part of the actual encoding or a simplified process is necessary for code amount prediction. However, as described above, lossless encoding generally has many light processes, and thus the amount of code can be predicted relatively easily compared to lossy encoding. This is the second principle constituting the present invention.
[0049]
As described above, the following two principles have been clarified.
[0050]
First principle: The amount of lossy encoding is less dependent on the input image than lossless encoding.
Second principle: The code amount prediction of lossless encoding is easier than that of lossy encoding.
[0051]
Therefore, in the present invention, the code amount by lossless encoding is estimated by code amount prediction, and compared with the code amount by lossy encoding that is assumed to be constant regardless of the image, it seems that the code amount is smaller. Select encoding. At this time, the accuracy of the lossy encoding code amount is improved by applying correction using the relationship with the encoding parameter statistically examined in advance.
[0052]
Specific examples of the present invention will be described in Examples. Hereinafter, as an example of the present invention,
(1) General example
(2) An example applied to predictive coding and JPEG selection will be described.
[0053]
[Example 1]
As Example 1 of the present invention, a general example is first described. Hereinafter, specific description of the first embodiment will be given. FIG. 1 is a block diagram illustrating an encoding selection apparatus according to the first embodiment. In the figure, parts similar to those in FIG. In the figure, 20 is a parameter input unit, 31 is a second code amount estimation unit, 120 is parameter data, and 130 and 131 are estimated code amount data.
[0054]
Each part of FIG. 1 will be described. The data input unit 10 receives data to be encoded from the outside, and sends it as input data 110 to the first code amount estimation unit 30. The parameter input unit 20 inputs parameters from the outside and sends them as parameter data 120 to the second code amount estimation unit 31. The first code amount estimation unit 30 analyzes the input data 110 to estimate a code amount by predetermined lossless encoding, and sends the estimated code amount data 130 to the encoding selection unit 40. The second code amount estimation unit 31 estimates a code amount by predetermined irreversible encoding from the parameter data 120 and sends the estimated code amount data 131 to the encoding selection unit 40. The encoding selection unit 40 selects an encoding method based on the estimated

code amount data

130 and 131 and sends it to the selection result output unit 50 as selection result data 140. The selection result output unit 50 outputs the selection result data 140 to the outside.
[0055]
The operation of the first embodiment of the present invention will be described based on the above configuration. FIG. 2 is a flowchart showing the encoding operation in the first embodiment. In the figure, parts similar to those in FIG. However, explanations will be added for slightly different parts.
[0056]
In S10, the data input unit 10 and the parameter input unit 20 receive data and parameter inputs from the outside. In S20, the first code amount estimation unit 30 estimates a code amount by a predetermined lossless encoding. In S30, the second code amount estimation unit 31 estimates a code amount by predetermined lossy encoding. In S40, an encoding method is selected based on the code amount estimated in S20 and S30.
[0057]
In the above operations, for the sake of explanation, it has been described that data and parameters are input simultaneously in S10. However, since these may be in time for the respective code amount estimation processes S20 and S30, synchronization is particularly necessary. There is no. Since S20 and S30 may be in time for S40, the order may actually be reversed or may be performed in parallel.
[0058]
Next, the code amount estimation processing of lossless encoding performed by the first code amount estimation unit 30 will be described. In general, the encoding process includes source coding at the previous stage and entropy coding at the subsequent stage. In a very general definition, source coding is a conversion process with some assumptions or modeling of the input, and entropy coding is a statistical compression process. In the above JPEG example, source coding employs DCT and quantization, and entropy coding employs Huffman coding.
[0059]
In general, code amount estimation is often obtained by observing the output of source coding. This is because the influence of the difference in input data is likely to appear in source coding. On the other hand, entropy coding often shows a relatively stable compression rate regardless of input. For example, in the case of JPEG, the code amount can be estimated from the number of transform coefficients that have not become 0 as a result of quantization. The code amount estimation process performed by the first code amount estimation unit 30 may also be based on the result of such source coding, for example.
[0060]
As another example, the code amount may be estimated not from source coding itself but from the result of simplified processing or alternative processing thereof. Such an example will be described in detail in the second embodiment.
[0061]
In order to obtain the code amount from the result of source coding, it is necessary to conduct a statistical experiment in advance. The statistical experiment here refers to collecting a large number of images that can be input to the system and statistically processing the relationship between the source coding results and the code amount. Since this statistical process is intended to estimate the number of corresponding codes when a certain source coding result is obtained, an average may be used in the simplest case. Of course, correction by weighting or deviation may be performed by a known statistical technique.
[0062]
The result of such statistical processing needs to be held in the first code amount estimation unit 30. It may be kept in tabular form or approximated by a linear or non-linear equation. Of course, these combinations may be used. As an example, a configuration having representative values in the form of a table and interpolating between them will be described in detail in the second embodiment. Since this part is a process common to the second code amount estimation unit 31 except for details such as data to be input, it will be described below again.
[0063]
Next, the loss amount encoding code amount estimation process performed by the second code amount estimation unit 31 will be described. This process is performed without referring to the input data 110. For example, when FIG. 13 is obtained as a statistical result, the estimated code amount for these four images is calculated from a statistical value, for example, an average, maximum value, minimum value, mode value, or intermediate value, and the actual code amount estimation is performed. Processing is performed with reference to these. These statistical values are obtained for each encoding parameter, and are stored in the second code amount estimation unit 31 in the form of a table, for example. FIG. 3 is a conceptual diagram of such a table when there are two types of encoding parameters.
[0064]
When a parameter not listed in the table is input, linear or nonlinear interpolation is performed to calculate a corresponding value. In this case, parameters close to those in the table may be substituted, but this is not preferable. This is because the code amount of lossy encoding strongly depends on such parameters.
[0065]
Conversely, there are cases where compression is performed with parameters fixed. For example, in applications in which image quality is strictly asked, the range of usable parameters is limited, but as a result, it may be designed with fixed parameters in practice. In such a case, the second code amount estimation unit 31 sends out a fixed code amount that does not depend on a parameter or an input image. At this time, the parameter input unit 20 can be omitted from the configuration of the present embodiment.
[0066]
Finally, the selection processing in the encoding selection unit 40 basically selects the encoding corresponding to the smaller one of the estimated

code amount data

130 and 131. However, this comparison may be given some weight. The weighting here refers to a process of adding or multiplying an estimated code amount. For example, if the difference in code amount is less than or equal to D due to image quality problems, if it is desired to select lossless encoding, the selection process is performed by comparing the estimated code amount data 130 minus D with the estimated code amount data 131. Just do it.
[0067]
Such correction processing can be performed more indirectly. For example, a function for performing such weighting may be added to the first code amount estimation unit 30 or the second code amount estimation unit 31. Furthermore, if parameters are adjusted inside the parameter input unit 20, an equivalent purpose can be realized without changing other configurations. This will be described below.
[0068]
Since the present embodiment is independent of the actual encoding part, the parameter data 120 sent from the parameter input unit 20 may not be a parameter for actual encoding. Therefore, when priority is given to lossless encoding as described above, this parameter is adjusted so that the compression rate of lossy encoding becomes worse. The degree of adjustment may be theoretically calculated, or even if it cannot be obtained, it can be obtained in advance by statistical processing or the like. Then, since the estimated code amount data 131 is larger than the actual code amount, the selection process performed by the encoding selection unit 40 can be changed to a process that gives priority to reversible in appearance. Since the parameter input unit 20 can be implemented in the input interface portion, for example, even when this embodiment is implemented in hardware, the hardware control portion such as a device driver or an application that activates the hardware can be used. Implementation is possible.
[0069]
In the above description, for convenience of explanation, the description has been made as if the lossy encoding and the lossless encoding to be selected are one each, but there may be two or more each. Since the extension of the present embodiment in such a case is clear from the above description, the description is omitted.
[0070]
As described above, according to the first embodiment, it is assumed that the amount of lossy encoding is constant regardless of the image, so that it is only necessary to predict the amount of code only for lossless encoding. An encoding selection process can be performed.
[0071]
In addition, the encoding apparatus using the encoding selection apparatus of Example 1 is comprised as shown in FIG. In this figure, the first encoding unit 61 and the second encoding unit 62 of the encoding unit 60 are selected and used based on the selection result from the selection result output unit 50 of the encoding selection device (FIG. 1). It is supposed to be. The first encoding unit 61 corresponds to the first code amount estimation unit 30, and the second encoding unit 62 corresponds to the second code amount estimation unit 31. Of course, the 1st encoding unit 61 and the 2nd encoding unit 62 may respond | correspond to the whole encoding part 60, and may correspond to the one part stage.
[0072]
[Example 2]
As a second embodiment of the present invention, an example will be described in which the present invention is applied to the selection between JPEG, which is lossy encoding, and predictive encoding, which is disclosed in JP 09-224253 A, which is lossless encoding.
[0073]
Hereinafter, the second embodiment will be described in detail. FIG. 5 is a block diagram illustrating an encoding selection apparatus according to the second embodiment. In the figure, parts similar to those in FIGS. 1 and 11 are denoted by the same reference numerals, and description thereof is omitted. In the figure, 3030 is a prediction unit, 3040 is a prediction match counting unit, 3050 is an interpolation unit, 3051 is a code amount holding unit, 3060 is a code amount holding unit, 3070 is an interpolation unit, 134 is prediction data, 135 is prediction match data,

Reference numerals

136 and 137 denote estimated code amount data.
[0074]
Each part of FIG. 5 will be described. The prediction unit 3030 performs predetermined one or more prediction processes on the input data 110, and sends the result as prediction data 134 to the prediction match counting unit 3040. The prediction coincidence counting unit 3040 counts the number of times that the prediction data 134 and the input data 110 coincide with each other, and sends the result to the interpolation unit 3050 and the code amount holding unit 3051 as the prediction coincidence data 135. The code amount holding unit 3051 holds an estimated code amount corresponding to each prediction matching count, and sends an appropriate estimated code amount as the estimated code amount data 136 to the interpolation unit 3050 based on the prediction matching data 135. Based on the prediction matching data 135, the interpolation unit 3050 performs a predetermined interpolation process if necessary, and sends the estimated code amount data 135 to the encoding selection unit 40. The code amount holding unit 3060 holds an estimated code amount corresponding to each parameter, and sends an appropriate estimated code amount as estimated code amount data 137 to the interpolation unit 3070 based on the parameter data 120. Based on the parameter data 120, the interpolation unit 3070 performs predetermined interpolation processing if necessary, and sends the estimated code amount data 131 to the encoding selection unit 40.
[0075]
Detailed operation will be omitted from the description of the first embodiment and the like.
[0076]
In the above configuration, first, the details of the first code amount estimation unit 30 will be described. The prediction in the prediction unit 3030 performs part or all of the prediction performed in the prediction encoding disclosed in Japanese Patent Laid-Open No. 09-224253. When performing some predictions, as to which prediction is to be performed, the relationship between the coincidence rate of each prediction and the code amount may be examined in advance, and a prediction with higher correlation may be preferentially adopted. In addition, when there are a plurality of predictions, the number of matches may be counted for each prediction, or the number of times any one of the predictions matches may be counted. For these selections, priority is given to values that have a higher correlation with the code amount.
[0077]
These correlations can be obtained in advance by statistical processing. Of course, if it can be calculated theoretically, this may be done. For example, in the case of the technique disclosed in Japanese Patent Laid-Open No. 09-224253, encoding is performed so as to select a matching prediction from a plurality of predictions, so it is theoretically sufficient to count only the number of times any prediction matches. Can be decided.
[0078]
In the present embodiment, the processing performed by the first code amount estimation unit 30 is a simplification of the encoding processing disclosed in Japanese Patent Laid-Open No. 09-224253. In other words, this is an example in which the code amount is estimated by the process described in the description of the first embodiment in which source coding is simplified. In general, the estimation of the code amount is not as accurate as the one that actually performs source coding to estimate the code amount. However, if the correlation between the predicted match data 135 and the actual code amount is high, the purpose of this embodiment is considered. The accuracy that can be provided can be ensured. FIG. 6 shows experimental results to confirm this. The horizontal axis indicates the prediction coincidence, and the vertical axis indicates the code amount according to the technique disclosed in Japanese Patent Application Laid-Open No. 09-224253. From FIG. 6, the correlation between the two is clearly high.
[0079]
In order to simplify this process, it is conceivable to sample and process an image. For example, it is possible to obtain a certain degree of correlation with the code amount by simply taking samples of N lines from the image and obtaining the prediction matching rate. The value of N varies depending on the required accuracy, but may be, for example, about 1/10 of all lines, and if higher speed is required, a relatively high correlation is obtained even if the resolution of the input image is high, even about 1/1000. Can be maintained. Also, it is desirable that this sample is scattered throughout the image as much as possible in order to avoid locality of the image.
[0080]
In the above description, for the sake of convenience, the technique disclosed in Japanese Patent Application Laid-Open No. 09-224253 is taken up in this embodiment, but application to other lossless encoding is also easy. For example, regarding differential encoding, the code amount can be estimated by correlating the pixel value to be processed with the pixel value to be processed. Similarly, it is possible to estimate the code amount from the measurement of the appearance probability of each Markov model in Markov model coding, from the measurement of conditional probability in block sorting coding, and from the correlation with surrounding pixels in the case of LZ coding. is there. Since these details deviate from the essence of the present embodiment, they are omitted.
[0081]
In the present embodiment, the simplified processing of the first code amount estimation unit 30 has been described. However, as long as the processing load can be increased, the source coding described above may be used as it is. Since the configuration in this case can be easily inferred from the above description, the description is omitted.
[0082]
Although the details of the interpolation unit 3050 and the code amount holding unit 3051 are not described in the above description, this can be easily inferred from the details of the interpolation unit 3070 and the code amount holding unit 3060 described below. Is omitted.
[0083]
Next, the second code amount estimation unit 31 will be described. The code amount holding unit 3060 selects an estimated code amount that is close to the input parameter data 120. FIG. 3 corresponds to a table of estimated code amounts held by the code amount holding unit 3060 in the present embodiment, but for the sake of simplicity, the case where there is one coding parameter will be described in detail in the present embodiment. In the case of JPEG assumed in this embodiment, a parameter called a scaling factor can be made to correspond to this. FIG. 7 is a conceptual diagram of such a table of estimated code amounts. It is assumed that they are arranged from the left in ascending order of scaling factors. For example, when the input parameter data 120 is larger than Sn and smaller than Sn + 1, Sn, Sn + 1, Cn, and Cn + 1 are transmitted to the interpolation unit 3070 as estimated code amount data 137. FIG. 8 shows a format example of the estimated code amount data 137 at this time.
[0084]
The scaling factor interval in FIG. 7 is determined in consideration of the estimation accuracy of the desired code amount and the size of the table that can be held. Since the interval between the scaling factors may not be constant, it is generally preferable to increase the number of samples in a portion where nonlinearity is strong in the relationship between the scaling factor and the estimated code amount. FIG. 9 is an example of such a table. This is an example of the code amount holding unit 3051, and shows the relationship between the prediction matching rate and the code amount exemplified in FIG.
[0085]
As described above, the above description applies to the code amount holding unit 3051 as well, but care must be taken because the input data is different. That is, the input to the code amount holding unit 3060 is an encoding parameter, whereas the input to the code amount holding unit 3051 is the number of prediction matches. Since this value varies depending on the size of the image, correction is required. For example, if the scaling factor in FIG. 7 is replaced with the prediction match rate and the prediction match probability is normalized by dividing the number of prediction matches by the image size inside the code amount holding unit 3051, reference regardless of the image size is possible. It becomes. Of course, such normalization may be performed on the prediction coincidence counting unit 3040 side.
[0086]
Next, interpolation in the interpolation unit 3070 will be described. The interpolation performed here may be simple interpolation such as linear interpolation as long as the estimated code amount held by the code amount holding unit 3060 is taken in a sufficiently fine unit with respect to the encoding parameter. In this case, the estimated code amount C is obtained for the parameter S as follows.
[0087]
[Expression 1]

[0088]
In the estimated code amount table, when the interval between the encoding parameters is wide and the relationship between the scaling factor and the estimated code amount is non-linear, a more complicated multi-order interpolation formula is preferable. There are many known techniques for specific examples, and the description is omitted because it deviates from the essence of the present embodiment.
[0089]
In order to confirm the effect of the present embodiment, a simulation of the present embodiment was performed on a computer. FIG. 10 shows the experimental results. Further, the processing time of this example at this time was about 1/40 compared with JPEG, and about 1/10 compared with the predictive coding disclosed in Japanese Patent Application Laid-Open No. 09-224253. From this result, the effect of the present embodiment is clear.
[0090]
As described above, according to the second embodiment, selection of lossy encoding and lossless encoding can be realized with a light processing load.
[0091]
【The invention's effect】
As is clear from the above description, according to the present invention, in a coding selection device that selects an optimum one in terms of code amount from a plurality of lossy coding and lossless coding, processing load with sufficient accuracy and light weight. The encoding selection process can be realized.
[Brief description of the drawings]
FIG. 1 is a configuration diagram illustrating a first embodiment of a coding selection apparatus according to the present invention.
FIG. 2 is a flowchart showing an example of operation in the first embodiment of the coding selection apparatus of the present invention.
FIG. 3 is a conceptual diagram of a table of estimated code amounts held in a code amount estimation process according to the first embodiment of the coding selection apparatus of the present invention.
FIG. 4 is a configuration diagram illustrating an encoding apparatus employing the encoding selection apparatus according to the first embodiment.
FIG. 5 is a block diagram showing Embodiment 2 of the coding selection apparatus of the present invention.
FIG. 6 is an explanatory diagram illustrating an example of a relationship between a prediction matching rate and a code amount of lossless encoding in the second embodiment of the encoding selection device of the present invention.
FIG. 7 is a conceptual diagram of a table of estimated code amounts held in a code amount estimation process according to the second embodiment of the coding selection apparatus of the present invention.
FIG. 8 is a conceptual diagram of estimated code amount data 137 used in the code amount estimation process of the second embodiment of the encoding selection apparatus of the present invention.
FIG. 9 is an example of a table of estimated code amounts held in the code amount estimation process of the second embodiment of the coding selection apparatus of the present invention.
FIG. 10 is an explanatory diagram showing an example of an experimental result according to the second embodiment of the coding selection apparatus of the present invention.
FIG. 11 is a block diagram showing a conventional coding selection apparatus.
FIG. 12 is a flowchart showing an example of the operation of a conventional coding selection apparatus.
FIG. 13 is an explanatory diagram of experimental results for explaining a difference in properties between lossy encoding and lossless encoding.
[Explanation of symbols]
10 Data input section
20 Parameter input section
30 1st code amount estimation part
31 Second code amount estimation unit
40 Encoding selection unit
50 Selection result output section
60 Coding unit
61 First encoding unit
62 Second encoding unit
110 Input data
120 Parameter data
130 Estimated code amount data
131 Estimated code amount data
132 block data
133 DCT data
134 Forecast data
135 Predictive match data
136 Estimated code amount data
137 Estimated code amount data
140 Selection result data
1010 A / D converter
1020 frame memory
3010 Raster block converter
3020 DCT converter
3030 Prediction unit
3040 Predictive coincidence counting unit
3050 Interpolator
3051 Code amount conversion unit
3060 Code amount holding unit
3070 Interpolator

Claims

Reversible coding means for entropy encoding the source coded results by source coding input data with the prediction result of the prediction means, a source encoded result by source coding input data using a frequency transformation and quantization In a coding selection device that performs coding by switching between lossy coding means for entropy coding,
Data input means for inputting input data;
By the reversible encoding means a process of source encoding process or the source coding of the reversible encoding unit executes the processing simplified for the input data based on the execution result by said data input means First code amount estimation means for estimating a code amount with respect to encoding;
The irreversible encoding means by the function of the size of the data input by the data input means or by the function of the encoding parameter for the size of the data and the quantization of the irreversible encoding means Second code amount estimating means for estimating a code amount with respect to encoding by:
An encoding selection unit that determines an encoding method based on a comparison of code amounts estimated by the first code amount estimation unit and the second code amount estimation unit;
An encoding selection apparatus comprising: a selection result output means for outputting the result of the encoding selection means to the outside.

The encoding selection apparatus according to claim 1, wherein the second code amount estimation means estimates a code amount based on a function of the size of input data, and the function is a linear function that monotonously increases.

Said code amount estimation processing is performed in the first code amount estimating means, the results summarized statistically the relationship between the results and the code amount of partial processing or a simplified process of source coding in tables or expressions of the form The encoding selection apparatus according to claim 1 or 2, wherein the estimated code amount is calculated by referring to and adding an interpolation to the reference if necessary.

Code amount estimation processing performed by the second code amount estimating means refers to the results summarized statistically the relationship between the coding parameter and the code amount input, and if necessary, adding interpolated to The encoding selection apparatus according to claim 1, wherein an estimated code amount is calculated by using the encoding selection apparatus.

The encoding according to any one of claims 1 to 4, wherein one of the encodings is preferentially selected by performing a predetermined correction including four arithmetic operations on the encoding parameter. Selection device.

Any code is obtained by performing a predetermined correction including four arithmetic operations on at least one of the estimated code amounts calculated by the first code amount estimating unit or the second code amount estimating unit. 6. The encoding selection apparatus according to claim 1, wherein encoding is preferentially selected.

By performing a predetermined correction including four arithmetic operations on at least one of the estimated code amounts input by the encoding selection unit, one of the encodings is preferentially selected. The encoding selection apparatus according to any one of claims 1 to 6.

The first code amount estimation means is
Prediction means for predicting the data input by the data input means;
A prediction match counting means for counting the number of times the prediction matches the input data;
Code amount holding means for holding the relationship between the number of prediction matches and the code amount;
The encoding selection apparatus according to any one of claims 1 to 7, comprising interpolation means for performing interpolation according to the number of prediction matches with the estimated code quantity given by the code quantity holding means.

The second code amount estimation means is
Code amount holding means for holding the relationship between the encoding parameter and the code amount;
5. The encoding selection apparatus according to claim 4, further comprising: an interpolation unit that performs interpolation according to an encoding parameter input to the estimated code amount given by the code amount holding unit.

The prediction performed by the prediction unit is performed by at least two or more prediction methods, and the prediction coincidence counting unit counts as the number of prediction coincidence when even one of the prediction methods coincides. 9. The encoding selection apparatus according to 8.

The encoding selection apparatus according to any one of claims 1 to 10, wherein the data input means partially selects input data and sends the selected data to the first code amount estimation means.

The code amount holding means in the second code amount estimation means holds the result of statistically calculating in advance the relationship between the input data and the encoding parameter with the code amount using the corresponding encoding. The encoding selection apparatus according to claim 9.

The data stored in the code amount holding means in the second code amount estimating means is characterized in that the input data and the portion where the relationship between the encoding parameter and the code amount is nonlinear are held at fine intervals. The encoding selection apparatus according to claim 9.

The code amount holding means in the second code amount estimation means selects either the nearest data to the input data or the coding parameter, or both the nearest data in the inner ring and the outer ring, respectively. The encoding selection apparatus according to claim 9, wherein the encoding selection apparatus sends the signal to the interpolation means.

The code amount holding means in the first code amount estimation means holds the result of statistically calculating in advance the relationship between the number of prediction matches and the code amount using corresponding encoding. The encoding selection apparatus according to claim 8.

16. The data stored in the code amount holding means in the first code amount estimation means holds data at a fine interval particularly in a portion where the relationship between the number of prediction matches and the code amount is particularly nonlinear. The encoding selection apparatus described.

The code amount holding means in the first code amount estimation means selects either the closest data or the closest data in each of the inner ring and the outer ring with respect to the number of prediction matches, and each of the interpolation means The encoding selection apparatus according to claim 15 or 16, wherein the encoding selection apparatus transmits the data to the encoding apparatus.

Reversible coding means for entropy encoding the source coded results by source coding input data with the prediction result of the prediction means, a source encoded result by source coding input data using a frequency transformation and quantization In an encoding device for encoding by switching between lossy encoding means for entropy encoding,
Data input means for inputting input data;
With respect to data input by the data input means , partial processing or simplified processing of the source encoding of the lossless encoding means is performed, and encoding by the lossless encoding means is performed based on the execution result First code amount estimation means for estimating a code amount;
By the function of the size of the data input by the data input means, or by the function of the parameter of the data and the encoding parameter for the portion of the lossy encoding means that performs quantization, the lossy encoding means Second code amount estimation means for estimating a code amount for encoding;
An encoding selection unit that determines an encoding method based on a comparison of code amounts estimated by the first code amount estimation unit and the second code amount estimation unit;
Selection result output means for outputting the result of the encoding selection means;
An encoding device comprising: means for encoding the input data using an encoding method based on a result of the encoding selection means.