JP2005150845A

JP2005150845A - Image data coding apparatus and method thereof, computer program and computer-readable storage medium

Info

Publication number: JP2005150845A
Application number: JP2003381641A
Authority: JP
Inventors: Masaki Suzuki; 正樹鈴木
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2003-11-11
Filing date: 2003-11-11
Publication date: 2005-06-09

Abstract

<P>PROBLEM TO BE SOLVED: To efficiently obtain a rate/distortion gradient in image coding with a simple operation, and to obtain image data in which a target coding quantity is compression-coded. <P>SOLUTION: Image data inputted from an image data input section 200 is subjected to wavelet transform and quantization by well-known procedures, code data strings in all paths of each code block are generated and stored in a code string storing section 206, and information indicating the relation between an increasing quantity of code data at each path position of each code block and distortion is stored in a code block information storing section 207. A bit searching processor 209 continuously decides target rate/distortion gradient information per bit from its upper bit to lower bit on the basis of the information stored in the section 207. A code string forming section 205 reads code data strings from the section 206 until the corresponding path position on the basis of the rate/distortion gradient information obtained by the section 209, and generates final coded data. <P>COPYRIGHT: (C)2005,JPO&NCIPI

Description

本発明は、静止画像または動画像のフレームデータを符号化する画像符号化技術に関するものである。 The present invention relates to an image encoding technique for encoding frame data of a still image or a moving image.

画像符号化の効率を向上させる手法の一つとしてレート・歪み最適化手法がある。レート・歪み最適化手法は符号化データを構成する複数の区分ごとに発生符号量と画像の歪みに関する指標値を求め、総符号量が目標値以下という条件の基で総歪み指標値の最小化を図るものである。 One technique for improving the efficiency of image coding is a rate / distortion optimization technique. The rate / distortion optimization method obtains an index value related to the generated code amount and image distortion for each of the multiple sections constituting the encoded data, and minimizes the total distortion index value based on the condition that the total code amount is equal to or less than the target value. Is intended.

ＩＳＯ／ＩＥＣＪＴＣ１／ＳＣ２９／ＷＧ１での標準化作業により制定された静止画像符号化の国際標準である通称ＪＰＥＧ２０００（ＩＳＯ／ＩＥＣ１５４４４）では、ウェーブレット変換により得られる各サブバンドの係数をコードブロックと呼ばれる矩形領域に分割してそれぞれ独立に符号化するという構成となっている。そして、各コードブロックを何回かに分けて（何回かのパスに分けて）符号化しており、パスを単位として発生符号量と画像の歪み指標値を求めてレート・歪み最適化手法を適用することが考慮されている。ＪＰＥＧ２０００標準を実施する上での上記レート・歪み最適化手法の一適用方法がある（非特許文献１）。 In JPEG2000 (ISO / IEC 15444), an international standard for still image coding established by standardization work in ISO / IEC JTC1 / SC29 / WG1, the coefficient of each subband obtained by wavelet transform is called a code block. It is configured to be divided into rectangular areas and encoded independently. Each code block is encoded several times (divided into several passes), and the rate / distortion optimization method is calculated by obtaining the generated code amount and the image distortion index value in units of passes. Application is considered. There is one application method of the above-described rate / distortion optimization method for implementing the JPEG2000 standard (Non-Patent Document 1).

コードブロックBiの符号打ち切り可能点をniとし、niで符号を打ち切った場合のコードブロックBiの符号量をRi(ni)、歪みの指標値をDi(ni)とするとき、画像全体での歪み指標値Ｄ、および総符号量Ｒは、
Ｄ＝ΣＤi(ni)
Ｒ＝ΣＲi(ni)
で表わされる（ここでΣは、「ｉ=1,2,…,コードブロック総数」を変数とする合算関数）。歪みの指標値としては、平均二乗誤差や視覚的に重み付けされた平均二乗誤差など、画質の劣化の度合を表わす様々な値を用いることができる。 Distortion in the entire image, where ni is the code truncation possible point of the code block Bi, the code amount of the code block Bi when the code is terminated at ni is Ri (ni), and the distortion index value is Di (ni) The index value D and the total code amount R are:
D = ΣDi (ni)
R = ΣRi (ni)
(Where Σ is a summing function with “i = 1, 2,..., Total number of code blocks” as a variable). As the distortion index value, various values representing the degree of image quality degradation such as a mean square error and a visually weighted mean square error can be used.

レート・歪み最適化の目標は、目標となる総符号量Rmax以下という条件、即ち、Ｒ≦Ｒmaxという条件で、Ｄを最小化する打ち切り点niの集合を求めることである。この最適化問題は、一般化ラグランジェ乗数法（非特許文献２）によって解決することができ、あるλの値について、
Σ（Ｄi(ni)＋λ×Ｒi(ni)）
を最小化する問題に帰着する。λの値は総符号量RがRmax以下となるように調整される。 The goal of the rate / distortion optimization is to obtain a set of censoring points ni that minimize D under the condition that the total code amount Rmax is less than the target, that is, R ≦ Rmax. This optimization problem can be solved by the generalized Lagrange multiplier method (Non-Patent Document 2).
Σ (Di (ni) + λ × Ri (ni))
To the problem of minimizing The value of λ is adjusted so that the total code amount R is equal to or less than Rmax.

この式の最小化は、各コードブロックの個々の最小化問題となる。以下、コードブロックBiについてＤi(ni)＋λ・Ｒi(ni)が最小となる符号打ち切り点niを求める単純なアルゴリズムについて説明する。 Minimizing this equation becomes an individual minimization problem for each code block. Hereinafter, a simple algorithm for obtaining the code truncation point ni that minimizes Di (ni) + λ · Ri (ni) for the code block Bi will be described.

図４は有効符号化パスの数がki_maxであるコードブロックBiについて符号打ち切り点niを決定する処理の流れを示すものである。図４に示すように、まず、符号打ち切り点niを0に初期化しておく（ステップＳ４０１）。次に、着目する符号打ち切り可能点を示す変数kを１に設定する（ステップＳ４０２）。続いて、着目する符号打ち切り可能点ｋについて、このコードブロックBiの符号打ち切り点をniからkに移動させた場合の符号量増加分ΔRi(k)と歪み指標値減少分ΔDi(k)を求める（ステップＳ４０３）。符号打ち切り点niから着目する符号打ち切り可能点kの間にある符号の符号量対歪み指標値としてΔDi(k)/ΔRi(k)を算出してλと比較し（ステップＳ４０４）、大きければniの値をkに更新する（ステップＳ４０５）。次いで、kに１を加えて着目する符号打ち切り点を一つ下げ（ステップＳ４０６）、更新されたkの値をこのコードブロックの符号化パスの数ki_maxと比較し（ステップＳ４０７）、k≦ki_maxであれば更新されたkについてステップＳ４０３から繰り返して処理を行う。こうして、k＞ki_maxであれば処理を終了し、与えられたλでの着目コードブロックBiの符号打ち切り点は終了時点のniを決定する。 FIG. 4 shows the flow of processing for determining the code truncation point ni for the code block Bi whose number of effective coding passes is ki_max. As shown in FIG. 4, first, the code cut-off point ni is initialized to 0 (step S401). Next, the variable k indicating the code censorable point of interest is set to 1 (step S402). Subsequently, the code amount increase ΔRi (k) and the distortion index value decrease ΔDi (k) when the code stop point of the code block Bi is moved from ni to k are obtained for the target code stoppable point k. (Step S403). ΔDi (k) / ΔRi (k) is calculated as the code amount versus distortion index value of the code between the code cutoff point ni and the target code cutoff possible point k, and compared with λ (step S404). Is updated to k (step S405). Next, 1 is added to k to lower the target code cutoff point by one (step S406), and the updated value of k is compared with the number of coding passes ki_max of this code block (step S407), k ≦ ki_max. If so, the updated k is repeated from step S403. In this way, if k> ki_max, the processing is terminated, and ni at the end point is determined as the code cutoff point of the target code block Bi at a given λ.

このアルゴリズムは種々のλの値について実施されることを考慮すると、あらかじめコードブロックの符号打ち切り点の候補を定めておく方が効率が良い。 Considering that this algorithm is implemented for various values of λ, it is more efficient to predetermine code truncation point candidates for code blocks.

コードブロックの符号打ち切りは符号化パス単位に行われるので基本的には全ての符号化パス境界で符号打ち切りが可能であるが、上述の符号打ち切り点ni決定アルゴリズムを適用する場合、符号打ち切り候補点間のレート・歪みの勾配を表す値Si(k)=ΔDi(k)/ΔRi(k)がkに伴って単調減少となるように打ち切り候補点が決定され、条件を満たさない符号化パス境界は符号打ち切り点の候補から除外する。 Since code truncation of code blocks is performed in units of coding passes, code truncation is basically possible at all coding pass boundaries. However, when the above-described code truncation point ni determination algorithm is applied, code truncation candidate points A value indicating the slope of the rate / distortion between them Si (k) = ΔDi (k) / ΔRi (k) is determined to be monotonically decreasing with k, and the candidate points for censoring are determined, and the coding path boundary does not satisfy the condition Is excluded from candidates for code censoring points.

例えば図１０のように、４つの符号化パスにより符号化されたコードブロックを考える。４つのパスの境界（同図中、打ち切り可能点０から４）で符号を打ち切ることが可能であるが、打ち切り可能点２ではレート・歪みの勾配が単調減少となっておらず、コードブロックの符号を打ち切るのは効率的でないため、上述のアルゴリズムで符号打ち切り点として選択されることはないようにする。 For example, as shown in FIG. 10, a code block encoded by four encoding passes is considered. The sign can be cut off at the boundary of the four paths (censorable points 0 to 4 in the figure). However, at the censorable point 2, the rate / distortion gradient does not decrease monotonously, and the code block Since truncating the code is not efficient, it is not selected as a code truncation point in the above algorithm.

なお、ここで言う、レート・歪みの勾配が減少するとは、その勾配が水平に近くなることを意味する。すなわち、「レート・歪みの勾配」の算術的な傾きの正負の符号は除外していることに注意されたい。 Here, the rate / distortion gradient decreasing means that the gradient becomes nearly horizontal. That is, it should be noted that the sign of the arithmetic slope of “rate / distortion slope” is excluded.

以下、全符号化パスの境界から符号打ち切り点の候補を決定するアルゴリズムについて説明する。 In the following, an algorithm for determining a code cutoff point candidate from the boundary of all coding passes will be described.

図５は符号打ち切り点の候補を選択する処理の流れを示す図である。ここでは符号打ち切り点の候補の集合をNiとする。まず、符号打ち切り点の候補の集合Niの初期状態として着目するコードブロックの全符号化パスの境界の集まりをセットする（ステップＳ５０１）。すなわち、コードブロックBiの符号化パスの数をki_maxとするとNi={1,2,3,・・・,ki_max}とする。 FIG. 5 is a diagram showing a flow of processing for selecting a candidate for a code interruption point. Here, a set of candidate code cut points is Ni. First, a set of boundaries of all coding passes of a code block of interest is set as an initial state of a set of candidate code break points Ni (step S501). That is, assuming that the number of coding passes of the code block Bi is ki_max, Ni = {1, 2, 3,..., Ki_max}.

次に候補判定の対象となる符号打ち切り点pを0に設定する(ステップＳ５０２)。候補判定の対象となる符号打ち切り点の次の打ち切り点としてkを“１”に設定する(ステップＳ５０３)。kが集合Niに属しているか否かを判定し（ステップＳ５０３）、Niに属しているならば符号打ち切り点をpからkに移動させた場合の符号量増加分ΔRi(k),歪み指標値減少分ΔDi(k)を求め、更にこの区間のレート・歪み勾配Si(k)を求める(ステップＳ５０５)。Niに属さないならば後述するステップＳ５０８へと処理を移す。p≠0の場合、Si(k)とSi(p)を比較して(ステップＳ５０６)、Si(k)＞Si(p)ならばpを集合Niから除き（ステップＳ５１０）、ステップＳ５０２へと戻る。それ以外の場合にはpにkを設定し(ステップＳ５０７)、kの値に１を加えて更新し(ステップＳ５０８)、kとki_maxを比較してk≦ki_maxならば更新されたkについてステップＳ５０４から処理を行い、k＞ki_maxならば処理を終了して、この時点での集合Niに残った数字が符号打ち切り点（パス）の候補の集合となる。 Next, the code truncation point p to be subjected to candidate determination is set to 0 (step S502). K is set to “1” as the next cut-off point of the code cut-off point to be subjected to candidate determination (step S503). It is determined whether or not k belongs to the set Ni (step S503). If it belongs to Ni, the code amount increase ΔRi (k) and distortion index value when the code cutoff point is moved from p to k. The decrease ΔDi (k) is obtained, and the rate / distortion gradient Si (k) in this section is further obtained (step S505). If it does not belong to Ni, the process proceeds to step S508 described later. If p ≠ 0, Si (k) and Si (p) are compared (step S506), and if Si (k)> Si (p), p is removed from the set Ni (step S510), and the process proceeds to step S502. Return. Otherwise, k is set to p (step S507), 1 is added to the value of k and updated (step S508), and k and ki_max are compared. The process is performed from S504, and if k> ki_max, the process is terminated, and the numbers remaining in the set Ni at this point become a candidate set of code censoring points (passes).

上記処理を行うと、先に例に挙げた図１０のようなコードブロックの場合には、採集的に残った注目コードブロックＢｉの打ち切り点の候補集合Niは{1,3,4}となり、これらの打ち切り候補点については図１１に示すようにレート・歪み勾配がkに伴って単調減少となることが約束されることになる。 When the above processing is performed, in the case of the code block as shown in FIG. 10 described above as an example, the candidate set Ni of the truncation points of the focused code block Bi that remains collected is {1, 3, 4}, With respect to these censoring candidate points, it is promised that the rate / distortion gradient monotonously decreases with k as shown in FIG.

以上により得られた符号打ち切り点候補集合Niに属するｋについてSi(k),Ri(k)の値を保持しておき、Si(k)＞λとなる最大のkを選択する。λの値が小さい場合には符号打ち切り点は下がって切り捨てられる符号は少なくなり、逆にλの値が大きい場合には符号打ち切り点は上がって切り捨てられる符号は多くなるため、乗数λは画質のパラメータと見なすことができる。大きいλの値から小さい値へと変化させながら総符号量R=Rmax、あるいはR≒Rmaxとなるλを探して、そのλに基づいて各コードブロックの符号打ち切り点を決定することでレート・歪みの最適化を図ることができる。 The values of Si (k) and Ri (k) are held for k belonging to the code termination point candidate set Ni obtained as described above, and the maximum k satisfying Si (k)> λ is selected. When the value of λ is small, the code truncation point goes down and fewer codes are discarded. Conversely, when the value of λ is large, the code truncation point rises and more codes are discarded. It can be regarded as a parameter. While changing from a large λ value to a small value, search for λ where the total code amount R = Rmax or R ≒ Rmax, and determine the code cutoff point of each code block based on that λ, rate / distortion Can be optimized.

以下、レート・歪み最適化手法をＪＰＥＧ２０００に適用する一形態について説明する。但し、ＪＰＥＧ２０００による符号化の具体的方法については勧告書に詳細に説明されているので、ここでは簡単な例についてその大まかな処理の流れのみを説明する。 Hereinafter, an embodiment in which the rate / distortion optimization technique is applied to JPEG2000 will be described. However, since the specific method of encoding by JPEG2000 is described in detail in the recommendation, only a rough processing flow will be described here for a simple example.

説明簡略化のため、符号化対象画像は、各画素が８ビット（０〜２５５の２５６階調）で表現された５１２×５１２画素のモノクロ画像とし、符号化対象画像の各画素の水平方向の画素位置（座標）をｘ、垂直方向の画素位置をｙとして、画素位置（ｘ，ｙ）の画素値をＰ（ｘ，ｙ）で表す。また、ＪＰＥＧ２０００の符号化の条件として、タイル分割なし（５１２×５１２画素＝１タイル）、離散ウェーブレット変換２回、９×７非可逆フィルタ(9-7 Irreversible Filter)使用、コードブロックサイズは６４×６４、１レイヤでの符号列形成として説明する。その他にも、エントロピ符号化のオプションなど、様々な条件設定が必要であるが、ここでは特に言及しない。 For simplification of explanation, the encoding target image is a 512 × 512 pixel monochrome image in which each pixel is expressed by 8 bits (256 gradations of 0 to 255), and the horizontal direction of each pixel of the encoding target image is set. The pixel position (coordinates) is x, the pixel position in the vertical direction is y, and the pixel value at the pixel position (x, y) is represented by P (x, y). Also, JPEG2000 encoding conditions include no tile division (512 × 512 pixels = 1 tile), two discrete wavelet transforms, use of a 9 × 7 irreversible filter (9-7 Irreversible Filter), and a code block size of 64 × 64, it will be described as code string formation in one layer. In addition, various condition settings such as an entropy encoding option are necessary, but are not particularly mentioned here.

図２はＪＰＥＧ２０００の符号化処理の流れを示す図である。同図において２００は画像入力部、２０１は離散ウェーブレット変換部、２０２は係数量子化部、２０３はコードブロック分割部、２０４はコードブロック符号化部、２０５は符号列形成部、２０６は符号列格納部、２０７はコードブロック情報格納部、２０８は符号出力部である。 FIG. 2 is a diagram showing the flow of JPEG2000 encoding processing. In the figure, 200 is an image input unit, 201 is a discrete wavelet transform unit, 202 is a coefficient quantization unit, 203 is a code block division unit, 204 is a code block encoding unit, 205 is a code string forming unit, and 206 is a code string storage. 207 is a code block information storage unit, and 208 is a code output unit.

まず、画像データ入力部２００から符号化対象の画像データを構成する各画素値Ｐ（ｘ，ｙ）が順に入力される。画像データ入力部２００では入力される各画素値Ｐ（ｘ，ｙ）から中間値１２８を引くことで０から２５５の入力データのＤＣレベルシフトを行い、−１２８から１２７までのデータＰ'（ｘ，ｙ）に変換して、離散ウェーブレット変換部２０１へと出力する。離散ウェーブレット変換部２０１ではＤＣレベルシフト後の入力データＰ'（ｘ、ｙ）を不図示の内部バッファに適宜格納し、２次元離散ウェーブレット変換を実施する。２次元離散ウェーブレット変換は、１次元の離散ウェーブレット変換を水平及び垂直方向それぞれに適用することにより行われる。離散ウェーブレット変換部２０１では１次元離散ウェーブレット変換として先に述べたように、９×７タップの非可逆フィルタを使用することとする。図３は、２次元離散ウェーブレット変換によって処理される符号化対象画像のサブバンドを説明するための図である。 First, each pixel value P (x, y) constituting the image data to be encoded is sequentially input from the image data input unit 200. The image data input unit 200 performs DC level shift of input data from 0 to 255 by subtracting the intermediate value 128 from each input pixel value P (x, y), and data P ′ (x from −128 to 127) , Y) and output to the discrete wavelet transform unit 201. The discrete wavelet transform unit 201 appropriately stores the input data P ′ (x, y) after the DC level shift in an internal buffer (not shown) and performs a two-dimensional discrete wavelet transform. The two-dimensional discrete wavelet transform is performed by applying the one-dimensional discrete wavelet transform in the horizontal and vertical directions. The discrete wavelet transform unit 201 uses a 9 × 7 tap irreversible filter as described above as the one-dimensional discrete wavelet transform. FIG. 3 is a diagram for explaining subbands of an encoding target image processed by the two-dimensional discrete wavelet transform.

すなわち、図３（ａ）に示されるような符号化対象画像に対して、まず垂直方向に１次元離散ウェーブレット変換を適用し、図３（ｂ）に示されるように低周波サブバンドＬと高周波サブバンドＨとに分解する。次に、それぞれのサブバンドに対して水平方向の１次元離散ウェーブレット変換を適用することにより、図３（ｃ）に示されるようなＬＬ、ＨＬ、ＬＨ、ＨＨの４つのサブバンドに分解し、１回目のウェーブレット変換を完了する。 That is, a one-dimensional discrete wavelet transform is first applied to the encoding target image as shown in FIG. 3A in the vertical direction, and the low frequency subband L and the high frequency are applied as shown in FIG. Decomposes into subband H. Next, by applying a horizontal one-dimensional discrete wavelet transform to each subband, it is decomposed into four subbands LL, HL, LH, and HH as shown in FIG. Complete the first wavelet transform.

離散ウェーブレット変換部２０１では、上述した２次元離散ウェーブレット変換により得られたサブバンドＬＬに対して、さらに繰り返して２次元離散ウェーブレット変換（２回目のウェーブレット変換）を適用する。これによって、符号化対象画像をＬＬ、ＨＬ１、ＬＨ１、ＨＨ１、ＨＬ２、ＬＨ２、ＨＨ２の７つのサブバンドに分解することができる。 The discrete wavelet transform unit 201 further applies the two-dimensional discrete wavelet transform (second wavelet transform) to the subband LL obtained by the above-described two-dimensional discrete wavelet transform. As a result, the encoding target image can be decomposed into seven subbands LL, HL1, LH1, HH1, HL2, LH2, and HH2.

図６は、２回目の２次元離散ウェーブレット変換によって得られる７つのサブバンドを示している。復号側ではＬＬサブバンドの係数を復号することにより水平・垂直方向ともに原画像の１／４の大きさ（水平・垂直方向画素数が１／４）で画像を再生することができ、さらにＨＬ１，ＬＨ１，ＨＨ１の係数を復号することにより水平・垂直とも１／２の大きさの画像を、ＨＬ２，ＬＨ２，ＨＨ２まで復号することにより元の画像と同じ大きさの画像を再生することができる。換言すれば、この順に画像の解像度が高くなることを意味するので、ＬＬサブバンドに基づいて生成される画像は解像度０、解像度０の画像及びＬＨ１，ＨＬ１，ＨＨ１のサブバンドで生成される画像は解像度レベル１、解像度１の画像及びＬＨ２，ＨＬ２，ＨＨ２のサブバンドで生成される画像は像度レベル２を有すると言い換えることができる。 FIG. 6 shows seven subbands obtained by the second two-dimensional discrete wavelet transform. On the decoding side, by decoding the coefficients of the LL subband, it is possible to reproduce an image with 1/4 the size of the original image in the horizontal and vertical directions (the number of pixels in the horizontal and vertical directions is 1/4). , LH1 and HH1 can be decoded to reproduce an image having a size of 1/2 both horizontally and vertically, and by decoding up to HL2, LH2 and HH2, an image having the same size as the original image can be reproduced. . In other words, this means that the resolution of the image becomes higher in this order, so that an image generated based on the LL subband is an image generated with a resolution of 0, a resolution of 0, and an image generated with subbands of LH1, HL1, and HH1. In other words, an image having a resolution level of 1, a resolution of 1, and an image generated in subbands of LH2, HL2, and HH2 have an image level of 2.

さて、ここで、各サブバンド内の係数をＣ（Ｓｂ，ｘ，ｙ）と表す。ここで、Ｓｂはサブバンドの種類を表し、ＬＬ、ＬＨ１、ＨＬ１、ＨＨ１、ＬＨ２、ＨＬ２、ＨＨ２のいずれかである。また、（ｘ，ｙ）は各サブバンド内の左上隅の係数位置を（０，０）としたときの水平方向及び垂直方向の係数位置（座標）を表す。 Now, the coefficient in each subband is represented as C (Sb, x, y). Here, Sb represents the type of subband, and is one of LL, LH1, HL1, HH1, LH2, HL2, and HH2. Further, (x, y) represents the coefficient position (coordinates) in the horizontal direction and the vertical direction when the coefficient position of the upper left corner in each subband is (0, 0).

係数量子化部２０２では、離散ウェーブレット変換部２０１で生成された各サブバンドの係数Ｃ（Ｓ，ｘ，ｙ）を、各サブバンド毎に定めた量子化ステップｄｅｌｔａ（Ｓ）を用いて量子化する。ここで、量子化された係数値をＱ（Ｓ，ｘ，ｙ）と表すとすると、係数量子化部２０３で行われる量子化処理は次式により表すことができる。
Ｑ（Ｓ，ｘ，ｙ）＝ｓｉｇｎ｛Ｃ（Ｓ，ｘ，ｙ）｝
×ｆｌｏｏｒ｛｜Ｃ（Ｓ，ｘ，ｙ）｜／ｄｅｌｔａ（Ｓ）｝
ここで、ｓｉｇｎ｛Ｉ｝は整数Ｉの正負符号を表す関数であり、Ｉが正の場合は１を、負の場合は−１を返す。また、ｆｌｏｏｒ｛Ｒ｝は実数Ｒを超えない最大の整数値を表す。 The coefficient quantization unit 202 quantizes the coefficient C (S, x, y) of each subband generated by the discrete wavelet transform unit 201 using a quantization step delta (S) determined for each subband. To do. Here, if the quantized coefficient value is expressed as Q (S, x, y), the quantization processing performed by the coefficient quantization unit 203 can be expressed by the following equation.
Q (S, x, y) = sign {C (S, x, y)}
× floor {| C (S, x, y) | / delta (S)}
Here, sign {I} is a function representing the sign of the integer I, and returns 1 when I is positive and -1 when negative. Further, floor {R} represents the maximum integer value not exceeding the real number R.

コードブロック分割部２０３は係数量子化部２０２で量子化されたサブバンドの係数Ｃ（Ｓ，ｘ，ｙ）を不図示の内部バッファに適宜格納してコードブロックとよばれる所定の大きさの矩形に分割して切り出す。コードブロック分割はサブバンドの左上隅を基準として例えば６４×６４のブロックに分割することで行われる。 The code block dividing unit 203 stores the subband coefficient C (S, x, y) quantized by the coefficient quantizing unit 202 in an internal buffer (not shown) as appropriate, and is a rectangle of a predetermined size called a code block. Divide and cut out. The code block division is performed by dividing the code block into, for example, 64 × 64 blocks with the upper left corner of the subband as a reference.

したがって、２回ウェーブレット変換の場合、これによりＬＬ，ＨＬ１，ＬＨ１，ＨＨ１の各サブバンドは１２８×１２８のサイズであるから、それぞれ２×２＝４つのコードブロックを内包し、ＨＬ２，ＬＨ２，ＨＨ２の各サブバンドは４×４＝１６個のコードブロックを内包することになる。各コードブロックには順番に重複しない識別番号ｉ(＝０〜６３)を割り振り、B0,B1,B2,…,B63のようにBiという形でコードブロックを特定する。識別番号ｉは、先に説明したように解像度レベル順に割り振り、同一解像度レベル内においてはＨＬ，ＬＨ，ＨＨサブバンドの順、同一サブバンド内においてはラスタースキャン順に番号付けるものとする。 Therefore, in the case of the twice wavelet transform, each subband of LL, HL1, LH1, and HH1 has a size of 128 × 128. Therefore, 2 × 2 = 4 code blocks are included, and HL2, LH2, and HH2 are included. Each of the subbands includes 4 × 4 = 16 code blocks. Each code block is assigned an identification number i (= 0 to 63) that does not overlap in order, and a code block is specified in the form of Bi such as B0, B1, B2,. As described above, the identification numbers i are assigned in the order of resolution levels, and are numbered in the order of HL, LH, and HH subbands within the same resolution level, and in the order of raster scans within the same subband.

図７はコードブロック分割部２０３におけるコードブロック分割と番号の割振りの様子を示している。同図において実線はサブバンドの境界を表し、点線はコードブロックの境界を示しており、点線または実線で区切られる矩形がコードブロックである。 FIG. 7 shows a state of code block division and number allocation in the code block division unit 203. In the figure, solid lines represent subband boundaries, dotted lines represent code block boundaries, and rectangles delimited by dotted lines or solid lines are code blocks.

コードブロック符号化部２０４はコードブロック分割部２０３により切り出されたコードブロックBi内の量子化された係数値Ｑ（Ｓ，ｘ，ｙ）（以降、単に「係数値」と称す）の絶対値を自然２進数で表現して、上位の桁から下位の桁へとビットプレーン方向を優先して二値算術符号化し、コードブロックの符号化データを符号列格納部２０６に格納する。 The code block encoding unit 204 calculates the absolute value of the quantized coefficient value Q (S, x, y) (hereinafter simply referred to as “coefficient value”) in the code block Bi cut out by the code block dividing unit 203. Expressed in a natural binary number, binary arithmetic coding is performed with priority given to the bit plane direction from the upper digit to the lower digit, and the coded data of the code block is stored in the code string storage unit 206.

このとき、各ビットプレーンは最上位のビットプレーンを除いて３つのパスに分けて符号化する。最上位のビットプレーンを複数パスに分割しないのは、最上位ビットプレーンに存在する各係数値のビットがその係数値に対して支配的であり、重要度が高いからである。 At this time, each bit plane is divided into three passes and encoded except for the highest bit plane. The reason why the most significant bit plane is not divided into a plurality of paths is that the bits of the coefficient values existing in the most significant bit plane are dominant with respect to the coefficient values and are highly important.

上記を分かりやすく示したのが図１２である。上記では１つのコードブロックは６４×６４サイズとしているが、同図では簡単のため４×５サイズで示している。最上位ビットプレーンについては、先に述べたように、該当するコードブロックの最上位ビットプレーンであるので、全ビットが符号化対象として扱われる。そして、最上位ビットプレーンよりも下位のビットプレーンは、図示の如く３つのグループに分けられ、それぞれの符号化対象ビット位置は図示のように排他的な位置に設定される（全部をまとめると１つのビットプレーンとなる、という意味）。これまで説明した符号化を打ち切るパスが仮にｎパス目としたとき、上位からｎパス目までの符号化データを利用し、それ以降のパスの符号化データを切り捨てることを意味する。 FIG. 12 shows the above in an easy-to-understand manner. In the above, one code block has a size of 64 × 64, but in FIG. As described above, since the most significant bit plane is the most significant bit plane of the corresponding code block, all bits are treated as encoding targets. Then, the bit planes lower than the most significant bit plane are divided into three groups as shown in the figure, and the encoding target bit positions are set to exclusive positions as shown in the figure. Meaning that there will be one bit plane). If the pass that terminates the encoding described so far is the n-th pass, it means that the encoded data from the higher order to the n-th pass is used, and the encoded data of the subsequent passes are discarded.

最上位ビットプレーンを除く、各ビットプレーンのパスへの分割、各パスでの具体的な符号化方法については勧告書に従う。各パスの符号化毎に着目するパスの符号量増加分ΔRi(k)と歪み指標値減少分ΔDi(k)を求め、図８に示すテーブルを構築して不図示の内部バッファに格納する。歪み指標値には平均二乗誤差や、サブバンド毎に重みをつけて導出される重み付き平均二乗誤差などが用いられる。３種類のパスそれぞれについて、各係数の歪み指標値減少分を導出する一方式は勧告書の附属書Ｊ等に記されている。着目するコードブロックBiについて全パスの符号化が終了し、図８のようなテーブルが完成されると、先に述べた符号打ち切り点の候補を選択するアルゴリズム（図５）を実行し、全符号化パス境界の集合からSi(k)が単調減少となる符号打ち切り点候補集合Niを求める。図９のように候補集合Niの要素の数NPと各打ち切り候補点でのパス番号k、レート・歪み勾配Si(k)、符号量Ri(k)をコードブロック情報格納部２０７に格納する。 The division of each bit plane into paths, excluding the most significant bit plane, and the specific encoding method in each path follow the recommendations. A path amount increase ΔRi (k) and distortion index value decrease ΔDi (k) of the path of interest for each pass encoding are obtained, and the table shown in FIG. 8 is constructed and stored in an internal buffer (not shown). As the distortion index value, a mean square error, a weighted mean square error derived by weighting each subband, or the like is used. One method for deriving a decrease in distortion index value of each coefficient for each of the three types of paths is described in Appendix J of the recommendation. When the encoding of all passes is completed for the code block Bi of interest and the table as shown in FIG. 8 is completed, the above-described algorithm (FIG. 5) for selecting the candidate for the code truncation point is executed. A code truncation point candidate set Ni in which Si (k) monotonously decreases is obtained from the set of generalized path boundaries. As shown in FIG. 9, the number NP of elements of the candidate set Ni, the pass number k, the rate / distortion gradient Si (k), and the code amount Ri (k) at each abort candidate point are stored in the code block information storage unit 207.

符号列形成部２０５はコードブロック符号化部２０４により全てのコードブロックの符号化が終了すると、コードブロック情報格納部２０５に格納される各コードブロックの打ち切り候補点の情報を参照しながら、総符号量R=Rmax、あるいはR≒Rmaxとなるλを探して、Si(k)＞λとなる部分の符号を集めて最終符号列を形成し、出力する。 When the code block encoding unit 204 finishes encoding all the code blocks, the code string forming unit 205 refers to the information on the abort candidate points of each code block stored in the code block information storage unit 205 while referring to the total code Searching for λ where the quantity R = Rmax or R≈Rmax, the codes of the parts where Si (k)> λ are collected to form and output the final code string.

符号列形成部２０５におけるλ決定の処理の流れを図１３に示し、以下に説明する。尚、以下の説明において、閾値を表す変数としてSを導入し、処理終了時点での変数Sの値がλとなることに注意されたい。 The flow of λ determination processing in the code string forming unit 205 is shown in FIG. 13 and will be described below. In the following description, it should be noted that S is introduced as a variable representing a threshold value, and the value of the variable S at the end of processing is λ.

まず、コードブロック情報格納部２０７に格納される全コードブロックの打ち切り候補点の情報を参照して、Si(k)の最小値Sminと最大値Smaxを求める（ステップＳ１４０１）。次に、閾値を表す変数SにステップＳ１４０１で求めたSmaxを設定する（ステップＳ１４０２）。続いて、変数Sからあらかじめ定めた閾値変更幅ΔSを減じ、変数Sの値を少しだけ下げる（ステップＳ１４０３）。コードブロック番号を表す変数iに0を設定し、累積符号量Rを0に初期化する（ステップＳ１４０４）。再び、コードブロック情報格納部２０５に格納されるコードブロックBiの打ち切り候補点情報を参照してSi(k)＞Sを満たす最大のkを求め、コードブロックBiの打ち切り点niとする（ステップＳ１４０５）。Si(k)の値はコードブロックBiの打ち切り候補点順に単調減少化されているのでSi(k)を候補点の順番に比較していくことでniを求めることができる。ステップＳ１４０５で求めた打ち切り点niでのコードブロックBiの符号量Ri(ni)を累積符号量Rに加える（ステップＳ１４０６）。次いで、変数iに１を加えて更新する（ステップＳ１４０７）。そして、変数iを64と比較し、i=64ならばステップＳ１４０９へ、そうでない場合にはステップＳ１４０５に処理を移して、次のコードブロックについて符号量加算を行う（ステップＳ１４０８）。 First, the minimum value Smin and the maximum value Smax of Si (k) are obtained with reference to the information on the abort candidate points of all code blocks stored in the code block information storage unit 207 (step S1401). Next, Smax obtained in step S1401 is set to the variable S representing the threshold (step S1402). Subsequently, the predetermined threshold change width ΔS is subtracted from the variable S, and the value of the variable S is slightly lowered (step S1403). The variable i representing the code block number is set to 0, and the accumulated code amount R is initialized to 0 (step S1404). Again, with reference to the abort candidate point information of the code block Bi stored in the code block information storage unit 205, the maximum k satisfying Si (k)> S is obtained and set as the abort point ni of the code block Bi (step S1405). ). Since the value of Si (k) is monotonously decreased in the order of the candidate points for termination of the code block Bi, ni can be obtained by comparing Si (k) in the order of the candidate points. The code amount Ri (ni) of the code block Bi at the censoring point ni obtained in step S1405 is added to the accumulated code amount R (step S1406). Next, the variable i is updated by adding 1 (step S1407). Then, the variable i is compared with 64. If i = 64, the process proceeds to step S1409; otherwise, the process proceeds to step S1405, and the code amount is added for the next code block (step S1408).

変数ｉの値が64に到達した場合、即ち、閾値Sについて全コードブロックからの累積符号量の算出が終了した場合、累積符号量Rを目標符号量Rmaxと比較し、R＜RmaxならばステップＳ１４１０へ、そうでなければステップＳ１４１１へと処理を移す（ステップＳ１４０９）。ステップＳ１４１１へと処理が移された場合には、現在の閾値Sでは目標符号量を超えるため、ΔSを加えてひとつ前の閾値Sに戻して処理を終了する（ステップＳ１４１１）。また、ステップＳ１４１０では閾値SをステップＳ１４０１で求めた最小値Sminと比較し、S＞SminであればステップＳ１４０３へと戻りSの値をまた少し小さくしてから再びステップＳ１４０９までの処理を行い、S≦Sminであれば処理を終了する。処理終了時点でのSが閾値λとして選択される。 When the value of the variable i reaches 64, that is, when calculation of the accumulated code amount from all code blocks for the threshold S is completed, the accumulated code amount R is compared with the target code amount Rmax, and if R <Rmax, step If not, the process proceeds to step S1411 (step S1409). When the process is shifted to step S1411, the current threshold value S exceeds the target code amount, so ΔS is added to return to the previous threshold value S and the process is terminated (step S1411). In step S1410, the threshold value S is compared with the minimum value Smin obtained in step S1401, and if S> Smin, the process returns to step S1403, the value of S is slightly reduced, and the processing up to step S1409 is performed again. If S ≦ Smin, the process ends. S at the end of processing is selected as the threshold λ.

上述の処理により求められた閾値λに対し、各コードブロックからSi(k)＞λとなる部分の符号を符号列格納部２０６から読み出して、ＪＰＥＧ２０００符号列のフォーマットに従って情報（メインヘッダ、タイルヘッダ、パケットヘッダなど）を付加してＪＰＥＧ２０００の符号列を形成し、符号出力部２０８へと出力する。 For the threshold value λ obtained by the above-described processing, the code of the portion where Si (k)> λ is read from each code block from the code string storage unit 206, and information (main header, tile header) is read according to the JPEG2000 code string format. , Packet header, etc.) are added to form a JPEG2000 code string and output to the code output unit 208.

符号出力部２０８は符号列形成部２０５により生成されたＪＰＥＧ２０００符号化データを装置外部に出力する。符号出力部２０８は例えばハードディスク、光磁気ディスク、メモリなどの記憶媒体、または、ネットワークへのインターフェースなどである。
標準勧告書（ＩＳＯ／ＩＥＣ１５４４４−１）の附属書Ｊ “Generalized Lagrange Multiplier Method for Solving Problems of Optimum Allocation of Resources”, Operation Research, vol.11, pp.399-417, 1963 The code output unit 208 outputs the JPEG2000 encoded data generated by the code string forming unit 205 to the outside of the apparatus. The code output unit 208 is, for example, a storage medium such as a hard disk, a magneto-optical disk, or a memory, or an interface to a network.
Annex J of the standard recommendation (ISO / IEC 15444-1) “Generalized Lagrange Multiplier Method for Solving Problems of Optimum Allocation of Resources”, Operation Research, vol.11, pp.399-417, 1963

しかしながら、上述の方法によりR≒Rmaxとなる最大のλを見つけるためには、様々なλで総符号量Rを求めて目標符号量Rmaxと比較するという処理を繰り返す必要があり、符号打ち切り点候補の情報Si(k),Ri(k)を格納するメモリへの多数回のアクセスや、Si(k)とλの比較など、総符号量Rを求めるために必要な演算コストなどから多くの処理時間を必要とするなどの問題がある。 However, in order to find the maximum λ where R≈Rmax by the above method, it is necessary to repeat the process of obtaining the total code amount R at various λ and comparing it with the target code amount Rmax. Many processes due to the calculation cost required to obtain the total code amount R, such as many accesses to the memory storing the information Si (k), Ri (k), and comparison between Si (k) and λ There are problems such as requiring time.

また、前述のしきい値変更幅ΔＳを小さくすればする程、RをRmaxに近づけることができるが、変更幅に反比例して上記演算量は増加する。しかも、演算量を増やしたからと言って、Rmaxに最も近いRを得ることが保証されるものでもない、という問題点がある。 Further, as the threshold change width ΔS is reduced, R can be made closer to Rmax, but the calculation amount increases in inverse proportion to the change width. Moreover, just because the amount of calculation is increased, there is a problem that it is not guaranteed that R closest to Rmax is obtained.

本発明は、このような事情を考慮してなされたものであり、より単純な処理で、目標符号量となる最適なレート・歪み勾配を決定し、圧縮符号化画像データを生成する技術を提供しようとするものである。 The present invention has been made in view of such circumstances, and provides a technique for determining the optimum rate / distortion gradient as a target code amount and generating compression-encoded image data with simpler processing. It is something to try.

この課題を解決するため、例えば本発明の画像データ符号化装置は以下の構成を備える。すなわち、
画像データを周波数変換し、変換後のデータを所定サイズの複数のブロックに分割し、分割したブロック内の各係数値のビット情報を上位から下位に向かうパス順に符号化データを生成し、各ブロックの上位から下位に向かうどのパス位置までの符号化データを用いて圧縮符号化データにするかを、目標符号量Ｒmaxと各パス位置における符号データ量の増加量と画像の歪みで表わされるレート・歪み勾配λに基づき決定する画像データ符号化装置であって、
周波数変換したデータの各ブロックについて、各パス段階での符号データを記憶する第１の記憶手段と、
前記各ブロックにおける各パス段階における増加する符号量と歪み量、及び、増加する符号量に対する歪み量に関する情報を記憶する第２の記憶手段と、
有意ビット長ｎを有し、ｉ＝ｎ−１とする最上位ビットｉを“１”とする仮レート・歪み勾配情報を設定する設定手段と、
該設定手段で設定された仮レート・歪み勾配情報以上のレート・歪み勾配情報となる符号データの総符号量Ｒを、前記第２の記憶手段を参照することで算出する算出手段と、
該算出手段で算出した総符号量Ｒと前記目標符号量Ｒmaxとを比較する比較手段と、
該比較手段の比較結果に応じて、Ｒ＞Ｒｍａｘである場合には前記ｉを“１”だけ減じ、前記仮レート・歪み勾配情報のビットｉを“１”に更新し、前記Ｒ＜Ｒｍａｘである場合には前記ビットｉを“０”に修正し、前記ｉを“１”だけ減じ、前記仮レート・歪み勾配情報のビットｉを“１”に更新する更新手段と、
前記比較手段でＲ＝Ｒｍａｘとなる、或いは、前記仮レート・歪み勾配情報の全ビット情報が確定するまで、或いは、前記仮レート・歪み勾配情報までの符号を集めた符号量と前記Ｒｍａｘとの差が所定値以内となるまで、仮レート・歪み勾配情報のビット情報の探索処理を行うべく、前記算出手段及び判断手段の処理を繰り返すビット探索手段とを備え、
前記ビット探索手段によって得られた最終の仮レート・歪み勾配情報を目標レート・歪み勾配情報λとし、当該レート・歪み勾配λに基づいて、前記第１の記憶手段に記憶された該当するパス段階での符号化データを読出し、圧縮符号化データとして出力することを特徴とする。 In order to solve this problem, for example, an image data encoding device of the present invention has the following configuration. That is,
The image data is frequency converted, the converted data is divided into a plurality of blocks of a predetermined size, and the bit information of each coefficient value in the divided block is generated in the order of the pass from the higher order to the lower order, and each block is generated. The rate that is expressed by the target code amount Rmax, the increase amount of the code data amount at each pass position, and the distortion of the image, from which the encoded data up to the lower pass position is used as the compression encoded data. An image data encoding device that determines based on a distortion gradient λ,
First storage means for storing code data at each pass stage for each block of frequency-converted data;
Second storage means for storing information on increasing code amount and distortion amount in each pass stage in each block, and distortion amount with respect to the increasing code amount;
Setting means for setting temporary rate / distortion gradient information having a significant bit length n and the most significant bit i of “1” being “1”;
Calculating means for calculating a total code amount R of code data that becomes rate / distortion gradient information equal to or higher than the provisional rate / distortion gradient information set by the setting means, by referring to the second storage means;
Comparing means for comparing the total code amount R calculated by the calculating means with the target code amount Rmax;
According to the comparison result of the comparison means, when R> Rmax, i is decreased by “1”, bit i of the provisional rate / distortion gradient information is updated to “1”, and R <Rmax Updating means for correcting the bit i to “0”, subtracting the i by “1”, and updating the bit i of the provisional rate / distortion gradient information to “1”, if any;
R = Rmax in the comparison means, or until all the bit information of the provisional rate / distortion gradient information is determined, or a code amount obtained by collecting codes up to the provisional rate / distortion gradient information and the Rmax A bit search unit that repeats the processing of the calculation unit and the determination unit in order to perform a search process of the bit information of the temporary rate / distortion gradient information until the difference is within a predetermined value;
The final provisional rate / distortion gradient information obtained by the bit search means is set as the target rate / distortion gradient information λ, and the corresponding pass stage stored in the first storage means based on the rate / distortion gradient λ. The encoded data is read out and output as compressed encoded data.

本発明によれば、目標となるレート・歪み勾配情報がｎビットで表わされるとした場合、最大でもｎ回の符号量算出と比較処理で全ビットの情報を確定でき、レート・歪み勾配を効率良く求めることが可能となり、目標符号量の符号化データを効率良く生成することが可能になる。 According to the present invention, when the target rate / distortion gradient information is represented by n bits, the information of all bits can be determined by calculating and comparing the code amount at most n times, and the rate / distortion gradient is made efficient. Thus, it is possible to obtain the encoded data of the target code amount efficiently.

以下、添付図面に従って本発明にかかる実施形態を詳細に説明する。 DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments according to the present invention will be described in detail with reference to the accompanying drawings.

図１は実施形態における画像符号化処理装置のブロック構成図である。 FIG. 1 is a block diagram of an image encoding processing apparatus according to the embodiment.

図２と異なる点は、コードブロック情報格納部２０７をアクセスするビット探索処理部２０９を設け、このビット探索処理部２０９が目標となるレート・歪み勾配情報の各ビットを順に確定する点である。符号列格納部２０５は、このビット探索処理部２０９で求めた処理結果（レート・歪み勾配情報）に基づいて符号列格納部２０５より該当する各コードブロックのパス位置における符号列を読出し、出力符号化データを生成することになる。それ以外は先に説明した内容と同様であるものとし、以下では、本実施形態の特徴部分について説明する。 The difference from FIG. 2 is that a bit search processing unit 209 for accessing the code block information storage unit 207 is provided, and this bit search processing unit 209 determines each bit of the target rate / distortion gradient information in order. The code string storage unit 205 reads the code string at the pass position of each corresponding code block from the code string storage unit 205 based on the processing result (rate / distortion gradient information) obtained by the bit search processing unit 209, and outputs the output code. Data will be generated. Other than that, it is assumed that the contents are the same as those described above, and the characteristic part of the present embodiment will be described below.

なお、詳細は後述するが、実施形態におけるコードブロック符号化部２０４は、１つの画像データを符号化する際、各コードブロックの各パス段階での符号データ列を生成する際、レート・歪み勾配を求め、求めたレート・歪み勾配と従前のレート歪み勾配とを比較し、大きい方を記憶保持する手段（レジスタ等）を備える。１つの画像全体の符号データ列の生成が完了したとき、その画像全体における最大レート・歪み勾配情報を初期処理で利用するためである。 Although details will be described later, the code block encoding unit 204 according to the embodiment encodes one image data, generates a code data string at each pass stage of each code block, and rate / distortion gradient. And a means (register or the like) for storing and holding the larger one, comparing the obtained rate / distortion gradient with the previous rate distortion gradient. This is because, when the generation of the code data string for one entire image is completed, the maximum rate / distortion gradient information for the entire image is used in the initial processing.

図１に示した装置全体の処理は、図１４に示す手順に従って処理される。 The processing of the entire apparatus shown in FIG. 1 is processed according to the procedure shown in FIG.

同図において、先ず、ステップＳ１００にて画像データを入力する。この画像データの入力は例えばイメージスキャナであるが、画像データを記憶した記憶媒体から画像データを読取る構成でも勿論構わないし、ネットワークインタフェースを介して画像データを入力しても勿論構わない。 In the figure, first, image data is input in step S100. The image data is input by, for example, an image scanner. However, the image data may be read from a storage medium storing the image data, or the image data may be input via a network interface.

次いで、ステップＳ１０１にて入力した画像データに対して離散ウェーブレット変換処理を行い、ステップＳ１０２にて変換結果の各係数の量子化処理を行う。そして、量子化結果を今度はコードブロックに分割し（ステップＳ１０３）、各コードブロック符号化処理を行う。このとき、各コードブロックをビットプレーンに分割し、なおかつ、最上位ビットプレーンを除くビットプレーンについては３つのパスに分け、最上位のパスから順に符号データ列を生成する（ステップＳ１０４）。次に、ステップＳ１０５において、各パス毎の符号化データを所定の記憶手段に格納すると共に、ステップＳ１０６において各コードブロック毎に、各パスにおける符号増加量と歪み量とを関連付けてコードブロック情報（図８参照）として所定の記憶手段に格納する。 Next, a discrete wavelet transform process is performed on the image data input in step S101, and a quantization process is performed on each coefficient of the transform result in step S102. Then, the quantization result is divided into code blocks (step S103), and each code block encoding process is performed. At this time, each code block is divided into bit planes, and bit planes other than the most significant bit plane are divided into three paths, and code data sequences are generated in order from the most significant path (step S104). Next, in step S105, the encoded data for each pass is stored in a predetermined storage means, and in step S106, the code block information ( As shown in FIG. 8).

次に、本実施形態における特徴部分である、ビット探索処理を行い、目標レート・歪み勾配λを求める（ステップＳ１０７）。この結果、各コードブロック毎のパス打ち切り位置が求められるので、ステップＳ１０９にて、先に格納した各コードブロック中の該当するパス位置までの符号化データ列を取り出し、最終的な画像全体の符号化データ列を生成し、ステップＳ１１０にて生成された符号化コード列を出力する。 Next, a bit search process, which is a characteristic part in the present embodiment, is performed to obtain a target rate / distortion gradient λ (step S107). As a result, the pass stop position for each code block is obtained. In step S109, the encoded data string up to the corresponding pass position in each previously stored code block is extracted, and the final code of the entire image is extracted. An encoded data string is generated, and the encoded code string generated in step S110 is output.

上記処理中、本実施形態における特徴的な処理はステップＳ１０７、すなわち、図１におけるビット探索処理部２０９にある。そこで、以下では、かかる点を中心に説明することとする。 During the above processing, the characteristic processing in the present embodiment is in step S107, that is, the bit search processing unit 209 in FIG. Therefore, in the following, this point will be mainly described.

先ず、説明を簡単にするため、本実施形態の画像符号化装置で符号化対象とする画像符号化データは、前述の従来例と同じく、５１２×５１２の各画素８ビットのモノクロ画像データとする。また、タイル分割の無（１タイル＝５１２×５１２）、コードブロックのサイズは６４×６４、９×７タップフィルタを用いた２回ウェーブレット変換を行うものとする。すなわち、先に説明した従来例と同じとする。 First, in order to simplify the description, the encoded image data to be encoded by the image encoding apparatus of the present embodiment is monochrome image data of 8 bits of each pixel of 512 × 512, as in the above-described conventional example. . Further, there is no tile division (1 tile = 512 × 512), the code block size is 64 × 64, and wavelet transform is performed twice using a 9 × 7 tap filter. That is, it is the same as the conventional example described above.

また、符号列格納部２０６には、各コードブロックの、各パス段階での符号データ列が既に格納され、コードブロック情報格納部２０７は各コードブロック毎の図８に示すテーブルが既に作成されているものとする。また、レート・歪み勾配情報は８ビットで表現されるものとする。 In addition, the code string storage unit 206 stores code data sequences of each code block at each pass stage, and the code block information storage unit 207 already creates the table shown in FIG. 8 for each code block. It shall be. The rate / distortion gradient information is represented by 8 bits.

実施形態におけるビット探索処理部２０９は、コードブロック情報格納部２０７に格納された情報に基づき、目標符号量もしくはそれに近似する符号量を得るための、目標レート・歪み勾配情報λthを上位ビットから下位ビットに向かう方向に１ビットずつ決定するものである。以下、その原理を説明する。 The bit search processing unit 209 in the embodiment lowers the target rate / distortion gradient information λth from the upper bits to obtain the target code amount or a code amount approximate thereto based on the information stored in the code block information storage unit 207. One bit is determined in the direction toward the bit. The principle will be described below.

先ず、初期段階において、ビット探索処理部２０９は、コードブロック符号化部２０４が記憶保持していた最大レート・歪み勾配情報λmaxを入力する。最大レート・歪み勾配情報λmaxは、コードブロック情報格納部２０７を検索することでも得られるが、上記のようにするとその検索に要する処理が省かれ、都合が良い。 First, in the initial stage, the bit search processing unit 209 inputs the maximum rate / distortion gradient information λmax stored and held in the code block encoding unit 204. The maximum rate / distortion gradient information λmax can also be obtained by searching the code block information storage unit 207. However, the above processing is convenient because the processing required for the search is omitted.

目標レート・歪み勾配情報λthは、０以上で、且つ、最大レート・歪み勾配情報λmax以下であることが約束されている。換言すれば、最大レート・歪み勾配情報λmaxの最上位ビットから下位に向かうｍ個のビットが“０”となっていた場合、目標レート・歪み勾配情報λthの最上位から下位に向かう、少なくともｍビットもまた“０”であることが約束されていることになる。従って、最大レート・歪み勾配情報λmaxを取得することは、目標レート・歪み勾配情報λthの最上位ビットから０に設定できるビット数を求めることを等価のものとなる。 It is promised that the target rate / distortion gradient information λth is 0 or more and not more than the maximum rate / distortion gradient information λmax. In other words, when m bits going from the most significant bit of the maximum rate / distortion gradient information λmax to the lower order are “0”, at least m going from the most significant bit of the target rate / distortion gradient information λth to the lower order. The bit is also promised to be “0”. Therefore, obtaining the maximum rate / distortion gradient information λmax is equivalent to obtaining the number of bits that can be set to 0 from the most significant bit of the target rate / distortion gradient information λth.

従って、目標レート・歪み勾配情報λthは、下位ｎ（＝８−ｍ）ビット（実施形態では、レート・歪み勾配情報は８ビットとしていることに注意）が如何なる状態であるかを求めれば良いことになる。 Therefore, the target rate / distortion gradient information λth should be obtained in what state the lower n (= 8−m) bits (note that the rate / distortion gradient information is 8 bits in the embodiment). become.

今、ここでコードブロック符号化部２０４より入力した全コードブロック中の最大レート・歪み勾配情報λmaxが、仮に、“１１”（２進数８ビットで示すと“００００１０１１”）であったとする。この場合、目標レート・歪み勾配情報λthの少なくとも上位４ビットは０であり、下位４ビットが不定であることになる。すなわち、目標レート・歪み勾配情報λthは２進数表記で示すと“００００ｘｘｘｘ”（ｘは０、１のいずれか）となる。 Here, it is assumed that the maximum rate / distortion gradient information λmax in all code blocks input from the code block encoding unit 204 is “11” (“00001011” in binary 8 bits). In this case, at least the upper 4 bits of the target rate / distortion gradient information λth are 0, and the lower 4 bits are undefined. That is, the target rate / distortion gradient information λth is “0000xxxx” (where x is 0 or 1) in binary notation.

つまり、目標レート・歪み勾配情報λthは、二進数で示すと“００００１１１１”（以下、λvmaxという）と、“００００００００”（以下、λvminという）の間に存在すると言い換えることができる。なお、上記の場合、λvmax＞λmaxとなってしまうので、厳密に言えば、λvmaxをλmaxとすべきであるが、本実施形態ではビット毎にλhを決定していくので、λvmaxをλmaxが有する有意なビット群のうち最上位のビット位置でもって決定する。 That is, it can be said that the target rate / distortion gradient information λth exists between “000011111” (hereinafter referred to as λvmax) and “00000000” (hereinafter referred to as λvmin) in binary numbers. In the above case, λvmax> λmax. Strictly speaking, λvmax should be λmax. However, in this embodiment, λh is determined for each bit, and thus λvmax has λmax. A significant bit group is determined by the most significant bit position.

さて、λvmaxとλvminが決定されると、そのほぼ中間値を仮のレート歪み勾配閾値λm＝００００１０００を求める。すなわちλmは、λthの未定のビット群（上記では下位４ビット）中の最上位ビットを“１”と仮定し、それより下位のビットを０とする。 Now, when λvmax and λvmin are determined, a provisional rate distortion gradient threshold λm = 00001000 is obtained as an approximately intermediate value between them. In other words, λm assumes that the most significant bit in the undetermined bit group of λth (the lower 4 bits in the above) is “1”, and the lower bits are set to 0.

図１５の符号１５０１乃至１５０４は符号量と歪みを示してる。横軸は符号量であるが、パス段階と見ることもできる点に注意されたい。 Reference numerals 1501 to 1504 in FIG. 15 indicate the code amount and distortion. Note that the horizontal axis is the code amount, but can also be viewed as a pass stage.

図１５における符号１５０１は、λvmax,λm,λvminそれぞれにおける符号量と歪みＤを示しているが、閾値λm以上の累積符号量Ｒ（レート・歪み勾配をλmとした場合の符号データ量をコードブロック情報格納部２０７を調べ、累積加算することで得られる）と目標符号量Ｒmaxとを比較したとき、その取り得る関係は、
Ｒ＞Ｒmax or
Ｒ＜Ｒmax or
Ｒ＝Ｒmax
のいずれかになる。 A code 1501 in FIG. 15 indicates the code amount and distortion D in each of λvmax, λm, and λvmin. The accumulated code amount R that is equal to or greater than the threshold λm (the code data amount when the rate / distortion gradient is λm) When the information storage unit 207 is checked and cumulatively added) and the target code amount Rmax are compared, the relationship that can be taken is
R> Rmax or
R <Rmax or
R = Rmax
One of them.

ここで、「Ｒ＞Ｒmax」という関係がある場合には、図示の符号１５０２の如く、求めるλthは現時点でのλmの左側に存在することになる。従って、この場合には、λvminをその時点でのλmで更新し、λvmaxと更新後のλvminとの中間位置、すなわち、値“００００１１００”を新たなλmとする。換言すれば、λthの不定となっていた４ビット中の最上位のビットは“１”として確定し、その下位のビットを“１”と仮定する。 Here, if there is a relationship of “R> Rmax”, the λth to be obtained exists on the left side of the current λm as indicated by reference numeral 1502 in the figure. Therefore, in this case, λvmin is updated with λm at that time, and an intermediate position between λvmax and the updated λvmin, that is, the value “00001100” is set as a new λm. In other words, it is assumed that the most significant bit among the 4 bits in which λth is indefinite is determined as “1” and the lower-order bit is “1”.

一方、「Ｒ＜Ｒmax」という関係になった場合には、図示の符号１５０４で示すように、求めるλthは現時点でのλmの右側に存在することを示す。従って、この場合には、λvmaxを現時点でのλmで更新し、λminと更新後のλvmaxの中間位置である値“０００００１００”を新たなλmとする。換言すれば、λthの不定となっていた４ビット中の最上位のビットは“０”として確定し、その下位のビットを“１”と仮定する。 On the other hand, when the relationship of “R <Rmax” is established, as shown by the reference numeral 1504 in the figure, it indicates that the λth to be found exists on the right side of λm at the present time. Therefore, in this case, λvmax is updated with the current λm, and a value “00000100”, which is an intermediate position between λmin and the updated λvmax, is set as a new λm. In other words, it is assumed that the most significant bit among the 4 bits in which λth is indefinite is determined as “0”, and the lower bit thereof is “1”.

そして、「Ｒ＝Ｒmax」であるという関係にある場合には、その時点でのλmが求めるλthであると判定し、本処理を終える。 If there is a relationship “R = Rmax”, it is determined that λm at that time is the λth to be obtained, and this processing is terminated.

ここで、「Ｒ＞Ｒmax」又は「Ｒ＜Ｒmax」のいずれかの場合には、上記処理を繰り返すことで、当初のλthの未定となっていた４ビットを、その上位ビット位置から順に確定していく。 Here, in the case of either “R> Rmax” or “R <Rmax”, the above processing is repeated to determine the 4 bits of λth that are initially undetermined in order from the upper bit position. To go.

図１７は、全コードブロックの最大レート・歪み勾配λmaxが十進数表記で“１１”（二進表記で“００００１０１１”）の場合の場合を示している。λmaxはその上位から調べていってビット３で始めて“１”になるから、仮のλｍはその上位４ビットを“０”にし、未定のビットは下位の４ビットであることになる。従って、その未定のビット群中の最大位置にあるビット３を開始ビット位置とし、以下、ビット２、ビット１、ビット０の順にそのビット位置を確定していくことを示している。 FIG. 17 shows a case where the maximum rate / distortion gradient λmax of all code blocks is “11” in decimal notation (“00001011” in binary notation). Since λmax is checked from the upper part and starts at bit 3 and becomes “1”, the temporary λm sets the upper 4 bits to “0”, and the undetermined bits are the lower 4 bits. Accordingly, bit 3 at the maximum position in the undetermined bit group is set as a start bit position, and the bit positions are determined in the order of bit 2, bit 1, and bit 0 hereinafter.

要するに、当初のλthの、未定ビットがｎビットである場合、最大でもｎ回の総符号量Ｒを求める処理、及び比較処理を行えば、Ｒ≦Ｒmaxとなる最終的なλthを求めることができることを意味する。実施形態では、レート・歪み勾配情報が８ビットで表現される場合について述べているので、最大でも８回の演算で済むことも意味するものとも言える。 In short, when the undetermined bit of the initial λth is n bits, the final λth satisfying R ≦ Rmax can be obtained by performing the process of obtaining the total code amount R at most n times and the comparison process. Means. In the embodiment, since the case where the rate / distortion gradient information is expressed by 8 bits is described, it can be said that it is necessary to perform 8 operations at the maximum.

以上であるが、実施形態におけるビット探索処理部２０９の具体的な構成を図１６に示し、以下、その構成をその動作処理内容に従って説明する。 The specific configuration of the bit search processing unit 209 in the embodiment is shown in FIG. 16, and the configuration will be described below according to the contents of the operation processing.

最大有意ビット位置検出部１００は、コードブロック符号化部２０４が１つの画像データの符号化処理が完了した後、コードブロック符号化部２０４が記憶保持している全コードブロック中の最大レート・歪み勾配情報λmaxを入力し、その最上位ビットＭＳＢから下位に向かうビットを調べ、最初に“１”となったビット位置を示す情報Ｓを閾値生成部１０１、及び、判定部１０６に出力する。例えば、最大レート・歪み勾配情報λmaxの値が先に説明したように十進数で“１１”である場合、二進数で示すと“００００１０１１”となるから、ビット７からビット４までは“０”、ビット３になって始めて“１”となるから、「３」を出力する。 The maximum significant bit position detection unit 100 includes a maximum rate / distortion in all code blocks stored and held in the code block encoding unit 204 after the code block encoding unit 204 completes encoding of one image data. The gradient information λmax is input, the bit going from the most significant bit MSB to the lower order is examined, and the information S indicating the bit position that is first “1” is output to the threshold value generation unit 101 and the determination unit 106. For example, when the value of the maximum rate / distortion gradient information λmax is “11” in decimal as described above, it becomes “00001011” in binary, and therefore “0” from bit 7 to bit 4. Since it becomes “1” only after becoming bit 3, “3” is output.

閾値生成部１０１は、初回時には、最大有意ビット位置検出部１００からの情報Ｓを入力し、その有意ビット位置を“１”とした仮の閾値λｍ（初期λｍ）を発生し、それを符号量Ｒ決定部１０３に出力する。例えば、最大有意ビット位置が「３」であれば、λｍ＝“００００１０００”となる。 The threshold generation unit 101 receives the information S from the maximum significant bit position detection unit 100 at the first time, generates a temporary threshold λm (initial λm) with the significant bit position set to “1”, and uses this as the code amount. The data is output to the R determination unit 103. For example, if the maximum significant bit position is “3”, λm = “00001000”.

符号量Ｒ決定部１０３は、入力したレート・歪み勾配情報λｍ以上のレート・勾配情報となる総符号データ量Ｒを、コードブロック情報格納部２０７を検索し、該当する符号量を累積加算することで取得し、その結果を比較部１０４に出力する。 The code amount R determining unit 103 searches the code block information storage unit 207 for the total code data amount R that becomes rate / gradient information greater than the input rate / distortion gradient information λm, and cumulatively adds the corresponding code amount. And the result is output to the comparison unit 104.

比較部１０４は、予め設定された目標符号量Ｒmaxと、符号量Ｒ決定部１０３からの符号量Ｒとを比較し、その比較結果、すなわち、Ｒ＜Ｒmax、Ｒ＞Ｒmax、Ｒ＝Ｒmaxのいずれであるのかを示す情報を判定部１０６に出力する。 The comparison unit 104 compares the target code amount Rmax set in advance with the code amount R from the code amount R determination unit 103, and any of the comparison results, that is, R <Rmax, R> Rmax, R = Rmax. Is output to the determination unit 106.

符号量Ｒ決定部１０３は、以下に示す３つの判定処理のいずれか１つを行う。ただし、最大有意ビット位置検出部１００から出力された値Ｓを初期値とし、３つの処理のいずれを行った場合にも１だけ減じるカウンタを有する。カウンタが保持された値をＣＴとしたとき、初回ではＣＴ＝Ｓとなる。 The code amount R determination unit 103 performs any one of the following three determination processes. However, the value S output from the maximum significant bit position detection unit 100 is used as an initial value, and a counter that decreases by 1 when any of the three processes is performed is provided. When the value held in the counter is CT, CT = S at the first time.

＜判定処理１＞
比較部１０４から比較結果の情報が「Ｒ＝Ｒmax」を示す情報であるとき、現時点でのλｍを目標レート・歪み勾配情報λthであるとして符号列形成部２０５に出力する。このとき、カウンタの値ＣＴが０以外であっても、本処理を終了する。 <Determination process 1>
When the comparison result information from the comparison unit 104 is information indicating “R = Rmax”, the current λm is output to the code string forming unit 205 as the target rate / distortion gradient information λth. At this time, even if the value CT of the counter is other than 0, this process ends.

＜判定処理２＞
比較部１０４からの比較結果の情報が「Ｒ＞Ｒmax」を示す情報であるとき、カウンタＣＴの値で示されるビットＣＴのビット情報は“１”で正しいものとして確定する（ＣＴ＝Ｓ＝３の場合、λｍのビット３は“１”で正しいとする）。そして、ＣＴを１だけ減じる。更に、減じたＣＴで示されるビット位置（ビットＣＴ）を仮に“１”とするため、λｍを更新するよう閾値生成部１０１に要求する。 <Determination process 2>
When the comparison result information from the comparison unit 104 is information indicating “R> Rmax”, the bit information of the bit CT indicated by the value of the counter CT is determined to be correct by “1” (CT = S = 3 In this case, it is assumed that bit 3 of λm is “1” and correct). Then, CT is decreased by 1. Further, in order to temporarily set the bit position (bit CT) indicated by the reduced CT to “1”, the threshold generation unit 101 is requested to update λm.

例えば、Ｓ＝ＣＴ＝３のときλｍは“００００１０００”であったが、ＣＴ＝ＣＴ−１＝２とすることで、現在のλｍのビット２を“１”とする新たなλｍ＝“００００１１００”を発生するよう要求する。 For example, when S = CT = 3, λm was “00001000”, but by setting CT = CT−1 = 2, a new λm = “00001001” in which bit 2 of the current λm is “1”. Request to generate.

なお、更新後のＣＴ＜０になった場合、有意ビットと判定された全ビットの状態が確定したことになるので、最後の符号量Ｒを求めたλｍを目標レート・歪み勾配情報λthとして符号列形成部２０５に出力し、この処理を終える。 When CT after the update becomes <0, the state of all bits determined to be significant bits is confirmed, so that λm obtained for the last code amount R is encoded as the target rate / distortion gradient information λth. The data is output to the column forming unit 205, and this process ends.

＜判定処理３＞
比較部１０４からの比較結果の情報が「Ｒ＜Ｒmax」を示すとき、カウンタＣＴの値で示されるビットＣＴのビット情報は誤りであり、“０”であるとして確定する（ＣＴ＝Ｓ＝３の場合、λｍのビット３を“０”とする）。そして、ＣＴを１だけ減じる。更に、減じたＣＴで示されるビット位置を仮に“１”とするため、λｍを更新するよう閾値生成部１０１に要求する。 <Determination process 3>
When the comparison result information from the comparison unit 104 indicates “R <Rmax”, the bit information of the bit CT indicated by the value of the counter CT is erroneous and is determined to be “0” (CT = S = 3 In this case, bit 3 of λm is set to “0”). Then, CT is decreased by 1. Furthermore, in order to temporarily set the bit position indicated by the reduced CT to “1”, the threshold generation unit 101 is requested to update λm.

例えば、Ｓ＝ＣＴ＝３のときλｍは“００００１０００”であったが、このビット３の値は“０”が正しいことが判明したわけであるから、ＣＴ＝ＣＴ−１＝２とすることで、現在のλｍのビット２を“１”とする新たなλｍ＝“０００００１００”を発生するよう要求する。なお、更新後のＣＴ＜０になった場合、有意ビットと判定された全ビットの状態が確定したことになるが、このときの符号量Ｒを求めたλｍの最下位ビットは０にし、その結果をλｍの目標レート・歪み勾配情報λthとして符号列形成部２０５に出力し、この処理を終える。 For example, when S = CT = 3, λm was “00001000”, but since the value of bit 3 was found to be correct, it is possible to set CT = CT−1 = 2. , Request to generate a new λm = “00000100” with bit 2 of the current λm set to “1”. When CT after updating <0, the state of all bits determined to be significant bits is confirmed, but the least significant bit of λm for which the code amount R is obtained is set to 0. The result is output to the code string forming unit 205 as the target rate / distortion gradient information λth of λm, and this processing is finished.

以上のようにして、目標レート・歪み勾配情報λthが求められるが、符号列形成部２０５は、与えられた目標レート・歪み勾配情報λthを最大レート・歪み勾配情報とし、コードブロック情報格納部２０７の該当するパス位置を各コードブロック毎に決定し、その決定したパス位置までの符号データ列を符号列格納部２０６より読出し、所定のフォーマットにデータを構成し、符号出力部２０８に出力することになる。 As described above, the target rate / distortion gradient information λth is obtained. The code string forming unit 205 sets the given target rate / distortion gradient information λth as the maximum rate / distortion gradient information, and the code block information storage unit 207. Is determined for each code block, the code data string up to the determined path position is read from the code string storage unit 206, the data is configured in a predetermined format, and is output to the code output unit 208. become.

以上説明したように本実施形態によれば、レート・歪み勾配情報がｎビットで表現される場合には、最大でもｎ回の符号量Ｒを求める処理と目標符号量Ｒｍａｘとの比較処理を行うことで目標レート・歪み勾配情報を求めることができるようになる。従って、これまでのように初期値λにΔλを順次加算して求めていく処理と比べ、格段に演算処理量及びコードブロック情報格納部へのアクセス回数が減り、処理速度を大幅に向上させることが可能となる。 As described above, according to the present embodiment, when the rate / distortion gradient information is expressed by n bits, a process for obtaining the code amount R at most n times and a process for comparing the target code amount Rmax are performed. This makes it possible to obtain target rate / distortion gradient information. Therefore, compared with the processing in which Δλ is sequentially added to the initial value λ as in the past, the calculation processing amount and the number of accesses to the code block information storage unit are significantly reduced, and the processing speed is greatly improved. Is possible.

また、符号化処理を行う際に得られた全コードブロックのレート・歪み勾配情報の最大値を示す有意ビット数を求め、その有意ビット数中の最大ビット位置から下位方向に沿って目標レート・歪み勾配情報λthの各ビット状態を確定していくので、有意ビット以外のビット状態の判定処理は不要になる分、処理を簡略化させることが可能となる。更に、全コードブロックのレート・歪み勾配情報の最大値は、コードブロック符号化部２０４に最大値を保持させることで、コードブロック情報格納部２０７を検索する処理が１回減り、処理量を更に減らすことが可能になる。 Further, the significant bit number indicating the maximum value of the rate / distortion gradient information of all code blocks obtained when performing the encoding process is obtained, and the target rate / Since each bit state of the distortion gradient information λth is determined, it is possible to simplify the processing because the determination processing of the bit states other than the significant bits is unnecessary. Further, the maximum value of the rate / distortion gradient information of all code blocks can be reduced by one processing for searching the code block information storage unit 207 by causing the code block encoding unit 204 to hold the maximum value, thereby further increasing the processing amount. It becomes possible to reduce.

なお、実施形態における目標レート・歪み勾配情報λthを求めるための処理を、例えばパーソナルコンピュータ等の汎用情報処理装置で実現する場合の構成を示すと図１９のようになる。 FIG. 19 shows a configuration when the processing for obtaining the target rate / distortion gradient information λth in the embodiment is realized by a general-purpose information processing apparatus such as a personal computer.

図中、１は装置全体の制御を司るＣＰＵ、２はブートプログラム及びＢＩＯＳ等を記憶するＲＯＭ、３はＣＰＵ１のワークエリアとして使用されるＲＡＭである。４はハードディスク装置等の外部記憶装置であって、ここには図２０に示すようにＯＳ、画像圧縮処理に係るアプリケーションプログラム（ＪＰＥＧ２０００アプリケーション）が格納されている。また、最終的に目標符号量ＲmaxもしくはＲｍａｘ未満でＲmaxに近似する符号量をネットワーク上に配信するための通信プログラム等も格納されている。５は画像入力部であって、例えばイメージスキャナ等であるが、画像データが記憶媒体に記憶されている場合にはその記憶媒体をアクセスする手段、画像データがネットワーク上のサーバに存在する場合にはネットワークインタフェースと見ることもできる。６はキーボード（ＫＢ）やポインティングデバイス（ＰＤ）であり、７はビデオＲＡＭを搭載すると共にそのビデオＲＡＭへの描画処理及びビデオＲＡＭからデータを読出しビデオ信号として出力する表示制御部である。８は表示制御部７よりのビデオ信号に基づき表示する表示装置である。 In the figure, 1 is a CPU that controls the entire apparatus, 2 is a ROM that stores a boot program and BIOS, and 3 is a RAM that is used as a work area of the CPU 1. Reference numeral 4 denotes an external storage device such as a hard disk device, in which an OS and an application program (JPEG2000 application) relating to image compression processing are stored as shown in FIG. In addition, a communication program or the like for finally distributing a code amount approximate to Rmax with a target code amount Rmax or less than Rmax on the network is stored. Reference numeral 5 denotes an image input unit, which is an image scanner, for example. When image data is stored in a storage medium, means for accessing the storage medium, and when the image data exists in a server on the network Can also be viewed as a network interface. Reference numeral 6 denotes a keyboard (KB) and a pointing device (PD). Reference numeral 7 denotes a display control unit which is equipped with a video RAM, draws data in the video RAM, reads data from the video RAM, and outputs it as a video signal. Reference numeral 8 denotes a display device that displays based on a video signal from the display control unit 7.

上記構成において、電源を投入すると、ＣＰＵ１はＲＯＭ２に格納されたブートプログラムに従って外部記憶装置よりＯＳをＲＡＭ３にロードし、しかる後、画像圧縮プログラムをロードすることで、画像入力部５より入力した画像データを、予め設定された目標符号量Rmax、もしくは、それを越えないで近似する符号量の圧縮符号化データを生成し、外部記憶装置に格納することになる。なお、最終的に生成された圧縮符号化画像データの出力先は外部記憶装置ではなく、ネットワークを介して出力しても構わないし、その出力先は問わない。 In the above configuration, when the power is turned on, the CPU 1 loads the OS from the external storage device to the RAM 3 in accordance with the boot program stored in the ROM 2, and then loads the image compression program, whereby the image input from the image input unit 5 is loaded. Compressed encoded data having a code amount that approximates the target code amount Rmax set in advance or not exceeding the target code amount Rmax is generated and stored in the external storage device. Note that the output destination of the finally generated compression-encoded image data may be output via a network instead of the external storage device, and the output destination is not limited.

ここで、上記画像データの圧縮符号化処理プログラムは、先に説明したように、図１４の処理手順にて行われるが、ステップＳ１０６までで生成されたコードブロック格納部、符号列格納部としてのデータは、図１９に示すように、外部記憶装置４に一時的に格納されることになる。なお、ＲＡＭ上に確保できるのであれば、ＲＡＭ上でも構わない。 Here, as described above, the compression encoding processing program for the image data is performed according to the processing procedure of FIG. 14, and the code block storage unit and the code string storage unit generated up to step S106 are used. The data is temporarily stored in the external storage device 4 as shown in FIG. The RAM may be used as long as it can be secured on the RAM.

そして、本実施形態におけるステップＳ１０７におけるビット探索処理は、図１８に示す処理を実行すれば良い。以下、図１８に従ってソフトウェアで実現する例を説明する。 And the bit search process in step S107 in this embodiment should just perform the process shown in FIG. Hereinafter, an example realized by software will be described with reference to FIG.

先ず、ステップＳ１８０１では、外部記憶装置４に格納されたコードブロック情報格納部を参照して、入力した画像の全コードブロックの最大レート・歪み勾配情報λmaxを求める。コードブロック情報格納部には、１つのコードブロックについて図８のような形式のデータが生成されているから、増加する符号量と歪み指標値との比を算出することを行い、全コードブロック中の最大値となるものを算出すればよい。なお、コードブロックの符号化データ列を生成する際に、最大値を記憶保持するようにし、それを利用すれば更にλmaxは容易に求めることもできる。 First, in step S1801, with reference to the code block information storage unit stored in the external storage device 4, the maximum rate / distortion gradient information λmax of all code blocks of the input image is obtained. In the code block information storage unit, data in the format as shown in FIG. 8 is generated for one code block. Therefore, the ratio between the increasing code amount and the distortion index value is calculated, and all the code blocks are included. What is necessary is just to calculate what becomes the maximum value of. Note that when generating the encoded data string of the code block, the maximum value is stored and held, and if it is used, λmax can be easily obtained.

次いで、ステップＳ１８０２に進み、λmax中の有意なビット群中の最大ビット位置を求め、それを変数ｎに格納する。ステップＳ１８０３では、目標レート・歪み勾配を求めるための変数λｍ（但し、ｎビット以上）をゼロクリアし、ステップＳ１８０４にて、そのビットｎを“１”にセットする。 In step S1802, the maximum bit position in the significant bit group in λmax is obtained and stored in the variable n. In step S1803, a variable λm (however, n bits or more) for obtaining the target rate / distortion gradient is cleared to zero, and in step S1804, the bit n is set to “1”.

ステップＳ１８０５では、コードブロック情報格納部を参照し、λｍ以上のレート・歪み勾配を有する各コードブロックの符号量を累積加算し、λｍ以上の総符号量Ｒを求める。そして、ステップＳ１８０５にて、求めた総符号量Ｒと目標符号量Ｒmax（ユーザがキーボード等で指定してもよいし、原稿画像のサイズに基づいて決定するようにしても良い）とを比較する。 In step S1805, the code block information storage unit is referred to, and the code amount of each code block having a rate / distortion gradient of λm or more is cumulatively added to obtain a total code amount R of λm or more. In step S1805, the calculated total code amount R is compared with the target code amount Rmax (which may be specified by the user using a keyboard or the like, or determined based on the size of the document image). .

Ｒ＜Ｒmaxであると判断した場合、λｍのビットｎは“１”でなく、“０”であることになるから、ステップＳ１８０７にてλｍのビットｎを“０”に修正する。 If it is determined that R <Rmax, the bit n of λm is not “1” but “0”, so the bit n of λm is corrected to “0” in step S1807.

ステップＳ１８０６にてＲ＞Ｒmaxと判断された場合、或いは、ステップＳ１８０７の処理が行われると、ステップＳ１８０８に進み、変数ｎが“０”であるか否か、すなわちＬＳＢの判定が完了したか否かを判断する。否（ｎ＞０）の場合には、ステップＳ１８０９にて変数ｎを“１”だけ減じ、ステップＳ１８１０にてλｍのビットｎを“１”に設定し、ステップＳ１８０５に戻り、上記処理を繰り返す。 If it is determined in step S1806 that R> Rmax, or if the processing in step S1807 is performed, the process proceeds to step S1808, whether or not the variable n is “0”, that is, whether or not the LSB determination is completed. Determine whether. If NO (n> 0), the variable n is reduced by “1” in step S1809, the bit n of λm is set to “1” in step S1810, the process returns to step S1805, and the above processing is repeated.

ここで、ステップＳ１８０６にてＲ＝Ｒｍａｘ、もしくは、ステップＳ１８０８にてｎ＝０であると判断した場合、その時点でのλｍを、目標レート・歪み勾配λthとして決定（ステップＳ１８１１）、本処理をコールした処理に戻る。この後は、図１４に示す処理ステップＳ１０９、Ｓ１１０を行えば良い。 If it is determined in step S1806 that R = Rmax or n = 0 in step S1808, λm at that time is determined as the target rate / distortion gradient λth (step S1811), and this process is performed. Return to the calling process. Thereafter, processing steps S109 and S110 shown in FIG. 14 may be performed.

以上、本発明に係る実施形態を説明したが、本発明はＪＰＥＧ２０００での実施に好適であるものの、符号化データを小区分に分けて符号量、歪み指標値を得られるその他の符号化方式においても適用することが可能である。 Although the embodiment according to the present invention has been described above, the present invention is suitable for implementation in JPEG 2000, but in other encoding schemes in which encoded data is divided into small sections to obtain a code amount and a distortion index value. Can also be applied.

また、上述の実施の形態では、５１２×５１２の各画素８ビットのモノクロ画像データを符号化対象画像として説明したが、その他のサイズ、ビット深度の画像、各画素が複数の色成分で表されたカラー画像など、その他の画像データに適用しても構わない。さらに、動画像の各フレーム、フィールドなどに対して適用しても構わない。 In the above-described embodiments, 512 × 512 pixel 8-bit monochrome image data has been described as an encoding target image. However, images of other sizes and bit depths, and each pixel is represented by a plurality of color components. The present invention may be applied to other image data such as a color image. Furthermore, the present invention may be applied to each frame, field, etc. of a moving image.

さらにまた、実施形態における処理は、先に説明したようにパーソナルコンピュータ等の汎用情報処理装置で実行するコンピュータプログラムにも適用できる。また、コンピュータプログラムは、通常、ＣＤＲＯＭ等のコンピュータ可読記憶媒体をそのコンピュータにセットし、システムにコピー若しくはインストールを行うことで実行可能となるわけであるから、本発明はかかるコンピュータ可読記憶媒体をも含むのは明らかである。 Furthermore, the processing in the embodiment can be applied to a computer program executed by a general-purpose information processing apparatus such as a personal computer as described above. In addition, since the computer program is normally executable by setting a computer-readable storage medium such as a CDROM in the computer and copying or installing it in the system, the present invention also includes such a computer-readable storage medium. It is clear to include.

実施形態における画像符号化装置のブロック構成図である。It is a block block diagram of the image coding apparatus in embodiment. 画像符号化装置の従来例の構成を示すブロック構成図である。It is a block block diagram which shows the structure of the prior art example of an image coding apparatus. ２次元離散ウェーブレット変換によって処理される符号化対象画像のサブバンドを説明するための図である。It is a figure for demonstrating the subband of the encoding object image processed by two-dimensional discrete wavelet transform. コードブロックBiの符号打ち切り点niを決定する処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process which determines the code | symbol truncation point ni of code block Bi. コードブロックBiのパスの統合による単調減少化の処理の流れを示すフローチャートである。It is a flowchart which shows the flow of the process of monotonic reduction by integration of the path | pass of code block Bi. ２回の２次元離散ウェーブレット変換によって得られる７つのサブバンドを説明するための図である。It is a figure for demonstrating seven subbands obtained by two times of two-dimensional discrete wavelet transforms. コードブロック分割部２０３におけるコードブロック分割の様子を示す図である。It is a figure which shows the mode of the code block division | segmentation in the code block division part 203. FIG. コードブロック符号化部２０４の内部に構築されるコードブロックBiの情報の例を示す図である。6 is a diagram illustrating an example of information of a code block Bi constructed inside a code block encoding unit 204. FIG. コードブロック情報格納部２０７に格納される符号打ち切り候補点の情報を表す図である。FIG. 6 is a diagram illustrating information on code abort candidate points stored in a code block information storage unit 207. コードブロックの各パスのレートと歪みの関係の一例を示す図である。It is a figure which shows an example of the relationship between the rate of each path | pass of a code block, and distortion. 単調減少化処理により、パスが統合される様子を示す図である。It is a figure which shows a mode that a path | pass is integrated by monotonic reduction process. コードブロックの各ビットプレーンと、符号化のパスとの関係を示す図である。It is a figure which shows the relationship between each bit plane of a code block, and the path | pass of an encoding. 従来の符号列形成部２０５におけるλ決定の処理の流れを示すフローチャートである。10 is a flowchart showing a flow of λ determination processing in a conventional code string forming unit 205. 実施形態における画データ圧縮処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the image data compression process in embodiment. 図１におけるビット探索処理部の原理を説明するための図である。It is a figure for demonstrating the principle of the bit search process part in FIG. 図１におけるビット探索処理部の構成の一例を示す図である。It is a figure which shows an example of a structure of the bit search process part in FIG. 図１におけるビット探索処理部の処理内容を説明するための図である。It is a figure for demonstrating the processing content of the bit search process part in FIG. ビット探索処理をソフトウェアで実現するための処理手順を示すフローチャートである。It is a flowchart which shows the process sequence for implement | achieving a bit search process with software. 実施形態における画像圧縮処理をコンピュータプログラムにより実現する際の装置の構成例を示す図である。It is a figure which shows the structural example of the apparatus at the time of implement | achieving the image compression process in embodiment by a computer program. 外部記憶装置に格納、確保されるプログラムや作業領域を示す図である。It is a figure which shows the program and work area which are stored and ensured in an external storage device.

Claims

The image data is frequency converted, the converted data is divided into a plurality of blocks of a predetermined size, and the bit information of each coefficient value in the divided block is generated in the order of the pass from the higher order to the lower order, and each block is generated. The rate that is expressed by the target code amount Rmax, the increase amount of the code data amount at each pass position, and the distortion of the image, from which the encoded data up to the lower pass position is used as the compression encoded data. An image data encoding device that determines based on a distortion gradient λ,
First storage means for storing code data at each pass stage for each block of frequency-converted data;
Second storage means for storing information on increasing code amount and distortion amount in each pass stage in each block, and distortion amount with respect to the increasing code amount;
Setting means for setting temporary rate / distortion gradient information having a significant bit length n and the most significant bit i of “1” being “1”;
Calculating means for calculating a total code amount R of code data that becomes rate / distortion gradient information equal to or higher than the provisional rate / distortion gradient information set by the setting means, by referring to the second storage means;
Comparing means for comparing the total code amount R calculated by the calculating means with the target code amount Rmax;
According to the comparison result of the comparison means, when R> Rmax, i is decreased by “1”, bit i of the provisional rate / distortion gradient information is updated to “1”, and R <Rmax Update means for correcting the bit i to “0”, subtracting the i by “1”, and updating the bit i of the provisional rate / distortion gradient information to “1”, if any;
A code amount and a target code amount Rmax obtained by collecting the codes up to the provisional rate / distortion gradient information until R = Rmax in the comparison unit, or all bit information of the provisional rate / distortion gradient information is determined. A bit search unit that repeats the processing of the calculation unit and the determination unit in order to perform a search process of the bit information of the provisional rate / distortion gradient information until the difference is within a preset error,
The final provisional rate / distortion gradient information obtained by the bit search means is set as the target rate / distortion gradient information λ, and the corresponding pass stage stored in the first storage means based on the rate / distortion gradient λ. An image data encoding apparatus, wherein the encoded data is read out and output as compressed encoded data.

The image data encoding apparatus according to claim 1, wherein the encoding is encoding according to JPEG2000.

Furthermore, storage unit for storing and holding the maximum rate / distortion gradient information among the rate / distortion gradient information at each pass stage of each code block when generating encoded data,
Significant bit number calculating means for obtaining the maximum number of significant bits of rate / distortion gradient information stored and held by the storage holding means,
2. The image data encoding apparatus according to claim 1, wherein the number of bits n of the provisional rate / distortion gradient information is the number of bits calculated by the significant bit number calculation means.

Further, search means for obtaining the maximum rate / distortion gradient information by searching the second storage means,
Significant bit number calculating means for obtaining a significant bit number of the maximum rate / distortion gradient information searched by the search means,
2. The image data encoding apparatus according to claim 1, wherein the number of bits n of the provisional rate / distortion gradient information is the number of bits calculated by the significant bit number calculation means.

The image data is frequency converted, the converted data is divided into a plurality of blocks of a predetermined size, and the bit information of each coefficient value in the divided block is generated in the order of the pass from the higher order to the lower order, and each block is generated. The rate that is expressed by the target code amount Rmax, the increase amount of the code data amount at each pass position, and the distortion of the image, from which the encoded data up to the lower pass position is used as the compression encoded data. An image data encoding method for determining based on a distortion gradient λ,
A first storage step of storing code data at each pass stage for each block of frequency converted data;
A second storage step for storing information on an increasing code amount and distortion amount at each pass stage in each block and a distortion amount with respect to the increasing code amount;
A setting step of setting provisional rate / distortion gradient information having a significant bit length n and the highest bit i of i = n−1 being “1”;
A calculation step of calculating a total code amount R of code data that becomes rate / distortion gradient information equal to or higher than the provisional rate / distortion gradient information set in the setting step by referring to the second storage unit;
A comparison step of comparing the total code amount R calculated in the calculation step with the target code amount Rmax;
When the comparison result of the comparison step is R> Rmax, i is reduced by “1”, bit i of the provisional rate / distortion gradient information is updated to “1”, and when R <Rmax. Updating the bit i to “0”, subtracting the i by “1”, and updating the bit i of the temporary rate / distortion gradient information to “1”;
In the comparison step, R = Rmax, or until all bit information of the provisional rate / distortion gradient information is determined, or a code amount obtained by collecting codes up to the provisional rate / distortion gradient information and the Rmax A bit search step that repeats the processing of the calculation step and the determination step in order to perform the search processing of the bit information of the temporary rate / distortion gradient information until the difference is within a predetermined value,
The final provisional rate / distortion gradient information obtained by the bit search step is set as the target rate / distortion gradient information λ, and the corresponding pass step stored in the first storage step based on the rate / distortion gradient λ. An image data encoding method characterized in that the encoded data is read out and output as compressed encoded data.

The image data is frequency converted, the converted data is divided into a plurality of blocks of a predetermined size, and the bit information of each coefficient value in the divided block is generated in the order of the pass from the higher order to the lower order, and each block is generated. The rate that is expressed by the target code amount Rmax, the increase amount of the code data amount at each pass position, and the distortion of the image, from which the encoded data up to the lower pass position is used as the compression encoded data. A computer program that functions as an image data encoding device that is determined based on a distortion gradient λ,
First storage means for storing code data at each pass stage for each block of frequency-converted data;
Second storage means for storing information on increasing code amount and distortion amount in each pass stage in each block, and distortion amount with respect to the increasing code amount;
Setting means for setting temporary rate / distortion gradient information having a significant bit length n and the most significant bit i of “1” being “1”;
Calculating means for calculating a total code amount R of code data that becomes rate / distortion gradient information equal to or higher than the provisional rate / distortion gradient information set by the setting means, by referring to the second storage means;
Comparing means for comparing the total code amount R calculated by the calculating means with the target code amount Rmax;
According to the comparison result of the comparison means, when R> Rmax, i is decreased by “1”, bit i of the provisional rate / distortion gradient information is updated to “1”, and R <Rmax Update means for correcting the bit i to “0”, subtracting the i by “1”, and updating the bit i of the provisional rate / distortion gradient information to “1”, if any;
R = Rmax in the comparison means, or until all the bit information of the provisional rate / distortion gradient information is determined, or the difference between Rmax and the amount of code obtained by collecting the codes up to the provisional rate / distortion gradient information A bit search unit that repeats the processing of the calculation unit and the determination unit in order to perform a search process of bit information of the provisional rate / distortion gradient information until is within a predetermined value,
The final provisional rate / distortion gradient information obtained by the bit search means is set as the target rate / distortion gradient information λ, and the corresponding pass stage stored in the first storage means based on the rate / distortion gradient λ. A computer program that functions to read out the encoded data in and output as compressed encoded data.

A computer-readable storage medium storing the computer program according to claim 6.