JP2008211810A

JP2008211810A - Moving image reversible encoding method and decoding method thereof, and program therefor

Info

Publication number: JP2008211810A
Application number: JP2008059225A
Authority: JP
Inventors: Takayuki Nakachi; 孝之仲地; Tatsuya Fujii; 竜也藤井; Tomoko Sawabe; 知子澤邉
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2008-03-10
Filing date: 2008-03-10
Publication date: 2008-09-11
Anticipated expiration: 2021-10-04
Also published as: JP4511607B2

Abstract

<P>PROBLEM TO BE SOLVED: To perform moving image reversible encoding with excellent encoding efficiency and decoding thereof, with space resolution scalability. <P>SOLUTION: A band dividing section 10 divides a source signal into a plurality of space resolution bands. A space-time adaptive prediction encoding dividing section 11 is provided for each band and estimates a motion of a target object for the unit of a small block to determine a motion vector. The motion vector defines a motion vector of a bottom frequency band as a reference. An inter-frame correlation coefficient R is calculated from a current frame and the reference frame shifted by the motion vector, the R is compared with a threshold value. In the case that the inter-frame correlation is strong, three-dimensional prediction is performed but in the case that the inter-frame correlation is weak, two-dimensional prediction is performed to produce a predictive residual signal. An entropy encoding section 12 performs entropy encoding upon the predictive residual signal produced by the space-time adaptive prediction encoding section 11 and the motion vector used by the space-time adaptive prediction encoding section 11 and outputs an encoded bit stream. <P>COPYRIGHT: (C)2008,JPO&INPIT

Description

本発明は、動画像を効率よく伝送、蓄積するための、符号化および復号化の技術に関する。 The present invention relates to encoding and decoding techniques for efficiently transmitting and storing moving images.

可逆画像符号化方式としては、ＪＰＥＧ−ＬＳやＪＰＥＧ２０００の可逆符号化モードが知られている。ＪＰＥＧ−ＬＳは、フレーム内予測器を使用して、予測器の出力である予測信号と原信号の差を符号化している。フレーム内予測器を使用していることにより、フレーム内の信号の相関を利用して符号化効率を高めている。ＪＰＥＧ２０００では、ロスレスＷａｖｅｌｅｔ変換を行い、Ｗａｖｅｌｅｔ係数を符号化している。Ｗａｖｅｌｅｔ変換を用いて画像信号を空間周波数帯域で分割し帯域毎に符号化を行っているために、空間解像度スケーラビリティを有する。 As a lossless image encoding method, JPEG-LS or JPEG2000 lossless encoding modes are known. JPEG-LS encodes a difference between a prediction signal that is an output of a predictor and an original signal by using an intra-frame predictor. By using the intra-frame predictor, the coding efficiency is improved by utilizing the correlation of signals in the frame. In JPEG2000, lossless Wavelet conversion is performed, and Wavelet coefficients are encoded. Since the image signal is divided into spatial frequency bands by using Wavelet transform and encoding is performed for each band, spatial resolution scalability is provided.

しかしながら、上記従来の両方式とも静止画像を対象としているために、動画に適応した場合、フレーム間の信号相関を利用することができない。動画像の非可逆符号化としては、空間解像度スケーラビリティを有し、フレーム間相関を利用した方法として、サブバンド領域で動き補償を行う方法があるが、動き補償はブロック単位に予測を行うことから符号化効率は必ずしも良くない。空間解像度スケーラビリティを有するとは、一つの符号化ビットストリームから異なる空間解像度の画像を直接復号化可能であることを示す。 However, since both of the conventional systems are intended for still images, signal correlation between frames cannot be used when adapted to moving images. As lossy encoding of moving images, there is a method that performs spatial compensation in the subband region as a method that has spatial resolution scalability and uses inter-frame correlation, but motion compensation performs prediction in units of blocks. Encoding efficiency is not always good. Having spatial resolution scalability indicates that images of different spatial resolutions can be directly decoded from one encoded bitstream.

本発明は、空間解像度スケーラビリティを有し、符号化効率に優れる動画像可逆符号化方法とその復号化方法、及びそれらのプログラムを提案することが課題である。 An object of the present invention is to propose a moving image lossless encoding method, a decoding method thereof, and a program thereof having spatial resolution scalability and excellent encoding efficiency.

上記の課題を解決するために、請求項１に係る発明は、動画像を対象とする可逆符号化方法において、
原画像を帯域分割し、
該分割した帯域のうち最低周波数帯域についてブロック単位にＳＡＤ計算を行って動きベクトルを求め、所定の閾値Ｔより大きいＳＡＤ値を与えるブロック内の画素は２次元予測を行い、所定の閾値Ｔ以下のＳＡＤ値を与えるブロック内の画素は時空間適応予測を行って予測残差信号を求め、該予測残差信号と該時空間適応予測で用いた動きベクトルとを符号化し、
二番目に低い周波数帯域の符号化では、最低周波数帯域において２次元予測を行ったブロックと対応する空間位置のブロックの画素は直接符号化し、それ以外のブロックはＳＡＤ計算を行って動きベクトルを求め、所定の閾値Ｔ２より大きいＳＡＤ値を与えるブロック内の画素は直接符号化し、前記閾値Ｔ２以下のＳＡＤ値を与えるブロック内の画素は時空間適応予測を行って予測残差信号を求め、該予測残差信号と該時空間適応予測で用いた動きベクトルとを符号化し、
以降の高周波数帯域の符号化では、直前に符号化された周波数帯域において直接符号化したブロックと対応する空間位置のブロックの画素は直接符号化し、それ以外のブロックはＳＡＤ計算を行って動きベクトルを求め、周波数帯域ごとに定められた閾値Ｔｘより大きいＳＡＤ値を与えるブロック内の画素は直接符号化し、所定の閾値Ｔｘ以下のＳＡＤ値を与えるブロック内の画素は時空間適応予測を行って予測残差信号を求め、該予測残差信号と該時空間適応予測で用いた動きベクトルとを符号化する際に、
該時空間適応予測では、前記ＳＡＤ計算で求めた動きベクトルを用いて現フレームと、シフト手段でシフトした参照フレームの被符号化対象画素近傍信号値の相関係数を計算し、相関係数が大きい場合には、現フレームと参照フレームの被符号化対象画素近傍信号値を用いて予測を行う３次元予測に切換え、相関係数が小さい場合には、現フレームの被符号化対象画素近傍信号値のみを用いて予測を行う２次元予測に切換え、
該２次元予測に切換えられた場合には、現フレームの被符号化対象画素近傍信号値から画素毎に複数の２次元予測器を切換えて予測残差信号を算出し、
該３次元予測に切換えられた場合には、現フレームと参照フレームの被符号化対象画素近傍信号値から画素毎に複数の３次元予測器を切換えて予測残差信号を算出することを特徴とする動画像可逆符号化方法である。 In order to solve the above problem, the invention according to claim 1 is a lossless encoding method for moving images.
The original image is divided into bands,
A motion vector is obtained by performing SAD calculation for each lowest frequency band among the divided bands to obtain a motion vector. Pixels in a block that give an SAD value larger than a predetermined threshold T are subjected to two-dimensional prediction, and are equal to or lower than the predetermined threshold T. Pixels in the block giving the SAD value perform space-time adaptive prediction to obtain a prediction residual signal, encode the prediction residual signal and the motion vector used in the space-time adaptive prediction,
In the second lowest frequency band coding, the pixels of the spatial block corresponding to the block for which the two-dimensional prediction is performed in the lowest frequency band are directly coded, and the other blocks are subjected to SAD calculation to obtain a motion vector. The pixels in the block that give a SAD value greater than a predetermined threshold T2 are directly encoded, and the pixels in the block that give a SAD value less than or equal to the threshold T2 are subjected to spatiotemporal adaptive prediction to obtain a prediction residual signal. Encode the residual signal and the motion vector used in the space-time adaptive prediction,
In the subsequent high frequency band encoding, the pixels of the spatial block corresponding to the block directly encoded in the frequency band encoded immediately before are directly encoded, and the other blocks perform the SAD calculation to obtain the motion vector. The pixels in the block that give a SAD value larger than the threshold value Tx determined for each frequency band are directly encoded, and the pixels in the block that give a SAD value less than or equal to the predetermined threshold value Tx are predicted by performing space-time adaptive prediction. When obtaining a residual signal and encoding the prediction residual signal and the motion vector used in the space-time adaptive prediction,
In the spatio-temporal adaptive prediction, the correlation coefficient between the current pixel and the reference frame pixel value to be encoded between the current frame and the reference frame shifted by the shift means is calculated using the motion vector obtained by the SAD calculation. If the correlation coefficient is small, switch to three-dimensional prediction that performs prediction using the encoding target pixel neighboring signal values of the current frame and the reference frame. If the correlation coefficient is small, the encoding target pixel neighboring signal of the current frame Switch to two-dimensional prediction that uses only values to make predictions,
In the case of switching to the two-dimensional prediction, a prediction residual signal is calculated by switching a plurality of two-dimensional predictors for each pixel from the encoding target pixel vicinity signal value of the current frame,
When switching to the three-dimensional prediction, the prediction residual signal is calculated by switching a plurality of three-dimensional predictors for each pixel from the encoding target pixel neighboring signal values of the current frame and the reference frame. This is a lossless video encoding method.

また、請求項２に係る発明は、請求項１において、閾値より大きいＳＡＤ値が得られたブロックの符号化信号に、時空間適応予測処理をしなかったことを示す符号を付加することを特徴とする動画像可逆符号化方法である。 The invention according to claim 2 is characterized in that, in claim 1, a code indicating that the spatio-temporal adaptive prediction processing has not been performed is added to an encoded signal of a block in which an SAD value larger than a threshold value is obtained. This is a lossless video encoding method.

また、請求項３に係る発明は、動画像を出力する可逆復号化方法において、
請求項１記載の動画像可逆符号化方法により符号化された帯域別の直接符号化ブロック、予測残差信号および動きベクトルを受信し、
最低周波数帯域から最高周波数帯域の順番で、該帯域別に、直接符号化ブロック、予測残差信号および動きベクトルを復号し、予測残差信号と動きベクトルからさらに時空間適応予測復号を行い、
復号された帯域を帯域合成して動画像を復号する際に、
該時空間適応予測復号では、該動きベクトルを用いてシフトした参照フレームの復号化対象画素近傍信号値と、現フレーム内信号の復号済み信号からなる復号化対象画素近傍信号値との相関係数を計算し、相関係数が大きい場合には３次元予測に切換え、相関係数が小さい場合には２次元予測に切換え、
該２次元予測に切換えられた場合には、現フレーム内信号の復号済み信号からなる復号化対象画素近傍信号値から、画素毎に複数の２次元予測器を切換えて予測信号を算出し、
該３次元予測に切換えられた場合には、現フレーム内信号の復号済み信号からなる復号化対象画素近傍信号値と、動きベクトルを用いてシフトした参照フレームの復号化対象画素近傍信号値から、画素毎に複数の３次元予測器を切換えて予測信号を算出し、
該予測信号に該予測残差信号を付加して対象信号を復号化することを特徴とする動画像可逆復号化方法である。 According to a third aspect of the present invention, in the lossless decoding method for outputting a moving image,
Receiving a direct coding block for each band, a prediction residual signal and a motion vector encoded by the video lossless encoding method according to claim 1;
In the order of the lowest frequency band to the highest frequency band, for each band, the direct coding block, the prediction residual signal and the motion vector are decoded, and the space-time adaptive prediction decoding is further performed from the prediction residual signal and the motion vector,
When decoding a moving image by combining the decoded bands,
In the spatio-temporal adaptive prediction decoding, a correlation coefficient between a decoding target pixel neighboring signal value of a reference frame shifted using the motion vector and a decoding target pixel neighboring signal value composed of a decoded signal of a signal in the current frame When the correlation coefficient is large, switch to 3D prediction, and when the correlation coefficient is small, switch to 2D prediction.
In the case of switching to the two-dimensional prediction, a prediction signal is calculated by switching a plurality of two-dimensional predictors for each pixel from the decoding target pixel neighboring signal value including the decoded signal of the signal in the current frame,
In the case of switching to the three-dimensional prediction, from the decoding target pixel neighboring signal value composed of the decoded signal of the signal in the current frame and the decoding target pixel neighboring signal value of the reference frame shifted using the motion vector, Calculate prediction signals by switching multiple 3D predictors for each pixel,
A moving image lossless decoding method comprising decoding a target signal by adding the prediction residual signal to the prediction signal.

また、請求項４に係る発明は、動画像を対象とする可逆符号化装置に、
原画像を帯域分割する手順、
該分割した帯域のうち最低周波数帯域についてブロック単位にＳＡＤ計算を行って動きベクトルを求め、所定の閾値Ｔより大きいＳＡＤ値を与えるブロック内の画素は２次元予測を行い、所定の閾値Ｔ以下のＳＡＤ値を与えるブロック内の画素は時空間適応予測を行って予測残差信号を求め、該予測残差信号と該時空間適応予測で用いた動きベクトルとを符号化する手順、
二番目に低い周波数帯域の符号化では、最低周波数帯域において２次元予測を行ったブロックと対応する空間位置のブロックの画素は直接符号化し、それ以外のブロックはＳＡＤ計算を行って動きベクトルを求め、所定の閾値Ｔ２より大きいＳＡＤ値を与えるブロック内の画素は直接符号化し、前記閾値Ｔ２以下のＳＡＤ値を与えるブロック内の画素は時空間適応予測を行って予測残差信号を求め、該予測残差信号と該時空間適応予測で用いた動きベクトルとを符号化する手順、
以降の高周波数帯域の符号化では、直前に符号化された周波数帯域において直接符号化したブロックと対応する空間位置のブロックの画素は直接符号化し、それ以外のブロックはＳＡＤ計算を行って動きベクトルを求め、周波数帯域ごとに定められた閾値Ｔｘより大きいＳＡＤ値を与えるブロック内の画素は直接符号化し、所定の閾値Ｔｘ以下のＳＡＤ値を与えるブロック内の画素は時空間適応予測を行って予測残差信号を求め、該予測残差信号と該時空間適応予測で用いた動きベクトルとを符号化する手順を実行させ、さらに、
該時空間適応予測では、前記ＳＡＤ計算で求めた動きベクトルを用いて現フレームと、シフト手段でシフトした参照フレームの被符号化対象画素近傍信号値の相関係数を計算し、相関係数が大きい場合には、現フレームと参照フレームの被符号化対象画素近傍信号値を用いて予測を行う３次元予測に切換え、相関係数が小さい場合には、現フレームの被符号化対象画素近傍信号値のみを用いて予測を行う２次元予測に切換え、
該２次元予測に切換えられた場合には、現フレームの被符号化対象画素近傍信号値から画素毎に複数の２次元予測器を切換えて予測残差信号を算出し、
該３次元予測に切換えられた場合には、現フレームと参照フレームの被符号化対象画素近傍信号値から画素毎に複数の３次元予測器を切換えて予測残差信号を算出する手順を実行させるためのプログラムである。 According to a fourth aspect of the present invention, there is provided a lossless encoding apparatus for moving images.
The procedure for dividing the original image into bands,
A motion vector is obtained by performing SAD calculation for each lowest frequency band among the divided bands to obtain a motion vector. Pixels in a block that give an SAD value larger than a predetermined threshold T are subjected to two-dimensional prediction, and are equal to or lower than the predetermined threshold T. A pixel in the block that gives the SAD value performs a spatiotemporal adaptive prediction to obtain a prediction residual signal, and encodes the prediction residual signal and the motion vector used in the spatiotemporal adaptive prediction.
In the second lowest frequency band coding, the pixels of the spatial block corresponding to the block for which the two-dimensional prediction is performed in the lowest frequency band are directly coded, and the other blocks are subjected to SAD calculation to obtain a motion vector. The pixels in the block that give a SAD value greater than a predetermined threshold T2 are directly encoded, and the pixels in the block that give a SAD value less than or equal to the threshold T2 are subjected to spatiotemporal adaptive prediction to obtain a prediction residual signal. A procedure for encoding a residual signal and a motion vector used in the space-time adaptive prediction;
In the subsequent high frequency band encoding, the pixels of the spatial block corresponding to the block directly encoded in the frequency band encoded immediately before are directly encoded, and the other blocks perform the SAD calculation to obtain the motion vector. The pixels in the block that give a SAD value larger than the threshold value Tx determined for each frequency band are directly encoded, and the pixels in the block that give a SAD value less than or equal to the predetermined threshold value Tx are predicted by performing space-time adaptive prediction. Obtaining a residual signal, and executing a procedure for encoding the prediction residual signal and the motion vector used in the space-time adaptive prediction,
In the spatio-temporal adaptive prediction, the correlation coefficient between the current pixel and the reference frame pixel value to be encoded between the current frame and the reference frame shifted by the shift means is calculated using the motion vector obtained by the SAD calculation. If the correlation coefficient is small, switch to three-dimensional prediction that performs prediction using the encoding target pixel neighboring signal values of the current frame and the reference frame. If the correlation coefficient is small, the encoding target pixel neighboring signal of the current frame Switch to two-dimensional prediction that uses only values to make predictions,
In the case of switching to the two-dimensional prediction, a prediction residual signal is calculated by switching a plurality of two-dimensional predictors for each pixel from the encoding target pixel vicinity signal value of the current frame,
When switching to the three-dimensional prediction, a procedure for calculating a prediction residual signal by switching a plurality of three-dimensional predictors for each pixel from the signal to be encoded pixel vicinity signal values of the current frame and the reference frame is executed. It is a program for.

また、請求項５に係る発明は、動画像を出力する可逆復号化装置に、
請求項１記載の動画像可逆符号化方法により符号化された帯域別の直接符号化ブロック、予測残差信号および動きベクトルを受信させる手順、
最低周波数帯域から最高周波数帯域の順番で、該帯域別に、直接符号化ブロック、予測残差信号および動きベクトルを復号し、予測残差信号と動きベクトルからさらに時空間適応予測復号を行わせる手順、
復号された帯域を帯域合成して動画像を復号する手順を実行させ、さらに、
該時空間適応予測復号では、該動きベクトルを用いてシフトした参照フレームの復号化対象画素近傍信号値と、現フレーム内信号の復号済み信号からなる復号化対象画素近傍信号値との相関係数を計算し、相関係数が大きい場合には３次元予測に切換え、相関係数が小さい場合には２次元予測に切換え、
該２次元予測に切換えられた場合には、現フレーム内信号の復号済み信号からなる復号化対象画素近傍信号値から、画素毎に複数の２次元予測器を切換えて予測信号を算出し、
該３次元予測に切換えられた場合には、現フレーム内信号の復号済み信号からなる復号化対象画素近傍信号値と、動きベクトルを用いてシフトした参照フレームの復号化対象画素近傍信号値から、画素毎に複数の３次元予測器を切換えて予測信号を算出し、
該予測信号に該予測残差信号を付加して対象信号を復号化する手順を実行させるためのプログラムである。 The invention according to claim 5 provides a lossless decoding apparatus that outputs a moving image,
A procedure for receiving a direct coding block for each band, a prediction residual signal, and a motion vector encoded by the moving image lossless encoding method according to claim 1;
A procedure for decoding a direct coding block, a prediction residual signal, and a motion vector for each band in order from the lowest frequency band to the highest frequency band, and further performing space-time adaptive prediction decoding from the prediction residual signal and the motion vector,
A procedure for decoding the moving image by performing band synthesis on the decoded band,
In the spatio-temporal adaptive prediction decoding, a correlation coefficient between a decoding target pixel neighboring signal value of a reference frame shifted using the motion vector and a decoding target pixel neighboring signal value composed of a decoded signal of a signal in the current frame When the correlation coefficient is large, switch to 3D prediction, and when the correlation coefficient is small, switch to 2D prediction.
In the case of switching to the two-dimensional prediction, a prediction signal is calculated by switching a plurality of two-dimensional predictors for each pixel from the decoding target pixel neighboring signal value including the decoded signal of the signal in the current frame,
In the case of switching to the three-dimensional prediction, from the decoding target pixel neighboring signal value composed of the decoded signal of the signal in the current frame and the decoding target pixel neighboring signal value of the reference frame shifted using the motion vector, Calculate prediction signals by switching multiple 3D predictors for each pixel,
It is a program for executing a procedure for decoding a target signal by adding the prediction residual signal to the prediction signal.

本発明では、空間解像度スケーラビリティを実現するために、画像を空間解像度の帯域で分割し、帯域毎に符号化を行う。符号化には、画素毎にフレーム内予測（２次元予測）とフレーム間予測（３次元予測）を切換える時空間適応予測符号化を用いて符号化効率を高める。 In the present invention, in order to realize spatial resolution scalability, an image is divided into spatial resolution bands, and encoding is performed for each band. For encoding, spatiotemporal adaptive prediction encoding is used to increase the encoding efficiency by switching between intra-frame prediction (two-dimensional prediction) and inter-frame prediction (three-dimensional prediction) for each pixel.

本発明によれば、動画像の効率の良い可逆符号化ができ、少ないディスク容量で保存が可能となる。さらに、空間解像度スケーラビリティ性を有するために、画像表示機器の性能や用途に応じた空間解像度で、画像を復号化することが可能である。低域から任意の帯域までを復号化すると、原画像よりも低い空間解像度の画像を再生でき、全てのデータを復号化すると、原画像と同じ解像度の画像が再生される。画像表示装置の精度や用途に応じて原画像より低い解像度の画像を再生したい場合は、必要な帯域までに対する復号化のみが必要であり、復号化を行えば画像を符号化データより直接復号化でき、原画像と同じ解像度の画像を再生してから解像度変換を行うよりも処理時間が短くなる。また、符号化ビットストリームを伝送する場合は、必要なデータのみを伝送するので、伝送レートも小さくなる。 According to the present invention, it is possible to efficiently perform lossless encoding of moving images and to store them with a small disk capacity. Furthermore, since it has spatial resolution scalability, it is possible to decode an image with a spatial resolution according to the performance and application of the image display device. When decoding from a low band to an arbitrary band, an image having a lower spatial resolution than the original image can be reproduced, and when all data is decoded, an image having the same resolution as the original image is reproduced. If you want to reproduce an image with a resolution lower than the original image depending on the accuracy and application of the image display device, you only need to decode up to the required bandwidth. If you perform decoding, the image is directly decoded from the encoded data. The processing time is shorter than when the resolution conversion is performed after the image having the same resolution as the original image is reproduced. Also, when transmitting an encoded bit stream, only the necessary data is transmitted, so the transmission rate is also reduced.

以下、本発明の実施の形態について図を用いて詳細に説明する。 Hereinafter, embodiments of the present invention will be described in detail with reference to the drawings.

［第１実施形態例］
図１に空間解像度スケーラビリティを持つ動画像符号化方法を実現するための基本構成を示す。図１において、１０は帯域分割部、１１は分割された帯域毎に設けた時空間適応予測符号化部、１２はエントロピー符号化部である。 [First Embodiment]
FIG. 1 shows a basic configuration for realizing a moving picture coding method having spatial resolution scalability. In FIG. 1, 10 is a band dividing unit, 11 is a spatio-temporal adaptive prediction encoding unit provided for each divided band, and 12 is an entropy encoding unit.

図１の構成の動作例としての動画像符号化方法を以下に示す。まず、入力された画像信号は帯域分割部１０において、複数の空間解像度の帯域に分割される。次に、分割した帯域毎に、時空間適応予測符号化部１１で生成される残差信号をエントロピー符号化部１２において符号化する。 A moving picture encoding method as an operation example of the configuration of FIG. 1 will be described below. First, the input image signal is divided into a plurality of spatial resolution bands by the band dividing unit 10. Next, the residual signal generated by the space-time adaptive prediction encoding unit 11 is encoded by the entropy encoding unit 12 for each divided band.

帯域分割部１０では、図２で示すオクターブ分割を画像の水平方向、垂直方向にそれぞれ適用する。オクターブ分割では、２分割フィルタを用いて次々に帯域を分割することによって、入力信号を複数の帯域に分割することができ、最終的に、図３で示すように帯域が分割される。 The band dividing unit 10 applies the octave division shown in FIG. 2 to the horizontal direction and the vertical direction of the image, respectively. In the octave division, the input signal can be divided into a plurality of bands by dividing the band one after another using a two-division filter, and finally the band is divided as shown in FIG.

図３において、Ｌは低周波数成分、Ｈは高周波数成分を示す。ここで用いる２分割フィルタと復号化側で用いる帯域合成フィルタは、可逆性を保つために完全再構成フィルタとする。 In FIG. 3, L represents a low frequency component, and H represents a high frequency component. The two-division filter used here and the band synthesis filter used on the decoding side are completely reconstructed filters in order to maintain reversibility.

時空間適応予測符号化部１１では、以下の式で予測を行う。 The space-time adaptive prediction encoding unit 11 performs prediction using the following equation.

ｆ（ａ０，ａ１，ａ２，ａ３，……，ｂ０，ｂ１，ｂ２，ｂ３，……）
但し、
ｆ：予測関数
ａ０，ａ１，ａ２，ａ３，……：符号化対象のフレームの画素の値
ｂ０，ｂ１，ｂ２，ｂ３，……：参照フレームの画素の値
参照フレームとしては、時間的に前方のフレーム、後方のフレーム、前方と後方のフレームの両方のいずれかを用いるが、復号器側では、参照フレームは先に復号する必要がある。予測に用いる画素の位置は以下の条件で決定する。符号化対象の画素の位置を（ｘ，ｙ）とする（ｘは水平方向の位置、ｙは垂直方向の位置）。符号化対象のフレームの場合、近傍かつ復号側で先に復号化する画素とする。例えば、１行目の左端の画素から右へと復号し、次に２行目を左から右へと行毎に復号する場合は、予測に用いる画素の位置（ｘａ，ｙａ）は、以下の条件を満たす必要がある。 f (a0, a1, a2, a3, ..., b0, b1, b2, b3, ...)
However,
f: Prediction function a0, a1, a2, a3, ...: Pixel value of the encoding target frame b0, b1, b2, b3, ...: Reference frame pixel value The reference frame is temporally forward Frame, the rear frame, and both the front and rear frames, the reference frame needs to be decoded first on the decoder side. The position of the pixel used for prediction is determined under the following conditions. Let the position of the pixel to be encoded be (x, y) (x is the position in the horizontal direction and y is the position in the vertical direction). In the case of a frame to be encoded, it is a pixel that is decoded first in the vicinity and on the decoding side. For example, when decoding from the leftmost pixel of the first row to the right and then decoding the second row from left to right for each row, the position (xa, ya) of the pixel used for prediction is as follows: It is necessary to satisfy the conditions.

ｙ−ｙ０＜ｙａ＜ｙかつｘ−ｘ０＜ｘａ＜ｘ＋ｘ１
または、
ｙａ＝ｙかつｘ−ｘ２＜ｘａ＜ｘ
但し、
ｙ０：垂直方向の近傍の範囲を定める適当な整数
ｘ０，ｘ１，ｘ２：水平方向の近傍の範囲を定める適当な整数
参照フレームの予測に用いる画素の位置（ｘｂ，ｙｂ）は以下の条件を満たす必要がある。 y−y0 <ya <y and x−x0 <xa <x + x1
Or
ya = y and x−x2 <xa <x
However,
y0: Appropriate integer that determines the range in the vicinity in the vertical direction x0, x1, x2: Appropriate integer that determines the range in the vicinity in the horizontal direction The position (xb, yb) of the pixel used for prediction of the reference frame satisfies the following conditions There is a need.

ｙ＋ｖｙ−ｙ３＜ｙａ＜ｙ＋ｖｙ＋ｙ４かつｘ＋ｖｘ−ｘ３＜ｘａ＜ｘ＋ｖｘ＋ｘ４
但し、
ｙ３，ｙ４：垂直方向の近傍の範囲を定める適当な整数
ｘ３，ｘ４：水平方向の近傍の範囲を定める適当な整数
Ｖ（ｖｘ，ｖｙ）：参照フレームヘの動きベクトル
動きベクトルＶ（ｖｘ，ｖｙ）は、参照フレームヘの動き推定を行うことによって得られる。 y + vy-y3 <ya <y + vy + y4 and x + vx-x3 <xa <x + vx + x4
However,
y3, y4: Appropriate integers that define the vertical neighborhood range x3, x4: Appropriate integers that define the horizontal neighborhood range V (vx, vy): Motion vector to the reference frame Motion vector V (vx, vy) Is obtained by performing motion estimation on the reference frame.

エントロピー符号化部１２では、各帯域の残差信号と時空間適応予測符号化部１１で用いた動きベクトルをエントロピー符号化して、符号化ビットストリームを作成する。 The entropy encoding unit 12 entropy-encodes the residual signal of each band and the motion vector used in the space-time adaptive prediction encoding unit 11 to create an encoded bitstream.

図５に時空間適応予測符号化方法を実現する時空間適応予測符号化部の基本構成を示す。図５において、３１はブロック単位動き推定手段、３２はシフト手段、３３は２次元予測器、３４は動き推定３次元予測器、３５は相関係数Ｒ計算手段、３６は第１の判断分岐手段、３７は加減算手段、３８は加減算手段である。 FIG. 5 shows a basic configuration of a space-time adaptive predictive coding unit that realizes the space-time adaptive predictive coding method. In FIG. 5, 31 is a block unit motion estimation means, 32 is a shift means, 33 is a two-dimensional predictor, 34 is a motion estimation three-dimensional predictor, 35 is a correlation coefficient R calculation means, and 36 is a first decision branching means. , 37 is addition / subtraction means, and 38 is addition / subtraction means.

予測器としては、３種類の２次元予測器、７種類の３次元予測器を用意し、フレーム間相関により予測器を切換える。 As predictors, three types of two-dimensional predictors and seven types of three-dimensional predictors are prepared, and the predictors are switched by inter-frame correlation.

３次元予測器は、フレーム間相関が強い場合には有効であるが、フレーム間相関が弱い場合には逆に残差信号が大きくなる可能性がある。そのために、フレーム間相関が弱い場合には２次元予測器に切換える方法を採用する。２次元予測器と３次元予測器とを切換えるために、現フレームと参照フレームの被符号化対象画素近傍の復号済み信号の相関係数を計算する。相関係数が大きい場合、すなわち、現フレーム内信号と参照フレーム内信号の波形が似ている場合には、予測精度が向上すると考えられることから、３次元予測を行う。それ以外の場合には、２次元予測を行う。 The three-dimensional predictor is effective when the inter-frame correlation is strong, but there is a possibility that the residual signal becomes large when the inter-frame correlation is weak. Therefore, a method of switching to a two-dimensional predictor is adopted when the inter-frame correlation is weak. In order to switch between the two-dimensional predictor and the three-dimensional predictor, the correlation coefficient of the decoded signal in the vicinity of the encoding target pixel in the current frame and the reference frame is calculated. When the correlation coefficient is large, that is, when the waveform of the signal in the current frame and the signal in the reference frame are similar, the prediction accuracy is considered to improve, so that three-dimensional prediction is performed. In other cases, two-dimensional prediction is performed.

各予測器の予測方法と予測器の具体的切換え方法について以下に説明する。各予測器の予測方法を次に示す。 A prediction method for each predictor and a specific switching method for the predictor will be described below. The prediction method of each predictor is as follows.

２次元予測器０予測信号ｙ＝ｍｉｎ（ａ，ｂ） …（１）
２次元予測器１予測信号ｙ＝ｍａｘ（ａ，ｂ） …（２）
２次元予測器２予測信号ｙ＝ａ＋ｂ−ｃ …（３）
３次元予測器３予測信号ｙ＝ｍｉｎ（ａ，ｘ’） …（４）
３次元予測器４予測信号ｙ＝ｍａｘ（ａ，ｘ’） …（５）
３次元予測器５予測信号ｙ＝ａ＋ｘ’−ａ …（６）
３次元予測器６予測信号ｙ＝ｍｉｎ（ｂ，ｘ’） …（７）
３次元予測器７予測信号ｙ＝ｍａｘ（ｂ，ｘ’） …（８）
３次元予測器８予測信号ｙ＝ｂ＋ｘ’−ｂ …（９）
３次元予測器９予測信号ｙ＝（ａ＋ｂ＋ｘ’）／３ …（１０）
ここで用いたａ，ｂ，ｃは、図７（ａ）で示すように、被符号化対象の画素ｘに隣接する上、左、右上の画素の復号値である。ａ’，ｂ’，ｘ’は、参照フレームの画素の復号値であり、被符号化対象のフレームと参照フレーム間の動きベクトル（ｋ，ｌ）（ｋ：水平方向、ｌ：垂直方向）から画素位置を定める。動きベクトルは、Ｌ×Ｌ画素の小ブロック単位であらかじめブロックマッチング法などにより計算し、付加情報として伝送する。被符号化対象の画素ｘの位置を（ｉ，ｊ）とすると、ｘ’の位置は（ｉ＋ｋ，ｊ＋ｌ）となる。また、図７（ｂ）に示す通り、ａ’，ｂ’は、ｘ’の隣接する上、左の画素である。参照フレームとしては、時間的に前方のフレーム、後方のフレーム、前方と後方のフレームの両方のいずれかを用いるが、復号器側では、参照フレームは先に復号する必要がある。 Two-dimensional predictor 0 Predicted signal y = min (a, b) (1)
Two-dimensional predictor 1 Predicted signal y = max (a, b) (2)
Two-dimensional predictor 2 Predicted signal y = a + bc (3)
Three-dimensional predictor 3 Predicted signal y = min (a, x ′) (4)
Three-dimensional predictor 4 Prediction signal y = max (a, x ′) (5)
Three-dimensional predictor 5 Predicted signal y = a + x′−a (6)
Three-dimensional predictor 6 Predicted signal y = min (b, x ′) (7)
Three-dimensional predictor 7 Prediction signal y = max (b, x ′) (8)
Three-dimensional predictor 8 Predicted signal y = b + x′−b (9)
Three-dimensional predictor 9 Predicted signal y = (a + b + x ′) / 3 (10)
As shown in FIG. 7A, a, b, and c used here are decoded values of the upper, left, and upper right pixels adjacent to the pixel x to be encoded. a ′, b ′, x ′ are the decoded values of the pixels of the reference frame, and are based on the motion vector (k, l) between the frame to be encoded and the reference frame (k: horizontal direction, l: vertical direction) Define the pixel position. The motion vector is calculated in advance by a block matching method or the like in units of small blocks of L × L pixels and transmitted as additional information. If the position of the pixel x to be encoded is (i, j), the position of x ′ is (i + k, j + l). Further, as shown in FIG. 7B, a ′ and b ′ are the pixels on the left side adjacent to x ′. As the reference frame, any one of a temporally forward frame, a backward frame, and a forward and backward frame is used. However, on the decoder side, the reference frame needs to be decoded first.

Ｒ≦Ｔ０かつｃ≧ｍａｘ（ａ，ｂ）の場合２次元予測器０を選択
Ｒ≦Ｔ０かつｃ≦ｍｉｎ（ａ，ｂ）の場合２次元予測器１を選択
Ｒ≦Ｔ０かつｍｉｎ（ａ，ｂ）＜ｃ＜ｍａｘ（ａ，ｂ）の場合２次元予測器２を選択
Ｒ＞Ｔ０かつＳ＞Ｔ１かつａ’≧ｍａｘ（ａ，ｘ’）の場合３次元予測器３を選択
Ｒ＞Ｔ０かつＳ＞Ｔ１かつａ’≦ｍｉｎ（ａ，ｘ’）の場合３次元予測器４を選択
Ｒ＞Ｔ０かつＳ＞Ｔ１かつｍｉｎ（ａ，ｘ’）＜ａ’＜ｍａｘ（ａ，ｘ’）の場合３次元予測器５を選択
Ｒ＞Ｔ０かつＳ＜−Ｔ１かつｂ’≧ｍａｘ（ｂ，ｘ’）の場合３次元予測器６を選択
Ｒ＞Ｔ０かつＳ＜−Ｔ１かつｂ’≦ｍｉｎ（ｂ，ｘ’）の場合３次元予測器７を選択
Ｒ＞Ｔ０かつＳ＜−Ｔ１かつｍｉｎ（ｂ，ｘ’）＜ｂ’＜ｍａｘ（ｂ，ｘ’）の場合３次元予測器８を選択
Ｒ＞Ｔ０かつ−Ｔ１＜Ｓ＜Ｔ１の場合３次元予測器９を選択
但し、
Ｒ＝ａａ’＋ｂｂ’＋ｃｃ’＋ｄｄ’−（ａ＋ｂ＋ｃ＋ｄ）（ａ’＋ｂ’＋ｃ’＋ｄ’） …（１１）
Ｓ＝｜ｘ’−ｂ’｜−｜ｘ’−ａ’｜ …（１２）
Ｔ０，Ｔ１：閾値
ａ，ｂ，ｃ，ｄ，ａ’，ｂ’，ｃ’，ｄ’，ｘ’：図７に示す画素の復号値
である。 If R ≦ T0 and c ≧ max (a, b), select 2D predictor 0 Select R ≦ T0 and c ≦ min (a, b) Select 2D predictor 1 R ≦ T0 and min (a, b) b) If <c <max (a, b) Select 2D predictor 2 If R> T0 and S> T1 and a ′ ≧ max (a, x ′) Select 3D predictor 3 R> T0 When S> T1 and a ′ ≦ min (a, x ′), the three-dimensional predictor 4 is selected. R> T0, S> T1, and min (a, x ′) <a ′ <max (a, x ′) In the case of 3D predictor 5 is selected R> T0 and S <−T1 and b ′ ≧ max (b, x ′) 3D predictor 6 is selected R> T0 and S <−T1 and b ′ ≦ min In the case of (b, x ′), the three-dimensional predictor 7 is selected. In the case of R> T0, S <−T1, and min (b, x ′) <b ′ <max (b, x ′), the three-dimensional predictor. 8 is selected When R> T0 and -T1 <S <T1, the three-dimensional predictor 9 is selected.
R = aa ′ + bb ′ + cc ′ + dd ′ − (a + b + c + d) (a ′ + b ′ + c ′ + d ′) (11)
S = | x′−b ′ | − | x′−a ′ | (12)
T0, T1: Thresholds a, b, c, d, a ′, b ′, c ′, d ′, x ′: Decoded values of the pixels shown in FIG.

ここで、Ｒは相関係数であり、Ｒが閾値Ｔ０より大きい場合には、３次元予測器を選択する。それ以外の場合には、２次元予測器を選択する。この選択は、実際には２段階で行われている。つまり、図５中の第１の判断分岐手段３６によりＲ値によって第１の判断分岐を行い、２次元予測器か３次元予測器かをまず切換えて２次元予測器３３または３次元予測器３４内で残る第２の判断分岐を行って各予測器１〜９の一つを選択する。 Here, R is a correlation coefficient, and when R is larger than the threshold value T0, a three-dimensional predictor is selected. In other cases, a two-dimensional predictor is selected. This selection is actually performed in two stages. That is, the first decision branch means 36 in FIG. 5 performs the first decision branch based on the R value, and first switches between the two-dimensional predictor or the three-dimensional predictor to switch the two-dimensional predictor 33 or the three-dimensional predictor 34. The remaining second decision branch is performed to select one of the predictors 1 to 9.

２次元予測器の３つの予測器の切換え方法は、既に提案されているものであり、静止画像可逆圧縮国際標準規格ＪＰＥＧ−ＬＳにおいて採用された方法と同じである。縦方向および横方法にエッジがあると判断された場合には、それぞれのエッジ方向に隣接する１画素を用いて予測し、それ以外の場合には隣接する３画素を用いて予測している。 The switching method of the three predictors of the two-dimensional predictor has been proposed and is the same as the method adopted in the international standard JPEG-LS for still image lossless compression. When it is determined that there is an edge in the vertical direction and the horizontal method, prediction is performed using one pixel adjacent to each edge direction, and prediction is performed using three adjacent pixels in other cases.

３次元予測の場合にも２次元予測器と同様に、被符号化対象画素の近傍信号値の状態により予測器を切換える適応予測を行う。縦方向または横方向にエッジがあると判断された場合は、現フレームおよび参照フレームのそれぞれのエッジ方向に隣接する信号を用いて予測する。エッジの方向は、参照フレームの縦方向の差分絶対値｜ｘ’−ａ’｜と横方向の差分絶対値｜ｘ’−ｂ’｜を比較して、閾値Ｔ１よりも大きい方向をエッジと判断する。式（１２）のＳはそれらの差分値であり、エッジ方向を判断するパラメータとなる。エッジ方向を判断する縦方向のエッジと判断された場合、現フレームおよび参照フレームの縦方向の信号に関して、２次元予測器と同様な方法で予測信号を選択する。横方向のエッジの場合も同様な方法で予測器を選択する。エッジでないと判断された場合には、近傍の３画素の平均値を予測値とする。 In the case of three-dimensional prediction as well as the two-dimensional predictor, adaptive prediction is performed in which the predictor is switched depending on the state of the neighborhood signal value of the encoding target pixel. When it is determined that there is an edge in the vertical direction or the horizontal direction, prediction is performed using signals adjacent to each edge direction of the current frame and the reference frame. As for the edge direction, the difference absolute value | x′−a ′ | in the vertical direction of the reference frame is compared with the absolute difference value | x′−b ′ | in the horizontal direction, and a direction larger than the threshold T1 is determined as an edge. To do. S in the equation (12) is a difference value between them and is a parameter for determining the edge direction. When it is determined that the edge is a vertical edge for determining the edge direction, a prediction signal is selected in the same manner as the two-dimensional predictor with respect to the vertical signals of the current frame and the reference frame. The predictor is selected in the same manner for the edge in the horizontal direction. When it is determined that it is not an edge, the average value of three neighboring pixels is set as the predicted value.

加減算手段３７または３８は、予測器の出力である予測信号と原フレーム信号の差、すなわち予測残差信号を出力する。予測残差信号は、図略の量子化器で量子化する。量子化された残差信号と、３次元予測器で用いた動きベクトルを、図１のエントロピー符号化部１２に入力し、符号化ビットストリームを出力する。量子化器の量子化ステップを１にした場合は、本符号化方法は可逆符号化法になる。 The addition / subtraction means 37 or 38 outputs a difference between the prediction signal output from the predictor and the original frame signal, that is, a prediction residual signal. The prediction residual signal is quantized by a quantizer (not shown). The quantized residual signal and the motion vector used in the three-dimensional predictor are input to the entropy encoding unit 12 in FIG. 1, and an encoded bit stream is output. When the quantization step of the quantizer is set to 1, the present encoding method is a lossless encoding method.

［第２実施形態例］
図４に、上記第１実施形態例で符号化されたデータを復号する復号化方法を実現するための基本構成を示す。図４において、２０はエントロピー復号部、２１は帯域毎に設けた時空間適応予測復号化部、２２は帯域合成部である。 [Second Embodiment]
FIG. 4 shows a basic configuration for realizing the decoding method for decoding the data encoded in the first embodiment. In FIG. 4, 20 is an entropy decoding unit, 21 is a space-time adaptive predictive decoding unit provided for each band, and 22 is a band synthesizing unit.

図４の構成の動作例としての復号化方法を以下に示す。最初にエントロピー復号化部２０において符号化ビットストリームから予測に用いる動きベクトルと予測残差信号を求める。次に、時空間適応予測符号化部２１において既に復号化された画像信号と残差信号を用いて被符号化信号を復号する。次に帯域合成部２２に、各時空間適応予測符号化部２１の出力を合成して画像を復号化する。 A decoding method as an operation example of the configuration of FIG. 4 is shown below. First, the entropy decoding unit 20 obtains a motion vector and a prediction residual signal used for prediction from the encoded bit stream. Next, the encoded signal is decoded using the image signal and residual signal that have already been decoded by the space-time adaptive predictive encoding unit 21. Next, the band synthesizing unit 22 synthesizes the outputs of the spatiotemporal adaptive prediction encoding units 21 to decode the image.

図６に、時空間適応予測復号化部の基本構成を示す。図６において、４１はシフト手段、４２は２次元予測器、４３は加算手段、４４は動き推定３次元予測器、４５は加算手段、４６は相関係数Ｒ計算手段、４７は第１の判断分岐手段である。 FIG. 6 shows a basic configuration of the space-time adaptive predictive decoding unit. In FIG. 6, 41 is a shift means, 42 is a two-dimensional predictor, 43 is an adder, 44 is a motion estimation three-dimensional predictor, 45 is an adder, 46 is a correlation coefficient R calculator, and 47 is a first determination. It is a branching means.

図６の動作例としての時空間適応予測復号化方法は、以下のとおりである。まず、２次元予測器４２を用いるか３次元予測器４４を用いるかを判断するために、動きベクトルを用いてシフト手段４１でシフトした参照フレーム内信号と現フレーム内信号の復号済み信号から、相関係数Ｒ計算手段４６により相関係数Ｒを計算する。第１の判断分岐手段４７は、相関係数Ｒが閾値Ｔ０より大きい場合には、動き推定３次元予測器４４側に切換えて３次元予測を行い、それ以外の場合には２次元予測器４２側に切換えて２次元予測を行う。２次元予測器４２および動き推定３次元予測器４４の構成と内部に用意した複数の予測器の切換えは第１実施形態例のものと同様である。動き推定３次元予測器４４では、現フレーム内信号の復号済み信号と動きベクトルを用いてシフトした参照フレーム内信号を用いて予測信号を生成し、加算手段４５は、この予測信号に残差信号を付加することで、現フレームの対象信号を復元する。２次元予測器４２では、現フレーム内信号の復号済み信号を用いて予測信号を生成し、加算手段４３がこの予測信号に残差信号を付加することで、現フレームの対象信号を復元する。 The space-time adaptive predictive decoding method as the operation example of FIG. 6 is as follows. First, in order to determine whether to use the two-dimensional predictor 42 or the three-dimensional predictor 44, from the decoded signal of the reference intra-frame signal and the current intra-frame signal shifted by the shift means 41 using the motion vector, The correlation coefficient R is calculated by the correlation coefficient R calculation means 46. When the correlation coefficient R is larger than the threshold value T0, the first decision branching unit 47 switches to the motion estimation three-dimensional predictor 44 side to perform three-dimensional prediction, and otherwise, the two-dimensional predictor 42. Switch to the side and perform 2D prediction. The configuration of the two-dimensional predictor 42 and the motion estimation three-dimensional predictor 44 and the switching of a plurality of predictors prepared therein are the same as those in the first embodiment. The motion estimation three-dimensional predictor 44 generates a prediction signal using the decoded signal of the current intraframe signal and the reference intraframe signal shifted using the motion vector, and the adder 45 adds the residual signal to the prediction signal. Is added to restore the target signal of the current frame. The two-dimensional predictor 42 generates a prediction signal using the decoded signal of the signal in the current frame, and the adding unit 43 adds the residual signal to the prediction signal, thereby restoring the target signal of the current frame.

［第３実施形態例］
本実施形態例では、第１実施形態例の時空間適応予測符号化部１１において、各帯域における対象物体の動きをブロックマッチング法により推定する。ブロックマッチング法においては、動きベクトルを求めるために、次のＳＡＤ（ＳｕｍｏｆＡｂｓｏｌｕｔｅＤｉｆｆｅｒｅｎｃｅ）値を計算する。 [Third Embodiment]
In this embodiment, the spatio-temporal adaptive predictive coding unit 11 of the first embodiment estimates the motion of the target object in each band by the block matching method. In the block matching method, the following SAD (Sum of Absolute Difference) value is calculated in order to obtain a motion vector.

ＳＡＤ（ｋ，ｌ）＝Σ_i=1 ^LΣ_j=1 ^L｜ｘ（ｉ，ｊ）−ｙ（ｉ＋ｋ，ｊ＋ｌ）｜ …（１３）
ここで、（ｋ，ｌ）を−ｗ＜ｋ，１＜ｗの範囲（ｗはウインドウサイズ）でＳＡＤ（ｋ，ｌ）を計算し、最小のＳＡＤ値を与えるベクトル（ｋ，ｌ）が動きベクトルとして選ばれる。３次元予測器の予測精度を向上させるために、通常、ＳＡＤは各帯域毎に計算される。ＳＡＤの計算には、かなりの時間を要する。これを考慮に入れ、符号化効率の低下をできるだけ抑えたうえで、計算時間を削減するために以下の３つの簡易符号化の手法（１），（２），（３）を提案する。 SAD (k, l) = Σi _{= 1} ^L Σj _{= 1} ^L | x (i, j) −y (i + k, j + l) | (13)
Here, SAD (k, l) is calculated in the range of (k, l) -w <k, 1 <w (w is the window size), and the vector (k, l) giving the minimum SAD value moves. Selected as a vector. In order to improve the prediction accuracy of the three-dimensional predictor, SAD is usually calculated for each band. The calculation of SAD takes a considerable amount of time. Considering this, the following three simple encoding methods (1), (2), and (3) are proposed to reduce the calculation time while minimizing the decrease in encoding efficiency.

（１）帯域分割された信号のフレーム間相関ならびにフレーム内相関は、低域ほど強く、高域ほど弱い。最低周波数帯域以外の信号はフレーム間相関ならびにフレーム内相関が弱いことから、時空間適応予測符号化の効果がそれほど顕著に現れてこない。その性質を利用して、時空間適応予測符号化を最低周波数帯域のみに適用する。ＳＡＤの計算量が極端に減少するとともに、２次元および３次元予測処理時間も減少する。最低周波数帯域の動きベクトルと予測残差信号はエントロピー符号化され、最低周波数帯域以外の信号は、直接エントロピー符号化される。 (1) The inter-frame correlation and intra-frame correlation of the band-divided signal are stronger as the frequency is lower and weaker as the frequency is higher. Since signals other than the lowest frequency band have weak inter-frame correlation and intra-frame correlation, the effect of space-time adaptive prediction coding does not appear so significantly. Using this property, space-time adaptive predictive coding is applied only to the lowest frequency band. The calculation amount of SAD is extremely reduced, and the two-dimensional and three-dimensional prediction processing times are also reduced. The motion vector in the lowest frequency band and the prediction residual signal are entropy coded, and the signal other than the lowest frequency band is directly entropy coded.

図８に本方法を実現するための基本構成図を示す。図８において、５０は帯域分割部、５１は最低周波数帯域用の時空間適応予測符号化部、５２はエントロピー符号化部５２である。まず、入力された画像信号は帯域分割部５０において、複数の空間解像度の帯域に分割される。次に、分割した帯域のうち最低周波数帯域についてのみ時空間適応予測符号化部５１で残差信号を生成し、エントロピー符号化部５２でエントロピー符号化する。それ以外の帯域の信号はエントロピー符号化部５２において直接符号化する。 FIG. 8 shows a basic configuration diagram for realizing this method. In FIG. 8, 50 is a band dividing unit, 51 is a space-time adaptive prediction encoding unit for the lowest frequency band, and 52 is an entropy encoding unit 52. First, the input image signal is divided into a plurality of spatial resolution bands by the band dividing unit 50. Next, a residual signal is generated by the spatio-temporal adaptive prediction encoding unit 51 only for the lowest frequency band among the divided bands, and entropy encoding unit 52 performs entropy encoding. The entropy encoding unit 52 directly encodes signals in other bands.

（２）動き推定を最低周波数帯域のみにて行い、動きベクトルを求める。他の帯域では、最低周波数帯域の動きベクトルを基準にした動きベクトルを用いて時空間適応予測符号化を行う。対象物体の動きは、帯域毎に変化するわけではないので、高域の物体の動きも低域の物体の動きも等しいはずである。すなわち、低域から高域に向けて同じ方向に存在する小ブロック動きベクトルは同じであると仮定する。例えば、図３の矢印で示した小ブロックの動きはほぼ等しい。但し、一つ高域に帯域が上がる毎に、画素数は縦および横方向に２倍になるので、一つ高域での動きベクトルは、
（２ｋ，２ｌ）
となる。Ｎ段高域の動きベクトルは、
（Ｎ×ｋ，Ｎ×ｌ）
となる。 (2) A motion vector is obtained by performing motion estimation only in the lowest frequency band. In other bands, space-time adaptive prediction encoding is performed using motion vectors based on the motion vector in the lowest frequency band. Since the movement of the target object does not change for each band, the movement of the high-frequency object and the movement of the low-frequency object should be equal. That is, it is assumed that the small block motion vectors existing in the same direction from the low range to the high range are the same. For example, the movements of the small blocks indicated by the arrows in FIG. However, the number of pixels doubles in the vertical and horizontal directions each time the band increases in one high region, so the motion vector in one high region is
(2k, 2l)
It becomes. The motion vector of the N stage high region is
(N × k, N × l)
It becomes.

本方法を実現するための基本構成図は、図１と同様であるが、各帯域毎の時空間適応予測符号化部１１で用いる動きベクトルは最低周波数帯域の動きベクトルを基準にした動きベクトルを用いる。最低周波数帯域の動きベクトルおよび全帯域の予測残差信号はエントロピー符号化部１２でエントロピー符号化される。 The basic configuration diagram for realizing this method is the same as that in FIG. 1, but the motion vector used in the space-time adaptive prediction encoding unit 11 for each band is a motion vector based on the motion vector in the lowest frequency band. Use. The motion vector in the lowest frequency band and the prediction residual signal in the entire band are entropy encoded by the entropy encoding unit 12.

（３）ＳＡＤを計算するときに低域から高域に向かって行う。図３の例においては、ＬＬ３→ＨＬ３→ＬＨ３→ＨＨ３→ＨＬ２→ＬＨ２→ＨＨ２→ＨＬ１→ＬＨ１→ＨＨ１の順となる。まず、最低周波数帯域の各小ブロックに対してＳＡＤを計算する。ＳＡＤが、
ＳＡＤ（ｋ，ｌ）＞Ｔ（閾値） …（１４）
の小ブロック内の信号に関してはフレーム間相関が弱いと判断し、時空間適応予測処理は行わず２次元予測を行う。および対応する高域の小ブロックの信号に関しては時空間適応予測符号化を行わず、直接エントロピー符号化を行う。 (3) When calculating SAD, it is performed from low to high. In the example of FIG. 3, the order is LL3->HL3->LH3->HH3->HL2->LH2->HH2->HL1->LH1-> HH1. First, SAD is calculated for each small block in the lowest frequency band. SAD
SAD (k, l)> T (threshold value) (14)
For the signals in the small blocks, it is determined that the inter-frame correlation is weak, and the two-dimensional prediction is performed without performing the spatiotemporal adaptive prediction process. In addition, the entropy coding is directly performed on the corresponding high-frequency small block signals without performing the space-time adaptive prediction coding.

式（１４）を満たさないブロック内の信号に関しては、通常どおり、時空間適応予測符号化を行う。引き続く高帯域のブロックにおいて、ＳＡＤが、
ＳＡＤ（ｋ，ｌ）＞Ｔｘ（高帯域での閾値。帯域によって閾値は異なる。） …（１５）
の場合にはフレーム間相関が弱いと判断し、時空間適応予測符号化を行わず、直接エントロピー符号化を行う。なお、対応する低域のブロックのＳＡＤが既に式（１４）を満足している場合には、直接エントロピー符号化を行うことが確定しているため、ＳＡＤの計算ならびに式（１５）の判断は行わない。 For a signal in a block that does not satisfy Expression (14), space-time adaptive prediction coding is performed as usual. In the subsequent high-band block, SAD
SAD (k, l)> Tx (threshold in high band. Threshold varies depending on band) (15)
In this case, it is determined that the correlation between frames is weak, and direct entropy coding is performed without performing space-time adaptive prediction coding. In addition, when the SAD of the corresponding low-frequency block already satisfies the equation (14), it is determined that the direct entropy encoding is performed, so the calculation of the SAD and the determination of the equation (15) Not performed.

式（１４）および（１５）の条件を満足する小ブロックは、付加情報として動きベクトルの代りに固有の符号ＬＩＭＩＴを伝送し、対応する高域の小ブロックでは付加情報を伝送しない。動きベクトルならびに固有の符号ＬＩＭＩＴ、予測残差信号がエントロピー符号化される。 A small block that satisfies the conditions of equations (14) and (15) transmits a unique code LIMIT instead of a motion vector as additional information, and does not transmit additional information in the corresponding high-frequency small block. The motion vector, the unique code LIMIT, and the prediction residual signal are entropy encoded.

本方法を実現するための基本構成図は、図１と同様であるが、各時空間適応予測符号化器１１は、ブロック単位動き推定において、低域から高域に向かってＳＡＤを計算する機能、時空間適応予測符号化を行うか否かの判断機能、及び固有の符号ＬＩＭＩＴを伝送する機能等を有する。エントロピー符号化部１２は、各時空間適応予測符号化器１１からの、動きベクトルまたは固有の符号ＬＩＭＩＴ、時空間適応予測符号化されない信号または予測残差信号をエントロピー符号化する。 The basic configuration diagram for realizing this method is the same as that in FIG. 1, but each space-time adaptive predictive encoder 11 has a function of calculating SAD from low to high in block unit motion estimation. And a function for determining whether to perform space-time adaptive predictive coding, a function for transmitting a unique code LIMIT, and the like. The entropy encoder 12 entropy-encodes the motion vector, the unique code LIMIT, a signal not subjected to space-time adaptive prediction encoding, or a prediction residual signal from each space-time adaptive prediction encoder 11.

［第４実施形態例］
第３実施形態例による簡易符号化の手法で符号化された信号の簡易復号化方法を示す。 [Fourth Embodiment]
A simple decoding method of a signal encoded by a simple encoding method according to a third embodiment will be described.

（１）図９に、簡易符号化（１）で符号化された信号を復号するための基本構成図を示す。図９において、６０はエントロピー復号部、６１は最低周波数帯域用の時空間適応予測復号化部、６２は帯域合成部である。 (1) FIG. 9 shows a basic configuration diagram for decoding a signal encoded by the simple encoding (1). In FIG. 9, 60 is an entropy decoding unit, 61 is a space-time adaptive prediction decoding unit for the lowest frequency band, and 62 is a band synthesizing unit.

本方法では、エントロピー復号部６０でエントロピー復号化された動きベクトルと予測残差信号を用いて、最低周波数帯域は時空間予測復号化部６１により信号を復号する。それ以外の帯域は、エントロピー復号部６０でエントロピー復号化により、直接復号される。帯域合成部６２は、各帯域の出力を合成して画像を復号化する。 In this method, using the motion vector entropy-decoded by the entropy decoding unit 60 and the prediction residual signal, the spatio-temporal prediction decoding unit 61 decodes the signal in the lowest frequency band. Other bands are directly decoded by the entropy decoding unit 60 by entropy decoding. The band synthesizer 62 synthesizes the output of each band and decodes the image.

（２）簡易符号化（２）で符号化された信号を復号する復号化方法を実現するための基本構成図は、図４と同様であるが、各帯域毎の時空間適応予測復号化部２１で用いる動きベクトルは最低周波数帯域の動きベクトルを基準にした動きベクトルを用いる。すなわち、エントロピー復号部２０でエントロピー復号化された最低周波数帯域の動きベクトル（ｋ，ｌ）をもとにして、Ｎ段高域の動きベクトルを、
（Ｎ×ｋ，Ｎ×ｌ）
により計算する。その結果をもとに、各帯域毎に時空間適応予測復号化部２１で復号処理を行い、帯域合成部２２において、各帯域の出力を合成して画像を復号化する。 (2) The basic configuration diagram for realizing the decoding method for decoding the signal encoded by the simple encoding (2) is the same as that in FIG. 4, but the spatio-temporal adaptive predictive decoding unit for each band. The motion vector used in 21 is a motion vector based on the motion vector in the lowest frequency band. That is, based on the motion vector (k, l) of the lowest frequency band entropy decoded by the entropy decoding unit 20, the motion vector of the N-stage high band is
(N × k, N × l)
Calculate according to Based on the result, the spatio-temporal adaptive predictive decoding unit 21 performs decoding processing for each band, and the band synthesizing unit 22 synthesizes the output of each band to decode the image.

（３）簡易符号化（３）で符号化された信号を復号する復号化方法を実現するための基本構成図も、図４と同様であるが、各帯域毎の時空間適応予測復号化部２１では、エントロピー復号部２０でエントロピー復号化された最低周波数帯域の動きベクトル（ｋ，ｌ）ならびに予測残差信号を用いて、最低周波数帯帯域から最高周波数帯域の順番（図３の例においては、ＬＬ３→ＨＬ３→ＬＨ３→ＨＨ３→ＨＬ２→ＬＨ２→ＨＨ２→ＨＬ１→ＬＨ１→ＨＨ１の順）で、時空間適応予測復号処理を行って出力するか、または固有符号ＬＩＭＩＴにより、エントロピー復号化された高周波数帯域の信号を直接出力する。帯域合成部２２は、各帯域の出力を合成して画像を復号化する。 (3) The basic configuration diagram for realizing the decoding method for decoding the signal encoded by the simple encoding (3) is the same as that in FIG. 4, but the space-time adaptive predictive decoding unit for each band. 21, using the motion vector (k, l) of the lowest frequency band entropy decoded by the entropy decoding unit 20 and the prediction residual signal, the lowest frequency band to the highest frequency band (in the example of FIG. 3). LL3-> HL3-> LH3-> HH3-> HL2-> LH2-> HH2-> HL1-> LH1-> HH1) in order, and output by performing spatio-temporal adaptive predictive decoding processing or by entropy decoding by eigencode LIMIT Direct output of frequency band signals. The band synthesizer 22 synthesizes the output of each band and decodes the image.

本発明の第１実施形態例による動画像符号化方法を実現するための基本構成を示す図である。It is a figure which shows the basic composition for implement | achieving the moving image encoding method by 1st Embodiment of this invention. オクターブ分割を説明する図である。It is a figure explaining an octave division. 画像の帯域分割を説明する図である。It is a figure explaining the zone | band division | segmentation of an image. 本発明の第２実施形態例による動画像復号化方法を実現するための基本構成を示す図である。It is a figure which shows the basic composition for implement | achieving the moving image decoding method by 2nd Example of this invention. 本発明における時空間適応予測符号化部の基本構成を示す図である。It is a figure which shows the basic composition of the space-time adaptive prediction encoding part in this invention. 本発明における時空間適応予測復号化部の基本構成を示す図である。It is a figure which shows the basic composition of the space-time adaptive prediction decoding part in this invention. （ａ），（ｂ）は、予測に用いる信号を説明する図である。(A), (b) is a figure explaining the signal used for prediction. 本発明の第３実施形態例による簡易符号化（１）を実現するための基本構成を示す図である。It is a figure which shows the basic composition for implement | achieving simple encoding (1) by 3rd Example of this invention. 本発明の第４実施形態例による簡易復号化（１）を実現するための基本構成を示す図である。It is a figure which shows the basic composition for implement | achieving simple decoding (1) by 4th Example of this invention.

Explanation of symbols

１０…帯域分割部
１１…時空間適応予測符号化部
１２…エントロピー符号化部
２０…エントロピー復号部
２１…時空間適応予測復号化部
２２…帯域合成部
３１…ブロック単位動き推定手段
３２…シフト手段
３３…２次元予測器
３４…動き推定３次元予測器
３５…相関係数Ｒ計算手段
３６…第１の判断分岐手段
３７…加減算手段
３８…加減算手段
４１…シフト手段
４２…２次元予測器
４３…加算手段
４４…動き推定３次元予測器
４５…加算手段
４６…相関係数Ｒ計算手段
４７…第１の判断分岐手段
５０…帯域分割部
５１…時空間適応予測符号化部
５２…エントロピー符号化部
６０…エントロピー復号部
６１…時空間適応予測復号化部
６２…帯域合成部 DESCRIPTION OF SYMBOLS 10 ... Band division part 11 ... Spatio-temporal adaptive prediction encoding part 12 ... Entropy encoding part 20 ... Entropy decoding part 21 ... Spatio-temporal adaptive prediction decoding part 22 ... Band synthesis part 31 ... Block unit motion estimation means 32 ... Shift means 33 ... Two-dimensional predictor 34 ... Motion estimation three-dimensional predictor 35 ... Correlation coefficient R calculating means 36 ... First decision branching means 37 ... Addition / subtraction means 38 ... Addition / subtraction means 41 ... Shift means 42 ... Two-dimensional predictor 43 ... Addition means 44 ... Motion estimation three-dimensional predictor 45 ... Addition means 46 ... Correlation coefficient R calculation means 47 ... First decision branch means 50 ... Band division section 51 ... Spatio-temporal adaptive prediction encoding section 52 ... Entropy encoding section 60 ... Entropy decoding unit 61 ... Spatio-temporal adaptive prediction decoding unit 62 ... Band synthesis unit

Claims

In a lossless encoding method for moving images,
The original image is divided into bands,
A motion vector is obtained by performing SAD calculation for each lowest frequency band among the divided bands to obtain a motion vector. Pixels in a block that give an SAD value larger than a predetermined threshold T are subjected to two-dimensional prediction, and are equal to or lower than the predetermined threshold T. Pixels in the block giving the SAD value perform space-time adaptive prediction to obtain a prediction residual signal, encode the prediction residual signal and the motion vector used in the space-time adaptive prediction,
In the second lowest frequency band coding, the pixels of the spatial block corresponding to the block for which the two-dimensional prediction is performed in the lowest frequency band are directly coded, and the other blocks are subjected to SAD calculation to obtain a motion vector. The pixels in the block that give a SAD value greater than a predetermined threshold T2 are directly encoded, and the pixels in the block that give a SAD value less than or equal to the threshold T2 are subjected to spatiotemporal adaptive prediction to obtain a prediction residual signal. Encode the residual signal and the motion vector used in the space-time adaptive prediction,
In the subsequent high frequency band encoding, the pixels of the spatial block corresponding to the block directly encoded in the frequency band encoded immediately before are directly encoded, and the other blocks perform the SAD calculation to obtain the motion vector. The pixels in the block that give a SAD value larger than the threshold value Tx determined for each frequency band are directly encoded, and the pixels in the block that give a SAD value less than or equal to the predetermined threshold value Tx are predicted by performing space-time adaptive prediction. When obtaining a residual signal and encoding the prediction residual signal and the motion vector used in the space-time adaptive prediction,
In the spatio-temporal adaptive prediction, the correlation coefficient between the current pixel and the reference frame pixel value to be encoded between the current frame and the reference frame shifted by the shift means is calculated using the motion vector obtained by the SAD calculation. If the correlation coefficient is small, switch to three-dimensional prediction that performs prediction using the encoding target pixel neighboring signal values of the current frame and the reference frame. If the correlation coefficient is small, the encoding target pixel neighboring signal of the current frame Switch to two-dimensional prediction that uses only values to make predictions,
In the case of switching to the two-dimensional prediction, a prediction residual signal is calculated by switching a plurality of two-dimensional predictors for each pixel from the encoding target pixel vicinity signal value of the current frame,
When switching to the three-dimensional prediction, the prediction residual signal is calculated by switching a plurality of three-dimensional predictors for each pixel from the encoding target pixel neighboring signal values of the current frame and the reference frame. A moving image lossless encoding method.

The video lossless encoding method according to claim 1, wherein a code indicating that the spatio-temporal adaptive prediction processing has not been performed is added to an encoded signal of a block in which an SAD value larger than a threshold value is obtained.

In a lossless decoding method for outputting a moving image,
Receiving a direct coding block for each band, a prediction residual signal and a motion vector encoded by the video lossless encoding method according to claim 1;
In the order of the lowest frequency band to the highest frequency band, for each band, the direct coding block, the prediction residual signal and the motion vector are decoded, and the space-time adaptive prediction decoding is further performed from the prediction residual signal and the motion vector,
When decoding a moving image by combining the decoded bands,
In the spatio-temporal adaptive prediction decoding, a correlation coefficient between a decoding target pixel neighboring signal value of a reference frame shifted using the motion vector and a decoding target pixel neighboring signal value composed of a decoded signal of a signal in the current frame When the correlation coefficient is large, switch to 3D prediction, and when the correlation coefficient is small, switch to 2D prediction.
In the case of switching to the two-dimensional prediction, a prediction signal is calculated by switching a plurality of two-dimensional predictors for each pixel from the decoding target pixel neighboring signal value including the decoded signal of the signal in the current frame,
In the case of switching to the three-dimensional prediction, from the decoding target pixel neighboring signal value composed of the decoded signal of the signal in the current frame and the decoding target pixel neighboring signal value of the reference frame shifted using the motion vector, Calculate prediction signals by switching multiple 3D predictors for each pixel,
A moving picture lossless decoding method comprising decoding the target signal by adding the prediction residual signal to the prediction signal.

In a lossless encoding device for moving images,
The procedure for dividing the original image into bands,
A motion vector is obtained by performing SAD calculation for each lowest frequency band among the divided bands to obtain a motion vector. Pixels in a block that give an SAD value larger than a predetermined threshold T are subjected to two-dimensional prediction, and are equal to or lower than the predetermined threshold T. A pixel in the block that gives the SAD value performs a spatiotemporal adaptive prediction to obtain a prediction residual signal, and encodes the prediction residual signal and the motion vector used in the spatiotemporal adaptive prediction.
In the second lowest frequency band coding, the pixels of the spatial block corresponding to the block for which the two-dimensional prediction is performed in the lowest frequency band are directly coded, and the other blocks are subjected to SAD calculation to obtain a motion vector. The pixels in the block that give a SAD value greater than a predetermined threshold T2 are directly encoded, and the pixels in the block that give a SAD value less than or equal to the threshold T2 are subjected to spatiotemporal adaptive prediction to obtain a prediction residual signal. A procedure for encoding a residual signal and a motion vector used in the space-time adaptive prediction;
In the subsequent high frequency band encoding, the pixels of the spatial block corresponding to the block directly encoded in the frequency band encoded immediately before are directly encoded, and the other blocks perform the SAD calculation to obtain the motion vector. The pixels in the block that give a SAD value larger than the threshold value Tx determined for each frequency band are directly encoded, and the pixels in the block that give a SAD value less than or equal to the predetermined threshold value Tx are predicted by performing space-time adaptive prediction. Obtaining a residual signal, and executing a procedure for encoding the prediction residual signal and the motion vector used in the space-time adaptive prediction,
In the spatio-temporal adaptive prediction, the correlation coefficient between the current pixel and the reference frame pixel value to be encoded between the current frame and the reference frame shifted by the shift means is calculated using the motion vector obtained by the SAD calculation. If the correlation coefficient is small, switch to three-dimensional prediction that performs prediction using the encoding target pixel neighboring signal values of the current frame and the reference frame. If the correlation coefficient is small, the encoding target pixel neighboring signal of the current frame Switch to two-dimensional prediction that uses only values to make predictions,
In the case of switching to the two-dimensional prediction, a prediction residual signal is calculated by switching a plurality of two-dimensional predictors for each pixel from the encoding target pixel vicinity signal value of the current frame,
When switching to the three-dimensional prediction, a procedure for calculating a prediction residual signal by switching a plurality of three-dimensional predictors for each pixel from the signal to be encoded pixel vicinity signal values of the current frame and the reference frame is executed. Program for.

In a lossless decoding device that outputs moving images,
A procedure for receiving a direct coding block for each band, a prediction residual signal, and a motion vector encoded by the moving image lossless encoding method according to claim 1;
A procedure for decoding a direct coding block, a prediction residual signal, and a motion vector for each band in order from the lowest frequency band to the highest frequency band, and further performing space-time adaptive prediction decoding from the prediction residual signal and the motion vector,
A procedure for decoding the moving image by performing band synthesis on the decoded band,
In the spatio-temporal adaptive prediction decoding, a correlation coefficient between a decoding target pixel neighboring signal value of a reference frame shifted using the motion vector and a decoding target pixel neighboring signal value composed of a decoded signal of a signal in the current frame When the correlation coefficient is large, switch to 3D prediction, and when the correlation coefficient is small, switch to 2D prediction.
In the case of switching to the two-dimensional prediction, a prediction signal is calculated by switching a plurality of two-dimensional predictors for each pixel from the decoding target pixel neighboring signal value including the decoded signal of the signal in the current frame,
In the case of switching to the three-dimensional prediction, from the decoding target pixel neighboring signal value composed of the decoded signal of the signal in the current frame and the decoding target pixel neighboring signal value of the reference frame shifted using the motion vector, Calculate prediction signals by switching multiple 3D predictors for each pixel,
A program for executing a procedure for decoding a target signal by adding the prediction residual signal to the prediction signal.