KR20070013995A

KR20070013995A - Method for encoding a video signal using inter-layer prediction and decoding the encoded signal

Info

Publication number: KR20070013995A
Application number: KR1020060014303A
Authority: KR
Inventors: 박승욱; 박지호; 김동석; 엄성현; 전병문
Original assignee: 엘지전자 주식회사
Priority date: 2005-07-26
Filing date: 2006-02-14
Publication date: 2007-01-31

Abstract

A method for decoding an image signal is provided to obtain residual data from the image restored from a lower layer when the image signal of plural layers is encoded and decoded, thereby enabling residual prediction. Information, which indicates the residual prediction on a target block within a picture of a first layer of each encoded bit stream, is checked. If the information indicates the residual prediction, a block having decoded data of a second layer of the encoded bit stream, which corresponds to the target block, is coded into residual data by using the target block or coding information of blocks including the target block. Residual data of the target block are obtained by adding the coded residual data to data of the target block.

Description

A method for encoding a video signal using the inter-layer prediction and decoding the encoded video signal {Method for encoding a video signal using inter-layer prediction and decoding the encoded signal}

도 1은 본 발명의 일 실시예에 따른, 레지듀얼 예측을 수행하는 디코더의 구성을 동작위주로 도시한 것이고,1 is a diagram illustrating an operation of a decoder for performing residual prediction according to an embodiment of the present invention.

도 2는 본 발명에 따라 인핸스드 레이어의 블록들과 그 블록정보들이 수축(shrink)되는 예를 나타낸 것이고,2 illustrates an example in which blocks of an enhanced layer and their block information are shrunk according to the present invention.

도 3은 본 발명에 따라 수축된 블록정보를 이용하여 복원된(reconstructed) 이미지 데이터를 레지듀얼 데이터로 다시 코딩하는 예를 도시한 것이고,3 illustrates an example of recoding reconstructed image data into residual data using contracted block information according to the present invention.

도 4는 본 발명의 다른 일 실시예에 따른, 레지듀얼 예측을 수행하는 디코더의 구성을 동작위주로 도시한 것이다.4 is a diagram illustrating an operation of a decoder for performing residual prediction according to another embodiment of the present invention.

<도면의 주요부분에 대한 부호의 설명> <Description of the symbols for the main parts of the drawings>

101, 111: 역양자화 및 이산여현 역변환부101, 111: inverse quantization and discrete cosine inverse transform unit

103, 112: 시간적 역분해 수행부 200: 레지듀얼 역예측부103, 112: temporal inverse decomposition performing unit 200: residual inverse predicting unit

본 발명은, 레이어(layer)간 예측방식에 의해 영상신호를 엔코딩하고 그에 따라 엔코딩된 영상신호를 디코딩하는 방법에 관한 것이다. The present invention relates to a method of encoding a video signal by an inter-layer prediction method and decoding the encoded video signal accordingly.

스케일러블 영상 코덱(SVC:Scalable Video Codec) 방식은 영상신호를 엔코딩함에 있어, 최고 화질로 엔코딩하되, 그 결과로 생성된 픽처 시퀀스의 부분 시퀀스( 시퀀스 전체에서 간헐적으로 선택된 프레임의 시퀀스 )를 디코딩해 사용해도 저화질의 영상 표현이 가능하도록 하는 방식이다. MCTF (Motion Compensated Temporal Filter) 방식이 상기와 같은 스케일러블 영상코덱에 사용하기 위해 제안된 엔코딩 방식중 하나이다. The scalable video codec (SVC) method encodes a video signal at the highest quality and decodes a partial sequence of the resulting picture sequence (a sequence of intermittently selected frames throughout the sequence). Even if it is used, it is a way to enable a low-quality video representation. The Motion Compensated Temporal Filter (MCTF) scheme is one of the proposed encoding schemes for use in the scalable image codec.

그런데, 스케일러블 방식인 MCTF로 엔코딩된 픽처 시퀀스는 그 부분 시퀀스만을 수신하여 처리함으로써도 저화질의 영상 표현이 가능하지만, 비트레이트(bitrate)가 낮아지는 경우 화질저하가 크게 나타난다. 이를 해소하기 위해서 낮은 전송률을 위한 별도의 보조 픽처 시퀀스, 예를 들어 소화면 및/또는 초당 프레임수 등이 낮은 픽처 시퀀스를 제공할 수도 있다. However, a picture sequence encoded by the scalable MCTF can be expressed in a low quality image only by receiving and processing only a partial sequence. However, when the bit rate is lowered, the picture quality is greatly deteriorated. In order to solve this problem, a separate auxiliary picture sequence for a low data rate, for example, a small picture and / or a low picture sequence per frame may be provided.

보조 시퀀스를 베이스 레이어(base layer)로, 주 픽처 시퀀스를 인핸스드(enhanced)( 또는 인핸스먼트(enhancement) ) 레이어라고 부른다. 그런데, 베이스 레이어와 인핸스드 레이어는 동일한 영상신호원을 엔코딩하는 것이므로 양 레이어 의 영상신호에는 잉여정보( 리던던시(redundancy) )가 존재한다. 따라서 인핸스드 레이어의 코딩율(coding rate)을 높이기 위해, 베이스 레이어의 임의 영상 프레임을 기준으로 하여 그와 동시간의 인핸스드 레이어의 영상 프레임내의 매크로 블록을 이미지 차( '레지듀얼'이라고도 함 )로 코딩하기도 하는 데 이와 같은 방법에 의해 코딩된 블록을 인트라 BL 모드(intra_BL)라 한다. 또한, 모션추정에 의해 인터(inter) 모드로 코딩된 레지듀얼 데이터를 갖는 블록에 대해서도, 베이스 레이어의 대응 블록이 인터 모드로 코딩되어 있는 경우 그 레지듀얼 데이터와의 차 데이터( 레지듀얼 차(difference) )로 코딩하기도 한다. The auxiliary sequence is called a base layer, and the main picture sequence is called an enhanced (or enhanced) layer. However, since the base layer and the enhanced layer encode the same video signal source, there is redundant information (redundancy) in the video signals of both layers. Therefore, in order to increase the coding rate of the enhanced layer, macroblocks in the image frame of the enhanced layer based on an arbitrary image frame of the base layer are simultaneously referred to as image differences (also called 'residual'). The block coded by this method is also called an intra BL mode (intra_BL). In addition, even for a block having residual data coded in an inter mode by motion estimation, when the corresponding block of the base layer is coded in the inter mode, the difference data with the residual data (residual difference Also coded as)).

후자와 같이 코딩되는 방법을 레지듀얼 예측(residual prediction)이라 하는 데, 이와 같이 코딩된 인핸스드 레이어의 블록에 대해서는, residual_prediction_flag를 1로 설정하여 디코더로 전송하게 된다.The latter coded method is called residual prediction. For the block of the enhanced layer coded as described above, residual_prediction_flag is set to 1 and transmitted to the decoder.

그런데, 상기와 같은 레지듀얼 예측방식으로 코딩된 블록에 대해 디코더가 코딩전 데이터로 복원하기 위해서는, 수신된 베이스 레이어의 스트림에서 레지듀얼 데이터를 획득할 수 있어야 한다. 하지만, 베이스 레이어로부터는 복원(reconstructed)된 영상 데이터만 획득할 수 있고 복원되기 전의 레지듀얼 데이터는 제공받지 못할 수도 있다. 예를 들어 단일-루프 디코딩일 경우에는 영상신호로 복원되기 전의 레지듀얼 데이터를 베이스 레이어로부터 제공받을 수 없다.However, in order for the decoder to reconstruct the block coded by the residual prediction scheme to the pre-coding data, the residual data should be obtained from the stream of the received base layer. However, only reconstructed image data may be obtained from the base layer, and residual data before reconstruction may not be provided. For example, in the case of single-loop decoding, residual data before reconstruction to a video signal cannot be provided from the base layer.

이와 같은 경우에는 디코더가 레지듀얼 예측된 블록의 레지듀얼 차 데이터를 원래의 레지듀얼 데이터로 복원하지 못하게 된다.In such a case, the decoder cannot restore the residual difference data of the residual predicted block to the original residual data.

따라서, 본 발명은, 복수 레이어로 수신되는 영상신호의 엔코딩 및 디코딩에 있어서, 그 하위 레이어의 복원된 영상으로부터 레지듀얼 데이터를 획득하여 레지듀얼 예측이 가능하도록 하는 방법을 제공하는 데 그 목적이 있다.Accordingly, an object of the present invention is to provide a method for acquiring residual data from a reconstructed image of a lower layer and enabling residual prediction in encoding and decoding a video signal received in a plurality of layers. .

상기한 목적을 달성하기 위한 본 발명에 따른 디코딩 방법은, 제 1레이어와 제 2레이어의 엔코딩된 각 비트 스트림을 수신하여 영상신호로 디코딩함에 있어서, 상기 제 1레이어의 픽처내의 대상 블록에 대한 레지듀얼(residual) 예측여부를 나타내는 제 1정보를 확인하는 1단계와, 상기 제 1정보가 레지듀얼 예측을 나타내는 경우에는, 상기 대상블록에 대응하는, 상기 제 2레이어의 디코딩된 데이터를 갖는 블록에 대해서, 상기 대상블록 또는 그 대상블록을 포함하는 블록들의 코딩정보를 사용하여 상기 제 2레이어의 대응블록을 레지듀얼 데이터로 코딩하는 2단계와, 상기 대응블록의 코딩된 레지듀얼 데이터를 상기 대상블록의 데이터에 가산하여 상기 대상블록의 레지듀얼 데이터를 구하는 3단계를 수행하는 것에 그 특징이 있다.The decoding method according to the present invention for achieving the above object, in receiving each of the encoded bit stream of the first layer and the second layer and decoded into a video signal, the register for the target block in the picture of the first layer A first step of confirming first information indicating whether or not dual prediction is performed; and if the first information indicates residual prediction, a block having decoded data of the second layer corresponding to the target block; And coding the corresponding block of the second layer into residual data using coding information of the target block or blocks including the target block, and encoding the coded residual data of the corresponding block into the target block. It is characterized by performing three steps of obtaining residual data of the target block by adding to the data of.

또한, 본 발명에 따른 엔코딩방법은, 입력 영상 픽처들을 제 1레이어와 제 2레이어의 데이터 스트림으로 엔코딩함에 있어서, 상기 영상 픽처 중 제 1픽처와 제 2픽처내의 제 1 및 제 2영상블록에 대해 모션추정/예측 동작을 행하여 레지듀얼 데이터로 코딩하고, 상기 제 1영상블록에 대응하는, 엔코딩 후 디코딩된 상기 제 2레이어의 블록에 대해서, 상기 제 1영상블록 또는 그 영상블록을 포함하는 블록들의 코딩정보를 사용하여 상기 대응 블록을 레지듀얼 데이터로 코딩하며, 상기 대응블록의 코딩된 레지듀얼 데이터를 상기 제 1영상블록의 데이터에서 차감하여 상기 제 1영상블록을 레지듀얼 차 데이터로 코딩하고, 상기 제 2영상블록에 대응되는 상기 제 2레이어의 블록의 엔코딩된 레지듀얼 데이터를 상기 제 2영상블록의 데이터에서 차감하여 상기 제 2영상블록을 레지듀얼 차 데이터로 코딩하되, 상기 제 1영상블록에 대해서는 제 1정보가 레지듀얼 예측을 지시하는 값으로 설정하고, 상기 제 2영상블록에 대해서는 상기 제 1정보와는 다른 제 2정보가 레지듀얼 예측을 지시하는 값으로 설정하는 것에 특징이 있다.In addition, the encoding method according to the present invention, in encoding the input image pictures into a data stream of the first layer and the second layer, for the first and second image blocks in the first picture and the second picture among the picture pictures. A motion estimation / prediction operation is performed to code residual data, and for the block of the second layer decoded after encoding corresponding to the first image block, the first image block or blocks including the image block. Coding the corresponding block into residual data using coding information, subtract the coded residual data of the corresponding block from the data of the first image block, and encode the first image block into residual difference data, The second residual is obtained by subtracting encoded residual data of the block of the second layer corresponding to the second image block from the data of the second image block. Coding an image block with residual difference data, wherein the first information is set to a value indicating residual prediction for the first image block, and the second information is different from the first information for the second image block. Is set to a value indicating residual prediction.

본 발명에 따른 일 실시예에서는, 상기 제 1정보는, 상기 제 2레이어의 엔코딩시에 획득되는 레지듀얼 데이터로부터 직접 예측되어 레지듀얼 차로 코딩되었는 지의 여부를 나타내는 플래그(residual_prediction_flag)와는 구별되는 정보이다.In one embodiment according to the present invention, the first information is information distinguished from a flag (residual_prediction_flag) indicating whether the residual data is directly predicted from the residual data acquired at the time of encoding the second layer and coded with the residual difference. .

본 발명에 따른 일 실시예에서는, 상기 플래그(residual_prediction_flag)가, 상기 제 2레이어의 엔코딩시에 획득되는 레지듀얼 데이터로부터 직접 예측되어 레지듀얼 차로 코딩된 것으로 지시하고 있으면, 상기 1단계 내지 3단계는 수행하지 않고, 그렇지 않은 경우, 상기 1단계를 수행하고 그 확인결과에 따라 상기 2단계 및 3단계를 수행한다.In one embodiment according to the present invention, if the flag (residual_prediction_flag) is directly predicted from the residual data obtained at the time of encoding of the second layer and is indicated as coded with a residual difference, steps 1 to 3 may be performed. If not, otherwise, the first step is performed, and steps 2 and 3 are performed according to the check result.

본 발명에 따른 다른 일 실시예에서는, 상기 제 1레이어의 픽처의 유형을 확인하고, 특정 유형, 예를 들어 키(key)픽처이면 상기 1단계 내지 3단계를 수행하고, 특정유형이 아닌 경우에, 상기 1단계 내지 3단계를 수행하지 않는다.In another embodiment according to the present invention, if the type of the picture of the first layer is confirmed, and if a specific type, for example, a key (key), perform the steps 1 to 3, if not a specific type , Do not perform steps 1 to 3.

본 발명에 따른 일 실시예에서는, 상기 코딩정보를 사용함에 있어서, 그 코 딩정보가 적용된 데이터 영역보다 축소된, 상기 대응블록내의 데이터 영역에 대해 사용하여 그 데이터 영역을 레지듀얼 데이터로 코딩한다.In one embodiment according to the present invention, in using the coding information, the data area is coded as residual data using the data area in the corresponding block, which is smaller than the data area to which the coding information is applied.

본 발명에 따른 일 실시예에서는, 상기 코딩정보를 상기 대응블록내의 축소된 데이터 영역에 대해 사용할 때, 그 코딩정보내의 모션벡터도 축소하여 사용한다.In one embodiment according to the present invention, when the coding information is used for the reduced data area in the corresponding block, the motion vector in the coding information is also reduced.

본 발명에 따른 일 실시예에서는, 레지듀얼 데이터로 코딩된 상기 대응블록 또는 그 대응블록의 일부를 확대하여 상기 대상블록의 데이터에 가산하여 레지듀얼 데이터를 복원한다.According to an embodiment of the present invention, the residual block is restored by adding the corresponding block coded as residual data or a part of the corresponding block to the data of the target block.

이하, 본 발명의 바람직한 실시예에 대해 첨부도면을 참조하여 상세히 설명한다. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 1은 본 발명에 따른, 레지듀얼 예측을 수행하는 디코더를 동작위주로 도시한 도면이다. 도 1의 디코더는 인핸스드 레이어와 베이스 레이어의 엔코딩된 스트림( 텍스처(texture) 정보와 매크로 블록 정보 등 )을 각기 수신하여 디코딩하게 된다. 본 발명의 설명에서는, 편의를 위해 2개의 레이어에 대해서 설명하나, 3개이상의 레이어가 수신되는 디지털 스트림에서, 인접된 양 레이어간에 또는 사이에 하나 이상의 레이어가 있는 격리된 양 레이어간에도 본 발명은 그대로 적용될 수 있다.1 is a diagram illustrating an operation mainly of a decoder for performing residual prediction according to the present invention. The decoder of FIG. 1 receives and decodes an encoded stream (texture information, macro block information, etc.) of an enhanced layer and a base layer, respectively. In the description of the present invention, two layers are described for convenience, but in the digital stream in which three or more layers are received, the present invention can be used as it is between two adjacent layers having one or more layers between or between adjacent layers. Can be applied.

도 1의 디코더는, 수신된 베이스 레이어의 스트림내의 텍스처 정보에 대해서 역양자화(IQ)와 이산여현 역변환(IDCT)을 행한다(111). 이와 같이 하여 얻어진 데 이터에 대해서, 수신되는 스트림내의 매크로 블록 정보를 이용하여 각 매크로 블록을 코딩전 이미지로 복원하고 또한 각 픽처에 대해서 역 시간적 분해(inverse Temporal Decomposition)를 행하여(112) 영상신호를 복원한다. 도 1의 디코더는 베이스 레이어의 영상 스트림이 MCTF 방식으로 코딩된 것을 전제로 한 것이나, 이는 단순히 하나의 예시일 뿐이며, MCTF가 아닌 다른 방식( 예를 들어, AVC 방식 )일 수도 있다. 이 경우에는 도 1의 디코더는 그에 대응하는 디코딩 블록을 가지며 그에 따라 수신된 신호를 영상신호로 디코딩출력한다.The decoder of FIG. 1 performs inverse quantization (IQ) and discrete cosine inverse transform (IDCT) on the texture information in the received stream of the base layer (111). With respect to the data thus obtained, each macroblock is reconstructed into a pre-coded image using macroblock information in a received stream, and inverse temporal decomposition is performed on each picture (112). Restore The decoder of FIG. 1 is based on the assumption that the video stream of the base layer is coded by the MCTF scheme, but this is merely an example and may be other than the MCTF (eg, the AVC scheme). In this case, the decoder of FIG. 1 has a decoding block corresponding thereto and accordingly decodes the received signal into a video signal.

한편, 상기 디코더는 수신되는 인핸스드 레이어의 스트림내의 텍스처 정보에 대해서도 역양자화 및 이산여현 역변환을 수행한다(101). 또한 수신되는 매크로 블록 정보에서 먼저 '인트라 BL 모드'의 매크로 블록을 확인하고, 그 매크로 블록에 대해서는, 그 블록에 대응되는 베이스 레이어의 매크로 블록의 복원된 이미지를 이용하여 그 블록의 레지듀얼 데이터를 코딩전 이미지 데이터로 복원한다(102). 여기서 '대응되는 블록'의 의미는 시간적으로(temporally)는 동시간이며 공간적으로는(spatially) 프레임( 또는 슬라이스 )내에서 동위치가 되는 블록을 의미한다. 인핸스드 레이어와 베이스 레이어의 프레임의 해상도가 상이한 경우에는 '대응되는 블록'은 베이스 레이어의 프레임을 인핸스드 레이어와 동일 해상도가 되도록 확대하였을 때, 인핸스드 레이어의 매크로 블록을 포함하게 되는 베이스 레이어의 블록을 의미한다. 또한, 이와 같이 양 레이어간에 해상도가 상이한 경우에는 대응되는 베이스 레이어의 블록의 복원된 이미지를 확대하고(105), 그 확대된 이미지를 인트라 BL 모드의 매크로 블록의 레지듀얼 데이터를 이미지 데이터로 복원하는 데 기준으 로 사용한다.Meanwhile, the decoder performs inverse quantization and discrete cosine inverse transform on the texture information in the stream of the received enhanced layer (101). In addition, the received macroblock information first checks the macroblock of the 'intra BL mode', and for the macroblock, residual data of the block is obtained by using the reconstructed image of the macroblock of the base layer corresponding to the block. Reconstruct the image data before coding (102). In this case, the term 'corresponding block' refers to a block that is temporally co-temporally and spatially co-located within a frame (or slice). If the resolution of the frame of the enhanced layer and the base layer is different, the 'corresponding block' includes the macro layer of the enhanced layer when the frame of the base layer is enlarged to have the same resolution as that of the enhanced layer. Means block. In this case, when the resolution is different between the two layers, the restored image of the block of the corresponding base layer is enlarged (105), and the enlarged image is used to restore the residual data of the macro block of the intra BL mode to the image data. Use as a guide.

그리고, 상기 디코더는 베이스 레이어로부터 레지듀얼 예측된 매크로 블록을 확인하고 그 블록에 대해서는 레지듀얼 예측에 의한 코딩전의 데이터를 복원하게 되는 데(200), 이를 위해서, 수신되는 매크로 블록의 정보에서 특정 정보, 예를 들어 'pseudo_residual_prediction_flag'의 값을 확인한다. 이 값이, 예를 들어 1이면 앞서 복원된 베이스 레이어의 영상으로부터, 현재 매크로 블록에 대응되는 블록의 레지듀얼 데이터를 구하여 이에 기초하여, 레지듀얼 예측된 현재 블록의 레지듀얼 차 데이터를 레지듀얼 데이터로 변환한다. 이하에서는 이에 대해 보다 상세히 설명한다. 이하의 설명에서는, 인핸스드 레이어와 베이스 레이어의 해상도가 2배 차이가 나는 것으로 설명하나, 차이가 없는 경우( 이 경우에는 하기에서 설명하는 매크로 블록정보의 수축(shrink), 데이터의 업샘플링 등의 동작이 불필요하다. )이거나 그 이상의 차이가 나는 경우에도 비율만 달리하여 하기에서 예시적으로 설명하는 원리와 동일한 원리를 그대로 적용할 수 있다.In addition, the decoder checks the residual predicted macroblock from the base layer and reconstructs the data before coding by the residual prediction for the block (200). For this purpose, specific information is received from the information of the received macroblock. For example, check the value of 'pseudo_residual_prediction_flag'. For example, if the value is 1, the residual data of the block corresponding to the current macro block is obtained from the image of the base layer reconstructed previously, and based on the residual data, the residual difference data of the residual predicted current block is obtained. Convert to This will be described in more detail below. In the following description, the resolutions of the enhanced layer and the base layer are described as having a difference of 2 times. However, when there is no difference (in this case, shrinking of macro block information, upsampling of data, etc.) Even if there is a difference or more, the same principle as that described in the following example can be applied without changing the ratio.

상기 디코더는 상기 특정정보, 예를 들어 'pseudo_residual_prediction_flag'의 값이 1인 매크로 블록에 대응되는 베이스 레이어의 블록을 특정하고, 그 특정된 블록을 2배로 확대하였을 때 그와 공간적으로 동위치가 되는 4개의 블록들( 해상도의 비가 2보다 크면 그 개수는 달라지며, 이 중 적어도 하나의 블록이 pseudo_residual_prediction_flag가 1인 값을 갖고 있다. )의 블록 정보( 블록 모드, 모션 벡터, 서브블록 정보 등 )를 얻고, 그 얻어진 4개 블록들의 블록 정보를 수축시킨다(201). The decoder specifies a block of the base layer corresponding to the macro block of which the specific information, for example, 'pseudo_residual_prediction_flag' has a value of 1, and when the specified block is enlarged twice, 4 which is spatially co-located with the specified block. Obtain the block information (block mode, motion vector, subblock information, etc.) of the two blocks (where the ratio of the resolution is larger than 2, the number is different, and at least one of the blocks has a value of pseudo_residual_prediction_flag is 1.) The block information of the obtained four blocks is shrunk (201).

도 2는 본 발명에 따라 블록정보가 수축되는 예를 나타낸 것이다. 이 블록 정보의 수축에 있어서는, 기본적으로 4개의 인핸스드 레이어의 블록이 하나의 베이스 레이어의 블록에 대한 정보로 변환되는 데, 먼저 인핸스드 레이어의 각 매크로 블록은 베이스 레이어의 블록내의 8x8의 서브 블록에 대응되며, 각 매크로 블록의 정보도 또한 그에 대응되는 각 8x8의 서브 블록의 정보로 해석된다. 인핸스드 레이어의 각 블록내의 서브블록들도 같은 비율로 베이스 레이어의 블록내의 서브 블복에 대응되는 데, 이 때, 매크로 블록내의 최소 서브 블록 크기, 예를 들어 4x4이하의 크기로 대응되어야 하는 크기를 갖는 인핸스드 레이어의 서브 블록( 예를 들어, 4x8, 8x4, 그리고 4x4 )은 모두 4x4의 서브 블록에 대응시킨다( 도 2의 A1과 A2 ). 그리고, 정보의 수축에 있어서, 인핸스드 레이어의 모션벡터는 해상도 비율에 따라, 즉 2배로 축소된다.2 shows an example in which block information is shrunk according to the present invention. In the contraction of this block information, basically, blocks of four enhanced layers are converted into information about blocks of one base layer. First, each macro block of the enhanced layer is an 8x8 subblock in the block of the base layer. Information of each macro block is also interpreted as information of each 8x8 sub-block corresponding thereto. The subblocks in each block of the enhanced layer correspond to the subblocks in the block of the base layer in the same ratio, where the size that should correspond to the minimum subblock size in the macroblock, for example, 4x4 or less. The subblocks (eg, 4x8, 8x4, and 4x4) of the enhanced layer having all correspond to the 4x4 subblocks (A1 and A2 in FIG. 2). Then, in the contraction of the information, the motion vector of the enhanced layer is reduced by two times according to the resolution ratio.

블록간의 대응관계를, 도 2의 예로써 설명하면, 좌측 상단 블록(21)은 인트라 모드 블록이므로 비예측(not-predicted)으로 마크된 8x8의 서브블록이 되고, 2개의 8x16 서브블록을 갖는 우측 상단의 인터모드의 매크로 블록(22)은 베이스 레이어의 대응 블록에서 우측 상단의 2개의 4x8의 서브블록이 되며, 좌측 하단의 인터모드의 매크로 블록(23)은 대응 블록에서 좌측 하단의 8x8의 서브 블록이 되고, 2개의 8x8과, 2개의 4x8(A2)과 또한 2개의 8x4(A1)의 서브 블록을 갖는 우측 하단의 매크로 블록(24)은 그림에 보여진 것처럼, 우측 하단의 4개의 4x4 서브 블록에 대응된다.2, the upper left block 21 is an intra mode block, and thus becomes an 8x8 subblock marked as not-predicted, and the right side having two 8x16 subblocks. The macro block 22 of the upper inter mode becomes two 4x8 subblocks of the upper right in the corresponding block of the base layer, and the macro block 23 of the inter mode of the lower left of the lower mode of the corresponding block corresponds to 8x8 sub of the lower left of the corresponding block. The bottom right macroblock 24, which is a block, has two 8x8, two 4x8 (A2) and also two 8x4 (A1) subblocks, the four 4x4 subblocks at the bottom right, as shown in the figure. Corresponds to.

그리고, 디코딩되어 원래의 영상신호를 갖는 베이스 레이어의 각 서브 블록( 비예측으로 지정된 서브 블록은 제외 )에 대해서, 그에 대응되는 인핸스드 레이어의 매크로 블록 또는 서브블록의 블록정보, 축소된 모션벡터 정보 등을 이용하여 예측동작을 수행한다(202). 즉, 대응되는 베이스 레이어의 각 서브블록을, 그 서브블록에 대응된 인핸스드 레이어의 블록정보와 축소된 모션벡터 정보를 이용하여 레지듀얼 데이터로 코딩한다.For each subblock (except for non-predicted subblocks) of the base layer having the original video signal decoded, block information of the macroblock or subblock of the enhanced layer corresponding thereto and reduced motion vector information corresponding thereto. The prediction operation is performed by using the like (202). That is, each subblock of the corresponding base layer is coded as residual data using block information of the enhanced layer corresponding to the subblock and reduced motion vector information.

도 3은 이 과정에 대한 한 예를 나타낸 것으로서, 도 2의 예에서, 우측 상단의 2개의 8x16의 서브 블록을 갖는 매크로 블록(22)에 대응되는 베이스 레이어의 서브블록(B)을 레지듀얼 데이터로 코딩하는 예로서, 상기 서브블록(B)이 현재 pseudo_residual_prediction_flag가 1인 인핸스드 레이어의 매크로 블록에 대응되는 경우를 가정한 것이다.FIG. 3 shows an example of this process. In the example of FIG. 2, the residual data of the subblock B of the base layer corresponding to the macroblock 22 having the two 8x16 subblocks in the upper right is residual data. For example, the subblock B corresponds to a macroblock of an enhanced layer whose current pseudo_residual_prediction_flag is 1.

도 3의 예에서, 상기 매크로 블록(22)의 좌측 8x16의 서브블록(22a)은 전방향(forward) 모드이므로, 상기 디코더는 그에 대응되는 베이스 레이어의 서브블록(B)의 4x8의 좌측 부분에 대해, 상기 좌측 서브블록(22a)의 축소된 모션벡터와 전방향의 기준픽처 값(ref_idx_l0)을 사용하여 레지듀얼 데이터로 코딩한다(301). 그리고, 상기 매크로 블록(22)의 우측 8x16의 서브블록(22b)은 양방향(bi-directional) 모드이므로, 상기 디코더는 그에 대응되는 베이스 레이어의 서브블록(B)의 4x8의 우측 부분에 대해, 상기 우측 서브블록(22b)의 축소된 양 모션벡터와 전방향 및 후방향의 기준픽처 값(ref_idx_l0,ref_idx_l1)을 각각 사용하여 레지듀얼 데이터로 코딩한다(302). In the example of FIG. 3, the subblock 22a of the left 8x16 of the macroblock 22 is in the forward mode, so that the decoder is located at the left portion of the 4x8 of the subblock B of the base layer corresponding thereto. In operation 301, the reduced motion vector of the left subblock 22a and the reference picture value (ref_idx_10) of the omnidirectional direction are used as residual data. In addition, since the 8x16 subblock 22b of the right side of the macroblock 22 is a bi-directional mode, the decoder determines that the right portion of 4x8 of the subblock B of the base layer corresponding thereto corresponds to: In operation 302, the reduced motion vector of the right subblock 22b and the reference picture values ref_idx_l0 and ref_idx_l1 in the forward and backward directions are used as residual data, respectively (302).

만약, 현재 pseudo_residual_prediction_flag가 1인 인핸스드 레이어의 매크 로 블록에 대응되는 서브블록이 도 2의 예에서 다른 서브블록( 비예측으로 지정된 서브블록 제외 )이면, 그 서브블록에 대해서 도 3의 예에서 제시된 것과 동일한 원리로 레지듀얼 코딩하게 된다.If the subblock corresponding to the macroblock of the enhanced layer whose current pseudo_residual_prediction_flag is 1 is another subblock (except for the non-predicted subblock) in the example of FIG. 2, the subblock is shown in the example of FIG. 3. The same principle is used for residual coding.

현재 pseudo_residual_prediction_flag가 1인 인핸스드 레이어의 매크로 블록에 대응되는 서브블록만을 레지듀얼 데이터로 코딩하지 않고 그 서브블록을 포함하는 매크로 블록( 비예측으로 마크된 서브블록 제외 ) 전체를 레지듀얼 데이터로 코딩하여 사용할 수도 있다.Instead of coding only the subblocks corresponding to the macroblock of the enhanced layer with pseudo_residual_prediction_flag as 1, the entire macroblock (excluding the subblocks marked as unpredicted) including the subblocks is encoded as residual data. Can also be used.

상기와 같은 방법으로, 현재 pseudo_residual_prediction_flag가 1인 인핸스드 레이어의 매크로 블록에 대응되는 베이스 레이어의 매크로 블록의 일부 또는 전부가 레지듀얼 데이터로 코딩되면, 상기 디코더는 그 매크로 블록 전부 또는 일부에 대해 업샘플링(up-sampling)하여 그 크기를 확대한다(203). 그리고, 크기가 확대된 매크로 블록에서, 현재 매크로 블록에 대응되는 영역의 레지듀얼 데이터를 현재 매크로 블록의 데이터( 레지듀얼 차 데이터 )에 더하여 원래의 레지듀얼 데이터를 복원한다(204).In the above manner, if some or all of the macro blocks of the base layer corresponding to the macro blocks of the enhanced layer whose current pseudo_residual_prediction_flag is 1 are coded with residual data, the decoder upsamples all or some of the macro blocks. (up-sampling) to enlarge the size (203). In the enlarged macroblock, the original residual data is restored by adding residual data of an area corresponding to the current macroblock to data (residual difference data) of the current macroblock.

전술한 과정에 의해, 인트라 모드의 블록은 코딩전 이미지 데이터로 복원되고(102), 레이어간 레지듀얼 예측된 블록은 레지듀얼 차데이터에서 레지듀얼 데이터로 복원된 후(204) 시간적 역분해 과정(103)을 거쳐 원래의 영상신호로 복원된다. 시간적 역분해 과정은 본 발명과 직접적인 관련이 없으므로 이에 대한 설명은 생략한다.By the above-described process, the intra mode block is reconstructed into image data before coding (102), and the inter-layer residual predicted block is reconstructed from residual difference data into residual data (204) and then temporal inverse decomposition process ( The original video signal is recovered through 103. Since the temporal reverse decomposition process is not directly related to the present invention, description thereof is omitted.

한편, 전술한 바와 같은, 정보수축과 그 수축된 정보를 이용한 레지듀얼 예 측에 의한 데이터 복원과정을 지원하기 위해 엔코더에서도, 레이어간 레지듀얼 예측이 코딩이득이 있는 매크로 블록의 경우, 상기 디코더에 대해 전술한 방식과 동일한 방식으로 레지듀얼 예측에 의한 코딩을 행한다.On the other hand, in order to support the data reconstruction by the residual prediction using information shrinkage and the contracted information as described above, even in the encoder, in the case of a macroblock having a coding gain in the inter-layer residual prediction, The coding by residual prediction is performed in the same manner as described above.

즉, 엔코딩된 베이스 레이어의 영상 스트림에 대해서, 역양자화 및 이산여현 역변환을 하고, 이를 다시 시간적 역분해( MCTF 엔코딩의 경우이며, 다른 코덱방식을 사용하면 그에 따른 디코딩과정 )하여 엔코딩전 영상신호로 만든 후, 인터모드로 코딩된 인핸스드 레이어의 매크로 블록에 대해서, 도 1에서의 200과정과 동일한 과정을 수행하여 그 매크로 블록에 대응되는 베이스 레이어의 매크로 블록 또는 그 블록에서 대응되는 영역의 레지듀얼 데이터를 구한 후, 현재 매크로 블록의 레지듀얼 데이터로의 코딩과, 상기 구해진 베이스 레이어의 레지듀얼 데이터와의 차 데이터로의 코딩과의 이득, 예를 들어 코딩량의 감소효과를 비교한다. 레지듀얼 차 데이터로의 코딩이 보다 더 이득이 있는 경우 해당 매크로 블록에 대해서 레지듀얼 데이터의 차 데이터로 코딩하고, 그 블록에 대한 코딩정보로서, 앞서 언급한 바와 같이, pseudo_residual_prediction_flag를 1로 설정하여 전송하게 된다.In other words, the inverse quantization and discrete cosine inverse transform are performed on the encoded base layer video stream, which is then temporally inversely decomposed (in the case of MCTF encoding, and decoding according to other codec methods). After creation, the macroblock of the enhanced layer coded in the inter-mode is subjected to the same process as in step 200 of FIG. 1, and the residual of the macroblock of the base layer corresponding to the macroblock or the corresponding region in the block is performed. After the data is obtained, the gain of the coding of the current macroblock into the residual data and the coding into the difference data from the obtained residual data of the base layer, for example, the effect of reducing the amount of coding is compared. If coding to residual difference data is more advantageous, coding with difference data of residual data for the corresponding macro block, and as coding information for the block, as previously mentioned, transmits by setting pseudo_residual_prediction_flag to 1 Done.

한편, 디코더는, 도 1에서의 200과정과 같은 방식으로 레이어간 레지듀얼 예측을 수행하는 대신, 베이스 레이어의 엔코딩과정에 의해 생성된 레지듀얼 데이터를, 인핸스드 레이어의 매크로 블록에 대한 레지듀얼 예측에 직접 이용하는 엔코더( 또는 그와 같은 방식으로 엔코딩된 스트림을 제공하는 서버 )로부터 영상 스트림을 수신할 수도 있다. 베이스 레이어의 엔코딩과정에서 얻어진 레지듀얼 데이터를 직접 이용하여 레지듀얼 예측된 매크로 블록에 대해서는, 앞서 언급한 특정정보, 예를 들어 pseudo_residual_prediction_flag와는 다른 플래그, 예를 들어 residual_prediction_flag가 세트된다.On the other hand, instead of performing the inter-layer residual prediction in the same manner as in step 200 of FIG. 1, the decoder uses the residual data generated by the encoding process of the base layer to perform residual prediction on the macroblock of the enhanced layer. The video stream may be received from an encoder (or a server providing a stream encoded in such a manner) used directly in the. Residual predicted macroblocks using the residual data obtained in the encoding of the base layer directly are set with a flag other than the above-mentioned specific information, for example, pseudo_residual_prediction_flag, for example, residual_prediction_flag.

또한, 다른 엔코더는, 앞서 설명한 200과정과 동일한 과정에 의한 레지듀얼 예측에 의한 코딩과, 베이스 레이어의 엔코딩시에 얻어지는 레지듀얼 데이터를 직접 이용하여 레지듀얼 예측코딩을 영상 픽처의 유형에 따라 달리 적용하여 하나의 영상 스트림을 생성할 수도 있다. 예를 들어, 키(key) 픽처 또는 L 픽처( 저역 서브밴드(low-pass subband) 픽처 )에 대해서는, 앞서 설명한 200과정과 동일한 과정에 의해 레지듀얼 예측에 의한 코딩을 하여 pseudo_residual_prediction_flag를 전송하고, 비 키픽처 또는 H픽처( 고역 서브밴드(high-pass subband) 픽처 )에 대해서는 베이스 레이어의 레지듀얼 데이터를 직접 이용하여 레지듀얼 예측코딩을 하여 residual_prediction_flag를 전송할 수도 있다. In addition, other encoders apply residual prediction coding differently according to the type of an image picture by coding using residual prediction according to the same procedure as 200 described above, and directly using residual data obtained when encoding the base layer. One video stream may be generated. For example, for a key picture or an L picture (low-pass subband picture), the pseudo_residual_prediction_flag is transmitted by coding by residual prediction using the same process as described above. For a key picture or an H picture (high-pass subband picture), residual_prediction_flag may be transmitted by performing residual prediction coding using the residual data of the base layer directly.

앞서 언급한 '키픽처'는, 기본화질(quality-base) 픽처만을 사용하여 예측동작을 수행하여 레지듀얼 데이터로 코딩한 픽처가 한가지 예일 수 있다. 이에 반하여 기본화질 픽처외에 SNR 인핸스(enhance) 픽처 데이터를 사용하여 레지듀얼 데이터로 코딩한 픽처를 비 키픽처(non-key picture)라 한다. 이 키픽처의 정의는 한가지의 예일 뿐, 본 발명은 이에 한정되지 않는다.The above-mentioned 'key picture' may be an example of a picture coded into residual data by performing a prediction operation using only a quality-base picture. On the other hand, a picture coded as residual data using SNR enhancement picture data in addition to the basic picture is called a non-key picture. The definition of this key picture is only one example, and the present invention is not limited thereto.

따라서, 복수 유형의 레지듀얼 예측정보가 사용될 수 있으므로, 본 발명에 따른 레지듀얼 예측을 수행하는 디코더는 도 4에 도시된 바와 같이, 상기 2가지 방식에 의한 레지듀얼 예측코딩을 모두 복원할 수 있는 구성을 갖는다. 도 4의 구성도 또한 동작위주로 도시한 것이다.Therefore, since a plurality of types of residual prediction information can be used, the decoder for performing the residual prediction according to the present invention can reconstruct both the residual prediction codings by the two methods as shown in FIG. 4. Has a configuration. The configuration of FIG. 4 is also shown mainly for operation.

도 4의 디코더는, 앞서 설명한 도 1의 디코더의 복원된 베이스 레이어의 영상을 이용한 레지듀얼 예측에 의한 데이터 복원동작(200)외에, 베이스 레이어의 영상신호 디코딩완료전의 레지듀얼 데이터를 직접 이용하여 복원하는 동작(401)도 또한 수행할 수 있다.The decoder of FIG. 4 restores data by directly using the residual data before the decoding of the image signal of the base layer, in addition to the data restoration operation 200 by the residual prediction using the reconstructed base layer image of the decoder of FIG. Action 401 may also be performed.

따라서, 도 4의 디코더는 역양자화 및 이산여현 역변환을 거친 인핸스드 레이어의 데이터 스트림에서 residul_prediction_flag가 세트된 매크로 블록에 대해서는, 역양자화 및 이산여현 역변환을 거친 베이스 레이어의 데이터 스트림에서 상기 매크로 블록에 대응되는 블록의 레지듀얼 데이터에 기초하여 원래의 레지듀얼 데이터를 복원하게 된다(401). 물론, 이 때도 레이어간 해상도가 상이하면 그 비율에 따라 업샘플링하여 베이스 레이어의 블록을 확대(402)하여 사용한다.Therefore, the decoder of FIG. 4 corresponds to the macroblock in the data stream of the base layer subjected to inverse quantization and the discrete cosine inverse transform, to a macroblock in which residul_prediction_flag is set in the enhanced stream through the inverse quantization and the discrete cosine inverse transform. The original residual data is restored based on the residual data of the block to be generated (401). Of course, in this case, if the resolution between the layers is different, up-sampling according to the ratio is used to enlarge (402) the block of the base layer.

도 4의 디코더는, residual_prediction_flag가 세트된 블록에 대해서는, 후단에서의 레지듀얼 예측에 의한 데이터 복원동작(200)의 수행여부를 결정하기 위한 pseudo_residual_prediction_flag의 확인동작을 행하지 않는다. 또한, 도 4의 디코더는, 현재 디코딩할 픽처가 특정 유형이 아니면, 예를 들어 키픽처가 아닌 경우에는 상기 pseudo_residual_prediction_flag를 확인하지 않고 또한 앞서 설명한 복원동작(200)을 수행하지 않는다. The decoder of FIG. 4 does not perform the checking operation of the pseudo_residual_prediction_flag for determining whether to perform the data reconstruction operation 200 by the residual prediction at a later stage for the block in which residual_prediction_flag is set. In addition, the decoder of FIG. 4 does not check the pseudo_residual_prediction_flag and does not perform the reconstruction operation 200 described above if the picture to be decoded is not a specific type, for example, if it is not a key picture.

따라서, 레지듀얼 데이터를 직접 이용하여 레지듀얼 예측코딩을 행하는 엔코더는 pseudo_residual_prediction_flag를 전송할 필요가 없다.Therefore, an encoder that performs residual prediction coding using the residual data directly does not need to transmit pseudo_residual_prediction_flag.

지금까지 설명한 레이어간 레지듀얼 예측동작을 수행하는 디코더는, 이동통신 단말기 등에 실장되거나 또는 기록매체를 재생하는 장치에 실장될 수 있다.The decoder for performing the inter-layer residual prediction operation described above may be mounted in a mobile communication terminal or the like or in an apparatus for reproducing a recording medium.

본 발명은 전술한 전형적인 바람직한 실시예에만 한정되는 것이 아니라 본 발명의 요지를 벗어나지 않는 범위 내에서 여러 가지로 개량, 변경, 대체 또는 부가하여 실시할 수 있는 것임은 당해 기술분야에 통상의 지식을 가진 자라면 용이하게 이해할 수 있을 것이다. 이러한 개량, 변경, 대체 또는 부가에 의한 실시가 이하의 첨부된 특허청구범위의 범주에 속하는 것이라면 그 기술사상 역시 본 발명에 속하는 것으로 보아야 한다. It is to be understood that the present invention is not limited to the above-described exemplary preferred embodiments, but may be embodied in various ways without departing from the spirit and scope of the present invention. If you grow up, you can easily understand. If the implementation by such improvement, change, replacement or addition falls within the scope of the appended claims, the technical idea should also be regarded as belonging to the present invention.

제한된 실시예로써 상술한 바와 같이, 본 발명은, 특정 레이어의 영상신호를 디코딩하여 출력할 때, 그 하위 레이어로부터 영상으로 복원(reconstructed)되기 전의 레지듀얼 데이터를 제공받지 못하는 경우에도 레이어간 레지듀얼 예측에 의한 코딩을 가능하게 하여 코딩효율을 향상시킬 수 있다.As described above in a limited embodiment, the present invention, when decoding and outputting a video signal of a specific layer, even if the residual data before being reconstructed from the lower layer to the image is not received from the inter-layer residual The coding efficiency can be improved by enabling coding by prediction.

Claims

A method of receiving each encoded bit stream of a first layer and a second layer and decoding the same into a video signal,

A first step of confirming first information indicating whether or not a residual prediction for a target block in a picture of the first layer is performed;

If the first information indicates residual prediction, for the block having decoded data of the second layer, corresponding to the target block, coding information of the target block or blocks including the target block is used. Coding the corresponding block of the second layer into residual data to form a residual data;

And adding the coded residual data of the corresponding block to the data of the target block to obtain the residual data of the target block.

The method of claim 1,

And wherein the first information is information distinguished from second information indicating whether the residual data is directly predicted from residual data obtained at the time of encoding the second layer and coded with a residual difference.

The method of claim 2,

The method may further include checking a value of the second information.

Steps 1 to 3 are performed according to the identified value, or is not performed.

The method of claim 3,

If the value of the second information is directly predicted from the residual data obtained at the time of encoding the second layer and is indicated as coded with the residual difference, the steps 1 to 3 are not performed. .

The method of claim 1,

The method may further include checking a type of the picture of the first layer.

If the identified type does not belong to a predetermined type, steps 1 to 3 are not performed.

The method of claim 5,

And said predetermined type is a key picture.

The method of claim 1,

And the step (2), in using the coding information, codes the data area into residual data using the data area in the corresponding block, which is smaller than the data area to which the coding information is applied.

The method of claim 7, wherein

In the step 2, in using the coding information, the motion vector information in the coding information is reduced according to a reduction ratio of an area to which the coding information is applied.

The method of claim 7, wherein

In the step 3, the corresponding block coded as residual data or a part of the corresponding block is enlarged and added to the data of the target block.

A method for encoding input picture pictures into data streams of a first layer and a second layer, the method comprising:

Performing a motion estimation / prediction operation on the first and second image blocks in the first picture and the second picture among the picture pictures, and coding them into residual data;

For the block of the second layer decoded after encoding corresponding to the first picture block, the corresponding block is encoded into residual data using coding information of the first picture block or blocks including the picture block. Two stages,

Coding the first image block into residual difference data by subtracting the coded residual data of the corresponding block from the data of the first image block, and encoding the block of the second layer corresponding to the second image block. Subtracting the residual data from the data of the second image block and coding the second image block into residual difference data.

First information is set to a value indicating residual prediction for the first image block, and second information different from the first information is set to a value indicating residual prediction for the second image block. Characterized in that the method.

The method of claim 10,

And wherein the first picture is a picture coded into a picture having low-pass component data, and the second picture is a picture coded into a picture having high-band component data.

The method of claim 10,

And wherein the first picture is a key picture and the second picture is a non-key picture.