KR20060085150A

KR20060085150A - Method and apparatus for encoding/decoding video signal using prediction information of intra-mode macro blocks of base layer

Info

Publication number: KR20060085150A
Application number: KR1020050024987A
Authority: KR
Inventors: 전병문; 윤도현; 박지호; 박승욱
Original assignee: 엘지전자 주식회사
Priority date: 2005-01-21
Filing date: 2005-03-25
Publication date: 2006-07-26
Also published as: KR100883591B1

Abstract

본 발명은, 보조 레이어의 내부(intra) 모드 블록의 예측정보를 이용하여 영상신호를 엔코딩하고 그에 따라 엔코딩된 영상데이터를 디코딩하는 방법 및 장치에 관한 것으로서, 영상신호를 스케일러블한 MCTF방식으로 엔코딩하여 인핸스드 레이어의 비트 스트림을 출력함과 동시에 상기 영상신호를 기 지정된 방식으로 엔코딩하여 베이스 레이어의 비트 스트림을 출력하되, MCTF 방식으로 엔코딩할 때, 상기 베이스 레이어의 비트 스트림에 포함되어 있는 내부모드로 코딩된 대응블록의 예측정보에 근거하여, 상기 영상신호의 임의의 프레임내에 포함되어 있는 영상블록에 대해, 그 영상블록의 인접화소를 이용하여 내부모드로 코딩한다.The present invention relates to a method and apparatus for encoding a video signal using prediction information of an intra mode block of an auxiliary layer and decoding the encoded video data accordingly, wherein the video signal is encoded in a scalable MCTF scheme. Outputs the bit stream of the enhanced layer and simultaneously encodes the video signal in a predetermined manner to output the bit stream of the base layer, and when encoding in the MCTF method, an internal mode included in the bit stream of the base layer. On the basis of the prediction information of the corresponding block coded by < RTI ID = 0.0 >, < / RTI >

MCTF, 엔코딩, 레이어, 내부모드, 예측모드, 예측방향, 분할, inter-layer, intra mode MCTF, encoding, layer, internal mode, prediction mode, prediction direction, segmentation, inter-layer, intra mode

Description

Method and apparatus for encoding / decoding video signal using prediction information of internal mode block of base layer {method and apparatus for encoding / decoding video signal using prediction information of intra-mode macro blocks of base layer}

도 1은, 동일 시간상의 베이스 레이어의 확대된 프레임 내의 동일한 위치의 블록을 인핸스드 레이어의 예측영상으로 만드는 종래의 과정의 예를 도시한 것이고,FIG. 1 illustrates an example of a conventional process of making blocks of the same position in an enlarged frame of a base layer at the same time into a prediction image of an enhanced layer.

도 2는 본 발명에 따른 영상신호 코딩방법이 적용되는 영상신호 엔코딩 장치의 구성블록을 도시한 것이고,2 is a block diagram of a video signal encoding apparatus to which a video signal coding method according to the present invention is applied.

도 3은 도 2의 MCTF 엔코더내의 영상 추정/예측과 갱신동작을 수행하는 구성을 도시한 것이고,FIG. 3 illustrates a configuration for performing image estimation / prediction and update operations in the MCTF encoder of FIG.

도 4a 내지 4c는, 본 발명의 일 실시예에 따라, 베이스 레이어의 내부모드 블록의 예측정보를 이용하여 인핸스드 레이어의 매크로 블록을 내부모드로 코딩하는 과정의 예를 각기 도시한 것이고,4A to 4C illustrate examples of a process of coding a macroblock of an enhanced layer in an internal mode using prediction information of an internal mode block of a base layer, according to an embodiment of the present invention.

도 5a 내지 5c는, 본 발명의 다른 일 실시예에 따라, 베이스 레이어의 내부모드 블록의 예측정보를 이용하여 인핸스드 레이어의 매크로 블록을 내부모드로 코딩하는 과정의 예를 각기 도시한 것이고,5A to 5C illustrate examples of a process of coding a macroblock of an enhanced layer in an internal mode by using prediction information of an internal mode block of a base layer, according to another embodiment of the present invention.

도 6은 도 2의 장치에 의해 엔코딩된 데이터 스트림을 디코딩하는 장치의 블록도이고,6 is a block diagram of an apparatus for decoding a data stream encoded by the apparatus of FIG. 2;

도 7은 도 6의 MCTF 디코더내의 역예측 그리고 역갱신 동작을 수행하는 구성를 도시한 것이다.FIG. 7 illustrates a configuration for performing reverse prediction and reverse update operations in the MCTF decoder of FIG. 6.

<도면의 주요부분에 대한 부호의 설명> <Description of the symbols for the main parts of the drawings>

100: MCTF 엔코더 102: 추정/예측기100: MCTF encoder 102: estimator / predictor

103: 갱신기 105, 240: 베이스 레이어 디코더103: updater 105, 240: base layer decoder

110: 텍스처 엔코더 120: 모션 코딩부110: texture encoder 120: motion coding unit

130: 먹서 150: 베이스레이어 엔코더130: eat 150: base layer encoder

200: 디먹서 210: 텍스처 디코더 200: demuxer 210: texture decoder

220: 모션 디코딩부 230: MCTF 디코더 220: motion decoding unit 230: MCTF decoder

231: 역갱신기 232: 역예측기 231: reverse updater 232: reverse predictor

234: 배열기 235: 모션벡터 디코더234: array 235: motion vector decoder

본 발명은, 영상신호의 스케일러블(scalable) 엔코딩 및 디코딩에 관한 것으로, 특히, MCTF (Motion Compensated Temporal Filter) 방식에 의한 스케일러블 코딩 시에, 베이스 레이어(base layer)의 내부 모드(intra mode) 블록의 예측정보를 이용하여 영상신호를 엔코딩하고 그에 따라 엔코딩된 영상데이터를 디코딩하는 방법 및 장치에 관한 것이다. BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to scalable encoding and decoding of video signals. In particular, the present invention relates to an intra mode of a base layer during scalable coding using a motion compensated temporal filter (MCTF) scheme. A method and apparatus for encoding a video signal using prediction information of a block and decoding the encoded video data accordingly.

현재 널리 사용되고 있는 휴대폰과 노트북, 그리고 앞으로 널리 사용하게 될 이동(mobile) TV 와 핸드 PC 등이 무선으로 송수신하는 디지털 영상신호에 대해서는 TV신호를 위한 대역폭과 같은 넓은 대역을 할당하기가 여의치 않다. 따라서, 이와 같은 이동성 휴대장치를 위한 영상 압축방식에 사용될 표준은 좀 더 영상신호의 압축 효율이 높아야만 한다.For digital video signals transmitted and received wirelessly by mobile phones and laptops and mobile TVs and hand PCs, which are widely used in the future, it is difficult to allocate a wide band such as bandwidth for TV signals. Therefore, the standard to be used for the image compression method for such a mobile portable device should be higher the compression efficiency of the video signal.

더욱이, 상기와 같은 이동성 휴대장치는 자신이 처리 또는 표현(presentation)할 수 있는 능력이 다양할 수 밖에 없다. 따라서, 압축된 영상이 그만큼 다양하게 사전준비되어야만 하는 데, 이는 동일한 하나의 영상원(source)을, 초당 전송 프레임수, 해상도, 픽셀당 비트수 등 다양한 변수들의 조합된 값에 대해 구비하고 있어야 함을 의미하므로 컨텐츠 제공자에게 많은 부담이 될 수 밖에 없다.In addition, such a mobile portable device has a variety of capabilities that can be processed or presented. Therefore, the compressed image must be prepared in such a variety that the same image source should be provided for the combined values of various variables such as transmission frames per second, resolution, and bits per pixel. This means a lot of burden on the content provider.

이러한 이유로, 컨텐츠 제공자는 하나의 영상원에 대해 고속 비트레이트의 압축 영상 데이터를 구비해 두고, 상기와 같은 이동성 장치가 요청하면 원시 영상을 디코딩한 다음, 요청한 장치의 영상처리 능력(capability)에 맞는 영상 데이터로 적절히 엔코딩하는 과정을 수행하여 제공한다. 하지만 이와 같은 방식에는 트랜스코딩(transcoding)(디코딩+엔코딩) 과정이 필히 수반되므로 이동성 장치가 요청한 영상을 제공함에 있어서 다소 시간 지연이 발생한다. 또한 트랜스코딩도 목표 엔코딩이 다양함에 따라 복잡한 하드웨어의 디바이스와 알고리즘을 필요로 한다.For this reason, the content provider has high-speed bitrate compressed image data for one image source, decodes the original image when requested by the mobile device, and then fits the image processing capability of the requested device. Provides by performing a process of properly encoding the image data. However, such a method requires a transcoding (decoding + encoding) process, and thus a time delay occurs in providing a video requested by the mobile device. Transcoding also requires complex hardware devices and algorithms as target encodings vary.

이와 같은 불리한 점들을 해소하기 위해 제안된 것이 스케일러블 영상 코덱(SVC:Scalable Video Codec)이다. 이 방식은 영상신호를 엔코딩함에 있어, 최고 화질로 엔코딩하되, 그 결과로 생성된 픽처 시퀀스의 부분 시퀀스( 시퀀스 전체에서 간헐적으로 선택된 프레임의 시퀀스 )를 디코딩해 사용해도 저화질의 영상 표현이 가능하도록 하는 방식이다. MCTF (Motion Compensated Temporal Filter) 방식이 상기와 같은 스케일러블 영상코덱에 사용하기 위해 제안된 엔코딩 방식이다. Scalable Video Codec (SVC) has been proposed to solve such disadvantages. This method encodes a video signal and encodes it at the highest quality, but enables a low-quality video representation by using a decoded partial sequence of the resulting picture sequence (a sequence of intermittently selected frames throughout the sequence). That's the way. The Motion Compensated Temporal Filter (MCTF) method is an encoding method proposed for use in the scalable image codec as described above.

그런데, 앞서 언급한 바와 같이 스케일러블 방식인 MCTF로 엔코딩된 픽처 시퀀스는 그 부분 시퀀스만을 수신하여 처리함으로써도 저화질의 영상 표현이 가능하지만, 비트레이트(bitrate)가 낮아지는 경우 화질저하가 크게 나타난다. 이를 해소하기 위해서 낮은 전송률을 위한 별도의 보조 픽처 시퀀스, 예를 들어 소화면과 초당 프레임수 등이 낮은 픽처 시퀀스를 제공할 수도 있다. 보조 시퀀스를 베이스 레이어(base layer)로, 주 픽처 시퀀스를 인핸스드(enhanced)( 또는 인핸스먼트(enhancement) ) 레이어라고 부른다. 그런데, 베이스 레이어와 인핸스드 레이어는 동일한 영상신호원을 엔코딩하는 것이므로 양 레이어의 영상신호에는 잉여정보( 리던던시(redundancy) )가 존재한다.However, as mentioned above, a picture sequence encoded by the scalable MCTF is capable of expressing a low quality image even by receiving and processing only a partial sequence. However, when the bitrate is low, the image quality is greatly deteriorated. In order to solve this problem, a separate auxiliary picture sequence for a low data rate, for example, a small picture and a low frame rate per frame may be provided. The auxiliary sequence is called a base layer, and the main picture sequence is called an enhanced (or enhanced) layer. However, since the base layer and the enhanced layer encode the same video signal source, redundancy information exists in the video signals of both layers.

따라서, MCTF방식에 의해 엔코딩되는 인핸스드 레이어의 코딩율(coding rate)을 높이기 위해, 베이스 레이어의 임의 영상 프레임을 기준으로 하여 그와 동시간의 인핸스드 레이어의 영상 프레임을 예측영상으로 만든다. 도 1은 이에 대한 과정을 도식적으로 나타낸 것이다. Therefore, in order to increase the coding rate of the enhanced layer encoded by the MCTF method, the image frame of the enhanced layer is simultaneously made based on an arbitrary image frame of the base layer as a prediction image. Figure 1 schematically shows the process for this.

도 1에 예시된 과정을 설명하면, 베이스 레이어의 소정갯수의 매크로 블록들을 한 화면으로 구성하여 그 화면을 업샘플링(upsampling)하여 인핸스드 레이어의 영상 프레임의 크기와 동일하게 확대한 다음(S10), 그 확대된 화면(B100)에서, 현재 예측 영상을 만들고자 하는 인핸스드 레이어의 프레임(E100)( 이 프레임은 확대된 베이스 레이어의 화면과 상호 동시간이다 )내의 매크로 블록(EM10)과 동 위치에 있는 매크로 블록(BM10)이 내부모드(intra mode)로 코딩되어 있으면 그 매크로 블록(BM10)을 기준으로 하여, 상기 인핸스드 레이어의 매크로 블록(EM10)에 대해 예측동작(prediction)을 행한다(S11). Referring to the process illustrated in FIG. 1, a predetermined number of macroblocks of the base layer are configured into one screen, and the screen is upsampled to enlarge the same as the size of the image frame of the enhanced layer (S10). In the enlarged screen B100, at the same position as the macro block EM10 in the frame E100 of the enhanced layer, which is the same time as the screen of the enlarged base layer, at which the current prediction image is to be made. If the macro block BM10 is coded in the intra mode, a prediction operation is performed on the macroblock EM10 of the enhanced layer based on the macroblock BM10 (S11). .

즉, 상기 베이스 레이어의 내부모드의 매크로 블록(BM10)에 대해 주변 인접라인의 픽셀값을 이용하여 원래의 블록 이미지로 복구한 후 그 복구된 화소값과의 차이값( 또는 에러값 ), 즉 레지듀얼(residual)이 상기 인핸스드 레이어의 매크로 블록(EM10)에 엔코딩되게 한다.That is, the macroblock BM10 of the internal mode of the base layer is restored to the original block image using pixel values of neighboring adjacent lines, and then the difference value (or error value) with the restored pixel value, that is, the register The dual is encoded in the macro block EM10 of the enhanced layer.

그런데, 상기와 같이 베이스 레이어의 내부모드 블록의 원래 이미지를 이용하는 방법은, 인핸스드 레이어의 임의의 프레임을 엔코딩할 때, 그 프레임내의 영 상블록에 이용하고자하는 베이스 레이어의 내부모드의 블록을 그 예측정보(prediction information)에 따라 원래의 이미지로 먼저 복구하여야 한다. 하지만, 이는 상당한 하드웨어의 복잡도를 필요로 한다.However, as described above, when the original image of the inner mode block of the base layer is encoded, when an arbitrary frame of the enhanced layer is encoded, the block of the inner mode of the base layer to be used for the image block within the frame is selected. According to the prediction information, the original image must first be restored. However, this requires considerable hardware complexity.

본 발명은 상기의 문제점을 해소하기 위해 창작된 것으로서, 그 목적은 영상을 스케일러블 방식으로 엔코딩함에 있어서, 베이스 레이어의 내부모드 블록의 이미지를 복구하지 않고 그 블록의 예측정보를 이용하여 영상신호를 예측영상으로 코딩하는 방법 및 장치를 제공하는 것이다.The present invention has been made to solve the above problems, and an object thereof is to encode an image in a scalable manner, and to recover an image signal using prediction information of the block without reconstructing the image of the inner mode block of the base layer. The present invention provides a method and apparatus for coding a prediction image.

본 발명은 또한, 베이스 레이어의 내부모드 블록의 예측정보를 이용하여 엔코딩된 블록을 갖는 데이터 스트림을 디코딩하는 방법 및 장치를 제공함을 목적으로 한다.Another object of the present invention is to provide a method and apparatus for decoding a data stream having an encoded block using prediction information of an inner mode block of a base layer.

상기한 목적을 달성하기 위해 본 발명은, 영상신호를 스케일러블한 MCTF방식으로 엔코딩하여 제 1레이어의 비트 스트림을 출력함과 동시에 상기 영상신호를 기 지정된 방식으로 엔코딩하여 제 2레이어의 비트 스트림을 출력하되, MCTF 방식으로 엔코딩할 때, 상기 제 2레이어의 비트 스트림에 포함되어 있는 내부모드로 코딩된 제 1블록의 예측정보에 근거하여, 상기 영상신호의 임의의 프레임내에 포함되어 있는 영상 블록에 대해, 그 영상 블록의 인접화소를 이용하여 내부모드로 코딩하는 것을 특징으로 한다.In order to achieve the above object, the present invention encodes a video signal in a scalable MCTF scheme to output a bit stream of a first layer and simultaneously encodes the video signal in a predetermined manner to generate a bit stream of a second layer. Output to the video block included in an arbitrary frame of the video signal based on the prediction information of the first block coded in the internal mode included in the bit stream of the second layer when encoding in the MCTF scheme. And coding in the internal mode using the adjacent pixel of the video block.

본 발명에 따른 일 실시예에서는, 상기 예측정보는 예측모드(Prediction Mode)와 예측방향(DoP:Direction of Prediction)에 대한 정보로 구분된다.In one embodiment according to the present invention, the prediction information is divided into information on a prediction mode and a direction of prediction (DoP).

본 발명에 따른 일 실시예에서는, 상기 제 2레이어의 프레임은 상기 제 1레이어의 프레임의 화면크기보다 작은 소화면 프레임으로 엔코딩한다.In one embodiment according to the present invention, the frame of the second layer is encoded into a small picture frame smaller than the screen size of the frame of the first layer.

본 발명에 따른 일 실시예에서는, 상기 제 1블록의 예측모드에 근거하여 상기 영상블록을 복수개의 셀로 분할하고, 분할된 셀의 소정갯수의 그룹마다, 그 그룹과 대응되는, 상기 제 1블록내의 부분영역의 예측방향을 그 그룹내의 각 셀에 동일하게 적용하여 해당 셀의 각 화소값 차이를 코딩한다.According to an embodiment of the present invention, the image block is divided into a plurality of cells based on the prediction mode of the first block, and for each predetermined number of groups of the divided cells, corresponding to the group, within the first block. The prediction direction of the partial region is equally applied to each cell in the group, and the difference of each pixel value of the corresponding cell is coded.

본 발명에 따른 다른 일 실시예에서는, 상기 제 1블록의 예측모드에서 지정된 크기에, 상기 제 2레이어 대비 상기 제 1레이어의 화면크기 비를 곱한 크기로 상기 영상블록을 복수개의 셀로 분할하고, 각 분할된 셀에 대해, 그 셀과 대응되는, 상기 제 1블록내의 부분영역의 예측방향을 적용하여 해당 셀의 각 화소값 차이를 코딩한다.In another embodiment according to the present invention, the image block is divided into a plurality of cells by a size obtained by multiplying a screen size ratio of the first layer to the second layer by a size specified in the prediction mode of the first block. For the divided cell, the pixel direction difference of each cell of the corresponding cell is coded by applying the prediction direction of the partial region in the first block corresponding to the cell.

본 발명에 따른 또 다른 실시예에서는, 상기 영상블록에 대해, 상기 제 1블록의 예측정보에 근거하여 구한 에러데이터와 상기 제 1블록 또는 그 일부영역의 에러 데이터의 차이값을 상기 영상블록에 코딩한다.In another embodiment according to the present invention, the difference value between the error data obtained based on the prediction information of the first block and the error data of the first block or a partial region of the image block is coded into the image block. do.

이하, 본 발명의 바람직한 실시예에 대해 첨부도면을 참조하여 상세히 설명한다. Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings.

도 2는 본 발명에 따른 영상신호의 스케일러블(scalable) 코딩방법이 적용되는 영상신호 엔코딩 장치의 구성블록을 도시한 것이다.2 is a block diagram of a video signal encoding apparatus to which a scalable coding method of a video signal according to the present invention is applied.

도 2의 영상신호 엔코딩 장치는, 본 발명이 적용되는, 입력 영상신호를 MCTF 방식에 의해 각 매크로 블록(macro block) 단위로 엔코딩하고 적절한 관리정보를 생성하는 MCTF 엔코더(100), 상기 엔코딩된 각 매크로 블록의 정보를 압축된 비트열로 변환하는 텍스처(Texture) 코딩부(110), 상기 MCTF 엔코더(100)에 의해 얻어지는 영상블록들의 모션 벡터들(motion vectors)을 지정된 방식에 의해 압축된 비트열로 코딩하는 모션 코딩부(120), 입력 영상신호를 지정된 방식, 예를 들어 MPEG 1, 2, 또는4, 또는 H.261, H.263 또는 H.264방식으로 엔코딩하여 소화면, 예를 들어 원래 크기의 25%크기인 픽처들의 시퀀스를 생성하는 베이스레이어(BL) 엔코더(150), 상기 텍스처 코딩부(110)의 출력 데이터와 상기 BL 엔코더(150)의 소화면 시퀀스와 상기 모션 코딩부(120)의 출력 벡터 데이터를 기 지정된 포맷으로 인캡슐(encapsulate)한 다음 기 지정된 전송포맷으로 상호 먹싱하여 출력하는 먹서(130)를 포함하여 구성된다.The video signal encoding apparatus of FIG. 2 is an MCTF encoder 100 for encoding an input video signal to each macro block unit by an MCTF scheme and generating appropriate management information according to the present invention. The texture coding unit 110 for converting the information of the macro block into the compressed bit string, and the bit streams compressed by the specified method of motion vectors of the image blocks obtained by the MCTF encoder 100. Motion coding unit 120 to encode the input video signal in a specified manner, for example MPEG 1, 2, or 4, or H.261, H.263 or H.264 method to encode a small picture, for example A base layer (BL) encoder 150 for generating a sequence of pictures having a size of 25% of the original size, output data of the texture coding unit 110, a small picture sequence of the BL encoder 150, and the motion coding unit ( Output vector data of 120) It is configured to include an encapsulator 130 that encapsulates the format and then muxes and outputs each other in a predetermined transmission format.

상기 MCTF 엔코더(100)는, 임의 영상 프레임내의 매크로 블록에 대하여 모션 추정(motion estimation)과 예측(prediction) 동작을 수행하며, 또한 인접 프레임내의 매크로 블록과의 이미지 차에 대해서 그 매크로 블록에 더하는 갱신(update) 동작을 수행한다.The MCTF encoder 100 performs a motion estimation and prediction operation on a macroblock in an image frame, and adds an update to the macroblock with respect to an image difference with a macroblock in an adjacent frame. (update) Perform the operation.

상기 MCTF 엔코더(100)는, 입력 영상 프레임 시퀀스를, 예를 들어 기수 및 우수 프레임으로 분리한 후 추정/예측과 갱신동작을 수차, 예를 들어 하나의 GOP( Group of Pictures )에 L프레임( 갱신동작에 의한 결과 프레임 )의 수가 1개가 될 때까지 수행하는 데, 도 3의 구성은, 그 중 한 단계( 'MCTF 레벨'이라고도 한다 )의 추정/예측 및 갱신동작에 관련된 구성을 도시한 것이다.The MCTF encoder 100 separates an input video frame sequence into, for example, odd and even frames, and then performs an estimation / prediction and update operation on an aberration, for example, an L frame (updating one GOP (Group of Pictures). The operation of FIG. 3 shows a configuration related to the estimation / prediction and update operation of one step (also referred to as 'MCTF level').

도 3의 구성은, 상기 베이스레이어 엔코더(150)의 엔코딩된 소화면 시퀀스의 베이스 레이어 스트림에서 프레임율(frame rate), 매크로 블록의 모드와 같은 엔코딩 정보를 추출하는 베이스 레이어(BL) 디코더(105), 전 또는 후로 인접된 프레임에서, 모션추정(motion estimation)을 통해, 레지듀얼(residual) 데이터로 코딩할 프레임내의 각 매크로 블록에 대한 기준블록을 찾고 실제 매크로블록과의 이미지 차( 각 대응화소의 차값 )를 코딩하고, 그 기준블록에 대한 모션 벡터를 직접 산출하거나, 또는 상기 BL 디코더(105)에 의해 추출되는 각 매크로 블록의 정보를 이용하여 코딩하는 추정/예측기(102), 상기 모션 추정에 의해 그 기준 블록이 찾아진 경우의 매크로 블록에 대해서는 상기 구해진 이미지 차를 정규화(normalize)한 후, 해당 기준 블록에 더하는 갱신(update) 동작을 수행하는 갱신기(103)를 포함하고 있다. 상기 갱신기(103)가 수행하는 동작을 'U' 동작(opeation)이라 하고 'U'동작에 의해 생성된 프레임을 'L' 프레임이라 한다.The configuration of FIG. 3 is a base layer (BL) decoder 105 for extracting encoding information such as a frame rate and a mode of a macro block from a base layer stream of an encoded small picture sequence of the base layer encoder 150. ), Before or after adjacent frames, through motion estimation, find the reference block for each macroblock in the frame to be coded as residual data, and find the image difference from the actual macroblock (each corresponding pixel). An estimator / predictor 102 that codes the difference value of < RTI ID = 0.0 >) < / RTI > and directly calculates a motion vector for the reference block or uses information of each macroblock extracted by the BL decoder 105, the motion estimation. For a macro block in which the reference block is found by, the update operation is performed by normalizing the obtained image difference and then adding it to the corresponding reference block. Performing includes an updater (103). An operation performed by the updater 103 is called an 'U' operation and a frame generated by the 'U' operation is called an 'L' frame.

도 3의 추정/예측기(102)와 갱신기(103)는 영상 프레임이 아니고 하나의 프레임이 분할된 복수 개의 슬라이스(slice)에 대해 병렬적으로 동시에 수행할 수도 있으며, 상기 추정/예측기(102)에 의해 만들어지는 이미지 차를 갖는 프레임( 또는 슬라이스 )을 'H' 프레임(슬라이스)이라 한다. 'H' 프레임(슬라이스)에 있는 차값의 데이터는 영상신호의 고주파 성분을 반영한다. 이하의 실시예에서 사용되는 '프 레임'의 용어는, 슬라이스로 대체하여도 기술의 등가성이 유지되는 경우에는 슬라이스의 의미를 당연히 포함하는 것으로 사용된다.The estimator / predictor 102 and the updater 103 of FIG. 3 may simultaneously perform parallel operations on a plurality of slices in which one frame is divided, not an image frame, and the estimator / predictor 102 The frame (or slice) with the image difference made by is called an 'H' frame (slice). The data of the difference value in the 'H' frame (slice) reflects the high frequency component of the video signal. The term 'frame' used in the following examples is used to include the meaning of the slice, of course, if the equivalent of technology is maintained even if replaced by the slice.

상기 추정/예측기(102)는 입력되는 영상 프레임들( 또는 전단계에서 얻어진 L프레임들 )의 각각에 대해서, 기 정해진 크기의 매크로블록(macro-block)으로 분할한 다음, 각 분할된 매크로 블록의 이미지와 가장 높은 상관성(correlation)을 갖는 블록을 시간적으로(temporally) 인접된 전/후 프레임에서 찾아서 이에 근거한 매크로 블록의 예측영상을 만들고 모션벡터를 구하는 과정을 수행한다. 만약, 적정한 문턱값이상의 상관성을 갖는 블록을 찾지 못하고 상기 BL 디코더(105)로부터 제공되는 엔코딩정보에 동시간의 프레임에 대한 정보가 없거나 동시간의 프레임내의 대응블록( 프레임내에서의 상대적 위치가 동일한 블록 )이 내부모드(intra mode)가 아니면 현재의 매크로 블록에 대해 인접 화소값을 이용하여 내부모드로 코딩한다. 이와 같은 동작을 'P' 동작(opeation)이라 하며, 이 'P'동작에 의해 생성되는 프레임이 곧 'H'프레임이다. 이 과정은 기 공지된 기술로서 이에 대한 자세한 설명은 본 발명과 직접적인 관련이 없으므로 생략하고, 본 발명에 따라, 동시간의 베이스 레이어 프레임의 내부모드 블록의 예측정보를 이용하여, 모션추정이 되지않은 매크로 블록을 예측 이미지, 즉 레지듀얼 데이터로 만드는 도 4a 내지 4c 그리고 도 5a 내지 5c의 예시적 과정을 참조하여 상세히 설명한다.The estimator / predictor 102 divides each of the input image frames (or L frames obtained in the previous step) into macro-blocks having a predetermined size, and then images of each divided macro blocks. The process of finding the block having the highest correlation with the temporally adjacent front / rear frame, making a prediction image of the macro block based on this, and obtaining a motion vector. If a block having a correlation greater than or equal to an appropriate threshold value is not found, the encoding information provided from the BL decoder 105 does not have information about a frame at the same time, or a corresponding block within the frame (the relative position in the frame is the same). If the block) is not an intra mode, the current macro block is coded into the internal mode using adjacent pixel values. This operation is called a 'P' operation, and a frame generated by the 'P' operation is an 'H' frame. This process is a well-known technique, and thus a detailed description thereof is not directly related to the present invention, and thus it is omitted. According to the present invention, motion estimation is not performed using prediction information of internal mode blocks of a base layer frame at the same time. A detailed description will be given with reference to the exemplary process of FIGS. 4A to 4C and 5A to 5C to make a macroblock into a predictive image, that is, residual data.

먼저, 도 4a 및 4c에 예시된 본 발명의 일 실시예를 설명한다.First, one embodiment of the present invention illustrated in FIGS. 4A and 4C will be described.

만약, 상기 BL 디코더(105)로부터 제공되는 엔코딩 정보로부터 동시간의 프레임내의 대응블록이 내부모드인 것으로 파악되면, 상기 추정/예측기(102)는 그 대 응블록의 예측모드와 예측방향(DoP)을 확인한다. 여기서 대응블록이란, 인핸스드 레이어와 베이스 레이어의 화면크기가 동일한 경우에는, 인핸스드 레이어의 현재의 매크로 블록과 프레임상에서의 상대적 위치가 동일한 블록을, 인핸스드 레이어가 베이스 레이어의 화면크기보다 큰 경우에는, 베이스 레이어의 프레임을 인핸스드 레이어의 프레임크기로 확대(scaling)했을 때 인핸스드 레이어의 현재 매크로 블록을 커버하는 이미지 영역을 갖는 블록을 의미한다.If it is determined from the encoding information provided from the BL decoder 105 that the corresponding block in the frame at the same time is the internal mode, the estimator / predictor 102 predicts the prediction mode and the prediction direction (DoP) of the corresponding block. Check. Here, the corresponding block means a block having the same relative position on the frame as the current macro block of the enhanced layer when the screen size of the enhanced layer and the base layer is the same, and an enhanced layer that is larger than the screen size of the base layer. Means a block having an image area covering the current macroblock of the enhanced layer when the frame of the base layer is scaled to the frame size of the enhanced layer.

본 발명의 일 실시예에서는, 상기 MCTF 엔코더(100)에 의해 엔코딩되는 프레임의 화면크기가 상기 BL 엔코더(150)에 의해 엔코딩되는 프레임의 화면크기의 4배로서, 서로 상이한 화면크기를 갖는다.In one embodiment of the present invention, the screen size of the frame encoded by the MCTF encoder 100 is four times the screen size of the frame encoded by the BL encoder 150, and has different screen sizes.

본 발명에 따른 일 실시예에서는, 상기 베이스 레이어 엔코더(150)가, 내부모드에 대해 도 4a 내지 4c에 각각 도시된 intra 4x4, intra 8x8 그리고 intra 16x16의 내부모드 유형, 즉 예측모드를 사용하며, intra 4x4, intra 8x8 유형에 대해서는 9가지의, intra 16x16 유형에는 4가지의 DoP( 예를 들어, 도면에서 화살표의 방향 )를 사용한다.In one embodiment according to the present invention, the base layer encoder 150 uses an internal mode type of intra 4x4, intra 8x8 and intra 16x16, that is, a prediction mode shown in FIGS. 4A to 4C for the internal mode, respectively. Nine types are used for the intra 4x4 and intra 8x8 types, and four DoPs (for example, the direction of the arrow in the figure) are used for the intra 16x16 type.

상기 추정/예측기(102)는 예측모드를 확인한 후, 베이스 레이어의 그 예측모드에 따라 현재의 매크로 블록(401)을 셀로 분할한다. 즉, 베이스 레이어의 예측모드가 도 4a에서와 같이 intra 4x4의 유형이면 현재의 매크로 블록(401)을 4x4의 크기를 갖는 셀들로 분할하고, 도 4b에서와 같이 intra 8x8의 모드이면, 현재의 매크로 블록(401)을 8x8의 크기를 갖는 셀들로 분할하고, 도 4c에서와 같이 intra 16x16의 모드이면, 현재의 매크로 블록(401)을 16x16의 크기를 갖는 셀(cell)들로 분할하여, 각 셀들에 대해서 베이스 레이어의 대응 매크로 블록의 DoP 정보를 이용하여 코딩한다.After estimating the prediction mode, the estimation / predictor 102 divides the current macroblock 401 into cells according to the prediction mode of the base layer. That is, if the prediction mode of the base layer is a type of intra 4x4 as shown in FIG. 4A, the current macro block 401 is divided into cells having a size of 4x4. If the mode of intra 8x8 as shown in FIG. 4B, the current macro, If the block 401 is divided into cells having a size of 8x8, and if the mode is intra 16x16 as shown in Fig. 4c, the current macro block 401 is divided into cells having a size of 16x16, each cell Is coded using DoP information of the corresponding macroblock of the base layer.

그런데, 본 발명에 따른 일 실시예에서는, 상기 인핸스드 레이어의 매크로 블록(401)이 베이스 레이어의 대응 매크로 블록 1/4의 이미지에 해당하는 화소들을 갖는데, 현재의 매크로 블록(401)을 베이스 레이어의 대응 매크로 블록의 예측모드와 동일하게 셀로 분할하였으므로, 이용할 DoP 정보의 수가 대응 블록에서 사용된 것보다 더 많이, 즉 4배가 필요하다. 즉, 현재 매크로 블록(401)에 대응되는, 베이스 레이어의 매크로 블록의 1/4 부분블록(402)에는, 도 4a의 intra 4x4 유형인 경우 4개의 DoP 정보가, 도 4b의 intra 8x8 유형에서는 1개의 DoP 정보가 포함되어 있는 데, 각각의 경우에 대해서 분할된 현재 매크로 블록의 셀수는 16개와 4개로서, 이용할 수 있는 DoP 수보다 분할된 셀의 수가 레이어간 화면크기 비율, 예를 들어 4배만큼 많다.However, in one embodiment according to the present invention, the macroblock 401 of the enhanced layer has pixels corresponding to the image of the corresponding macroblock 1/4 of the base layer, and the current macroblock 401 is the base layer. Since the cell is partitioned into cells in the same manner as the prediction mode of the corresponding macroblock in, the number of DoP information to be used is more than four times, that is, used in the corresponding block. That is, in the fourth partial block 402 of the macroblock of the base layer corresponding to the current macroblock 401, four DoP information is included in the intra 4x4 type of FIG. 4A and 1 in the intra 8x8 type of FIG. 4B. DoP information is included. For each case, the number of cells of the current macroblock divided is 16 and 4, and the number of divided cells is larger than the number of available DoPs. As many as.

따라서, 상기 추정/예측기(102)는 현재의 매크로 블록(401)의 분할 셀들에 대해서 4개씩을 그룹핑하여, 1/4 부분블록(402)내에서 각 셀 그룹에 대응되는 영역이 갖는 DoP를 동일하게 이용하여 내부모드 코딩을 한다. 예를 들어, 도 4a의 예에서는, 좌상단 셀그룹(401a)의 각 4개의 셀은 대응블록내의 1/4부분 블록(402)의 좌상단 셀(402a)의 DoP 정보를 동일하게 이용하여 각각 내부모드 코딩되고, 도 4b의 예에서는, 셀그룹(401a)( 이는 매크로 블록(401)의 크기와 동일 )내의 4개의 각 셀이, 그 셀그룹(401a)에 대응되는 베이스 레이어의 1/4부분블록(402)의 DoP 정보를 동일하게 이용하여 각각 내부모드 코딩된다. 매크로 블록의 다른 셀그룹 또는 다른 매크로 블록에 대해서도 마찬가지이다.Accordingly, the estimator / predictor 102 groups four of the divided cells of the current macroblock 401 to equal the DoP of the region corresponding to each cell group in the quarter subblock 402. Internal mode coding. For example, in the example of FIG. 4A, each of the four cells of the upper left cell group 401a uses the same DoP information of the upper left cell 402a of the quarter-part block 402 in the corresponding block, respectively. In the example of FIG. 4B, each of the four cells in the cell group 401a (which is the same as the size of the macro block 401) is a 1/4 partial block of the base layer corresponding to the cell group 401a. Inner mode coded using the DoP information of 402 in the same manner. The same applies to other cell groups or other macro blocks of the macro block.

내부모드 코딩은 정해진 DoP에 따라, 인접된 좌 및/또는 상단의 픽셀라인의 화소값을 적절히 선택하여 그 평균을 기준으로 한 각 화소의 차이값( 레지듀얼(residual) )을 코딩하거나 또는 인접된 두 라인에 있는 화소값을 DoP에 따라 적절히 인터폴레이션(interpolation)한 값과의 차이값(레지듀얼)을 코딩한다.Intra-mode coding selects the pixel values of adjacent left and / or top pixel lines according to a given DoP and codes the difference value (residual) of each pixel based on the average or A difference value (residual) with a value obtained by properly interpolating pixel values in two lines according to DoP is coded.

한편, 베이스 레이어의 대응블록이 도 4c의 예와 같이, intra 16x16으로 코딩되어 있으면, 즉 16x16의 하나의 매크로 블록 전체에 대해 하나의 DoP(41)에 따라 코딩되어 있으면, 상기 추정/예측기(102)는 현재의 매크로 블록(401) 뿐만 아니라 그 매크로 블록(401)과 접한, 모두 동일한 대응블록(41)을 갖는 3개의 매크로 블록들도 대응블록(410)의 DoP(41)를 동일하게 이용하여 각각 내부코딩한다.On the other hand, if the corresponding block of the base layer is coded with intra 16x16 as in the example of FIG. 4C, that is, if one macroblock of 16x16 is coded according to one DoP 41, the estimation / predictor 102 ) Uses not only the current macro block 401 but also three macro blocks having the same corresponding block 41 in contact with the macro block 401 using the DoP 41 of the corresponding block 410 in the same manner. Each coded internally.

본 발명에 따른 다른 실시예에서는, 상기 추정/예측기(102)는 현재 매크로 블록의 대응블록에 대한 예측모드와, 베이스 레이어의 프레임대비 자신이 코딩하는 프레임의 화면크기 비율에 근거하여 현재의 매크로 블록(401)을 셀로 분할한다. In another embodiment according to the present invention, the estimator / predictor 102 is based on the prediction mode for the corresponding block of the current macroblock and the current macroblock based on the screen size ratio of the frame coded by the frame to the base layer. Divide 401 into cells.

인핸스드 레이어 프레임의 화면크기가 베이스 레이어 프레임의 4배인 조건하에서, 베이스 레이어의 내부모드 블록이 도 5a에서와 같이 intra 4x4의 모드이면 현재의 매크로 블록(501)을 그 모드의 4배에 해당하는 8x8의 크기를 갖는 셀들로 분할하고, 도 5b에서와 같이 intra 8x8의 모드이면, 그 모드의 4배에 해당하는 크기는 곧 현재 매크로 블록(501)의 크기이므로 분할하지 않는다. 현재 매크로 블록의 대응블록의 내부모드 유형이 intra 16x16인 도 5c의 경우에도 마찬가지이다.Under the condition that the screen size of the enhanced layer frame is four times that of the base layer frame, if the inner mode block of the base layer is an intra 4x4 mode as shown in FIG. 5A, the current macroblock 501 corresponds to four times that of the mode. In the case of intra 8x8 mode, as shown in FIG. 5B, the size corresponding to 4 times of the mode is the size of the current macro block 501, and thus the data is not divided. The same applies to the case of FIG. 5C in which the internal mode type of the corresponding block of the current macro block is intra 16x16.

본 실시예에 따라 매크로 블록이 분할되는 경우, 도 5a에 예시된 바와 같이, 분할된 각 셀은 대응블록의 4x4 부분영역의 DoP정보와 일대일로 대응되므로, 상기 추정/예측기(102)는 분할된 매크로 블록(501)의 각 셀에 대해 상호 위치대응되는 4x4 영역의 DoP정보를 이용하여 내부코딩한다.When the macroblock is divided according to the present embodiment, as illustrated in FIG. 5A, since each divided cell corresponds one-to-one with DoP information of the 4x4 subregion of the corresponding block, the estimator / predictor 102 is divided. Each cell of the macroblock 501 is internally coded using DoP information of a 4x4 region corresponding to each other.

그런데, 이와 같이 베이스 레이어의 임의 매크로 블록에서 내부코딩된 모드보다 더 큰 크기를 갖는 모드를 사용하게 되면, 예를 들어 베이스 레이어가 intra 8x8을 사용하였을 때 인핸스드 레이어가 그 상위인 intra 16x16을 사용하게 되면 동일한 DoP를 이용할 수 없는 경우가 발생한다. 예를 들어, 도 5b에 예시된 바와 같이 베이스 레이어의 대응블록이 intra 8x8 모드로 코딩되어 있고, 현재 매크로 블록(501)에 대응되는 대응블록내의 1/4 부분블록(502)이 대각선 DoP(52)를 가지는 경우, 현재 매크로 블록(501)은 intra 16x16 모드로 대각선 DoP를 사용하여야 하는 데, intra 16x16 모드가 갖는 4개의 DoP에는 대각선 DoP가 정의되어 있지 않아 이용할 수 없다.However, if a mode having a larger size than an internal coded mode is used in an arbitrary macro block of the base layer, for example, when the base layer uses intra 8x8, the enhanced layer uses intra 16x16, which is higher than that. In this case, the same DoP cannot be used. For example, as illustrated in FIG. 5B, a corresponding block of the base layer is coded in an intra 8x8 mode, and a quarter partial block 502 in the corresponding block corresponding to the current macro block 501 is a diagonal DoP 52. In this case, the current macroblock 501 should use diagonal DoP in intra 16x16 mode, but the diagonal DoP is not defined in four DoPs of the intra 16x16 mode and cannot be used.

따라서, 상기 추정/예측기(102)는 대응블록이 갖고 있는 DoP 정보를 이용할 수 없을 때는 도 5b에서와 같이 현재 매크로 블록(501)에 대해 방향에 무관하게, 인접된 두 라인에 있는 화소값 및/또는 고정값, 예를 들어 128을 더한 값의 평균을 기준으로 한 DC 코딩 또는 플레인(plane) 코딩을 사용한다. Accordingly, when the DoP information of the corresponding block is not available, the estimator / predictor 102 may use the pixel values in two adjacent lines and / or the current macro block 501 regardless of the direction as shown in FIG. Or DC coding or plane coding based on the average of a fixed value, e.g., 128.

도 5c에서와 같이 대응 블록이 intra 16x16 모드로 코딩되어 있는 경우에는, 현재 매크로 블록(501)을 분할할 수 없고, 현재 매크로 블록(501)을 포함하는 인접 3개의 블록이 동일한 대응블록(510)을 가지므로 4개의 매크로 블록에 대해 대응블록(510)의 하나의 DoP(53)를 공통으로 사용하여 내부 코딩을 하게 된다. 이는 도 4c의 경우와 동일하다.When the corresponding block is coded in the intra 16x16 mode as shown in FIG. 5C, the current macroblock 501 cannot be divided, and three adjacent blocks including the current macroblock 501 are identical to the corresponding block 510. Therefore, internal coding is performed by using one DoP 53 of the corresponding block 510 in common for four macro blocks. This is the same as the case of FIG. 4C.

상기와 같은 내부모드 코딩후에, 상기 추정/예측기(102)는 베이스 레이어의 대응블록의 DoP를 이용하여 코딩하였음을 알리는 모드 정보를, 매크로 블록의 헤더정보내에, 예를 들어 블록모드에 기록한다. 이 때의 모드 정보는, 베이스 레이어의 내부모드 블록의 예측정보를 이용하지 않고 인핸스드 레이어의 인접화소를 이용하여 코딩한 내부모드를 지시하는 정보와는 구별되는 정보이다.After the internal mode coding as described above, the estimator / predictor 102 records mode information indicating that the code is coded using the DoP of the corresponding block of the base layer in the header information of the macro block, for example, in the block mode. The mode information at this time is information that is distinguished from information indicating the internal mode coded using the neighboring pixel of the enhanced layer without using the prediction information of the internal mode block of the base layer.

본 발명에 따른 다른 일 실시예에서는, 현재의 매크로 블록에 대해 도 4a 내지 4c 또는 5a 내지 5c와 같은 방식으로 내부 코딩한 레지듀얼 블록을 임시로 저장한 후, 그 임시블록과 베이스 레이어의 대응블록 또는 대응블록의 일부영역과의 각 화소의 차이값을 상기 매크로 블록에 코딩할 수도 있다. 즉, 각기 내부모드 코딩된 에러데이터간의 차이가 코딩된다. 이를 위해서는, 상기 BL 디코더(105)가 베이스 레이어 스트림으로부터의 엔코딩 정보의 추출외에 엔코딩된 영상 프레임도 제공한다. 그리고, 인핸스드 레이어와 베이스 레이어의 화면크기가 상이하면 그 비율에 따라 디코딩된 베이스 레이어의 프레임을 업샘플링을 통해 프레임 크기를 확대하여 상기 추정/예측기(102)에 제공하게 된다.In another embodiment according to the present invention, after temporarily storing an internal coded residual block in the same manner as in FIGS. 4A to 4C or 5A to 5C, the corresponding block of the temporary block and the base layer is stored. Alternatively, a difference value of each pixel with a partial region of the corresponding block may be coded in the macroblock. That is, the difference between the respective internal mode coded error data is coded. To this end, the BL decoder 105 also provides encoded image frames in addition to extracting encoding information from the base layer stream. When the screen sizes of the enhanced layer and the base layer are different, the frame size of the decoded base layer is enlarged by upsampling and provided to the estimator / predictor 102 according to the ratio.

지금까지 설명한 방법에 의해 엔코딩된 데이터 스트림은 유선 또는 무선으로 디코딩 장치에 전부 또는 일부( 채널 용량예 따라 )가 전송되거나 기록매체를 매개로 하여 전달되며, 디코딩 장치는 이후 설명하는 방법에 따라 원래의 인핸스드 레이어 및/또는 베이스 레이어의 영상신호를 복원하게 된다.The data streams encoded by the method described so far are transmitted in whole or in part (depending on the channel capacity, for example) or via the recording medium, to the decoding device by wire or wirelessly, and the decoding device is then adapted according to the method described later. The video signal of the enhanced layer and / or the base layer is restored.

도 6은 도 2의 장치에 의해 엔코딩된 데이터 스트림을 디코딩하는 장치의 블 록도이다. 도 6의 디코딩 장치는, 수신되는 데이터 스트림에서 압축된 모션 벡터 스트림과 압축된 매크로 블록 정보 스트림을 분리하는 디먹서(200), 압축된 매크로 블록 정보 스트림을 원래의 비압축 상태로 복원하는 텍스처 디코딩부(210), 압축된 모션 벡터 스트림을 원래의 비압축 상태로 복원하는 모션 디코딩부(220), 압축해제된 매크로 블록 정보 스트림과 모션 벡터 스트림을 MCTF 방식에 따라 원래의 영상신호로 역변환하는 MCTF 디코더(230), 상기 베이스 레이어 스트림을 정해진 방식, 예를 들어 MPEG4 또는 H.264방식에 의해 디코딩하는 베이스 레이어 디코더(240)를 포함하여 구성된다. 상기 BL 디코더(240)는, 입력되는 베이스 레이어 스트림을 디코딩함과 동시에, 스트림내의 헤더정보를 상기 MCTF 디코더(230)에 제공하여 필요한 베이스 레이어의 엔코딩 정보, 예를 들어 내부모드 블록의 예측 정보 등을 이용할 수 있게 한다.6 is a block diagram of an apparatus for decoding a data stream encoded by the apparatus of FIG. The decoding apparatus of FIG. 6 includes a demux 200 that separates a compressed motion vector stream and a compressed macro block information stream from a received data stream, and texture decoding to restore the compressed macro block information stream to an original uncompressed state. A unit 210, a motion decoding unit 220 for restoring a compressed motion vector stream to an original uncompressed state, an MCTF for inversely converting a decompressed macroblock information stream and a motion vector stream into an original video signal according to an MCTF method. The decoder 230 includes a base layer decoder 240 for decoding the base layer stream by a predetermined method, for example, MPEG4 or H.264. The BL decoder 240 decodes the input base layer stream and provides header information in the stream to the MCTF decoder 230 to encode encoding information of the base layer, for example, prediction information of an internal mode block. Make it available.

상기 MCTF 디코더(230)는, 입력되는 스트림으로부터 원래의 프레임 시퀀스를 복원하기 위한 도 7의 구성을 포함한다.The MCTF decoder 230 includes the configuration of FIG. 7 for recovering an original frame sequence from an input stream.

도 7의 MCTF 디코더(230)는, MCTF 레벨 N의 H와 L프레임 시퀀스를 레벨 N-1의 L 프레임 시퀀스로 복원하는 구성이다. 도 7에는, 입력되는 H 프레임의 각 화소의 차값을 입력되는 L프레임에서 감하는 역갱신기(231), H프레임의 이미지 차가 감해진 L프레임과 그 H프레임을 사용하여 원래의 이미지를 갖는 L프레임을 복원하는 역 예측기(232), 입력되는 모션 벡터 스트림을 디코딩하여 H프레임내의 각 매크로 블록의 모션벡터 정보를 각 단(stage)의 역 예측기(232 등)에 제공하는 모션 벡터 디코더(235), 그리고 상기 역 예측기(232)에 의해 완성된 L프레임을 상기 역갱신기 (231)의 출력 L프레임 사이에 간삽시켜 정상 순서의 L프레임 시퀀스로 만드는 배열기(234)를 포함한다. The MCTF decoder 230 of FIG. 7 is configured to restore the H and L frame sequences of the MCTF level N to the L frame sequences of the level N-1. 7 shows an inverse updater 231 which subtracts a difference value of each pixel of an input H frame from an input L frame, an L frame having an image difference of the H frame subtracted, and an L having an original image using the H frame. Inverse predictor 232 for reconstructing a frame, and a motion vector decoder 235 for decoding an input motion vector stream and providing motion vector information of each macro block in an H frame to inverse predictors 232 of each stage. And an arranger 234 interpolating the L frames completed by the inverse predictor 232 between the output L frames of the inverse updater 231 to form a sequence of L frames in a normal order.

상기 배열기(234)에 의해 출력되는 L 프레임은 레벨 N-1의 L프레임 시퀀스(701)가 되고 이는 입력되는 N-1레벨의 H프레임 시퀀스(702)와 함께 다음 단의 역갱신기와 역 예측기에 의해 L프레임 시퀀스로 다시 복원되며, 이 과정이 엔코딩시의 MCTF 레벨만큼 수행됨으로써 원래의 영상 프레임 시퀀스로 복원된다.The L frame output by the arranger 234 becomes the L frame sequence 701 of level N-1, which is the inverse updater and inverse predictor of the next stage together with the input H frame sequence 702 of level N-1. It is reconstructed back to the L frame sequence by, and this process is performed by the MCTF level at the time of encoding to restore the original video frame sequence.

레벨 N에서의 H프레임의 L프레임으로의 복원과정을 본 발명과 관련하여 보다 상세히 설명하면, 먼저, 상기 역갱신기(231)는, 임의의 L프레임에 대해, 그 프레임내에 블록을 기준블록으로 하여 이미지 차를 구한 모든 H프레임내의 매크로 블록의 에러값을 상기 L프레임의 해당 블록에서 감하는 동작을 수행한다.The recovery process of the H frame to the L frame at level N will be described in more detail with reference to the present invention. First, the inverse updater 231, for any L frame, converts a block in the frame into a reference block. Then, an error value of a macroblock in every H frame for which the image difference is obtained is subtracted from the corresponding block of the L frame.

상기 역 예측기(232)는, 하나의 H프레임내에서, 헤더가 베이스 레이어의 대응블록의 예측정보를 이용하여 내부코딩되었음을 지시하는 매크로 블록을 제외하고는, 상기 모션 벡터 디코더(235)로부터 제공되는 모션벡터 정보에 근거해서 기 공지된 방식에 따라서 매크로 블록을 원래의 화소값으로 복구하는 동작을 수행한다.The inverse predictor 232 is provided from the motion vector decoder 235 except for a macro block indicating that a header is internally coded using prediction information of a corresponding block of a base layer in one H frame. Based on the motion vector information, the macro block is restored to the original pixel value according to a known method.

만약, 베이스 레이어의 대응블록의 예측정보를 이용하여 내부코딩된 것으로 그 헤더정보가 지시하고 있는 매크로 블록에 대해서는 원 이미지를 복원하기 위해, 상기 BL 디코더(240)로부터 제공되는 베이스 레이어의 대응블록의 예측모드와 DoP정보를 먼저 확인하고, 그에 따라 내부모드로 코딩된 현재 매크로 블록의 원 화소값을 복원한다. If a macroblock that is internally coded using the prediction information of the corresponding block of the base layer and indicated by the header information thereof is used to restore the original image, the corresponding block of the base layer provided from the BL decoder 240 may be used. The prediction mode and the DoP information are checked first, and accordingly, original pixel values of the current macroblock coded in the internal mode are restored.

먼저, 베이스 레이어의 내부모드의 대응블록의 예측모드와 동일한 크기로 셀 을 분할하여 DoP를 이용한 도 4a 내지 4c의 실시예의 경우에는, 상기 역 예측기(232)도 대응블록의 예측모드(intra 4x4 또는 intra 8x8)와 동일모드로 현재 매크로 블록을 셀로 분할하여, 베이스 레이어의 대응블록의 DoP정보를 중복하여, 예를 들어 화면크기 비율이 4이면 4회 중복하여 상호 인접된 4개의 분할 셀들에 각각 적용하여 해당 셀의 원래의 화소값을 구하고, 분할할 수 없는 예측모드, 즉 intra 16x16이면( 도 4c의 경우 ) 그 대응블록의 DoP를 그대로 사용하여 현재 매크로 블록의 원래의 화소값을 복원한다.First, in the case of the embodiment of FIGS. 4A to 4C using DoP by dividing a cell into the same size as the prediction mode of the corresponding block in the inner mode of the base layer, the inverse predictor 232 may also use the prediction mode (intra 4x4 or the same) of the corresponding block. intra 8x8), the current macroblock is divided into cells in the same mode, and the DoP information of the corresponding block of the base layer is overlapped. For example, if the screen size ratio is 4, the overlap is applied four times to each of the four adjacent divided cells. The original pixel value of the corresponding cell is obtained, and if the prediction mode cannot be divided, that is, intra 16x16 (in case of FIG. 4C), the original pixel value of the current macro block is restored using the DoP of the corresponding block as it is.

대응블록의 DoP를 이용하여 원래의 화소값을 복원하는 방법은, 앞서 복구된 인접 매크로 블록 또는 셀의 원래의 화소값으로부터, 해당 DoP에 따라 각 화소에 적용된 기준값을 구한 후 그 기준값과 그 화소의 현재 차값을 더하여 원래의 화소값으로 복원한다. 경우에 따라서는, 현재 매크로 블록에 인접된 매크로 블록이 프레임간 모드인 경우에도 그와 인접된 라인의 화소값을 128로 대체하지 않고 복원된, 즉 디코딩된 화소값을 상기 기준값을 구하는 데 사용할 수도 있다. 현재 매크로 블록에 접한 앞선 3블록( 좌측, 상단, 그리고 좌상단 블록 )은 디코딩 순서에 있어서 현재 블록에 앞서서 먼저 복원되므로 프레임간 모드인 경우에도 원래 화소값을 사용하는 데는 문제가 없다.The method of restoring the original pixel value using the DoP of the corresponding block includes obtaining a reference value applied to each pixel according to the corresponding DoP from the original pixel value of the neighboring macroblock or the cell previously restored, and then The current difference value is added to restore the original pixel value. In some cases, even when the macroblock adjacent to the current macroblock is in the inter-frame mode, the pixel value of the restored, i.e. decoded pixel value of the line adjacent to it may not be replaced by 128, but may be used to obtain the reference value. have. Since the preceding three blocks (left, top, and top left blocks) facing the current macro block are recovered before the current block in decoding order, there is no problem in using the original pixel values even in the inter-frame mode.

베이스 레이어의 대응블록의 예측모드의 크기에 화면크기 비율, 예를 들어 4를 곱한 모드로 셀을 분할하여 DoP를 이용한 도 5a 내지 5c의 실시예의 경우에는, 상기 역 예측기(232)는 대응블록의 예측모드가 intra 4x4이면 8x8의 셀로 분할하여, 대응블록내에서 상호 대응되는 4x4영역의 DoP정보를 일대일 각각 적용하여 해 당 셀의 원래의 화소값을 구하고( 도 5a의 경우 ), 분할할 수 없는 예측모드, 즉 intra 8x8 및 intra 16x16이면( 도 5b 및 5c의 경우 ) 그 대응블록의 DoP를 그대로 사용하여 현재 매크로 블록의 원래의 화소값을 복원한다. 그런데, 베이스 레이어의 블록에 대해 적용된 예측모드의 상위 모드가 인핸스드 레이어의 블록에 적용되므로, 도 5b에서와 같이 베이스 레이어의 블록에 적용된 DoP를 현재의 매크로 블록에 적용할 수 없는 경우가 발생할 수도 있다. 이 때에는 상기 역 예측기(232)는 미리 지정된 방식, 예를 들면 DC 또는 플레인 예측의 역동작을 수행하여 원래의 화소값을 복원하게 된다.In the case of the embodiments of FIGS. 5A to 5C using DoP by dividing a cell into a mode of multiplying the size of the prediction mode of the corresponding block of the base layer by a screen size ratio, for example, 4, the inverse predictor 232 may be configured as the corresponding block. If the prediction mode is intra 4x4, it is divided into 8x8 cells, and the original pixel values of the corresponding cells are obtained by applying DoP information of 4x4 areas corresponding to each other in the corresponding block one-to-one (in case of FIG. 5A), and cannot be divided. In the prediction mode, i.e., intra 8x8 and intra 16x16 (in FIGS. 5B and 5C), the DoP of the corresponding block is used as it is to restore the original pixel value of the current macroblock. However, since the higher mode of the prediction mode applied to the block of the base layer is applied to the block of the enhanced layer, as shown in FIG. 5B, a DoP applied to the block of the base layer may not be applied to the current macro block. have. In this case, the inverse predictor 232 restores the original pixel value by performing a reverse operation of a predetermined method, for example, DC or plane prediction.

만약, 베이스 레이어의 내부모드 블록의 예측정보를 이용하는 인핸스드 레이어의 블록에 대해 에러데이터간의 차이를 코딩하는 실시예에서는, 먼저, 현재 매크로 블록의 각 화소값에 베이스 레이어의 대응블록 또는 대응블록의 일부영역의 대응 화소값을 각기 더한 후에, 상기 대응블록의 예측정보를 이용하여 원래의 화소값을 복원하는 전술한 동작을 행하게 된다. 이를 위해서는, 상기 BL 디코더(240)가 디코딩전의 베이스 레이어 프레임도 상기 MCTF 디코더(230)에 제공하는 데, 인핸스드 레이어와 베이스 레이어의 화면크기가 상이하면 그 비율에 따라 베이스 레이어의 프레임을 확대하여 제공하게 된다.In an embodiment in which a difference between error data is coded for a block of an enhanced layer using prediction information of an inner mode block of the base layer, first, each pixel value of the current macro block is assigned to the corresponding block or the corresponding block of the base layer. After the corresponding pixel values of the partial region are added, the above-described operation of restoring the original pixel values is performed by using the prediction information of the corresponding block. To this end, the BL decoder 240 also provides the base layer frame before decoding to the MCTF decoder 230. If the screen sizes of the enhanced layer and the base layer are different, the frame of the base layer is enlarged according to the ratio. Will be provided.

하나의 H프레임에 대해, 소정단위, 예를 들어 슬라이스(slice) 단위로 병렬적으로 수행되어 그 프레임내의 모든 매크로 블록들이 원래의 이미지를 가지게 된 다음 이들이 모두 조합됨으로써 하나의 완전한 영상 프레임을 구성하게 된다.For one H frame, it is performed in parallel in a predetermined unit, for example, a slice unit, so that all macro blocks in the frame have the original image, and then all of them are combined to form one complete image frame. do.

이러한 방법에 따라, 베이스 레이어의 내부모드 블록의 예측정보를 이용하여 MCTF방식으로 엔코딩된 데이터 스트림이 완전한 영상 프레임 시퀀스로 복구된다. 전술한 디코딩 장치는, 이동통신 단말기 등에 실장되거나 또는 기록매체를 재생하는 장치에 실장될 수 있다.According to this method, the data stream encoded by the MCTF method using the prediction information of the inner mode block of the base layer is recovered to the complete image frame sequence. The above-described decoding apparatus may be mounted in a mobile communication terminal or the like or in an apparatus for reproducing a recording medium.

본 발명은 전술한 전형적인 바람직한 실시예에만 한정되는 것이 아니라 본 발명의 요지를 벗어나지 않는 범위 내에서 여러 가지로 개량, 변경, 대체 또는 부가하여 실시할 수 있는 것임은 당해 기술분야에 통상의 지식을 가진 자라면 용이하게 이해할 수 있을 것이다. 이러한 개량, 변경, 대체 또는 부가에 의한 실시가 이하의 첨부된 특허청구범위의 범주에 속하는 것이라면 그 기술사상 역시 본 발명에 속하는 것으로 보아야 한다. It is to be understood that the present invention is not limited to the above-described exemplary preferred embodiments, but may be embodied in various ways without departing from the spirit and scope of the present invention. If you grow up, you can easily understand. If the implementation by such improvement, change, replacement or addition falls within the scope of the appended claims, the technical idea should also be regarded as belonging to the present invention.

상술한 바와 같이, MCTF 엔코딩에 있어서, 인핸스드 레이어의 프레임외에 저성능 디코더를 위해 제공되는 베이스 레이어의 엔코딩 정보를 이용하여 내부모드 블록을 만들게 되면 엔코딩장치의 하드웨어 복잡도를 감소시킬 수 있다.As described above, in the MCTF encoding, if the internal mode block is made using encoding information of the base layer provided for the low performance decoder in addition to the frame of the enhanced layer, the hardware complexity of the encoding apparatus may be reduced.

Claims

An apparatus for encoding an input video signal,

A first encoder for encoding the video signal in a scalable first manner to output a bit stream of a first layer;

And a second encoder configured to output the bit stream of the second layer by encoding the video signal in a designated second method.

The first encoder,

Based on the prediction information of the first block coded in the internal mode included in the encoding information extracted from the bit stream of the second layer, the video block included in an arbitrary frame of the video signal is And means for coding in an internal mode using adjacent pixels.

The method of claim 1,

And the first block is a block co-located with the video block or a block including an area at the same position in a frame of the second layer simultaneously with the arbitrary frame.

The method of claim 1,

The prediction information, characterized in that it comprises information about the prediction mode and the prediction direction.

The method of claim 3, wherein

And the prediction mode is one selected from intra 4x4, intra 8x8, and intra 16x16.

The method of claim 3, wherein

The means divides the video block into a plurality of cells based on the prediction mode, and for each of a predetermined number of groups of divided cells, predicts the prediction direction of the partial region in the first block corresponding to the group within the group. And applying the same to each cell to code the difference in each pixel value of the corresponding cell.

The method of claim 3, wherein

The means divides the image block into a plurality of cells by a size multiplied by a screen size ratio of the first layer to the second layer, and a size corresponding to the cell specified in the prediction mode. And coding the pixel value difference of the corresponding cell by applying the prediction direction of the partial region in the first block.

The method of claim 3, wherein

The means, when it is not possible to divide the image block based on the prediction mode, applies the prediction direction for the partial region of the first block or the first block to the entire image block and codes each pixel value difference. Device characterized in that.

The method of claim 7, wherein

The means may internally code the image block by applying DC prediction or plane prediction when the prediction direction for the partial region of the first block is a prediction direction that cannot be applied to the entire image block. .

The method of claim 1,

And the means is further configured to include, in the header information of the image block, information indicating that the image block is internally coded using the prediction information of the corresponding block of the second layer.

The method of claim 1,

The decoder further provides an encoded video frame of the second layer to the first encoder,

And the means for coding the image block with the difference value between the error data obtained based on the prediction information of the first block and the error data of the first block or a partial region thereof. .

In the method for encoding an input video signal,

Outputting a bit stream of a first layer by encoding the video signal in a scalable first manner;

And encoding the video signal in a designated second manner to output a bit stream of a second layer.

The encoding in the first method may include:

Based on the prediction information of the first block coded in the internal mode included in the bit stream of the second layer, a video block included in an arbitrary frame of the video signal is used for pixels adjacent to the video block. And coding in the internal mode.

The method of claim 11,

And the first block is a block co-located with the video block or a block including an area in the same position in a frame of the second layer simultaneously with the arbitrary frame.

The method of claim 11,

The prediction information includes information on a prediction mode and a prediction direction.

The method of claim 13,

The prediction mode is one selected from intra 4x4, intra 8x8 and intra 16x16.

The method of claim 13,

The process divides the image block into a plurality of cells based on the prediction mode, and for each predetermined number of groups of divided cells, predicts the prediction direction of the partial region in the first block corresponding to the group within the group. Applying the same to each cell and coding the difference of each pixel value of the corresponding cell.

The method of claim 13,

The process may be performed by dividing the image block into a plurality of cells by multiplying a screen size ratio of the first layer to the second layer by a size specified in the prediction mode, and corresponding to the cell for each divided cell. And coding the pixel value difference of the corresponding cell by applying the prediction direction of the partial region in the first block.

The method of claim 13,

In the process, when the image block cannot be divided based on the prediction mode, the pixel value difference is coded by applying a prediction direction for the first block or a partial region of the first block to the entire image block. Characterized in that.

The method of claim 17,

The process may include internally coding the image block by applying DC prediction or plane prediction when the prediction direction for the partial region of the first block is a prediction direction that cannot be applied to the entire image block. .

The method of claim 11,

The encoding by the first method may further include including, in the header information of the video block, information indicating that the video block is internally encoded using the prediction information of the corresponding block of the second layer. Characterized in that the method.

The method of claim 11,

The process may include encoding, on the image block, a difference value between the error data obtained based on the prediction information of the first block and the error data of the first block or a partial region thereof. .

An apparatus for receiving and decoding a bit stream of a first layer including a frame having a pixel having a difference value into a video signal,

A first decoder which decodes the bit stream of the first layer in a scalable first manner and restores the bit stream of the first layer into image frames having an original image;

Receiving a bit stream of a second layer, extracting encoding information from the bit stream, and providing a second decoder to the first decoder,

The first decoder,

Based on the prediction information of the first block coded in the internal mode included in the encoding information, the neighboring pixel of the target block is used for the target block included in any frame in the bit stream of the first layer. And means for restoring the pixel of the difference value of the target block to the original pixel value.

The method of claim 21,

The prediction information, the device characterized in that it comprises information about the prediction mode and the prediction direction.

The method of claim 23, wherein

The means divides the target block into a plurality of cells based on the prediction mode, and for each predetermined number of groups of divided cells, predicts the prediction direction of the partial region in the first block corresponding to the group within the group. And applying the same to each cell to restore the pixel of the difference value of the corresponding cell to the original pixel value.

The method of claim 23, wherein

The means divides the target block into a plurality of cells in a size multiplied by the screen size ratio of the first layer to the second layer, and corresponding to the cell for each divided cell. And restoring the pixel of the difference value of the corresponding cell to the original pixel value by applying the prediction direction of the partial region in the first block.

The method of claim 23, wherein

The means, when it is not possible to divide the target block based on the prediction mode, applies the prediction direction for the partial region of the first block or the first block to the entire target block and applies the pixel of the difference value of the block. Recovering the original pixel values to the original pixel values.

The method of claim 27,

If the prediction direction for the partial region of the first block is a prediction direction that cannot be applied to the entire target block, the means performs reverse DC prediction or plane prediction to invert the pixel of the difference value of the target block. And restore to a pixel value of.

The method of claim 21,

The means, when instructing that the header information of the target block is internally coded using the prediction information of the corresponding block of the second layer, using the prediction information of the first block, And restoring the pixel value.

The method of claim 21,

The means adds data of the first block or a partial region thereof to data in the target block, and then adds the added target block using adjacent pixels of the target block based on prediction information of the first block. And restore the original pixel values.

A method of receiving a bit stream of a first layer including a frame having pixels of difference values and decoding the same into a video signal, the method comprising:

Restoring and outputting the image stream having the original image by decoding the bit stream of the first layer in a scalable first manner using encoding information extracted from the input bit stream of the second layer. It's done,

The restoration output step,

Based on the prediction information of the first block coded in the internal mode included in the encoding information, the neighboring pixel of the target block is used for the target block included in any frame in the bit stream of the first layer. Restoring the pixel of the difference value of the target block to the original pixel value.

The method of claim 31, wherein

The method of claim 33,

The prediction mode is one selected from intra 4x4, intra 8x8 and intra 16x16.

The method of claim 33,

The process divides the target block into a plurality of cells based on the prediction mode, and for each predetermined number of groups of divided cells, predicts the prediction direction of the partial region in the first block corresponding to the group within the group. Applying the same to each cell to restore the pixel of the difference value of the corresponding cell to the original pixel value.

The method of claim 33,

The process may be performed by dividing the target block into a plurality of cells by multiplying a screen size ratio of the first layer to the second layer by a size specified in the prediction mode, and corresponding to the cell for each divided cell. And applying the prediction direction of the partial region in the first block to restore the pixel of the difference value of the corresponding cell to the original pixel value.

The method of claim 33,

In the above process, when the target block cannot be divided based on the prediction mode, the prediction direction for the partial region of the first block or the first block is applied to the entire block, and the pixel of the difference value of the block is applied. Recovering the original pixel values.

The method of claim 37, wherein

In the above process, when the prediction direction for the partial region of the first block is a prediction direction that is not applicable to the entire target block, the pixel of the difference value of the target block is originally obtained by performing DC prediction or plane prediction in reverse. Restoring to a pixel value of.

The method of claim 31, wherein

The process is performed when the header information of the target block indicates that the internal coded using the prediction information of the corresponding block of the second layer.

The method of claim 31, wherein

The process may include adding data of the first block or a partial region to data in the target block, and then adding the added target block using adjacent pixels of the target block based on prediction information of the first block. Method to restore original pixel value.