KR20090059707A

KR20090059707A - Apparatus of scalable video encoding using closed-loop filtering and the method thereof

Info

Publication number: KR20090059707A
Application number: KR1020070126706A
Authority: KR
Inventors: 김명환; 구본태; 엄낙웅
Original assignee: 한국전자통신연구원
Priority date: 2007-12-07
Filing date: 2007-12-07
Publication date: 2009-06-11

Abstract

A scalable video encoding apparatus using the closed loop filtering and a method thereof are provided to perform the video coding efficiently by guaranteeing the quality of images, respectively having different resolution. A scalable video encoding apparatus using the closed loop filtering comprises a space-time conversion unit(13), a quantization unit and a closed loop filtering unit(22). The space-time conversion unit removes the temporal and spatial redundancy by using a reference image to generate the first conversion image. The quantization unit produces image information coded by quantizing the first conversion image. The closed loop filtering unit sequentially performs the inverse quantization and inverse space-time conversion for the coded image information to generate a decoding image, and then provides the decoding image as the reference image.

Description

Apparatus and method for scalable video encoding using closed loop filtering {APPARATUS OF SCALABLE VIDEO ENCODING USING CLOSED-LOOP FILTERING AND THE METHOD THEREOF}

본 발명은 비디오 코딩에 관한 것으로, 좀 더 구체적으로는 스케일러블 비디오 코딩 방법 및 이를 위한 장치에 관한 것이다.The present invention relates to video coding, and more particularly, to a scalable video coding method and apparatus therefor.

본 발명은 정보통신부 및 정보통신연구진흥원의 IT성장동력기술개발사업의 일환으로 수행한 연구로부터 도출된 것이다.[과제관리번호 : 2006-S-017-02, The present invention is derived from the research conducted as part of the IT growth engine technology development project of the Ministry of Information and Communication and the Ministry of Information and Communication Research.

과제명 : 지상파 DMB 전송 고도화 기술과제]Project Name: Technology for Advanced Terrestrial DMB Transmission]

인터넷을 포함한 정보통신 기술이 발달함에 따라 문자, 음성뿐만 아니라 화상통신이 증가하고 있다. 따라서, 이에 따른 문자, 영상, 음악 등 다양한 형태의 정보를 수용할 수 있는 멀티미디어 서비스가 요구되고 있다. As information and communication technology including the Internet is developed, not only text and voice but also video communication are increasing. Accordingly, there is a demand for a multimedia service capable of accommodating various types of information such as text, video, and music.

멀티미디어 데이터는 그 양이 방대하여 대용량의 저장매체를 필요로 하며 전송시에 넓은 대역폭을 필요로 한다. 따라서, 문자, 영상 및 오디오를 포함한 멀티미디어 데이터를 전송하기 위해서 압축코딩 기법을 사용한다.Multimedia data has a huge amount and requires a large storage medium and a wide bandwidth in transmission. Therefore, compression coding is used to transmit multimedia data including text, video, and audio.

예를 들면 640*480의 해상도를 갖는 24 비트(bit) 트루 컬러의 이미지(True Color)는 한 프레임(Frame)당 640*480*24 비트의 용량(즉, 약 7.37 Mbit의 데이터)을 필요로 한다. 이를 초당 30 프레임으로 전송하는 경우 640*480의 해상도를 갖는 24 비트(bit) 트루 컬러의 이미지는 221Mbit/sec의 대역폭을 필요로 하며, 90분 동안 상영되는 영화를 저장하려면 약 1200G 비트의 저장공간이 필요하다.For example, a 24-bit True Color image with a resolution of 640 * 480 requires a capacity of 640 * 480 * 24 bits (ie approximately 7.37 Mbits of data) per frame. do. When transmitting it at 30 frames per second, a 24-bit true color image with a resolution of 640 * 480 requires a bandwidth of 221 Mbit / sec, and approximately 1200 Gbits of storage space to store a 90-minute movie. This is necessary.

데이터를 압축하는 기본적인 원리는 데이터의 중복(redundancy)을 없애는 것이다. 공간적 중복은 이미지에서 동일한 색이나 객체가 반복되는 것이다. 시간적 중복은 동영상 프레임에서 인접 프레임이 거의 변화가 없는 경우나 오디오에서 같은 음이 계속 반복되는 것과 같은 것이다. The basic principle of compressing data is to eliminate data redundancy. Spatial overlap is the repetition of the same color or object in an image. Temporal redundancy is when the adjacent frames in a video frame change little, or when the same note is repeated over and over in audio.

비디오 코딩에서 모션 보상 예측 코딩법은 이러한 중복을 제거한다. 즉, 시간적 중복은 모션 보상에 근거한 시간적 필터링(temporal filtering)에 의해 제거되고, 공간적 중복은 공간적 변환(spatial transform)에 의해 제거된다.Motion compensated predictive coding in video coding eliminates this duplication. That is, temporal redundancy is eliminated by temporal filtering based on motion compensation, and spatial redundancy is eliminated by spatial transform.

데이터의 중복을 제거한 후 생성되는 멀티미디어는 전송매체를 통하여 전달된다. 현재 사용되는 전송매체는 매우 다양하여 초당 수십 메가비트의 데이터를 전송할 수 있는 초고속통신망부터 초당 384 kbit의 전송속도를 갖는 이동통신망 등이 있다.The multimedia generated after the data is duplicated is transmitted through the transmission medium. Currently used transmission media are very diverse, ranging from ultra-high speed communication networks capable of transmitting tens of megabits of data per second to mobile communication networks having a transmission rate of 384 kbits per second.

일반적으로, 인코더는 입력 비디오를 인코딩의 기본 단위인 GOP(group of pictures)로 나누고, 각 GOP별로 인코딩 작업을 수행한다. In general, the encoder divides the input video into groups of pictures (GOPs), which are basic units of encoding, and performs encoding for each GOP.

도 1은 일반적인 개루프(Open-Loop) 스케일러블 비디오 인코더를 도시한 블록도이다. FIG. 1 is a block diagram illustrating a typical open-loop scalable video encoder.

도 1을 참조하면, 스케일러블 비디오 인코더(10)는 공간적 변환부(1), 시간 적 변환부(2), 버퍼(5), 임베디드 양자화부(6) 및 엔트로피 부호화부(7)를 포함한다. Referring to FIG. 1, the scalable video encoder 10 includes a spatial transform unit 1, a temporal transform unit 2, a buffer 5, an embedded quantizer 6, and an entropy encoder 7. .

공간적 변환부(1)는 입력된 비디오 시퀀스의 공간적 중복을 제거하기 위하여 웨이브렛 변환(Wavelet Transform)을 사용한다. 웨이브렛 변환은 하나의 프레임을 4등분한다. 웨이브렛 변환은 전체 이미지와 거의 유사하고 1/4 면적을 갖는 축소된 이미지(이하, "L 서브밴드"라 칭함)를 상기 하나의 프레임의 사분면 중 어느 하나에 대체하고, 나머지 3개의 사분면에는 L 서브밴드를 통해 전체 이미지를 복원할 수 있도록 하는 정보(이하, "H 서브밴드"라 칭함)로 대체한다. The spatial transform unit 1 uses a wavelet transform to remove spatial redundancy of the input video sequence. The wavelet transform divides one frame into quarters. The wavelet transform replaces a reduced image (hereinafter referred to as an "L subband") with a quarter area that is almost similar to the entire image, to any one of the quadrants of the one frame, and L in the other three quadrants. It is replaced with information (hereinafter referred to as "H subband") that allows the entire image to be restored through the subband.

공간적 변환부(1)는 입력된 영상을 웨이브렛 변환하여 잔여 영상을 생성한다. 생성된 잔여 영상은 웨이브렛 계수로 표현된다. 웨이브렛 계수란 입력된 영상의 공간적 중복을 제거한 잔여 영상의 스케일과 시간축의 전이를 나타낸다. The spatial converter 1 generates a residual image by wavelet converting the input image. The generated residual image is represented by wavelet coefficients. The wavelet coefficient refers to the transition of the scale and time axis of the residual image from which spatial overlap of the input image is removed.

시간적 변환부(2)는 모션 추정부(3)와 시간적 필터링부(4)를 포함한다. 모션 추정부(3)는 버퍼(5)에 저장된 GOP(Group of Picture)의 [n-1]번째 프레임을 참조 프레임으로 하여 GOP의 [n]번째 프레임에 대한 모션 추정을 수행하여 모션 벡터를 생성한다. [n]번째 프레임은 다음 프레임의 모션 추정을 위하여 버퍼(5)에 저장된다. 시간적 필터링부(4)는 생성된 모션 벡터를 이용하여 프레임 간의 시간적 중복성을 제거함으로써 시간적 변환 영상을 생성한다. The temporal transform unit 2 includes a motion estimation unit 3 and a temporal filtering unit 4. The motion estimation unit 3 generates a motion vector by performing motion estimation on the [n] th frame of the GOP using the [n-1] th frame of the GOP (Group of Picture) stored in the buffer 5 as a reference frame. do. The [n] th frame is stored in the buffer 5 for motion estimation of the next frame. The temporal filtering unit 4 generates a temporal transformed image by removing temporal redundancy between frames using the generated motion vector.

임베디드 양자화부(6)는 시간적 필터링부(4)로부터 생성된 변환 영상(즉, 웨이브렛 계수)을 양자화한다. 엔트로피 부호화부(7)는 상기 양자화된 웨이블렛 계수 및 모션 추정부(3)에서 생성된 모션 벡터를 부호화하여 비트스트림(Bit Stream)을 생성한다. 엔트로피 부호화부(7)는 코딩된 이미지 정보(즉, 양자화된 웨이브렛 계수)와 모션 추정부(3)에서 얻은 모션 벡터들 및 기타 필요한 정보 등을 포함한 비트스트림을 생성한다.The embedded quantization unit 6 quantizes the transformed image (ie, wavelet coefficient) generated from the temporal filtering unit 4. The entropy encoder 7 generates a bit stream by encoding the quantized wavelet coefficients and the motion vector generated by the motion estimator 3. The entropy encoder 7 generates a bitstream including coded image information (ie, quantized wavelet coefficients), motion vectors obtained from the motion estimation unit 3, and other necessary information.

임베디드 양자화 방식으로는 MCTF(Motion compensated Temporal filtering) 방식이 사용된다. As the embedded quantization scheme, a motion compensated temporal filtering (MCTF) scheme is used.

스케일러블 비디오 인코더는 임베디드 양자화 방식을 통해 변환계수들에 대한 양자화를 수행함으로써 양자화에 의해 필요한 정보량을 줄일 수 있고, 임베디드 양자화에 의해 SNR(Signal-to-Noise Ratio) 스케일러빌티(Scalability)를 얻을 수 있다. The scalable video encoder can reduce the amount of information required by quantization by performing quantization on transform coefficients through an embedded quantization scheme, and obtain signal-to-noise ratio (SNR) scalability through embedded quantization. have.

실제로 각 레이어(Layer)에서 참조 영상을 구성하기 위한 모션 벡터들은 유사하기는 하지만 동일하지는 않으므로 인코더는 가장 높은 해상도에 대한 모션 벡터들을 이용하여 낮은 해상도의 영상에 최적화된 모션추정치를 사용할 수 없게 된다. 따라서, 가장 낮은 해상도의 잔여 영상의 화질은 심각하게 저하된다. 또한, 인코딩과정에서 화질을 개선하기 위하여 많은 비트를 할당한다면 압축효율의 저하가 발생한다. In fact, since motion vectors for constructing a reference image in each layer are similar but not identical, the encoder cannot use a motion estimate optimized for a low resolution image using the motion vectors for the highest resolution. Therefore, the image quality of the residual image of the lowest resolution is severely degraded. In addition, if a large number of bits are allocated in order to improve the image quality, the compression efficiency decreases.

도 1에 도시된 시간적 스케일러빌리티(Temporal Scablability)를 지원하는 비디오 인코더(10)는 SNR 스케일러빌리티(Signal-to-Noise Ratio Scablability)를 구현하기 위해 개루프(open-loop) 구조를 가진다. 일반적으로, 비디오 인코딩 과정에서 현재 프레임의 영상은 다음 프레임의 참조 프레임(reference frame)으로 사용된다. The video encoder 10 supporting temporal scalability shown in FIG. 1 has an open-loop structure to implement signal-to-noise ratio scalability. In general, an image of a current frame is used as a reference frame of a next frame in a video encoding process.

즉, 도 1과 같은 개루프 구조의 인코더(10)는 이전의 원영상 프레임 n-1이 현재 프레임에 대한 참조 프레임으로 사용되지만, 일반적인 디코더는 재구성된 영상(즉, 양자화에 의한 에러가 반영되어 있는 이전 프레임[n-1])이 현재 프레임의 참조 프레임으로 사용된다. 즉, 하나의 GOP 내에서 프레임이 거듭 될수록 에러가 누적된다. 따라서, 개루프 구조의 인코더는 복원된 영상에 드리프트(drift)가 발생한다.That is, the encoder 10 of the open loop structure as shown in FIG. 1 uses the previous original picture frame n−1 as a reference frame for the current frame. Previous frame [n-1]) is used as the reference frame of the current frame. That is, errors accumulate as frames are repeated within one GOP. Therefore, in the open loop encoder, drift occurs in the reconstructed image.

본 발명의 목적은 스케일러블 비디오 코딩(scalable video coding)에 있어서, 인코더와 디코더 간에 양자화에 의해 발생된 에러가 누적됨으로써 생기는 화질 열화 현상을 개선하기 위하여 폐루프 필터링 방법을 제공한다. SUMMARY OF THE INVENTION An object of the present invention is to provide a closed loop filtering method in scalable video coding in order to improve image quality deterioration caused by accumulation of an error generated by quantization between an encoder and a decoder.

또한, 본 발명의 또 다른 목적은 각 해상도 마다 코딩된 영상의 중복성을 줄이면서 각 해상도에서 좋은 화질을 갖는 비디오 코딩 방법 및 이를 위한 장치를 제공한다.In addition, another object of the present invention is to provide a video coding method and apparatus therefor having a good image quality at each resolution while reducing the redundancy of the coded image at each resolution.

본 발명의 실시예에 따른 스케일러블 비디오 인코딩 장치는 상기 원영상에 대응하는 참조 영상을 제공받고, 상기 참조 영상을 이용하여 상기 원영상의 공간적 및 시간적 중복을 제거하여 변환 영상을 생성하는 시공간적 변환부; 상기 변환 영상을 양자화하여 코딩된 영상 정보를 생성하는 양자화부; 및 상기 코딩된 영상 정보를 순차적으로 역양자화, 역공간적 및 역시간적 변환을 통하여 디코딩 영상을 생성하고, 상기 디코딩 영상을 상기 참조 영상으로 제공하는 폐루프 필터링부를 포함한다. The scalable video encoding apparatus according to an embodiment of the present invention receives a reference image corresponding to the original image, and uses a reference image to remove a spatial and temporal overlap of the original image to generate a spatiotemporal transform unit. ; A quantizer configured to quantize the transformed image to generate coded image information; And a closed loop filtering unit configured to sequentially generate the decoded image information through inverse quantization, inverse spatial and inverse temporal transformation, and provide the decoded image as the reference image.

이 실시예에 있어서, 상기 폐루프 필터링부는, 상기 코딩된 영상 정보를 역양자화하여 상기 변환 영상을 생성하는 역양자화부; 상기 변환 영상을 역공간적 변환하여 상기 잔여 영상을 생성하는 역공간적 변환부; 및 상기 잔여 영상을 역시간적 변환하여 상기 디코딩 영상을 생성하는 역시간적 필터링부를 포함한다. The closed loop filtering unit may include an inverse quantization unit configured to inversely quantize the coded image information to generate the converted image; An inverse spatial transform unit which inversely transforms the converted image to generate the residual image; And an inverse temporal filtering unit generating the decoded image by inversely transforming the residual image.

이 실시예에 있어서, 상기 시공간적 변환부는, 상기 원영상에 대응하는 참조 영상을 제공받고, 상기 참조 영상을 이용하여 상기 원영상의 공간적 중복을 제거하여 잔여 영상을 생성하는 공간적 변환부; 및 상기 잔여 영상에 대응하는 모션 추정을 실행하고, 상기 실행된 모션 추정에 따라 모션 벡터를 생성하고, 상기 모션 벡터를 이용하여 상기 원영상의 시간적 중복을 제거하여 변환 영상를 생성하는 시간적 변환부를 포함한다. In this embodiment, the spatio-temporal transform unit may include: a spatial transform unit configured to receive a reference image corresponding to the original image and to remove a spatial overlap of the original image using the reference image to generate a residual image; And a temporal converter configured to perform a motion estimation corresponding to the residual image, generate a motion vector according to the executed motion estimation, and remove a temporal overlap of the original image using the motion vector to generate a transformed image. .

이 실시예에 있어서, 상기 모션벡터와 상기 코딩된 영상 정보를 포함하는 비트 스트림을 생성하는 엔트로피 부호화부를 더 포함한다.The apparatus may further include an entropy encoder configured to generate a bit stream including the motion vector and the coded image information.

이 실시예에 있어서, 상기 디코딩 영상을 저장하고, 상기 디코딩 영상을 상기 참조 영상으로 상기 공간적 변환부에 제공하는 버퍼를 더 포함한다.The method may further include a buffer that stores the decoded image and provides the decoded image as the reference image to the spatial transform unit.

이 실시예에 있어서, 원영상의 원해상도 영상을 저역통과 필터링하여 저해상도 영상을 생성하는 다운 샘플링부를 더 포함하는 스케일러블 비디오 인코딩 장치.The scalable video encoding apparatus of claim 1, further comprising a down sampling unit configured to low pass filter the original resolution image of the original image to generate a low resolution image.

이 실시예에 있어서, 상기 다운 샘플링부는, 상기 원해상도 영상을 저역통과 필터링하여 제1 저해상도 영상을 생성하는 제1 다운 샘플링부; 상기 제1 저해상도 영상을 저역통과 필터링하여 제2 저해상도 영상을 생성하는 제2 다운 샘플링부; 및 같은 방법으로, 제n-1 저해상도 영상을 저역통과 필터링하여 제n 저해상도 영상을 생성하는 제n 다운 샘플링부를 포함한다.The down sampling unit may include: a first down sampling unit configured to low pass filter the original resolution image to generate a first low resolution image; A second down sampling unit configured to low pass filter the first low resolution image to generate a second low resolution image; And an n-th down sampling unit configured to generate an n-th low resolution image by low-pass filtering the n-1 low-resolution image.

이 실시예에 있어서, 상기 저역통과 필터링은 웨이브렛 필터에 의한 다운 샘플링인 것을 특징으로 한다. In this embodiment, the lowpass filtering is down sampling by a wavelet filter.

이 실시예에 있어서, 상기 시공간적 변환부는 서로 다른 해상도에 따라 개별 적으로 진행된다. In this embodiment, the spatiotemporal transform units are individually processed according to different resolutions.

이 실시예에 있어서, 상기 변환 영상은, 상기 원해상도 영상, 상기 제1 및 제2 저해상도 영상의 공간적 및 시간적 중복을 제거하여 원해상도 변환 영상, 제1 및 제2 저해상도 변환 영상을 생성하고, 상기 제1 및 제2 저해상도 변환 영상은 통합하여 통합된 저해상도 변환 영상을 생성하고, 그리고 상기 통합된 저해상도 변환 영상과 상기 원해상도 변환 영상이 통합된 것을 특징으로 한다.In the present embodiment, the converted image is generated by removing the spatial and temporal overlap of the original resolution image and the first and second low resolution images to generate the original resolution converted image, the first and second low resolution converted images, and The first and second low resolution converted images are integrated to generate an integrated low resolution converted image, and the integrated low resolution converted image and the original resolution converted image are integrated.

본 발명의 실시예에 따른 스케일러블 비디오 인코딩 방법은 코딩된 영상 정보를 역양자화하여 상기 변환 영상을 생성하는 단계; 상기 변환 영상을 역공간적 변환하여 잔여 영상을 생성하는 단계; 및 상기 잔여 영상을 역시간적 변환하여 디코딩 영상을 생성하여 상기 디코딩 영상을 참조 영상으로 제공하는 단계를 포함한다.A scalable video encoding method according to an embodiment of the present invention comprises the steps of: generating the converted image by dequantizing coded image information; Generating a residual image by inverse spatially transforming the converted image; And generating a decoded image by inversely transforming the residual image to provide the decoded image as a reference image.

이 실시예에 있어서, (a) 원영상에 대응하는 상기 참조 영상을 제공받고, 상기 참조 영상을 이용하여 상기 원영상의 공간적 및 시간적 중복을 제거하여 변환 영상을 생성하는 단계; 및 (b) 상기 변환 영상을 양자화하여 상기 코딩된 영상 정보를 생성하는 단계를 더 포함한다. In this embodiment, (a) receiving the reference image corresponding to the original image, and using the reference image to remove the spatial and temporal overlap of the original image to generate a converted image; And (b) quantizing the transformed image to generate the coded image information.

이 실시예에 있어서, 상기 (a) 단계는, 상기 참조 영상을 이용하여 상기 원영상의 공간적 중복을 제거하여 잔여 영상을 생성하는 단계; 및 상기 잔여 영상에 대응하는 모션 추정을 실행하고, 상기 실행된 모션 추정에 따라 모션 벡터를 생성하고, 상기 모션 벡터를 이용하여 상기 원영상의 시간적 중복을 제거하여 변환 영상를 생성하는 단계를 포함한다.In this embodiment, the step (a) may include: generating a residual image by removing spatial redundancy of the original image using the reference image; And performing a motion estimation corresponding to the residual image, generating a motion vector according to the executed motion estimation, and removing a temporal overlap of the original image using the motion vector to generate a transformed image.

이 실시예에 있어서, 상기 (b) 단계는, 상기 모션벡터와 상기 코딩된 영상 정보를 포함하는 비트 스트림을 생성하는 단계를 더 포함한다. In this embodiment, the step (b) further comprises generating a bit stream including the motion vector and the coded image information.

이 실시예에 있어서, 상기 디코딩 영상을 저장하고, 상기 디코딩 영상을 상기 참조 영상으로 제공하는 단계를 더 포함한다. In this embodiment, the method may further include storing the decoded image and providing the decoded image as the reference image.

이 실시예에 있어서, 원영상의 원해상도 영상을 저역통과 필터링하여 저해상도 영상을 생성하는 단계를 더 포함한다. In this embodiment, the method further includes low-pass filtering the original resolution image of the original image to generate a low resolution image.

이 실시예에 있어서, 상기 저해상도 영상을 생성하는 단계는, 상기 원해상도 영상을 저역통과 필터링하여 제1 저해상도 영상을 생성하는 단계; 상기 제1 저해상도 영상을 저역통과 필터링하여 제2 저해상도 영상을 생성하는 단계; 및 같은 방법으로, 제n-1 저해상도 영상을 저역통과 필터링하여 제n 저해상도 영상을 생성하는 단계를 포함한다.The generating of the low resolution image may include: generating a first low resolution image by low pass filtering the original resolution image; Low-pass filtering the first low resolution image to generate a second low resolution image; And in the same manner, generating an nth low-resolution image by lowpass filtering the n-1th low-resolution image.

이 실시예에 있어서, 상기 변환 영상은, 상기 원해상도 영상, 상기 제1 및 제2 저해상도 영상의 공간적 및 시간적 중복을 제거하여 원해상도 변환 영상, 제1 및 제2 저해상도 변환 영상을 생성하고, 상기 제1 및 제2 저해상도 변환 영상은 통합하여 통합된 저해상도 변환 영상을 생성하고, 그리고 상기 통합된 저해상도 변환 영상과 상기 원해상도 변환 영상이 통합된 것을 특징으로 한다. In the present embodiment, the converted image is generated by removing the spatial and temporal overlap of the original resolution image and the first and second low resolution images to generate the original resolution converted image, the first and second low resolution converted images, and The first and second low resolution converted images are integrated to generate an integrated low resolution converted image, and the integrated low resolution converted image and the original resolution converted image are integrated.

상술한 바와 같이, 본 발명은 해상도별 영상을 통합하여 각 해상도의 영상의 화질을 최대한 보장할 수 있으므로 효율적인 비디오 코딩을 할 수 있다. As described above, the present invention can ensure the maximum quality of the image of each resolution by integrating the image for each resolution, thereby enabling efficient video coding.

또한, 본 발명은 폐루프 최적화(closed-loop optimization)로 양자화에 의해 발생하는 누적 에러를 감소시켜 영상의 드리프트(drift) 현상을 줄일 수 있다.In addition, the present invention can reduce the drift phenomenon of the image by reducing the cumulative error caused by quantization by closed-loop optimization.

본 발명의 실시예에 따른 스케일러블 비디오 인코딩 장치는 상기 원영상에 대응하는 참조 영상을 제공받고, 상기 참조 영상을 이용하여 상기 원영상의 공간적 및 시간적 중복을 제거하여 변환 영상을 생성하는 시공간적 변환부; 상기 변환 영상을 양자화하여 코딩된 영상 정보를 생성하는 양자화부; 및 상기 코딩된 영상 정보를 순차적으로 역양자화, 역공간적 및 역시간적 변환을 통하여 디코딩 영상을 생성하고, 상기 디코딩 영상을 상기 참조 영상으로 제공하는 폐루프 필터링부를 포함하되, 상기 폐루프 필터링부는, 상기 코딩된 영상 정보를 역양자화하여 상기 변환 영상을 생성하는 역양자화부; 상기 변환 영상을 역공간적 변환하여 상기 잔여 영상을 생성하는 역공간적 변환부; 및 상기 잔여 영상을 역시간적 변환하여 상기 디코딩 영상을 생성하는 역시간적 필터링부를 포함한다. The scalable video encoding apparatus according to an embodiment of the present invention receives a reference image corresponding to the original image, and uses a reference image to remove a spatial and temporal overlap of the original image to generate a spatiotemporal transform unit. ; A quantizer configured to quantize the transformed image to generate coded image information; And a closed loop filtering unit configured to sequentially generate the decoded image information through inverse quantization, inverse spatial and inverse temporal transformation, and provide the decoded image as the reference image, wherein the closed loop filtering unit comprises: An inverse quantizer configured to inversely quantize coded image information to generate the converted image; An inverse spatial transform unit which inversely transforms the converted image to generate the residual image; And an inverse temporal filtering unit generating the decoded image by inversely transforming the residual image.

따라서, 본 발명은 해상도마다 영상을 통합하여 각 해상도의 영상의 화질을 최대한 보장할 수 있으므로 효율적인 비디오 코딩을 할 수 있다. 또한, 본 발명은 폐루프 최적화(closed-loop optimization)로 양자화에 의해 발생하는 누적 에러를 감소시켜 영상의 드리프트(drift) 현상을 줄일 수 있다.Therefore, the present invention can ensure the best image quality of the image of each resolution by integrating the image for each resolution, it is possible to perform efficient video coding. In addition, the present invention can reduce the drift phenomenon of the image by reducing the cumulative error caused by quantization by closed-loop optimization.

이하, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자가 본 발명의 기술적 사상을 용이하게 실시할 수 있도록 본 발명의 실시예를 첨부된 도면을 참조하여 설명한다.DETAILED DESCRIPTION Hereinafter, exemplary embodiments of the present invention will be described with reference to the accompanying drawings so that those skilled in the art may easily implement the technical idea of the present invention.

본 발명에 따른 실시예는 하나의 비트스트림으로부터 3개의 해상도를 갖도록 하는 비디오 코딩을 가정한다. 즉, 제1 레이어(Layer)는 최고 해상도를 갖는 원해상도의 영상, 제2 레이어는 중간 해상도를 갖는 영상 및 제3 레이어는 최저 해상도를 갖는 영상의 코딩과 디코딩에 관한 것이다. 본 발명에 따른 실시예는 코딩되는 하나의 프레임을 중심으로 하여 영상의 코딩과 디코딩 과정을 설명한다.Embodiments in accordance with the present invention assume video coding to have three resolutions from one bitstream. That is, the first layer relates to coding and decoding of an image having an original resolution having the highest resolution, a second layer having an intermediate resolution, and a third layer having an lowest resolution. An embodiment according to the present invention describes a coding and decoding process of an image based on one frame to be coded.

도 2은 본 발명에 따른 스케일러블 비디오 인코더를 도시한 블록도이고, 도 3은 도 2에 도시된 모션 보상부를 도시한 블록도이고, 도 4은 도 2에 도시된 임베디드 역양자부를 도시한 블록도이다. FIG. 2 is a block diagram illustrating a scalable video encoder according to the present invention, FIG. 3 is a block diagram showing the motion compensation unit shown in FIG. 2, and FIG. 4 is a block diagram showing the embedded inverse quantum unit shown in FIG. It is also.

도 2를 참조하면, 본 발명에 따른 스케일러블 비디오 인코더(100)는 제1 및 제2 다운 샘플링부(11, 12), 공간적 변환부(13), 시간적 변환부(17), 임베디드 양자화부(21), 버퍼(26), 엔트로피 부호화부(27) 및 폐루프 필터링부(22)을 포함한다. Referring to FIG. 2, the scalable video encoder 100 according to the present invention includes a first and second down sampling units 11 and 12, a spatial transform unit 13, a temporal transform unit 17, and an embedded quantization unit ( 21, a buffer 26, an entropy encoder 27, and a closed loop filter 22.

먼저, 본 발명에 따른 스케일러블 비디오 인코더(100)는 입력 비디오를 인코딩의 기본 단위인 GOP(group of pictures)로 나누고, 각 GOP마다 인코딩을 수행한다.First, the scalable video encoder 100 according to the present invention divides the input video into a group of pictures (GOP) which is a basic unit of encoding, and performs encoding for each GOP.

제1 다운 샘플링부(11)는 제1 레이어(즉, 원해상도 영상)로부터 제2 레이어(즉, 제1 저해상도 영상)를 추출한다. 제2 다운 샘플링부(12)는 제2 레이어로부터 제3 레이어(즉, 제2 저해상도 영상)를 추출한다. 본 발명에 따른 제1 및 제2 다운 샘플링부(11, 12)는 웨이브렛 9-7 필터를 사용한다.The first down sampling unit 11 extracts the second layer (ie, the first low resolution image) from the first layer (ie, the original resolution image). The second down sampling unit 12 extracts a third layer (ie, a second low resolution image) from the second layer. The first and second down sampling units 11 and 12 according to the present invention use a wavelet 9-7 filter.

공간적 변환부(13)는 제1 내지 제3 웨이브렛 변환부(14-16)를 포함한다. 각 해상도마다 공간적 중복을 제거하는 공간적 변환부(13)의 제1 내지 제3 웨이브렛 변환부(14-16)은 모두 동일한 구조를 가진다. 원해상도 영상과 제1 및 제2 저해상도 영상은 각각 제1 내지 제3 웨이브렛 변환부(14-16)를 통해 제1 내지 제3 잔여 영상으로 변환된다. 즉, 제1 웨이브렛 변환부(14)는 원해상도 영상을 제1 잔여 영상으로 변환하고, 제2 웨이브렛 변환부(15)는 제1 저해상도 영상을 제2 잔여 영상으로 변환하고, 제3 웨이브렛 변환부(16)는 제2 저해상도 영상을 제3 잔여 영상으로 변환한다.The spatial converter 13 includes first to third wavelet converters 14-16. The first to third wavelet transform units 14-16 of the spatial transform unit 13 which remove spatial redundancy for each resolution have the same structure. The original resolution image and the first and second low resolution images are converted into the first to third residual images through the first to third wavelet converters 14-16, respectively. That is, the first wavelet converter 14 converts the original resolution image into the first residual image, and the second wavelet converter 15 converts the first low resolution image into the second residual image and the third wave. The let converting unit 16 converts the second low resolution image into a third residual image.

시간적 변환부(17)는 영상 참조부(20), 모션 추정부(19) 및 모션 보상부(18)를 포함한다. 즉, 영상 참조부(20)는 제1 내지 제3 잔여 영상을 저장하고, 모션 추정부(19)에 제1 내지 제3 잔여 영상을 제공한다. The temporal transform unit 17 includes an image reference unit 20, a motion estimator 19, and a motion compensator 18. That is, the image reference unit 20 stores the first to third residual images and provides the first to third residual images to the motion estimation unit 19.

모션 추정부(19)는 폐루프 필터링부(22)을 통하여 복원된 현재 GOP의 프레임[n-1]을 제1 내지 제3 참조 영상으로 하여 현재 GOP의 프레임[n]에 대한 모션 추정을 수행하고 모션 벡터를 생성한다. 모션 추정부(19)는 입력 영상과 영상 참조부(20)에 저장되어 있는 영상들을 참조하여 모션 추정을 한다. 모션 추정을 통해 얻은 모션 벡터들은 모션 보상부(18)에 제공된다. 모션 보상부(18)는 모션 보상을 통해 얻어진 참조 프레임과 입력 영상을 비교하여 제1 내지 제3 변환 영상을 생성한다. The motion estimation unit 19 performs motion estimation on the frame [n] of the current GOP by using the frames [n-1] of the current GOP reconstructed by the closed loop filtering unit 22 as the first to third reference images. And generate a motion vector. The motion estimation unit 19 performs motion estimation with reference to the input image and the images stored in the image reference unit 20. The motion vectors obtained through the motion estimation are provided to the motion compensation unit 18. The motion compensator 18 generates the first to third transformed images by comparing the reference frame obtained through the motion compensation with the input image.

본 발명에 따른 모션 보상부는 도 3에서 상세히 설명한다. The motion compensation unit according to the present invention will be described in detail with reference to FIG. 3.

도 1 및 도 3를 참조하면, 모션 보상부(18)는 제1 내지 제3 시간적 필터링부(181-183)을 포함한다. 제1 시간적 필터링부(181)는 제1 잔여 영상의 시간적 중복을 제거하여 제1 변환 영상을 생성한다. 제2 시간적 필터링부(182)는 제2 잔여 영상의 시간적 중복을 제거하여 제1 변환 영상을 생성한다. 제3 시간적 필터링부(183)는 제3 잔여 영상의 시간적 중복을 제거하여 제1 변환 영상을 생성한다. 모션 보상부(18)는 제1 내지 제3 변환 영상을 엔트로피 부호화부(27)에 제공한다. 1 and 3, the motion compensator 18 includes first to third temporal filtering units 181 to 183. The first temporal filtering unit 181 generates a first converted image by removing temporal overlap of the first residual image. The second temporal filtering unit 182 generates a first converted image by removing temporal overlap of the second residual image. The third temporal filtering unit 183 generates a first converted image by removing temporal overlap of the third residual image. The motion compensator 18 provides the first to third transformed images to the entropy encoder 27.

계속해서 도 2를 참조하면, 임베디드 양자화부(21)는 시간적 변환부(17)에 의하여 변환된 잔여 영상(즉, 웨이블렛 계수)을 양자화하고, 웨이블렛 계수의 크기에 따라 웨이블렛 계수를 재배열한다. 2, the embedded quantizer 21 quantizes the residual image (that is, the wavelet coefficients) transformed by the temporal transformer 17 and rearranges the wavelet coefficients according to the magnitude of the wavelet coefficients.

웨이브렛 계수에 있어서 큰 계수가 작은 계수보다 중요도가 높다. 임베디드 양자화부(21)는 상기 웨이블렛 계수를 크기에 따라 재배열한 다음에 가장 큰 크기의 웨이블렛 계수를 가장 먼저 보낸다. 즉, 스케일러빌리티(scalability)를 구현하기 위하여, 임베디드 양자화부(21)는 문턱값(threshold)보다 큰 값을 갖는 픽셀들만 인코딩을 하고, 모든 픽셀에 대한 처리가 끝나면 문턱값을 낮추고 다시 반복하는 과정을 통하여 임베디드 양자화를 수행한다. In the wavelet coefficients, the larger coefficient is more important than the smaller coefficient. The embedded quantizer 21 rearranges the wavelet coefficients according to the size and then sends the wavelet coefficient having the largest size first. That is, in order to implement scalability, the embedded quantization unit 21 encodes only pixels having a value larger than a threshold, and after processing all pixels, lowers the threshold and repeats again. Embedded quantization is performed through.

폐루프 필터링부(22)는 코딩된 영상 정보를 역양자화하여 변환 영상을 생성하는 임베디드 역양자화부(23), 변환 영상을 역공간적 변환하여 잔여 영상을 생성하는 역공간적 변환부(24) 및 잔여 영상을 역시간적 변환하여 디코딩 영상을 생성하는 역시간적 필터링부(25)를 포함한다. 즉, 폐루프 필터링부(22)는 양자화된 웨이블렛 계수에 대한 디코딩 과정을 수행함으로써 복원된 프레임을 생성하고, 이후의 모션 추정 과정에서 이를 참조 프레임(즉, 참조 영상)으로 제공하는 역할을 한다. The closed loop filtering unit 22 includes an embedded inverse quantization unit 23 which inversely quantizes the coded image information to generate a transformed image, an inverse spatial transformer 24 which inversely spatially transforms the transformed image, and generates a residual image, and a residual An inverse temporal filtering unit 25 generates a decoded image by inversely transforming an image. That is, the closed loop filtering unit 22 generates a reconstructed frame by performing a decoding process on the quantized wavelet coefficients, and serves as a reference frame (ie, a reference picture) in a subsequent motion estimation process.

임베디드 양자화부(21)에 의하여 생성된 통합된 코딩 영상정보는 폐루프 필 터링부(22)로도 입력된다.The integrated coded image information generated by the embedded quantization unit 21 is also input to the closed loop filtering unit 22.

임베디드 역양자화부(23)는 임베디드 양자화부(21)으로부터 전달받은 통합된 코딩 영상정보(즉, 양자화된 웨이블렛 계수)를 크기 정보에 따라 디코딩한다. 즉, 임베디드 역양자화부(23)는 임베디드 양자화부(21)에서 사용한 방법을 역으로 하여 공간적 순서에 따라 웨이블렛 계수를 배열한다. 따라서, 코딩된 영상정보는 임베디드 역양자화부(23)를 통해 분리 및 역양자화되어 각 해상도에 따른 제1 내지 제3 변환 영상이 된다. The embedded inverse quantization unit 23 decodes the integrated coded image information (that is, the quantized wavelet coefficients) received from the embedded quantization unit 21 according to the size information. That is, the embedded inverse quantization unit 23 arranges the wavelet coefficients in a spatial order in the reverse order of the method used in the embedded quantization unit 21. Therefore, the coded image information is separated and inversely quantized by the embedded inverse quantization unit 23 to be first to third converted images according to respective resolutions.

본 발명에 따른 임베디드 역양자화부에 대한 구조는 도 4에서 상세히 설명한다. The structure of the embedded dequantization unit according to the present invention will be described in detail with reference to FIG. 4.

도 1 및 도 4를 참조하면, 임베디드 역양자화부(23)는 제1 내지 제3 임베디드 역양자화부(231-233)을 포함한다. 제1 및 제2 분배기(234, 235)는 통합된 코딩 영상정보를 제1 내지 제3 임베디드 역양자화부(231-233)에 제공한다. 1 and 4, the embedded inverse quantization unit 23 includes first to third embedded inverse quantization units 231 to 233. The first and second distributors 234 and 235 provide integrated coded image information to the first to third embedded inverse quantizers 231 to 233.

제1 임베디드 역양자화부(231)는 통합된 코딩 영상 정보를 역양자화하여 제1 변환 영상을 생성한다. 제2 임베디드 역양자화부(232)는 통합된 코딩 영상 정보를 역양자화하여 제2 변환 영상을 생성한다. 제3 임베디드 역양자화부(233)는 통합된 코딩 영상 정보를 역양자화하여 제2 변환 영상을 생성한다. 임베디드 역양자화부(23)는 제1 내지 제3 변환 영상을 역공간적 변환부(24)에 제공한다. The first embedded inverse quantization unit 231 inversely quantizes the integrated coded image information to generate a first transformed image. The second embedded inverse quantization unit 232 inversely quantizes the integrated coded image information to generate a second converted image. The third embedded inverse quantization unit 233 inversely quantizes the integrated coded image information to generate a second transformed image. The embedded inverse quantization unit 23 provides the first to third transformed images to the inverse spatial transform unit 24.

계속해서 도 2를 참조하면, 역공간적 변환부(24)는 공간적 변환부(13)에서의 동작을 역순으로 수행한다. 임베디드 역양자화부(23)에서 전달된 변환 계수를 역으로 변환하여 공간적 도메인에서의 프레임으로 복원한다. 임베디드 역양자화부(23) 는 변환 계수가 웨이블렛 계수인 경우에는 웨이블렛 계수를 역 웨이블렛 변환에 따라서 변환하고, 시간적 차분 프레임을 생성한다. 따라서, 제1 내지 제3 변환 영상은 역공간적 변환부(24)를 거쳐 제1 내지 제3 잔여 영상이 된다. Subsequently, referring to FIG. 2, the inverse spatial transform unit 24 performs operations in the spatial transform unit 13 in the reverse order. The inverse transform quantized by the inverse quantization unit 23 converts the inverse to restore the frame in the spatial domain. When the transform coefficient is a wavelet coefficient, the embedded inverse quantization unit 23 converts the wavelet coefficient according to the inverse wavelet transform and generates a temporal difference frame. Accordingly, the first to third transformed images become first to third residual images through the inverse spatial transform unit 24.

역시간적 필터링부(25)는 공간적 필터링부(25)에서 생성된 제1 내지 제3 참조 영상을 이용하여 역공간적 변환부(24)로부터 전송된 제1 내지 제3 잔여 영상을 제1 내지 제3 디코딩 영상으로 변환한다. 즉, 제1 내지 제3 잔여 영상은 역시간적 필터링부(25)를 통해 각 해상도의 제1 내지 제3 디코딩 영상이 된다. 역시간적 필터링부(25)에서 복원된 영상이 최종 복원된 영상이 된다. The inverse temporal filtering unit 25 uses the first to third reference images generated by the spatial filtering unit 25 to display the first to third residual images transmitted from the inverse spatial transform unit 24 to the first to third images. Convert to decoded video. That is, the first to third residual images become first to third decoded images of each resolution through the inverse temporal filtering unit 25. The image reconstructed by the inverse temporal filtering unit 25 becomes a final reconstructed image.

버퍼(26)는 역시간적 필터링부(25)로부터 생성된 제1 내지 제3 디코딩 영상을 저장하고, 영상에 대한 모션 추정을 하는 경우 제1 내지 제3 디코딩 영상을 참조 영상으로 제공한다.The buffer 26 stores the first to third decoded images generated by the inverse temporal filtering unit 25, and provides the first to third decoded images as reference images when performing motion estimation on the images.

엔트로피 부호화부(27)는 임베디드 양자화부(21)에 의하여 양자화된 웨이블렛 계수 및 모션 추정부(19)에서 생성된 모션 벡터 정보와 기타 헤더 정보를 전송 또는 저장에 적합하도록 압축된 비트스트림으로 변환한다. The entropy encoder 27 converts the wavelet coefficient quantized by the embedded quantizer 21 and the motion vector information and other header information generated by the motion estimation unit 19 into a compressed bitstream suitable for transmission or storage. .

즉, 제1 내지 제3 잔여 영상은 시간적 변환부(17)를 통해 시간적 중복이 제거되고 통합되어 하나의 통합된 변환 영상이 된다. 통합된 변환 영상은 임베디드 양자화부(21)를 통해 양자화되어 코딩된 영상이 된다. 입력된 각각의 영상을 코딩하여 얻은 코딩된 영상은 시간적 중복을 제거하는 과정에서 얻은 각 해상도별 모션 벡터와 함께 엔트로피 부호화(22)를 통해 비트스트림이 된다. 비트스트림은 상기 코딩된 영상에 관한 정보와 모션 벡터들을 포함하며, 기타 필요한 헤더 정보를 포 함한다.That is, the first to third residual images are eliminated and merged through the temporal transform unit 17 to form one integrated transformed image. The integrated transform image is quantized and coded by the embedded quantization unit 21. The coded image obtained by coding each input image is a bitstream through entropy encoding 22 together with the motion vector for each resolution obtained in the process of eliminating temporal overlap. The bitstream includes information about the coded image and motion vectors, and includes other necessary header information.

엔트로피 엔코딩 방법으로는 예측 코딩(predictive coding) 방법, 가변 길이 코딩(variable-length coding) 방법(Huffman 코딩이 대표적임), 또는 산술 코딩(arithmetic coding) 방법 등이 있다.Entropy encoding methods include a predictive coding method, a variable-length coding method (Huffman coding is typical), or an arithmetic coding method.

개방루프(Open Loop) 방식은 인코딩 과정에서 시간적 중복을 제거하는 과정에서 원래의 영상을 참조하지만 디코딩 과정에서 역시간적 중복을 제거하는 과정에서는 디코딩된 영상을 참조하므로 이른바 드리프트 에러(Drift Error) 현상이 발생한다. 이에 반하여 폐루프 방식에서는 인코딩 및 디코딩 과정 모두 시간적 중복을 제거하는 과정에서 디코딩된 영상을 참조하므로 드리프트 에러 현상이 발생하지 않는다. The open loop method refers to the original video in the process of eliminating temporal redundancy in the encoding process, but refers to the decoded video in the process of eliminating inverse temporal redundancy in the decoding process, so that a so-called drift error phenomenon occurs. Occurs. In contrast, in the closed loop method, the drift error does not occur because the encoding and decoding processes refer to the decoded image in the process of eliminating temporal duplication.

이와 같은 개방루프 코딩 방식의 문제점을 개선하기 위하여 본 발명에 따른 인코더는 양자화된 변환 계수를 엔트로피 부호화하여 상기 변환 계수를 역으로 디코딩한 후 복원된 프레임을 참조 프레임으로 제공한다. 따라서, 본 발명에 따른 인코더는 디코더 과정과 동일한 환경을 생성하여 누적되는 에러를 제거한다. In order to improve the problem of such an open loop coding scheme, the encoder according to the present invention entropy-codes quantized transform coefficients, decodes the transform coefficients inversely, and provides a reconstructed frame as a reference frame. Therefore, the encoder according to the present invention generates the same environment as the decoder process and eliminates the accumulated error.

도 5는 본 발명에 따른 스케일러블 비디오 디코더를 도시한 블록도이다. 도 2 및 도 5를 참조하면, 스케일러블 비디오 디코더(50)는 역엔트로피 부호화부(51), 임베디드 역양자화부(52), 역공간적 변환부(53), 역시간적 변환부(54) 및 버퍼(55)를 포함한다. 5 is a block diagram illustrating a scalable video decoder according to the present invention. 2 and 5, the scalable video decoder 50 may include an inverse entropy encoder 51, an embedded inverse quantizer 52, an inverse spatial converter 53, an inverse temporal converter 54, and a buffer. And 55.

역 엔트로피 부호화부(51)는 인코더(100)의 엔트로피 부호화부(27)에서의 과정을 역으로 수행한다. 즉, 역엔트로피 부호화부(51)은 입력된 비트스트림으로부터 양자화된 변환 계수를 구한다.The inverse entropy encoder 51 performs the reverse process of the entropy encoder 27 of the encoder 100. In other words, the inverse entropy encoder 51 obtains a quantized transform coefficient from the input bitstream.

임베디드 역양자화부(52)는 인코더(100)의 임베디드 역양자화부(23)와 마찬가지로 엔트로피 부호화부(27)로부터 전달받은 양자화된 웨이블렛 계수(즉, 코딩된 영상정보)를 크기 정보에 따라 디코딩한다.The embedded dequantizer 52 decodes the quantized wavelet coefficients (ie, coded image information) received from the entropy encoder 27, similarly to the embedded dequantizer 23 of the encoder 100, according to the size information. .

역공간적 변환부(53)는 인코더(200)의 역공간적 변환부(24)와 동일하게 동작한다. The inverse spatial transform unit 53 operates in the same manner as the inverse spatial transform unit 24 of the encoder 200.

역시간적 필터링부(54)는 이전에 복원된 복원영상[n-1]을 참조 영상으로 하고, 역엔트로피 부호화부(51)로부터 전달된 모션 벡터를 이용하여, 시간적 잔여 영상을 최종적인 복원영상[n]으로 출력한다. 역시간적 필터링부(54)는 최종 복원된 복원영상[n]을 이후의 영상에 대한 참조 영상으로 사용하기 위하여 버퍼(55)에 저장한다.The inverse temporal filtering unit 54 uses the previously reconstructed reconstructed image [n-1] as a reference image, and uses the motion vector transmitted from the inverse entropy encoder 51 to convert the temporal residual image into the final reconstructed image [ n]. The inverse temporal filtering unit 54 stores the last reconstructed reconstructed image [n] in the buffer 55 to be used as a reference image for the subsequent image.

이상에서와 같이 도면과 명세서에서 최적 실시예가 개시되었다. 여기서 특정한 용어들이 사용되었으나, 이는 단지 본 발명을 설명하기 위한 목적에서 사용된 것이지 의미한정이나 특허청구범위에 기재된 본 발명의 범위를 제한하기 위하여 사용된 것은 아니다. 그러므로 본 기술 분야의 통상의 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 타 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 진정한 기술적 보호 범위는 첨부된 특허청구범위의 기술적 사상에 의해 정해져야 할 것이다.As described above, optimal embodiments have been disclosed in the drawings and the specification. Although specific terms have been used herein, they are used only for the purpose of describing the present invention and are not intended to limit the scope of the invention as defined in the claims or the claims. Therefore, those skilled in the art will understand that various modifications and equivalent other embodiments are possible from this. Therefore, the true technical protection scope of the present invention will be defined by the technical spirit of the appended claims.

도 1은 일반적인 개루프 스케일러블 비디오 인코더를 도시한 블록도.1 is a block diagram illustrating a typical open loop scalable video encoder.

도 2은 본 발명에 따른 스케일러블 비디오 인코더를 도시한 블록도.2 is a block diagram illustrating a scalable video encoder in accordance with the present invention.

도 3은 도 2에 도시된 모션 추정부를 도시한 블록도.3 is a block diagram illustrating a motion estimation unit shown in FIG. 2.

도 4은 도 2에 도시된 임베디드 역양자부를 도시한 블록도.4 is a block diagram showing the embedded inverse quantum unit shown in FIG.

도 5는 본 발명에 따른 스케일러블 비디오 디코더를 도시한 블록도.5 is a block diagram illustrating a scalable video decoder in accordance with the present invention.

* 도면의 주요 부분에 대한 부호 설명 *Explanation of symbols on the main parts of the drawings

11 : 제1 다운 샘플링부 12 : 제2 다운 샘플링부11: first down sampling unit 12: second down sampling unit

13 : 공간적 변환부 14 : 제1 웨이브렛 변환부13: spatial transform unit 14: first wavelet transform unit

15 : 제2 웨이브렛 변환부 16 : 제3 웨이브렛 변환부15: second wavelet converter 16: third wavelet converter

17 : 시간적 변환부 18 : 모션 보상부17: temporal conversion unit 18: motion compensation unit

19 : 모션 추정부 20 : 영상 참조부19: motion estimation unit 20: image reference unit

21 : 임베디드 양자화부 22 : 폐루프 필터링부21: embedded quantization unit 22: closed loop filtering unit

23 : 임베디드 역양자화부 24 : 역공간적 변환부23: embedded inverse quantization unit 24: inverse spatial transform unit

25 : 역시간적 필터링부 26 : 버퍼25: inverse temporal filtering unit 26: buffer

27 : 엔트로피 부호화부 100 : 스케일러블 비디오 인코더27: entropy encoder 100: scalable video encoder

Claims

A spatio-temporal transform unit for generating a first transformed image by removing spatial and temporal overlap of the original image using the reference image;

A quantizer configured to quantize the first transformed image to generate coded image information; And

And a closed loop filtering unit configured to sequentially dequantize the coded image information, inverse spatially, and inversely temporally to generate a decoded image, and provide the decoded image as the reference image.

The method of claim 1,

The closed loop filtering unit,

An inverse quantizer configured to inversely quantize the coded image information to generate a second transformed image;

An inverse spatial transform unit which inverse-space transforms the second transformed image to generate a residual image; And

And a reverse temporal filtering unit generating the decoded image by inversely converting the residual image.

The method of claim 1,

The space-time conversion unit,

A spatial transform unit which generates a residual image by removing spatial overlap of the original image by using the reference image; And

A temporal converter configured to perform motion estimation corresponding to the residual image, generate a motion vector according to the executed motion estimation, and remove temporal overlap of the original image using the motion vector to generate the first transformed image Scalable video encoding apparatus comprising.

The method of claim 3, wherein

And an entropy encoder configured to generate a bit stream including the motion vector and the coded image information.

The method of claim 3, wherein

And a buffer for storing the decoded image and providing the decoded image as the reference image to the spatial transform unit.

The method of claim 1,

And a down sampling unit configured to low pass filter the original resolution image of the original image to generate a low resolution image.

The method of claim 6,

The down sampling unit,

A first down sampling unit configured to low pass filter the original resolution image to generate a first low resolution image;

A second down sampling unit configured to low pass filter the first low resolution image to generate a second low resolution image; And

In the same manner, a scalable video encoding apparatus comprising an n-th down sampling unit configured to low pass filter an n-1 low resolution image to generate an nth low resolution image.

The method of claim 6,

And the low pass filtering is down sampling by a wavelet filter.

The method of claim 7, wherein

And the spatiotemporal transform unit proceeds separately according to different resolutions.

The method of claim 9,

The first converted video,

Generating the original resolution converted image and the first and second low resolution converted images by removing the spatial and temporal overlap of the original resolution image and the first and second low resolution images,

The first and second low resolution converted images are integrated to generate an integrated low resolution converted image, and

And the integrated low resolution converted image and the original resolution converted image are integrated.

The method of claim 2,

And the first transformed image and the second transformed image are the same.

Dequantizing the coded image information to generate a first transformed image;

Generating a first residual image by performing inverse spatial transformation on the first converted image;

Generating a decoded image by inversely transforming the first residual image; And providing the decoded image as a reference image.

The method of claim 12,

(a) generating a second converted image by removing spatial and temporal overlap of the original image using the reference image; And

and (b) quantizing the second transformed image to generate the coded image information.

The method of claim 13,

In step (a),

Generating a second residual image by removing spatial redundancy of the original image using the reference image; And

Performing a motion estimation corresponding to the second residual image, generating a motion vector according to the executed motion estimation, and removing the temporal overlap of the original image using the motion vector to generate the second transformed image Scalable video encoding method comprising a.

The method of claim 13,

In step (b),

And generating a bit stream including the motion vector and the coded image information.

The method of claim 14,

Storing the decoded image and providing the decoded image as the reference image.

The method of claim 13,

And low-pass filtering the original resolution image of the original image to generate a low resolution image.

The method of claim 17,

Generating the low resolution image,

Low pass filtering the original resolution image to generate a first low resolution image;

Low-pass filtering the first low resolution image to generate a second low resolution image; And

In the same manner, a scalable video encoding method comprising low pass filtering an n-1 low resolution image to generate an nth low resolution image.

The method of claim 18,

The converted video,

Generating the original resolution converted image and the first and second low resolution converted images by removing spatial and temporal overlap of the original resolution image and the first and second low resolution images;

The method of claim 12,

And the first transformed image and the second transformed image are the same.

The method of claim 14,

And the first residual image and the second residual image are the same.