KR101407719B1

KR101407719B1 - Multi-view image coding method and apparatus using variable GOP prediction structure, multi-view image decoding apparatus and recording medium storing program for performing the method thereof

Info

Publication number: KR101407719B1
Application number: KR1020080003913A
Authority: KR
Inventors: 호요성; 오관정
Original assignee: 광주과학기술원
Priority date: 2008-01-14
Filing date: 2008-01-14
Publication date: 2014-06-16
Also published as: KR20090078114A

Abstract

본 발명은 가변적 화면 그룹 예측 구조를 이용한 다시점 영상 부호화 방법과 장치, 그리고 영상 복호화 장치를 개시한다. 본 발명의 다시점 영상 부호화 방법은 복수의 화면들을 포함하는 화면 그룹을 입력 받는 단계; 화면 그룹의 화면들을 미리 결정된 복수의 그룹 예측 모드들에 따라 부호화하고, 비트-왜곡 비용값을 계산하는 단계; 상기 비트-왜곡 비용값을 고려하여, 상기 그룹 예측 모드들 중에서 최적의 그룹 예측 모드를 결정하는 단계; 및 상기 결정된 그룹 예측 모드에 따라 부호화된 다시점 영상 정보를 생성하는 단계를 포함한다. 본 발명에 따르면, 다시점 영상의 부호화 효율을 향상시킬 수 있으며, 시공간적 예측 구조가 최적화된 영상 부호화가 가능하다.The present invention discloses a multi-view video encoding method and apparatus and a video decoding apparatus using a variable picture group prediction structure. According to another aspect of the present invention, there is provided a multi-view image encoding method including: receiving a screen group including a plurality of screens; Coding the pictures of the screen group according to a predetermined plurality of group prediction modes, and calculating a bit-distortion cost value; Determining an optimal group prediction mode among the group prediction modes considering the bit-distortion cost value; And generating multi-view image information encoded according to the determined group prediction mode. According to the present invention, it is possible to improve the coding efficiency of multi-view images and to perform image coding with optimized temporal and spatial prediction structures.

다시점 영상, GOP, 비트-왜곡 비용값, 예측 구조 Multi-view image, GOP, bit-distortion cost value, prediction structure

Description

TECHNICAL FIELD [0001] The present invention relates to a multi-view image coding method and an apparatus, an image decoding apparatus, and a recording medium on which a program for performing the method is recorded. apparatus and recording medium storing program for < RTI ID = 0.0 >

본 발명은 다시점 영상 부호화 방법과 장치, 그리고 영상 복호화 장치에 관한 것으로서, 구체적으로는 시간 방향과 시점 방향의 화면 예측 구조에 대한 가변적 조절을 통해 다시점 영상 부호화의 효율을 향상시키기 위한 방법 및 장치에 관한 것이다.The present invention relates to a multi-view image encoding method and apparatus, and an image decoding apparatus, and more particularly, to a method and apparatus for improving the efficiency of multi-view image encoding through variable adjustment of a picture prediction structure in a time direction and a view direction .

다시점 영상(multi-view image)이란 똑같은 3차원 장면을 두 대 이상의 카메라를 이용하여 촬영한 영상을 의미하며, 특히 다시점 영상은 기하학적인 교정을 거친 여러 영상들의 공간적인 합성을 통해 사용자에게 다양한 시점의 영상을 제공할 수 있다.A multi-view image is an image of the same three-dimensional scene captured using two or more cameras. In particular, a multi-view image is a scene in which a variety of geometric calibrated images are synthesized It is possible to provide an image of a viewpoint.

다시점 영상의 한 예인 파노라마(panoramic) 영상은 우주/항공 사진학, 컴퓨터 비전, 컴퓨터 그래픽스 분야에서 많이 연구되고 있으며, 항공사진의 해석, 영상 변화 감지, 비디오 압축, 비디오 인덱싱, 카메라 해상도 및 FOV(field of view) 확대에서 간단한 영상 편집에 이르기까지 매우 다양한 분야에 응용되고 있다. 컴퓨터 비전에서는 서로 다른 시점에서 획득된 여러 영상을 이용하여 영상내의 물체의 깊이(depth)와 시차(disparity) 정보를 추출하고 있으며, 컴퓨터 그래픽스에서도 영상기반 렌더링(image based rendering)이란 이름으로 획득된 다시점 영상들을 이용하여 가상의 시점에서 사실적인 영상을 생성한다. Panoramic images, which are an example of multi-view images, are being studied extensively in aerospace / aviation, computer vision, and computer graphics, and include aerial photograph analysis, image change detection, video compression, video indexing, camera resolution, and FOV of view) to simple image editing. In computer vision, the depth and disparity information of an object in the image are extracted using several images acquired at different points of view. In computer graphics, Point images to generate a realistic image at a virtual point of view.

이러한 다시점 영상 처리 기술은 전방향성 카메라를 이용한 감시 시스템이나, 게임에서 이용되는 3차원 가상 시점, 또는 다수의 카메라 영상들로부터 입력된 영상을 임의로 선택할 수 있도록 하는 시점 스위칭 등에 이용되고 있다. 또한, 이러한 다시점 비디오 영상은 네트워크 기술과 맞물려 대화형 콘텐츠나 실감 콘텐츠를 이용하는 다양한 멀티미디어 서비스에 확장될 수 있다. The multi-view image processing technique is used for a surveillance system using an omnidirectional camera, a three-dimensional virtual viewpoint used in a game, or a viewpoint switching for arbitrarily selecting an image input from a plurality of camera images. In addition, such multi-view video images can be extended to various multimedia services using interactive contents or realistic contents in combination with network technology.

종래에는 다시점 영상의 부호화를 위해 정형화된 예측 구조를 사용하였다. 그러나, 다시점 영상의 부호화는 단일 시점의 영상의 부호화와 달리 부호화의 효율성이 예측 구조에 의존적이고, 다시점 영상의 시공간적 특성이 서로 상이하여 시점 방향과 시간 방향에 따라 최적의 예측 구조가 서로 다르기 때문에, 기존과 같이 정형화된 예측 구조를 이용할 경우 부호화 효율을 향상시키는데 일정한 한계가 있다. Conventionally, a formalized prediction structure is used for multi-view image coding. However, the multi-view image coding is different from the single-view image coding because the coding efficiency is dependent on the prediction structure and the spatial and temporal characteristics of the multi-view images are different from each other. Therefore, there is a certain limit in improving the coding efficiency when using the conventional prediction structure as in the conventional art.

본 발명은 공간적으로 인접한 복수개의 화면들로 이루어진 다시점 영상에 대한 효율적인 부호화를 위하여, 인접하는 시공간에 존재하는 영상들의 예측 구조를 시공간적 상관도에 따라 가변적으로 조절시킨 다시점 영상 부호화 방법과 장치, 그리고 다시점 영상 복호화 장치를 제공하는 것을 목적으로 한다.The present invention relates to a multi-view image encoding method and a multi-view image encoding method, in which a prediction structure of images existing in adjacent space-time is variably adjusted according to temporal and spatial correlation for efficiently encoding multi-view images including a plurality of spatially adjacent scenes, And a multi-view image decoding apparatus.

상술한 본 발명의 기술적 과제를 해결하기 위하여, 본 발명에 따른 다시점 영상 부호화 방법은 복수의 화면들을 포함하는 화면 그룹을 입력 받는 단계; 상기 화면 그룹의 화면들을 미리 결정된 복수의 그룹 예측 모드들에 따라 부호화하고, 상기 부호화된 화면들 각각의 비트-왜곡 비용값을 계산하는 단계; 상기 비트-왜곡 비용값을 고려하여, 상기 그룹 예측 모드들 중에서 하나의 그룹 예측 모드를 최적의 그룹 예측 모드로 결정하는 단계; 및 상기 결정된 그룹 예측 모드에 따라 부호화된 다시점 영상 정보를 생성하는 단계를 포함한다.According to another aspect of the present invention, there is provided a multi-view image encoding method comprising: inputting a screen group including a plurality of screens; Encoding the pictures of the picture group according to a predetermined plurality of group prediction modes and calculating a bit-distortion cost value of each of the encoded pictures; Determining one group prediction mode among the group prediction modes as an optimal group prediction mode considering the bit-distortion cost value; And generating multi-view image information encoded according to the determined group prediction mode.

상술한 본 발명의 또 다른 기술적 과제를 해결하기 위하여, 본 발명에 따른 다시점 영상 부호화 장치는 복수의 화면들을 포함하는 화면 그룹을 입력받고, 입력된 화면 그룹을 저장하는 버퍼; 상기 화면 그룹의 화면들을 미리 결정된 복수의 그룹 예측 모드들에 따라 부호화하는 예측 모드별 부호화부; 상기 부호화된 화면들 각각의 비트-왜곡 비용값을 계산하는 비트-왜곡값 계산부; 상기 비트-왜곡 비용값을 고려하여, 상기 그룹 예측 모드들 중에서 하나의 그룹 예측 모드를 최적의 그룹 예측 모드로 결정하는 예측 모드 결정부; 및 상기 결정된 그룹 예측 모드에 따라 부호화된 다시점 영상 정보를 생성하는 부호화부를 포함한다.According to another aspect of the present invention, there is provided a multi-view image encoding apparatus comprising: a buffer for receiving a screen group including a plurality of screens and storing an input screen group; A prediction mode encoding unit that encodes the pictures of the picture group according to a plurality of predetermined group prediction modes; A bit-distortion value calculation unit for calculating a bit-distortion cost value of each of the encoded pictures; A prediction mode determiner for determining one of the group prediction modes as an optimal group prediction mode considering the bit-distortion cost value; And an encoding unit for generating multi-view image information encoded according to the determined group prediction mode.

상술한 본 발명의 또 다른 기술적 과제를 해결하기 위하여, 본 발명에 따른 다시점 영상 부호화 장치는 복수의 화면들을 포함하는 화면 그룹을 입력 받는 입력부; 상기 화면 그룹에 포함된 화면들의 부호화와 관련된 그룹 예측 모드를 조절하는 예측 모드 조절부; 상기 예측 모드 조절부에 의해 조절된 그룹 예측 모드에 따라 부호화된 화면들 각각의 비트-왜곡 비용값을 계산하는 비트-왜곡값 계산부; 및 상기 비트-왜곡 비용값을 고려하여, 상기 그룹 예측 모드들 중에서 하나의 그룹 예측 모드를 최적의 그룹 예측 모드로 결정하는 예측 모드 결정부를 구비하며, 상기 결정된 그룹 예측 모드에 따라 부호화된 다시점 영상 정보를 생성하는 다시점 영상 부호화 장치이다.According to another aspect of the present invention, there is provided a multi-view image encoding apparatus comprising: an input unit receiving a screen group including a plurality of screens; A prediction mode adjuster for adjusting a group prediction mode related to coding of pictures included in the picture group; A bit-distortion value calculation unit for calculating a bit-distortion cost value of each of the coded pictures according to the group prediction mode adjusted by the prediction mode adjustment unit; And a prediction mode deciding unit for deciding one of the group prediction modes as an optimal group prediction mode considering the bit-distortion cost value, wherein the decoding unit decodes the multi- Point video encoding apparatus for generating information.

상술한 본 발명의 또 다른 기술적 과제를 해결하기 위하여, 본 발명에 따른 다시점 영상 복호화 장치는 부호화된 화면 그룹에 대한 비트스트림 정보로부터 예측 모드 정보를 복원하는 예측 모드 복호화부; 상기 비트스트림 정보에 대한 엔트로피 복호화를 수행하는 엔트로피 복호화부; 상기 엔트로피 복호화부를 통해 복원된 잔여 성분 정보에 대한 역양자화를 수행하는 역양자화부; 상기 엔트로피 복호화부를 통해 복원된 움직임 정보를 이용하여 움직임 보상된 화면을 생성하는 움직임 보상부; 및 상기 역양자화부로 부터의 역양자화된 잔여 성분 정보와 상기 움직임 보상부로 부터의 움직임 보상된 화면들을 이용하여 복원된 화면을 생성하고, 상기 복원된 화면들을 재배열하는 화면 재배열부를 포함한다.According to another aspect of the present invention, there is provided a multi-view image decoding apparatus comprising: a prediction mode decoding unit for decoding prediction mode information from bitstream information of a coded picture group; An entropy decoding unit for performing entropy decoding on the bitstream information; An inverse quantization unit for performing inverse quantization on residual component information reconstructed by the entropy decoding unit; A motion compensation unit for generating a motion compensated picture using the restored motion information through the entropy decoding unit; And a picture rearrangement unit for generating a reconstructed picture using the inversely quantized residual component information from the inverse quantization unit and the motion compensated pictures from the motion compensation unit and rearranging the reconstructed pictures.

또한, 본 발명은 상술한 다시점 영상 부호화 방법을 컴퓨터 상에서 수행하기 위한 컴퓨터에서 판독 가능한 기록 매체를 제공한다.The present invention also provides a computer-readable recording medium for performing the above-described multi-view image encoding method on a computer.

본 발명에 따르면, 다시점 영상의 부호화에 시퀀스(sequence)와 화면(picture)의 중간 레벨 개념으로 시점 방향의 화면 그룹(VGOP), 시간 방향의 화면 그룹(TGOP)의 개념을 도입하고, 비트-왜곡 비용값을 고려하여 화면 그룹의 예측 구조를 가변적으로 조절함으로써, 다시점 영상의 부호화 효율을 향상시키며, 시공간적 예측 구조가 최적화된 영상 부호화가 가능하다는 잇점이 있다.According to the present invention, the concept of a view group in the view direction (VGOP) and a view group in the time direction (TGOP) is introduced as a mid-level concept between a sequence and a picture in encoding of a multi- The coding efficiency of the multi-view image is improved by adjusting the prediction structure of the screen group variably in consideration of the distortion cost value, and the image coding can be optimized with the temporal / spatial prediction structure.

이하 도면을 참고하여 본 발명의 가변적 화면 그룹 예측 구조를 이용한 다시점 영상 부호화/복호화 장치 및 방법 그리고 상기 방법을 수행하는 프로그램이 기록된 기록 매체에 대하여 구체적으로 설명한다.Hereinafter, a multi-view image encoding / decoding apparatus and method using the variable picture group prediction structure according to the present invention and a recording medium on which a program for performing the method are recorded will be described in detail with reference to the drawings.

도 1은 본 발명에 따른 다시점 영상 전송 시스템을 나타내는 개략도이다. 도 1에 도시된 다시점 영상 전송 시스템은 복수개의 다시점 카메라(12, 14, 16, 18), 부호화 장치(20), 인터넷(30), 복호화 장치(40) 및 사용자 단말기(50)를 포함한다.1 is a schematic diagram showing a multi-view image transmission system according to the present invention. The multi-viewpoint image transmission system shown in FIG. 1 includes a plurality of multi-viewpoint cameras 12, 14, 16 and 18, an encoding device 20, an Internet 30, a decryption device 40 and a user terminal 50 do.

다시점 카메라는 동일한 촬영 대상을 촬영 대상을 촬영하여, 디지털 또는 아날로그 형태의 전송선을 통해 다시점 영상의 부호화 장치(20)그룹으로 전송한다. 본 실시예의 다시점 영상의 부호화 장치(20)는 고정된 예측 구조가 아닌 화면 화면 그룹 단위로 가변적인 GOP 예측 구조를 이용하여 영상을 부호화한다. 여기에서 화면 그룹은 화면 시점 방향의 화면 그룹(VGOP)와 시간 방향의 화면 그룹(TGOP)을 의 미하며, 가변적인 GOP 예측 구조는 VGOP와 TGOP에 대하여 I, P, B 화면 구조가 화면 그룹에 따라 상이함을 의미한다. 다시점 영상의 부호화 장치에 대한 상세한 설명은 후술한다. 다시점 영상의 부호화 장치(20)에서 압축된 데이터는 인터넷 또는 다른 데이터 통신망을 통해 다시점 영상의 복호화 장치(40)로 전달된다. 다시점 영상의 복호화 장치(40)는 전달된 데이터를 복호화한 후, 사용자 단말(50)에 구비된 출력 수단을 통해 복원된 영상을 출력한다.The multi-viewpoint camera captures an object to be photographed and transmits it to a group of multi-view image encoding devices 20 through digital or analog transmission lines. The multi-view image encoding apparatus 20 of the present embodiment encodes an image using a variable GOP prediction structure in units of screen picture groups instead of a fixed prediction structure. Here, the screen group refers to the screen group (VGOP) in the view direction of the screen and the screen group (TGOP) in the temporal direction, and the variable GOP prediction structure defines the I, P and B screen structures for the VGOP and TGOP Which means they are different. A detailed description of the multi-view image encoding apparatus will be described later. The compressed data in the multi-view video encoding device 20 is transmitted to the multi-view video decoding device 40 via the Internet or another data communication network. The multi-view image decoding apparatus 40 decodes the transferred data and outputs the reconstructed image through the output means provided in the user terminal 50. [

도 2는 본 발명의 일 실시예에 따른 다시점 영상 부호화 장치를 나타내는 블록도이다. 도 2에 도시된 다시점 영상 부호화 장치(20)는 버퍼(102), 다운 샘플링부(104), GOP 예측 모드별 부호화부(106), 비트-왜곡 비용값 계산부(108), GOP 예측모드 결정부(110) 및 부호화부(112)를 포함한다.2 is a block diagram illustrating a multi-view image encoding apparatus according to an exemplary embodiment of the present invention. 2 includes a buffer 102, a downsampling unit 104, a GOP prediction mode encoding unit 106, a bit-distortion cost value calculation unit 108, a GOP prediction mode A determination unit 110 and an encoding unit 112. [

버퍼(102)는 다시점 영상 획득 장치로부터 획득된 화면 그룹을 입력 받는다. 본 발명에서 화면 그룹은 시점 방향의 화면 그룹(VGOP)과 시간 방향의 화면 그룹(TGOP)을 포함하는 GGOP(Group of GOP)의 개념으로 사용된다. GGOP는 시퀀스(sequence)와 화면(picture)의 중간 레벨로서, 하나의 GGOP는 여러 개의 화면들로 이루어지고, 여러 개의 GGOP가 모여 하나의 시퀀스(sequence)를 구성할 수 있다.The buffer 102 receives a screen group obtained from the multi-view image acquisition apparatus. In the present invention, the screen group is used as a concept of a group of GOP (GGOP) including a screen group (VGOP) in the view direction and a screen group (TGOP) in the time direction. A GGOP is an intermediate level between a sequence and a picture. One GGOP is composed of a plurality of screens, and a plurality of GGOPs can be gathered to form a single sequence.

다운 샘플링부(104)는 버퍼에 저장된 상기 입력된 화면 그룹에 속한 화면들에 대해 다운 샘플링을 수행한다. 도 2의 다시점 영상 부호화 장치(20)는 GOP 예측 모드별 부호화부를 구비하여 다운 샘플링된 영상 정보에 대한 부호화를 수행하므로, 예측 모드별 부호화에 필요한 연산량과 연산에 소요되는 시간을 줄일 수 있다. 그러나, 다운 샘플링된 영상 정보를 입력 받아 수행되는 예측 모드별 부호화의 결과를 최종적인 부호화의 결과로서 사용한다면, 복원되는 영상의 품질이 떨어지는 문제가 있으므로, 본 실시예에서는 영상 부호화를 위한 별도의 부호화부(112)를 더 구비한다.The downsampling unit 104 downsamples the pictures belonging to the input picture group stored in the buffer. Since the multi-view image encoding apparatus 20 of FIG. 2 includes the encoding unit according to the GOP prediction mode, the down-sampled image information is encoded, thereby reducing the amount of calculation required for encoding according to the prediction mode and the time required for the calculation. However, if the result of the encoding for each prediction mode performed after receiving the downsampled image information is used as the final encoding result, there is a problem that the quality of the reconstructed image is degraded. Therefore, in this embodiment, (112).

GOP 예측 모드별 부호화부(106)는 다운 샘플링된 화면 그룹을 미리 정해진 복수 개의 그룹 예측 모드들에 따라 각각 부호화한다. GOP 예측 모드별 부호화부(106)는 예측 모드별 제1 부호화부와 예측 모드별 제2 부호화부를 포함한다. 예측 모드별 제1 부호화부는 화면 그룹 중 앵커 화면들을 시점 방향의 그룹 예측 모드에 따라 부호화하는 것이고, 제2 부호화부는 비앵커 화면들을 시간 방향의 그룹 예측 모드에 따라 부호화하는 것이다.The GOP prediction mode-based encoding unit 106 encodes the downsampled picture group according to a plurality of predetermined group prediction modes. The GOP prediction mode-based encoding unit 106 includes a first encoding unit for each prediction mode and a second encoding unit for each prediction mode. The first encoding unit for each prediction mode encodes the anchor pictures in the picture group according to the group prediction mode in the view direction, and the second encoding unit encodes the non-anchor pictures according to the group prediction mode in the time direction.

비트-왜곡 비용값 계산부(108)는 GOP 예측 모드별 부호화부에 따른 부호화된 영상 정보와 원래의 영상 정보를 이용하여 부호화된 화면들 각각의 비트-왜곡 비용값을 계산한다. 비트-왜곡 비용값 계산부(108)는 제1 비트-왜곡 비용값 계산부와 제2 비트-왜곡 비용값 계산부를 포함한다. 제1 비트-왜곡 비용값 계산부는 예측 모드별 제1 부호화부를 통해 부호화된 앵커 화면들 각각의 비트-왜곡 비용값을 계산하고, 제2 비트-왜곡 비용값 계산부는 예측 모드별 제2 부호화부를 통해 부호화된 각각의 비앵커 화면에 따른 비트-왜곡 비용값을 계산한다.The bit-distortion cost value calculation unit 108 calculates the bit-distortion cost value of each of the encoded images using the encoded image information according to the GOP prediction mode encoding unit and the original image information. The bit-distortion cost value calculation section 108 includes a first bit-distortion cost value calculation section and a second bit-distortion cost value calculation section. The first bit-distortion cost calculation unit calculates the bit-distortion cost value of each of the anchor pictures coded through the first encoding unit for each prediction mode, and the second bit-distortion cost calculation unit calculates the bit- And calculates a bit-distortion cost value according to each encoded non-anchor picture.

도 3은 8개의 시점으로 이루어진 다시점 영상의 부호화를 위한 예측 구조의 일예를 나타낸다. 도 3의 예측 구조는 시점 방향(y축, S0, S1, ... , S7)과 시간 방향(x축, T0, T1... )으로 이루어져 있다. 도면 3에서 화살표는 두 프레임간의 참 조 관계를 나타내는 것으로 A->B이면 B가 A를 참조하여 부호화됨을 의미한다. 도면 3과 같이 시점 방향으로만 참조 관계를 갖는 시간대의 화면들(T0, T8, ..)을 앵커(anchor) 화면이라 하고, 앵커 화면이 아닌 화면을 비앵커(non-anchor) 화면이라 한다. 다시점 영상 부호화의 예측 구조는 공간적 상관관계를 갖는 시점 방향에 대한 예측 구조와 시간적 상관관계를 갖는 시간 방향에 대한 예측 구조로 구분할 수 있다.FIG. 3 shows an example of a prediction structure for encoding a multi-view image having eight viewpoints. The prediction structure of FIG. 3 is composed of a view direction (y-axis, S0, S1, ..., S7) and a time direction (x-axis, T0, T1 ...). In Fig. 3, the arrow indicates the reference relationship between two frames. If A- > B, B means that A is coded with reference to A. As shown in FIG. 3, screens (T0, T8, ..) of a time zone having a reference relationship only in a view direction are referred to as an anchor screen, and screens other than an anchor screen are referred to as a non-anchor screen. The prediction structure of multi-view image coding can be classified into a prediction structure for a temporal direction having a spatial correlation and a temporal prediction structure having a temporal correlation.

시점 방향에 대한 예측 구조와 관련하여, 종래에는 I-B-P-B-B-B-P-P와 같은 정형화된 GOP 예측 구조를 사용하였으며, 시간과 무관하게 동일한 예측 구조를 유지하여 사용하였다. 본 발명의 다시점 영상 부호화 방법은 종래의 이러한 예측 구조와 달리, 시점 방향과 시간 방향의 예측 구조를 가변적으로 조절하는 것에 일 특징이 있다. 도 3에는 시점 방향의 화면 그룹을 나타내는 VGOP와, 시간 방향의 화면 그룹을 나타내는 TGOP가 도시되어 있다. 먼저, 시점 방향의 화면 그룹인 VGOP의 부호화 방법을 살펴보면, I시점의 위치를 식 1을 통해 중앙 시점에 가깝게 위치시키는 것이 바람직하다. 하기 수학식1에서 Num_view는 시점의 수를 의미한다.Regarding the prediction structure for the view direction, a conventional GOP prediction structure such as IBPBBBPP is conventionally used and the same prediction structure is used regardless of time. The multi-view image coding method of the present invention is different from the conventional prediction structure in that it variably adjusts the prediction structure of the view direction and the time direction. 3 shows a VGOP representing a screen group in the view direction and a TGOP representing a screen group in the time direction. First, the coding method of the VGOP, which is a screen group in the view direction, is preferable to locate the position of the I viewpoint close to the central viewpoint through the equation (1). In Equation (1), Num _view means the number of viewpoints.

[수학식 1][Equation 1]

I_view =

Num_view/2

I _view =

Num _view / 2

수학식1에 따라 I_view를 중앙으로 위치시킬 경우, 기본 예측 구조는 …PPIPP… 형태를 갖게 되며, 그 확장 예측 구조는 여기에 P화면 사이에 몇 개의 B화면을 넣을 것인가에 달려있다. 예를 들면 PBPBIBPBP…의 형태는 하나의 B를 삽입한 형태 이고 …PBBPBBIBBP…는 두 개의 B를 삽입한 형태이다. 이와 같이 I화면을 중심으로 하는 다양한 예측 구조가 존재할 수 있다. 본 발명에서는 상술한 다양한 형태의 예측 구조를 그룹 예측 모드라 정의한다. 이렇게 다양한 시점 방향의 화면 그룹(VGOP)에 대한 그룹 예측 모드들을 미리 정의하고, 하기의 수학식 2와 같이 비트-왜곡 비용값의 관점에서 최적의 그룹 예측 모드들 선택한 후, 선택된 최적의 그룹 예측 모드에 따라 다시점 영상 부호화를 수행하는 것이 바람직하다. 그리고, 선택된 최적의 그룹 예측 모드에 대한 정보를 부호화하여 시점 방향의 화면 그룹(VGOP)의 헤더에 전송하고, 복호화기에서는 이 정보를 이용하여 해당 구조로 디코딩한다.If I _view is centered according to Equation 1, the basic prediction structure is ... PPIPP ... , And the extended prediction structure depends on how many B screens are to be placed between the P screens. For example, PBPBIBPBP ... The form of a B is inserted in the form of ... PBBPBBIBBP ... Is a form in which two Bs are inserted. Thus, various prediction structures centering on the I picture can exist. In the present invention, the above-described various types of prediction structures are defined as a group prediction mode. The group prediction modes for the picture group (VGOP) in various view direction directions are defined in advance, and the optimal group prediction modes are selected from the viewpoint of the bit-distortion cost value as shown in Equation (2) Point video encoding in accordance with the present invention. Then, information on the selected optimal group prediction mode is encoded and transmitted to the header of the screen group (VGOP) in the view direction, and the decoder decodes the information using the information.

[수학식 2]&Quot; (2) "

C = D + λ·RC = D +? R

여기에서, D는 임의의 그룹 예측 모드에 따라 부호화 했을 경우의 왜곡(distortion)이고, R은 상기 그룹 예측 모드에 따라 부호화 했을 경우의 비트(rate)이며, C는 비트-왜곡 비용값이고, λ는 왜곡과 비트에 대한 가중치로서, H.264/AVC에서 정의된 값을 사용할 수 있다. 수학식2를 이용하여 그룹 예측 모드에 따라 부호화된 화면 각각에 대하여 비트-왜곡 비용값을 계산할 수 있고, 해당 VGOP에 속하는 화면들의 비트-왜곡 비용값들을 합산하면, 해당 VGOP에 최적인 그룹 예측 모드를 찾을 수 있다.Here, D is a distortion when coding is performed according to an arbitrary group prediction mode, R is a rate when coding is performed according to the group prediction mode, C is a bit-distortion cost value, and? Can be a value defined in H.264 / AVC as a weight for distortion and bits. The bit-distortion cost value can be calculated for each picture coded according to the group prediction mode using Equation 2. If the bit-distortion cost values of the pictures belonging to the corresponding VGOP are summed, Can be found.

본 발명에서 시간 방향의 화면 그룹(TGOP)의 그룹 예측 모드에 따른 부호화는 시점 방향 그룹의 그룹 예측 모드를 결정한 후 수행한다. 일반적으로 TGOP의 예측 구조는 계층적 B화면 구조를 갖는다. TGOP의 구조는 시퀀스의 임의 접근이 용이 하도록 정의된 GOP 길이에 따라 결정된다. 도면 3의 경우 GOP 길이가 8인 경우이다. 그러나 GOP 길이가 결정된다 하더라도 다양한 계층적 B화면 구조가 가능하다. B1 레벨의 화면을 2개로 정의할 수도 있고 다른 한쪽으로 치우치게 할 수도 있다. 계층적 B화면의 구조에 따라 부호화 효율이 달라지기 때문에, VGOP와 마찬가지로 TGOP의 경우에도 다양한 시간 방향의 그룹 예측 모드를 미리 정의하고, 각각의 TGOP에 대하여 최적의 부호화 효율을 갖는 그룹 예측 모드를 선택할 수 있다. 여기에서 최적의 부호화 효율은 비트-왜곡 관점에서 판단할 수 있다.In the present invention, the coding according to the group prediction mode of the temporal picture group (TGOP) is performed after determining the group prediction mode of the viewing direction group. In general, the prediction structure of TGOP has a hierarchical B screen structure. The structure of the TGOP is determined according to the defined GOP length so as to facilitate random access of the sequence. In the case of FIG. 3, the GOP length is 8. However, even if the GOP length is determined, various hierarchical B screen structures are possible. You can define two screens of the B1 level or you can shift them to the other. Since the coding efficiency varies depending on the structure of the hierarchical B picture, in the case of the TGOP as well as the VGOP, the group prediction modes in various time directions are predefined in advance, and the group prediction mode having the optimum coding efficiency for each TGOP is selected . Here, the optimal coding efficiency can be determined from the viewpoint of bit-distortion.

GOP 예측 모드 결정부(110)는 비트-왜곡 비용값을 고려하여, 상기 그룹 예측 모드들 중에서 하나의 그룹 예측 모드를 결정한다. GOP 예측 모드 결정부(110)는 제1 예측 모드 결정부와 제2 예측 모드 결정부를 포함한다. 제1 예측 모드 결정부는 시간 방향의 그룹 예측 모드들 중에서 상기 제2 비트-왜곡 계산부에 따라 계산된 비트-왜곡 비용값의 합을 최소로 하는 그룹 예측 모드를 결정하고, 제2 예측 모드 결정부는 시점 방향의 그룹 예측 모드들 중에서 비트-왜곡 비용값의 합을 최소로 하는 시점 방향의 그룹 예측 모드를 결정한다.The GOP prediction mode determination unit 110 determines one group prediction mode among the group prediction modes considering the bit-distortion cost value. The GOP prediction mode determination unit 110 includes a first prediction mode determination unit and a second prediction mode determination unit. The first prediction mode decision unit decides a group prediction mode that minimizes the sum of the bit-distortion cost values calculated according to the second bit-distortion calculation unit among the group prediction modes in the temporal direction, The group prediction mode in the view direction that minimizes the sum of the bit-distortion cost values among the group prediction modes in the view direction is determined.

부호화부(112)는 GOP 예측 모드 결정부(110)에 따라 선택된 그룹 예측 모드에 따라 상기 화면 그룹의 영상에 대한 부호화를 수행한다. 본 실시예의 경우 다운 샘플링된 영상 정보에 대한 예측 모드별 부호화를 수행하였기 때문에, 별도의 부호화부를 통해 영상 부호화를 수행하는 것이 바람직하다. 그러나, 본 실시예와 달리 다운 샘플링부를 구비하지 않는 경우, GOP 예측 모드 결정부에서 선택된 시점 방향의 그룹 예측 모드에 따른 부호화 결과를 이용하여 부호화를 수행하도록 구현할 수 있다. 여기에서 부호화 결과는 부호화된 영상 정보와 그룹 예측 모드에 대한 식별 정보를 의미한다.The encoding unit 112 encodes the image of the picture group according to the selected group prediction mode according to the GOP prediction mode deciding unit 110. [ In the present embodiment, since the prediction mode-based encoding of the downsampled image information is performed, it is preferable to perform image encoding through a separate encoding unit. However, unlike the present embodiment, when the downsampling unit is not provided, the GOP prediction mode determination unit may perform the encoding using the encoding result according to the group prediction mode in the view direction. Here, the encoding result means the encoded image information and the identification information for the group prediction mode.

도 4는 본 발명의 또 다른 일 실시예에 따른 다시점 영상 부호화 장치를 나타내는 블록도이다. 도 4에 도시된 다시점 영상 부호화 장치(20’)는 버퍼(152), 화면 재배열부(154), 이산여현변환부(DCT, 156), 양자화부(Q,158), 역양자화부(Q^-1, 160), 역이산여현변환부(IDCT, 162), 인트라 예측부(164), 움직임 보상부(168), 움직임 예상부(170), GOP 예측 모드 조절부(172), 비트-왜곡 비용값 계산부(174), GOP 예측 모드 결정부(176), 엔트로피 부호화부(178) 및 비트스트림 생성부(180)를 포함한다.4 is a block diagram illustrating a multi-view image encoding apparatus according to another embodiment of the present invention. The multi-view image encoding apparatus 20 'shown in FIG. 4 includes a buffer 152, a picture rearrangement unit 154, a DCT 156, a quantization unit Q 158, an inverse quantization unit Q ^-1 , and 160, an IDCT 162, an intra predictor 164, a motion compensator 168, a motion estimator 170, a GOP prediction mode adjuster 172, A cost value calculation unit 174, a GOP prediction mode determination unit 176, an entropy encoding unit 178, and a bitstream generation unit 180.

본 실시예의 다시점 영상 부호화 장치는 기존의 영상 부호화 장치에 GOP 예측 모드 조절부(172), 비트-왜곡 비용값 계산부(174) 및 GOP 예측 모드 결정부(176)를 더 구비하며, 부호화하고자 하는 화면들을 미리 결정된 복수개의 그룹 예측 모드에 따라 부호화하고, 비트-왜곡의 관점에서 최적의 그룹 예측 모드에 따라 부호화 결과값을 출력하는 것을 주된 특징으로 한다.The multi-view image encoding apparatus of the present embodiment further includes a GOP prediction mode adjusting unit 172, a bit-distortion cost value calculating unit 174 and a GOP prediction mode determining unit 176 in the conventional image encoding apparatus, And outputs an encoding result value according to an optimal group prediction mode from the viewpoint of bit-distortion.

예를 들어, 시점 방향의 화면 그룹(VGOP)의 길이가 8이고, VGOP에 대한 4개의 그룹 예측 모드(IBPBPBPP, PPBIBPBP, PBBIBBPP 및 PPBBIBBP)가 존재할 경우, 본 실시예의 영상 부호화 장치는 하나의 시점 방향 그룹에 대하여 4번의 부호화를 수행하고, 4개의 그룹 예측 모드들 중에서 최적의 그룹 예측 모드를 최종적인 부호화 결과로서 출력한다. 이하, 영상 부호화 장치를 구성하는 구성요소들에 대하여 설명 한다.For example, when the length of the view group (VGOP) in the view direction is 8 and the four group prediction modes (IBPBPBPP, PPBIBPBP, PBBIBBPP and PPBBIBBP) for the VGOP are present, the image encoding apparatus of this embodiment has one view direction Group, and outputs the best group prediction mode among the four group prediction modes as a final encoding result. Hereinafter, the components constituting the image encoding apparatus will be described.

버퍼(152)는 캠코더, 디지털 카메라 등의 영상 획득 장치에서 획득된 다시점 영상 정보를 입력받고, 일시적으로 저장한다. 화면 재배열부(154)는 후술하는 재배열 순서에 따라 버퍼를 액세스하여 부호화하고자 하는 화면의 데이터를 움직임 예상부와 감산기에 제공한다.The buffer 152 receives and temporarily stores multi-view image information obtained from an image acquisition device such as a camcorder, a digital camera, and the like. The picture rearranging unit 154 accesses the buffer in accordance with the rearrangement procedure to be described later and provides the data of the picture to be encoded to the motion estimation unit and the subtractor.

우선, 전방향 경로(forward path)에 대하여 상세히 설명한다. 화면 재배열부(154)로부터 전달되는 부호화하고자하는 대상 화면은 감산기에 입력된다. 감산기는 움직임 보상부(168)를 통해 재구성된 참조화면과 대상 화면의 차이값 행렬을 생성하고, 생성된 차이값 행렬을 이산여현변환부(DCT, 156)에 전달한다. 이산여현변환부(DCT, 156)는 상기 차이값 행렬에 대한 이산 코사인 변환을 통해 DCT 계수를 계산한다. 양자화부(Q, 158)는 이산여현변환부에서 생성된 DCT 계수를 양자화시킨다. 양자화부(158)에서 양자화된 DCT 계수는 엔트로피 부호화부(178)로 전달되며, CAVLC 또는 CAVAC 등의 방법으로 엔트로피 부호화된다. 엔트로피 부호화된 데이터는 비트스트림 생성부(180)를 통해 외부의 네트워크로 전송된다.First, a forward path will be described in detail. The target picture to be coded transmitted from the picture rearranging unit 154 is input to a subtractor. The subtractor generates a difference value matrix of the reconstructed reference picture and the target picture through the motion compensation unit 168 and transmits the generated difference value matrix to the DCT unit 156. [ The DCT unit 156 calculates a DCT coefficient by performing discrete cosine transform on the difference value matrix. The quantization unit (Q) 158 quantizes the DCT coefficients generated in the DCT unit. The DCT coefficients quantized by the quantization unit 158 are transmitted to the entropy encoding unit 178, and entropy-encoded by a method such as CAVLC or CAVAC. The entropy-encoded data is transmitted to the external network through the bitstream generator 180.

다음은, 재구성 경로(reconstruction path)에 대하여 상세히 설명한다. 본 실시예에서 양자화부(158)를 통해 양자화된 데이터는 역양자화부(Q^-1, 160), 역이산여현변환부(IDCT, 162)와 합산기에 입력된다. 인트라 예측부(164)는 화면 내 예측 알고리즘을 이용하여 I화면을 생성하고, 생성된 I화면을 GOP 예측모드 조절부(172)에 전달한다. 인터모드로 부호화하는 경우, 재구성된 화면은 화면 저장부(166)에 저장된 후, 저장된 화면은 움직임 보상부(168)와 움직임 보상부(168)에 전달된다.The following describes the reconstruction path in detail. In this embodiment, the quantized data through the quantization unit 158 is input to an inverse quantization unit (Q ^-1 , 160) and an IDCT 162 and a summer. The intraprediction unit 164 generates an I-picture using the intra-picture prediction algorithm and transfers the generated I-picture to the GOP prediction mode adjuster 172. In the case of coding in the inter mode, the reconstructed picture is stored in the picture storage unit 166, and then the stored picture is transferred to the motion compensation unit 168 and the motion compensation unit 168.

움직임 예상부(170)는 화면 재배열부(154)에서 입력되는 대상 화면의 움직임을 예상하고, 대상 화면의 블록에 대한 움직임 벡터를 엔트로피 부호화부(178)로 전송한다. 움직임 예상부(170)에서 생성된 움직임 벡터는 움직임 보상부(168)로 전달되고, 움직임 보상부(168)는 화면 저장부로 부터의 재구성된 화면에 대한 정보와 움직임 벡터를 이용하여 움직임 보상된 예측 화면을 생성한다. 이렇게 생성된 예측 화면과 화면 재배열부에서 입력된 대상 화면의 차이는 감산기에서 연산되어, 상술한 바와 같이 이산여현변환부(156)로 전달된다. 또한, 움직임 보상부에서 예측된 화면에 대한 데이터는 가산기로도 입력되며, 상기 입력된 데이터는 IDCT를 통해 재구성된 차이값 행렬에 대한 정보와 합산되어 화면 저장부(166)에 저장된다.The motion predicting unit 170 predicts the motion of the target picture input from the picture rearranging unit 154 and transmits the motion vector for the block of the target picture to the entropy coding unit 178. [ The motion vector generated in the motion estimating unit 170 is transmitted to the motion compensating unit 168. The motion compensating unit 168 compensates the motion compensated prediction using the information about the reconstructed picture from the picture storing unit and the motion vector, Create a screen. The difference between the generated prediction picture and the target picture input from the picture rearrangement unit is calculated by the subtractor and transmitted to the DCT unit 156 as described above. The data of the predicted picture in the motion compensation unit is also input to an adder. The input data is added to the reconstructed difference value matrix through the IDCT and stored in the picture storage unit 166.

GOP 예측 모드 조절부(172)는 미리 결정된 복수개의 그룹 예측 모드에 따라 대상 화면을 I화면으로 부호화할 것인지 아니면 P 또는 B화면으로 부호화할 것인지를 조절하고, 인트라 예측부 또는 움직임 보상부를 통해 부호화된 화면을 감산기에 전달한다.The GOP prediction mode adjuster 172 adjusts whether the target picture is coded into an I picture or a P or B picture according to a plurality of predetermined group prediction modes, And transmits the screen to the subtractor.

비트-왜곡 비용값 계산부(174)는 복수개의 그룹 예측 모드에 따라 부호화된 화면들 각각의 비트-왜곡 비용값을 계산한다. 비트-왜곡 비용값 계산부는 재구성 경로를 통해 생성된 복원 영상 정보, 원래의 영상 정보와 비트율을 고려하여 현재의 그룹 예측 모드에 따른 비트-왜곡 비용값을 계산한다.The bit-distortion cost value calculation unit 174 calculates a bit-distortion cost value of each of the pictures coded according to the plurality of group prediction modes. The bit-distortion cost value calculation unit calculates the bit-distortion cost value according to the current group prediction mode in consideration of the reconstructed image information, original image information, and bit rate generated through the reconstruction path.

GOP 예측 모드 결정부(176)는 4개의 그룹 예측 모드 각각에 대하여 8개의 화면 각각의 비트-왜곡 비용값들을 합산하고, 합산된 값을 최소로 하는 그룹 예측 모 드를 최적의 그룹 예측 모드로 결정한다. 또한, GOP 예측 모드 결정부(176)는 결정된 그룹 예측 모드에 대한 식별 정보를 엔트로피 부호화부(178)에 전달한다.The GOP prediction mode determination unit 176 adds the bit-distortion cost values of each of the eight pictures to each of the four group prediction modes, determines the group prediction mode that minimizes the summed value as the optimal group prediction mode do. In addition, the GOP prediction mode determination unit 176 transfers the identification information of the determined group prediction mode to the entropy encoding unit 178.

도 5는 본 발명의 일 실시예에 따른 다시점 영상 부호화 방법을 나타내는 흐름도이다. 도 5에 도시된 다시점 영상 부호화 방법은 도 2의 영상 부호화 장치에서 수행되는 하기 단계들을 포함한다.5 is a flowchart illustrating a multi-view image encoding method according to an embodiment of the present invention. The multi-view image encoding method shown in FIG. 5 includes the following steps performed in the image encoding apparatus of FIG.

210단계에서, 버퍼(102)는 화면 그룹(GOP) 단위로 영상 정보를 입력 받는다. 여기에서 화면 그룹은 다시점 영상의 화면 그룹으로서 시점 방향의 화면 그룹과 시간 방향의 화면 그룹을 포함한다.In step 210, the buffer 102 receives image information in units of picture groups (GOPs). Here, the screen group includes a screen group in the view direction and a screen group in the time direction as a screen group of the multi-view image.

220단계에서, 다운 샘플링부(104)는 입력된 화면 그룹에 속한 화면들 각각에 대하여 다운 샘플링을 수행한다.In step 220, the down-sampling unit 104 performs down-sampling on each of the screens included in the input screen group.

230단계에서, GOP 예측 모드별 부호화부(106)는 화면 그룹 중 시점 방향의 화면 그룹의 앵커 화면들을 시점 방향의 그룹 예측 모드에 따라 부호화한다.In step 230, the GOP prediction mode encoding unit 106 encodes the anchor pictures of the screen group in the view direction among the screen groups according to the group prediction mode in the view direction.

240단계에서 비트-왜곡 비용값 계산부(108)는 230단계를 통해 부호화된 앵커 화면들 각각의 비트-왜곡 비용값을 계산한다.In operation 240, the bit-distortion cost calculator 108 calculates a bit-distortion cost value of each of the encoded anchor pictures in operation 230.

250단계에서 GOP 예측 모드 결정부(110)는 시점 방향의 그룹 예측 모드들 중에서 상기 비트-왜곡 비용값의 합을 최소로 하는 그룹 예측 모드를 최적의 그룹 예측 모드로 결정한다.In step 250, the GOP prediction mode determination unit 110 determines the group prediction mode that minimizes the sum of the bit-distortion cost values among the group prediction modes in the view direction as the optimal group prediction mode.

260단계에서 GOP 예측 모드별 부호화부(106)는 비앵커 화면에 대한 시간 방향의 그룹 예측 모드들 중에서 상기 250단계에서 결정된 그룹 예측 모드와 관련된 시간 방향의 그룹 예측 모드들에 따라 상기 비앵커 화면들에 대한 부호화를 수행한 다. 예를 들어, 앵커 화면 그룹에 대한 예측 모드가 PPBIBPBP로 결정된 경우, S0 시점에서의 예측 모드별 부호화는 시간 방향의 그룹 예측 모드들 중에서 P로 시작하는 그룹 예측 모드들에 대하여만 수행한다.In step 260, the GOP prediction mode encoding unit 106 encodes the non-anchor pictures in accordance with temporal group prediction modes related to the group prediction mode determined in step 250, among group temporal prediction modes for non- As shown in FIG. For example, when the prediction mode for the anchor picture group is determined to be PPBIBPBP, the prediction mode encoding at the time S0 is performed only for the group prediction modes starting from P among the group prediction modes in the temporal direction.

270단계에서 비트-왜곡 비용값 계산부(108)는 260단계를 통해 부호화된 각각의 비앵커 화면에 따른 비트-왜곡 비용값을 계산한다.In operation 270, the bit-distortion cost calculator 108 calculates a bit-distortion cost value according to each non-anchor picture encoded in operation 260. [

280단계에서 GOP 예측 모드 결정부(110)는 시간 방향의 그룹 예측 모드들 중에서 상기 비트-왜곡 비용값의 합을 최소로 하는 그룹 예측 모드를 최적의 그룹 예측 모드로 결정한다.In step 280, the GOP prediction mode determination unit 110 determines the group prediction mode that minimizes the sum of the bit-distortion cost values among the group prediction modes in the time direction to be the optimal group prediction mode.

290단계에서 부호화부(112)는 GOP 예측 모드 결정부(110)에 의하여 결정된 최적의 그룹 예측 모드에 따라 버퍼(102)에 저장된 다시점 영상들에 대한 부호화를 수행하고, 부호화된 영상 정보를 출력한다. 본 실시예의 다시점 영상 부호화 방법과 달리, 다운 샘플링 단계를 포함시키기 않도록 부호화 방법을 구현하는 것도 가능하다. 이 경우에는 GOP 예측 모드별 부호화부에 의해 생성된 부호화 결과 즉 부호화된 영상 정보를 활용할 수 있기 때문에, 부호화부(112)는 상기 결정된 그룹 예측 모드를 식별하기 위한 정보와 이미 생성된 부호화된 영상 정보를 이용하여 부호화를 수행한다.In step 290, the encoding unit 112 encodes the multi-view images stored in the buffer 102 according to the optimal group prediction mode determined by the GOP prediction mode deciding unit 110, and outputs the encoded image information do. Unlike the multi-view image encoding method of the present embodiment, it is also possible to implement a coding method so as not to include the down-sampling step. In this case, since the encoding result generated by the encoding unit for each GOP prediction mode, that is, the encoded image information, can be utilized, the encoding unit 112 encodes the information for identifying the determined group prediction mode and the already- To perform coding.

도 6은 본 발명의 일 실시예에 따른 다시점 영상 복호화 장치를 나타내는 블록도이다. 도 6에 도시된 영상 복호화 장치는 GOP 예측 모드 복호화부(302), 엔트로피 복호화부(304), 역양자화부(Q^-1, 306), 역이산여현변환부(IDCT, 308), 움직임 보상부(310) 및 화면 재배열부(312)를 포함한다.6 is a block diagram illustrating a multi-view image decoding apparatus according to an embodiment of the present invention. 6 includes a GOP prediction mode decoding unit 302, an entropy decoding unit 304, an inverse quantization unit (Q ^-1 , 306), an IDCT 308, (310) and a screen rearrangement unit (312).

GOP 예측 모드 복호화부(302)는 부호화된 화면 그룹에 대한 비트스트림으로부터 그룹 예측 모드에 대한 식별 정보를 복원한다. 엔트로피 복호화부(304)는 복원된 그룹 예측 모드의 식별 정보에 따라 입력된 비트스트림에 대한 엔트로피 복호화를 수행한다. 역양자화부(Q^-1, 306)는 엔트로피 복호화된 잔여 성분 정보를 역양자화시키고, 역이산여현변환부(IDCT, 308)는 이산 코사인 변환의 역연산을 수행하여 주파수 성분을 화소 성분으로 변환시킨다. 움직임 보상부(310)는 상기 엔트로피 복호화된 움직임 정보를 이용하여 움직임 보상을 수행하여 움직임 보상된 복원 영상 정보를 생성한다. 상기 복원된 영상 정보는 역이산여현변환부(308)로 부터의 잔여 성분 정보에 가산처리되며, 최종적으로 복원된 영상 정보를 화면 재배열부(312)에 전달한다. 화면 재배열부(312)는 가산기로부터 복원된 영상 정보를 입력 받고, 재생 시간 순서에 맞도록 화면을 재배열한다.The GOP prediction mode decoding unit 302 reconstructs the identification information for the group prediction mode from the bitstream of the coded picture group. The entropy decoding unit 304 performs entropy decoding on the input bitstream according to the identification information of the restored group prediction mode. The inverse quantization unit (Q ^-1 , 306) dequantizes the entropy-decoded residual component information, and the inverse discrete cosine transform unit (IDCT) 308 performs inverse operation of the discrete cosine transform to convert the frequency component into a pixel component . The motion compensation unit 310 performs motion compensation using the entropy-decoded motion information to generate motion compensated reconstructed image information. The restored image information is added to the residual component information from the IDCT unit 308, and finally the reconstructed image information is transmitted to the screen rearrangement unit 312. [ The picture rearranging unit 312 receives the reconstructed picture information from the adder, and rearranges the picture according to the reproduction time order.

한편 본 발명의 영상 부호화, 복호화 방법은 컴퓨터로 읽을 수 있는 기록 매체에 컴퓨터가 읽을 수 있는 코드로 구현하는 것이 가능하다. 컴퓨터가 읽을 수 있는 기록 매체는 컴퓨터 시스템에 의하여 읽혀질 수 있는 데이터가 저장되는 모든 종류의 기록 장치를 포함한다.Meanwhile, the image encoding and decoding method of the present invention can be implemented by a computer-readable code on a computer-readable recording medium. A computer-readable recording medium includes all kinds of recording apparatuses in which data that can be read by a computer system is stored.

컴퓨터가 읽을 수 있는 기록 매체의 예로는 ROM, RAM, CD-ROM, 자기 테이프, 플로피디스크, 광데이터 저장장치 등이 있으며, 또한 캐리어 웨이브(예를 들어 인터넷을 통한 전송)의 형태로 구현하는 것을 포함한다. 또한, 컴퓨터가 읽을 수 있 는 기록 매체는 네트워크로 연결된 컴퓨터 시스템에 분산되어, 분산 방식으로 컴퓨터가 읽을 수 있는 코드가 저장되고 실행될 수 있다. 그리고, 본 발명을 구현하기 위한 기능적인(functional) 프로그램, 코드 및 코드 세그먼트 들은 본 발명이 속하는 기술 분야의 프로그래머들에 의하여 용이하게 추론될 수 있다.Examples of the computer-readable recording medium include a ROM, a RAM, a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device and the like, and also a carrier wave (for example, transmission via the Internet) . In addition, the computer readable recording medium may be distributed over networked computer systems so that computer readable code can be stored and executed in a distributed manner. Further, functional programs, codes, and code segments for implementing the present invention can be easily deduced by programmers in the technical field to which the present invention belongs.

이제까지 본 발명에 대하여 바람직한 실시예를 중심으로 살펴보았다. 본 발명이 속하는 기술 분야에서 통상의 지식을 가진 자는 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 본 발명을 구현할 수 있음을 이해할 것이다. 그러므로, 상기 개시된 실시예 들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 한다.The present invention has been described above with reference to preferred embodiments. It will be understood by those skilled in the art that the present invention may be embodied in various other forms without departing from the spirit or essential characteristics thereof. Therefore, the above-described embodiments should be considered in an illustrative rather than a restrictive sense. The scope of the present invention is indicated by the appended claims rather than by the foregoing description, and all differences within the scope of equivalents thereof should be construed as being included in the present invention.

본 발명은 시점 방향의 화면 그룹(VGOP)과 시간 방향의 화면 그룹(TGOP)의 부호화를 위한 예측 구조에 대한 가변적으로 조절이 가능한 다시점 영상 부호화 방법와 장치, 그리고 영상 복호화 장치에 관한 것으로서, 다시점 영상 부호화 시스템은 물론 대화형 컨텐츠, 실감 컨텐츠 등을 이용하는 다양한 멀티미디어 서비스 시스템에 적용되기에 유용하다.The present invention relates to a variable-length multi-view image coding method and apparatus, and an image decoding apparatus for a prediction structure for coding a view group in a view direction (VGOP) and a view group in a time direction (TGOP) The present invention is useful for application to various multimedia service systems using an image encoding system as well as interactive contents and realistic contents.

도 1은 본 발명에 따른 다시점 영상 전송 시스템을 나타내는 개략도이다. 1 is a schematic diagram showing a multi-view image transmission system according to the present invention.

도 2는 본 발명의 일 실시예에 따른 다시점 영상 부호화 장치를 나타내는 블록도이다.2 is a block diagram illustrating a multi-view image encoding apparatus according to an exemplary embodiment of the present invention.

도 3은 8개의 시점으로 이루어진 다시점 영상 부호화를 위한 예측 구조의 일예를 나타낸다.FIG. 3 shows an example of a prediction structure for multi-view image encoding with eight viewpoints.

도 4는 본 발명의 또 다른 일 실시예에 따른 다시점 영상 부호화 장치를 나타내는 블록도이다. 4 is a block diagram illustrating a multi-view image encoding apparatus according to another embodiment of the present invention.

도 5는 본 발명의 일 실시예에 따른 다시점 영상 부호화 방법을 나타내는 흐름도이다.5 is a flowchart illustrating a multi-view image encoding method according to an embodiment of the present invention.

도 6은 본 발명의 일 실시예에 따른 다시점 영상 복호화 장치를 나타내는 블록도이다.6 is a block diagram illustrating a multi-view image decoding apparatus according to an embodiment of the present invention.

Claims

In the multi-view image coding method,

a) receiving a screen group including a plurality of screens;

b) encoding the pictures of the picture group according to a plurality of predetermined group prediction modes, and calculating a bit-distortion cost value of each of the encoded pictures;

c) determining one group prediction mode among the group prediction modes considering the bit-distortion cost value; And

d) generating multi-view image information encoded according to the determined group prediction mode.

The method according to claim 1,

In the step a), the picture group includes a view group (VGOP) and a temporal group (TGOP). In the step b), the group prediction modes include a plurality And a plurality of group prediction modes for group-prediction mode and temporal-group-based picture group coding.

3. The method of claim 2, wherein step b)

b1) encoding the anchor pictures among the picture groups in the view direction according to the group prediction mode in the view direction; And

b2) calculating a bit-distortion cost value of each of the anchor pictures coded through the step b1).

The method of claim 3,

The determination of one group prediction mode in step c)

Determining a group prediction mode that minimizes a sum of the bit-distortion cost values calculated in the step b2) among the group prediction modes in the view direction,

The step of generating the multi-view image information in step d)

Wherein the image information encoded in accordance with the group prediction mode in the view direction selected in step c) and the identification information on the determined group prediction mode are used from the coding information generated as a result of step b1) Image encoding method.

3. The method of claim 2, wherein step b)

b1) downsampling the anchor pictures among the picture groups in the view direction;

b2) encoding the downsampled anchor pictures according to the group prediction modes in the view direction; And

b3) calculating a bit-distortion cost value according to each anchor picture coded through the step b2).

3. The method of claim 2, wherein step b)

b1) encoding the anchor pictures among the picture groups in the view direction according to the group prediction modes in the view direction;

b2) calculating a bit-distortion cost value of each of the anchor pictures coded through the step b1), and calculating a bit-distortion cost value of a group prediction mode in a view direction that minimizes a sum of the bit- Determining a mode;

b3) selecting time direction group prediction modes related to the group prediction mode in the view direction determined in step b2) among the group prediction modes in the temporal direction, and encoding non-anchor pictures according to the selected group prediction modes ;

b4) calculating a bit-distortion cost value according to each encoded non-anchor picture.

The method according to claim 6,

Wherein the step c) comprises: a step of minimizing a sum of the bit-distortion cost values of the non-anchor picture calculated in step b4) among the temporal group prediction modes related to the group prediction mode determined in step b2) Determining a group prediction mode of a direction,

The generating of the multi-view image information in the step d) may include using the encoded image information generated as a result of the step b3) and the identification information on the group prediction mode determined in the step b2) or c) A multi-view video encoding method characterized by:

3. The method of claim 2,

Wherein the view direction group prediction modes include at least one group prediction mode selected from " ... PPIPP ... ", "... PBIBP ... ", and" ... PBBIBBP ... ".

A computer-readable recording medium on which a program for performing the multi-view image encoding method of any one of claims 1 to 8 is recorded on a computer.

In a multi-view video encoding apparatus,

A buffer for receiving a screen group including a plurality of screens and storing data for the input screen group;

A prediction mode encoding unit that encodes the pictures of the picture group according to a plurality of predetermined group prediction modes;

A bit-distortion value calculation unit for calculating a bit-distortion cost value of each of the encoded pictures;

A prediction mode determining unit for determining one group prediction mode among the group prediction modes considering the bit-distortion cost value; And

And an encoding unit for generating multi-view image information encoded according to the determined group prediction mode.

11. The method of claim 10,

Wherein the screen group includes screens in a view direction and screens in a time direction,

Wherein the group prediction modes include a plurality of group prediction modes for the screen group coding in the view direction and a plurality of group prediction modes for the screen group coding in the temporal direction.

12. The method of claim 11,

Wherein the encoding unit for each prediction mode encodes the anchor pictures included in the picture group according to the group prediction mode in the view direction, and the bit-distortion cost value calculation unit calculates the bit-distortion cost value of each of the anchor pictures, - calculating a distortion cost value,

Wherein the prediction mode determination unit determines a group prediction mode that minimizes a sum of the bit-distortion cost values among the group prediction modes in the view direction.

12. The method of claim 11,

Wherein the encoding unit for each prediction mode includes a first encoding unit for each prediction mode for encoding the anchor pictures included in the picture group according to the group prediction mode in the view direction and a second encoding unit for encoding non-anchor pictures according to the group prediction mode in the time direction And a second encoding unit for each prediction mode,

Wherein the bit-distortion cost calculator comprises a first bit-distortion calculator for calculating a bit-distortion cost value of each of the anchor pictures coded by the first encoder for each prediction mode, and a second bit- And a second bit-distortion calculation unit for calculating a bit-distortion cost value according to each non-anchor picture,

Wherein the prediction mode determination unit includes a first prediction mode determination unit for determining one group prediction mode that minimizes a sum of bit-distortion cost values among the group prediction modes in the view direction, And a second prediction mode deciding unit for deciding a group prediction mode that minimizes a sum of bit-distortion cost values.

12. The method of claim 11,

And a down-sampling unit for down-sampling screens of the input screen group,

Wherein the encoding unit for each prediction mode encodes the downsampled pictures according to the group prediction modes.

15. The method of claim 14,

Wherein the prediction mode determination unit determines one group prediction mode that minimizes a sum of bit-distortion cost values among the group prediction modes,

Wherein the encoding unit generates encoded multi-view image information according to the determined group prediction mode.

In a multi-view video encoding apparatus,

A prediction mode controller for adjusting a group prediction mode for coding pictures included in the picture group;

A bit-distortion value calculation unit for calculating a bit-distortion cost value of each of the coded pictures according to the group prediction mode adjusted by the prediction mode adjustment unit; And

And a prediction mode determining unit for determining one group prediction mode among the group prediction modes considering the bit-distortion cost value, wherein the multi-view image coding unit generates multi-view image information encoded according to the determined group prediction mode, Encoding apparatus.

delete