KR101386651B1

KR101386651B1 - Multi-View video encoding and decoding method and apparatus thereof

Info

Publication number: KR101386651B1
Application number: KR1020120101402A
Authority: KR
Inventors: 최병호; 김용환; 박지호
Original assignee: 전자부품연구원
Priority date: 2012-09-13
Filing date: 2012-09-13
Publication date: 2014-04-17
Also published as: KR20140035065A

Abstract

복수의 다시점 영상과 각 시점 영상에 대한 깊이 영상 정보를 이용하여 중간 시점 영상을 합성하는 기술이 개시된다. 본 발명의 일 면에 따른 다시점 비디오 인코딩 방법은 다시점 영상 정보 및 각 시점 영상에 대한 깊이 정보를 획득하는 단계, 획득된 다시점 영상 중 임의의 제1 시점의 영상 정보 및 깊이 정보와, 상기 제1 시점에 가장 근접한 시점인 제2 시점의 영상 정보 및 깊이 정보를 이용하여 상기 제1 및 제2 시점의 중간 시점에 대한 가상 정보를 생성하는 단계, 상기 가상 정보를 부호화하는 단계를 포함한다.Disclosed is a technique for synthesizing an intermediate view image using a plurality of multi-view images and depth image information of each view image. According to an aspect of the present invention, there is provided a method of encoding a multiview video, acquiring multiview image information and depth information of each view image, image information and depth information of any first view of the obtained multiview image, and Generating virtual information on the intermediate view of the first and second views using image information and depth information of the second view, which is the view closest to the first view, and encoding the virtual information.

Description

Multi-view video encoding and decoding method and apparatus for encoding and decoding using same

본 발명은 다시점 비디오 코덱에서 자유 시점 비디오 콘텐츠를 제공하기 위한 방법 및 장치에 관한 것으로서, 보다 상세하게는 복수의 다시점 영상과 각 시점 영상에 대한 깊이 영상 정보를 이용하여 중간 시점 영상을 합성하는 방법에 관한 것이다.The present invention relates to a method and apparatus for providing free view video content in a multiview video codec, and more particularly, to synthesize an intermediate view image using a plurality of multiview images and depth image information of each view image. It is about a method.

최근 멀티미디어 처리 기술 및 하드웨어 기술의 급격한 발전으로 인해, HD(High-Definition)급 이상의 고화질 방송 서비스가 널리 보급화되었다. 또한 소비자에게 더 나은 사실갑과 현장감을 제공하기 위해 3차원 입체 콘텐츠를 이용한 3D TV 방송에 대한 관심이 높아지고 있다. 이러한 3D TV 방송은 HDTV의 뒤를 이어 차세대 방송 서비스 시장에서 가장 주목 받게 될 것으로 기대되고 있다. 이와 관련된 연구로써, JVT(Joint Video Team of ISO/IEC JTC1/SC29/WG11 MPEG and ITU-T SG16 Q.6 VCEG)에서는 두 개 이상의 카메라로부터 입력 받은 영상들을 효율적으로 부호화하기 위한 다시점 비디오 부호화(Multi-view Video Coding, H.264/AVC)표준을 완료하였고, 또한 비디오 표준화 단체인 ISO/IEC의 MPEG에서는 사용자에게 다양한 시점과 높은 현장감을 제공하기 위하여 3차원 비디오(3D-Video)표준을 진행 중이다. 3차원 비디오 표준은 입력 받은 시점 영상 이외에 사용자가 원하는 다양한 가상 시점 영상을 생성할 수 있도록 지원하기 위한 표준이다. 가상(중간)시점 영상(Virtual-view image)은 실 세계에서 카메라와 실제 사물(Object) 간의 거리인 깊이정보 맵(Depth-map)을 이용한 시점보간(View-interpolation)방법을 통하여 무한대로 생성될 수 있다. 따라서 3차원 비디오 표준에서는 소수의 시점(View)영상 데이터와 그것의 깊이정보 맵만 전송하면 되기 때문에 기존 다시점 비디오 부호화 방법에 비해 대역폭과 저장 공간을 절약할 수 있는 장점이 있다.Recently, due to the rapid development of multimedia processing technology and hardware technology, high-definition broadcasting service of higher than HD (High-Definition) level has become widespread. In addition, interest in 3D TV broadcasting using 3D stereoscopic contents is increasing to provide consumers with a better sense of reality and realism. Such 3D TV broadcasting is expected to be most noticed in the next generation broadcasting service market following HDTV. In this research, JVT (Joint Video Team of ISO / IEC JTC1 / SC29 / WG11 MPEG and ITU-T SG16 Q.6 VCEG) uses multi-view video encoding to efficiently encode images from two or more cameras. Multi-view Video Coding (H.264 / AVC) standard has been completed, and ISO / IEC MPEG, a video standardization organization, has implemented 3D video standard to provide users with various viewpoints and high realism. In the process. The 3D video standard is a standard for supporting a user to generate various virtual viewpoint images in addition to the viewpoint images received. A virtual-view image may be generated indefinitely through a view-interpolation method using a depth-map, which is a distance between a camera and a real object in the real world. Can be. Therefore, in the 3D video standard, only a small number of view image data and its depth map need to be transmitted, thereby saving bandwidth and storage space compared to the conventional multiview video encoding method.

이러한 방법의 다시점 영상 부호화기에서는 중간 시점 영상 합성을 위한 깊이정보 맵을 입력된 다시점 영상과 함께 부호화하고, 복호화기는 복호화된 영상과 깊이정보 맵을 이용하여 원하는 시점의 중간 시점 영상을 합성하게 된다. 이 경우, 복호화된 영상과 합성하고자 하는 시점의 영상 사이에는 차이가 존재하게 되는데, 복호화기는 이러한 차이에 관한 정보를 알 수 없기 때문에 중간 시점 영상 합성에 많은 어려움이 있었고, 또한 합성된 중간 시점의 영상의 품질에도 문제가 있었다.In this method, the multiview image encoder encodes the depth map for synthesizing the intermediate view image together with the input multiview image, and the decoder synthesizes the intermediate view image of the desired view using the decoded image and the depth map. . In this case, there is a difference between the decoded image and the image of the viewpoint to be synthesized. Since the decoder cannot know the information about the difference, there are many difficulties in synthesizing the intermediate view image and the synthesized intermediate view image. There was also a problem with the quality.

본 발명은 상술한 문제점을 해결하기 위하여, 중간 시점 영상의 합성 과정에서 발생하게 되는 폐색(occlusion) 영역과 비폐색(disocclusion) 영역에 관한 정보를 부호화기에서 미리 생성하고 전송함으로써, 복호화기의 연산량을 줄여주어 신뢰성이 높고 고 품질의 중간 시점 영상을 합성할 수 있는 방법을 제공하는데 그 목적이 있다.In order to solve the above problem, the encoder generates and transmits information about an occlusion area and a non-occlusion area that are generated during the synthesis of an intermediate view image in advance, thereby reducing the amount of computation of the decoder. The purpose of this paper is to provide a method for synthesizing high-quality mid-point images with high reliability.

본 발명의 다른 목적은 부호화기가 생성하는 폐색 영역과 비폐색 영역에 관한 정보의 부호화 비트레이트를 줄여서 부호화 효율을 향상시키는 것이다.Another object of the present invention is to improve encoding efficiency by reducing the encoding bitrate of information about the occluded and non-occluded regions generated by the encoder.

본 발명의 목적은 이상에서 언급한 목적으로 제한되지 않으며, 언급되지 않은 또 다른 목적들은 아래의 기재로부터 당업자에게 명확하게 이해될 수 있을 것이다.The objects of the present invention are not limited to the above-mentioned objects, and other objects not mentioned can be clearly understood by those skilled in the art from the following description.

전술한 목적을 달성하기 위한 본 발명의 일면에 따른 다시점 비디오 인코딩 방법은 다시점 영상 정보 및 각 시점 영상에 대한 깊이 정보를 획득하는 단계, 획득된 다시점 영상 중 임의의 제1 시점의 영상 정보 및 깊이 정보와, 상기 제1 시점에 가장 근접한 시점인 제2 시점의 영상 정보 및 깊이 정보를 이용하여 상기 제1 및 제2 시점의 중간 시점에 대한 가상 정보를 생성하는 단계, 상기 가상 정보를 부호화하는 단계를 포함한다.According to an aspect of the present invention, there is provided a multi-view video encoding method, the method comprising: acquiring multi-view image information and depth information of each view image, and image information of any first view from the obtained multi-view image And generating virtual information on the intermediate view between the first and second views using depth information, image information and depth information of a second view, which is a view closest to the first view, and encoding the virtual information. It includes a step.

또한 본 발명의 일 실시예에서 상기 가상 정보를 생성하는 단계는 상기 제1 시점의 깊이 정보를 기반으로 상기 제1 시점의 영상을 상기 제2 시점의 영상에 투사(projection)하여 상기 제1 시점의 영상에 대한 폐색(occlusion) 영역과 비폐색(disocclusion) 영역에 관한 정보를 생성하는 단계와, 상기 제2 시점의 깊이 정보를 기반으로 상기 제2 시점의 영상을 상기 제1 시점의 영상에 투사하여 상기 제2 시점의 영상에 대한 폐색 영역과 비폐색 영역에 관한 정보를 생성하는 단계를 한다.The generating of the virtual information may include projecting an image of the first view onto an image of the second view based on depth information of the first view. Generating information about an occlusion region and a non-occluded region of the image, and projecting the image of the second viewpoint to the image of the first viewpoint based on depth information of the second viewpoint; Generating information about the occluded area and the non-occluded area for the image of the second view is performed.

또한, 본 발명의 일 실시예에서 상기 폐색 영역과 비폐색 영역에 관한 정보를 생성하는 단계는 상기 제1 또는 상기 제2 시점의 영상을 임의의 부호화 단위로 검색하여 상기 폐색 영역과 상기 비폐색 영역의 발생 여부와 관련된 정보를 이진 맵(Binary map)으로 생성하는 단계를 포함한다.The generating of the information about the occlusion area and the non-occlusion area may include generating the occlusion area and the non-occlusion area by retrieving the image of the first or second view in an arbitrary coding unit. And generating information related to whether or not as a binary map.

또한, 본 발명의 다른 실시예에서 상기 폐색 영역과 비폐색 영역에 관한 정보를 생성하는 단계는 상기 제1 또는 상기 제2 시점의 영상을 임의의 부호화 단위로 검색하여 상기 폐색 영역 또는 상기 비폐색 영역의 발생 여부와 관련된 정보를 2 비트를 이용한 맵으로 생성하는 단계를 포함한다.In another embodiment of the present invention, the generating of the information about the occlusion area and the non-occlusion area may be performed by retrieving the image of the first or the second view point in an arbitrary coding unit to generate the occlusion area or the non-obstruction area. And generating information related to whether or not to map using 2 bits.

또한, 본 발명의 일 실시예에서 상기 가상 정보를 부호화하는 단계는 상기 이진 맵에서 라스터 스캔 순서(Raster scan order)로 동일 숫자가 나온 개수를 연속적으로 부호화하는 단계를 포함한다.Also, in the embodiment of the present invention, the encoding of the virtual information may include continuously encoding the number of the same number in the raster scan order in the binary map.

또한, 본 발명의 다른 실시예에서 상기 가상 정보를 부호화하는 단계는 상기 2 비트를 이용한 맵에서 동일 숫자가 나온 개수를 연속적으로 부호화하는 단계; 및In another embodiment of the present invention, the encoding of the virtual information may include sequentially encoding a number of the same number in the map using the two bits; And

상태 변환을 위한 식별자를 부호화하는 단계를 포함한다.Encoding an identifier for state translation.

또한, 본 발명의 일 실시예에서 상기 부호화된 가상 정보는 H.264 또는 HEVC 코덱에서 상기 다시점 영상 정보를 재생하는 응용 프로그램의 종류에 따라 선택적으로 전송 비트스트림의 부가 정보 영역 또는 필수 정보 영역에 포함되어 전송되는 것을 특징으로 한다.In addition, according to an embodiment of the present invention, the encoded virtual information may be selectively added to an additional information area or an essential information area of a transport bitstream according to the type of an application that reproduces the multiview video information in an H.264 or HEVC codec. It is characterized in that the transmission.

전술한 목적을 달성하기 위한 본 발명의 다른 면에 따른 다시점 비디오 디코딩 방법은 수신한 다시점 영상 정보를 복호화하는 단계, 가상 정보를 복호화하여 각 시점 영상에 대한 폐색(occlusion) 영역과 비폐색(disocclusion) 영역에 관한 정보를 추출하는 단계, 상기 복호화된 영상 정보 및 상기 추출된 폐색 영역과 비폐색 영역에 관한 정보를 참조하여 상기 각 시점에 대한 중간 시점의 영상 정보를 생성하는 단계를 포함한다.Multi-view video decoding method according to another aspect of the present invention for achieving the above object is decoding the received multi-view image information, decoding the virtual information occlusion region (noncclusion) and non-obstruction (deisocclusion) for each viewpoint image Extracting information about an area, and generating image information of an intermediate view for each view by referring to the decoded image information and the information about the extracted occlusion area and the non-blocking area.

전술한 목적을 달성하기 위한 본 발명의 다른 면에 따른 다시점 비디오 인코딩 장치는 다수의 카메라로부터 다시점 영상 정보 및 각 시점 영상에 대한 깊이 정보를 수신하는 수신부, 수신된 다시점 영상 중 임의의 제1 시점의 영상 정보 및 깊이 정보와, 상기 제1 시점에 가장 근접한 시점인 제2 시점의 영상 정보 및 깊이 정보를 이용하여 상기 제1 및 제2 시점의 중간 시점에 대한 가상 정보를 생성하는 가상 정보 생성부, 상기 가상 정보를 부호화하는 가상 정보 부호화부를 포함한다.According to another aspect of the present invention, a multi-view video encoding apparatus according to another aspect of the present invention is a receiver for receiving multi-view image information and depth information for each view image from a plurality of cameras, and any arbitrary one of the received multi-view images. Virtual information for generating virtual information on the intermediate view between the first and second views using the image information and the depth information of the first view and the image information and the depth information of the second view which is the closest view to the first view. A generation unit and a virtual information encoding unit for encoding the virtual information.

이때, 본 발명의 일 실시예에 있어서 상기 가상 정보 생성부는 상기 제1 시점의 깊이 정보를 기반으로 상기 제1 시점의 영상을 상기 제2 시점의 영상에 투사(projection)하여 상기 제1 시점의 영상에 대한 폐색(occlusion) 영역과 비폐색(disocclusion) 영역에 관한 정보를 생성하고, 상기 제2 시점의 깊이 정보를 기반으로 상기 제2 시점의 영상을 상기 제1 시점의 영상에 투사하여 상기 제2 시점의 영상에 대한 폐색 영역과 비폐색 영역에 관한 정보를 생성한다.In this example, the virtual information generator may project the image of the first view onto the image of the second view based on the depth information of the first view, and thus the image of the first view. Generate information about an occlusion area and a non-occlude area for the second view, and project the image of the second view onto the image of the first view based on the depth information of the second view. Generates information about occluded and non-occluded regions of the image.

또한, 본 발명의 다른 실시예에 있어서 상기 가상 정보 생성부는 상기 제1 또는 상기 제2 시점의 영상을 임의의 부호화 단위로 검색하여 상기 폐색 영역과 상기 비폐색 영역의 발생 여부와 관련된 정보를 이진 맵(Binary map)으로 생성한다.
Also, in another embodiment of the present invention, the virtual information generation unit may search for the image of the first or second view in an arbitrary coding unit to obtain information related to whether the occluded area and the non-occluded area occur. Binary map)

또한, 본 발명의 다른 실시예에 있어서 상기 가상 정보 생성부는 상기 제1 또는 상기 제2 시점의 영상을 임의의 부호화 단위로 검색하여 상기 폐색 영역 또는 상기 비폐색 영역의 발생 여부와 관련된 정보를 2 비트를 이용한 맵으로 생성한다.Also, in another embodiment of the present invention, the virtual information generator may search for the image of the first or second view in an arbitrary coding unit to obtain 2 bits of information related to whether the occluded area or the non-occluded area is generated. Create using map.

이때, 본 발명의 일 실시예에 있어서 상기 가상 정보 부호화부는 상기 이진 맵에서 라스터 스캔 순서(Raster scan order)로 동일 숫자가 나온 개수를 연속적으로 부호화한다.In this case, according to an embodiment of the present invention, the virtual information encoder continuously encodes the number of the same number in the raster scan order in the binary map.

또한, 본 발명의 다른 실시예에 있어서 상기 가상 정보 부호화부는 상기 2 비트를 이용한 맵에서 동일 숫자가 나온 개수를 연속적으로 부호화하고, 상태 변환을 위한 식별자를 부호화한다.In another embodiment of the present invention, the virtual information encoder continuously encodes the number of the same number in the map using the two bits, and encodes an identifier for state transformation.

전술한 목적을 달성하기 위한 본 발명의 다른 면에 따른 다시점 비디오 디코딩 장치는 수신한 다시점 영상 정보를 복호화하는 영상 정보 복호화부, 가상 정보를 복호화하여 각 시점 영상에 대한 폐색(occlusion) 영역과 비폐색(disocclusion) 영역에 관한 정보를 추출하는 가상 정보 복호화부, 상기 복호화된 영상 정보 및 상기 추출된 폐색 영역과 비폐색 영역에 관한 정보를 참조하여 상기 각 시점에 대한 중간 시점의 영상 정보를 생성하는 중간 시점 영상 정보 생성부를 포함한다.According to another aspect of the present invention, an apparatus for decoding a multiview video according to another aspect of the present invention includes an image information decoder for decoding received multiview image information, an occlusion region for each viewpoint image by decoding virtual information; A virtual information decoder which extracts information about a non-occluded region, an intermediate to generate image information of an intermediate viewpoint for each viewpoint by referring to the decoded image information and the information about the extracted occluded region and the non-occluded region It includes a viewpoint image information generation unit.

한편, 본 발명의 일 실시예에 따른 다시점 비디오 인코딩 방법 및 디코딩 방법은 컴퓨터로 실행하기 위한 프로그램으로 기록되어 컴퓨터로 판독 가능한 기록 매체에 저장될 수 있다.Meanwhile, the multi-view video encoding method and the decoding method according to an embodiment of the present invention may be recorded as a program for execution by a computer and stored in a computer-readable recording medium.

상술한 바와 같이 본 발명에 따르면, 부호화기에 의해 생성된 폐색 영역 및 비폐색 영역에 관한 정보를 전송 받음으로써, 복호화기는 중간 시점 영상 합성 시, 폐색 영역 및 비폐색 영역으로 인한 영상 겹침 및 빈 공간 발생 문제를 해결할 수 있으며 이로 인해 출력되는 중간 시점 영상의 품질과 신뢰도를 향상시킬 수 있다.As described above, according to the present invention, by receiving information about the occlusion area and the non-occlusion area generated by the encoder, the decoder can solve the problem of image overlap and empty space due to the occlusion area and the non-occlusion area when synthesizing the intermediate view image. This can improve the quality and reliability of the output mid-view image.

도 1은 본 발명의 실시예에 따른 다시점 영상의 부호화기의 구성을 도시한 블록도이다.
도 2는 본 발명의 실시예에 따른 다시점 영상 부호화 방법을 도시한 순서도이다.
도 3은 본 발명의 실시예에 따른 다시점 영상 부호화 방법에 있어서 가상 정보 부호화 시, 상태변환을 위한 식별자 정보를 부호화하는 방법의 일 예를 도시한 예시도이다.
도 4는 본 발명의 실시예에 따른 다시점 영상의 복호화기의 구성을 도시한 블록도이다.
도 5는 본 발명의 실시예에 따른 다시점 영상 복호화 방법을 도시한 순서도이다.1 is a block diagram illustrating a configuration of an encoder of a multiview image according to an embodiment of the present invention.
2 is a flowchart illustrating a multiview image encoding method according to an embodiment of the present invention.
3 is an exemplary diagram illustrating an example of a method of encoding identifier information for state transformation in virtual information encoding in the multi-view image encoding method according to an embodiment of the present invention.
4 is a block diagram illustrating a configuration of a decoder of a multiview image according to an embodiment of the present invention.
5 is a flowchart illustrating a multi-view image decoding method according to an embodiment of the present invention.

본 발명의 이점 및 특징, 그리고 그것들을 달성하는 방법은 첨부되는 도면과 함께 상세하게 후술되어 있는 실시예들을 참조하면 명확해질 것이다. 그러나 본 발명은 이하에서 개시되는 실시예들에 한정되는 것이 아니라 서로 다른 다양한 형태로 구현될 것이며, 단지 본 실시예들은 본 발명의 개시가 완전하도록 하며, 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자에게 발명의 범주를 완전하게 알려주기 위해 제공되는 것이며, 본 발명은 청구항의 범주에 의해 정의될 뿐이다. 한편, 본 명세서에서 사용된 용어는 실시예들을 설명하기 위한 것이며 본 발명을 제한하고자 하는 것은 아니다. 본 명세서에서, 단수형은 문구에서 특별히 언급하지 않는 한 복수형도 포함한다.Advantages and features of the present invention and methods for achieving them will be apparent with reference to the embodiments described below in detail with the accompanying drawings. The present invention may, however, be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the invention to those skilled in the art. Is provided to fully convey the scope of the invention to those skilled in the art, and the invention is only defined by the scope of the claims. It is to be understood that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. In the present specification, the singular form includes plural forms unless otherwise specified in the specification.

이하, 본 발명의 바람직한 실시예를 첨부된 도면들을 참조하여 상세히 설명한다. 우선 각 도면의 구성요소들에 참조부호를 부가함에 있어서, 동일한 구성요소들에 대해서는 비록 다른 도면상에 표시되더라도 가급적 동일한 부호를 부여하고 또한 본 발명을 설명함에 있어, 관련된 공지 구성 또는 기능에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있는 경우에는 그 상세한 설명은 생략한다.Hereinafter, preferred embodiments of the present invention will be described in detail with reference to the accompanying drawings. In the drawings, like reference numerals refer to like elements throughout. In the drawings, like reference numerals are used to denote like elements, and in the description of the present invention, In the following description, a detailed description of the present invention will be omitted.

이상, 본 발명의 바람직한 실시예를 통하여 본 발명의 구성을 상세히 설명하였으나, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 본 명세서에 개시된 내용과는 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 본 발명의 보호범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구의 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 도는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.While the present invention has been described in connection with what is presently considered to be practical exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, It is to be understood that the invention may be embodied in other specific forms. It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive. It is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

이하, 도 1 및 도 2를 참조하여 본 발명의 일 실시예에 따른 다시점 비디오 부호화기를 설명한다. 도 1은 본 발명의 실시예에 따른 다시점 영상의 부호화기의 구성을 도시한 블록도이고, 도 2는 본 발명의 실시예에 따른 다시점 영상 부호화 방법을 도시한 순서도이다.Hereinafter, a multiview video encoder according to an embodiment of the present invention will be described with reference to FIGS. 1 and 2. 1 is a block diagram illustrating a configuration of an encoder of a multiview image according to an embodiment of the present invention, and FIG. 2 is a flowchart illustrating a multiview image encoding method according to an embodiment of the present invention.

본 발명은 다시점 3차원 TV 또는 자유시점 TV 서비스를 지원하는 비디오 콘텐츠 제공 방법으로서, 1시점의 영상 정보와 그와 연관된 1시점의 깊이 정보에 의해서 서비스가 이루어질 수 있으며, 2개 시점 이상의 다시점 영상 정보 및 다시점 깊이 정보에 의해서 서비스가 이루어질 수도 있다.The present invention provides a video content providing method for supporting a multi-view 3D TV or a free-view TV service, wherein a service can be provided by image information of one view and depth information of one view associated therewith, and the multiview of two or more views. The service may be provided by the image information and the multi-view depth information.

즉, 본 발명은 상기 영상 정보 및 깊이 정보로 양안식 3차원 TV, 다시점 3차원 TV, 자유시점 TV 서비스 등의 다양한 비디오 콘텐츠 제공 서비스에서 제공되는 정보를 지원한다. 본 발명은 다시점 카메라 등으로부터 획득한 다시점 영상과 각 시점 영상에 대한 깊이 정보를 이용하여 각 시점 영상에서의 가상 영상 정보(폐색 영역 정보 및 비폐색 영역 정보)를 생성하고, 이를 복호화기에 전송하는 전송 비트스트림에 포함시켜, 복호화기가 복호화된 다시점 영상을 이용하여 중간 시점 영상을 합성할 때, 연산량을 줄여줌으로써 신뢰성 있고 높은 품질의 영상을 합성할 수 있는 방법을 제공한다. 이와 같은 방법을 제공하기 위한 본 발명의 일 실시예에 따른 다시점 부호화기의 구체적인 구성을 아래에 설명한다.That is, the present invention supports information provided by various video content providing services such as binocular 3D TV, multiview 3D TV, and free view TV service using the image information and depth information. The present invention generates virtual image information (closed region information and non-closed region information) in each viewpoint image by using a multiview image acquired from a multiview camera and the depth information of each viewpoint image, and transmits it to the decoder. The present invention provides a method for synthesizing a reliable and high quality image by reducing the amount of computation when the decoder synthesizes an intermediate view image by using the decoded multi-view image. A detailed configuration of a multi-view encoder according to an embodiment of the present invention for providing such a method will be described below.

도 1을 참조하면, 본 발명의 일 실시예에 따른 다시점 비디오 부호화기는 수신부(110), 깊이 정보 부호화부(120), 영상 정보 부호화부(130), 가상 정보 생성부(140), 가상 정보 부호화부(145), 다중화부(150), 비트스트림 생성부(160), 전송부(170)를 포함한다.Referring to FIG. 1, a multi-view video encoder according to an embodiment of the present invention includes a receiver 110, a depth information encoder 120, an image information encoder 130, a virtual information generator 140, and virtual information. The encoder 145 includes a multiplexer 150, a bitstream generator 160, and a transmitter 170.

수신부(110)는 적어도 하나 이상의 다시점 비디오 카메라, 스테레오 카메라, 깊이 정보 카메라 등에서 촬영한 영상 정보, 즉 다시점 영상 및 각 시점 영상에 대한 깊이 정보를 수신한다(S210).The receiver 110 receives image information photographed by at least one multiview video camera, a stereo camera, a depth information camera, or the like, that is, depth information of a multiview image and each view image (S210).

다시점 영상은 한 대의 카메라를 이용하여 카메라를 이동하면서 촬영하거나, 여러 대의 카메라를 다양한 위치에 배치한 후에 이들을 이용하여 촬영할 수 있다. 또한, 제한된 시점을 가지는 일반적인 카메라가 아니라, 다시점 영상 취득을 위한 특수 카메라를 통해 한꺼번에 여러 방향의 영상을 획득할 수도 있다.The multi-view image can be taken while moving the camera by using one camera, or can be photographed by placing several cameras at various positions. In addition, instead of a general camera having a limited viewpoint, images of various directions may be acquired at a time through a special camera for acquiring a multiview image.

깊이 정보란 실 세계에서 카메라와 실제 사물(object) 간의 거리를 실사 영상과 동일한 해상도로 각 화소(픽셀)에 해당하는 일정한 비트수로 표현한 것이고, 일반적으로 해당하는 실사 영상과 동일한 크기의 맵(깊이 정보 맵)으로 표현된다. 일반적으로 깊이 정보 맵은 모든 다시점 영상에 대해 생성되며, 픽셀 단위로 표시되므로, 깊이 해상도는 영상 해상도와 동일하다.Depth information is a representation of the distance between the camera and the real object in the real world with a certain number of bits corresponding to each pixel (pixel) at the same resolution as the real image, and generally a map having the same size as the corresponding real image (depth Information map). In general, the depth information map is generated for all multi-view images and displayed in pixels, so the depth resolution is the same as the image resolution.

가상 정보 생성부(140)는 획득된 다시점 영상 중 임의의 제1 시점의 영상 정보 및 깊이 정보와, 상기 제1 시점에 가장 근접한 시점인 제2 시점의 영상 정보 및 깊이 정보를 이용하여 상기 제1 및 제2 시점의 중간 시점에 대한 가상 정보를 생성 한다(S220). The virtual information generating unit 140 uses the image information and depth information of an arbitrary first viewpoint among the acquired multi-view images, and the image information and depth information of a second viewpoint which is the closest to the first viewpoint. Virtual information about an intermediate time point between the first and second time points is generated (S220).

예를 들어, 수신부(110)가 동일 시간에 서로 다른 시점에서 촬영된 V0, V2, V4 시점 영상 및 각 시점 영상에 대한 깊이 정보를 획득하였고 이를 이용하여 가상 정보 생성부(140)가 중간 시점 영상인 V1 시점 및 V3 시점 영상에 대한 가상 정보를 생성한다고 가정하였을 때, 가상 정보 생성부(140)는 V1 시점에 대한 가상 정보를 생성할 때는, V0 시점 영상과 이에 대한 깊이 영상 정보 및 V2 시점 영상과 이에 대한 깊이 정보를 이용하고, 이와 마찬가지 방법으로 V3 시점에 대해서는 V2 시점 영상과 이에 대한 깊이 정보 및 V4 시점 영상과 이에 대한 깊이 정보를 이용한다.For example, the receiver 110 acquires depth information of the V0, V2, and V4 viewpoint images and each viewpoint image photographed at different viewpoints at the same time, and the virtual information generation unit 140 uses the intermediate viewpoint image to obtain depth information. Assuming that the virtual information for the V1 and V3 view images is generated, the virtual information generator 140 generates the virtual information for the V1 view, the V0 view image, the depth image information, and the V2 view image. And depth information thereof, and in the same manner, the V2 view image and the depth information thereof, and the V4 view image and the depth information thereof are used for the V3 view.

구체적으로, 가상 정보 생성부(140)는 V1 시점에 대한 가상 정보를 생성할 때, V0 시점의 깊이 정보를 기반으로 V0 시점의 영상을 V2 시점의 영상에 투사(projection)한다. 이때, 영상의 투사는 기본적인 부호화 단위로 이루어질 수 있는데, 각 부호화 단위가 투사된 변위는 깊이 정보에 따라 달라지므로 모두 동일한 값을 가지는 것이 아니다. 따라서 모든 부호화 단위의 영상이 동일한 변위를 가지고 균등하게 이동되는 것이 아니라 서로 다른 변위를 갖고 이동하기 때문에 경우에 따라서는 서로 다른 부호화 단위 영상이 같은 위치로 이동되는 영역(이하, 폐색(occlusion) 영역)이 발생하거나, 또는 어떠한 부호화 단위 영상도 이동되지 않는 영역(비폐색(disocclusion) 영역)이 발생할 수 있다.In detail, when the virtual information generator 140 generates virtual information about the V1 viewpoint, the virtual information generator 140 projects the image of the V0 viewpoint to the image of the V2 viewpoint based on the depth information of the V0 viewpoint. In this case, the projection of the image may be performed by a basic coding unit, and since the displacements projected by each coding unit vary depending on the depth information, they do not all have the same value. Therefore, because images of all coding units are not uniformly moved with the same displacement but are moved with different displacements, in some cases, regions where different coding unit images are moved to the same position (hereinafter, occlusion region). This may occur, or an area (disocclusion area) in which no coding unit image is moved may occur.

여기서, 폐색 영역 및 비폐색 영역 검색의 기본 단위가 되는 부호화 단위는 화소 단위일 수 있고, 또는 반 화소, 1/4 화소 단위의 정밀한 검색을 통해 보다 정확한 가상 정보가 수집될 수도 있다.In this case, the coding unit serving as the basic unit of the occlusion area and the non-occlusion area search may be a pixel unit, or more accurate virtual information may be collected through a precise search of a half pixel and a quarter pixel unit.

예를 들어, 가상 정보 생성부(140)는 화소 단위로 검색하여 임의의 화소 A와 가장 인접한 화소 B가 이동한 거리를 기준으로 폐색 영역과 비폐색 영역을 판단한다. 만약, A 화소와 B 화소가 동일한 위치로 이동하거나, 정화소 이내의 화소로 이동한 경우, 가상 정보 생성부(140)는 이 영역은 폐색 영역에 속하게 되는 것으로, 반대로 A 화소와 B 화소가 이동한 변위가 정화소 이상의 거리를 갖는 경우에는 이 영역은 비폐색 영역에 속하게 되는 것으로 판단할 수 있다.For example, the virtual information generator 140 searches in units of pixels and determines a closed region and a non-closed region based on the distance traveled by the pixel B closest to the arbitrary pixel A. If the A pixel and the B pixel move to the same position or move to a pixel within the septic tank, the virtual information generating unit 140 belongs to the occlusion area. On the contrary, the A pixel and the B pixel move. If the displacement has a distance greater than or equal to the septic tank, it can be determined that this region belongs to the non-occluded region.

본 발명은 복호화기가 중간 시점 영상 합성시의 효율을 향상시키기 위해, 부호화기가 참조 정보를 미리 생성하여 전송하는 것을 특징으로 하는데, 참조 정보는 전술한 바와 같은 가상의 중간 시점 합성을 위한 인접 시점의 투사 시 발생하게 되는 폐색영역과 비폐색 영역에 관한 정보를 포함한다.The present invention is characterized in that the decoder generates and transmits reference information in advance so as to improve the efficiency in synthesizing the intermediate view image, wherein the reference information is the projection of the adjacent view for the virtual intermediate view synthesis as described above. Includes information about occlusion and non-occlusion areas that occur during

가상 정보 생성부(140)에서 생성된 가상 정보는 복화하기로 전송되는데, 이경우 가상 정보의 위치는 부호화기와 복호화기가 서로의 약속이나 식별자를 통해서 인식할 수 있도록 정의된다. 여기서, 가상 정보는 폐색 영역과 비폐색 영역의 발생 여부와 관련된 이진 맵(Binary map)으로 생성될 수 있다. The virtual information generated by the virtual information generating unit 140 is transmitted to be decoded. In this case, the location of the virtual information is defined so that the encoder and the decoder can recognize each other through an appointment or an identifier. Here, the virtual information may be generated as a binary map related to whether the occluded area and the non-occluded area are generated.

일 실시예로, 가상 정보 생성부(140)는 검색 결과 폐색 영역, 비폐색 영역이라는 구분을 두지 않고 일단 중간 영상 합성 시 문제가 발생할 수 있는 영역(폐색 영역, 비폐색 영역)이라는 판단이 내려진 영역을 '1'로 표시하고, 그렇지 않고 이상이 없는 영역을 '0'으로 표시하여 이진 맵을 생성한다. In one embodiment, the virtual information generating unit 140 does not distinguish between the occluded region and the non-occluded region as a result of the search, and the region is determined to be a region (occluded region, non-occluded region) where problems may occur when synthesizing the intermediate image. A binary map is generated by marking '1', and marking '0' in a region that is not abnormal.

또한, 가상 정보는 폐색 영역과 또는 비폐색 영역의 발생 여부와 관련된 2 비트를 이용한 맵으로 생성될 수도 있다. 일 예로 폐색 영역과 비폐색 영역의 구분을 두기 위해서 '0', '1', '2'라는 2 비트를 이용한 맵이 구성되고, 이 경우 '0'은 이상 없음, '1'은 폐색 영역, '2'는 비폐색 영역으로 구분된다.In addition, the virtual information may be generated as a map using two bits related to whether a closed region or a non-closed region is generated. For example, a map using two bits of '0', '1', and '2' is configured to distinguish between the occlusion area and the non-occlusion area. In this case, '0' is no abnormality, '1' is the occlusion area, ' 2 'is divided into non-occluded areas.

가상 정보 부호화부(145)는 생성된 가상 정보를 부호화한다(S230). 이 경우, 다양한 방법을 통해 전송되는 비트량을 최소화할 수 있는데, 일 실시예로서 가상 정보가 폐색 영역과 비폐색 영역의 발생 여부와 관련된 이진 맵으로 생성된 경우에는 '0'이 발생한 숫자, '1'이 발생한 숫자만 라스터 스캔 순서(raster scan order)로 연속적으로 부호화한다. The virtual information encoder 145 encodes the generated virtual information (S230). In this case, the amount of bits transmitted through various methods can be minimized. As an example, when the virtual information is generated as a binary map relating to whether a closed region and a non-closed region are generated, the number '0' is generated, '1' Only numbers with 'are consecutively encoded in the raster scan order.

예)이진맵 0000000000000000011111111111111110000000000011000000000Example) Binary Map 0000000000000000011111111111111110000000000011000000000

부호화 17 - 16 - 11 - 2 - ....Coding 17-16-11-2-....

다른 실시예로서, 가상 정보가 폐색 영역 또는 비폐색 영역의 발생 여부와 관련된 2 비트를 이용한 맵으로 생성된 경우에는 연속되는 숫자를 부호화함과 동시에 0,1,2라는 3개의 상태의 변환 정보(식별자)를 부호화한다. In another embodiment, when the virtual information is generated as a map using two bits related to whether an occluded area or a non-occluded area is generated, the conversion information of three states, i. ) Is encoded.

도 3은 본 발명의 실시예에 따른 다시점 영상 부호화 방법에 있어서 가상 정보 부호화 시, 상태변환을 위한 식별자 정보를 부호화하는 방법의 일 예를 도시한 예시도이다.3 is an exemplary diagram illustrating an example of a method of encoding identifier information for state transformation in virtual information encoding in the multi-view image encoding method according to an embodiment of the present invention.

도 3을 참조하여 본 발명의 실시예에서 가상 정보 부호화부(145)가 가상 정보를 부호화하는 방법을 설명한다.Referring to FIG. 3, a method in which the virtual information encoder 145 encodes virtual information will be described.

먼저, 가상 정보 부호화부(145)는 연속되는 숫자를 부호화한다. 앞 예시처럼 0이 연속하여 17개가 나왔다면, 1 비트(bit)를 사용하여 17을 부호화하고, 나머지 1 비트(bit)는 0 이후의 숫자가 1인지 2인지에 대한 식별 정보를 부호화하는데 사용된다. First, the virtual information encoder 145 encodes consecutive numbers. As in the previous example, if 17 consecutive 0s are found, 1 is used to encode 17, and the remaining 1 bits are used to encode identification information of whether the number after 0 is 1 or 2. .

식별 정보는 부호화기와 복호화기 사이의 약속된 규칙인 것으로서, 도 3에 도시된 바와 같이 0 / 1 / 2를 원형으로 배치해서 시계방향 또는 반시계 방향으로 상태변화를 식별할 수 있는 정보를 포함할 수 있다. 예를 들어, 0은 반시계 방향, 1은 시계 방향의 상태변환 정보를 표시할 수 있다.The identification information is a promised rule between the encoder and the decoder, and may include information capable of identifying a state change in a clockwise or counterclockwise direction by arranging 0 / 1/2 in a circle as shown in FIG. Can be. For example, 0 may indicate counterclockwise and 1 may indicate clockwise state change information.

깊이 정보 부호화부(120)는 수신부(110)가 획득한 각 시점 영상에 대한 깊이 정보를 부호화한다(S230). The depth information encoder 120 encodes depth information of each viewpoint image acquired by the receiver 110 (S230).

깊이 정보의 부호화는 실사 영상을 부호화할 때 사용되는 DCT(Discrete Cosine Transform) 기반 동영상 부호화 방법인 H.264(또는 MPEG-4 Part 10 Advanced Video Coding)이 사용될 수 있다. DCT 기반 동영상 부호화 방법은 블록 기반 변환 방법으로, 공간 주파수가 낮은 영상(주변 화소 간의 상관 관계가 높은 영상)에서 효율적인데 반해, 객체 경계부분과 같은 공간 주파수가 높은 영상(주변 화소 간 변화가 큰 영상)에서는 효율적이지 못하다, 특히, 저비트율 환경에서 DCT 기반 동영상 부호화 방법으로 부호화를 수행하였을 경우, 양자화에 의한 고주파 성분(High frequency components)의 손실로 인해 객체 경계부분을 정확하게 표현할 수 없으며 영상이 뭉개지는 문제점이 있다. 이를 위해 본 발명의 일 실시예에서는 깊이 정보를 부호화할 시, 깊이 정보의 특성에 따라, 즉 깊이 정보의 중요도에 따라 계층적으로 부호화할 수 있다. 이때, 본 발명은 깊이 정보를 공간적 계층(Spatial scalable) 또는 SNR(Signal to Noise Ratio) 계층 알고리즘을 이용하여 부호화할 수 있다.The encoding of the depth information may be H.264 (or MPEG-4 Part 10 Advanced Video Coding), which is a DCT-based video encoding method used for encoding a real picture. The DCT-based video encoding method is a block-based transform method, which is efficient in a low spatial frequency image (high correlation image between pixels), whereas a high spatial frequency image such as an object boundary part (image with large change between neighboring pixels). ), In particular, when encoding is performed using a DCT-based video encoding method in a low bit rate environment, an object boundary cannot be accurately represented due to loss of high frequency components due to quantization, and the image is crushed. There is a problem. To this end, in the embodiment of the present invention, when the depth information is encoded, the depth information may be hierarchically encoded according to the characteristics of the depth information, that is, the importance of the depth information. In this case, the present invention may encode the depth information using a spatial scalable or signal to noise ratio (SNR) layer algorithm.

영상 정보 부호화부(130)는 수신부(110)가 획득한 다시점 영상을 부호화한다(S230). The image information encoder 130 encodes a multiview image acquired by the receiver 110 (S230).

일 실시예로서, 본 발명에서 다시점 영상을 부호화하는 경우, 시간축(Time) 및 공간축(시점: view)의 상관정보를 이용하여 부호화하는 것을 기초로 할 수 있다.For example, in the present invention, when the multiview image is encoded, the multiview image may be encoded using correlation information of a time axis and a space axis (view).

즉, 시간축 상의 움직임 정보와 공간축 상의 변이 정보(disparity)를 예측하여 움직임 및 변이 보상을 수행하여 부호화하게 된다. 이 때, 기존의 모노 영상 정보 부호화 기술과의 호환성을 고려하여 1개 시점에 대해서는 기본시점(base view)으로 공간축 상의 다른 시점의 정보를 이용하지 않고 움직임 정보만을 이용하여 부호화하고, 기본시점을 제외한 나머지 시점의 영상 정보에 대해서만 공간축 상의 정보를 이용하여 부호화한다. That is, the motion information on the time axis and the disparity on the spatial axis are predicted to perform motion and disparity compensation to be encoded. At this time, in consideration of compatibility with the existing mono image information encoding technology, one view is encoded using only motion information without using information of another view on a spatial axis as a base view, and the base view is encoded. Only image information of the remaining viewpoints is encoded using information on the spatial axis.

예를 들어, 기본시점에 대한 부호화는 DCT 기반 부호화 방법으로서 H.264 부호화를 적용할 수 있다. H.264 부호화는 화면 내 예측만을 수행하는 인트라(Intra) 모드와 화면 내 예측 및 화면 간 예측을 모두 수행하는 인터(Inter) 모드 부호화가 수행된다. H.264 부호화는 부호화의 기본 단위인 매크로 블록 단위로 수행되는 것이 일반적인데, 먼저 입력된 매크로 블록에 대한 예측 블록을 생성한 후, 입력된 매크로 블록과 예측 블록과의 차분을 구해 그 차분 정보를 부호화한다.For example, encoding for the base view may apply H.264 encoding as a DCT-based encoding method. In H.264 encoding, an intra mode for performing only intra prediction and inter mode encoding for performing both intra prediction and inter prediction. In general, H.264 encoding is performed in macroblock units, which are basic units of encoding. First, a prediction block for an input macroblock is generated, and then a difference between the input macroblock and a prediction block is obtained and the difference information is obtained. Encode

상기 부호화된 깊이 정보와 상기 부호화된 영상 정보와 상기 부호화된 가상정보는 다중화부(150)를 통하여 다중화 과정을 거친다(S240). 상기와 같이, 본 발명에 따르면 다중화 과정을 거친 상기 부호화된 깊이 정보, 상기 부호화된 영상 정보, 상기 부호화된 가상 정보는 비트스트림 생성부(160)에 의해 각각의 비트스트림으로 생성되고(S250), 전송부(170)를 통하여 디코더 측으로 전송된다(S260).The encoded depth information, the encoded image information, and the encoded virtual information are subjected to a multiplexing process through the multiplexer 150 (S240). As described above, according to the present invention, the coded depth information, the coded image information, and the coded virtual information which have been subjected to the multiplexing process are generated as respective bitstreams by the bitstream generator 160 (S250). The transmission unit 170 is transmitted to the decoder side (S260).

여기서, 폐색 영역과 비폐색 영역에 관한 가상 정보는 H.264 또는 HEVC와 같은 코덱과 호환을 갖기 위해서 Supplemental Enhanced Information(SEI)와 같은 부가 정보 영역에 포함된다. 또한, 상기 다시점 영상을 재생되는 응용 프로그램의 성격에 따라 상기 가상 정보는 Sequence Parameter, Picture Parameter, Slice Header와 같은 필수 정보 영역에 포함되어 전송될 수도 있다. Here, the virtual information about the occlusion area and the non-occlusion area is included in an additional information area such as Supplemental Enhanced Information (SEI) in order to be compatible with a codec such as H.264 or HEVC. In addition, the virtual information may be included in an essential information area such as a sequence parameter, a picture parameter, a slice header, and transmitted according to a characteristic of an application program playing the multiview image.

이하, 도 4 및 도 5를 참조하여 본 발명의 다른 실시예에 따른 다시점 비디오 복호화 방법 및 장치를 설명한다. 도 4는 본 발명의 실시예에 따른 다시점 영상의 복호화기의 구성을 도시한 블록도이고, 도 5는 본 발명의 실시예에 따른 다시점 영상 복호화 방법을 도시한 순서도이다.Hereinafter, a multiview video decoding method and apparatus according to another embodiment of the present invention will be described with reference to FIGS. 4 and 5. 4 is a block diagram illustrating a decoder of a multiview image decoder according to an embodiment of the present invention, and FIG. 5 is a flowchart illustrating a multiview image decoding method according to an embodiment of the present invention.

도 4에 도시된 바와 같이 본 발명의 다른 실시예에 따른 다시점 영상 복호화기는 역다중화부(410), 영상정보 복호화부(430), 깊이 정보 복호화부(420), 가상 정보 복호화부(440), 중간 시점 영상 생성부(450)를 포함한다.As shown in FIG. 4, the multi-view image decoder according to another embodiment of the present invention includes a demultiplexer 410, an image information decoder 430, a depth information decoder 420, and a virtual information decoder 440. The intermediate view image generator 450 is included.

먼저, 역다중화부(410)는 부호화된 깊이 정보, 영상 정보, 가상 정보를 수신하여 역다중화 한다(S510).First, the demultiplexer 410 receives decoded depth information, image information, and virtual information to demultiplex (S510).

깊이 정보 복호화부(420)는 역다중화된 깊이 정보를 복호화한다. The depth information decoder 420 decodes the demultiplexed depth information.

영상정보 복호화부(430)는 역다중화된 영상 정보를 복호화한다. 구체적으로 영상정보 복호화부(430)는 시공간 축 상의 상관 정보를 이용하여 부호화된 비트스트림을 복호화하며, 이때 해당 시점의 복호화된 깊이 정보와 이미 복호화된 다른 시점의 영상 정보를 이용할 수 있다.The image information decoder 430 decodes the demultiplexed image information. In detail, the image information decoder 430 may decode the encoded bitstream using the correlation information on the space-time axis. In this case, the image information decoder 430 may use the decoded depth information of the corresponding viewpoint and the image information of another viewpoint already decoded.

가상정보 복호화부(440)는 역다중화된 가상 정보를 복호화한다. 구체적으로 가상정보 복호화부(440)는 가상 정보를 복호화하여 각 시점 영상에 대한 폐색(occlusion) 영역과 비폐색(disocclusion) 영역에 관한 정보를 추출한다(S520).The virtual information decoder 440 decodes the demultiplexed virtual information. In detail, the virtual information decoding unit 440 decodes the virtual information and extracts information about an occlusion region and a non-occluded region for each viewpoint image (S520).

중간시점 영상 생성부(450)는 상기 복호화된 영상 정보 및 상기 추출된 폐색 영역과 비폐색 영역에 관한 정보를 참조하여 상기 각 시점에 대한 중간 시점의 영상 정보를 생성한다(S530).The mid-view image generator 450 generates image information of the mid-view for each of the viewpoints by referring to the decoded image information and the information about the extracted occlusion region and the non-occluding region (S530).

한편, 본 발명에 따른 인코딩 및 디코딩 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Meanwhile, the encoding and decoding method according to the present invention may be embodied in the form of program instructions that can be executed by various computer means and recorded in a computer readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and constructed for the present invention or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks, and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Examples of program instructions, such as magneto-optical and ROM, RAM, flash memory and the like, can be executed by a computer using an interpreter or the like, as well as machine code, Includes a high-level language code. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이상, 본 발명의 바람직한 실시예를 통하여 본 발명의 구성을 상세히 설명하였으나, 본 발명이 속하는 기술분야의 통상의 지식을 가진 자는 본 발명이 그 기술적 사상이나 필수적인 특징을 변경하지 않고서 본 명세서에 개시된 내용과는 다른 구체적인 형태로 실시될 수 있다는 것을 이해할 수 있을 것이다. 그러므로 이상에서 기술한 실시예들은 모든 면에서 예시적인 것이며 한정적이 아닌 것으로 이해해야만 한다. 본 발명의 보호범위는 상기 상세한 설명보다는 후술하는 특허청구범위에 의하여 나타내어지며, 특허청구의 범위 그리고 그 균등 개념으로부터 도출되는 모든 변경 또는 변형된 형태가 본 발명의 범위에 포함되는 것으로 해석되어야 한다.
While the present invention has been described in connection with what is presently considered to be practical exemplary embodiments, it is to be understood that the invention is not limited to the disclosed embodiments, but, on the contrary, It is to be understood that the invention may be embodied in other specific forms. It is therefore to be understood that the above-described embodiments are illustrative in all aspects and not restrictive. The scope of the present invention is defined by the appended claims rather than the detailed description, and all changes or modifications derived from the scope of the claims and their equivalents should be construed as being included within the scope of the present invention.

Claims

delete

Obtaining multi-view image information and depth information on each view image;
An intermediate viewpoint between the first and second viewpoints by using image information and depth information of an arbitrary first viewpoint among the acquired multiview images, and image information and depth information of a second viewpoint, which is a view closest to the first viewpoint. Generating virtual information about the network; And
Encoding the virtual information;
Generating the virtual information,
The image of the first view is projected onto the image of the second view based on the depth information of the first view, and the occlusion area and the non-occluded area of the image of the first view. Generating information; And
Projecting the image of the second view onto the image of the first view based on the depth information of the second view to generate information about the occluded and non-occluded areas of the image of the second view.
Multiview video encoding method.

The method of claim 2, wherein the generating of the information about the occluded area and the non-occluded area comprises:
Retrieving the image of the first or second view in an arbitrary coding unit and generating information related to whether the occluded region and the non-occluded region are generated as a binary map.
Multiview video encoding method.

The method of claim 2, wherein the generating of the information about the occluded area and the non-occluded area comprises:
Retrieving the image of the first or second view in an arbitrary coding unit and generating information related to whether the occluded region or the non-occluded region is generated as a map using 2 bits.
Multiview video encoding method.

The method of claim 3, wherein the encoding of the virtual information comprises:
Continuously encoding the number of the same number in the raster scan order in the binary map.
Multiview video encoding method.

The method of claim 4, wherein the encoding of the virtual information comprises:
Continuously encoding the number of the same number in the map using the 2 bits; And
Encoding an identifier for state translation
Multiview video encoding method.

The method of claim 2, wherein the encoded virtual information,
Depending on the type of the application that reproduces the multi-view video information in the H.264 or HEVC codec, it is optionally included in the additional information area or the essential information area of the transport bitstream and transmitted.
Multiview video encoding method.

Decoding the received multiview image information;
Extracting information about an occlusion area and a non-occlusion area for each viewpoint image by decoding the virtual information of claim 2;
Generating image information of an intermediate view for each of the viewpoints by referring to the decoded image information and the information about the extracted occlusion area and the non-obstruction area.
Multi-view video decoding method comprising a.

delete

A receiver configured to receive multi-view image information and depth information of each view image from a plurality of cameras;
An intermediate viewpoint between the first and second viewpoints using image information and depth information of an arbitrary first viewpoint among received multiview images, and image information and depth information of a second viewpoint, which is a view closest to the first viewpoint. A virtual information generator for generating virtual information on the virtual machine; And
Including a virtual information encoder for encoding the virtual information,
The virtual information generation unit,
The image of the first view is projected onto the image of the second view based on the depth information of the first view, and the occlusion area and the non-occluded area of the image of the first view. Generate information,
Projecting the image of the second view onto the image of the first view based on the depth information of the second view to generate information about the occluded and non-occluded areas of the image of the second view
Multiview video encoding device.

The method of claim 10, wherein the virtual information generating unit,
Retrieving the image of the first or second view in an arbitrary coding unit and generating information related to whether the occluded area and the non-occluded area are generated as a binary map.
Multiview video encoding device.

The method of claim 10, wherein the virtual information generating unit,
Retrieving the image of the first or second view in an arbitrary coding unit and generating information related to whether the occluded area or the non-occluded area is generated as a map using 2 bits;
Multiview video encoding device.

The method of claim 11, wherein the virtual information encoder,
Successively encoding the number of the same number in the raster scan order in the binary map
Multiview video encoding device.

The method of claim 12, wherein the virtual information encoder,
Continuously encoding the number of the same number in the map using the 2 bits, and encoding an identifier for state transformation.
Multiview video encoding device.

A computer-readable recording medium having recorded thereon a program for executing the method of claim 2.