KR102091860B1

KR102091860B1 - Method and apparatus for image encoding

Info

Publication number: KR102091860B1
Application number: KR1020130013149A
Authority: KR
Inventors: 최장원; 김용구; 최윤식
Original assignee: 연세대학교 산학협력단
Priority date: 2012-10-19
Filing date: 2013-02-06
Publication date: 2020-03-20
Also published as: KR20140051035A

Abstract

본 발명은 영상 부호화 방법에 관한 것으로, 보다 상세하게는 깊이 영상 기반 렌더링을 이용하여 깊이맵과 컬러 영상을 획득하는 단계; 상기 깊이맵을 배경 깊이 텍스처와 전경 깊이 텍스처로 분류하는 단계; 및 상기 전경 깊이 텍스처를 제외한 후 홀 영역의 깊이도를 예측하는 단계;를 포함하여, 홀 채우기 과정에서의 영상의 품질을 향상시킬 수 있는 부호화 방법을 제공한다. The present invention relates to an image encoding method, and more specifically, to obtain a depth map and a color image using depth image-based rendering; Classifying the depth map into a background depth texture and a foreground depth texture; And predicting a depth of the hole region after excluding the foreground depth texture. An encoding method capable of improving the quality of an image in a hole filling process is provided.

Description

Method and apparatus for image encoding}

본 발명은 영상 부호화 방법 및 장치에 관한 것이다. The present invention relates to an image encoding method and apparatus.

삼차원 비디오 처리기술은 차세대 정보통신 서비스 분야의 핵심기술로서, 정보산업 사회로의 발달과 더불어 수요 및 기술개발 경쟁이 치열한 최첨단 기술이다. 이러한 삼차원 비디오 처리기술은 멀티미디어 응용에서 고품질의 영상 서비스를 제공하기 위해 필수적인 요소인데, 오늘날에는 이러한 정보통신 분야뿐만 아니라 방송, 의료, 교육(또는 훈련), 군사, 게임, 애니메이션, 가상현실 등 그 응용 분야가 매우 다양화되고 있다. 게다가, 삼차원 비디오 처리기술은 여러 분야에서 공통적으로 요구하는 차세대 실감 삼차원 입체 멀티미디어 정보통신의 핵심 기반기술로도 자리잡아 선진국을 중심으로 이에 대한 연구가 활발히 진행되고 있다.3D video processing technology is a core technology in the field of next-generation information and communication services, and it is a cutting-edge technology in which demand and technology development competition is fierce along with development into the information industry society. This three-dimensional video processing technology is an essential element for providing high-quality video services in multimedia applications. Today, it is applied not only to the information and communication fields, but also to applications such as broadcasting, medical, education (or training), military, games, animation, and virtual reality. The field is becoming very diversified. In addition, three-dimensional video processing technology has also established itself as a core base technology for next-generation realistic three-dimensional multimedia information and communication that is commonly required in various fields, and research on this has been actively conducted in developed countries.

일반적으로 삼차원 비디오는 다음과 같이 두가지 관점에서 정의내릴 수 있다. 첫번째로, 삼차원 비디오는 영상에 깊이에 대한 정보를 적용시켜 영상의 일부가 화면으로부터 돌출되는 느낌을 사용자가 느낄 수 있도록 구성되는 비디오로 정의할 수 있다. 두번째로, 삼차원 비디오는 사용자에게 다양한 시점을 제공하여 이로부터 사용자가 영상에서 현실감(즉, 입체감)을 느낄 수 있도록 구성되는 비디오로 정의할 수 있다. 이러한 삼차원 비디오는 획득 방식, 깊이감(Depth Impression), 디스플레이 방식 등에 따라 양안식, 다안식, IP(Integral Photography), 다시점(옴니(Omni), 파노라마(Panorama)), 홀로그램 등으로 분류할 수 있다. 그리고, 이러한 삼차원 비디오를 표현하는 방법으로는 크게 영상 기반 표현법(Image-Based Reconstruction)과 메쉬 기반 표현법(Mesh-Based Representation)이 있다.In general, three-dimensional video can be defined from two viewpoints as follows. First, the 3D video may be defined as a video that is configured to allow a user to feel a part of the image protruding from the screen by applying depth information to the image. Second, the three-dimensional video can be defined as a video that is configured to provide a variety of viewpoints to the user, from which the user can feel the reality (ie, three-dimensional feeling) in the image. These 3D videos can be classified into binocular, multi-eye, IP (Integral Photography), multi-view (Omni, Panorama), hologram, etc. according to the acquisition method, depth impression, display method, etc. have. In addition, there are largely two methods for expressing the three-dimensional video: image-based representation (construction) and mesh-based representation (mesh-based representation).

최근 들어 이러한 삼차원 비디오를 표현하는 방법으로 깊이영상 기반 렌더링(DIBR; Depth Image-Based Rendering)이 각광을 받고 있다. 깊이영상 기반 렌더링은 관련된 각 화소마다 깊이나 차이각 등의 정보를 가진 참조 영상들을 이용하여 다른 시점에서의 장면들을 창출하는 방법을 말한다. 이러한 깊이영상 기반 렌더링은 삼차원 모델의 표현하기 어렵고 복잡한 형상을 용이하게 렌더링할 뿐만 아니라, 일반적인 영상 필터링(Filtering)과 같은 신호처리 방법의 적용을 가능하게 하며, 고품질의 삼차원 비디오를 생산할 수 있게 하는 장점을 가지고 있다. 이러한 깊이영상 기반 렌더링은 상기한 바의 실현을 위해 깊이 카메라(Depth Camera) 및 멀티뷰 카메라(Multi-view Camera)를 통하여 획득되는 깊이영상(Depth Image(or Depth Map))과 텍스처 영상(Texture Image(or Color Image))을 이용한다. 여기에서 특히, 깊이영상은 삼차원 모델을 보다 실감있게 표현하는 데에 사용된다. 즉, 보다 입체감있는 삼차원 비디오의 생성을 위해 사용된다.Recently, depth image-based rendering (DIBR) has been spotlighted as a method of expressing such a 3D video. Depth image-based rendering refers to a method of creating scenes at different viewpoints by using reference images having information such as depth or difference angle for each related pixel. The depth image-based rendering not only easily renders difficult and complex shapes of a 3D model, but also enables application of a signal processing method such as general image filtering, and is capable of producing high quality 3D video. Have In order to realize the above, the depth image based rendering is a depth image (or depth map) and a texture image acquired through a depth camera and a multi-view camera. (or Color Image)). In particular, depth images are used to express three-dimensional models more realistically. That is, it is used for the creation of a more three-dimensional 3D video.

이와 같이, DIBR은 주어진 시점의 컬러 영상(Color Image)과 그에 따른 깊이맵(Depth Map)을 이용하여 다른 시점의 영상을 생성하는 방법이다. As described above, DIBR is a method of generating an image of a different view using a color image of a given view and a depth map accordingly.

그러나, DIBR 후 생성된 영상에는 비어있는 영역, 즉 홀(Hole)이 발생한다. However, an empty area, that is, a hole, is generated in the image generated after DIBR.

도 1은 DIBR 시 생성되는 홀영역을 도시한 것이다. 홀은 스테레오스코픽 영상의 주관적 화질에 큰 영향을 끼치기 때문에 홀을 고화질로 채우는 기법(Hole-filling)이 요구된다. 하지만 홀 영역에 대한 컬러 정보는 DIBR 전의 원 영상에 존재하지 않기 때문에, 홀을 고품질로 채우는 일은 쉽지 않다.1 shows a hole area generated during DIBR. Since the hole has a great influence on the subjective image quality of a stereoscopic image, a hole-filling technique is required. However, since the color information for the hole area does not exist in the original image before DIBR, filling the hole with high quality is not easy.

종래, 홀의 깊이도를 고려하는 선행연구로는 기존의 영상 인페인팅 기법을 적용하여 홀의 깊이도를 예측하는 방법이 있다.Conventionally, as a prior study considering the depth of the hole, there is a method of predicting the depth of the hole by applying an existing image painting technique.

도 2는 종래 DIBR 후 생성되는 컬러 영상과 깊이맵과 홀영역 깊이도 예측을 도시한 것이다.FIG. 2 illustrates a prediction of a color image, a depth map, and a hole region depth generated after conventional DIBR.

도 2을 참조하면, DIBR시, 컬러 영상과 그에 따른 깊이맵도 같이 렌더링하여 생성된 시점의 컬러 영상뿐만 아니라(도 2(a)참조) 깊이맵도 얻는다(도 2(b) 참조). Referring to FIG. 2, in DIBR, a color image and a corresponding depth map are also rendered, and a depth map as well as a color image at a generated time point (see FIG. 2 (a)) is obtained (see FIG. 2 (b)).

그 후, 생성된 깊이맵의 홀을 채우기 위해 깊이도를 예측(도 2(c) 참조)한다. 깊이도 예측은 Navier-Stokes and fluid dynamics equations을 이용한 인페인팅 알고리즘을 적용할 수 있다. Then, the depth map is predicted (see FIG. 2 (c)) to fill the holes of the generated depth map. For depth prediction, an inpainting algorithm using Navier-Stokes and fluid dynamics equations can be applied.

하지만, 종래의 깊이도 예측 방법을 사용할 경우, 홀 영역은 배경(Background) 영역임에도 불구하고 기존의 인페인팅 알고리즘을 바로 깊이맵의 홀 예측에 적용하기 때문에 도 3에 도시된 DIRB 홀영역 깊이도 예측 결과와 같이 전경(Foreground)와 배경이 혼합되어 예측되는 문제점이 발생한다. However, in the case of using the conventional depth prediction method, the DIRB hole region depth illustrated in FIG. 3 is also predicted because the existing inpainting algorithm is directly applied to the hole prediction of the depth map even though the hole region is a background region. As a result, a problem in which the foreground and the background are mixed is predicted.

또한, Navier-Stokes and fluid dynamics equations을 이용한 인페인팅 알고리즘은 픽셀 단위로 인페인팅을 적용하기 때문에 주변의 텍스처 정보를 활용하지 못하는 문제점이 있다. In addition, the inpainting algorithm using the Navier-Stokes and fluid dynamics equations applies inpainting on a pixel-by-pixel basis, so there is a problem that the surrounding texture information cannot be utilized.

이와 같은 부정확한 홀 영역의 깊이도 예측은 컬러 영상의 홀 필링시 보조적 역할을 하여 생성된 영상의 품질을 향상시키는 것이 아니라 오히려 알고리즘의 효율을 저하시키는 문제점을 발생시킨다. Prediction of the depth of the inaccurate hole region does not improve the quality of the generated image by playing a secondary role in hole filling of the color image, but rather, it lowers the efficiency of the algorithm.

본 발명은 상기와 같은 문제점을 해결하기 위해 안출된 것으로, 특히 DIRB 홀영역 깊이도 예측시 전경과 배경이 혼합되지 않도록 홀 필링을 수행하여 영상 품질을 향상시키는 영상 부호화 방법 및 장치를 제공하는 데 그 목적이 있다.The present invention has been devised to solve the above problems, and in particular, to provide an image encoding method and apparatus for improving image quality by performing hole filling to prevent mixing of the foreground and background when predicting the depth of the DIRB hole region. There is a purpose.

또한, 주변의 텍스처 정보를 활용하여 홀영역 깊이도를 예측함으로써 알고리즘의 효율을 향상시키는 영상 부호화 방법 및 장치를 제공하는 데 그 목적이 있다.In addition, an object of the present invention is to provide an image encoding method and apparatus for improving the efficiency of an algorithm by predicting a depth of a hole region by using surrounding texture information.

상기 목적을 달성하기 위해 안출된 본 발명의 일관점은, 영상 부호화 방법에 있어서, 깊이 영상 기반 렌더링을 이용하여 깊이맵과 컬러 영상을 획득하는 단계; 상기 깊이맵을 배경 깊이 텍스처와 전경 깊이 텍스처로 분류하는 단계; 및 상기 전경 깊이 텍스처를 제외한 후 홀 영역의 깊이도를 예측하는 단계를 포함하는, 영상 부호화 방법을 제공한다.Consistent with the present invention devised to achieve the above object, an image encoding method comprising: acquiring a depth map and a color image using depth image-based rendering; Classifying the depth map into a background depth texture and a foreground depth texture; And predicting the depth of the hole region after excluding the foreground depth texture.

여기서, 상기 깊이도 예측은, 인페인팅 알고리즘을 사용하는 것이 바람직하며, 특히, Navier-Stokes and fluid dynamics equations을 이용하는 것이 바람직하다.Here, it is preferable to use an inpainting algorithm for the depth prediction, and in particular, it is preferable to use Navier-Stokes and fluid dynamics equations.

상기 분류하는 단계는, 제1 영역내의 경계에 위치한 제1 화소와 상기 제1 영역과 맞닿은 제2 영역내의 경계에 위치한 제2 화소간의 차이값I과, 상기 홀 영역 일측과 맞닿은 상기 제3 영역내의 경계에 위치한 상기 제3 화소와 상기 홀 영역의 타측과 맞닿은 제4 영역내의 경계에 위치한 제4 화소간의 차이값II를 계산하고, 상기 차이값 I과 상기 차이값 II간의 차이값III을 기초로 결정되는 것이 바람직하다.The classifying step includes a difference value I between a first pixel located at a boundary in the first area and a second pixel located at a boundary in the second area abutting the first area, and within the third area abutting one side of the hole area. Calculate the difference value II between the third pixel located at the boundary and the fourth pixel located at the boundary in the fourth area contacting the other side of the hole area, and determine based on the difference III between the difference value I and the difference value II It is desirable to be.

또한, 상기 분류하는 단계는, 상기 제1 ~ 제4 화소가 복수개로 각각 대응되도록 구성되어, 각 차이값들의 합산값인 누적합산 차이값I과 누적합산 차이값II과 누적합산 차이값III을 기초로 결정될 수도 있다.Further, the classifying step is configured such that the first to fourth pixels are respectively corresponded in plural, and is based on a cumulative sum difference value I, a cumulative sum difference value II, and a cumulative sum difference value III of each difference value. It may be determined as.

여기서, 상기 제1 영역은, 상기 배경 깊이 텍스처인 것이 바람직하다.Here, it is preferable that the first region is the background depth texture.

상기 분류하는 단계는, 상기 누적합산 차이값III이 클 경우, 상기 제4 영역을 상기 전경 깊이 텍스처로 결정하고, 상기 누적합산 차이값III이 작을 경우, 상기 제4 영역을 상기 배경 깊이 텍스처로 결정하는 것이 바람직하다.In the classifying, when the cumulative sum difference value III is large, the fourth region is determined as the foreground depth texture, and when the cumulative sum difference value III is small, the fourth region is determined as the background depth texture. It is desirable to do.

한편, 상기 복수개의 제1 내지 제4 화소는 상기 제1 내지 제4 영역에서 가장 인접한 영역에서 선택되는 것이 바람직하다.Meanwhile, it is preferable that the plurality of first to fourth pixels are selected from regions closest to the first to fourth regions.

본 발명에 의하면 DIBR의 홀 영역 깊이도 예측시 전경 깊이 텍스처를 제외하고 배경 깊이 텍스처만을 사용하여 예측함으로써 컬러 영상의 홀 채우기시에 보다 정교한 깊이 보조 정보를 제공하는 효과가 있다.According to the present invention, the depth of the hole region of the DIBR is predicted using only the background depth texture excluding the foreground depth texture when predicting, thereby providing more sophisticated depth assist information when filling the hole of the color image.

또한, 본 발명에 의하면 주변의 텍스처 정보를 활용함으로써 보다 정확하고 고품질의 컬러 영상을 제공할 수 있게 된다.In addition, according to the present invention, it is possible to provide a more accurate and high-quality color image by utilizing surrounding texture information.

도 1은 DIBR 시 생성되는 홀영역을 도시한 것이다.
도 2는 DIBR 후 생성되는 컬러 영상과 깊이맵 및 홀 영역의 깊이도 예측을 도시한 것이다.
도 3은 일반적인 DIRB 홀 영역 깊이도 예측 결과를 도시한 것이다.
도 4는 본 발명의 일실시예에 따른 영상 부호화 방법을 도시한 순서도이다.
도 5는 본 발명의 일실시예에 따른 영상 부호화 방법에 사용되는 깊이 텍스처 분리방법을 도시한 것이다.
도 6은 본 발명의 일실시예에 따른 영상 부호화 방법에 사용되는 홀의 깊이도 예측방법을 도시한 것이다.
도 7은 본 발명의 일실시예에 따른 영상 부호화 방법에 사용된 홀의 깊이도 예측방법의 결과 영상을 도시한 것이다.
도 8은 본 발명의 일실시예에 따른 영상 부호화 장치를 도시한 블록도이다.1 shows a hole area generated during DIBR.
FIG. 2 shows prediction of a color image generated after DIBR, a depth map, and a depth diagram of a hole region.
3 shows a general DIRB hole area depth prediction result.
4 is a flowchart illustrating an image encoding method according to an embodiment of the present invention.
5 illustrates a depth texture separation method used in an image encoding method according to an embodiment of the present invention.
6 illustrates a method for predicting a depth of a hole used in an image encoding method according to an embodiment of the present invention.
7 illustrates a result image of a method for predicting a depth of a hole used in an image encoding method according to an embodiment of the present invention.
8 is a block diagram illustrating an image encoding apparatus according to an embodiment of the present invention.

본 발명은 다양한 변경을 가할 수 있고 여러 가지 실시예를 가질 수 있는 바, 특정 실시예들을 도면에 예시하고 상세한 설명에 상세하게 설명하고자 한다. 그러나, 이는 본 발명을 특정한 실시 형태에 대해 한정하려는 것이 아니며, 본 발명의 사상 및 기술 범위에 포함되는 모든 변경, 균등물 내지 대체물을 포함하는 것으로 이해되어야 한다. 각 도면을 설명하면서 유사한 참조부호를 유사한 구성요소에 대해 사용하였다.The present invention can be applied to various changes and may have various embodiments, and specific embodiments will be illustrated in the drawings and described in detail in the detailed description. However, this is not intended to limit the present invention to specific embodiments, and should be understood to include all modifications, equivalents, and substitutes included in the spirit and scope of the present invention. In describing each drawing, similar reference numerals are used for similar components.

제1, 제2 등의 용어는 다양한 구성요소들을 설명하는데 사용될 수 있지만, 상기 구성요소들은 상기 용어들에 의해 한정되어서는 안 된다. 상기 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 사용된다. 예를 들어, 본 발명의 권리 범위를 벗어나지 않으면서 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소도 제1 구성요소로 명명될 수 있다. 및/또는 이라는 용어는 복수의 관련된 기재된 항목들의 조합 또는 복수의 관련된 기재된 항목들 중의 어느 항목을 포함한다.Terms such as first and second may be used to describe various components, but the components should not be limited by the terms. The terms are used only for the purpose of distinguishing one component from other components. For example, the first component may be referred to as a second component without departing from the scope of the present invention, and similarly, the second component may be referred to as a first component. The term and / or includes a combination of a plurality of related described items or any one of a plurality of related described items.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다거나 "접속되어"있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다. 반면에, 어떤 구성요소가 다른 구성요소에 "직접 연결되어"있다거나 "직접 접속되어"있다고 언급된 때에는, 중간에 다른 구성요소가 존재하지 않는 것으로 이해되어야 할 것이다.When an element is said to be "connected" or "connected" to another component, it is understood that other components may be directly connected to or connected to the other component, but other components may exist in the middle. It should be. On the other hand, when a component is said to be "directly connected" or "directly connected" to another component, it should be understood that no other component exists in the middle.

본 출원에서 사용한 용어는 단지 특정한 실시예를 설명하기 위해 사용된 것으로, 본 발명을 한정하려는 의도가 아니다. 단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 출원에서, "포함하다" 또는 "가지다" 등의 용어는 명세서상에 기재된 특징, 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것이 존재함을 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.The terms used in this application are only used to describe specific embodiments, and are not intended to limit the present invention. Singular expressions include plural expressions unless the context clearly indicates otherwise. In this application, the terms "include" or "have" are intended to indicate the presence of features, numbers, steps, actions, components, parts or combinations thereof described herein, one or more other features. It should be understood that the existence or addition possibilities of fields or numbers, steps, operations, components, parts or combinations thereof are not excluded in advance.

이하, 첨부한 도면들을 참조하여, 본 발명의 바람직한 실시예를 보다 상세하게 설명하고자 한다. 이하, 도면상의 동일한 구성요소에 대해서는 동일한 참조부호를 사용하고 동일한 구성요소에 대해서 중복된 설명은 생략한다.Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. Hereinafter, the same reference numerals are used for the same components in the drawings, and duplicate descriptions for the same components are omitted.

도 4는 본 발명의 일실시예에 따른 영상 부호화 방법을 도시한 순서도이다.4 is a flowchart illustrating an image encoding method according to an embodiment of the present invention.

도 4를 참조하면, 본 발명의 일실시예에 따른 영상 부호화 방법은, 깊이맵과 컬러 영상을 획득하는 단계와, 상기 깊이맵을 배경과 전경으로 구분하는 단계와, 홀 영역의 깊이도 예측단계를 포함한다.Referring to FIG. 4, an image encoding method according to an embodiment of the present invention includes: obtaining a depth map and a color image, dividing the depth map into a background and a foreground, and predicting a depth of a hole region It includes.

먼저, 깊이 영상 기반 렌더링을 이용하여 깊이맵과 컬러 영상을 획득한다(S10). 컬러 영상은 주어진 시점(예를 들면, 좌측시점 영상)에서의 컬러 영상을 의미한다. 깊이맵은 다른 시점의 영상을 생성하기 위한 정보를 표시한 것으로 화면의 뒤쪽에 위치한 배경 깊이 텍스처와 화면의 앞쪽에 위치한 전경 깊이 텍스처를 포함한다.First, a depth map and a color image are obtained using depth image-based rendering (S10). The color image refers to a color image at a given viewpoint (eg, a left viewpoint image). The depth map displays information for generating an image of a different viewpoint, and includes a background depth texture located at the back of the screen and a foreground depth texture located at the front of the screen.

이후, 생성된 깊이맵을 배경 깊이 텍스처와 전경 깊이 텍스처로 분류한다(S20). 이를 구분하는 방법에 대해서는 이후 상세하게 설명하기로 한다.Thereafter, the generated depth map is classified into a background depth texture and a foreground depth texture (S20). The method of classifying them will be described later in detail.

다음으로, 전경 깊이 텍스처를 제외한 후 홀 영역의 깊이도를 예측한다(S30). 이와 같이 전경 깊이 텍스처를 제외하고 배경 깊이 텍스처만으로 홀 영역의 깊이도를 예측할 경우 홀을 고품질로 채울 수 있게 된다.Next, after excluding the foreground depth texture, the depth of the hole region is predicted (S30). As described above, when the depth of the hole region is predicted using only the background depth texture excluding the foreground depth texture, the hole can be filled with high quality.

도 5는 본 발명의 일실시예에 따른 영상 부호화 방법에 사용되는 깊이 텍스처 분리방법을 도시한 것이다. 후술하는 영상 부호화 방법은 영상 부호화를 수행하는 영상 부호화 장치 특히 영상 부호화 장치 내의 프로세서에서 수행된다.5 illustrates a depth texture separation method used in an image encoding method according to an embodiment of the present invention. The video encoding method described below is performed by a video encoding device that performs video encoding, particularly in a processor within the video encoding device.

도 5를 참조하면, 본 발명의 일실시예에 따른 영상 부호화 방법은, 깊이 영상 기반 렌더링을 이용하여 깊이맵과 컬러 영상을 획득하고, 상기 깊이맵을 배경 깊이 텍스처와 전경 깊이 텍스처로 분류하는 단계를 포함한다.Referring to FIG. 5, an image encoding method according to an embodiment of the present invention includes obtaining a depth map and a color image using depth image-based rendering, and classifying the depth map into a background depth texture and a foreground depth texture It includes.

먼저, 도 2에 도시된 바와 같이, 영상 부호화 장치의 프로세서는 DIBR을 통해 다른 시점의 컬러 영상을 생성 시, 깊이 맵 또한 같이 렌더링을 통해 생성한다(본 설명에서는 왼쪽 컬러 영상과 그에 따른 깊이맵을 통해 오른쪽 시점의 스테레오스코픽 영상을 생성하는 경우라 가정한다).First, as illustrated in FIG. 2, when a color image of a different viewpoint is generated through the DIBR, the processor of the image encoding apparatus also generates a depth map through rendering (in this description, a left color image and a corresponding depth map are generated). It is assumed that a stereoscopic image of the right view is generated through).

본 실시예의 프로세서는 생성된 깊이맵의 홀을 예측하기에 앞서, 홀 주변의 깊이 텍스처를 전경 레이어와 배경 텍스처로 분류한다. 상기 분류하는 단계를 도 5를 참조하여 자세히 살펴보면 다음과 같다.The processor of the present embodiment classifies the depth texture around the hole into the foreground layer and the background texture before predicting the hole of the generated depth map. Referring to Figure 5 in detail the step of the classification as follows.

도 5의 (a)는 깊이 텍스처가 배경인 경우이고, (b)는 깊이 텍스처가 전경인 경우이다. 본 발명은 홀영역(4) 가장자리의 픽셀 차이 계산을 통하여 깊이 레이어 분리 기법을 사용한다. 프로세서는 원본 영상의 제1 영역(경계 좌측에 위치한 객체, 도 5(a)의 좌측 그림에서 1) 내의 경계에 위치한 제1 화소(p)와, 제1 영역에 인접 내지는 맞닿은 제2 영역(분류하고자 하는 객체, 도 5(a)의 우측 그림에서 3)에 위치한 제2 화소(q)간의 차이값I을 계산한다. 또한, 프로세서는 예측 영상의 홀 여역의 일측과 맞닿은 제3 영역(홀이 발생한 곳의 좌측에 위치한 객체, 도 5(a)에서 우측그림의 1) 내의 경계에 위치한 제3 화소(p')와, 타측과 맞닿은 제4영역(분류하고자 하는 객체로서 홀의 우측 경계, 도 5(a)에서 우측그림의 3) 내의 제4화소(q')간의 차이값II를 계산한 후, 상기 차이값I, II를 이용하여 분류하고자 하는 객체(도 5(a)의 경우 3)가 전경에 해당하는지 배경에 해당하는지를 판단한다.5 (a) is a case where the depth texture is a background, and (b) is a case where the depth texture is a foreground. The present invention uses a depth layer separation technique through pixel difference calculation at the edge of the hole region (4). The processor includes a first pixel p located at a boundary within a first area of the original image (object located at the left of the border, 1 in the left figure of FIG. 5 (a)) and a second area adjacent to or adjacent to the first area (classification) The difference I between the second pixel q located in 3) in the right figure of FIG. 5 (a) is calculated. In addition, the processor and the third pixel (p ') located on the boundary within the third region (the object located on the left side of the hole where the hole occurred, 1 in the right figure in FIG. , After calculating the difference value II between the fourth pixel (q ') in the fourth area (the right boundary of the hole as the object to be classified, 3 in the right figure in FIG. 5 (a)) that comes into contact with the other side, the difference value I, Using II, it is determined whether the object to be classified (3 in the case of FIG. 5 (a)) corresponds to the foreground or the background.

즉, 프로세서는 홀영역(4)의 왼쪽에 위치한 객체의 제1 영역, 제3 영역에 위치한 화소들 p, p'와 오른쪽 경계에 위치한 객체의 제2 영역, 제3 영역에 위치한 화소들 q, q'간의 각 차이값 x=(p-q)와 y=(p'-q')를 계산하고, x, y의 계산 결과를 이용하여, 홀 오른쪽에 위치한 객체의 깊이 텍스처를 전경 또는 배경으로 결정한다. 특히, x-y가 미리 결정된 기준값 보다 클 경우, 오른쪽 깊이 텍스처를 전경으로, 차이값이 작을 경우 배경으로 결정한다. 이때, 제1 영역은 배경 깊이 텍스처이다. That is, the processor includes the first area of the object located on the left side of the hole area 4, the pixels p, p 'located on the third area, the second area of the object located on the right boundary, and the pixels q located on the third area, Each difference value between q 'x = (pq) and y = (p'-q') is calculated, and the depth texture of the object located on the right side of the hole is determined as the foreground or background using the calculation results of x and y. . In particular, when x-y is larger than a predetermined reference value, the right depth texture is determined as the foreground, and when the difference value is small, the background is determined. At this time, the first region is a background depth texture.

본 실시예에서 도 5(a)의 좌측 그림의 제1 영역과, 도 5(a)의 우측 그림의 제3 영역은 실제로 동일한 객체의 서로 대응되는 영역으로서 우측 객체의 깊이값 예측을 위한 기준이 되며, 시점에 따라 화소값이 다를 수 있다.In this embodiment, the first region of the left figure of FIG. 5 (a) and the third region of the right figure of FIG. 5 (a) are regions corresponding to each other of the same object, and thus the reference for predicting the depth value of the right object is The pixel value may be different depending on the viewpoint.

그리고, 경계에 위치한 화소는 경계로부터 소정의 거리(예를 들어, 픽셀 개수로 특정될 수 있음) 이내에 위치한 화소로서, 경계에 위치한 해당 객체의 모든 화소들에 대한 연산을 수행할 경우, 계산량의 부담이 있으므로, 미리 결정된 일정한 간격에 따라 선택되는 화소, 즉 경계영역에 위치한 객체의 일부 화소로 정의될 수 있다.In addition, the pixel located at the boundary is a pixel located within a predetermined distance from the boundary (for example, it may be specified by the number of pixels), and when computation is performed on all pixels of the corresponding object located at the boundary, a computational burden is imposed. Because of this, it may be defined as a pixel selected according to a predetermined predetermined interval, that is, some pixels of an object located in a boundary area.

또한, p, q, p', q'의 개수는 각 영역이 가장 인접한 밴드대역(영역)에서 선택되고, 그 개수도 서로 대응되게 동일하도록 하는 것이 바람직하다. 상기 p, q, p', q'는 복수개로 각각 대응되도록 구성되어, 각 차이값들의 합산값인 누적합산값을 사용할 수 있다. 이 경우, 보다 많은 화소를 기준으로 텍스처를 결정하기 때문에 신뢰도가 높다고 할 수 있다. In addition, the number of p, q, p ', and q' is preferably selected so that each region is selected from the closest band band (region), and the number is also the same corresponding to each other. The p, q, p ', and q' are configured to correspond to a plurality, respectively, and a cumulative sum, which is the sum of the differences, can be used. In this case, it can be said that the reliability is high because the texture is determined based on more pixels.

분류하는 단계는, 상기 누적합산 (x-y)가 미리 설정한 임계값보다 클 경우, 상기 제3 영역을 전경 깊이 텍스처로 결정하고, 상기 누적합산 (x-y)이 미리 결정된 임계값 보다 작을 경우, 상기 제3 영역을 배경 깊이 텍스처로 결정한다. The step of classifying determines the third region as a foreground depth texture when the cumulative sum (xy) is greater than a preset threshold, and if the cumulative sum (xy) is less than a predetermined threshold, the second sum. 3 Determine the area as the background depth texture.

상기 p, q, p', q'의 선택과 관련하여 본 실시예에서 프로세서는, DIBR 수행 결과에 따라 홀이 발생한 영상에서 q'를 미리 정해진 기준에 따라 선택할 수 있다. 프로세서는 DIBR으로 생긴 오른쪽 홀 영역과 깊이 텍스쳐의 경계 부분에서, 소정 개수 또는 소정 간격으로 q'를 선택할 수 있다. 다음, 프로세서는 홀 주변에 위치하며, q'가 포함된 객체의 반대측에 위치한 객체로부터 q'의 위치에 대칭 또는 마주하고 있는 화소들을 p'로 결정한다. 프로세서는 p', q'를 결정한 후, 역 DIBR 변환(inverse DIBR)을 통해 p, q를 결정할 수 있다. 객체 내에 존재하는 수많은 화소들 중에서 p, q, p', q'를 선택하는 방법은 상술된 방법에만 국한되지 않으며, 당업자에 의하여 다른 방법으로 선택될 수도 있다. In relation to the selection of p, q, p ', and q', the processor in this embodiment may select q 'from a hole-occurring image according to a predetermined criterion according to a DIBR execution result. The processor may select q 'at a predetermined number or a predetermined interval at the boundary between the right hole region and the depth texture created by DIBR. Next, the processor determines the pixels that are symmetrically or facing each other at the position of q 'from the object located on the opposite side of the object containing q', and are positioned around the hole as p '. The processor may determine p 'and q', and then determine p and q through an inverse DIBR transformation. The method of selecting p, q, p ', and q' among the numerous pixels existing in the object is not limited to the above-described method, and may be selected by other methods by those skilled in the art.

도 5(b)의 경우에도, 프로세서는 상술한 도 5(a)에 대한 설명과 동일한 방식으로 제4 영역(도 5(b)의 2)이 배경인지 전경인지 판단한다. 도 5(b)의 경우, 홀이 발생한 영역을 기준으로 볼 때, 프로세서는 우선 홀의 우측 경계 영역에 위치한 화소들(q'), 홀의 좌측에 위치한 화소들(p'), 그리고, 그에 대응되는 원본 영상의 경계의 좌측 경계 영역에 존재하는 화소들(p), 우측 경계 영역에 존재하는 화소들(q)의 화소값을 읽고, x(p-q), y(p'-q')를 각각 계산한다.In the case of FIG. 5 (b), the processor determines whether the fourth area (2 in FIG. 5 (b)) is the background or the foreground in the same manner as described for FIG. 5 (a). In the case of FIG. 5 (b), when viewed based on the region where the hole has occurred, the processor first includes pixels q 'located in the right boundary region of the hole, pixels p' located at the left side of the hole, and correspondingly The pixel values of the pixels (p) present in the left boundary area of the original image boundary and the pixels (q) present in the right boundary area are read, and x (pq) and y (p'-q ') are respectively calculated. do.

도 5(b)에서, 기준 영역에 해당하는 좌측 객체의 제1, 3 영역(1)의 p, p'는 실제로 동일한 객체로서 그 값이 일정한 수준을 유지하지만, q와 q'의 경우 객체가 실제로 다르므로 그 값이 일정하게 유지되지 않는다. 즉, x, y의 차이 (p-p'-(q-q'))의 값은 도 5(a)와 비교할 때, 상대적으로 큰 값을 가지게되며, 미리 결정된 기준값의 범위를 초과하는 수준일 경우, 도 5(b)의 2는 전경으로 분류될 수 있다. In FIG. 5 (b), p and p 'of the first and third regions 1 of the left object corresponding to the reference region are actually the same object, and their values maintain a constant level, but in the case of q and q' Since it is actually different, the value is not kept constant. That is, the value of the difference between x and y (p-p '-(q-q')) has a relatively large value when compared to FIG. 5 (a), and is a level exceeding a range of a predetermined reference value. In this case, 2 of FIG. 5 (b) may be classified as a foreground.

도 6는 본 발명의 일실시예에 따른 영상 부호화 방법에 사용되는 홀의 깊이도 예측방법을 도시한 것이다.6 illustrates a method for predicting a depth of a hole used in an image encoding method according to an embodiment of the present invention.

상기 깊이도 예측은, 상술한 도 5에의 단계에서 배경 깊이 텍스처와 전경 깊이 텍스처를 구분한 후에 인페인팅 알고리즘을 사용하여 수행한다.The depth map prediction is performed using an inpainting algorithm after classifying the background depth texture and the foreground depth texture in the step of FIG. 5 described above.

홀 주변의 깊이 텍스처의 분류가 모두 완료되면 홀 영역에 패치(Patch) 기반의 인페인팅 알고리즘을 적용하는데, 전경 부분의 깊이 텍스처는 인페인팅 알고리즘에서 제외하여 수행한다(도 6(b)). 도 6의 (a)는 깊이도 예측전을 도시한 것이고, 도 6의 (c)는 본 발명의 깊이도 예측을 적용한 결과를 도시한 것이다. When the classification of the depth texture around the hole is completed, a patch-based inpainting algorithm is applied to the hole area, and the depth texture of the foreground part is performed by excluding the inpainting algorithm (FIG. 6 (b)). FIG. 6 (a) shows the depth map prediction, and FIG. 6 (c) shows the result of applying the depth map prediction of the present invention.

여기서, 상기 인페인팅 알고리즘은, 크게 인페인팅 영역을 채우는 밴드 인페인팅(band in-painting)과 seamless cloning이 가능하다. 밴드 인페인팅(band in-painting)은 인페인팅 영역의 경계를 따라서 일정한 두께를 가지는 타겟 밴드(target band)를 정의하고, 인페인팅 영역 밖의 모든 픽셀을 중심으로 하는, 타겟 밴드와 같은 모양과 크기를 가지는 소스 밴드(source band)와 타겟 밴드 차이를 계산하여, 그 값의 차이가 가장 작은 소스밴드 영역의 값을 인페인팅 영역에 복사하는 것이다. Seamless cloning은 인페인팅 영역과 입력 이미지의 경계를 없애는 것이다.Here, the inpainting algorithm can perform band in-painting and seamless cloning, which largely fills the inpainting area. Band in-painting defines a target band having a certain thickness along the border of the inpainting area, and has the same shape and size as the target band, centered on all pixels outside the inpainting area. The branch is to calculate the difference between the source band and the target band, and copy the value of the source band region with the smallest difference in value to the inpainting region. Seamless cloning removes the border between the inpainting area and the input image.

홀 영역의 깊이값 예측을 위하여, 본 실시예에서는 Navier-Stokes and fluid dynamics equations을 이용하는 것이 가능하다. 상기 방정식은 홀 영역의 깊이값을 예측하는 방법은 이미지 인페인팅(image inpainting or depth hole-filling)을 기반으로 하는 방법이다. 또한, 홀 영역의 깊이값 예측을 위한 또 다른 방법으로서, 홀 주변의 깊이 텍스쳐를 분류 후, 홀 주변의 배경 깊이 텍스쳐 정보만을 이용하여 패치 기반의 인페인팅 알고리즘(patch based image inpainting)을 이용하여 홀의 깊이도를 예측(or inpainting) 하는 것이 바람직하다.In order to predict the depth value of the hole region, in this embodiment, it is possible to use Navier-Stokes and fluid dynamics equations. The above equation is a method for predicting a depth value of a hole region based on image inpainting or depth hole-filling. In addition, as another method for predicting the depth value of the hole region, after classifying the depth texture around the hole, using only patch-based image inpainting algorithm using only the background depth texture information around the hole. It is desirable to predict depth (or inpainting).

도 7은 본 발명의 일실시예에 따른 영상 부호화 방법에 사용된 홀의 깊이도 예측방법의 결과 영상을 도시한 것이다.7 illustrates a result image of a method for predicting a depth of a hole used in an image encoding method according to an embodiment of the present invention.

도 7을 참조하면, 도 3에서와 같이 왜곡된 영상부분이 없이 보다 정교한 영상이 생성되었음을 알 수 있다.Referring to FIG. 7, it can be seen that a more sophisticated image is generated without a distorted image portion as in FIG. 3.

이와 같이 본 발명은, 홀의 깊이도를 배경 깊이 텍스처만을 이용하여 정교하게 예측함으로써, 컬러 영상의 홀 채우기시 보다 정교한 깊이 보조 정보를 활용할 수 있게 한다. As described above, according to the present invention, by accurately predicting the depth of the hole using only the background depth texture, more sophisticated depth assistance information can be utilized when filling the hole of the color image.

도 8은 본 발명의 일실시예에 따른 영상 부호화 장치를 도시한 블록도이다.8 is a block diagram illustrating an image encoding apparatus according to an embodiment of the present invention.

도 8을 참조하면, 본 발명의 일실시예에 따른 영상 부호화 장치는, 영상 처리부(810)와, 깊이맵 생성부(820)와, 3차원 영상 생성부(830)를 포함한다.Referring to FIG. 8, an image encoding apparatus according to an embodiment of the present invention includes an image processing unit 810, a depth map generation unit 820, and a 3D image generation unit 830.

영상 처리부(810)는 멀티뷰 카메라를 통해 입력받은 2차원 영상이미지를 각 영역으로 분류하여 컬러 영상으로 생성한다. 생성된 컬러 영상을 참조영상으로 하여 각 화소마다 깊이나 차이각 등의 정보를 활용하여 다른 시점에서의 장면을 생성할 수 있게 된다.The image processing unit 810 classifies a 2D image image received through a multi-view camera into each region and generates a color image. By using the generated color image as a reference image, information such as depth or difference angle for each pixel can be used to generate a scene at different viewpoints.

깊이맵 생성부(820)는 깊이 카메라를 통해 획득되는 깊이 영상을 이용하여 깊이맵을 생성한다. 상기 깊이맵 생성부(820)에서 발생될 수 있는 홀 영역을 채우기 위해서는 본 발명에서 상술된 깊이도 예측방법을 사용하여 보다 정교한 깊이맵을 생성할 수 있게 된다. 한편, 상기 깊이맵 생성부(820)는 깊이 정보를 테이블 형태로 미리 저장할 수 있다. The depth map generating unit 820 generates a depth map using a depth image acquired through a depth camera. In order to fill the hole region that may be generated by the depth map generator 820, a more sophisticated depth map can be generated using the depth map prediction method described above. Meanwhile, the depth map generator 820 may store depth information in advance in a table form.

3차원 영상 생성부(830)는 상기 깊이맵 생성부(820)에서 생성한 깊이맵을 이용하여 오른쪽 영상(Right view)을 생성하고, 원본 영상을 왼쪽 영상(Left view)으로 이용하여 3차원 영상을 생성한다. 이 경우 깊이맵을 이용하여 3차원 영상을 생성하는 동작에는 기존의 일반적인 방식이 사용한다.The 3D image generation unit 830 generates a right view using the depth map generated by the depth map generation unit 820, and uses the original image as a left view to generate a 3D image. Produces In this case, an existing general method is used to generate a 3D image using a depth map.

본 발명에서 제안하는 방법으로 예측된 홀의 깊이도는, 홀 채우기 시, 홀 영역과 유사한 텍스처의 검색, 인페인팅(Inpainting) 알고리즘에서의 패치 우선도 결정(Patch priority selection) 등 다양한 용도로 활용될 수 있다.The depth of the hole predicted by the method proposed in the present invention can be used for various purposes, such as searching for textures similar to the hole area when filling the hole, and determining patch priority in the inpainting algorithm. have.

또한, 홀 영역의 깊이도 예측 알고리즘을 최근 차세대 고화질 3D 방송을 위해 국제 표준화가 진행중인 3DVC(3D video coding) 분야의 시점 합성 소프트웨어 (View-synthesis software)에 적용할 경우, 보다 고품질의 홀 채우기 결과를 얻을 수 있을 것이라 기대된다.In addition, when the depth-of-hole prediction algorithm is applied to view-synthesis software in the field of 3DVC (3D video coding), which is undergoing international standardization for the next-generation high-definition 3D broadcasting, high-quality hole filling results are obtained. It is expected to be obtained.

Claims

In the video encoding method,
Obtaining a depth map and a color image using depth image-based rendering (DIBR), and generating an original depth image and a predicted depth image;
Classifying the depth texture of the depth map into a background depth texture and a foreground depth texture; And
And excluding the foreground depth texture and predicting the depth of the hole region using an inpainting algorithm.
The classifying step selects a first region and a second region from the original depth image, selects a third region and a fourth region from the predicted depth image, and (i) according to the depth image-based rendering result. After selecting the fourth pixel located in the fourth area corresponding to the boundary between the hole area and the depth texture, (ii) after selecting the fourth pixel, located in the periphery of the hole area, the fourth pixel A pixel that is symmetric or facing the position of the fourth pixel in the third area located on the opposite side of the fourth area including is selected as a third pixel, and (iii) after selecting the third pixel, inverse depth A second pixel positioned at a boundary in the second region and a first pixel positioned at a boundary in the first region are determined through image-based rendering (inverse DIBR) transformation,
The classifying step includes the difference value I between the first pixel located at the boundary in the first area and the second pixel located at the boundary in the second area abutting the first area, and the one contacting one side of the hole area Calculate the difference value II between the third pixel located at the boundary in the third area and the fourth pixel located at the boundary in the fourth area that abuts the other side of the hole area, and based on the difference value I and the difference value II The video encoding method.

delete

The method according to claim 1,
The inpainting algorithm, Navier-Stokes equations, video encoding method.

delete

The method according to claim 1,
In the classifying, the first pixel to the fourth pixel are configured to correspond to a plurality of each, so that the cumulative sum difference value I, which is the sum of the difference values, and the cumulative sum difference value II, and the difference value I and difference values, respectively. The image coding method is determined based on the cumulative sum difference value III which is a difference of II.

The method according to claim 5,
The first region is the background depth texture, the image encoding method.

The method according to claim 6,
In the classifying, if the cumulative sum difference value III is greater than a predetermined reference value, the fourth region is determined as the foreground depth texture.

The method according to claim 6,
In the classifying, if the cumulative sum difference value III is smaller than a predetermined reference value, the fourth region is determined as the background depth texture.

delete

The method according to claim 5,
The first region is the foreground depth texture, the image encoding method.

The method according to claim 10,
In the classifying, when the cumulative sum difference value III is large, the third region is determined as the background depth texture, and when the cumulative sum difference value III is small, the third region is determined as the foreground depth texture. Video encoding method.

A processor; And
It is connected to the processor and includes a memory for storing information for driving the processor,
The processor acquires a depth map and a color image using depth image-based rendering (DIBR), generates an original depth image and a predicted depth image, and sets the depth texture of the depth map to background depth texture After classifying as and foreground depth texture, after excluding the foreground depth texture, the depth map of the hole region is predicted using an inpainting algorithm.
The processor selects a first region and a second region from the original depth image, selects a third region and a fourth region from the predicted depth image, and (i) the hole according to the depth image-based rendering result. A fourth pixel located in the fourth area corresponding to a boundary between an area and the depth texture is selected, and (ii) after selecting the fourth pixel, located in the periphery of the hole area, and the fourth pixel included In the third area located on the opposite side of the fourth area, a pixel that is symmetric or facing the position of the fourth pixel is selected as a third pixel, and (iii) after the third pixel is selected, an inverse depth image is based on A second pixel positioned at a boundary in the second region and a first pixel positioned at a boundary in the first region are determined through inverse DIBR transformation,
The processor may include a difference value I between the first pixel located at the boundary in the first area and the second pixel located at the boundary in the second area contacting the first area, and the third contact with one side of the hole area. Calculate the difference value II between the third pixel located at the boundary in the area and the fourth pixel located at the boundary in the fourth area that abuts the other side of the hole area, and determine based on the difference I and the difference II Video encoding device.

delete

The method according to claim 12,
Classifying the background depth texture and the foreground depth texture is
The first pixel to the fourth pixel are configured to respectively correspond to a plurality, and the cumulative sum difference value I and the cumulative sum difference value II and the cumulative sum difference of the difference value I and the difference value II are summed values of the difference values. And configured to be determined based on the difference value III.