KR101710444B1

KR101710444B1 - Depth Map Generating Apparatus and Method thereof

Info

Publication number: KR101710444B1
Application number: KR1020100048625A
Authority: KR
Inventors: 껑위 마; 김지원; 하이타오 왕; 씨잉 왕; 김지연; 정용주
Original assignee: 삼성전자주식회사
Priority date: 2009-07-06
Filing date: 2010-05-25
Publication date: 2017-02-28
Also published as: KR20110004267A; CN101945295B; CN101945295A

Abstract

깊이 맵 생성 장치 및 방법이 개시되며, 보통 비디오로부터 비디오의 각 2차원 이미지와 대응하는 깊이 맵을 자동적으로 생성한다. 깊이 맵 생성 장치는: 입력된 비디오 중 시간 상 연속된 복수의 2차원 이미지를 획득하는 이미지 획득부; HVP 모델에 따라 현재 2차원 이미지와 대응하는 적어도 하나의 현저성 맵을 생성하는 현저성 맵 생성부; 현저성에 기한 깊이 맵 생성부; 상기 복수의 2차원 이미지 중 현재 2차원 이미지와 다수의 미리 저장된 3차원 전형 구조의 각 정합도를 계산하고, 최고 정합도를 가지는 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정하는 3차원 구조 정합부; 정합에 기한 깊이 맵 생성부; 현저성에 기한 깊이 맵과 정합에 기한 깊이 맵을 결합하여 종합 깊이 맵을 생성하는 종합 깊이 맵 생성부; 상기 종합 깊이 맵을 평활하는 시공간 영역 평활 부;를 포함한다. A depth map generation apparatus and method is disclosed and automatically generates a depth map corresponding to each two-dimensional image of video from normal video. An apparatus for generating a depth map, comprising: an image obtaining unit for obtaining a plurality of two-dimensional images that are continuous in time among input video; A saliency map generator for generating at least one saliency map corresponding to a current two-dimensional image according to an HVP model; A depth map generation unit based on conspicuousness; Dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining the three-dimensional representative structure having the highest matching degree as the three-dimensional structure of the current two- Dimensional structure matching section; A depth map based on registration; An overall depth map generation unit for generating an overall depth map by combining a depth map based on conspicuousness and a depth map based on matching; And a space-time area smoothing unit for smoothing the composite depth map.

Description

[0001] The present invention relates to a depth map generating apparatus and method,

본 발명은 깊이 맵을 생성하는 장치 및 방법에 관한 것이고, 특히 보통 비디오 중 비디오의 각 프레임의 2차원 이미지에 대응하는 깊이 맵을 자동으로 생성하는 장치 및 그 방법에 관한 것이다.The present invention relates to an apparatus and method for generating a depth map, and more particularly to an apparatus and method for automatically generating a depth map corresponding to a two-dimensional image of each frame of video, usually of video.

최근 연구 영역 및 상업 시장에서 3차원 텔레비전이 뜨거운 이슈가 되고 있다. 3차원 텔레비전과 종래의 2차원 텔레비전의 차이는 그것이 나타내는 입체 비디오에 있다. 시청자는 실제 3차원 장면과 같은 깊이 효과를 느낄 수 있게 된다. 이러한 효과는 사람의 두 눈의 시각 모델이론에 기초한다. 사람은 두 눈을 이용하여 실제 세계를 바라보고, 3차원 장면을 볼 때 두 눈의 이미지는 서로 다르게 된다. 독립적으로 투영되는 사람의 좌안과 우안의 두 상이한 이미지를 통해 사람은 뇌에서 3차원 장면을 형성하게 된다.In recent research areas and commercial markets, 3D television is becoming a hot issue. The difference between a three-dimensional television and a conventional two-dimensional television lies in the stereoscopic video it represents. The viewer can feel the depth effect like the actual three-dimensional scene. This effect is based on the visual model theory of the human eye. When a person looks at the real world using two eyes and sees a three-dimensional scene, the images of the two eyes become different. Through two different images of the left and right eyes of an independently projected person, a person will form a three-dimensional scene in the brain.

하지만, 현재 대부분의 매체(영화, 비디오) 및 이미지 획득 장치(디지털 카메라, 필름 카메라 등)는 여전히 하나의 카메라를 이용한 모노 시스템에 의존한다. 이러한 매체가 직접 3차원 텔레비전에 디스플레이되면, 3차원 효과를 나타낼 수 없다. 이러한 매체가 3차원 비디오로 전환되기 위해서는, 하나의 해결 방법으로서 많은 인원을 고용하여 수작업으로 각 영역의 깊이 맵을 표시하는 방법이 있다. 이 전환 결과는 사람들을 만족시킬 수 있으나 너무 많은 인력을 필요로 한다는 단점이 너무나 명확하다.Currently, however, most media (movies, video) and image acquisition devices (digital cameras, film cameras, etc.) still rely on mono systems using a single camera. When such a medium is directly displayed on a three-dimensional television, it can not exhibit a three-dimensional effect. In order for such a medium to be converted into three-dimensional video, there is a method of employing a large number of persons as a solution and displaying a depth map of each region by hand. The consequences of this transition can be very satisfying, but the drawback is that it requires too much manpower.

현재 이미 해결방법이 있으나, 모두 보통의 비디오 시퀀스에 사용하기에는 한계가 있다. 예를 들어, 한 방법은 컴퓨터 인터렉션을 필요로 하는 깊이 표시 시스템을 제공하고 있지만, 이 방법은 3차원 텔레비전에 응용하기에는 무인 감독을 완전히 실현하지 못하고 있으며, 사용자의 입력이 필요하여 실시간으로 작동할 수 없다. 또한 다른 방법으로 이미지 중 대상에 대해 수평운동을 진행하고 배경은 정지한 것으로 가정하여 운동 시각차를 이용하여 입체 비디오 차를 시뮬레이션하는 것이나, 상기 가정은 보통의 비디오 중에는 실시간이 될 수 없어서 상기 방법도 비디오 처리에는 한계가 있다.There are currently solutions, but all have limitations for use in normal video sequences. For example, one method provides a depth-of-view system that requires computer interaction, but this method does not fully realize unattended supervision for 3D television applications, none. Another way is to simulate the stereoscopic video difference using the motion visual difference assuming that the horizontal motion is performed on the object in the image and the background is stopped, but the above assumption can not be real time in normal video, There is a limit to processing.

본 발명의 실시예는 완전히 자동으로, 사용자의 입력없이 다양한 형태의 비디오(정지 이미지 시퀀스 포함)을 처리하는 방법 및 장치를 제공한다.Embodiments of the present invention provide a method and apparatus for fully processing various types of video (including still image sequences) without user input.

본 발명의 일 실시예에 따르면, 입력된 비디오 중 시간 상 연속된 복수의 2차원 이미지를 획득하는 이미지 획득부;와 상기 복수의 2차원 이미지 중 현재 2차원 이미지와 다수의 미리 저장된 3차원 전형 구조의 각 정합도를 계산하고, 최고 정합도의 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정하는 3차원 구조 정합부;와 상기 3차원 전형 구조의 깊이 맵을 미리 저장하고, 확정된 현재 2차원 이미지의 3차원 구조의 3차원 전형 구조를 가진 깊이 맵을 현재 2차원 이미지에 대응하는 정합에 기한 깊이 맵으로 하고, 상기 정합에 기한 깊이 맵의 각 픽셀은 현재 2차원 이미지의 대응하는 픽셀의 정합에 기한 깊이값을 표시하는 정합에 기한 깊이 맵 생성부;를 포함하는 깊이 맵 생성 장치를 제공한다.According to an embodiment of the present invention, there is provided an image processing apparatus including: an image obtaining unit that obtains a plurality of two-dimensional images that are continuous in time among input images; Dimensional structure of the current two-dimensional image; a three-dimensional structure matching unit for calculating each degree of matching of the three-dimensional stereoscopic structure and determining the three-dimensional stereoscopic structure of the highest matching degree as a three- A depth map having a three-dimensional typical structure of a three-dimensional structure of a current two-dimensional image is set as a depth map based on matching corresponding to a current two-dimensional image, and each pixel of the depth map based on the matching is a corresponding And a depth-map-based matching unit for displaying a depth value based on the matching of the pixels.

상기 정합에 기한 깊이값은 [0, 1] 범위 내에 있고, 0은 대응 픽셀이 최대 깊이를 가진 것을 표시하고, 1은 대응 픽셀이 최소의 깊이를 가지는 것을 표시한다.The depth value due to the match is in the range [0, 1], 0 indicates that the corresponding pixel has the maximum depth, and 1 indicates that the corresponding pixel has the minimum depth.

상기 3차원 구조 정합부는: 현재 2차원 이미지를 정합된 3차원 전형 구조의 평면에 대응하는 적어도 하나의 영역으로 분할하는 평면 분할 모듈; 상기 각 영역의 특성 분포에 근거하여 상기 각 영역의 깊이를 계산하고, 상기 각 영역의 특성의 평균값을 계산하여 상기 평균값 사이의 차의 놈(norm)으로부터 두 영역 간의 유사성을 계산하며, 각 상기 영역의 밀도와 상기 두 영역 간의 유사성의 합으로 정합도를 계산하는 정합도 계산 모듈; 상기 정합도에 따라 최고 정합도를 가지는 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정하는 3차원 구조 확정 모듈;을 더 포함할 수 있다.Wherein the three-dimensional structure matching unit comprises: a plane splitting module that divides the current two-dimensional image into at least one region corresponding to a plane of the matched three-dimensional representative structure; Calculating a depth of each of the regions based on a characteristic distribution of each region, calculating an average value of the characteristics of the regions, calculating a similarity between the two regions based on a norm of a difference between the average values, A matching degree calculating module for calculating a matching degree by a sum of the density of the two regions and the similarity between the two regions; And a three-dimensional structure determination module for determining the three-dimensional representative structure having the highest matching degree according to the degree of matching to the three-dimensional structure of the current two-dimensional image.

상기 정합도 계산 모듈은

에 따라 각 상기 영역 r의 밀도를 계산하고, 여기서

이고, p는 상기 영역의 픽셀이며, I(p)는 픽셀 p의 특성값이고,

는 상기 영역의 상기 픽셀의 특성값의 평균값이며, area(r)는 상기 영역 중 픽셀의 수량이다.The matching degree calculation module

The density of each of the regions r is calculated according to Equation

P is a pixel of the area, I (p) is a characteristic value of the pixel p,

Is an average value of the characteristic values of the pixels in the area, and area (r) is the number of pixels in the area.

상기 정합도 계산 모듈은

에 따라 영역 ri와 영역 rj 사이의 유사성을 계산하고, 여기서

는 상기 영역 중 특성의 평균값이고, |.|는 놈(norm)이다.The matching degree calculation module

The similarity between the region ri and the region rj is calculated according to Equation

Is the average value of the properties in the region, and | .vertline. Is the norm.

상기 특성은 색깔, 그레디언트 또는 경계이다.The characteristic is a color, a gradient, or a boundary.

상기 놈은 1-놈, 2-놈 또는 ∞놈이다. The nucleus is a 1-nucleotide, a 2-nucleotide, or an ∞ nucleotide.

본 발명의 다른 일실시예에 따르면, 입력된 비디오 중 시간 상 연속된 복수의 2차원 이미지를 획득하는 이미지 획득부; HVP 모델에 따라 상기 복수의 2차원 이미지 중 현재 2차원 이미지에 대응하는 적어도 하나의 현저성 맵을 생성하고, 상기 현저성 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 현저성을 나타내는 현저성 맵 생성부; 상기 적어도 하나의 현저성 맵을 사용하여 현재 2차원 이미지와 대응하는 현저성에 기한 깊이 맵을 생성하고, 상기 현저성에 기한 깊이 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 현저성의 깊이값을 표시하는 현저성에 기한 깊이 맵 생성부; 상기 복수의 2차원 이미지 중 현재 2차원 이미지와 다수의 미리 저장된 3차원 전형 구조의 각 정합도를 계산하고, 최고 정합도를 가지는 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정하는 3차원 구조 정합부; 상기 3차원 전형 구조의 깊이 맵을 미리 저장하고, 확정된 현재 2차원 이미지의 3차원 구조의 전형 구조를 가진 깊이 맵을 현재 2차원 이미지에 대응하는 정합에 기한 이미지로 하고, 상기 정합에 기한 깊이 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 정합된 깊이값을 표시하는 정합에 기한 깊이 맵 생성부; 현저성에 기한 깊이 맵과 정합에 기한 깊이 맵을 결합하여 종합 깊이 맵을 생성하고, 상기 종합 깊이 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 종합 깊이값을 표시하는 종합 깊이 맵 생성부;을 포함하는 깊이 맵 생성 장치를 제공한다.According to another embodiment of the present invention, there is provided an image processing apparatus including: an image obtaining unit obtaining a plurality of two-dimensional images that are continuous in time among input video; Dimensional image; and generating at least one saliency map corresponding to a current two-dimensional image of the plurality of two-dimensional images in accordance with the HVP model, wherein each pixel of the saliency map has a saliency A map generator; Dimensional image and a depth map based on the saliency corresponding to the current two-dimensional image using the at least one saliency map, and each pixel of the depth map based on the saliency displays a depth value of the saliency of the corresponding pixel of the current two- A depth map generation unit based on the conspicuousness; Dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining the three-dimensional representative structure having the highest matching degree as the three-dimensional structure of the current two- Dimensional structure matching section; Dimensional depth of the three-dimensional specimen structure is previously stored, and the depth map having the typical structure of the three-dimensional structure of the determined current two-dimensional image is regarded as an image based on the matching corresponding to the current two- Each pixel of the map representing a matched depth value of a corresponding pixel of a current two-dimensional image; An overall depth map generation unit for generating an overall depth map by combining the depth map based on the conspicuousness and the depth map based on matching and displaying each pixel of the comprehensive depth map an overall depth value of a corresponding pixel of the current two- And a depth map generating unit for generating a depth map.

상기 현저성 맵 생성부는: 현재 2차원 이미지의 특성을 식별함으로써 특성 현저성 맵을 생성하는 특성 현저성 맵 생성 모듈;현재 2차원 이미지와 현재 2차원 이미지의 시간 상 인접한 2차원 이미지 사이의 운동을 식별함으로써 운동 현저성 맵을 생성하는 운동 현저성 맵 생성 모듈; 현재 2차원 이미지의 대상을 식별함으로써 대상 현저성 맵을 생성하는 대상 현저성 맵 생성 모듈; 특성 현저성 맵 생성 모듈, 운동 현저성 맵 생성 모듈 및 대상 현저성 맵 생성 모듈 중의 어느 하나, 임의의 둘 또는 전체를 사용하여 하나, 임의의 둘 또는 전체 현저성 맵을 생성하는 현저성 맵 제어 모듈;을 포함할 수 있다.Wherein the saliency map generator comprises: a feature saliency map generation module for generating a saliency saliency map by identifying a characteristic of a current two-dimensional image; a motion detection unit for detecting a motion between a current two- A motion aberration map generation module for generating an aberration map based on the motion; An object saliency map generation module that generates a target saliency map by identifying an object of a current two-dimensional image; An odd map control module for generating one, any two or all of the conspicuousness maps using any or all of the characteristic saliency map generation module, the motion saliency map generation module and the target saliency map generation module, ; &Lt; / RTI >

상기 현저성에 기한 깊이 맵 생성부는 아래의 처리를 통해 현저성에 기한 깊이 맵을 생성하며: 상기 현저성 맵 생성부가 단지 대상 현저성 맵 만을 생성하면, 상기 현저성에 기한 깊이 맵 생성부는 (0, 1) 범위 내의 상수값을 현저성에 기한 깊이 맵 중 2차원 이미지의 대상으로 식별된 픽셀에 대응하는 픽셀에 부여하고, 0을 현저성에 기한 깊이 맵 중의 기타 픽셀에 부여하고; 상기 현저성 맵 생성부가 특성 현저성 맵 또는 운동 현저성 맵 중의 하나를 생성하면, 상기 현저성에 기한 깊이 맵 생성부는 특성 현저성 맵 또는 운동 현저성 맵 중의 각 픽셀의 현저성에 따라 [0, 1]범위 내의 값을 현저성에 기한 깊이 맵 중의 각 픽셀에 부여하고, 0은 대응 픽셀이 최소의 현저성을 가짐을 표시하고, 1은 대응 픽셀이 최대 현저성을 가짐을 표시하며; 상기 현저성 맵 생성부가 대상의 현저성 맵을 포함하지 않은 두 현저성 맵을 생성하면, 상기 현저성에 기한 깊이 맵 생성부는 상기 두 현저성 맵 중의 대응 픽셀을 서로 더하여 규격화한 값 또는 비교적 큰 값을 현저성에 기한 깊이 맵 중의 대응 픽셀에 부여하고; 상기 현저성 맵 생성부가 대상의 현저성 맵을 포함한 두 현저성 맵을 생성하면, 상기 현저성에 기한 깊이 맵 생성부는 (0, 1) 범위 내의 상수를 현저성에 기한 깊이 맵 중 대상 현저성 맵 중의 대상으로 식별된 각 픽셀에 대응하는 픽셀에 부여하고, 두 현저성 맵 중 대상의 현저성 맵 이외의 현저성 맵의 대응 픽셀값을 현저성에 기한 깊이 맵 중의 기타 대응 픽셀에 부여하며; 상기 현저성 맵 생성부가 전체 현저성 맵을 생성하면, 상기 현저성에 기한 깊이 맵 생성부는 (0, 1) 범위 내의 상수를 현저성에 기한 깊이 맵 중 대상 현저성 맵 중 대상으로 식별된 각 픽셀에 대응하는 픽셀에 부여하고, 대상 현저성 맵 이외의 두 현저성 맵 중 대응 픽셀을 서로 더하여 규격화한 값 또는 비교적 큰 값을 현저성에 기한 깊이 맵의 대응 픽셀에 부여한다.Wherein the depth map generation unit based on the saliency generates a depth map based on the saliency through the following process: if the saliency map generator generates only the target saliency map, Assigning a constant value in the range to a pixel corresponding to a pixel identified as an object of the two-dimensional image in the depth map based on the sensibility, and assigning 0 to other pixels in the depth map based on the sensibility; [0, 1] according to the conspicuousness of each pixel in the characteristic saliency map or the motional saliency map, the depth map generation unit based on the saliency may generate [0, 1] A value in the range is assigned to each pixel in the depth map based on the conspicuity; 0 indicates that the corresponding pixel has minimum conspicuity; 1 indicates that the corresponding pixel has maximum conspicuity; The depth map generator based on the saliency maps adds the corresponding pixels in the two saliency maps to each other to obtain a normalized value or a relatively large value To a corresponding pixel in the depth map based on conspicuity; Wherein the depth map generation unit based on the saliency maps the constants in the range of (0, 1) to the target in the target salience map in the depth map based on the saliency, when the two saliency maps including the saliency map as the target of the saliency map generation unit are generated To a pixel corresponding to each of the pixels identified as " a ", and assigns corresponding pixel values of the saliency maps other than the saliency map of the object among the two saliency maps to other corresponding pixels in the depth map based on saliency; Wherein when the saliency map generating section generates the entire saliency map, the depth map generating section based on the saliency corresponds to each pixel identified as an object in the target saliency map in the depth map based on the saliency, with a constant in the range of (0, 1) And adds the corresponding pixels among the two conspicuous maps other than the target conspicuousness map to the standardized value or the relatively large value to the corresponding pixels of the depth map based on the conspicuousness.

현저성에 기한 깊이 맵과 정합에 기한 깊이 맵의 픽셀값은 [0. 1] 범위 내에 있고, 0은 대응 픽셀이 최대 깊이를 가짐을 표시하고, 1은 대응 픽셀이 최소 깊이를 가짐을 표시한다. The pixel values of the depth map based on the depth map and matching based on the conspicuity are [0. 1], where 0 indicates that the corresponding pixel has the maximum depth, and 1 indicates that the corresponding pixel has the minimum depth.

상기 종합 깊이 맵 생성부는 현저성에 기한 깊이 맵과 정합에 기한 깊이 맵의 대응 픽셀을 합하여 규격화한 값 또는 현저성에 기한 깊이 맵과 정합에 기한 깊이 맵의 대응 픽셀 중 비교적 큰 값을 선택하여 종합 깊이 맵을 생성한다.The comprehensive depth map generation unit may select a relatively large value among the corresponding pixels of the depth map based on the depth map based on the conspicuousness and the value normalized by adding the corresponding pixels of the depth map based on the consistency, .

상기 현재 2차원 이미지 중의 대상은 사람, 얼굴 또는 문자를 포함할 수 있다. The object in the current two-dimensional image may include a person, a face, or a character.

본 발명의 다른 일 실시예에 따르면, 입력된 비디오 중 시간 상 연속된 복수의 2차원 이미지를 획득하는 이미지 획득부; 입력된 상기 복수의 2차원 이미지 중 각 2차원 이미지에 대응하는 최초 깊이 맵을 획득하고, 상기 최초 깊이 맵 중 각 픽셀값은 대응하는 2차원 이미지 중 대응 픽셀의 깊이값인 최초 깊이 맵 획득 부; 및 상기 최초 깊이 맵에 대해 공간 영역과 시간 영역에서 평활시키는 시공간 영역 평활부;를 포함하는 깊이 맵 평활 장치를 제공한다.According to another embodiment of the present invention, there is provided an image processing apparatus including: an image obtaining unit obtaining a plurality of two-dimensional images that are continuous in time among input video; Acquiring an initial depth map corresponding to each two-dimensional image among the plurality of input two-dimensional images, wherein each pixel value in the initial depth map is a depth value of a corresponding pixel in the corresponding two-dimensional image; And a space-time area smoothing unit for smoothing the space in the time domain with respect to the initial depth map.

상기 시공간 영역 평활부는: HVP 모델에 근거하여, 시간(t)에서 현재 2차원 이미지 중 각 픽셀 P1(x, y, t)와 시간(t+△t)에서 2차원 이미지 중 픽셀 P2(x+△x, y+△y, t+△t) 사이의 유사성, 거리 및 깊이값의 차이에 따라 평활량(S(P1, P2))을 계산하고, 기대한 평활 효과에 따라 △x, △y와 △t값을 확정하는 평활량 계산 모듈; 상기 평활량(S(P1, P2))에 따라 평활 후의 현재 2차원 이미지의 픽셀(P1)의 깊이값(D'(P1)=D(P1)-S(P1))을 계산하고, 상기 평활량(S(P1, P2))은 평활 후의 픽셀(P1)의 깊이값과 픽셀(P2)의 깊이값(D'(P2)=D(P2)+S(P1, P2)) 사이의 차의 절대값이 평활 전의 픽셀값(P1)의 깊이값(D(P1))과 픽셀값(P2)의 깊이값(D(P2)) 사이의 차의 절대값보다 작게 하는 평활 모듈;을 더 포함할 수 있다.(X + y (t)) at a time (t +? T) in each pixel P1 (x, y, t) of the current two-dimensional image at time t, based on the HVP model, the smoothness values S (P1, P2) are calculated according to the similarity, distance and depth difference between the values of x, y and t + Δt according to the expected smoothing effect, A smoothing amount calculation module for determining the smoothing amount; Calculates a depth value D '(P1) = D (P1) -S (P1) of the pixel P1 of the current two-dimensional image after smoothing according to the smoothness amount S (P1, P2) The difference between the depth value of the pixel P1 after smoothing and the depth value D '(P2) = D (P2) + S (P1, P2) of the pixel P2 The smoothing module making the absolute value smaller than the absolute value of the difference between the depth value D (P1) of the pixel value P1 before smoothing and the depth value D (P2) of the pixel value P2 .

상기 평활량 계산 모듈은 D(P1)-D(P2)）*N(P1，P2)*C(P1,P2)에 따라 상기 평활량(S(P1, P2))을 계산하고, 여기서 D(.)는 픽셀이 깊이값이고;The smoothed amount calculation module calculates the smoothed amount S (P1, P2) according to D (P1) -D (P2) * N (P1, P2) * C .) Is the pixel is the depth value;

여기서,

이며;here,

;

I(.)는 픽셀의 특성(색깔 또는 무늬)값이며, |.|는 절대값이다.I (.) Is the characteristic (color or pattern) value of the pixel, and |. | Is the absolute value.

본 발명의 다른 일 실시예에 따르면, 입력된 비디오 중 시간 상 연속된 복수의 2차원 이미지를 획득하는 이미지 획득부; HVP 모델에 따라 상기 다수의 2차원 이미지 중 현재 2차원 이미지에 대응하는 적어도 하나의 현저성 맵을 생성하고, 상기 현저성 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 현저성을 표시하는 현저성 맵 생성부; 상기 적어도 하나의 현저성 맵을 사용하여 현재 2차원 이미지에 대응하는 현저성에 기한 깊이 맵을 생성하고, 상기 현저성에 기한 깊이 맵의 각 픽셀은 현재 2차원 이미지 중의 대응 픽셀의 현저성에 기한 깊이값을 표시하는, 현저성에 기한 깊이 맵 생성부; 상기 다수의 2차원 이미지 중 현재 2차원 이미지와 다수의 상기 3차원 구조 정합 모듈 중 미리 저장된 3차원 전형 구조의 각 정합도를 계산하고, 최고 정합도를 가지는 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정하는 3차원 구조 정합부; 상기 3차원 전형 구조를 미리 저장하고, 현재 2차원 이미지로 확정된 3차원 구조의 3차원 전형 구조의 깊이 맵을 현재 2차원 이미지와 대응하는 정합도에 기한 깊이 맵로 하고, 상기 정합도에 기한 깊이 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 정합도에 기한 깊이값을 표시하는, 정합도에 기한 깊이 맵 생성부; 현저성에 기한 깊이 맵과 정합에 기한 깊이 맵을 결합하여 종합 깊이 맵을 생성하고, 상기 깊이 맵 중의 각 픽셀은 현재 2차원 이미지 중의 대응 픽셀의 종합 깊이값을 표시하는, 종합 깊이 맵 생성부; 및 종합 깊이 맵에 대해 공간 영역과 시간 영역 상의 평활을 하는 시공간 영역 평활부;를 포함하는 깊이 맵 생성 장치를 제공한다.According to another embodiment of the present invention, there is provided an image processing apparatus including: an image obtaining unit obtaining a plurality of two-dimensional images that are continuous in time among input video; Dimensional image; and generating at least one saliency map corresponding to a current two-dimensional image of the plurality of two-dimensional images according to an HVP model, wherein each pixel of the saliency map includes a salient feature indicative of saliency of a corresponding pixel of a current two- A sex map generation unit; Dimensional image; and a depth map based on the saliency of a corresponding pixel in the current two-dimensional image, wherein the depth map is based on the saliency based on the at least one saliency map, A depth map generation unit based on the conspicuousness; Dimensional image and a plurality of three-dimensional structural matching modules among a plurality of the two-dimensional images and calculating a degree of matching of the three-dimensional representative structure stored in advance among a plurality of the three-dimensional structural matching modules, A three-dimensional structure matching unit fixed by a three-dimensional structure; Dimensional structure is previously stored and a depth map of a three-dimensional structure of a three-dimensional structure determined as a current two-dimensional image is used as a depth map based on a matching degree corresponding to a current two-dimensional image, Wherein each pixel of the map represents a depth value based on a matching degree of a corresponding pixel of a current two-dimensional image; An overall depth map generating unit for generating an overall depth map by combining the depth map based on the conspicuousness and the depth map based on matching and displaying each pixel in the depth map an overall depth value of a corresponding pixel in the current two- And a space-time area smoothing unit for smoothing the space area and the time area with respect to the overall depth map.

본 발명의 다른 일 실시예에 따르면, 입력된 비디오 중 연속된 복수의 2차원 이미지를 획득하는 단계; 상기 복수의 2차원 이미지 중 현재 2차원 이미지와 다수의 미리 저장된 3차원 전형 구조의 각 정합도를 계산하고, 최고 정합도를 가진 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정하는 단계; 상기 3차원 전형 구조의 깊이 맵을 미리 저장하고, 현재 2차원 이미지의 3차원 구조로 확정된 3차원 전형 구조의 깊이 맵을 현재 2차원 이미지에 대응하는 정합에 기한 깊이 맵 으로 하고, 상기 정합에 기한 깊이 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 정합도에 기한 깊이값을 표시하는 단계;를 포함하는 깊이 맵 생성 방법을 제공한다.According to another embodiment of the present invention, there is provided an image processing method comprising: obtaining a plurality of consecutive two-dimensional images of an input video; Calculating a degree of matching of a current two-dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining a three-dimensional representative structure having a highest matching degree as a three- ; The depth map of the three-dimensional stereoscopic structure is stored in advance and the depth map of the three-dimensional stereoscopic structure determined by the three-dimensional structure of the current two-dimensional image is used as the depth map based on the matching corresponding to the current two- And each pixel of the due depth map displays a depth value based on a matching degree of a corresponding pixel of the current two-dimensional image.

현재 2차원 이미지의 3차원 구조로 확정하는 상기 단계는: 현재 2차원 이미지를 정합된 3차원 전형 구조 중의 평면에 대응하는 적어도 하나의 영역으로 분할하는 단계; 상기 각 영역의 특성의 분포에 따라 각 상기 영역의 밀도를 계산하고; 상기 각 영역의 특성의 평균값을 계산하고 상기 평균값 사이의 차의 놈에 따라 두 영역 사이의 유사성을 계산하며; 상기 각 영역의 밀도와 상기 두 영역 사이의 유사성의 합에 따라 정합도를 계산하는 단계; 및 상기 정합도에 따라, 최고 정합도를 가지는 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정하는 단계;를 포함할 수 있다.Determining the current three-dimensional structure of the two-dimensional image comprises: dividing the current two-dimensional image into at least one region corresponding to a plane in the matched three-dimensional representative structure; Calculate a density of each of the regions according to a distribution of the characteristics of the regions; Calculating an average value of the characteristics of each of the regions and calculating a similarity between the two regions according to a difference between the average values; Calculating a matching degree according to a sum of the density of each of the regions and a similarity between the two regions; And fixing the three-dimensional template structure having the highest matching degree to the three-dimensional structure of the current two-dimensional image according to the matching degree.

에 따라 상기 각 영역 r의 밀도를 계산하고, 여기서 여기서

는 상기 영역의 상기 픽셀의 특성값의 평균값이며, area(r)는 상기 영역 중 픽셀의 수량이다.

Lt; RTI ID = 0.0 > r, < / RTI >

P is a pixel of the area, I (p) is a characteristic value of the pixel p,

를 근거로 영역 ri와 영역 rj 사이의 유사성을 계산하고, 여기서

는 상기 영역의 특성의 평균값이며, |.|는 놈이다.

, The similarity between the region ri and the region rj is calculated, where

Is an average value of the characteristics of the region, and |

본 발명의 다른 일 실시예에 따르면, 입력된 비디오 중 연속된 복수의 2차원 이미지를 획득하는 단계; HVP 모델에 따라 상기 다수의 2차원 이미지 중 현재 2차원 이미지에 대응하는 적어도 하나의 현저성 맵을 생성하고, 상기 현저성 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 현저성을 표시하는 단계; 상기 적어도 하나의 현저성 맵과 현재 2차원 이미지와 대응하는 현저성에 기한 깊이 맵을 사용하고, 상기 현저성에 기한 깊이 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 현저성에 기한 깊이값을 표시하는 단계; 상기 복수의 2차원 이미지 중 현재 2차원 이미지와 다수의 미리 저장된 3차원 전형 구조의 각 정합도를 계산하고, 최고 정합도를 가진 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정하는 단계; 상기 3차원 전형 구조의 깊이 맵을 미리 저장하고, 현재 2차원 이미지의 3차원 구조로 확정된 3차원 전형 구조의 깊이 맵을 현재 2차원 이미지에 대응하는 정합에 기한 깊이 맵으로 하고, 상기 정합에 기한 깊이 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 정합도에 기한 깊이값을 표시하는 단계; 및 현저성에 기한 깊이 맵과 정합에 기한 깊이 맵을 결합하여 종합 깊이 맵을 생성하고, 상기 종합 깊이 맵 중의 각 픽셀은 현재 2차원 이미지 중의 대응 픽셀의 종합 깊이값을 표시하는 단계;를 포함하는 깊이 맵 생성 방법을 제공한다.According to another embodiment of the present invention, there is provided an image processing method comprising: obtaining a plurality of consecutive two-dimensional images of an input video; Generating at least one saliency map corresponding to a current two-dimensional image of the plurality of two-dimensional images according to an HVP model, each pixel of the saliency map displaying saliency of a corresponding pixel of a current two- ; Wherein the depth map is based on the at least one saliency map and the saliency map corresponding to the current two-dimensional image, and each pixel of the depth map based on the saliency displays a depth value based on the saliency of the corresponding pixel of the current two- step; Calculating a degree of matching of a current two-dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining a three-dimensional representative structure having a highest matching degree as a three- ; The depth map of the three-dimensional stereoscopic structure is stored in advance and the depth map of the three-dimensional stereoscopic structure determined by the three-dimensional structure of the current two-dimensional image is used as the depth map based on the matching corresponding to the current two- Each pixel of the due depth map displaying a depth value based on a matching degree of a corresponding pixel of a current two-dimensional image; And combining the depth map based on the roughness-based depth map and the matching depth-based map to generate a comprehensive depth map, wherein each pixel in the overall depth map displays an overall depth value of a corresponding pixel in the current two- Provides a map generation method.

상기 현저성 맵을 생성하는 것은: 특성 현저성 맵, 운동 현저성 맵 및 대상 현저성 맵 중의 하나, 임의의 둘 또는 전체를 생성하고, 현재 2차원 이미지 중의 특성을 통해 특성 현저성 맵을 생성하고, 현재 2차원 이미지와 현재 2차원 이미지가 시간 상 인접한 2차원 이미지 사이의 운동을 식별함으로써 운동 현저성 맵을 생성하며, 현재 2차원 이미지 중의 대상을 식별함으로써 대상 현저성 맵을 생성하는 것을 포함할 수 있다. Generating the saliency map includes: generating one or any two or all of the characteristic saliency map, the motion saliency map and the target saliency map, generating a characteristic saliency map through the characteristics in the current two-dimensional image Generating an motion saliency map by identifying a motion between the current two-dimensional image and a two-dimensional image that is temporally adjacent to the current two-dimensional image, and generating an object saliency map by identifying an object in the current two-dimensional image .

상기 현저성에 기한 깊이 맵을 생성하는 것은: 단지 대상의 현저성 맵만을 생성하면, (0, 1) 범위 내의 상수값을 현저성에 기한 깊이 맵의 2차원 이미지 중 대상으로 식별된 픽셀과 대응하는 픽셀에 부여하고, 0을 현저성에 기한 깊이 맵의 기타 픽셀에 부여하며; 특성 현저성 맵 또는 운동 현저성 맵 중의 하나를 생성하면, 특성 현저성 맵 또는 운동 현저성 맵의 각 픽셀의 현저성에 따라 [0, 1]범위 내의 값을 현저성에 기한 깊이 맵의 각 픽셀에 부여하고, 0은 대응 픽셀이 최소의 현저성을 가짐을 표시하고, 1은 대응 픽셀이 최대 현저성을 가짐을 표시하며; 대상의 현저성 맵을 포함하지 않은 두 현저성 맵을 생성하면, 상기 두 현저성 맵 중의 대응 픽셀을 서로 더하여 규격화한 값 또는 비교적 큰 값을 현저성에 기한 깊이 맵 중의 대응 픽셀에 부여하고; 대상의 현저성 맵을 포함한 두 현저성 맵을 생성하면, (0, 1) 범위 내의 상수를 현저성에 기한 깊이 맵 중 대상 현저성 맵의 각 대상으로 식별된 픽셀에 대응하는 픽셀에 부여하고, 두 현저성 맵 중 대상의 현저성 맵 이외의 현저성 맵의 대응 픽셀값을 현저성에 기한 깊이 맵의 기타 대응 픽셀에 부여하며; 전체 현저성 맵을 생성하면, (0, 1) 범위 내의 상수를 현저성에 기한 깊이 맵 중 대상 현저성 맵의 대상으로 식별된 각 픽셀에 대응하는 픽셀에 부여하고, 대상 현저성 맵 이외의 두 현저성 맵의 대응 픽셀을 서로 더하여 규격화한 값 또는 비교적 큰 값을 현저성에 기한 깊이 맵 중의 대응 픽셀에 부여하는 것을 포함할 수 있다.The generating of the depth map based on the conspicuousness may include: generating only the conspicuity map of the object, the constant value within the range of (0, 1) is converted into the pixel corresponding to the pixel identified as the object in the two- And assigns 0 to other pixels of the depth map based on the conspicuousness; A value within the range of [0, 1] is assigned to each pixel of the depth map based on the conspicuousness according to the conspicuousness of each pixel of the characteristic saliency map or the motional saliency map. , 0 indicates that the corresponding pixel has minimum saliency, and 1 indicates that the corresponding pixel has maximum saliency; Adding a corresponding normalized value or a relatively large value to a corresponding pixel in the depth map based on the conspicuousness by adding corresponding pixels in the two conspicuous maps to each other when the two conspicuousness maps not including the conspicuousness map of the object are generated; When the two conspicuity maps including the conspicuity map of the object are generated, the constants in the range of (0, 1) are given to the pixels corresponding to the pixels identified as the objects of the object conspicuity map in the depth map based on conspicuity, The corresponding pixel values of the consonantity maps other than the consonantity map of the object among the consonantity maps to the other corresponding pixels of the depth map based on the conspicuousness; When the entire saliency map is generated, the constants in the range of (0, 1) are given to the pixels corresponding to the respective pixels identified as the targets of the target saliency map in the depth map based on the saliency, Adding corresponding pixels of the sex map to each other to give a normalized value or a relatively large value to a corresponding pixel in the depth map based on the conspicuousness.

상기 현재 2차원 이미지의 3차원 구조를 확정하는 단계는: 현재 2차원 이미지를 정합 진행된 3차원 전형구조 중 평면에 대응하는 적어도 하나의 영역으로 분할하는 단계; 상기 각 영역 중 특성 분포에 따라 상기 각 영역의 밀도를 계산하고; 상기 각 영역 중 특성의 평균치를 계산하며, 상기 평균치 사이의 차의 놈에 따라 두 영역 사이의 유사성을 계산하고; 상기 각 영역의 밀도와 상기 두 영역 사이의 유사성의 합에 따라 정합도를 계산하는 단계; 및 상기 정합도에 따라, 최고 정합도를 가진 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정하는 단계;를 더 포함할 수 있다.Determining the three-dimensional structure of the current two-dimensional image comprises: dividing the current two-dimensional image into at least one region corresponding to a plane of the matched three-dimensional representative structure; Calculating a density of each of the regions according to a characteristic distribution among the regions; Calculating an average value of the characteristics among the areas, calculating a similarity between the two areas according to a difference between the average values; Calculating a matching degree according to a sum of the density of each of the regions and a similarity between the two regions; And determining the three-dimensional representative structure having the highest matching degree as the three-dimensional structure of the current two-dimensional image according to the matching degree.

현저성에 기한 깊이 맵과 정합에 기한 깊이 맵 중 대응 픽셀값의 합하여 규격화하거나 현저성에 기한 깊이 맵 및 정합에 기한 깊이 맵 중 대응 픽셀 중 비교적 큰 값을 선택하여 종합 깊이 맵을 생성할 수 있다.It is possible to generate a comprehensive depth map by selecting a relatively large value among the corresponding pixels in the depth map based on the depth map and the corresponding pixel values in the depth map based on the conspicuousness or the corresponding depth map based on the conspicuousness.

본 발명의 다른 실시예에 따르면, 입력된 비디오 중 시간 상 연속된 복수의 2차원 이미지를 획득하는 단계; 입력된 상기 다수의 2차원 이미지 중 각 2차원 이미지에 대응하는 최초 깊이 맵을 획득하고, 상기 깊이 맵 중 각 픽셀값은 대응하는 2차원 이미지 중 대응 픽셀의 깊이값인 단계; 및 상기 최초 깊이 맵에 대해 공간 영역 및 시간 영역 상의 평활을 진행하는 단계;를 포함하는 깊이 맵 평활 방법을 제공한다.According to another embodiment of the present invention, there is provided an image processing method comprising the steps of: acquiring a plurality of two-dimensional images that are sequential in time of an input video; Acquiring a first depth map corresponding to each two-dimensional image of the plurality of input two-dimensional images, wherein each pixel value in the depth map is a depth value of a corresponding one of the corresponding two-dimensional images; And advancing smoothness in a spatial domain and a time domain with respect to the initial depth map.

상기 최초 깊이 맵에 대해 공간 영역 및 시간 영역 상의 평활을 진행하는 단계는: HVP 모델에 근거하여, 시간(t)에서 현재 2차원 이미지 중 각 픽셀 P1(x, y, t)와 시간(t+△t)에서 2차원 이미지 중 픽셀 P2(x+△x, y+△y, t+△t) 사이의 유사성, 거리 및 깊이값의 차이에 따라 평활량(S(P1, P2))을 계산하고, 기대한 평활 효과에 따라 △x, △y와 △t 값을 확정하는 단계; 상기 평활량(S(P1, P2))에 따라 평활 후의 현재 2차원 이미지의 픽셀(P1)의 깊이값(D'(P1)=D(P1)-S(P1))을 계산하고, 상기 평활량(S(P1, P2))은 평활 후의 픽셀(P1)의 깊이값과 픽셀(P2)의 깊이값(D'(P2)=D(P2)+S(P1, P2)) 사이의 차의 절대값이 평활 전의 픽셀값(P1)의 깊이값(D(P1))과 픽셀값(P2)의 깊이값(D(P2)) 사이의 차의 절대값보다 작게 하는 단계;를 포함할 수 있다.Wherein the step of advancing the spatial domain and the smoothing over the temporal domain with respect to the initial depth map comprises: based on the HVP model, for each pixel P1 (x, y, t) and time (t + (P1, P2) in accordance with the difference in similarity, distance and depth value between the pixel P2 (x + x, y + y, t + Determining Δx, Δy and Δt according to the smoothing effect; Calculates a depth value D '(P1) = D (P1) -S (P1) of the pixel P1 of the current two-dimensional image after smoothing according to the smoothness amount S (P1, P2) The difference between the depth value of the pixel P1 after smoothing and the depth value D '(P2) = D (P2) + S (P1, P2) of the pixel P2 Making the absolute value smaller than the absolute value of the difference between the depth value D (P1) of the pixel value P1 before smoothing and the depth value D (P2) of the pixel value P2 .

D(P1)-D(P2)）*N(P1，P2)*C(P1,P2)에 따라 상기 평활량(S(P1, P2))을 계산하고, 여기서 D(.)는 픽셀의 깊이값이고;Calculating the smoothed amount S (P1, P2) according to D (P1) -D (P2) * N (P1, P2) * C Value;

여기서,

이며;here,

;

본 발명의 다른 일 실시예에 따르면, 입력된 비디오 중 연속된 복수의 2차원 이미지를 획득하는 단계; HVP 모델에 따라 상기 다수의 2차원 이미지 중 현재 2차원 이미지에 대응하는 적어도 하나의 현저성 맵을 생성하고, 상기 현저성 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 현저성을 표시하는 단계; 상기 적어도 하나의 현저성 맵과 현재 2차원 이미지와 대응하는 현저성에 기한 깊이 맵을 사용하여, 상기 현저성에 기한 깊이 맵의 각 픽셀은 현재 2차원 이미지 중의 대응 픽셀의 현저성에 기한 깊이값을 표시하는 단계; 상기 복수의 2차원 이미지 중 현재 2차원 이미지와 다수의 미리 저장된 3차원 전형 구조의 각 정합도를 계산하고, 최고 정합도를 가진 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정하는 단계; 상기 3차원 전형 구조의 깊이 맵을 미리 저장하고, 현재 2차원 이미지의 3차원 구조로 확정된 3차원 전형 구조의 깊이 맵을 현재 2차원 이미지에 대응하는 정합에 기한 깊이 맵으로 하고, 상기 정합에 기한 깊이 맵의 각 픽셀은 현재 2차원 이미지의 대응 픽셀의 정합도에 기한 깊이값을 표시하는 단계; 및 현저성에 기한 깊이 맵과 정합에 기한 깊이 맵을 결합하여 종합 깊이 맵을 생성하고, 상기 종합 깊이 맵 중의 각 픽셀은 현재 2차원 이미지 중의 대응 픽셀의 종합 깊이값을 표시하는 단계; 및 종합 깊이 맵에 대해 공간 영역 및 시간 영역 상의 평활을 진행하는 단계;를 포함하는 깊이 맵 생성 방법을 제공한다.According to another embodiment of the present invention, there is provided an image processing method comprising: obtaining a plurality of consecutive two-dimensional images of an input video; Generating at least one saliency map corresponding to a current two-dimensional image of the plurality of two-dimensional images according to an HVP model, each pixel of the saliency map displaying saliency of a corresponding pixel of a current two- ; Using the at least one saliency map and a depth map based on the saliency corresponding to the current two-dimensional image, each pixel of the depth map based on the saliency displays a depth value based on the saliency of a corresponding pixel in the current two- step; Calculating a degree of matching of a current two-dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining a three-dimensional representative structure having a highest matching degree as a three- ; The depth map of the three-dimensional stereoscopic structure is stored in advance and the depth map of the three-dimensional stereoscopic structure determined by the three-dimensional structure of the current two-dimensional image is used as the depth map based on the matching corresponding to the current two- Each pixel of the due depth map displaying a depth value based on a matching degree of a corresponding pixel of a current two-dimensional image; And combining the depth map based on the saliency and the depth map based on the matching to generate a comprehensive depth map, wherein each pixel in the overall depth map displays an overall depth value of a corresponding pixel in the current two-dimensional image; And advancing smoothness on the spatial domain and the time domain on the global depth map.

본 발명의 일 실시예에 따르면 완전히 자동으로, 사용자의 입력없이 다양한 형태의 비디오를 처리할 수 있다.According to an embodiment of the present invention, various types of video can be processed completely automatically without user's input.

도 1은 본 발명의 제1실시예에 따른 깊이 맵을 생성하는 장치의 블록도;
도 2는 3차원 전형 구조의 일 예를 도시한 개략도;
도 3은 본 발명에 따른 3차원 구조 정합 모듈의 블록도;
도 4는 본 발명의 제2실시예에 따른 깊이 맵을 생성하는 장치의 블록도;
도 5는 본 발명에 따른 현저성 맵 생성 모듈의 블록도;
도 6은 본 발명의 제2실시예에 따른 장치를 사용하여 생성된 깊이 맵의 일예를 보이는 도면;
도 7은 본 발명의 제3실시예에 따른 깊이 맵에 대해 평활을 진행하는 장치의 블록도;
도 8은 본 발명의 공간 영역 및 시간 영역 평활 모듈의 블록도;
도 9는 본 발명에 따른 공간 영역 및 시간 영역 평활 모듈의 블록도;
도 10은 본 발명의 제4실시예에 따른 깊이 맵 생성 장치의 블록도;
도 11은 본 발명의 제4실시예의 장치에 따라 깊이 맵을 생성한 일 예를 보이는 도면;
도 12는 본 발명의 제5실시예에 따른 깊이 맵을 생성하는 방법의 플로우차트;
도 13은 본 발명의 현재 2차원 이미지의 3차원 구조를 확정하는 플로우 차트;
도 14는 본 발명의 제6실시예에 따른 깊이 맵을 생성하는 방법의 플로우 차트;
도 15는 본 발명의 제7실시예에 따른 깊이 맵에 대해 평활을 진행하는 방법의 플로우 차트;
도 16은 본 발명의 제8실시예에 따른 깊이 맵을 생성하는 방법의 플로우 차트.1 is a block diagram of an apparatus for generating a depth map according to a first embodiment of the present invention;
FIG. 2 is a schematic view showing an example of a three-dimensional typical structure; FIG.
3 is a block diagram of a three-dimensional structure matching module according to the present invention;
4 is a block diagram of an apparatus for generating a depth map according to a second embodiment of the present invention;
5 is a block diagram of a saliency map generation module according to the present invention;
FIG. 6 illustrates an example of a depth map generated using an apparatus according to a second embodiment of the present invention; FIG.
FIG. 7 is a block diagram of an apparatus for smoothing a depth map according to a third embodiment of the present invention; FIG.
8 is a block diagram of a spatial domain and time domain smoothing module of the present invention;
9 is a block diagram of a spatial domain and time domain smoothing module according to the present invention;
10 is a block diagram of a depth map generating apparatus according to a fourth embodiment of the present invention;
11 is a view showing an example of generating a depth map according to the apparatus of the fourth embodiment of the present invention;
12 is a flowchart of a method of generating a depth map according to a fifth embodiment of the present invention;
13 is a flowchart for determining a three-dimensional structure of a current two-dimensional image of the present invention;
14 is a flowchart of a method of generating a depth map according to a sixth embodiment of the present invention;
15 is a flowchart of a method of advancing smoothness for a depth map according to a seventh embodiment of the present invention;
16 is a flowchart of a method of generating a depth map according to an eighth embodiment of the present invention.

아래에서 본 발명의 실시예에 대해 상세히 설명한다. 그 예는 도면 중에 나타나며, 상이한 비디오에서 동일한 구성요소에 대해서는 동일한 부호를 사용한다. 필요한 경우 동일 부호에 대한 중복 설명을 생략한다.Hereinafter, embodiments of the present invention will be described in detail. The examples are shown in the figures, and the same reference numerals are used for the same components in different videos. Where necessary, redundant description of the same reference numerals is omitted.

도 1은 본 발명의 제1실시예에 따른 깊이 맵을 생성하는 장치를 보인다.FIG. 1 shows an apparatus for generating a depth map according to a first embodiment of the present invention.

도 1을 참조하면, 깊이 맵 생성 장치(100)는 이미지 획득부(110), 3차원 구조 정합부(120) 및 정합에 기한 깊이 맵 생성부(130)을 포함한다.Referring to FIG. 1, the depth map generation apparatus 100 includes an image acquisition unit 110, a three-dimensional structure matching unit 120, and a depth-of-match map generation unit 130.

깊이 맵 생성 장치(100)의 입력은 다수의 이미지로 이루어진 비디오 시퀀스로 이루어진다. 이미지 획득부(110)은 입력된 비디오 중 시간 상 연속된 다수의 2차원 이미지를 획득한다. 다수의 2차원 이미지 중 각 이미지에 대해, 3차원 구조정합부(120)은 다수의 미리 저장된 3차원 전형 구조 중 현재 이미지와 가장 잘 정합되는 3차원 구조를 얻는다.The input of the depth map generating apparatus 100 consists of a video sequence composed of a plurality of images. The image obtaining unit 110 obtains a plurality of two-dimensional images that are continuous in time among the input video. For each image of the plurality of two-dimensional images, the three-dimensional structure mating section 120 obtains a three-dimensional structure that best matches the current image among a plurality of pre-stored three-dimensional representative structures.

구체적으로 설명하면, 3차원 전형 구조의 선험 지식에 대한 응용을 통해 현재 이미지의 유사 3차원 구조를 얻을 수 있다. 3차원 전형 구조의 일련의 예는 도 2에 도시된다. Specifically, a similar three-dimensional structure of the current image can be obtained through application to the knowledge of the pre-knowledge of the three-dimensional model structure. A series of examples of three-dimensional representative structures are shown in Fig.

도 2를 참조하면, 제2행이 3차원 전형 구조의 예를 보이고, 제1행은 대응하는 실제 장면의 이미지를 보인다. 현실에서 장면의 실제 구조는 미리 저장된 3차원 전형 구조에 비해 복잡하지만, 사람의 눈의 시각 시스템의 경계로 인해, 복잡한 3차원 구조가 3차원 텔레비전에 보일 때에는 시청자에게 다양한 3차원 느낌을 줄 수 있어, 간단한 3차원 구조만으로도 2차원 이미지의 깊이 맵을 생성하여 시청자에게 전통적인 2차원 비디오에 비해 더 많은 3차원 느낌을 줄 수 있다.Referring to Fig. 2, the second row shows an example of a three-dimensional stereoscopic structure, and the first row shows an image of a corresponding real scene. In reality, the actual structure of the scene is more complicated than the pre-stored 3D stereoscopic structure, but due to the boundaries of the visual system of the human eye, a complex 3D structure can give viewers various 3D sensations when viewed on a 3D television , A simple three-dimensional structure can generate a depth map of a two-dimensional image, giving viewers more three-dimensional feeling than a conventional two-dimensional video.

다수의 미리 저장된 3차원 전형 구조에서 현재 이미지와 가장 잘 정합되는 3차원 구조를 얻기 위해, 현재 이미지와 다수의 미리 저장된 3차원 전형 구조의 각 정합도를 계산해야 할 필요가 있고, 최고 정합도를 가지는 미리 정해진 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정한다.In order to obtain a three-dimensional structure that best matches the current image in a plurality of pre-stored three-dimensional representative structures, it is necessary to calculate each matching degree of the current image and a plurality of previously stored three-dimensional representative structures, Dimensional structure of the current two-dimensional image as a three-dimensional structure of the current two-dimensional image.

도 3은 본 발명에 따른 3차원 구조 정합부(120)을 보인다.3 shows a three-dimensional structure matching unit 120 according to the present invention.

도 3을 참조하면, 3차원 구조 정합부(120)은 평면 분할 모듈(121), 정합도 계산 모듈(122) 및 3차원 구조 확정 모듈(123)을 포함한다. 3, the three-dimensional structure matching unit 120 includes a plane division module 121, a matching degree calculation module 122, and a three-dimensional structure determination module 123. [

평면 분할 모듈(121)은 다수의 미리 저장된 3차원 전형 구조 중 하나에 따라 현재 이미지를 적어도 하나의 평면으로 분할한다. 예를 들어, 현재 이미지와 도 2의 제2행의 첫 번째 3차원 전형 구조를 정합할 때, 상기 3차원 전형 구조가 단지 하나의 평면만 가지므로, 현재 이미지의 전체는 하나의 영역이 된다; 하지만, 현재 이미지와 도 2의 제2행의 네 번째 3차원 구조를 정합할 때 현재 이미지를 네번재 3차원 전형 구조 중 네 개의 평면에 대응하는 네 개의 영역으로 분할할 필요가 있다. The plane splitting module 121 splits the current image into at least one plane according to one of a plurality of previously stored three-dimensional representative structures. For example, when matching the current image with the first three-dimensional representative structure of the second row of FIG. 2, the three-dimensional representative structure has only one plane, so the entirety of the current image is one region; However, when matching the current image with the fourth three-dimensional structure of the second row of FIG. 2, it is necessary to divide the current image into four regions corresponding to four planes of the four-dimensional three-dimensional representative structure.

그런 다음, 정합도 계산 모듈(122)는 현재 이미지의 각 영역 중 특성(색깔, 그레디언트, 경계)을 통해 각 영역의 밀도 및 두 영역 사이의 유사성을 계산하여 상기 정합도(S)(수학식 1)을 계산한다. 수학식 1에서 n은 상기 이미지를 분할한 영역의 수량, ri, rj 는 분할된 영역, Dense(ri) 는 영역 중 특성에 따라 계산한 각 영역의 밀도, diff(ri, rj)는 영역의 특성(feature)에 따라 계산된 영역 사이의 유사성이다.Then, the matching degree calculation module 122 calculates the density of each area and the similarity between the two areas through the characteristics (color, gradient, boundary) of each area of the current image and calculates the matching degree S ). (Ri, rj) is the density of each region calculated according to the property of the region, and diff (ri, rj) is the density of the region (similarity between regions calculated according to feature).

Dense(ri)의 계산에 대해서 아래의 수학식 2에서 나타낸다.The calculation of Dense (ri) is shown in Equation (2) below.

수학식 2의 std(ri)는 영역(ri) 중의 특성에 대한 표준 분포이고, 아래의 수학식 3으로 표시된다.Std (ri) in the equation (2) is a standard distribution of the characteristics in the area ri and is expressed by the following equation (3).

수학식 3에서 p는 영역(ri)의 픽셀, I(p)는 픽셀(p)의 특성값이고,

는 영역(ri) 중 픽셀의 특성값의 평균값이며, area(r)은 영역(ri)의 픽셀의 수량이다.In Equation (3), p is a pixel in the region ri, I (p) is a characteristic value of the pixel (p)

(R) is the average value of the pixel values in the region (ri), and area (r) is the number of pixels in the region (ri).

Diff(ri, rj)의 계산은 아래의 수학식 4를 참조한다.The calculation of Diff (ri, rj) is shown in Equation (4) below.

수학식 4에서

는 영역 중 픽셀의 특성값의 평균값이고, |.|는 놈(norm)이며, 1-놈

, 2-놈

이며, ∞놈은

이다.In Equation 4,

Is the average value of the characteristic values of the pixels in the region, |. | Is the norm,

, 2-man

And ∞

to be.

또한 Dense(ri)는 영역 중 특성의 밀도이고, 상기 영역 중 특성이 더 조밀해질수록 Dense(ri)의 값도 커진다. Diff(ri, rj)는 두 영역 사이의 유사성이고, 두 영역의 차이가 커질수록, diff(ri, rj) 값도 커진다. 따라서, 비교적 높은 정합도(S)는 분할 후의 각 이미지 영역 중의 특성이 일치성과 독자성을 가지는 것을 의미한다. Also, Dense (ri) is the density of the characteristic in the region, and the more dense the characteristic of the region, the larger the value of Dense (ri). Diff (ri, rj) is the similarity between the two regions. The larger the difference between the two regions, the larger the diff (ri, rj) value. Therefore, the relatively high matching degree S means that the characteristics in each of the image areas after division have consistency and uniqueness.

일반적으로 각 3차원 평면은 일치된 이미지 특성을 가지고, 두 상이한 3차원 평면은 상이한 특성을 가진다. 그러므로, 상이한 3차원 전형 구조에 따라 현재 이미지를 분할하고 각각의 정합도를 계산할 때, 만약 얻은 임의의 정합도가 높을수록, 현재 이미지와 그 분할된 상기 임의의 3차원 전형 구조는 더 정합되는 것을 의미한다. 따라서, 3차원 구조 확정 모듈(123)은 상기 정합도에 따라 최고 정합도를 가지는 3차원 전형 구조를 현재 2차원 이미지의 3차원 구조로 확정한다.In general, each three-dimensional plane has a matched image characteristic, and two different three-dimensional planes have different characteristics. Therefore, when dividing the current image according to different three-dimensional stereoscopic structures and calculating the respective degrees of matching, the higher the arbitrary degree of matching obtained, the more the matching between the current image and the arbitrary three- it means. Accordingly, the three-dimensional structure determination module 123 determines the three-dimensional structure having the highest matching degree according to the degree of matching to the three-dimensional structure of the current two-dimensional image.

그런 다음, 정합에 기한 깊이 맵 생성부(130)은 확정된 3차원 구조에 따라 현재 이미지의 정합에 기한 깊이 맵을 생성한다. 상기 정합에 기한 깊이 맵 중의 각 픽셀은 [0, 1]범위 내에 있고, 현재 2차원 이미지 중 대응 픽셀의 깊이 맵을 표시한다. Then, the depth map generation unit 130 based on registration generates a depth map based on the matching of the current image according to the determined three-dimensional structure. Each pixel in the depth map due to the matching is in the range [0, 1] and displays the depth map of the corresponding pixel in the current two-dimensional image.

본 실시예는 장면에 대한 주요 부분을 점유하는 이미지에 대해 비교적 좋은 효과를 나타낸다.The present embodiment shows a comparatively good effect on an image occupying a main portion of a scene.

도 4는 본 발명의 제2실시예에 따른 깊이 맵을 생성하는 장치이다.4 is an apparatus for generating a depth map according to a second embodiment of the present invention.

도 4를 참조하면, 깊이 맵 생성 장치(400)는 이미지 획득부 (410), 현저성 맵 생성부(420), 현저성에 기한 깊이 맵 생성부(430), 3차원 구조 정합부(120), 정합에 기한 깊이 맵 생성부(130) 및 종합 깊이 맵 생성부(440)을 포함한다. 3차원 구조 정합부(120), 정합에 기한 깊이 맵 생성부(130)과 도 1의 동일한 참조부호의 모듈을 서로 동일하다. 4, the depth map generating apparatus 400 includes an image obtaining unit 410, a saliency map generating unit 420, a depth map generating unit 430, a three-dimensional structure matching unit 120, A matching depth based map generating unit 130, and a global depth map generating unit 440. [ The three-dimensional structure matching unit 120, the depth-based depth map generation unit 130, and the module of the same reference numeral in FIG.

깊이 맵 생성 장치(400)의 입력은 다수의 이미지로 이루어진 비디오 시퀀스이다. 이미지 획득부(410)는 입력된 비디오 중 시간상 연속적인 다수의 2차원 이미지를 얻는다.The input of the depth map generating apparatus 400 is a video sequence composed of a plurality of images. The image obtaining unit 410 obtains a plurality of two-dimensional images continuously in time among the input video.

HVP(Human Visual Perception) 모델로부터, 시청자가 비디오 중 현저성(saliency)을 가지는 부분 등에 더 흥미를 느끼고 상기 현저성을 가지는 부분은 일반적으로 시청자와의 거리가 더 가깝게 느껴진다(즉, 비교적 작은 깊이를 가진다). 따라서 2차원 이미지의 현저성을 가지는 특성, 운동(motion) 또는 대상(object) 등을 식별할 수 있고, 상기 식별된 특성, 운동 또는 대상 등에 따라 [0, 1] 사이의 깊이값을 각 픽셀에 부여하여 현저성에 기한 깊이 맵을 얻을 수 있다. From the HVP (Human Visual Perception) model, the viewer feels more interested in the portion of the video that has saliency, and the portion having the saliency generally feels closer to the viewer (i.e., . Therefore, it is possible to identify a characteristic, a motion, or an object having a conspicuousness of a two-dimensional image, and calculate a depth value between [0, 1] according to the identified characteristic, motion, A depth map based on the conspicuousness can be obtained.

이미지 중의 현저성을 가지는 특성, 운동 또는 대상을 식별하기 위해, 현저성 맵부(420)은 특성 현저성 생성 모듈(421), 운동 현저성 맵 생성 모듈(422), 대상 현저성 맵 생성 모듈(423) 및 현저성 맵 제어 모듈(424)를 도 5에 도시된 바와 같이 포함한다. In order to identify characteristics, motions, or objects having salience in the image, the saliency map section 420 includes a characteristic saliency generating module 421, a motion saliency map generation module 422, a target saliency map generation module 423 And an affinity map control module 424, as shown in Fig.

특성 현저성 맵 생성 모듈(421)은 2차원 이미지의 색깔, 그레디언트 또는 경계 특성 등의 특성을 식별한다. 예를 들어, Solbe 연산자 또는 Prewitt 연산자를 사용하여 그레디언트 특성을 식별하고, 라플라시안 경계 검출 계산법을 사용하여 이미지 중의 경계 특성을 식별한다. The characteristic saliency map generation module 421 identifies characteristics such as the color, gradient, or boundary characteristic of the two-dimensional image. For example, use the Solbe operator or the Prewitt operator to identify gradient properties, and use the Laplacian boundary detection computation to identify boundary characteristics in the image.

운동 현저성 맵 생성 모듈(422)는 시간 상 인접한 두 이미지 사이의 운동을 식별함으로써 운동 현저성 맵을 생성한다. The motion aberration map generation module 422 generates an motion aberration map by identifying motion between two images that are adjacent in time.

대상 현저성 맵 생성 모듈(423)은 이미지 중의 대상(사람, 얼굴 또는 문자)를 식별함으로써 대상 현저성 맵을 생성하며, 예를 들어 부스팅(boosting) 계산법에 기초한 대상 식별 모델을 사용하여 이미지 중의 사람과 얼굴을 식별한다. The target saliency map generation module 423 generates a target saliency map by identifying an object (person, face, or character) in the image, and uses the target identification model based on, for example, a boosting calculation method, And face.

현저성 맵 제어 모듈(424)은 특성 현저성 맵 생성 모듈, 운동 현저성 맵 생성 모듈(422) 및 대상 현저성 맵 생성 모듈(423) 중의 하나, 임의의 둘 또는 전체를 사용하여 하나, 임의의 둘 또는 전체 현저성 맵을 생성한다. The saliency map control module 424 may use one or both of the characteristic saliency map generation module, the motion saliency map generation module 422 and the target saliency map generation module 423 to generate one, Thereby generating two or all of the saliency maps.

예를 들어, 비디오 시퀀스 중의 2차원 이미지 중 매우 많은 사람을 포함하면, 현저성 맵 제어 모듈(424)이 사용하는 모듈 중 대상 현저성 맵 생성 모듈(423)을 포함한다. 만약 비디오 시퀀스 중의 2차원 이미지가 사람, 얼굴 또는 문자를 포함하지 않고 비교적 많은 운동을 포함하면, 현저성 맵 제어 모듈(424)이 사용하는 모듈 중 대상 현저성 맵 생성 모듈(423)은 포함하지 않을 수 있고 운동 현저성 맵 생성 모듈(422) 을 포함하는 등이다. 즉, 처리해야 할 2차원 이미지 시퀀스 자체에 따라 상기 세 종류의 모듈을 사용한다.For example, if the image includes a very large number of people in the two-dimensional image in the video sequence, it includes a target among the modules used by the saliency map control module 424 module 423. If the two-dimensional image in the video sequence does not include a person, a face or a character but contains a relatively large number of motions, the target conspicuousness map generation module 423 among the modules used by the conspicuity map control module 424 And includes motion awareness map generation module 422 and the like. That is, the three kinds of modules are used according to the two-dimensional image sequence itself to be processed.

단지 대상 현저성 맵만을 생성하면, 현저성에 기한 깊이 맵 생성부(430)은 (0, 1) 범위 내의 상수값(예 0.8)을 현저성에 기한 깊이 맵 중의 2차원 이미지 중 대상으로 식별된 픽셀에 대응하는 픽셀에 부여하고, 0은 현저성에 기한 깊이 맵 중의 기타 픽셀에 부여한다.If only the target saliency map is generated, then the depth map generation unit 430 based on the saliency property sets a constant value (e.g. 0.8) within the range of (0, 1) to a pixel identified as the target among the two- And 0 is given to other pixels in the depth map based on the conspicuousness.

특성 현저성 맵 또는 운동 현저성 맵 중의 하나를 생성하면, 현저성에 기한 깊이 맵 생성부(430)은 특성 현저성 맵 또는 운동 현저성 맵 중의 각 픽셀의 현저성에 따라 [0, 1] 범위 내의 값을 현저성에 기한 깊이 맵 중의 각 픽셀에 부여한다. 0은 대응 픽셀이 최소의 현저성을 가짐을 표시하며, 1은 대응 픽셀이 최대의 현저성을 가짐을 표시한다. If one of the characteristic saliency map or the motion aurorance map is generated, the depth map generation unit 430 based on the saliency maps the value within the range [0, 1] according to the saliency of each pixel in the characteristic saliency map or the motion semi- To each pixel in the depth map based on the conspicuousness. 0 indicates that the corresponding pixel has the minimum saliency, and 1 indicates that the corresponding pixel has the maximum saliency.

예를 들어, 상이한 크기로 이미지 중 각 위치에 따라 계산된 중심 픽셀 또는 중심 블록의 특성값과 상, 하, 좌, 우에 인접한 픽셀 또는 인접 블록의 특성평균값 사이의 차이에 따라 [0, 1] 범위 내의 값을 현저성에 기한 깊이 맵 중의 각 픽셀에 부여한다. For example, [0, 1] ranges depending on the difference between the characteristic values of the center pixel or the center block calculated according to each position in the image at different sizes and the characteristic average value of pixels adjacent to the upper, lower, left, Is assigned to each pixel in the depth map based on the conspicuousness.

예를 들어, 색깔 특성을 사용하여 특성 현저성 맵을 생성한다고 가정하면, 상기 색깔 특성은 (R, G, B) 벡터가 된다. 먼저 단일 픽셀의 (R, G, B)의 벡터와 주변 상, 하, 좌, 우의 인접한 픽셀의 (R, G, B) 벡터의 평균값 사이의 차를 계산하고 기록한다; 그런 다음 크기를 4*4 블록으로 확대하고, 각 4*4 블록의 16개 픽셀의 (R, G, B) 벡터량의 평균을 더하여 평균을 계산(인접 평균으로 칭함)한 다음, 중심 평균과 인접 평균 사이의 차를 계산한다; 계속하여 8*8 블록의 중심 평균과 인접 평균 사이의 차를 계산한다,... 계속 상기 조작을 크기를 전체 이미지로 확대하여 실행한다; 마지막으로 각 픽셀의 모든 크기의 차를 서로 더하고 [0, 1] 범위 내로 규격화하여, 현저성에 기한 깊이 맵을 얻는다.For example, assuming that a color characteristic is used to generate a characteristic saliency map, the color characteristic is a (R, G, B) vector. First, calculate and record the difference between the vector of (R, G, B) of a single pixel and the average value of the (R, G, B) vector of neighboring upper, lower, left, and right adjacent pixels; Then, the size is enlarged to 4 * 4 blocks, and the average of the (R, G, B) vector quantities of 16 pixels of each 4 * 4 block is added to calculate the average (referred to as the adjacent average) Calculate the difference between the averages; Subsequently, the difference between the center average of the 8 * 8 block and the adjacent average is computed. Continue the above operation to enlarge the size to the whole image and execute it; Finally, differences in all sizes of each pixel are added together and normalized within the range [0, 1] to obtain a depth map based on conspicuousness.

대상 현저성 맵을 포함하지 않은 두 개의 현저성 맵을 생성하면, 현저성에 기한 깊이 맵 생성부(430)은 생성된 두 개의 현저성 맵 중의 대응 픽셀을 서로 더하여 규격화한 값 또는 비교적 큰 값을 현저성에 기한 깊이 맵 중의 대응 픽셀에 부여한다;The depth map generating unit 430 based on the conspicuousness adds the corresponding pixels in the generated two conspicuous maps to the standardized value or the relatively large value as the remarkableity map, To the corresponding pixels in the depth map based on the sex;

대상 현저성 맵을 포함하는 두 개의 현저성 맵을 생성하면, 현저성에 기한 깊이 맵 생성부(430)은 (0, 1) 범위 내의 상수(예 0.8)을 현저성에 기한 깊이 맵 중 대상 현저성 맵 중의 각 대상으로 식별된 픽셀에 대응하는 픽셀에 부여하며, 두 개의 현저성 맵 중 대상 현저성 맵 이외의 현저성 맵 중의 대응 픽셀값을 현저성에 기한 깊이 맵 중의 기타 대응 픽셀에 부여한다.The depth map generating unit 430 based on the conspicuity generates a consonant property map including a target conspicuity map and a constant (e.g., 0.8) within the range of (0, 1) To the pixels corresponding to the pixels identified as the respective objects in the depth map, and the corresponding pixel values in the saliency maps other than the target saliency map among the two saliency maps are given to the other corresponding pixels in the depth map based on the saliency.

전체 현저성 맵을 생성하면, 현저성에 기한 깊이 맵 생성 부(430)은 (0, 1) 범위 내의 상수(예 0.8)을 현저성에 기한 깊이 맵 중의 대상 현저성 맵 중 각 대상으로 식별된 픽셀에 대응하는 픽셀에 부여하고, 대상 현저성 맵 이외의 두 개의 현저성 맵 중 대응 픽셀을 서로 더하여 규격화한 값 또는 비교적 큰 값을 현저성에 기한 깊이 맵 중의 대응 픽셀에 부여한다.When the entire saliency map is generated, the depth map generation unit 430 based on the saliency property sets a constant (e.g., 0.8) in the range (0, 1) to a pixel identified as each target in the target saliency map in the depth map based on the saliency And the corresponding pixels in the two conspicuousness maps other than the object conspicuousness map are added to each other to give a normalized value or a relatively large value to the corresponding pixel in the depth map based on the conspicuousness.

종합 깊이 맵 생성부(440)은 현저성에 기한 깊이 맵과 정합된 깊이 맵 중의 대응 픽셀값을 합하여 규격화하거나 현저성에 기한 깊이 맵과 정합에 기한 깊이 맵 중의 대응 픽셀 중 비교적 큰 값을 선택하여 종합 깊이 맵을 생성한다.The comprehensive depth map generation unit 440 normalizes the depth map based on the conspicuousness by adding the corresponding pixel values in the depth map matched with the conspicuousness or selects a relatively large value among the corresponding pixels in the depth map based on the depth map based on the conspicuousness, Create a map.

도 6은 본 발명의 현저성에 기한 깊이 맵과 정합에 기한 깊이 맵을 결합하여 종합 깊이 맵을 생성한 효과도이다.Fig. 6 is an effect diagram of generating a comprehensive depth map by combining the depth map based on the conspicuousness and the depth map based on matching of the present invention.

도 7은 본 발명의 제3실시예에 따른 깊이 맵에 평활(smoothing)을 진행하는 장치(700)를 보인다.FIG. 7 shows an apparatus 700 for smoothing the depth map according to the third embodiment of the present invention.

도 7을 참조하면, 깊이 맵에 대해 평활을 진행하는 장치(700)는 이미지 획득부(710), 깊이 맵 획득부(720)과 시공 영역 평활부(730)을 포함한다. Referring to FIG. 7, an apparatus 700 for smoothing a depth map includes an image obtaining unit 710, a depth map obtaining unit 720, and a construction area smoothing unit 730.

이미지 획득부(710)은 입력된 다수의 이미지로 이루어진 비디오 시퀀스 중 시간 상 연속된 다수의 2차원 이미지를 획득하고, 깊이 맵 획득부(720)은 입력된 상기 다수의 2차원 이미지의 각 대응하는 최초 깊이 맵을 획득할 수 있으며, 최초 이미지 중의 각 픽셀값은 대응 2차원 이미지 중 대응 픽셀의 깊이값이다. The image acquiring unit 710 acquires a plurality of two-dimensional images that are consecutive in time among a video sequence composed of a plurality of input images, and the depth map acquiring unit 720 acquires a depth map of each corresponding two- The initial depth map can be obtained, and each pixel value in the initial image is the depth value of the corresponding pixel in the corresponding two-dimensional image.

HVP 모델에 따르면, 사람의 눈은 이미지 중의 경계 위치의 큰 깊이 변화에 민감하며, 인접 프레임 사이에 빠른 깊이 변화가 존재하면 시청자는금 현기증이 나게 된다. 따라서 상기 최초 깊이 맵에 대해 공간 영역과 시간 영역을 평활하게 하여 시청자를 편안하게 하는 깊이 맵을 생성한다.According to the HVP model, the human eye is sensitive to a large depth variation of the boundary position in the image, and if there is a rapid depth variation between adjacent frames, the viewer becomes dizziness. Accordingly, a depth map is created that smoothes the spatial region and the time region with respect to the initial depth map, thereby making the viewer comfortable.

도 8은 본 발명에 따른 시공영역 평활 부(730)을 보인다.8 shows a construction area smoothing part 730 according to the present invention.

도 8을 참조하면, 시공영역 평활부(730)은 평활량 계산 모듈(731)과 평활 모듈(732)를 포함한다. Referring to FIG. 8, the construction area smoothing unit 730 includes a smoothing amount calculation module 731 and a smoothing module 732.

도 9는 본 발명에 따른 공간 영역 및 시간 영역 평활을 진행하는 일예이다.FIG. 9 is an example of progressing the spatial domain and time domain smoothing according to the present invention.

도 8 및 도 9를 참조하면, 평활량 계산 모듈(731)은 프레임(t) 중의 픽셀(P1(x, y, t))과 공간 및 시간 영역 상 P1과 인접한 픽셀 P2(x+△x, y+△y, z+△z)의 특성값(예 색깔 또는 무늬)을 비교한다. △x, △y와 △t의 값의 설계가 너무 크면, 평활이 너무 지나치게 된다; 반대로 △x, △y와 △t의 값의 설계가 너무 작으면, 평활 효과가 나타나지 않는다. 따라서 얻고자 기대하는 평활 효과에 따라 △x, △y와 △t의 값을 확정하고, 예를 들어 △x=5, △y=5와 △t=5에서 비교적 적당한 평활 효과를 얻을 수 있다. 8 and 9, the smoothed amount calculation module 731 calculates a smoothed amount of a pixel P1 (x, y, t) in the frame t and a pixel P2 (x + x, y + ? Y, z +? Z) (e.g., color or pattern). If the design of the values of DELTA x, DELTA y and DELTA t is too large, the smoothing becomes too much; Conversely, if the design of the values of DELTA x, DELTA y and DELTA t is too small, the smoothing effect does not appear. Therefore, the values of DELTA x, DELTA y and DELTA t are determined according to the expected smoothing effect to be obtained, and a comparatively appropriate smoothing effect can be obtained, for example, DELTA x = 5, DELTA y = 5 and DELTA t =

HVP 모델에 따르면, 예를 들어 색깔을 특성으로 사용하는 경우, P1과 P2의 색깔이 비슷하면, P1과 P2의 깊이도 비슷하게 된다. 따라서 P1과 P2의 색깔에 따라 P1과 P2의 깊이를 조정하여 평활 후의 픽셀 P1의 깊이값 D(P1)과 픽셀 P2의 깊이값 D(P2) 사이의 차의 절대값이 평활 전의 픽셀(p1)의 깊이값 D(P1)과 픽셀 P2의 깊이값 D(P2) 사이의 차보다 작게 되게 한다.According to the HVP model, for example, when the color is used as a characteristic, if the colors of P1 and P2 are similar, the depths of P1 and P2 are similar. Therefore, by adjusting the depths of P1 and P2 according to the colors of P1 and P2, the absolute value of the difference between the depth value D (P1) of the pixel P1 after smoothing and the depth value D (P2) The depth value D (P1) of the pixel P2 and the depth value D (P2) of the pixel P2.

평활량 계산 모듈(731)은 아래의 수학식 5에 따라 평활량(S)를 계산한다.The smoothed amount calculation module 731 calculates the smoothed amount S according to the following equation (5).

수학식 5에서 D(.)는 픽셀의 깊이값이다. C(P1,P2)는 픽셀 P1과 P2의 특성값 사이의 차(즉, 유사성)이고, N(P1, P2)는 (△x, △y, △t)에 따라 계산된 P1과 P2 사이의 거리이다. 아래의 수학식 6과 7에 따라 C(p1, P2)와 N(P1, P2)를 계산할 수 있다.In Equation 5, D (.) Is the depth value of the pixel. C (P1, P2) is the difference (i.e., similarity) between the characteristic values of pixels P1 and P2 and N (P1, P2) is the difference between P1 and P2 calculated according to (x, y, It is a street. C (p1, P2) and N (P1, P2) can be calculated according to the following equations (6) and (7).

수학식 6에서 I(.)는 픽셀의 특성값이고, |.|는 절대값이다.In Equation 6, I (.) Is the characteristic value of the pixel, and | .vertline. Is the absolute value.

수학식 7에서,

이다.In Equation (7)

to be.

평활 모듈(732)는 상기 평활량S(P1, P2)에 따라 평활 후의 현재 2차원 이미지 픽셀(P1)의 깊이값

을 계산한다.The smoothing module 732 corrects the depth value of the current two-dimensional image pixel P1 after smoothing according to the smoothed amount S (P1, P2)

.

현재 2차원 이미지의 각 픽셀은 평활량 계산 모듈(731)과 평활 모듈(732)의 기능을 응용하여 평활 후의 현재 2차원 이미지의 깊이 맵을 얻을 수 있다.Each pixel of the current two-dimensional image can obtain the depth map of the current two-dimensional image after smoothing by applying the functions of the smoothing amount calculation module 731 and the smoothing module 732. [

도 10은 본 발명의 제4실시예에 따른 깊이 맵을 생성하는 장치(1000)을 보인다.FIG. 10 shows an apparatus 1000 for generating a depth map according to a fourth embodiment of the present invention.

깊이 맵 생성 장치(1000)은 이미지 획득부(1010), 현저성 맵 생성부(420), 현저성에 기한 깊이 맵 생성부(430), 3차원 구조 정합부(120), 정합에 기한 깊이 맵 생성부(130), 종합 깊이 맵 생성부(440)과 시공영역 평활 부(730)을 포함한다. The depth map generation apparatus 1000 includes an image acquisition unit 1010, a saliency map generation unit 420, a depth map generation unit 430, a three-dimensional structure matching unit 120, A total depth map generating unit 440, and a construction area smoothing unit 730. [0033]

현저성 맵 생성부(420), 현저성에 기한 깊이 맵 생성 부(430)과 도 4 중 도시된 동일 참조부호의 모듈은 동일하다. 3차원 구조 정합부(120), 정합에 기한 깊이 맵 생성 부(130)과 도 1에 도시된 동일 참조부호의 모듈을 동일하다. 시공영역 평활부(730)과 도 7에 도시된 동일 참조부호의 모듈은 동일하다.The saliency map generation unit 420 and the saliency depth map generation unit 430 are the same as the modules of the same reference numerals shown in Fig. The three-dimensional structure matching unit 120 and the depth-based depth map generation unit 130 are the same as those shown in FIG. The construction area smoothing unit 730 and the module of the same reference numeral shown in Fig. 7 are the same.

이미지 획득 장치(1010)은 입력된 비디오 중 시간 상 연속된 복수의 2차원 이미지를 획득한다.The image acquisition device 1010 acquires a plurality of two-dimensional images that are sequential in time among the input video.

도 11은 본 발명에 따른 깊이 맵 생성 장치(1000)이 생성한 깊이 맵의 예이다. 본 발명에 따른 깊이 맵은 비교적 좋은 효과를 가진다.11 is an example of a depth map generated by the depth map generating apparatus 1000 according to the present invention. The depth map according to the present invention has a relatively good effect.

도 12는 본 발명의 제5실시예에 따른 정합에 기한 깊이 맵 생성 방법의 플로우 차트이다.12 is a flowchart of a depth map generation method based on matching according to the fifth embodiment of the present invention.

도 12를 참조하면, 단계(S1210)는 입력된 비디오 중 시간 상 연속된 복수의 2차원 이미지를 얻는다. Referring to FIG. 12, step S1210 obtains a plurality of two-dimensional images continuous in time among the input video.

단계(S1220)은 현재 2차원 이미지와 미리 저장된 3차원 전형 구조의 정합도를 계산하고, 최고 정합도를 가진 3차원 전형 구조를 현재 이미지의 3차원 구조로 확정한다. 비록 미리 저장된 3차원 전형 구조가 일반적으로 현실의 장면의 실제 구조보다 간단하지만, 사람의 눈의 시각 시스템의 특성으로 인해, 간단한 3차원 전형 구조만으로도 2차원 이미지의 깊이 맵을 생성하여 시청자에게 전통적인 2차원 비디오보다 훨씬 양호한 3차원 느낌을 제공할 수 있다.In operation S1220, the degree of matching between the current two-dimensional image and the previously stored three-dimensional representative structure is calculated, and the three-dimensional representative structure having the highest matching degree is determined as the three-dimensional structure of the current image. Although the pre-stored three-dimensional stereoscopic structure is generally simpler than the actual structure of the real scene, due to the characteristics of the visual system of the human eye, a simple three-dimensional stereoscopic structure can generate a depth map of the two- It can provide a much better three-dimensional feeling than 2D video.

다수의 미리 저장된 3차원 전형 구조로부터 얻은 현재 이미지와 최고 정합도를 가지는 3차원 구조를 얻기 위해, 현재 2차원 이미지와 다수의 미리 저장된 3차원 전형 구조의 각 정합도를 계산해야 한다.In order to obtain a three-dimensional structure having the highest matching degree and a current image obtained from a plurality of pre-stored three-dimensional representative structures, a matching degree of the current two-dimensional image and a plurality of pre-stored three-dimensional representative structures must be calculated.

도 13은 본 발명에 따른 정합도를 계산하는 플로우 차트이다.13 is a flowchart for calculating the matching degree according to the present invention.

도 13을 참조하면, 단계(S1221)에서 다수의 미리 저장된 3차원 전형 구조의 하나에 따라 현재 이미지를 적어도 하나의 평면으로 분할한다. 예를 들어, 현재 이미지와 도 2의 정합을 진행한 3차원 전형 구조에 하나의 평면이 있는 경우, 현재 이미지의 전체를 하나의 분할 영역으로 한다; 하지만, 현재의 이미지와 도 2의 정합을 진행한 3차원 전형 구조는 다수의 평면을 가지는 경우 현재 이미지를 상기 3차원 전형 구조의 각 평면에 대응하는 다수의 영역으로 분할해야 한다. Referring to FIG. 13, in step S1221, the current image is divided into at least one plane according to one of a plurality of previously stored three-dimensional representative structures. For example, if there is one plane in the three-dimensional stereoscopic structure matching the current image and in FIG. 2, the whole of the current image is taken as one divided area; However, in the case where the 3D stereoscopic structure having the matching of FIG. 2 with the current image has a plurality of planes, the current image must be divided into a plurality of regions corresponding to the respective planes of the 3D stereoscopic structure.

그런 다음 단계(S1221)은 수학식 1, 수학식 2, 수학식 3 및 수학식 4를 사용하여 현재 이미지의 각 영역 중의 특성(색깔, 그레디언트 또는 경계)를 통해 각 영역의 밀도 및 두 영역 사이의 유사성을 계산하고 상기 정합도(S)를 계산한다. 수학식1을 통해 계산된 정합도(S)가 높을수록, 이는 현재 이미지와 그 분할된 기초한 3차원 전형구조가 정합이 더 잘되는 것을 의미한다. The step S1221 then uses the characteristics (color, gradient, or boundary) of each region of the current image to determine the density of each region and the density of each region between the two regions using Equations 1, 2, The similarity is calculated and the matching degree S is calculated. The higher the degree of matching S calculated through Equation 1, the better the match between the current image and its segmented based three-dimensional representative structure.

따라서, 단계(S1225)에서 최고 정합도를 가지는 3차원 전형 구조는 현재 이미지의 3차원 구조로 확정된다.Therefore, in step S1225, the three-dimensional template structure having the highest matching degree is determined as the three-dimensional structure of the current image.

그런 다음, 다시 도 12를 참조하면, 단계(S1230)에서 확정된 3차원 구조에 따라 현재 이미지의 정합에 기한 깊이 맵을 생성한다. 상기 정합에 기한 깊이 맵의 각 픽셀은 [0, 1] 범위 내에 있고, 현재 2차원 이미지의 대응 픽셀의 깊이값을 표시한다. 0은 대응 픽셀이 최대 깊이를 가짐을 표시하고, 1은 대응 픽셀이 최소 깊이를 가짐을 표시한다. 본 실시예는 장면에 대해 점하는 주요 부분의 이미지에 비교적 좋은 효과를 생성한다.Referring again to FIG. 12, a depth map based on the matching of the current image is generated according to the three-dimensional structure determined in step S1230. Each pixel of the depth map based on the matching is in the range [0, 1] and represents the depth value of the corresponding pixel of the current two-dimensional image. 0 indicates that the corresponding pixel has the maximum depth, and 1 indicates that the corresponding pixel has the minimum depth. This embodiment produces a relatively good effect on the image of the main part pointing to the scene.

도 14는 본 발명의 제6실시예에 따른 현저성에 기한 깊이 맵의 플로우 차트이다.FIG. 14 is a flowchart of a depth map based on the conspicuousness according to the sixth embodiment of the present invention.

도 14를 참조하면, 단계(S1410)에서 입력된 비디오 중 시간 상 연속적인 복수의 2차원 이미지를 획득한다.Referring to FIG. 14, a plurality of continuous two-dimensional images in time are acquired in the video inputted in step S1410.

단계(S1420)에서 특성 현저성 맵, 운동 현저성 맵 및 대상 현저성 맵의 하나, 임의의 둘 또는 전체 현저성 맵을 생성하고, 2차원의 특성(색깔, 그레디언트 또는 경계 특성)을 식별함으로써 특성 현저성 맵을 생성하고, 시간 상 인접한 두 2차원 이미지 사이의 운동을 식별함으로써 운동 현저성 맵을 생성하며, 2차원 이미지의 대상(사람, 얼굴 또는 문자)를 식별함에 따라 대상의 현저성 맵을 생성한다.In step S1420, one of the characteristic saliency map, the motion saliency map and the target saliency map, any two or all of the saliency maps are generated, and the characteristic (color, gradient, or boundary characteristic) Generating an aberrantness map by identifying a motion between two adjacent two-dimensional images in time, identifying an object (person, face, or character) of the two-dimensional image, .

단계(S1420)에서 단지 대상 현저성 맵만을 생성한다면, 단계(S1430)에서 (0, 1) 범위 내의 상수값(예 0.8)을 현저성 깊이 맵 중 2차원 이미지의 대상으로 식별된 픽셀에 대응된 픽셀에 부여하며, 0을 현저성에 기한 깊이 맵 중 기타 픽셀에 부여한다.If only the target saliency map is generated in step S1420, a constant value (e.g., 0.8) within the range of (0, 1) in step S1430 is compared with a pixel value And assigns 0 to other pixels in the depth map based on the conspicuousness.

단계(S1420)에서 특성 현저성 맵 또는 운동 현저성 맵 중 하나를 생성하면, 단계(S1430)에서 특성 현저성 맵 또는 운동 현저성 맵의 각 픽셀의 현저성에 따라 [0, 1] 범위 내의 값을 현저성에 기한 깊이 맵 중의 각 픽셀에 부여한다. 0은 대응 픽셀이 최소 현저성을 가짐을 표시하고, 1은 대응 픽셀이 최대 현저성을 가짐을 표시한다. 예를 들어, 상이한 크기로 이미지의 각 위치에서 계산한 중심 픽셀 또는 중심 블록의 특성값과 상, 하, 좌 및 우의 인접한 픽셀 또는 인접 블록의 특성 평균값 사이의 차에 따라 [0, 1]범위 내의 값을 현저성 깊이 맵의 각 픽셀에 부여한다. If one of the characteristic saliency maps or the motional saliency maps is generated in step S1420, then in step S1430, a value within the range of [0, 1] according to the saliency of each pixel of the characteristic saliency map or the motion semblance map is set to Is given to each pixel in the depth map based on the conspicuousness. 0 indicates that the corresponding pixel has minimum saliency, and 1 indicates that the corresponding pixel has maximum saliency. For example, in the range [0, 1] according to the difference between the characteristic value of the center pixel or the central block calculated at each position of the image at different sizes, and the average characteristic value of the adjacent pixels in the upper, Value to each pixel of the saliency depth map.

단계(S1420)에서 대상 현저성 맵을 포함하지 않은 두 개의 현저성 맵을 생성하면, 단계(S1430)에서 두 현저성 맵 중 대응 픽셀을 합하여 규격화한 값 또는 비교적 큰 값을 현저성에 기한 깊이 맵의 대응 픽셀에 부여한다.If two saliency maps not including the target saliency map are generated in step S1420, a value obtained by normalizing the sum of the corresponding pixels in the two saliency maps or a relatively large value obtained by adding the corresponding pixels in the two saliency maps to the depth map based on the saliency To the corresponding pixel.

단계(S1420)에서 대상 현저성 맵을 포함한 두 개의 현저성 맵을 생성하면, 단계(S1430)에서 (0, 1) 범위 내 상수(예 0.8)을 현저성에 기한 깊이 맵 중 대상 현저성 맵의 대상으로 식별된 각 픽셀에 대응하는 픽셀에 부여하고, 상기 두 개의 현저성 맵 중 대상 현저성 맵을 제외한 현저성 맵의 대응 픽셀값을 현저성에 기한 깊이 맵의 기타 대응 픽셀에 부여한다.In step S1420, if two conspicuity maps including the target conspicuousness map are generated, then in step S1430, the constants in the range (0, 1) (e.g., 0.8) And assigns the corresponding pixel value of the sensibility map except the target saliency map of the two saliency maps to the other corresponding pixels of the depth map based on the saliency.

단계(S1420)에서 전체 현저성 맵을 생성하면, 단계(S1430)에서 (0, 1)범위 내 상수(예 0.8)를 현저성에 기한 깊이 맵 중 대상 현저성 맵의 식별된 각 대상의 픽셀에 대응하는 픽셀에 부여하며, 대상 현저성 맵을 제외한 두 개의 현저성 맵의 대응 픽셀을 합하여 규격화한값 또는 비교적 큰 값을 현저성에 기한 깊이 맵의 대응 픽셀에 부여한다.If the entire saliency map is generated in step S1420, a constant in the range (0, 1) (e.g., 0.8) in step S1430 corresponds to a pixel of each identified target of the target saliency map in the depth map based on saliency And adds a normalized value or a relatively large value to the corresponding pixel of the depth map based on the conspicuousness by adding the corresponding pixels of the two conspicuousness maps excluding the target conspicuousness map.

단계(S1440)과 도 12의 단계(S1220)은 동일하고, 단계(S1450)과 도 12의 단계(S1230)이 동일하다.Step S1440 and step S1220 in Fig. 12 are the same, and step S1450 and step S1230 in Fig. 12 are the same.

단계(S1460)에서, 단계(S1430)에서 생성한 현저성에 기한 깊이 맵과 단계(S1450)에서 생성한 정합에 기한 깊이 맵의 대응 픽셀값을 더하여 규격화하거나 현저성에 기한 깊이 맵과 정합에 기한 깊이 맵 중 대응 픽셀의 비교적 큰 값을 선택함으로써 종합 깊이 맵을 생성한다.In step S1460, the depth map based on the conspicuousness generated in step S1430 and the corresponding pixel value in the depth map based on the matching created in step S1450 are added to normalize or to the depth map based on the conspicuousness, Thereby generating a comprehensive depth map by selecting a relatively large value of the corresponding pixels.

도 15는 본 발명의 제7실시예에 따른 깊이 맵을 평활하는 방법의 플로우차트이다.15 is a flowchart of a method of smoothing a depth map according to a seventh embodiment of the present invention.

도 15를 참조하면, 단계(S1510)에서 입력된 다수의 이미지로 이루어진 비디오 시퀀스 중 시간 상 연속적인 2차원 이미지를 획득한다.Referring to FIG. 15, a continuous two-dimensional image is acquired in time among the video sequences composed of the plurality of images input in step S1510.

단계(S1520)에서 입력된 각 2차원 이미지와 대응하는 최초 깊이 맵을 획득하고, 상기 최초 깊이 맵에서 각 픽셀값은 대응 2차원 이미지 중 대응 픽셀의 깊이값을 나타낸다. The first depth map corresponding to each two-dimensional image input in step S1520 is obtained, and each pixel value in the initial depth map represents the depth value of the corresponding pixel in the corresponding two-dimensional image.

도 9를 참조하면, 단계(S1530)에서 프레임(t) 중의 픽셀(P1(x, y, t))과 공간 및 시간 영역 상 P1과 인접한 픽셀 P2(x+△x, y+△y, t+△t)의 특성값(예 색깔 또는 무늬)을 비교한다. △x, △y와 △t의 값의 설계가 너무 크면, 평활이 너무 지나치게 된다; 반대로 △x, △y와 △t의 값의 설계가 너무 작으면, 평활 효과가 나타나지 않는다. 따라서 얻고자 기대하는 평활 효과에 따라 △x, △y와 △t의 값을 확정하고, 예를 들어 △x=5, △y=5, △t=5에서 비교적 적당한 평활 효과를 얻을 수 있다. HVP 모델에 따르면, 예를 들어 색깔을 특성으로 사용하는 경우, P1과 P2의 색깔이 비슷하면, P1과 P2의 깊이도 비슷하게 된다. 따라서 P1과 P2의 색깔에 따라 P1과 P2의 깊이를 조절하여 평활 수의 픽셀 P1의 깊이값 D'(P1)과 픽셀 P2의 깊이값 D'(P2) 사이의 차의 절대값이 평활 전의 픽셀(p1)의 깊이값 D(P1)과 픽셀 P2의 깊이값 D(P2) 사이의 차보다 작게 되게 한다.9, a pixel P1 (x, y, t) in the frame t and a pixel P2 (x +? X, y +? Y, t +? T ) (For example, color or pattern). If the design of the values of DELTA x, DELTA y and DELTA t is too large, the smoothing becomes too much; Conversely, if the design of the values of DELTA x, DELTA y and DELTA t is too small, the smoothing effect does not appear. Therefore, the values of DELTA x, DELTA y and DELTA t are determined according to the expected smoothing effect to be obtained, and a comparatively appropriate smoothing effect can be obtained, for example, DELTA x = 5, DELTA y = 5 and DELTA t = According to the HVP model, for example, when the color is used as a characteristic, if the colors of P1 and P2 are similar, the depths of P1 and P2 are similar. Therefore, by adjusting the depths of P1 and P2 according to the colors of P1 and P2, the absolute value of the difference between the depth value D '(P1) of the pixel P1 of smooth water and the depth value D' (P2) is smaller than the difference between the depth value D (P1) of the pixel P2 and the depth value D (P2) of the pixel P2.

수학식 5, 6, 7에 따라 평활량(S)를 계산한다.The smoothing amount S is calculated according to Equations (5), (6), and (7).

그런 다음, 상기 평활량S(P1, P2)에 따라 평활 후의 현재 2차원 이미지 픽셀(P1)의 깊이값 D'(P1)=D(P1)-S(P1)을 계산한다.Then, the depth value D '(P1) = D (P1) -S (P1) of the current two-dimensional image pixel P1 after smoothing is calculated according to the smoothed amount S (P1, P2).

현재 2차원 이미지의 각 픽셀에 대해 평활량(S)를 계산하고 평활을 진행하며, 평활 후의 현재 2차원 이미지의 깊이 맵을 얻는다.Compute the smoothed amount (S) for each pixel of the current two-dimensional image, smooth it, and obtain a depth map of the current two-dimensional image after smoothing.

도 16은 본 발명의 제8실시예에 따른 깊이 맵을 생성하는 방법의 플로우차트이다.16 is a flowchart of a method of generating a depth map according to an eighth embodiment of the present invention.

도 16을 참조하면, 단계(S1610)에서 입력된 다수의 이미지로 이루어진 비디오 시퀀스 중 시간 상 연속적인 2차원 이미지를 획득한다. 단계(S1620), 단계(S1630), 단계(S1640), 단계(S1650) 및 단계(S1660)은 각각 도 14의 단계(S1420), 단계(S1430), 단계(S1440), 단계(S1450) 및 단계(S1460)과 동일하다. 단계(S1670)과 도 15의 단계(S1523)은 동일하다.Referring to FIG. 16, a continuous two-dimensional image in time among the video sequences composed of a plurality of images input in step S1610 is obtained. Steps S1620, S1630, S1640, S1650, and S1660 correspond respectively to steps S1420, S1430, S1440, S1450, (S1460). Steps S1670 and S1523 in Fig. 15 are the same.

또한 본 발명의 일 실시예에 따른 깊이 맵 생성 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 본 발명을 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 본 발명의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.Also, the depth map generation method according to an embodiment of the present invention may be implemented in the form of a program command that can be executed through various computer means and recorded in a computer readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like, alone or in combination. The program instructions recorded on the medium may be those specially designed and constructed for the present invention or may be available to those skilled in the art of computer software. Examples of computer-readable media include magnetic media such as hard disks, floppy disks and magnetic tape; optical media such as CD-ROMs and DVDs; magnetic media such as floppy disks; Magneto-optical media, and hardware devices specifically configured to store and execute program instructions such as ROM, RAM, flash memory, and the like. Examples of program instructions include machine language code such as those produced by a compiler, as well as high-level language code that can be executed by a computer using an interpreter or the like. The hardware devices described above may be configured to operate as one or more software modules to perform the operations of the present invention, and vice versa.

이상과 같이 본 발명은 비록 한정된 실시예와 도면에 의해 설명되었으나, 본 발명은 상기의 실시예에 한정되는 것은 아니며, 본 발명이 속하는 분야에서 통상의 지식을 가진 자라면 이러한 기재로부터 다양한 수정 및 변형이 가능하다. 그러므로, 본 발명의 범위는 설명된 실시예에 국한되어 정해져서는 아니 되며, 후술하는 특허청구범위뿐 아니라 이 특허청구범위와 균등한 것들에 의해 정해져야 한다.While the invention has been shown and described with reference to certain preferred embodiments thereof, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims. This is possible. Therefore, the scope of the present invention should not be limited to the described embodiments, but should be determined by the equivalents of the claims, as well as the claims.

100, 400, 1000 깊이 맵 생성 장치
110, 410, 710, 1010 이미지 획득부
120 3차원 구조 정합부
121 평면 분할 모듈
122 정합도 계산 모듈
123 3차원 구조 확정 모듈
130 정합에 기한 깊이 맵 생성부
420 현저성 맵 생성부
421 특성 현저성 맵 생성 모듈
422 운동 현저성 맵 생성 모듈
423 대상 현저성 맵 생성 모듈
424 현저성 맵 제어 모듈
430 현저성에 기한 깊이 맵 생성부
440 종합 깊이 맵 생성부
700 평활 장치
720 깊이 맵 획득부
730 시공 영역 평활부
731 평활량 계산 모듈
732 평활 모듈 100, 400, 1000 depth map generator
110, 410, 710, and 1010,
120 three-dimensional structure matching section
121 plane splitting module
122 Matching degree calculation module
123 3D Structure Confirmation Module
130 depth map generation section
420 < SEP >
421 Characteristic Remarkability Map Generation Module
422 Motion Existence Map Generation Module
423 Target Remarkability Map Generation Module
424 Remarkability map control module
430 depth map generation section
440 < / RTI >
700 Smoothing device
720 depth map acquisition unit
730 Construction area Smooth part
731 Smoothness Calculation Module
732 Smoothing module

Claims

An image acquiring unit acquiring a plurality of two-dimensional images successive in time among the input video;
Dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining the three-dimensional representative structure of the highest matching degree as a three-dimensional structure of the current two- A structure matching section; And
Dimensional image of the three-dimensional structure of the current two-dimensional image is set as a depth map based on the matching corresponding to the current two-dimensional image, and the depth map of the three- Each pixel of the depth map based on the depth map represents a depth value based on the matching of the corresponding pixels of the current two-dimensional image;
Lt; / RTI >
Wherein the three-dimensional structure matching unit comprises:
A plane splitting module that divides the current two-dimensional image into at least one region corresponding to a plane of the matched three-dimensional representative structure;
Calculating a depth of each of the regions based on the characteristic distribution of each region, calculating average values corresponding to the characteristics of the regions, calculating a similarity between the two regions from a norm of a difference between the average values, A matching degree calculating module for calculating a matching degree by a sum of the density of each of the areas and a similarity between the two areas; And
A three-dimensional structure determination module for determining a three-dimensional representative structure having the highest matching degree according to the degree of matching to a three-dimensional structure of a current two-dimensional image;
And a depth map generation unit for generating depth map information.

The method according to claim 1,
Wherein the depth value due to registration is in the range [0, 1], 0 indicates that the corresponding pixel has the maximum depth, and 1 indicates that the corresponding pixel has the minimum depth.

delete

The method according to claim 1,
The matching degree calculation module

, Where r is each of the regions,

P is a pixel of the area, I (p) is a characteristic value of the pixel p,

Is an average value of the characteristic values of the pixels in the area, and area (r) is a number of pixels in the area.

The method according to claim 1,
The matching degree calculation module

Is a mean value of the characteristics in the region, and | is a normal depth map generator.

6. The method according to any one of claims 4 to 5,
Wherein the feature is a color, gradient, or boundary depth map.

6. The method of claim 5,
Wherein the genome is a 1-genome, a 2-genome, or a ∞ genome.

An image acquiring unit acquiring a plurality of two-dimensional images successive in time among the input video;
Dimensional image; and generating at least one saliency map corresponding to a current two-dimensional image of the plurality of two-dimensional images in accordance with the HVP model, wherein each pixel of the saliency map has a saliency A map generator;
Dimensional image and a depth map based on the saliency corresponding to the current two-dimensional image using the at least one saliency map, and each pixel of the depth map based on the saliency displays a depth value of the saliency of the corresponding pixel of the current two- A depth map generation unit based on the conspicuousness;
Dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining the three-dimensional representative structure having the highest matching degree as the three-dimensional structure of the current two- Dimensional structure matching section;
Dimensional depth of the three-dimensional specimen structure is previously stored, and the depth map having the typical structure of the three-dimensional structure of the determined current two-dimensional image is regarded as an image based on the matching corresponding to the current two- Each pixel of the map representing a matched depth value of a corresponding pixel of a current two-dimensional image; And
An overall depth map generating unit for generating an overall depth map by combining the depth map based on the conspicuousness and the depth map based on the matching, and each pixel of the comprehensive depth map displaying the comprehensive depth value of the corresponding pixel of the current two- dimensional image;
Lt; / RTI >
Wherein the three-dimensional structure matching unit comprises:
A plane splitting module that divides the current two-dimensional image into at least one region corresponding to a plane of the matched three-dimensional representative structure;
Calculating a density of each of the regions according to a characteristic distribution of the regions; Calculating similarities between the two regions according to the difference between the average values by calculating average values corresponding to the characteristics of the regions; A matching degree calculation module for calculating a matching degree according to the density of each area and the similarity between the two areas; And
A three-dimensional structure determination module for determining a three-dimensional representative structure having the highest matching degree based on the matching degree to a three-dimensional structure of a current two-dimensional image;
And a depth map generation unit for generating depth map information.

9. The method of claim 8,
Wherein the saliency map generator comprises:
A characteristic saliency map generation module that generates a characteristic saliency map by identifying a characteristic of a current two-dimensional image;
A motion aberration map generation module that generates a motion aberration map by identifying a motion between a current two-dimensional image and a two-dimensional image that is temporally adjacent to a current two-dimensional image;
An object saliency map generation module that generates a target saliency map by identifying an object of a current two-dimensional image;
An odd map control module for generating one, any two or all of the conspicuousness maps using any or all of the characteristic saliency map generation module, the motion saliency map generation module and the target saliency map generation module, ;
And a depth map generation unit for generating depth map information.

10. The method of claim 9,
The depth map generation unit based on the conspicuousness generates a depth map based on the conspicuousness through the following processing:
When the saliency map generating unit generates only the target saliency map, the depth map generating unit based on the saliency corresponds to a pixel identified as a target of the two-dimensional image in the depth map based on the salience of a constant value within the range of (0, 1) And assigns 0 to other pixels in the depth map based on the conspicuousness;
[0, 1] according to the conspicuousness of each pixel in the characteristic saliency map or the motional saliency map, the depth map generation unit based on the saliency may generate [0, 1] A value in the range is assigned to each pixel in the depth map based on the conspicuity; 0 indicates that the corresponding pixel has the minimum saliency; and 1 indicates that the corresponding pixel has the maximum saliency;
The depth map generating unit based on the saliency map may compare the values obtained by normalizing the corresponding pixels in the two saliency maps to each other or by comparing them with each other Assigning a large value to a corresponding pixel in the depth map based on the conspicuousness;
Wherein the depth map generation unit based on the saliency maps the constants in the range of (0, 1) to the target in the target salience map in the depth map based on the saliency, when the two saliency maps including the saliency map as the target of the saliency map generation unit are generated To a pixel corresponding to each of the pixels identified as " a ", and assigns corresponding pixel values of the saliency maps other than the saliency map of the object among the two saliency maps to other corresponding pixels in the depth map based on saliency;
Wherein when the saliency map generating section generates the entire saliency map, the depth map generating section based on the saliency corresponds to each pixel identified as an object in the target saliency map in the depth map based on the saliency, with a constant in the range of (0, 1) And assigns a larger value to a corresponding pixel of the depth map based on the conspicuousness by comparing the values obtained by normalizing the corresponding pixels among the two conspicuous maps other than the target conspicuousness map or by normalizing them.

delete

9. The method of claim 8,
The matching degree calculation module

, Where r is each of the regions,

P is a pixel of the area, I (p) is a characteristic value of the pixel p,

9. The method of claim 8,
The matching degree calculation module

Is a mean value of the characteristics of the region, and |

14. The method according to any one of claims 12 to 13,
Wherein the feature is a color, gradient, or limit.

14. The method of claim 13,
Wherein the genome is a 1-genome, a 2-genome, or a ∞ genome.

9. The method of claim 8,
The pixel values of the depth map based on the depth map and matching based on the conspicuity are [0. 1], where 0 indicates that the corresponding pixel has a maximum depth, and 1 indicates that the corresponding pixel has a minimum depth.

9. The method of claim 8,
The comprehensive depth map generation unit compares the depth map based on the conspicuousness with the corresponding pixels in the depth map based on the value obtained by standardizing the sum of the corresponding pixels of the depth map based on matching and the depth map based on the conspicuousness, A depth map generator for generating a depth map.

10. The method of claim 9,
Wherein the object in the current two-dimensional image includes a person, a face, or a character.

An image acquiring unit acquiring a plurality of two-dimensional images successive in time among the input video;
Dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining the three-dimensional representative structure of the highest matching degree as a three-dimensional structure of the current two- A structure matching section;
Dimensional image of the three-dimensional structure of the current two-dimensional image is set as a depth map based on the matching corresponding to the current two-dimensional image, and the depth map of the three- Each pixel of the depth map based on the depth map represents a depth value based on registration of a corresponding pixel of the current two-dimensional image; And
A space-time area smoothing unit for smoothing the space map and the time space with respect to the depth map based on the matching;
Lt; / RTI >
Wherein the three-dimensional structure matching unit comprises:
A plane splitting module that divides the current two-dimensional image into at least one region corresponding to a plane of the matched three-dimensional representative structure;
Calculating a density of each of the regions according to a characteristic distribution of the regions; Calculating similarities between the two regions according to the difference between the average values by calculating average values corresponding to the characteristics of the regions; A matching degree calculation module for calculating a matching degree according to the density of each area and the similarity between the two areas; And
A three-dimensional structure determination module for determining a three-dimensional representative structure having the highest matching degree based on the matching degree to a three-dimensional structure of a current two-dimensional image;
The depth map smoothing device comprising:

20. The method of claim 19,
The space-time area smoothing unit includes:
(X + x, y + y, t + t) of the two-dimensional image at each pixel P1 (x, y, t) and time (t + (P1, P2) according to the difference in the similarity, distance, and depth values between the two values of Δx, Δy and Δt, and calculates the amount of smoothening to determine Δx, Δy and Δt according to the expected smoothing effect module;
Calculates a depth value D '(P1) = D (P1) -S (P1) of the pixel P1 of the current two-dimensional image after smoothing according to the smoothness amount S (P1, P2) The difference between the depth value of the pixel P1 after smoothing and the depth value D '(P2) = D (P2) + S (P1, P2) of the pixel P2 A smoothing module for making the absolute value smaller than the absolute value of the difference between the depth value D (P1) of the pixel value P1 before smoothing and the depth value D (P2) of the pixel value P2;
Wherein the depth map smoothing device further comprises:

21. The method of claim 20,
The smoothed amount calculation module calculates the smoothed amount S (P1, P2) according to (D (P1) -D (P2)) * N (P1, P2) * C (.) Is the pixel is the depth value;

here,

;

I (.) Is the characteristic (color or pattern) value of the pixel, and |. | Is the absolute value.

22. The method of claim 21,
Wherein the characteristic is a color or a pattern.

An image acquiring unit acquiring a plurality of two-dimensional images successive in time among the input video;
Dimensional image, the at least one saliency map corresponding to a current two-dimensional image of the plurality of two-dimensional images in accordance with the HVP model, wherein each pixel of the saliency map includes a salient feature indicating the saliency of the corresponding pixel of the current two- A sex map generation unit;
Dimensional image; and a depth map based on the saliency of a corresponding pixel in the current two-dimensional image, wherein the depth map is based on the saliency based on the at least one saliency map, A depth map generation unit based on the conspicuousness;
Dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining the three-dimensional representative structure having the highest matching degree as the three-dimensional structure of the current two- Dimensional structure matching section;
Wherein the depth map of the three-dimensional stereoscopic structure having the three-dimensional structure determined as the current two-dimensional image is set as a depth map based on the matching degree corresponding to the current two-dimensional image, Wherein each pixel of the depth map represents a depth value based on a matching degree of a corresponding pixel of a current two-dimensional image;
An overall depth map generating unit for generating an overall depth map by combining the depth map based on the conspicuousness and the depth map based on matching and displaying each pixel in the depth map an overall depth value of a corresponding pixel in the current two- And
A space-time area smoothing unit for smoothing the space area and the time area with respect to the overall depth map;
Lt; / RTI >
Wherein the three-dimensional structure matching unit comprises:
A plane splitting module that divides the current two-dimensional image into at least one region corresponding to a plane of the matched three-dimensional representative structure;
Calculate a density of each of the regions based on a characteristic distribution of the regions; Calculating similarities between the two regions based on a difference between the average values by calculating average values corresponding to the respective characteristics; A matching degree calculation module for calculating a matching degree based on the density of each area and the sum of similarities between the two areas; And
A three-dimensional structure determination module for determining a three-dimensional representative structure having the highest matching degree based on the matching degree to a three-dimensional structure of a current two-dimensional image;
And a depth map generation unit for generating depth map information.

24. The method of claim 23,
Wherein the saliency map generator comprises:
A characteristic saliency map generation module that generates a characteristic saliency map by identifying a characteristic of a current two-dimensional image;
A motion aberration map generation module that generates a motion aberration map through a motion between a current two-dimensional image and a current two-dimensional image in time;
An object saliency map generation module that generates a target saliency map by identifying an object of a current two-dimensional image;
Wherein one or any two or all of one of the characteristic saliency map generation module, the motion aurorance map generation module and the target saliency map generation module is used to generate one or any two or all of the saliency maps Map control module;
And a depth map generation unit for generating depth map information.

25. The method of claim 24,
The depth map generation unit based on the conspicuousness generates a depth map based on the conspicuousness through the following processing:
If the saliency map generator generates only a saliency map of a target object, the depth map generator based on the saliency may set a constant value in the range of (0, 1) to a pixel identified as an object in the two- Assigns 0 to the corresponding pixels, and assigns 0 to other pixels in the depth map based on the conspicuousness;
[0, 1] according to the conspicuousness of each pixel of the characteristic saliency map or the motional saliency map, when the saliency map generator generates one of the characteristic saliency map or the motion aurora saliency map, A value in the range is assigned to each pixel in the depth map based on the conspicuity; 0 indicates that the corresponding pixel has minimum conspicuity; 1 indicates that the corresponding pixel has maximum conspicuity;
The depth map generating unit based on the saliency map may compare the values obtained by normalizing the corresponding pixels of the two saliency maps to each other or by comparing them with each other Assigning a large value to the corresponding pixel of the depth map based on the conspicuousness;
Wherein the depth map generator based on the saliency maps the constants in the range of (0, 1) to the angle of the target saliency map in the depth map based on the saliency, when the two saliency maps including the saliency map as the target of the saliency map generation unit are generated To a pixel corresponding to a pixel identified as an object, and assigns corresponding pixel values of the conspicuity maps other than the conspicuity map of the object among the two conspicuity maps to other corresponding pixels of the depth map based on conspicuity;
Wherein when the saliency map generating unit generates the entire saliency map, the depth map generating unit based on the saliency matches a constant within the range of (0, 1) with each pixel identified as the target of the target saliency map in the depth map based on saliency And adds the corresponding pixels of the two conspicuous maps other than the target conspicuousness map to each other to standardize the values or compares them with each other to give a larger value to the corresponding pixels of the depth map based on the conspicuousness.

delete

24. The method of claim 23,
The matching degree calculation module

, Where r is each of the regions,

P is a pixel of the area, I (p) is a characteristic value of the pixel p,

28. The method of claim 27,
The matching degree calculation module

Is a mean value of the characteristics of the region, and |

29. The method according to any one of claims 27 to 28,
Wherein the feature is a color, gradient, or boundary depth map.

29. The method of claim 28,
Wherein the genome is a 1-genome, a 2-genome, or a ∞ genome.

24. The method of claim 23,
The pixel values of the depth map based on the depth map and matching based on the conspicuity are [0. 1], where 0 indicates that the corresponding pixel has a maximum depth, and 1 indicates that the corresponding pixel has a minimum depth.

24. The method of claim 23,
The comprehensive depth map generation unit compares the depth map based on the conspicuousness with the corresponding pixels of the depth map based on the sum of the corresponding pixels of the depth map based on the matching and the corresponding pixels of the depth map based on the matching, A depth map generator for generating a global depth map.

24. The method of claim 23,
Wherein the object in the current two-dimensional image includes a person, a face, or a character.

24. The method of claim 23,
The space-time area smoothing unit includes:
(X + x, y + y, t + t) of the two-dimensional image at each pixel P1 (x, y, t) and time (t + (P1, P2) according to the difference in the similarity, distance, and depth values between the two values of Δx, Δy and Δt, and calculates the amount of smoothening to determine Δx, Δy and Δt according to the expected smoothing effect module;
Calculates a depth value D '(P1) = D (P1) -S (P1) of the pixel P1 of the current two-dimensional image after smoothing according to the smoothness amount S (P1, P2) The difference between the depth value of the pixel P1 after smoothing and the depth value D '(P2) = D (P2) + S (P1, P2) of the pixel P2 A smoothing module for making the absolute value smaller than the absolute value of the difference between the depth value D (P1) of the pixel value P1 before smoothing and the depth value D (P2) of the pixel value P2;
The depth map generating apparatus further comprising:

35. The method of claim 34,
The smoothed amount calculation module calculates the smoothed amount S (P1, P2) according to (D (P1) -D (P2)) * N (P1, P2) * C (.) Is the pixel is the depth value;

here,

;

36. The method of claim 35,
Wherein the feature is a color or a pattern.

Obtaining a plurality of consecutive two-dimensional images of the input video;
Calculating a degree of matching of a current two-dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining a three-dimensional representative structure having a highest matching degree as a three- ;
The depth map of the three-dimensional stereoscopic structure is stored in advance and the depth map of the three-dimensional stereoscopic structure determined by the three-dimensional structure of the current two-dimensional image is used as the depth map based on the matching corresponding to the current two- Each pixel of the due depth map displaying a depth value based on a matching degree of a corresponding pixel of a current two-dimensional image;
Lt; / RTI >
Determining the three-dimensional structure of the current two-dimensional image comprises:
Dividing a current two-dimensional image into at least one region corresponding to a plane in the matched three-dimensional representative structure;
Calculating a density of each of the regions according to the distribution of the characteristics of the regions; Calculating similarities between the two regions according to a difference between the average values by calculating average values corresponding to the respective characteristics; Calculating a matching degree according to a sum of the density of each of the regions and a similarity between the two regions; And
Determining a three-dimensional structure having a highest degree of matching as a three-dimensional structure of a current two-dimensional image according to the degree of matching;
/ RTI >

39. The method of claim 37,
Wherein the depth value based on the degree of matching is in the range [0, 1], 0 indicates that the corresponding pixel has the maximum depth, and 1 indicates that the corresponding pixel has the minimum depth.

delete

39. The method of claim 37,

, Where r is each of the regions,

P is a pixel of the area, I (p) is a characteristic value of the pixel p,

39. The method of claim 37,

, The similarity between the region ri and the region rj is calculated, where

Is a mean value of the characteristics of the region, and |

42. The method according to any one of claims 40 to 41,
Wherein the characteristic is a color, gradient, or boundary depth map.

42. The method of claim 41,
Wherein the genome is a 1-genome, a 2-genome, or a ∞ genome.

Obtaining a plurality of consecutive two-dimensional images of the input video;
Generating at least one saliency map corresponding to a current two-dimensional image of the plurality of two-dimensional images according to an HVP model, each pixel of the saliency map displaying saliency of a corresponding pixel of a current two- ;
Wherein the depth map is based on the at least one saliency map and the saliency map corresponding to the current two-dimensional image, and each pixel of the depth map based on the saliency displays a depth value based on the saliency of the corresponding pixel of the current two- step;
Calculating a degree of matching of a current two-dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining a three-dimensional representative structure having a highest matching degree as a three- ;
The depth map of the three-dimensional stereoscopic structure is stored in advance and the depth map of the three-dimensional stereoscopic structure determined by the three-dimensional structure of the current two-dimensional image is used as the depth map based on the matching corresponding to the current two- Each pixel of the due depth map displaying a depth value based on a matching degree of a corresponding pixel of a current two-dimensional image; And
Combining the depth map based on the conspicuousness and the depth map based on the matching to generate a comprehensive depth map, each pixel in the overall depth map displaying an overall depth value of a corresponding pixel in the current two-dimensional image;
Lt; / RTI >
Determining the three-dimensional structure of the current two-dimensional image comprises:
Dividing a current two-dimensional image into at least one region corresponding to a plane of the matched three-dimensional representative structure;
Calculating a density of each of the regions according to a characteristic distribution among the regions; Calculating average values corresponding to the characteristics of each of the regions and calculating a similarity between the two regions according to a difference between the average values; Calculating a matching degree according to a sum of the density of each of the regions and a similarity between the two regions; And
Determining a three-dimensional structure having the highest matching degree as a three-dimensional structure of a current two-dimensional image according to the degree of matching;
And a depth map generating step of generating a depth map.

45. The method of claim 44,
Generating the saliency map includes:
A characteristic saliency map, a motion saliency map and a target saliency map, generating a characteristic saliency map through the characteristics in the current two-dimensional image, and generating a current two-dimensional image and a current two- Generating an motion saliency map by identifying an image motion between two adjacent two-dimensional images in time, and generating an object saliency map by identifying an object in the current two-dimensional image.

46. The method of claim 45,
The depth map based on the conspicuousness may be generated by:
If only the object's saliency map is generated, a constant value in the range of (0, 1) is given to a pixel corresponding to a pixel identified as an object in the two-dimensional image of the depth map based on the sensibility, and 0 is set as a depth map Lt; / RTI >
A value within the range of [0, 1] is assigned to each pixel of the depth map based on the conspicuousness according to the conspicuousness of each pixel of the characteristic saliency map or the motional saliency map. , 0 indicates that the corresponding pixel has minimum saliency, and 1 indicates that the corresponding pixel has maximum saliency;
When two conspicuous maps that do not include the conspicuity map of the object are generated, the corresponding pixels in the two conspicuity maps are added together to give a standardized value or a larger value to the corresponding pixels in the depth map based on the conspicuousness ;
When the two conspicuity maps including the conspicuity map of the object are generated, the constants in the range of (0, 1) are given to the pixels corresponding to the pixels identified as the objects of the object conspicuity map in the depth map based on conspicuity, The corresponding pixel values of the consonantity maps other than the consonantity map of the object among the consonantity maps to the other corresponding pixels of the depth map based on the conspicuousness;
When the entire saliency map is generated, the constants in the range of (0, 1) are given to the pixels corresponding to the respective pixels identified as the targets of the target saliency map in the depth map based on the saliency, Adding the corresponding pixels of the sex map to each other to normalize or compare the values with each other to assign a larger value to a corresponding pixel in the depth map based on the conspicuousness.

delete

45. The method of claim 44,

, Where r is each of the regions,

P is a pixel of the area, I (p) is a characteristic value of the pixel p,

45. The method of claim 44,

Is a mean value of the characteristics of the region, and |

A method according to any one of claims 48 to 49,
Wherein the characteristic is a color, gradient, or boundary depth map.

50. The method of claim 49,
Wherein the genome is a 1-genome, a 2-genome, or a ∞ genome.

45. The method of claim 44,
The pixel values of the depth map based on the depth map based on the conspicuousness are in the range [0, 1], 0 indicates that the corresponding pixel has the maximum depth, and 1 indicates that the corresponding pixel has the minimum depth To generate a depth map.

50. The method of claim 49,
A depth map based on the depth map based on the sum of the corresponding pixel values in the depth map based on the conspicuousness and matching, or a depth map based on the conspicuousness, and a corresponding depth pixel on the matching map. To generate a depth map.

46. The method of claim 45,
Wherein the object in the current two-dimensional image comprises a person, a face, or a character.

Obtaining a plurality of temporally continuous two-dimensional images of the input video;
Calculating a degree of matching of a current two-dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining a three-dimensional representative structure having a highest matching degree as a three- ;
The depth map of the three-dimensional stereoscopic structure is stored in advance and the depth map of the three-dimensional stereoscopic structure determined by the three-dimensional structure of the current two-dimensional image is used as the depth map based on the matching corresponding to the current two- Each pixel of the due depth map displaying a depth value based on a matching degree of a corresponding pixel of a current two-dimensional image; And
Advancing smoothness in a spatial domain and a time domain with respect to the depth map based on the matching;
Lt; / RTI >
Determining the three-dimensional structure of the current two-dimensional image comprises:
Dividing a current two-dimensional image into at least one region corresponding to a plane of the matched three-dimensional representative structure;
Calculating a density of each of the regions according to a characteristic distribution among the regions; Calculating average values corresponding to the characteristics of each of the regions and calculating a similarity between the two regions according to a difference between the average values; Calculating a matching degree according to a sum of the density of each of the regions and a similarity between the two regions; And
Determining a three-dimensional structure having the highest matching degree as a three-dimensional structure of a current two-dimensional image according to the degree of matching;
And a depth map generating step of generating a depth map.

56. The method of claim 55,
Wherein the step of smoothing in spatial and temporal regions with respect to the depth map based on the matching comprises:
(X + x, y + y, t + t) of the two-dimensional image at each pixel P1 (x, y, t) and time (t + Calculating a smoothed amount S (P1, P2) according to the difference between the similarity, distance and depth value between the reference points A and B, and determining Δx, Δy and Δt according to the expected smoothing effect;
Calculates a depth value D '(P1) = D (P1) -S (P1) of the pixel P1 of the current two-dimensional image after smoothing according to the smoothness amount S (P1, P2) The difference between the depth value of the pixel P1 after smoothing and the depth value D '(P2) = D (P2) + S (P1, P2) of the pixel P2 Making the absolute value smaller than the absolute value of the difference between the depth value D (P1) of the pixel value P1 before smoothing and the depth value D (P2) of the pixel value P2; Generation method.

57. The method of claim 56,
(P1, P2) according to the following equation (D (P1) -D (P2)) * N (P1, P2) * C Depth value;

here,

;

58. The method of claim 57,
Wherein the feature is a color and a pattern.

Obtaining a plurality of consecutive two-dimensional images of the input video;
Generating at least one saliency map corresponding to a current two-dimensional image of the plurality of two-dimensional images according to an HVP model, each pixel of the saliency map displaying saliency of a corresponding pixel of a current two- ;
Using the at least one saliency map and a depth map based on the saliency corresponding to the current two-dimensional image, each pixel of the depth map based on the saliency displays a depth value based on the saliency of a corresponding pixel in the current two- step;
Calculating a degree of matching of a current two-dimensional image and a plurality of pre-stored three-dimensional representative structures among the plurality of two-dimensional images, and determining a three-dimensional representative structure having a highest matching degree as a three- ;
The depth map of the three-dimensional stereoscopic structure is stored in advance and the depth map of the three-dimensional stereoscopic structure determined by the three-dimensional structure of the current two-dimensional image is used as the depth map based on the matching corresponding to the current two- Each pixel of the due depth map displaying a depth value based on a matching degree of a corresponding pixel of a current two-dimensional image; And
Combining the depth map based on the conspicuousness and the depth map based on the matching to generate a comprehensive depth map, each pixel in the overall depth map displaying an overall depth value of a corresponding pixel in the current two-dimensional image; And
Advancing smoothness on the spatial domain and the time domain on the global depth map;
Lt; / RTI >
Determining the three-dimensional structure of the current two-dimensional image comprises:
Dividing a current two-dimensional image into at least one region corresponding to a plane of the matched three-dimensional representative structure;
Calculating a density of each of the regions according to a characteristic distribution among the regions; Calculating average values corresponding to the characteristics of each of the regions and calculating a similarity between the two regions according to a difference between the average values; Calculating a matching degree according to a sum of the density of each of the regions and a similarity between the two regions; And
Determining a three-dimensional structure having the highest matching degree as a three-dimensional structure of a current two-dimensional image according to the degree of matching;
And a depth map generating step of generating a depth map.

60. The method of claim 59,
Generating the saliency map includes:
A characteristic saliency map, a motion saliency map and a target saliency map, generating a characteristic saliency map through the characteristics in the current two-dimensional image, and generating a current two-dimensional image and a current two- A method of generating a depth map, the method comprising: generating an aberration map by identifying an image motion between two adjacent two-dimensional images in time, and generating an object saliency map by identifying an object in the current two-dimensional image.

60. The method of claim 59,
To create a depth map based on conspicuity:
If only the object's saliency map is generated, a constant value in the range of (0, 1) is given to the pixel corresponding to the pixel identified as the object of the two-dimensional image in the depth map based on the sensibility, and 0 is set as the depth map To other pixels of the pixel;
A value within the range of [0, 1] is assigned to each pixel of the depth map based on the conspicuousness according to the conspicuousness of each pixel of the characteristic saliency map or the motional saliency map. , 0 indicates that the corresponding pixel has minimum saliency, and 1 indicates that the corresponding pixel has maximum saliency;
When two conspicuous maps that do not include the conspicuity map of the object are generated, the corresponding pixels of the two conspicuity maps are added together to give a standardized value or a larger value to a corresponding pixel in the depth map based on the conspicuousness ;
When the two conspicuous maps including the conspicuousness map of the object are generated, the constants in the range of (0, 1) are given to the pixels corresponding to the pixels identified as the objects of the object conspicuity map in the depth map based on conspicuity, The corresponding pixel value of the saliency map other than the saliency map of the object among the surname maps to the other corresponding pixels in the depth map based on the saliency;
When the entire saliency map is generated, a constant in the range of (0, 1) is given to a pixel corresponding to the pixel identified as the target of the target saliency map in the depth map based on saliency, and two conspicuous A corresponding pixel in the map is added to the standardized value, or a larger value is compared with each other to give a corresponding pixel to the corresponding pixel in the depth map based on the conspicuousness.

delete

60. The method of claim 59,

, Where r is each of the regions,

P is a pixel of the area, I (p) is a characteristic value of the pixel p,

60. The method of claim 59,

Is a mean value of the characteristics of the region, and |

65. The method according to any one of claims 63 to 64,
Wherein the feature is a color and a pattern.

65. The method of claim 64,
Wherein the genome is a 1-genome, a 2-genome, or a ∞ genome.

60. The method of claim 59,
The pixel values of the depth map based on the depth map based on the conspicuousness are in the range [0, 1], 0 indicates that the corresponding pixel has the maximum depth, and 1 indicates that the corresponding pixel has the minimum depth To generate a depth map.

60. The method of claim 59,
A depth map based on the depth map based on the sum of the corresponding pixel values in the depth map based on the conspicuousness and matching, or a depth map based on the conspicuousness, and a corresponding depth pixel on the matching map. To generate a depth map.

60. The method of claim 59,
Wherein the object in the current two-dimensional image comprises a person, a face, or a character.

60. The method of claim 59,
Wherein the step of smoothing the spatial domain and the temporal domain with respect to the global depth map comprises:
(X + x, y + y, t + t) of the two-dimensional image at each pixel P1 (x, y, t) and time (t + Calculating a smoothed amount S (P1, P2) according to the difference between the similarity, distance and depth value between the reference points A and B, and determining Δx, Δy and Δt according to the expected smoothing effect;
Calculates a depth value D '(P1) = D (P1) -S (P1) of the pixel P1 of the current two-dimensional image after smoothing according to the smoothness amount S (P1, P2) The difference between the depth value of the pixel P1 after smoothing and the depth value D '(P2) = D (P2) + S (P1, P2) of the pixel P2 Making the absolute value smaller than the absolute value of the difference between the depth value D (P1) of the pixel value P1 before smoothing and the depth value D (P2) of the pixel value P2; Generation method.

71. The method of claim 70,
(P1, P2) according to the following equation (D (P1) -D (P2)) * N (P1, P2) * C Depth value;

here,

;

72. The method of claim 71,
Wherein the feature is a color and a pattern.

A computer-readable recording medium recording a program for performing the method of any one of claims 37, 44, 55, and 59.