KR20080002409A

KR20080002409A - Device and method for transforming 2-d image into 3-d image

Info

Publication number: KR20080002409A
Application number: KR1020060061243A
Authority: KR
Inventors: 손광훈; 김동현
Original assignee: 연세대학교 산학협력단
Priority date: 2006-06-30
Filing date: 2006-06-30
Publication date: 2008-01-04
Also published as: KR100799990B1; WO2008001967A1

Abstract

An apparatus and a method for converting a two dimensional image into a three dimensional image are provided to convert the two dimensional image into the three dimensional image by considering presence of a motion of a camera, so that adaptive three dimensional image conversion can be possible. A feature point extracting unit(100) which includes a color dividing unit, a labling unit, a contour extracting unit and a feature point gaining unit divides an area of a two dimensional image and extracts a feature point of the divided areas. A camera motion recognizer(102) recognizes a camera motion in the two dimensional image. A motion parallax converter(104) generates a variance image for the two dimensional image through a motion parallax if the camera motion recognizer recognizes no camera motion. A scene structure estimation converter(106) generates a variation image for the two dimensional image through screen structure estimation if the camera motion recognizer recognizes the camera motion. By using the variation images generated in the motion parallax converter or the scene structure estimation converter, an image synthesizer(110) synthesizes a three dimensional image.

Description

Device and method for transforming two-dimensional images to three-dimensional image {Device and Method for transforming 2-D Image into 3-D Image}

도 1은 본 발명의 바람직한 일 실시예에 따른 2차원 영상의 3차원 변환 장치의 구성을 도시한 블록도.1 is a block diagram showing the configuration of a three-dimensional conversion apparatus of a two-dimensional image according to an embodiment of the present invention.

도 2는 본 발명의 바람직한 일 실시예에 따른 특징점 추출부의 구성을 도시한 블록도.Figure 2 is a block diagram showing the configuration of a feature point extraction unit according to an embodiment of the present invention.

도 3은 본 발명의 일 실시예에 따른 색상 분할이 완료된 영상을 도시한 도면.3 is a diagram illustrating an image in which color division is completed according to an embodiment of the present invention.

도 4는 본 발명의 바람직한 일 실시예에 따른 변환 대상 영상에 대해 레이블링을 수행하는 일레를 도시한 도면.4 is a diagram illustrating an example of performing labeling on a target image to be converted according to an exemplary embodiment of the present invention.

도 5는 본 발명의 일 실시예에 따른 도 3의 영상에 대해 레이블링을 수행한 후의 영상을 도시한 도면.5 is a diagram illustrating an image after labeling the image of FIG. 3 according to an embodiment of the present invention; FIG.

도 6은 본 발명의 일 실시예에 따른 윤곽선 추출을 위해 경계를 찾는 일례를 도시한 도면.6 illustrates an example of finding a boundary for contour extraction according to an embodiment of the present invention.

도 7은 도 5의 레이블링된 영상에 대해 윤곽선 추출을 완료한 영상을 도시한 도면.FIG. 7 illustrates an image of completing contour extraction on the labeled image of FIG. 5; FIG.

도 8은 도 7의 영상에 대해 특징점이 추출된 영상을 도시한 도면. FIG. 8 is a diagram illustrating an image from which feature points are extracted from the image of FIG. 7. FIG.

도 9는 본 발명의 바람직한 일 실시예에 따른 카메라 움직임 인식부에서 카메라 패닝이 있는지 여부를 인식하는 과정에 대한 순서도.9 is a flowchart illustrating a process of recognizing whether there is a camera panning in a camera motion recognition unit according to an exemplary embodiment of the present invention.

도 10은 본 발명의 바람직한 일 실시예에 따른 카메라 줌인 및 카메라 줌아웃을 판단하는 과정에 대한 순서도.10 is a flowchart illustrating a process of determining a camera zoom in and a camera zoom out according to an exemplary embodiment of the present invention.

도 11은 본 발명의 바람직한 일 실시예에 따른 움직임 시차를 이용한 변이 영상 생성 과정을 도시한 순서도.11 is a flowchart illustrating a process of generating a disparity image using motion parallax according to an exemplary embodiment of the present invention.

도 12는 본 발명의 바람직한 일 실시예에 따른 장면 구조 추정을 이용한 변이 영상 생성 과정을 도시한 순서도.12 is a flowchart illustrating a disparity image generation process using scene structure estimation according to an exemplary embodiment of the present invention.

도 13은 본 발명의 일 실시예에 따른 장면 구조 추정을 위한 카메라 모델을 도시한 도면.FIG. 13 illustrates a camera model for scene structure estimation according to an embodiment of the present invention. FIG.

본 발명은 2차원 영상의 3차원 영상 변환 장치 및 방법에 관한 것으로서, 더욱 상세하게는 2차원 영상에 대한 변이 영상을 이용하여 3차원 영상으로 변환하는 장치 및 방법에 관한 것이다. The present invention relates to an apparatus and method for converting a two-dimensional image to a three-dimensional image, and more particularly, to an apparatus and method for converting a two-dimensional image into a three-dimensional image using a disparity image.

실제 세계에서는 인간이 보고 있는 대상에 대해 자동적으로 눈의 초점이 조절되지만, 컴퓨터 모니터에 디스플레이되는 모든 영상들은 깊이 단서를 가지지 못 하므로 초점이 항상 일정하다. 따라서, 3차원 디스플레이에서는 좌우에 서로 다른 이미지를 인위적으로 제공하지만 눈에서 물체까지의 거리가 항상 일정하므로 원근 조절 초첨 맞추기나 시선의 수렴이 비정상적으로 되는 문제가 발생하여 눈에 피로감을 유발하게 된다. In the real world, the eye is automatically focused on what the human is seeing, but since all images displayed on the computer monitor do not have depth cues, the focus is always constant. Therefore, the 3D display artificially provides different images to the left and right, but since the distance from the eye to the object is always constant, a problem arises in that focusing focusing or convergence of eyes is abnormal, causing eye fatigue.

2차원 영상의 3차원 변환 기법은 획득된 영상에서 단안 단서를 검출하여 깊이 정보를 얻은 후 양안 단서로 변환해준다. 양안 단서 중 일정한 깊이에 존재하는 일반적인 3차원 디스플레이를 이용할 경우 시선의 수렴은 이용할 수 없으며 양안 시차를 이용한 변환이 일반적이다. The 3D transformation technique of the 2D image detects the monocular cue from the acquired image, obtains depth information, and converts the binocular cue into binocular cues. When using a general three-dimensional display that exists at a certain depth among binocular cues, convergence of the gaze cannot be used, and transformation using binocular disparity is common.

기존의 양안 시차를 이용한 방법으로 지연 시간을 이용한 변환, 깊이 계산을 이용한 변환 등 다양한 알고리즘이 제안되었다. Various algorithms have been proposed, such as transform using delay time and transform using depth calculation.

이러한 종래의 알고리즘은 카메라의 움직임 여부, 특히 카메라의 줌인, 줌아웃과 같은 동작을 고려하지 않고 일방적으로 3차원 변환을 수행함으로써 적절한 3차원 변환이 이루어지지 않는 문제점이 있었다. 특히 종래의 알고리즘의 경우 카메라의 움직임이 발생할 경우 정확한 3차원 영상으로 변환되지 못하는 문제점이 있었다. Such a conventional algorithm has a problem in that proper three-dimensional transformation is not performed by unilaterally performing three-dimensional transformation without considering the movement of the camera, in particular, the operation such as zooming in and zooming out of the camera. In particular, in the conventional algorithm, there is a problem in that the movement of the camera cannot be converted into an accurate 3D image.

본 발명에서는 상기한 바와 같은 종래 기술의 문제점을 해결하기 위해, 카메라의 움직임 여부를 고려하여 2차원 영상을 3차원 영상으로 변환하는 장치 및 방법을 제안하고자 한다. In order to solve the problems of the prior art as described above, an apparatus and method for converting a 2D image into a 3D image in consideration of the movement of the camera is proposed.

본 발명의 다른 목적은 카메라의 움직임이 있을 경우 장면 구조 추정을 이용하여 2차원 영상을 3차원 영상으로 변환하는 장치 및 방법을 제안하는 것이다. Another object of the present invention is to propose an apparatus and method for converting a 2D image into a 3D image using scene structure estimation when the camera moves.

상기한 바와 같은 목적을 달성하기 위하여, 본 발명의 일 측면에 따르면, 2차원 영상의 영역을 분할하고 분할된 영역의 특징점을 추출하는 특징점 추출부; 상기 2차원 영상에서의 카메라 움직임을 인식하는 카메라 움직임 인식부; 상기 카메라 움직임 인식부에서 카메라 움직임이 인식되지 않을 경우, 움직임 시차를 통해 상기 2차원 영상에 대한 변이 영상을 생성하는 움직임 시차 변환부; 상기 카메라 움직임 인식부에 카메라 움직임이 인식될 경우, 장면 구조 추정을 통해 상기 2차원 영상에 대한 변이 영상을 생성하는 장면 구조 추정 변환부; 및 상기 움직임 시차 변환부 또는 상기 장면 구조 추정 변환부에서 생성되는 변이 영상을 이용하여 3차원 영상을 합성하는 영상 합성부를 포함하는 2차원 영상의 3차원 영상 변환 장치가 제공된다. According to an aspect of the present invention, a feature point extraction unit for dividing a region of a 2D image and extracting feature points of the divided region; A camera motion recognition unit for recognizing camera motion in the 2D image; A motion parallax converter configured to generate a disparity image of the 2D image through motion parallax if the camera motion is not recognized by the camera motion recognition unit; A scene structure estimation converter configured to generate a disparity image of the 2D image through scene structure estimation when the camera motion is recognized by the camera motion recognition unit; And an image synthesizer configured to synthesize a 3D image by using the disparity image generated by the motion parallax transform unit or the scene structure estimation transform unit.

상기 특징점 추출부는, 색상 정보를 이용하여 상기 2차원 영상의 영역을 분할하는 색상 분할부; 상기 색상 분할부에 의해 분할된 영상에서 연결되어 있는 화소에 대해 레이블 번호를 부여하는 레이블링부; 상기 레이블링된 영상에 대해 윤곽선을 추출하는 윤곽선 추출부; 및 상기 윤곽선 추출 정보를 이용하여 상기 분할된 영역별로 특징점을 선택하는 특징점 획득부를 포함할 수 있다. The feature point extracting unit may include: a color dividing unit dividing an area of the 2D image by using color information; A labeling unit which assigns a label number to pixels connected in the image divided by the color dividing unit; An outline extractor configured to extract an outline of the labeled image; And a feature point acquisition unit for selecting a feature point for each of the divided regions using the contour extraction information.

상기 색상 분할부는 평균 이동 알고리즘을 이용하여 색상 분할을 수행한다. The color dividing unit performs color dividing using an average shift algorithm.

상기 레이블링부는, 상기 색상분할된 2차원 영상에 대해 레이블 번호가 부여되지 않은 화소를 탐색하는 단계; 상기 탐색된 화소에 새로운 레이블 번호를 부여하는 단계; 및 상기 탐색된 화소와 연결된 화소들에 동일한 레이블 번호를 부여하는 단계를 수행한다. The labeling unit searching for pixels having no label number assigned to the color-divided two-dimensional image; Assigning a new label number to the searched pixel; And assigning the same label number to the pixels connected to the searched pixel.

상기 특징점 획득부는 상기 분할된 영역별로 특징점을 랜덤하게 선택하거나 주파수 정보를 이용하여 선택한다. The feature point acquisition unit randomly selects feature points for each of the divided regions or selects the frequency information using frequency information.

상기 특징점 획득부가 영역별로 선택하는 특징점의 개수는 영역의 크기에 기초한다. The number of feature points selected by the feature point acquirer for each region is based on the size of the region.

상기 카메라 움직임 인식부는 영상 경계 영역들에서의 움직임 벡터를 탐색하고, 상기 탐색한 움직임 벡터가 적어도 3개의 경계 영역에서 일정한 방향성을 가지는 경우 카메라 패닝 동작으로 판단한다. The camera motion recognition unit searches for motion vectors in the image boundary regions, and determines that the camera motion panning operation is performed when the searched motion vectors have a constant direction in at least three boundary regions.

상기 카메라 움직임 인식부는 영상 경계 영역 중 모서리 부분에서의 움직임 벡터를 탐색하고, 상기 탐색한 움직임 벡터가 내부 또는 외부로 방향성을 가질 경우 카메라 줌인 또는 카메라 줌아웃 동작으로 판단한다. The camera motion recognizing unit searches for a motion vector at a corner of the image boundary region, and determines that the camera motion in or out is performed when the searched motion vector has a direction inward or outward.

상기 카메라 움직임 인식부는 상기 경계 영역에서의 움직임 벡터 탐색 시 경계 영역의 각 픽셀에서 움직임 크기를 구한 후 가장 큰 움직임의 크기를 경계 영역의 움직임 대표값으로 설정하고, 상기 대표값이 미리 설정된 임계치 이상일 경우 움직임이 있다고 판단한다. The camera motion recognizing unit obtains a motion size from each pixel of the border area when searching for a motion vector in the border area, and sets the largest motion size as a motion representative value of the border area, and the representative value is greater than or equal to a preset threshold. I think there is movement.

상기 움직임 시차 3차원 변환부는 KLT(Kande-Lucas-Tomasi) 특징점 추적기를 이용하여 특징점의 움직임을 추적한다. The motion parallax three-dimensional converter tracks the movement of the feature point using a Kande-Lucas-Tomasi (KLT) feature point tracker.

상기 움직임 시차 변환부는 상기 KLT(Kande-Lucas-Tomasi) 특징점 추적기를 이용한 특징점 추적 결과를 통해 프레임간 특징점 위치 차이를 연산하여 움직임을 추정한다. The motion parallax converter estimates motion by calculating a feature point position difference between frames through a feature point tracking result using the Kande-Lucas-Tomasi (KLT) feature point tracker.

상기 움직임 시차 변환부는 상기 움직임 추정 정보를 변이로 변환하여 변이 영상을 생성한다. The motion parallax converter generates a disparity image by converting the motion estimation information into disparity.

상기 장면 구조 추정 변환부는 카메라의 초점 거리를 포함하는 카메라 내부 파라미터를 사용하는 카메라 모델을 설정하고 확장 칼만 필터를 이용하여 상태 벡터를 연산함으로써 상기 2차원 영상의 깊이 정보를 연산한다. The scene structure estimation converter calculates depth information of the 2D image by setting a camera model using camera internal parameters including a focal length of the camera and calculating a state vector using an extended Kalman filter.

본 발명이 다른 측면에 따르면, 2차원 영상의 영역을 분할하고 분할된 영역의 특징점을 추출하는 단계(a); 상기 2차원 영상에서의 카메라 움직임을 인식하는 단계(b); 상기 단계(b)에서 카메라 움직임이 인식되지 않을 경우, 움직임 시차를 통해 상기 2차원 영상에 대한 변이 영상을 생성하는 단계(c); 상기 단계(b)에서 카메라 움직임이 인식될 경우, 장면 구조 추정을 통해 상기 2차원 영상에 대한 변이 영상을 생성하는 단계(d); 및 상기 단계(c) 또는 상기 단계(d)에서 생성되는 변이 영상을 이용하여 3차원 영상을 합성하는 단계를 포함하는 2차원 영상의 3차원 영상 변환 방법이 제공된다. According to another aspect of the present invention, the method comprises: dividing a region of a 2D image and extracting feature points of the divided region; Recognizing camera movement in the two-dimensional image (b); (C) generating a disparity image for the 2D image through motion parallax if the camera movement is not recognized in the step (b); Generating a disparity image of the 2D image by estimating a scene structure when the camera movement is recognized in the step (b); And synthesizing a three-dimensional image using the disparity image generated in the step (c) or the step (d).

이하에서, 첨부된 도면을 참조하여 본 발명에 의한 움직임 시차 및 장면 구조 추정을 이용한 2차원 영상의 3차원 변환 방법 및 장치의 바람직한 실시예를 상세하게 설명한다. Hereinafter, with reference to the accompanying drawings will be described in detail a preferred embodiment of the method and apparatus for three-dimensional conversion of a two-dimensional image using motion parallax and scene structure estimation according to the present invention.

도 1은 본 발명의 바람직한 일 실시예에 따른 2차원 영상의 3차원 변환 장치의 구성을 도시한 블록도이다. 1 is a block diagram illustrating a configuration of a 3D conversion apparatus of a 2D image according to an exemplary embodiment of the present invention.

도 1을 참조하면, 본 발명의 일 실시예에 따른 2차원 영상의 3차원 변환 장치는 특징점 추출부(100), 카메라 움직임 인식부(102), 움직임 시차 3차원 변환부(104), 장면 구조 추정 3차원 변환부(106), 홀 필링부(108) 및 영상 합성부(110)를 포함할 수 있다. Referring to FIG. 1, a 3D transformation apparatus of a 2D image according to an embodiment of the present invention may include a feature point extractor 100, a camera motion recognition unit 102, a motion parallax 3D converter 104, and a scene structure. The estimation 3D converter 106, the hole filling unit 108, and the image synthesizing unit 110 may be included.

특징점 추출부(100)는 변환하려는 2차원 영상(이하, '변환 대상 영상'이라 함)으로부터 특징점을 추출하는 기능을 한다. 본 발명은 움직임 시차 또는 장면 구조 추정을 이용하여 2차원 영상을 3차원 영상으로 변환하는 방법을 제안하며, 이러한 움직임 시차 및 장면 구조 추정을 위한 전처리 작업으로 특징점 추출이 필요한 바, 특징점 추출부(100)는 이러한 기능을 수행한다. The feature point extractor 100 extracts a feature point from a two-dimensional image to be converted (hereinafter, referred to as a 'conversion target image'). The present invention proposes a method for converting a 2D image into a 3D image using motion parallax or scene structure estimation, and the feature point extraction unit 100 needs to extract feature points as a preprocessing operation for estimating the motion parallax and scene structure. ) Performs this function.

특징점 추출부(100)는 변환 대상 영상을 분할하며, 분할된 영역 단위로 특징점을 추출한다. 본 발명의 바람직한 실시예에 따르면, 색상 분할을 이용하여 변환 대상 영상을 다수의 영역을 분할한다. The feature point extractor 100 splits the image to be converted and extracts the feature points in units of the divided regions. According to a preferred embodiment of the present invention, a color to be converted is divided into a plurality of regions by using color segmentation.

특징점 추출부(100)는 색상별로 변환 대상 영상의 영역을 분할하고, 분할된 영역에 대한 레이블링(labling)을 수행한다. 레이블링을 통해 변환 대상 영상의 분할된 영역들은 크기순으로 정렬된다. The feature point extractor 100 divides the region of the conversion target image for each color and performs labeling on the divided region. Through labeling, the divided regions of the image to be converted are sorted in size order.

특징점 추출부(100)는 분할된 영역에 대한 레이블링 수행 후 각 영역의 윤곽선을 획득하며 각 영역별로 특징점을 추출한다. 본 발명의 일 실시예에 따르면, 특징점은 각 영역별로 랜덤하게 추출될 수 있다. 본 발명의 다른 실시예에 따르면, 특징점은 각 영역 중 고주파 성분을 가진 점 중에서 선택될 수도 있을 것이다. 각 영역별로 선택되는 특징점의 수는 하나 이상이며, 영역의 크기에 따라 결정될 수 있을 것이다. 보타 자세한 특징점 추출 알고리즘은 별도의 도면을 통해 상세히 설명하기로 한다. The feature point extractor 100 obtains the contour of each region after labeling the divided regions and extracts the feature points for each region. According to an embodiment of the present invention, feature points may be randomly extracted for each region. According to another embodiment of the present invention, the feature point may be selected from points having a high frequency component in each region. The number of feature points selected for each region is one or more, and may be determined according to the size of the region. More detailed feature extraction algorithm will be described in detail with a separate drawing.

카메라 움직임 인식부(102)는 변환 대상 영상의 움직임 분석을 통해 카메라의 정지 및 이동을 구분하는 기능을 한다. 즉, 카메라 움직임 인식부(102)는 패닝, 줌인, 줌아웃과 같은 카메라의 움직임 여부를 판단한다. 카메라의 움직임을 인식하는 보다 상세한 방법은 별도의 도면을 통해 상세히 설명하기로 한다. The camera motion recognition unit 102 distinguishes between the stop and the movement of the camera through the motion analysis of the converted target image. That is, the camera motion recognition unit 102 determines whether the camera moves, such as panning, zooming in, and zooming out. A more detailed method of recognizing the movement of the camera will be described in detail through separate drawings.

카메라 움직임 인식부(102)는 카메라의 움직임이 있을 경우, 장면 구조 추정 변환부(106)에 의해 이미지 변화 작업이 수행되도록 하며, 카메라의 움직임이 없을 경우, 움직임 시차 3차원 변환부(106)에 의해 이미지 변환이 수행되도록 한다. When there is camera movement, the camera motion recognition unit 102 causes the image change operation to be performed by the scene structure estimation transformation unit 106. When there is no camera movement, the camera motion recognition unit 102 transmits the motion parallax to the three-dimensional conversion unit 106. This allows image conversion to be performed.

움직임 시차 3차원 변환부(104)는 변환 대상 영상에서 특징점의 움직임을 추정하고 추정된 움직임에 변이를 할당하여 변환 대상 영상의 변이 영상을 생성한다. 움직임 시차 3차원 변환부(104)는 상기 특징점 추출부(100)에 의해 분할된 영역을 기반으로 움직임을 추정한다. The motion parallax three-dimensional transform unit 104 generates a disparity image of the transform target image by estimating the motion of a feature point in the transform target image and assigning the disparity to the estimated motion. The motion parallax three-dimensional converter 104 estimates the motion based on the area divided by the feature point extractor 100.

움직임 시차 3차원 변환부(104)는 움직임 추정 시 변환 대상 영상의 수평 방향 움직임 및 수직 방향의 움직임의 크기를 판단하며, 본 발명의 일 실시예에 따르면, 수직 성분의 변이는 깊이감을 얻을 수 없기 때문에 수평 방향의 움직임 크기와 수직 방향의 움직임 크기를 정규화하여 수평 방향의 변이로 변환한다. The motion parallax three-dimensional converter 104 determines the magnitude of the horizontal motion and the vertical motion of the transform target image during motion estimation. According to an embodiment of the present invention, the variation of the vertical component cannot obtain a sense of depth. Therefore, the motion size in the horizontal direction and the motion size in the vertical direction are normalized and converted into horizontal variations.

장면 구조 추정 3차원 변환부(106)는 변환 대상 영상에서 장면 구조를 획득 하여 영사의 깊이 정보를 획득하여 변이 영상을 생성한다. 본 발명의 바람직한 실시예에 따르면, 장면 구조 추정 3차원 변환부(106)는 확장 칼만 필터를 이용하여 장면 구조 및 카메라 움직임을 측정함으로써 영사의 깊이를 획득한다. 장면 구조 추정을 이용하여 영상의 깊이를 획득하는 보다 상세한 방법은 별도의 도면을 통해 상세히 설명하기로 한다. The scene structure estimation three-dimensional transform unit 106 obtains the scene structure from the transformation target image to obtain depth information of the projection to generate a disparity image. According to a preferred embodiment of the present invention, the scene structure estimation three-dimensional transform unit 106 obtains the depth of projection by measuring the scene structure and the camera movement using the extended Kalman filter. A more detailed method of acquiring the depth of an image using scene structure estimation will be described in detail with reference to separate drawings.

홀 필링(Hole filling)부(108)는 움직임 시차 3차원 변환부(104) 및 장면 구조 추정 3차원 변환부에 의해 생성되는 변이 영상에서 발생하는 홀을 채우는 기능을 한다. The hole filling unit 108 may fill a hole generated in the disparity image generated by the motion parallax three-dimensional transform unit 104 and the scene structure estimation three-dimensional transform unit.

움직임 시차 또는 장면 구조 추정을 이용하여 변이 영상 생성 시 상기 특징점 추출부(100)에 의해 분할된 모든 영역에 대해 깊이 정보를 파악하지 못할 수도 있으며, 이 경우 홀이 발생할 수 있다. 홀이 발생하는 원인으로, 분할된 영역에 포함된 특징점의 움직임 추정이 실패하거나 정지한 구조물 이외의 움직임은 장면 구조 추정에 사용하지 않기 때문에 발생하는 경우가 있다. When generating a disparity image using motion parallax or scene structure estimation, depth information may not be grasped in all regions divided by the feature point extractor 100, and in this case, holes may occur. As a cause of the occurrence of a hole, movement other than the structure in which motion estimation of a feature point included in the divided region fails or stops may occur because it is not used for scene structure estimation.

홀 필링부(108)는 홀이 발생할 경우 홀이 발생한 영역과 홀 주변 영역의 색상을 비교하여 색상 보간을 통해 홀 필링을 수행한다. When the hole is generated, the hole filling unit 108 compares the color of the area where the hole is generated and the area around the hole to perform hole filling through color interpolation.

영상 합성부(110)는 움직임 시차 3차원 변환부(104) 및 장면 구조 추정 3차원 변환부(106)에 의해 생성되고 홀 필링부(108)에 의해 보간되는 변이 영상을 합성하여 스테레오 영상을 생성하는 기능을 한다. The image synthesizer 110 generates a stereo image by synthesizing the disparity images generated by the motion parallax 3D transformer 104 and the scene structure estimation 3D transformer 106 and interpolated by the hole filling unit 108. Function.

도 2는 본 발명의 바람직한 일 실시예에 따른 특징점 추출부의 구성을 도시한 블록도이다. 2 is a block diagram showing the configuration of a feature point extraction unit according to an embodiment of the present invention.

도 2를 참조하면, 본 발명의 일 실시예예 따른 특징점 추출부는 색상 분할부(200)는 변환 대상 영상의 색상 정보를 이용하여 변환 대상 영상의 영역을 분할하는 기능을 한다. Referring to FIG. 2, the feature point extractor 200 divides an area of the transform target image by using color information of the transform target image.

본 발명의 바람직한 실시예에 따르면, 색상 분할부(200)는 평균 이동 알고리즘(Mean Shift Algorithm)을 이용하여 색상 분할을 수행한다. 평균 이동 알고리즘은 밀도 변화를 측정할 수 있는 알고리즘으로서, 파라미터를 필요로 하지 않아 비교적 간단한 특성을 가지고 있다. According to a preferred embodiment of the present invention, the color dividing unit 200 performs color dividing by using a mean shift algorithm. The average shift algorithm is an algorithm that can measure the density change, and has a relatively simple characteristic since it does not require a parameter.

평균 이동 알고리즘은 일반적인 특정 공간에서 높은 밀도를 가진 영역의 중점을 구하기 위해 사용되며, 이를 찾기 이한 탐색 영역의 파라미터를 최소화하기 위하여 단일 지름이 정의된 구 형태의 윈도우가 정의된다. 이러한 특성은 기존의 알고리즘과 대비되는 점으로 기존의 알고리즘은 커널의 크기와 모양 또는 탐색에 포함될 이웃 픽셀의 수가 필요하지만, 평균 이동 알고리즘의 경우 부가적인 파라미터의 입력이 최소화된다. The average shift algorithm is used to find the midpoint of a region with high density in a specific space, and a sphere-shaped window with a single diameter is defined to minimize the parameters of the search region. This characteristic is in contrast to the conventional algorithm. The conventional algorithm requires the size and shape of the kernel or the number of neighboring pixels to be included in the search, but the average shift algorithm minimizes the input of additional parameters.

평균 이동 알고리즘의 원리는 다음과 같다. 먼저 특징 벡터

의 확률 분포 함수

를 가정한다. 반지름이

, 중점이

인 구

가 특징 벡터

를 포함할 경우

으로 나타낼 수 있다.

와

가 주어졌을 경우

의 기대값은 다음의 수학식 1과 같다. The principle of the mean shift algorithm is as follows. First feature vector

Probability distribution function

Assume Radius

, Emphasis

population

Autumn feature vector

If you include

It can be represented as

Wow

Is given

Expected value is equal to the following Equation 1.

가 충분히 작을 경우, 수학식 1은 다음의 수학식 2와 같이 근사화 될 수 있다.

Is small enough, Equation 1 may be approximated as Equation 2 below.

는 구의 부피를 나타내며,

의 근사값은 다음의 수학식 3과 같다.

Represents the volume of the sphere,

An approximation of is given by Equation 3 below.

는

의 확률 분포 함수의 그래디언트이므로 다음의 수학식 4를 얻는다.

Is

Since it is a gradient of the probability distribution function of, the following equation (4) is obtained.

위의 수학식 4의 적분을 수행하면 다음의 수학식 5를 얻을 수 있다. By integrating Equation 4 above, the following Equation 5 can be obtained.

즉, 지역 평균과 탐색 영역의 중점의 차이인 평균 이동 벡터는

의 확률 분포의 그래디언트에 비례하는 것을 알 수 있다. 높은 밀도의 영역에서는 큰 값의

와 작은 값의

를 가지게 되므로 적은 평균 이동이 일어나게 되고 수렴점을 찾을 수 있다. 평균 이동 알고리듬의 적용은 먼저 탐색 윈도우의 반지름

과 위치를 정한 후, 평균 이동 벡터를 구하고 그 값만큼 탐색 영역을 이동하여 벡터가 수렴할 때까지 반복적으로 수행한다. 도 3은 본 발명의 일 실시예에 따른 색상 분할이 완료된 영상을 도시한 도면이다. 도 3에서(a)는 색상 분할 전의 변환 대상 영상을 도시한 것이며, (b)는 색상 분할 후의 변환 대상 영상을 도시한 것이다. In other words, the mean moving vector, which is the difference between the regional mean and the midpoint of the search region,

It can be seen that it is proportional to the gradient of the probability distribution of. In areas of high density,

With a small value

Since we have a small average shift, we can find the convergence point. The application of the mean shift algorithm is the radius of the navigation window.

After determining the and positions, the average motion vector is obtained, and the search area is moved by the value, and iteratively performed until the vector converges. 3 is a diagram illustrating an image in which color separation is completed according to an embodiment of the present invention. In FIG. 3, (a) shows a conversion target image before color division, and (b) shows a conversion target image after color division.

색상 분할부(200)에 의해 색상 분할이 수행되면, 레이블링부(202)는 색상 분할 후 연결되어 있는 모든 화소에 동일한 번호를 부여하는 레이블링 작업을 수행한다. When color division is performed by the color dividing unit 200, the labeling unit 202 performs a labeling operation of assigning the same number to all the pixels connected after the color dividing.

도 4는 본 발명의 바람직한 일 실시예에 따른 변환 대상 영상에 대해 레이블링을 수행하는 일레를 도시한 도면이다. 레이블링은 다음과 같은 과정을 통해 수행될 수 있다. 4 is a diagram illustrating an example of performing labeling on a target image to be converted according to an exemplary embodiment of the present invention. Labeling may be performed through the following process.

i) 영상 탐색을 통해 아직 레이블링이 되지 않은 화소 P(도 4 참조)를 발견 할 경우, 해당 화소에 새로운 레이블 번호를 부여한다. i) When a pixel P (see FIG. 4) that is not yet labeled through image search is found, a new label number is assigned to the pixel.

ii) 도 4와 같이 화소 P와 연결되어 있는 화소에 같은 레이블 넘버를 부여한다. ii) The same label number is assigned to the pixel connected to the pixel P as shown in FIG.

iii) i) 및 ii)의 과정을 반복하여 레이블일 불틴 화소와 연결된 전체 화소에 대해 동일한 레이블 넘버를 부여한다. iii) The process of i) and ii) is repeated to give the same label number to all the pixels connected to the lighted pixel to be labeled.

iv) ii) 및 iii)의 과정을 반복하여 레이블링이 되지 않은 화소가 없을 때까지 계속하여 레이블링을 수행한다. iv) Repeating steps ii) and iii) to continue labeling until there are no unlabeled pixels.

v) i)의 과정으로 돌아가 아직 레이블이 붙지 않은 화소를 탐색하고 ii) 및 iv)의 과정을 반복한다. v) Return to the process of i) to search for the pixels not yet labeled and repeat the process of ii) and iv).

도 5는 본 발명의 일 실시예에 따른 도 3의 영상에 대해 레이블링을 수행한 후의 영상을 도시한 도면이다. 5 is a diagram illustrating an image after labeling the image of FIG. 3 according to an embodiment of the present invention.

윤곽선 추출부(204)는 레이블링이 완료된 영상에 대해 특징점을 선택하기 위해 윤곽선을 획득하는 연산을 수행한다. 윤곽선 추출부(204)는 레이블링된 영상에 포함되어 있는 화소수를 계산하고, 화소수에 따라 레이블링된 물체의 순서를 정렬한 후 윤곽선을 추출한다. The contour extracting unit 204 performs an operation of acquiring the contour in order to select a feature point for the labeled image. The outline extractor 204 calculates the number of pixels included in the labeled image, arranges the order of the labeled objects according to the number of pixels, and extracts the outline.

도 6은 본 발명의 일 실시예에 따른 윤곽선 추출을 위해 경계를 찾는 일례를 도시한 도면으로서, 도 6을 참조하여 윤곽선을 추출하는 과정을 설명하면 다음과 같다. FIG. 6 is a diagram illustrating an example of finding a boundary for contour extraction according to an embodiment of the present invention. Referring to FIG. 6, the process of extracting the contour is as follows.

i) 도 6과 같이 영상을 탐색하여 추적 완료 마크(도 6에서 255)를 첨가하지 않은 경계점 a₀를 탐색한다. i) The image is searched as shown in FIG. 6 to search the boundary point a ₀ without adding the trace completion mark (255 in FIG. 6).

ii) a₀주변이 적부 흑(0)일 경우 a0는 고립점이 되고 추적을 완료한다. ii) a ₀ if the surroundings are propriety black (0) and point a0 is isolated and finished tracing.

iii) 그 외의 경우 다음의 경계점을 찾고 상술한 방법으로 경계점을 추적한다. iii) In other cases, the following boundary point is found and the boundary point is traced by the above-described method.

iv) 다음의 경계점이 a₀가 될 경우 추적을 완료한다. iv) Complete the trace when the next boundary point is a ₀ .

도 7은 도 5의 레이블링된 영상에 대해 윤곽선 추출을 완료한 영상을 도시한 것이다. FIG. 7 illustrates an image of completing contour extraction on the labeled image of FIG. 5.

특징점 추출부(206)는 윤곽선이 추출된 영상으로부터 특징점을 선택하는 기능을 한다. 전술한 바와 같이, 분할된 각 영역에 대해 랜덤하게 특징점을 선택하거나 고주파 성분을 선택하는 방법에 의해 특징점을 추출할 수 있다. 도 8은 도 7의 영상에 대해 특징점이 추출된 영상을 도시한 도면이다. The feature point extractor 206 selects a feature point from the image from which the contour is extracted. As described above, the feature point can be extracted by a method of randomly selecting the feature point or selecting a high frequency component for each divided region. FIG. 8 is a diagram illustrating an image from which feature points are extracted from the image of FIG. 7.

카메라 움직임 인식과 관련하여, 2차원 영상의 3차원 변환 기법에서 카메라의 움직임을 고려하지 않고 모든 영상에 대해 동일한 방식을 적용하면 패닝 카메라와 같이 카메라의 움직임이 있는 영상에 대해서는 객체와 배경의 구분이 어렵다. 이는 객체의 움직임이 배경의 움직임에 비해 상대적으로 작기 때문이다. 이러한 경우에 고정 카메라의 방식을 동일하게 적용시키게 되면 객체 부분에서 입체감을 느낄 수가 없다. 따라서 카메라의 움직임이 있는 영상에서는 새로운 3차원 변환 기법을 적용시켜야 하며 이를 위해 카메라의 움직임을 인식하는 과정이 필요하다. 본 발명에서는 카메라 패닝과 줌을 인식할 수 있는 알고리즘을 제안한다. Regarding camera motion recognition, if the same method is applied to all images without considering the camera movement in the 3D transformation method of the 2D image, the object and the background are separated for the image with the camera movement such as a panning camera. it's difficult. This is because the movement of the object is relatively small compared to the movement of the background. In this case, if the fixed camera method is applied in the same way, the three-dimensional effect cannot be felt in the object part. Therefore, a new three-dimensional transformation technique should be applied to the image with camera movement. This requires a process of recognizing the camera movement. The present invention proposes an algorithm that can recognize camera panning and zooming.

정지, 패닝, 줌과 같은 카메라의 움직임을 영상으로부터 인식하기 위해서 움직임 벡터를 이용한다. 움직임 분석 방식에는 두 영상들 간의 차를 이용한 간단한 방법과 일정 시간 동안에 일어난 움직임의 변화를 이용한 광류 (optical flow) 방법이 있다. 광류 방법에는 모델 파라미터를 추정하는 방법과 움직임 모델을 이용하지 않고 광류 벡터의 크기나 각도의 분포로부터 광류 패턴을 분석하는 방법이 있다. 그러나, 이 방법은 계산량이 많고 여러 개의 임계치를 필요로 한다. 다른 방법으로 많은 영상이 MPEG 표준으로 압축되어 있다는 특성을 이용한 방법으로, MPEG 비트열이 P 프레임과 B 프레임에 관련된 움직임 벡터를 이용하여 카메라의 움직임을 인식하는데 이용된다. 그러나, 이 방법은 일반적으로 움직이는 객체의 크기에 제한이 있다. Motion vectors are used to recognize camera movements such as still, pan, and zoom from images. The motion analysis method includes a simple method using a difference between two images and an optical flow method using a change in a motion that occurs during a certain time. The optical flow method includes a method of estimating model parameters and a method of analyzing the optical flow pattern from the distribution of the magnitude or angle of the optical flow vector without using a motion model. However, this method is computationally expensive and requires several thresholds. Alternatively, the MPEG video stream is used to recognize camera movement using motion vectors associated with P and B frames. However, this method generally has a limitation on the size of a moving object.

도 9는 본 발명의 바람직한 일 실시예에 따른 카메라 움직임 인식부에서 카메라 패닝이 있는지 여부를 인식하는 과정에 대한 순서도이다. 9 is a flowchart illustrating a process of recognizing whether there is a camera panning in a camera motion recognition unit according to an exemplary embodiment of the present invention.

도 9를 참조하면, 변환 대상 영상의 경계 영역의 최대 움직임 벡터를 탐색한다(단계 900). 본 발명에서는 카메라의 움직임을 인식하기 위하여 먼저 영상의 경계 부분의 각 픽셀에 대해 움직임의 크기를 구한다. 그리고 가장 많이 발생한 움직임의 크기를 경계 영역의 움직임 대표값으로 지정한다. 대표값이 임계치 이상이면 카메라의 움직임이 발생하였다고 판단한다. Referring to FIG. 9, a maximum motion vector of a boundary region of a transformation target image is searched (operation 900). In the present invention, in order to recognize the motion of the camera, first, the magnitude of the motion is obtained for each pixel of the boundary portion of the image. In addition, the magnitude of the most frequently generated motion is designated as the motion representative value of the boundary region. If the representative value is greater than or equal to the threshold, it is determined that camera movement has occurred.

이러한 본 발명의 방법은 영상의 경계부분에 대한 움직임 벡터의 정보를 얻기 때문에 기존 방법에 비해 계산량이 적다. Since the method of the present invention obtains motion vector information on the boundary of the image, the calculation amount is smaller than that of the conventional method.

탐색한 움직임 벡터가 일정한 방향성이 있는지 여부를 판단한다(단계 902). 본 발명의 바람직한 실시예에 따르면, 영상의 아래 경계 영역을 제외한 나머지 세 부분의 경계 영역의 움직임 벡터가 일정한 방향성을 가지고 있는지 여부를 판단한다. It is determined whether the searched motion vector has a constant direction (step 902). According to a preferred embodiment of the present invention, it is determined whether the motion vectors of the boundary regions of the remaining three portions except the lower boundary region of the image have a constant direction.

움직임 벡터가 일정한 방향성이 있을 경우, 카메라 패닝이 있다고 판단한다(단계 904). 움직임 벡터가 일정한 방향이 있지 않을 경우, 카메라 패닝이 아닌 고정된 카메라인 것으로 판단한다(단계 906). If the motion vector is of constant orientation, it is determined that there is camera panning (step 904). If the motion vector does not have a constant direction, it is determined that the motion vector is a fixed camera rather than a camera panning (step 906).

도 10은 본 발명의 바람직한 일 실시예에 따른 카메라 줌인 및 카메라 줌아웃을 판단하는 과정에 대한 순서도이다. 10 is a flowchart illustrating a process of determining camera zoom-in and camera zoom-out according to an exemplary embodiment of the present invention.

도 10을 참조하면, 영상의 경계 영역 중 모서리 영역에서의 움직임 벡터를 탐색한다(단계 1000). 카메라 패닝이 모든 경계 영역의 움직임 벡터를 탐색하는 것에 비해, 카메라 줌인 및 줌 아웃 판단 시에는 경계 영역 중 모서리 영역에서의 움직임 벡터를 탐색하는 것이 바람직하다. Referring to FIG. 10, a motion vector is searched for in a corner region of a boundary region of an image (step 1000). While camera panning searches for motion vectors of all boundary regions, it is preferable to search for motion vectors in corner regions of the boundary regions when determining camera zoom-in and zoom-out.

네 개의 모서리 중 세 개의 모서리 이상에서 방향성이 감지(움직임 벡터가 미리 설정된 임계치 이상)되는지 여부를 판단한다(단계 1002). It is determined whether or not directionality is detected (motion vector is greater than or equal to a preset threshold) at three or more of four corners (step 1002).

세 개의 모서리 이상에서 방향성이 감지되지 않는 경우, 카메라 줌인 및 줌아웃이 발생하지 않는 고정 카메라라고 판단한다(단계 1004). If the orientation is not detected at three or more corners, it is determined that the camera is a fixed camera in which zooming in and zooming out do not occur (step 1004).

세 개의 모서리 이상에서 방향성이 감지되는 경우, 영상의 내부로 방향성이 있는지 또는 영상의 외부로 방향성이 있는지 여부를 판단한다(단계 1006). When directionality is detected at three or more corners, it is determined whether there is a direction toward the inside of the image or a direction toward the outside of the image (step 1006).

영상의 내부로 방향성이 있을 경우, 카메라 줌아웃이 발생하였다고 판단한다(단계 1008).If there is directionality inside the image, it is determined that a camera zoom out has occurred (step 1008).

영상의 외부로 방향성이 있을 경우, 카메라 줌인이 발생하였다고 판단한다(단계 1010). If there is directivity out of the image, it is determined that camera zoom-in has occurred (step 1010).

도 11은 본 발명의 바람직한 일 실시예에 따른 움직임 시차를 이용한 변이 영상 생성 과정을 도시한 순서도이다. 11 is a flowchart illustrating a process of generating a disparity image using motion parallax according to an exemplary embodiment of the present invention.

도 11을 참조하면, 우선 특징점 추적 절차를 수행한다(단계 1100). 특징점을 기반으로 하는 추적은 영상에서 추적하고자 하는 영역을 선택하고 그에 대응하는 점을 동영상 각각의 프레임에서 찾는 과정을 의미한다. Referring to FIG. 11, a feature point tracking procedure is first performed (step 1100). Tracking based on feature points refers to a process of selecting an area to be tracked in an image and finding a corresponding point in each frame of the video.

본 발명의 바람직한 실시예에 따르면, KLT (Kanade-Lucas-Tomasi) 특징점 추적기를 사용하여 특징점을 추적한다. According to a preferred embodiment of the present invention, feature points are tracked using a Kanade-Lucas-Tomasi (KLT) feature point tracker.

움직임 정보를 획득하였으며, 획득된 움직임 벡터를 변이 벡터로 변환하여 3차원 영상을 생성한다.The motion information is obtained and a 3D image is generated by converting the obtained motion vector into a disparity vector.

KLT를 이용한 특징점 추적의 원리를 설명하면 다음과 같다. 서로 인접하는 프레임간의 두 영상을 I 및 J로 가정하고, 특징점의 좌표를

, 움직임 벡터를

, w(x)=1이라고 할 때, 움직임 측정 오차인

은 다음의 수학식 6과 같이 정의된다. The principle of feature point tracking using KLT is as follows. Assume two images between adjacent frames as I and J, and coordinate the feature points

Motion vector

, w (x) = 1, the motion measurement error

Is defined as in Equation 6 below.

1차 테일러 급수에 의해,

와

를 가정하면 다음의 수학식 7과 같이 정의된다. By primary Taylor watering,

Wow

Suppose that is defined as Equation 7 below.

움직임 추정 오차를 최소로 하는 움직임 벡터를 찾기 위해 다음의 수학식 8과 같이 미분을 수행한다. The derivative is performed as shown in Equation 8 to find a motion vector that minimizes the motion estimation error.

위의 수학식 8을 정리한 다음의 수학식 9에 의해 움직임 추정 오차를 최소화시키는 d를 찾는다. After summarizing the above Equation 8, the following equation 9 finds d that minimizes the motion estimation error.

상술한 바와 같이 특징점에서의 움직임 벡터 정보를 구하게 되면, 프레임 간 특징점의 위치 차이를 계산하여 움직임을 추정한다(단계 1102). 움직임 추정 시 수평 방향의 움직임 크기 및 수평 방향의 움직임 크기를 획득한다. As described above, when the motion vector information at the feature point is obtained, the motion is estimated by calculating the position difference between the feature points between the frames (step 1102). When estimating the motion, the horizontal motion and the horizontal motion are obtained.

본 발명의 바람직한 실시예에 따르면, 특징점 추적은 영상의 특성에 따라 5 프레임에서 20 프레임 사이에서 이루어지며 양방향 추적을 사용하여 정확한 결과를 얻는다. 즉, 첫 번째 프레임에서 선택된 특징점을 추적하여 마지막 프레임에서의 특징점을 획득하며, 이를 프레임 역순으로 추적하여 처음 선택했던 특징점과 오차 범위 이내의 유사한 위치를 가질 경우에 특징점 추적 결과를 사용한다. According to a preferred embodiment of the present invention, feature point tracking is performed between 5 and 20 frames depending on the characteristics of the image, and accurate results are obtained by using bidirectional tracking. In other words, the feature point selected in the first frame is traced to obtain the feature point in the last frame, and the feature point trace result is used when the feature point is tracked in the reverse order of the frame and has a similar position within the error range.

움직임 추정이 완료되면, 움직임 추정 정보를 변이로 변환한다(단계 1106). 전술한 바와 같이, 본 발명의 바람직한 실시예에 따르면, 수직 성분의 변이로부터 깊이감을 얻을 수 없기 때문에, 수직 방향의 움직임 크기 및 수평 방향의 움직임 크기를 정규화하여 수평 방향의 변이로 변환한다. When the motion estimation is complete, the motion estimation information is transformed to disparity (step 1106). As described above, according to the preferred embodiment of the present invention, since a sense of depth cannot be obtained from the variation of the vertical component, the vertical movement magnitude and the horizontal movement magnitude are normalized and converted into horizontal displacement.

움직임 정보를 변이 정보로 변환하면, 변이를 이용하여 변이 영상을 생성한다(단계 1106). When the motion information is converted into disparity information, a disparity image is generated using the disparity (step 1106).

도 12는 본 발명의 바람직한 일 실시에에 따른 장면 구조 추정을 이용한 변이 영상 생성 과정을 도시한 순서도이다. 12 is a flowchart illustrating a process of generating a disparity image using scene structure estimation according to an exemplary embodiment of the present invention.

도 12를 참조하면, 장면 구조 추정을 위한 카메라 모델을 설정한다(단계 1200). Referring to FIG. 12, a camera model for scene structure estimation is set (step 1200).

도 13은 본 발명의 일 실시예에 따른 장면 구조 추정을 위한 카메라 모델을 도시한 도면이다. 카메라 좌표계에서의 위치가 (Xc, Yc, Zc)이고, 특징점의 좌표를 (u,v)로 할 때 다음의 수학식 10과 같은 관계가 성립한다. 본 발명의 일 실시예에 따른 카메라 모델은 카메라 초점 거리를 파라미터로 사용하며,

는 초점 거리의 역수이다. 13 is a diagram illustrating a camera model for scene structure estimation, according to an embodiment of the present invention. When the position in the camera coordinate system is (Xc, Yc, Zc), and the coordinate of the feature point is (u, v), the following equation (10) holds. Camera model according to an embodiment of the present invention uses the camera focal length as a parameter,

Is the inverse of the focal length.

3차원 좌표계(X, Y, Z)에서의 장면 구조 모델은 다음의 수학식 11과 같이 나타낼 수 있으며, 여기서

는 스케일 파라미터이다. The scene structure model in the three-dimensional coordinate system (X, Y, Z) can be expressed as Equation 11 below, where

Is a scale parameter.

장면 구조 추정을 위해, F개의 프레임과 N개의 특징점을 사용하면, 6(F-1)개의 움직임 파라미터와 3N개의 구조 파라미터가 필요하다. 특징점의 개수에 따라 2NF개의 정보와 1개의 스케일 정보가 필요하기 때문에 문제의 해결을 위해서는 2NF+1 > 6(F-1)+3N의 조건을 만족시켜야 한다. 따라서 F ≥ 2 and N ≥ 6의 조건이 만족되어야 움직임과 장면 구조 추정의 해를 구할 수 있다. 반면 확장 칼만 필터는 재귀적 해법을 통해 해를 구하게 되므로 F를 고려하지 않고, 한 프레임에서만의 조건을 설정한다. 특징점의 개수에 따라 2N개의 정보와 1개의 스케일 정보가 필요하고 6개의 카메라 외부 파라미터, 1개의 초점거리, N개의 구조 파라미터가 필요하므로 N > 7인 경우 해를 구할 수 있다. For scene structure estimation, using F frames and N feature points, 6 (F-1) motion parameters and 3N structure parameters are required. Since 2NF information and one scale information are needed according to the number of feature points, the condition of 2NF + 1> 6 (F-1) + 3N must be satisfied to solve the problem. Therefore, the conditions of F ≥ 2 and N ≥ 6 must be satisfied to solve the motion and scene structure estimation. On the other hand, the extended Kalman filter does not take F into account because the solution is solved through a recursive solution. According to the number of feature points, 2N information and 1 scale information are required, and 6 camera external parameters, 1 focal length, and N structural parameters are required, so that a solution can be obtained when N> 7.

카메라 모델이 설정되면, 카메라 모델에서의 좌표계와 3차원 좌표계 사이의 관계를 설정한다(단계 1202). Once the camera model is set up, the relationship between the coordinate system and the three-dimensional coordinate system in the camera model is set (step 1202).

카메라 좌표계와 3차원 좌표계 사이의 관계는 다음의 수학식 12와 같이 설정될 수 있다. The relationship between the camera coordinate system and the three-dimensional coordinate system may be set as in Equation 12 below.

3차원 좌표계와 카메라 좌표계 사이의 관계는 수학식 12의 (1)과 같이 정의되며, 평행 이동 및 회전 운동이 되어 있다. 회전 운동은 수학식 12의 (2)와 같이 정의되며, 회전 운동은 4원수를 이용하여 수학식 12의 (3)과 같이 정의된다. The relationship between the three-dimensional coordinate system and the camera coordinate system is defined as in Equation 12 (1), which is a parallel movement and rotational motion. The rotational motion is defined as in Equation 12 (2), and the rotational motion is defined as in Equation 12 using Equation 4 (3).

6개의 카메라 외부 파라미터와 1개의 카메라 내부 파라미터 및 N개의 장면 구조를 다음의 수학식 13과 같은 상태 벡터로 구성한다. Six camera external parameters, one camera internal parameter, and N scene structures are configured by a state vector as shown in Equation 13.

확장 칼만 필터를 이용하여 상태 벡터 X 및 측정 벡터를 구할 수 있으며, 또한 장면 구조의 특징점에서의 깊이 정보를 획득할 수 있다(단계 1204). 그러나, 이러한 깊이 정보는 스케일링 파라미터

가 포함되지 않은 결과이며, 깊이 정보의 평균고 표준 편차를 이용하여 0에서 255 사이의 값을 가지는 스케일링 파라미터

를 구한다(단계 1206). The extended Kalman filter can be used to obtain the state vector X and the measurement vector, and can also acquire depth information at the feature points of the scene structure (step 1204). However, this depth information is not a scaling parameter.

Is a result that does not contain a scaling parameter with a value between 0 and 255 using the average high standard deviation of the depth information.

Is obtained (step 1206).

0에서 255 사이의 값을 가지는 스케일링 파라미터

는 다음의 수학식 14에 구해질 수 있다. Scaling parameters with values between 0 and 255

Can be obtained from Equation 14 below.

위의 수학식 14에서

는 장면 구조 값의 표준 편차이고

는 장면 구조 값의 평균이다. In Equation 14 above

Is the standard deviation of the scene structure values

Is the average of the scene structure values.

스케일링 파라미터

를 이용하여 장면 구조 추정에 의한 변이 영상을 획득한다. Scaling parameters

Acquire a disparity image by scene structure estimation using.

이상에서 설명한 바와 같이, 본 발명의 바람직한 실시예에 따르면, 카메라의 움직임 여부를 고려하여 2차원 영상을 3차원 영상으로 변환함으로써 적응적인 3차원 영상 변환이 가능한 장점이 있다. As described above, according to the preferred embodiment of the present invention, an adaptive three-dimensional image conversion is possible by converting a two-dimensional image into a three-dimensional image in consideration of the movement of the camera.

또한, 본 발명의 바람직한 실시예에 따르면, 카메라의 움직임이 있을 경우 장명 구조 추정을 3차원 변환에 이용함으로써 카메라의 움직임이 있을 경우 장면 구조 추정을 이용함으로써 보다 효율적인 3차원 영상 변환이 가능한 장점이 있다. In addition, according to a preferred embodiment of the present invention, by using the long-life structure estimation for the three-dimensional transformation when the camera movement, there is an advantage that the more efficient three-dimensional image conversion by using the scene structure estimation when the camera movement .

Claims

A feature point extractor for dividing a region of the 2D image and extracting feature points of the divided region;

A camera motion recognition unit for recognizing camera motion in the 2D image;

A motion parallax converter configured to generate a disparity image of the 2D image through motion parallax if the camera motion is not recognized by the camera motion recognition unit;

A scene structure estimation converter configured to generate a disparity image of the 2D image through scene structure estimation when the camera motion is recognized by the camera motion recognition unit; And

And an image synthesizer configured to synthesize a 3D image using the disparity image generated by the motion parallax transform unit or the scene structure estimation transform unit.

The method of claim 1,

The feature point extraction unit,

A color dividing unit dividing an area of the 2D image using color information;

A labeling unit which assigns a label number to pixels connected in the image divided by the color dividing unit;

An outline extractor configured to extract an outline of the labeled image; And

And a feature point acquisition unit for selecting a feature point for each of the divided regions by using the contour extraction information. 2.

The method of claim 2,

And the color dividing unit performs color dividing using an average shift algorithm.

The method of claim 2,

The labeling unit,

Searching for pixels having no label number assigned to the color-divided two-dimensional image;

Assigning a new label number to the searched pixel; And

And assigning the same label number to the pixels connected to the searched pixel.

The method of claim 2,

And the feature point obtaining unit randomly selects the feature points for each of the divided regions or uses frequency information to select the feature points.

The method of claim 5,

And the number of feature points selected by the feature point acquirer for each region is based on the size of the region.

The method of claim 1,

The camera motion recognition unit searches for a motion vector in the image boundary regions, and if the searched motion vector has a constant direction in at least three boundary regions, the camera motion recognition unit determines that the camera panning operation is performed. Video conversion device.

The method of claim 1,

The camera motion recognition unit searches for a motion vector at a corner of an image boundary region, and if the searched motion vector has a direction inward or outward, it is determined as a camera zoom-in or a camera zoom-out operation. 3D image conversion device.

The method according to claim 7 or 8,

The camera motion recognizing unit obtains a motion size from each pixel of the border area when searching for a motion vector in the border area, and sets the largest motion size as a motion representative value of the border area, and the representative value is greater than or equal to a preset threshold. A three-dimensional image conversion apparatus for a two-dimensional image, characterized in that the movement is determined.

The method of claim 1,

And the motion parallax three-dimensional converter tracks the movement of the feature point using a Kande-Lucas-Tomasi (KLT) feature point tracker.

The method of claim 10,

The motion parallax transform unit estimates motion by calculating a feature point position difference between frames through a feature point tracking result using the Kande-Lucas-Tomasi (KLT) feature point tracker.

The method of claim 11,

And the motion parallax converter converts the motion estimation information into a disparity to generate a disparity image.

The method of claim 1,

The scene structure estimation converter calculates depth information of the 2D image by setting a camera model using camera internal parameters including a focal length of the camera and calculating a state vector using an extended Kalman filter. 3D image conversion device of 3D image.

The method of claim 13,

The depth information is represented by the following equation

Is computed by

Is the standard deviation of the scene structure values,

Is a mean of scene structure values.

Dividing a region of the 2D image and extracting feature points of the divided region;

Recognizing camera movement in the two-dimensional image (b);

(C) generating a disparity image for the 2D image through motion parallax if the camera movement is not recognized in the step (b);

Generating a disparity image of the 2D image by estimating a scene structure when the camera movement is recognized in the step (b); And

And synthesizing a three-dimensional image using the disparity image generated in the step (c) or the step (d).

The method of claim 15,

Step (a) is,

A color segmentation step of segmenting an area of the 2D image by using color information;

A labeling step of assigning label numbers to pixels connected in the image divided by the color division;

Contour extraction step of extracting a contour on the labeled image; And

And a feature point obtaining step of selecting a feature point for each of the divided regions using the contour extraction information.

The method of claim 16,

The labeling step,

Searching for pixels having no label number assigned to the color-divided two-dimensional image; Assigning a new label number to the searched pixel; And assigning the same label number to the pixels connected to the searched pixel.

The method of claim 16,

The method for acquiring a feature point may include selecting a feature point randomly for each of the divided regions or by using frequency information.

The method of claim 15,

The step (b) is to search for motion vectors in the image boundary regions, and if the searched motion vectors have a constant direction in at least three boundary regions, it is determined as a camera panning operation. Dimensional Image Conversion Method.

The method of claim 15,

The step (b) is to search for a motion vector in the corner portion of the image boundary region, and if the searched motion vector has a direction inward or outward, it is determined that the camera zoom in or camera zoom out operation 3D image conversion method.

The method of claim 15,

The step (c) is a three-dimensional image conversion method of a two-dimensional image, characterized in that to track the movement of the feature point using a Kande-Lucas-Tomasi (KLT) feature point tracker.

The method of claim 22,

The step (c) is a three-dimensional image transformation method of a two-dimensional image, characterized in that the motion is estimated by calculating the position difference between the feature points between the frame using the feature point tracking results using the Kande-Lucas-Tomasi (KLT) feature point tracker.

The method of claim 22,

The step (c) is a three-dimensional image conversion apparatus for a two-dimensional image, characterized in that for generating a disparity image by converting the motion estimation information into a disparity.

The method of claim 1,

In step (d), the depth information of the 2D image may be calculated by setting a camera model using camera internal parameters including a focal length of the camera and calculating a state vector using an extended Kalman filter. 3D image conversion method of 2D image.

The method of claim 25,

The depth information is represented by the following equation

Is computed by

Is the standard deviation of the scene structure values,

Is a mean of scene structure values.