KR20060036230A

KR20060036230A - 2d/3d converting method of compressed motion pictures using motion vectors therein, and device thereby

Info

Publication number: KR20060036230A
Application number: KR1020040085310A
Authority: KR
Inventors: 최병호; 송혁; 김제우; 김용환; 김태완
Original assignee: 전자부품연구원
Priority date: 2004-10-25
Filing date: 2004-10-25
Publication date: 2006-04-28
Also published as: KR100757259B1

Abstract

본 발명은 압축 동영상의 움직임 벡터를 이용하여 2차원 영상을 3차원 영상으로 변환하는 방법에 관한 것으로서, 상기 압축 동영상으로부터 각 블록의 움직임 벡터를 추출하는 제1 단계와, 상기 추출된 움직임 벡터 중에서 유사한 움직임 벡터를 그룹화하여 객체를 추출하는 제2 단계와, 상기 추출된 각 객체의 움직임으로부터 각 객체의 원근을 판정하는 제3 단계와, 상기 판정된 각 객체의 원근에 기초하여 레이어를 분리하는 제4 단계와, 상기 레이어를 순차적으로 중첩하여 3차원 영상을 구성하는 제5 단계를 포함한다.The present invention relates to a method of converting a two-dimensional image to a three-dimensional image using a motion vector of a compressed video, the first step of extracting a motion vector of each block from the compressed video, and similar among the extracted motion vector A second step of extracting objects by grouping motion vectors, a third step of determining the perspective of each object from the movement of each extracted object, and a fourth step of separating the layers based on the determined perspective of each object And a fifth step of constructing a 3D image by sequentially overlapping the layers.

2차원, 3차원, 동영상, 움직임 벡터, 레이어, 객체, 깊이, 시차, MPEG, MOTION VECTOR, DISPARITY2D, 3D, Video, Motion Vector, Layer, Object, Depth, Parallax, MPEG, MOTION VECTOR, DISPARITY

Description

{2D / 3D CONVERTING METHOD OF COMPRESSED MOTION PICTURES USING MOTION VECTORS THEREIN, AND DEVICE THEREBY}

도 1은 종래기술 1에 따른 2D/3D 변환장치의 예시도.1 is an illustration of a 2D / 3D conversion apparatus according to the prior art 1.

도 2a는 본 발명의 바람직한 실시예에 따라 2D/3D 변환을 수행하는 MPEG 디코더 장치의 블록도.2A is a block diagram of an MPEG decoder device for performing 2D / 3D conversion in accordance with a preferred embodiment of the present invention.

도 2b는 본 발명의 바람직한 실시예에 따라 움직임 벡터로부터 객체를 추출하는 방법의 절차별 흐름도.2B is a flow chart for each method of extracting an object from a motion vector according to a preferred embodiment of the present invention.

도 2c는 도 2b와 관련하여 움직임 벡터를 예시한 도면.FIG. 2C illustrates a motion vector in connection with FIG. 2B. FIG.

도 3a는 본 발명의 바람직한 실시예에 따라, 도 2b의 방법에 후속하여 객체별로 레이어를 판정하는 방법의 절차별 흐름도.FIG. 3A is a flow chart of a method of determining a layer for each object following the method of FIG. 2B according to a preferred embodiment of the present invention. FIG.

도 3b는 도 3a의 방법에서 객체가 중첩되는 경우를 설명하기 위한 예시도.3B is an exemplary diagram for explaining a case where objects overlap in the method of FIG. 3A.

도 4a는 본 발명의 바람직한 실시예에 따라 도 3a의 방법에 후속하여 레이어별로 3차원 영상 데이터를 합성하는 방법의 절차별 흐름도.4A is a flow chart for each method of synthesizing 3D image data for each layer following the method of FIG. 3A according to a preferred embodiment of the present invention.

도 4b는 도 4a의 방법에서 레이어 구성 순서를 설명하기 위한 예시도.4B is an exemplary diagram for explaining a layer configuration procedure in the method of FIG. 4A.

도 5는 본 발명의 바람직한 실시예에 따라, 주변 움직임 벡터의 유사성을 이용하여 움직임 벡터를 획득하는 방법을 설명하기 위한 예시도.5 is an exemplary diagram for describing a method of obtaining a motion vector using similarity of surrounding motion vectors, according to a preferred embodiment of the present invention.

본 발명은 2차원 영상(2D)을 3차원 영상(3D)으로 변환하는 방법에 관한 것으로서, 특히 MPEG, H.264 등의 압축 동영상에 포함된 움직임 벡터(Motion Vector) 정보를 이용하여 2차원 영상으로부터 각 객체별 레이어를 구성함으로써 2차원 영상을 3차원 영상으로 변환하는 방법에 관한 것이다.The present invention relates to a method of converting a 2D image (2D) to a 3D image (3D), in particular, using a motion vector (Motion Vector) information included in a compressed video such as MPEG, H.264, etc. The present invention relates to a method of converting a two-dimensional image into a three-dimensional image by configuring a layer for each object.

고화질 양방향 멀티미디어 시대의 도래와 함께 시청장의 영상정보 욕구에 부응하고자 3D 입체 영상의 제작 기술 및 이를 디스플레이하기 위한 표시장치에 관한 연구가 활발히 진행되고 있다. 또한, 현재까지도 대부분의 영상 컨텐츠가 2D 형태로 제작된다는 점을 고려하여 2차원 영상을 3차원 영상으로 변환하는 방법에 관한 연구도 함께 진행되고 있다.With the advent of the high-definition interactive multimedia era, researches on the production technology of 3D stereoscopic images and the display device for displaying the same have been actively conducted in order to meet the needs of the visual information of the auditorium. In addition, studies on a method of converting a 2D image into a 3D image are also being conducted in consideration of the fact that most image contents are produced in a 2D form.

현재 널리 사용되고 있는 2차원 영상의 3차원 변환 방법의 대표적인 기술로서 산요(Sanyo)사의 한국등록특허 제345630호 "피사체의전후위치관계판정방법및2차원영상을3차원영상으로변환하는방법"(2002.11.30 등록)(이하 "종래기술 1"이라 함)과, 한국등록특허 제230447호 "2차원 연속영상의 3차원 영상 변환장치 및 방법"(1999.8.23)(이하 "종래기술 2"라 함)이 있다. Sanyo's Korean Patent No.345630, "Method for Determining Before and After Positional Relationship of Subject and Converting 2D Image to 3D Image," as a representative technology of the 3D conversion method of 2D image, which is widely used (2002.11) .30 registration) (hereinafter referred to as "prior art 1") and Korean Patent No. 230447 "A device and method for converting two-dimensional continuous images to 3D image" (1999.8.23) (hereinafter referred to as "prior art 2") There is).

종래기술 1은 통상 MTD (modified time difference)방식이라 불리우며, 현재 영상과 지연된 영상을 이용하여 좌우 시야의 영상을 각각 만들어 주면 입체감을 느끼게 된다는 이론에 기초한 것이다. 즉, MTD 방식은 압축된 동영상에서 움직임 정 보를 이용하여 지연정보를 얻어내고 영상 내에서 지연 범위를 결정한 후 지연영상(delayed image)을 이용하여 양안 영상을 합성한다. The prior art 1 is commonly referred to as a modified time difference (MTD) method, and is based on the theory that a three-dimensional image is felt by making an image of left and right views using a current image and a delayed image, respectively. That is, the MTD method obtains delay information using motion information from a compressed video, determines a delay range within the image, and then synthesizes a binocular image using a delayed image.

도 1은 종래기술 1에 따른 2D/3D 변환장치의 구성을 도시한 것이다.1 shows a configuration of a 2D / 3D converter according to the prior art 1.

도 1을 참조하면, 입력 단자(1)에 2차원 영상을 표시하기 위한 통상의 2차원 영상 신호가 입력되고, 입력 단자(1)에 입력된 2차원 영상 신호는 영상 전환 회로(2) 및 필드 메모리(5)에 각각 공급된다. 필드 메모리(5)에 입력된 2차원 영상 신호는 소정 필드 수만큼 지연시켜 출력되고, 영상 전환 회로(2)로 공급된다. 필드 메모리(5)의 지연랑은 메모리 제어회로(6)에 의해 1 필드 단위로 가변 제어된다. 영상 전환 회로(2)는 좌영상 신호 L을 출력하는 출력 단자(3) 및 우영상 신호 R을 출력하는 출력 단자(4)에 연결되어 있으며, 피사체의 동 방향에 따라 출력 상태가 전환되도록 제어된다. Referring to FIG. 1, a normal two-dimensional image signal for displaying a two-dimensional image is input to an input terminal 1, and the two-dimensional image signal input to the input terminal 1 is an image switching circuit 2 and a field. It is supplied to the memory 5, respectively. The two-dimensional image signal input to the field memory 5 is output by delaying a predetermined number of fields and supplied to the image switching circuit 2. The delay of the field memory 5 is variably controlled by the memory control circuit 6 in units of one field. The image switching circuit 2 is connected to an output terminal 3 for outputting the left image signal L and an output terminal 4 for outputting the right image signal R, and controlled to switch the output state according to the same direction of the subject. .

또한, 입력 단자(1)에 입력된 2차원 영상 신호는 동(動) 벡터 검출 회로(7)에 공급된다. 동 벡터 검출 회로(7)에서는 영상 필드간 움직임, 즉 피사체의 이동량(이동속도)에 따른 동 벡터(움직임 벡터)가 검출된다. 검출된 동 벡터는 CPU(8)에 공급된다. CPU(8)는 검출된 동 벡터 중 수평 성분을 추출하고, 이것에 따라 메모리 제어 회로(6)를 제어한다. 즉, 피사체의 움직임이 크고 동 벡터가 큰 경우, 필드 메모리(5)의 지연량이 적게 되도록 제어하며, 피사체의 움직임이 작거나 또는 슬로우모션 재생시와 같이 동 벡터가 작은 경우 지연량이 많게 되도록 제어한다.In addition, the two-dimensional image signal input to the input terminal 1 is supplied to the moving vector detection circuit 7. The motion vector detection circuit 7 detects motion vectors (motion vectors) according to the movement between the image fields, that is, the movement amount (moving speed) of the subject. The detected copper vector is supplied to the CPU 8. The CPU 8 extracts a horizontal component of the detected motion vectors, and controls the memory control circuit 6 accordingly. That is, when the movement of the subject is large and the motion vector is large, the amount of delay of the field memory 5 is controlled to be small. When the movement of the subject is small or when the motion vector is small, such as during slow motion playback, the control is controlled to increase the amount of delay.

CPU(8)는 동 벡터의 수평 성분의 방향이 좌에서 우인 경우에는 그 동 벡터의 검출 대상에 따라 영상 전환 회로(2)를 제어하여, 동 벡터의 검출 대상이 배경의 전방 위치에 있는 피사체라고 판단한 때에는 지연된 영상 신호를 우목용 영상 신호가 되도록 영상 전환 회로(2)를 제어한다. 한편, 동 벡터의 검출 대상이 피사체의 후방 위치에 있는 배경이라고 판단할 때에는 지연된 영상 신호를 좌목용 영상 신호로 하도록 영상 전환 회로(2)를 제어한다. 따라서, 2차원 영상 신호에서 피사체 또는 배경이 좌우 방향으로 이동하는 장면에 대해서는 움직임 속도에 따른 시차로 인하여, 배경에 대해 피사체가 항상 전방 위치에 있는 시차가 발생한다.When the direction of the horizontal component of the vector is from left to right, the CPU 8 controls the image switching circuit 2 according to the detection target of the vector, so that the detection target of the vector is a subject in the front position of the background. When it is determined, the video switching circuit 2 is controlled so that the delayed video signal becomes the right eye video signal. On the other hand, when it is determined that the detection target of the vector is the background at the rear position of the subject, the video switching circuit 2 is controlled so that the delayed video signal is the left eye video signal. Therefore, for a scene in which the subject or the background moves in the left and right directions in the 2D image signal, parallax occurs in which the subject is always in the forward position with respect to the background due to the parallax according to the movement speed.

그런데, 전술한 종래기술 1의 MTD 방식은 움직임 정보에 의해 이전의 K개 프레임 중에서 하나의 프레임을 지연영상으로 결정하기 때문에, 예컨대 움직이는 물체에 대해서는 입체감을 느낄 수 있으나 배경(Background)과 같이 움직임이 적은 물체에 대해서는 임체감을 느낄 수 없는 문제점이 있다.However, the above-described MTD method of the prior art 1 determines one frame among the previous K frames as the delayed image based on the motion information, so that, for example, the moving object may have a three-dimensional effect, but the motion may be reduced as in the background. There is a problem that can not feel the presence of a small object.

종래기술 2는 전술한 종래기술 1의 단점을 개선한 것으로서, 현재 영상 이전의 어느 한 프레임으로부터 지연영상을 생성하지 아니하고, 각 블록별로 복수의 이전 프레임으로부터 블록 데이터를 합성하여 지연영상을 생성한다.The prior art 2 improves the above-described disadvantages of the prior art 1, and does not generate a delay image from one frame before the current image, but generates a delay image by synthesizing block data from a plurality of previous frames for each block.

즉, 종래기술 1 및 종래기술 2는 기본적으로 지연 시차를 이용하여 2차원 영상을 3차원 영상으로 변환한다. 이와 같이 지연 시차를 이용하는 종래의 변환 방법은 지연범위를 결정하고 이에 따라 좌우 영상을 구분하기 때문에, 움직임의 변화에 따라 지연 범위, 또는 지연 영상과 현재 영상의 유사도가 변화되므로 실제 관찰자가 상당한 피로감을 느끼게 되며 입체감이 상당히 저하되는 문제점이 있다. 또한, 종래의 MTD 방식은 지연 영상을 결정하기 위해 복수개의 이전 프레임을 프레임 메모리에 저장하여야 하기 때문에 구현에 필요한 메모리 크기가 증가하는 단점이 있 다. That is, the prior art 1 and the prior art 2 basically converts a 2D image into a 3D image using a delay parallax. Since the conventional conversion method using the delay parallax determines the delay range and classifies the left and right images accordingly, the delay range or the similarity between the delayed image and the current image changes according to the change of motion, so that the actual observer feels very tired. There is a problem that the three-dimensional feeling is significantly reduced. In addition, the conventional MTD method has a disadvantage in that a memory size required for implementation is increased because a plurality of previous frames must be stored in a frame memory to determine a delay image.

또한, 종래 기술에 따르면, 3차원 영상을 재생하기 위하여 3차원 영상을 획득하는 장치, 압축 알고리즘 및 새로운 저장 방식이 요구되어, 새로운 하드웨어를 추가로 구비하여야 하는 문제점이 있다.In addition, according to the related art, in order to reproduce a 3D image, an apparatus for acquiring a 3D image, a compression algorithm, and a new storage method are required, and thus there is a problem that additional hardware must be provided.

전술한 문제점을 해결하고자, 본 발명은 하드웨어의 추가 없이도 통상의 2차원 영상 재생기에서 2차원 영상을 3차원 영상으로 변환할 수 있도록 하는 데 그 목적이 있다. In order to solve the above problems, an object of the present invention is to enable the conversion of a two-dimensional image to a three-dimensional image in a conventional two-dimensional image player without the addition of hardware.

전술한 목적을 달성하기 위하여, 본 발명의 일 측면에 따르면, 압축 동영상의 움직임 벡터를 이용하여 2차원 영상을 3차원 영상으로 변환하는 방법이 제공되며, 상기 압축 동영상으로부터 각 블록의 움직임 벡터를 추출하는 제1 단계와, 상기 추출된 움직임 벡터 중에서 유사한 움직임 벡터를 그룹화하여 객체를 추출하는 제2 단계와, 상기 추출된 각 객체의 움직임으로부터 각 객체의 원근을 판정하는 제3 단계와, 상기 판정된 각 객체의 원근에 기초하여 레이어를 분리하는 제4 단계와, 상기 레이어를 순차적으로 중첩하여 3차원 영상을 구성하는 제5 단계를 포함한다.In order to achieve the above object, according to an aspect of the present invention, there is provided a method for converting a two-dimensional image to a three-dimensional image using a motion vector of the compressed video, extracting the motion vector of each block from the compressed video A second step of extracting an object by grouping similar motion vectors among the extracted motion vectors, a third step of determining the perspective of each object from the movements of the extracted objects, and the determined A fourth step of separating the layers based on the perspective of each object, and a fifth step of forming a three-dimensional image by sequentially overlapping the layers.

이 때, 상기 제1 단계는 이미 추출된 주변 블록의 움직임 벡터들을 현재 블록의 움직임 벡터로 설정하는 단계를 포함할 수 있다. In this case, the first step may include setting the motion vectors of the neighboring blocks already extracted as the motion vectors of the current block.

또한, 상기 제2 단계는 상기 유사한 움직임 벡터의 그룹에서 주변 블록의 움직임 벡터의 변위를 이용하여 노이즈 에러를 제거하는 단계와, 상기 노이즈 에러가 제거된 움직임 벡터의 그룹을 객체의 내부 영역으로 설정하고 상기 2차원 영상의 경계영역 데이터를 이용하여 객체의 윤곽을 설정하는 단계를 포함할 수 있다.The second step may include removing a noise error by using a displacement of a motion vector of a neighboring block in the group of similar motion vectors, and setting a group of motion vectors from which the noise error is removed as an inner region of an object. The method may include setting an outline of an object by using boundary area data of the 2D image.

또한, 상기 제4 단계는 상기 판정된 각 객체의 원근에 대응하여 각 객체의 깊이값을 설정하는 단계와, 상기 추출된 객체들을 상기 설정된 깊이값에 대응하는 각 레이어로 분리하는 단계를 포함할 수 있으며, 상기 깊이값 설정 단계는 깊이에 미세한 차이가 있는 객체들에 대해서는 대표 깊이값을 부여하는 것이 바람직하다.The fourth step may include setting a depth value of each object corresponding to the determined perspective of each object, and separating the extracted objects into respective layers corresponding to the set depth value. In the setting of the depth value, it is preferable to assign a representative depth value to objects having a slight difference in depth.

마지막으로, 상기 제5 단계는 배경 데이터를 이용하여 배경화면을 구성하는 단계와, 상기 배경화면 상에 상기 레이어를 깊이값에 따라 순차적으로 중첩하는 단계를 포함할 수 있으며, 상기 배경화면 구성 단계는 이전 프레임 영상의 배경 데이터를 이용하여, 객체의 움직임으로 인하여 발생하는 배경화면의 공백 영역을 보충하는 단계를 포함할 수 있다. Finally, the fifth step may include configuring a background screen using background data, and sequentially overlapping the layer on the background screen according to a depth value. Compensating for the blank area of the background screen generated by the movement of the object by using the background data of the previous frame image.

이하, 첨부된 도면을 참조하여 본 발명의 바람직한 실시예를 설명토록 한다.Hereinafter, exemplary embodiments of the present invention will be described with reference to the accompanying drawings.

도 2a는 본 발명의 바람직한 실시예에 따라 2D/3D 변환이 가능한 엠펙(MPEG, 이하 "MPEG"이라 함) 디코더 장치의 구성을 도시하고 있으며, MPEG 압축 데이터의 비트스트림(Encoded Bitstream)으로부터 양자화 변환 계수(Quantized Transform Coefficient), 움직임 벡터(MV) 등을 추출하는 엔트로피 복호화기(110)와, 상기 양자화 변환 계수를 역양자화 및 역이산 코사인 변환(IDCT; Inverse Discrete Cosine Transform)하여 에러 영상을 복원하는 역양자화 및 역이산 코사인 변환부(120)와, 엔트로피 복호화기(110)로부터 출력되는 움직임 벡터(MV)와 후술하는 프레임 버퍼에 저장된 이전의 영상 프레임을 이용하여 움직임 벡터를 복원하는 움직임 보상부 (130)와, 상기 에러 영상과 상기 움직임 보상부(130)로부터 복원된 움직임 벡터를 가산하여 좌영상(Left Image)용 프레임을 복원하는 가산부(140)와, 후속하는 영상 프레임의 복원을 위해 상기 가산부(140)로부터 출력되는 좌영상 프레임을 저장하는 프레임 버퍼(150)로 구성되어 있다. FIG. 2A illustrates a configuration of an MPEG (MPEG) decoder device capable of 2D / 3D conversion according to a preferred embodiment of the present invention, and is quantized from an encoded bitstream of MPEG compressed data. An entropy decoder 110 that extracts a quantized transform coefficient, a motion vector (MV), and the like, and an quantized transform coefficient to dequantize and inverse discrete cosine transform (IDCT) to reconstruct an error image. A motion compensator for reconstructing the motion vector using the inverse quantization and inverse discrete cosine transform unit 120, the motion vector MV output from the entropy decoder 110, and a previous image frame stored in a frame buffer to be described later ( 130, an adder 140 for restoring a left image frame by adding the error image and the motion vector reconstructed from the motion compensator 130, and For the restoration of the video frame consists of a frame buffer 150 for storing the left image frame outputted from the addition unit 140.

압축된 2차원 동영상 데이터의 내용은 일반적으로 기준이 되는 영상데이터가 있고 이 데이터를 기준으로 몇 개의 순차적 프레임은 움직임 벡터 정보와 움직임벡터를 이용하여 복원한 영상과 원래 영상과의 차영상을 처리한 데이터로 구성된다. 즉, 기준이 되는 영상데이터는 보통 I 프레임(frame)이라 불리고 이 기준이 되는 영상을 이용하여 구성된 영상 데이터는 P 또는 B 프레임으로 불리어 사용되고 있다. P 프레임과 B 프레임의 경우, 전술한 MPEG 디코더 장치의 구성에 의하여, 움직임 벡터 정보와 이에 대한 차영상 데이터로부터 영상을 복원할 수 있다. The content of compressed two-dimensional video data generally has reference image data, and based on this data, several sequential frames are processed by reconstructing the image reconstructed by using the motion vector information and the motion vector from the original image. It consists of data. That is, the reference image data is usually called an I frame, and the image data constructed using the reference image is called a P or B frame. In the case of the P frame and the B frame, the video may be reconstructed from the motion vector information and the difference image data according to the structure of the MPEG decoder described above.

다시 도 2a를 참조하면, MPEG 디코더 장치는 우영상(Right Image)의 복원을 위한 우영상 복원부(200)를 구비하며, 우영상 복원부(200)는 전술한 엔트로피 복호화기(110)로부터 출력되는 움직임 벡터(MV)와 프레임 버퍼(150)에 저장된 이전의 영상 프레임으로부터 객체를 추출하는 객체 추출부(210)와, 추출된 객체별로 움직임을 판정하는 움직임 판정부(220)와, 상기 움직임 판정부로부터 판정된 객체별 움직임에 따라 각 개체의 원근을 판정하는 객체별 원근 판정부(230)와, 상기 객체별 원근 판정부로부터 판정된 각 개체의 원근 정보에 따라 각 객체를 레이어별로 분리하는 객체별 레이어 판정부(240)와, 상기 객체별 레이어 판정부로부터 분리된 각 레이어를 순차적으로 합성하여 3차원 영상 데이터를 구성하여 우영상(Right Image) 으로서 출력하는 영상데이터 합성부(250)로 구성되어 있다.Referring back to FIG. 2A, the MPEG decoder apparatus includes a right image restoration unit 200 for restoring a right image, and the right image restoration unit 200 is output from the entropy decoder 110 described above. An object extracting unit 210 for extracting an object from a previous image frame stored in the motion vector MV and the frame buffer 150, a motion determining unit 220 for determining a motion for each extracted object, and the motion plate Per-object perspective determination unit 230 that determines the perspective of each object according to the object-specific movement determined by the government, and an object that separates each object by layer according to the perspective information of each object determined by the per-object perspective determination unit. The image layer synthesizing unit 250 configured to sequentially combine each layer determination unit 240 and each layer separated from the object layer determination unit to form 3D image data and output as a right image. It is configured.

객체 추출부(210)의 기능과 관련하여, 도 2b는 도 2a의 객체 추출부(210)에 의해 움직임 벡터로부터 객체를 추출하는 방법을 절차별로 도시하고 있다.In relation to the function of the object extractor 210, FIG. 2B illustrates a method of extracting an object from a motion vector by the object extractor 210 of FIG. 2A.

도 2b를 참조하면, 먼저 도 2a의 엔트로피 복호화기(110)로부터 복원된 정보로부터 움직임 벡터를 추출한다(S260). 이 때, 주변 블록의 움직임 벡터와 상관도를 이용하면, 이미 추출된 주변 블록의 움직임 벡터들을 현재 블록의 가상의 움직임 벡터로 설정하여 현재 움직임 벡터를 최소 연산으로 추출할 수 있다. Referring to FIG. 2B, first, a motion vector is extracted from information reconstructed from the entropy decoder 110 of FIG. 2A (S260). In this case, by using the correlation vector and the motion vector of the neighboring block, the current motion vector may be extracted with the minimum operation by setting the motion vectors of the neighboring blocks extracted as the virtual motion vectors of the current block.

한편, 압축된 2차원 동영상의 경우 각 표준에 따라 16x16 픽셀의 매크로 블록에서부터 4x4 크기의 블록까지 다양한 크기의 움직임 벡터 정보를 가지며, MPEG 압축 영상으로부터 복원된 움직임 벡터는 실제 움직임(Optical flow)과는 상이한 값을 가질 수 있다. 즉, 압축영상의 부호화(encoding) 과정에서 움직임 벡터를 추출에 가장 적합한 블록을 탐색하는 알고리즘은 압축 영상의 비트율(bitrate)을 감소시키기 위하여, 실제 움직임을 찾는 것이 아니라 일정 함수를 이용하여 에러가 가장 작은 블록을 탐색하여 움직임 벡터를 결정할 수 있다. 따라서, 일부 움직임 벡터의 경우 실제 움직임을 추종하지 아니하고 실제 움직임과는 무관한 방향성을 가질 수 있기 때문에, 압축 영상으로부터 복원된 움직임 벡터를 그대로 사용하면 화질의 열화가 생기게 된다. 이를 해결하기 위해 전술한 단계(S260)에 후속하는 단계들(S270 내지 S290)이 수행된다.Meanwhile, a compressed 2D video has motion vector information of various sizes, ranging from 16x16 pixel macroblocks to 4x4 size blocks according to each standard, and a motion vector reconstructed from an MPEG compressed image is different from an optical flow. Can have different values. That is, in order to reduce the bitrate of the compressed image, an algorithm for searching for a block that is most suitable for extracting a motion vector during encoding of the compressed image may find an error using a constant function rather than finding the actual motion. A small block can be searched to determine the motion vector. Therefore, some motion vectors may have a direction that is independent of the actual motions without following the actual motions. Thus, when the motion vectors reconstructed from the compressed image are used as they are, deterioration of image quality occurs. In order to solve this problem, steps S270 to S290 subsequent to the above-described step S260 are performed.

단계(S270)는 전술한 단계(S260)에서 추출된 움직임 벡터 중에서 유사한 변위를 갖는 움직임 벡터를 영역별로 그룹화한다. 즉, 압축된 동영상으로부터 추출된 움직임 벡터의 변위를 관찰하여 동일 객체(물체, 배경 등등)에 해당하는 유사변위의 벡터를 그룹화할 수 있다. Step S270 groups motion vectors having similar displacements among the motion vectors extracted in the aforementioned step S260 for each region. That is, by observing the displacement of the motion vector extracted from the compressed video, it is possible to group similar displacement vectors corresponding to the same object (object, background, etc.).

이어서, 단계(S280)에서, 주변 블록의 움직임 벡터와 비교할 때 전혀 상이한 값을 갖는 움직임 벡터를 노이즈로 정의하고, 주변 벡터들을 이용하여 그 노이즈 벡터를 보정함으로써 노이즈 에러를 제거한다. Next, in step S280, a motion vector having a completely different value when compared with the motion vector of the neighboring block is defined as noise, and the noise vector is corrected using the neighboring vectors to remove the noise error.

단계(S270 및 S280)에서 분류되고 노이즈 에러가 보정된 각 그룹의 움직임 벡터는 객체의 대략적인 내부로 가정되며, 단계(S290)에서 프레임 버퍼(150)에 저장된 이전의 영상 프레임에서 물체의 경계영역 데이터(객체의 윤곽선)를 결정함으로써, 최종적으로 정확한 객체를 추출한다(S300). The motion vectors of each group classified in steps S270 and S280 and corrected for noise errors are assumed to be the approximate interior of the object, and in step S290 the boundary region of the object in the previous image frame stored in the frame buffer 150. By determining the data (contour of the object), finally the correct object is extracted (S300).

이와 같이, 움직임 벡터들을 이용하여 움직임의 크기에 따라 객체를 구성하면, 객체영역이 아니라 움직임 벡터를 구성한 벡터의 크기에 따라서 사각형의 모양의 집합체로 객체 데이터가 구성된다. 전술한 바에 따라 움직임 벡터를 이용하여 구성된 불완전한 객체 데이터는 2차원 복원된 영상의 경계영역 데이터를 이용하여 객체의 정확한 윤곽이 가미된다.As such, when an object is constructed according to the magnitude of the motion using the motion vectors, the object data is composed of an aggregate of square shapes according to the size of the vector constituting the motion vector, not the object region. As described above, the incomplete object data constructed by using the motion vector has an accurate outline of the object by using the boundary region data of the 2D reconstructed image.

예컨대, 도 2c에서 객체 A와 객체 B의 내부에서 움직임 벡터(화살표로 표시됨)는 유사한 벡터값을 가지므로, 이와 같이 유사한 변위의 벡터를 그룹화하여 객체 A와 객체 B를 판정할 수 있다. 그리고, 보다 바람직하게는 객체 A와 객체 B의 윤곽선 정보를 이용함으로써 이들 객체를 보다 정확하게 추출할 수 있다.For example, in FIG. 2C, since the motion vectors (indicated by arrows) inside the objects A and B have similar vector values, the objects A and B may be determined by grouping vectors of similar displacements as described above. And, more preferably, by using the contour information of the object A and the object B, these objects can be extracted more accurately.

도 3a는 도 2b의 방법에 후속하여 객체별로 레이어를 판정하는 방법의 절차를 도시하고 있으며, 도 3b는 도 3a의 방법에서 객체가 중첩되는 경우의 원근 판정 을 설명하기 위한 예시도이다. FIG. 3A illustrates a procedure of a method of determining a layer for each object following the method of FIG. 2B, and FIG. 3B is an exemplary diagram for describing perspective determination when objects overlap in the method of FIG. 3A.

도 3a를 참조하면, 단계(S300)는 도 2b의 방법에 따라 영상에서 객체를 추출하며, 추출된 각 객체의 움직임을 판정하고(S310), 그 판정된 각 객체의 움직임 시차에 따라 각 객체의 원근을 결정한다(S320). 일반적으로 영상은 배경과 전경으로 구성되며, 배경 영상의 경우 사람의 좌우 시야에 들어오는 영상의 시차가 매우 작다. 또한, 전경이 되는 객체의 경우 가까우면 가까울 수록 시차가 커지고 가장 가까운 전경과 배경 사이에서는 시차가 점점 줄어든다. 따라서, 이러한 성질을 이용하여 각 객체의 움직임 시차에 따라 각 객체의 원근을 결정할 수 있다. Referring to FIG. 3A, step S300 extracts an object from an image according to the method of FIG. 2B, determines the movement of each extracted object (S310), and determines the motion of each object according to the motion parallax of the determined object. Perspective is determined (S320). In general, an image is composed of a background and a foreground, and in the case of a background image, a parallax of an image coming into a left and right view of a person is very small. Also, the closer the object is to the foreground, the greater the parallax and the less the parallax between the nearest foreground and the background. Therefore, using this property, it is possible to determine the perspective of each object according to the motion parallax of each object.

단계(S330)에서, 전술한 단계에서 추출된 객체는 영상내에 다수 존재하므로, 각 객체의 원근에 대응하는 깊이 정보에 따라 분류하여 레이어를 구성한다. 한편, 객체의 깊이감에 미세한 차이만이 있는 경우에는 구분이 용이하지 아니하므로, 영상 내에 다양한 객체가 존재할 경우 각 객체별로 객체의 깊이를 구성하지 아니하고, 깊이에 미세한 차이가 있는 객체들에 대해서는 대표 깊이값을 부여하여 일정 개수의 레이어로 구성할 수 있다. In operation S330, since the objects extracted in the above-described steps exist in the image, the layers are classified and classified according to depth information corresponding to the perspective of each object. On the other hand, if there is only a slight difference in the depth of the object is not easy to distinguish, if there are a variety of objects in the image does not constitute the depth of the object for each object, representative for the objects with a small difference in depth Depth can be assigned to a certain number of layers.

도 3b에 도시된 바와 같이, 객체 A와 객체 B는 이전 프레임에서 각기 다른 위치에 존재하다가 현재의 영상 프레임에서 중첩되어 있으므로, 종래의 방법으로는 원근을 판단하기가 용이하지 않다. 그러나, 전술한 도 3b의 방법에 의할 경우, 각 객체의 움직임이 발생한 후 객체가 중첩되고 있으므로, 각 객체의 움직임으로부터 각 객체의 뒤 혹은 앞으로 이동하는 정보를 이용하여 객체의 원근을 판단할 수 있다. 도 3b의 경우에는 객체 A가 객체 B보다 앞쪽에 위치하는 것으로 판정된다. As shown in FIG. 3B, since the objects A and B exist at different positions in the previous frame and overlap each other in the current image frame, it is not easy to determine perspective using the conventional method. However, in the above-described method of FIG. 3B, since the objects are overlapped after the movement of each object occurs, the perspective of the object may be determined using the information moving backward or forward of each object from the movement of each object. have. In the case of FIG. 3B, it is determined that the object A is located ahead of the object B.

도 4a는 본 발명의 바람직한 실시예에 따라 도 3a의 방법에 후속하여 레이어별로 3차원 영상 데이터를 합성하는 방법을 절차별로 도시하고 있으며, 도 4b는 도 4a의 방법에서 레이어 구성 순서를 설명하기 위한 예시도이다.4A is a flowchart illustrating a method of synthesizing 3D image data for each layer subsequent to the method of FIG. 3A according to a preferred embodiment of the present invention. FIG. 4B is a view for explaining a layer configuration procedure in the method of FIG. 4A. It is an illustration.

도 4a에 도시된 바와 같이, 먼저 배경 데이터 및 시간적 데이터를 이용하여 배경화면을 구성한다(S400). 객체의 움직임으로 인하여, 객체가 이전에 위치하는 영역에는 배경 데이터가 존재하지 아니하므로, 시간적 데이터를 이용하여 상기 영상부분을 채움으로써 완전한 배경화면을 구성할 수 있다. 여기서, 시간적 데이터는 시간적으로 현재가 아닌 다른 프레임을 의미한다. 종래 기술의 경우 시간적으로 과거 프레임에서 전체 프레임을 가져다가 사용함에 반하여, 본 발명에 따르면 전체 배경 데이터를 가져오는 것이 아니라 빈 공간부분에 대하여 움직임 벡터를 이용하여 그 블록의 움직임이 유래한 벡터만을 가져다가 채우게 된다.As shown in FIG. 4A, first, a background screen is configured using background data and temporal data (S400). Due to the movement of the object, since no background data exists in the region where the object is previously located, a complete background screen may be configured by filling the image part using temporal data. Here, the temporal data means another frame that is not current in time. In the prior art, the entire frame is taken from a past frame in time, and according to the present invention, instead of obtaining the entire background data, only the vector from which the motion of the block originates is obtained by using the motion vector for the empty space part. Will fill.

이어서, 전술한 깊이값에 따라, 가장 멀리 있는 최하위 레이어로부터 순차적으로 레이어를 합성하여(S410), 레이어별로 중첩된 삼차원 영상을 합성한다(S420). Subsequently, the layers are sequentially synthesized from the lowest layer farthest in accordance with the above-described depth value (S410), and the superimposed three-dimensional images for each layer are synthesized (S420).

도 4b를 참조하면, 평행 사변형의 배경을 먼저 구성한다. 이어서, 배경 영상 위에 사각형의 객체 B를 구성하고, 가장 가까운 원형 객체 A를 구성하며, 이에 따라 각 객체의 중첩 영역에서는 최후에 합성된 가장 가까운 객체가 표시된다.Referring to FIG. 4B, a background of a parallelogram is first constructed. Subsequently, a rectangular object B is formed on the background image, and the closest circular object A is formed. Thus, the closest synthesized object is displayed in the overlapping region of each object.

도 5는 본 발명의 바람직한 실시예에 따라, 주변 움직임 벡터의 유사성을 이용하여 움직임 벡터를 획득하는 방법을 설명하기 위한 예시도이다.FIG. 5 is an exemplary diagram for describing a method of obtaining a motion vector using similarity of surrounding motion vectors according to a preferred embodiment of the present invention.

예컨대, MPEG2에서는 매크로 블록 단위로 블록이 구성되고, MPEG4에서는 8ㅧ8 단위, H.264에서는 보다 미세한 4ㅧ4 단위로 블록이 구성될 수 있으며, 엔코더에 서는 객체 내에서 일정한 크기의 블록 단위로 움직임 벡터를 구성한다. 움직임 벡터의 추출 시에는 래스터 스캔 방향, 즉 좌에서 우로, 위에서 아래로 블록이 재생되므로, 현재 움직임 벡터를 추출하고자 하는 블록의 상위 라인 블록과 좌측의 블록은 이미 재생되어 있는 상태가 된다. 그런데, 전술한 바와 같이 객체 내의 움직임 벡터들은 유사한 방향성과 크기를 가지므로, 압축 영상을 구성할 시에 이미 재생된 주위 블록의 움직임 벡터를 이용하여 현재 블록의 움직임 벡터를 구하면 객체내에 정확한 시차를 산출할 수 있다. For example, in MPEG2, blocks may be configured in macroblock units, in MPEG4, blocks may be configured in 8 ㅧ 8 units and in H.264 in finer 4 H4 units. Construct a motion vector. When the motion vector is extracted, the blocks are reproduced from the raster scan direction, that is, from left to right and from top to bottom, so that the upper line block and the left block of the block from which the current motion vector is to be extracted are already reproduced. However, as described above, since motion vectors in the object have similar directionality and magnitude, when the motion vector of the current block is obtained by using the motion vector of the neighboring blocks already reproduced when constructing the compressed image, the correct parallax is calculated in the object. can do.

도 5를 참조하면, 현재 움직임 벡터를 추출하고자 하는 블록(C)의 상위 블록(T)(Top을 의미함)과, 상위 블록 우측의 블록(TR)(Top Right를 의미함)과, 블록(C)의 좌측 블록(L)(Left를 의미함)의 움직임 벡터는 각각 DV_T, DV_TR, DV_L로 표시되어 있으며, 이를 이용하여 시차(Disparity)를 산출한다. 이 때, 이들 세 블록(T, TR, L)은 현재 블록(C)의 재생 이전에 움직임 벡터를 구하여 이미 재생이 완료된 상태이고, 이들 주위 블록(T, TR, L)의 움직임 벡터는 유사한 방향성과 크기를 가지므로 동일한 객체로 분류될 것이다. 따라서, 전술한 방법에 따라 현재 블록(C)이 이들 주위 블록과 동일한 객체 영역으로 분류될 때, 이들 주위 블록의 움직임 벡터(DV_T, DV_TR, DV_L)를 이용하여 현재 블록의 시차를 고속으로 결정할 수 있다. Referring to FIG. 5, an upper block T (meaning Top) of a block C to which the current motion vector is to be extracted, a block TR of the right side of the upper block (meaning Top Right), and a block ( The motion vectors of the left block L (meaning Left) of C) are represented by DV _T , DV _TR , and DV _L , respectively, to calculate disparity. At this time, these three blocks (T, TR, L) have obtained a motion vector before the reproduction of the current block (C) and have already been reproduced. The motion vectors of these neighboring blocks (T, TR, L) have similar directionality. Since it has the size and, it will be classified as the same object. Therefore, when the current blocks C are classified into the same object area as those surrounding blocks according to the above-described method, the parallax of the current blocks can be rapidly increased by using the motion vectors DV _T , DV _TR , and DV _L of these surrounding blocks. Can be determined.

이상 설명한 바와 같이, 2차원 영상을 3차원 영상으로 변환하기 위해서는 기존의 압축 디코더에서 움직임 벡터를 추출하여 사용하며, 이는 기존의 2차원 동영상 디코더에서도 제공될 수 있으므로, 본 발명에 따른 2D/3D 변환 기능은 별도의 하드웨어 장비를 추가하지 않고도 기존의 2차원 동영상 재생기에 소프트웨어 알고리즘 형태로 탑재될 수 있다. 따라서, 본 발명에 따른 2D/3D 변환 장치는 통상의 2차원 동영상 재생기에 연결하여 3차원 영상을 변환하는 장치로 구현할 수 있을 뿐만 아니라, 필요에 따라서는 통상의 2차원 동영상 재생기에 본 발명에 따른 2D/3D 변환 알고리즘을 추가하여 구현될 수 있다.As described above, in order to convert a 2D image into a 3D image, a motion vector is extracted from a conventional compression decoder, which may be provided in a conventional 2D video decoder, and thus 2D / 3D conversion according to the present invention. The function can be installed in the form of a software algorithm in the existing two-dimensional video player without adding additional hardware equipment. Therefore, the 2D / 3D conversion apparatus according to the present invention can be implemented as a device for converting a 3D image by connecting to a conventional 2D video player and, if necessary, according to the present invention in a conventional 2D video player. It can be implemented by adding a 2D / 3D conversion algorithm.

이상에서 본 발명에 따른 바람직한 실시예를 설명하였으나, 이는 예시적인 것에 불과하며 당해 분야에서 통상적 지식을 가진 자라면 이로부터 다양한 변형 및 균등한 여타 실시예가 가능하다는 점을 이해할 것이다. 따라서, 본 발명의 보호 범위는 이하의 특허청구범위에 의해서 정해져야 할 것이다.Although the preferred embodiment according to the present invention has been described above, this is merely exemplary and those skilled in the art will understand that various modifications and equivalent other embodiments are possible therefrom. Therefore, the protection scope of the present invention should be defined by the following claims.

이상 설명한 바와 같이, 본 발명에 따르면 기존의 2차원 동영상 재생 장치를 이용하여 직접적으로 변환 가능하므로, 종래의 모든 2차원 영상 재생기에서 활용할 수 있다. As described above, according to the present invention, since the present invention can be directly converted using the existing 2D video reproducing apparatus, it can be utilized in all conventional 2D video reproducing apparatuses.

Claims

A method of converting a 2D image into a 3D image using a motion vector of a compressed video,

Extracting a motion vector of each block from the compressed video;

A second step of extracting an object by grouping similar motion vectors among the extracted motion vectors;

A third step of determining the perspective of each object from the movement of each extracted object;

Dividing the layer based on the perspective of each of the determined objects;

A fifth step of forming a 3D image by sequentially overlapping the layers

3D image conversion method of a 2D image comprising a.

The method of claim 1, wherein the first step

Setting motion vectors of neighboring blocks already extracted as motion vectors of the current block

3D image conversion method of a 2D image comprising a.

The method of claim 1, wherein the second step,

Removing noise errors using the displacement of the motion vectors of neighboring blocks in the group of similar motion vectors;

Setting a group of motion vectors from which the noise error is removed to an internal region of an object, and setting an outline of the object by using boundary region data of the 2D image

3D image conversion method of a 2D image comprising a.

The method of claim 1, wherein the fourth step is

Setting a depth value of each object corresponding to the determined perspective of each object;

Dividing the extracted objects into respective layers corresponding to the set depth value

3D image conversion method of a 2D image comprising a.

The method of claim 4, wherein the setting of the depth value provides a representative depth value.

The method of claim 4, wherein the fifth step,

Configuring the wallpaper using the background data;

Sequentially superimposing the layers on the background screen according to a depth value

3D image conversion method of a 2D image comprising a.

The method of claim 6, wherein the setting of the wallpaper is performed.

Compensating the blank area of the background screen caused by the movement of the object by using the background data of the previous frame image

3D image conversion method of a 2D image comprising a.

According to any one of claims 1 to 3, wherein the fifth step,

Configuring the wallpaper using the background data;

3D image conversion method of a 2D image comprising a.

The method of claim 8, wherein the configuring of the wallpaper is performed.

3D image conversion method of a 2D image comprising a.

An apparatus for converting a 2D image into a 3D image using a motion vector of a compressed video,

A motion vector extractor which extracts a motion vector of each block from the compressed video;

An object extractor which extracts an object by grouping similar motion vectors among the extracted motion vectors;

Perspective determination unit for determining the perspective of each object from the movement of the extracted object,

A layer determination unit that separates layers based on perspectives of the determined objects;

An image synthesizer configured to sequentially overlap the layers to form a 3D image

3D image conversion apparatus of a 2D image comprising a.

The method of claim 10, wherein the motion vector extractor

3D image conversion apparatus of a 2D image to set the motion vectors of the neighboring block already extracted as the motion vector of the current block.

The method of claim 10, wherein the object extraction unit,

Remove the noise error using the displacement of the motion vector of the neighboring block in the group of similar motion vectors,

A group of motion vectors from which the noise error is removed is set as an internal region of an object.

3D image conversion apparatus for a two-dimensional image to set the contour of the object by using the boundary region data of the two-dimensional image.

The method of claim 10, wherein the perspective determination unit,

Setting a depth value of each object in correspondence with the determined perspective of each object,

And dividing the extracted objects into respective layers corresponding to the set depth value.

The 3D image conversion apparatus of claim 13, wherein the set depth value is a representative depth value.

The method of claim 13, wherein the layer determination unit,

Compose a wallpaper using background data,

3D image conversion apparatus for a two-dimensional image to sequentially overlap the layer on the background screen according to the depth value.

The method of claim 15, wherein the layer determination unit,

3D image conversion apparatus for a 2D image by using the background data of the previous frame image, to compensate for the blank area of the background screen caused by the movement of the object.

The method according to any one of claims 10 to 12, wherein the layer determining unit,

Compose a wallpaper using background data,

The method of claim 17, wherein the layer determination unit,