KR20120024431A

KR20120024431A - Video processing method and apparatus based on multiple texture image using video excitation signal

Info

Publication number: KR20120024431A
Application number: KR1020110080664A
Authority: KR
Inventors: 홍성훈
Original assignee: 한국전자통신연구원
Priority date: 2010-09-01
Filing date: 2011-08-12
Publication date: 2012-03-14
Also published as: KR101469513B1

Abstract

PURPOSE: A video processing method and apparatus based on multiple texture image using a video excitation signal is provided to perform a predetermined multiple texture image approximation using the similarity of time-space location conversion variables. CONSTITUTION: A video processing apparatus produces variables for time-space location conversion by tracing a plurality of feature points(S13). The apparatus defines a plurality of texture images(S14). The apparatus performs the approximation of the texture image(S15). The apparatus defines the texture image as the sum of a plurality of texture blocks(S16). The apparatus compresses video filtering signal, the variable of a texture composition filter, and the variable of a plurality of time-space location conversion corresponding to the texture image(S17).

Description

Image processing method and apparatus based on multiple texture image using video excitation signal

본 발명은 영상여기신호를 이용한 다중 텍스쳐 이미지 기반 영상 처리 방법 및 장치에 관한 것이다. 더욱 상세하게, 본 발명은 저전송률에서 최적의 화질로 영상을 처리할 수 있는 영상여기신호를 이용한 다중 텍스쳐 이미지 기반 영상 처리 방법 및 장치에 관한 것이다.The present invention relates to a multi-texture image-based image processing method and apparatus using an image excitation signal. More specifically, the present invention relates to a multi-texture image-based image processing method and apparatus using an image excitation signal capable of processing an image at an optimal image quality at a low data rate.

일반적인 영상 처리 방식은 영상 프레임간 모션 추정 처리된 신호를 이산코사인변환 등을 이용하여 변환 영역 처리하는 방식에 기초를 두고 있다. 그러나, 일반적인 영상 처리 방식은 영상의 특성에 대한 부정확한 추정과 모델링으로 인해서 다양한 실제 영상의 특성을 표현하기 어려운 문제점을 가지고 있었다. 그리고, 이에 따라 변환된 영상 신호와 원본 영상 신호의 차이값이 증가하여, 해당 영상 신호의 압축시 비트율이 커지는 문제점을 가지고 있었다. 이러한 문제를 해결하기 위하여 MPEG1/2/4와 H.261/263/264로 대표되는 영상압축표준들이 제안되었지만, 여전히 원본크기대비 1/500 비트율과 같은 저전송율 영상압축에서 화질 저하가 심각한 실정이다. 또한 1/n 픽셀단위 모션추정 및 보상방법, 적응 블록크기 변환영역 영상처리방법, 다중 레퍼런스프레임 모션추정 및 보상방법, 일반화된 B-프레임처리방법들이 제안 및 사용되어 왔으나 여전히 저전송율 영상압축에서의 화질저하가 심각한 실정이다.A general image processing method is based on a method of processing a transform region using a discrete cosine transform, for example, an image inter-frame motion estimation process. However, the general image processing method has a problem in that it is difficult to express various real image characteristics due to incorrect estimation and modeling of image characteristics. As a result, the difference between the converted video signal and the original video signal increases, resulting in a large bit rate when the video signal is compressed. In order to solve this problem, video compression standards represented by MPEG1 / 2/4 and H.261 / 263/264 have been proposed. However, the image quality deterioration is still serious at low bit rate video compression such as 1/500 bit rate compared to the original size. . In addition, 1 / n pixel-based motion estimation and compensation methods, adaptive block size conversion domain image processing methods, multiple reference frame motion estimation and compensation methods, and generalized B-frame processing methods have been proposed and used. Image quality deterioration is serious.

본 발명의 목적은 복수개의 텍스쳐 이미지 및 해당 텍스쳐 이미지의 시공간위치변환 변수를 통하여 다양한 영상 특성을 표현하는 것을 목적으로 한다. An object of the present invention is to express a variety of image characteristics through a plurality of texture images and the spatio-temporal position transformation variable of the texture image.

그리고, 본 발명은 원본 영상을 복수개의 텍스쳐 이미지 및 이에 대응하는 복수개의 시공간위치변환 변수들만으로 압축 처리하여, 원본 크기 대비 획기적으로 크기가 줄은 압축 영상을 제공하는 것을 목적으로 한다. 더불어, 본 발명은 복수개의 텍스쳐 이미지 각각을, 가우시안 함수로 표현되는 영상여기신호 및 텍스쳐합성필터의 출력인 텍스쳐 블록의 합으로 정의하여, 원본 크기 대비 보다 획기적으로 크기가 줄은 압축 영상을 제공하는 것을 목적으로 한다.In addition, an object of the present invention is to provide a compressed image that is significantly reduced in size compared to the original size by compressing the original image with only a plurality of texture images and a plurality of space-time position transformation variables corresponding thereto. Furthermore, the present invention defines each of the plurality of texture images as a sum of an image excitation signal represented by a Gaussian function and a texture block that is an output of the texture synthesis filter, thereby providing a compressed image that is significantly smaller in size than the original size. For the purpose of

또한, 본 발명은 시공간위치변환 변수들의 유사성을 이용하여 기 정하여진 복수개의 텍스쳐 이미지를 근사화함으로써, 압축 영상의 크기를 보다 줄이는 것을 목적으로 한다.In addition, the present invention aims to further reduce the size of the compressed image by approximating a plurality of predetermined texture images using similarity of the spatiotemporal position transformation variables.

또한, 본 발명은 저전송률에서 최적의 화질로 영상을 처리하는 것을 목적으로 한다. In addition, an object of the present invention is to process an image with an optimum image quality at a low data rate.

상기한 목적을 달성하기 위한 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법은 입력되는 영상을 샷 단위 영상으로 분류하고, 상기 샷 단위 영상의 복수개의 프레임 중의 한 프레임을 시드 이미지로 선택하는 단계; 상기 시드 이미지에서 복수개의 특징점을 검출하는 단계; 상기 샷 단위 영상의 상기 복수개의 프레임에서, 상기 복수개의 특징점을 추적하여 특징점 각각에 대한 시공간위치변환 변수를 산출하는 단계; 상기 시공간위치변환 변수가 대응되는 특징점들을 이용하여 복수개의 텍스쳐 이미지를 정의하는 단계; 및 상기 복수개의 텍스쳐 이미지 각각을, 영상여기신호를 입력으로 한 텍스쳐합성필터의 출력인 복수개의 텍스쳐 블록의 합으로 정의하는 단계를 포함한다.The multi-texture image-based image processing method according to the present invention for achieving the above object comprises the steps of classifying the input image into a shot unit image, and selecting one frame of a plurality of frames of the shot unit image as a seed image; Detecting a plurality of feature points in the seed image; Calculating a spatiotemporal position change variable for each of the feature points by tracking the plurality of feature points in the plurality of frames of the shot unit image; Defining a plurality of texture images using feature points corresponding to the spatiotemporal position transformation variables; And defining each of the plurality of texture images as a sum of a plurality of texture blocks that are outputs of a texture synthesis filter that receives an image excitation signal.

이 때, 상기 영상여기신호는 2차원 가우시안 함수로 표현될 수 있다.In this case, the image excitation signal may be represented by a two-dimensional Gaussian function.

이 때, 상기 복수개의 텍스쳐 이미지를 정의하는 상기 복수개의 텍스쳐 블록 각각의 상기 영상여기신호, 상기 텍스쳐합성필터의 변수 및 상기 복수개의 텍스쳐 이미지 각각에 대응하는 시공간위치변환 변수를 압축하는 단계를 더 포함한다.The method may further include compressing the image excitation signal of each of the plurality of texture blocks defining the plurality of texture images, the variable of the texture synthesis filter, and the spatiotemporal position conversion variable corresponding to each of the plurality of texture images. do.

이 때, 상기 압축하는 단계는, 상기 영상여기신호 및 상기 텍스쳐합성필터의 변수 및 상기 시공간위치변환 변수를 비트스트림 압축 방식으로 압축한다.In this case, the compressing may include compressing the image excitation signal, the variable of the texture synthesis filter, and the space-time position conversion variable by a bitstream compression method.

이 때, 상기 복수개의 텍스쳐 이미지에 있어서, 텍스쳐 이미지 신호의 상관관계(correlation) 특성을 얻어냄으로써 계산된 유사성이 기 설정된 임계치 이내의 값들을 가지는 시공간위치변환 변수들을 갖는 텍스쳐 이미지들을 하나의 텍스쳐 이미지로 합쳐서 근사화하는 단계를 더 포함한다. In this case, in the plurality of texture images, texture images having spatio-temporal position transformation variables having similar values calculated by obtaining correlation characteristics of texture image signals having values within a preset threshold as one texture image. And further approximating.

이 때, 상기 복수개의 특징점을 검출하는 단계는, 상기 복수개의 프레임에 있어서, 기 설정된 수치 이상의 변화량을 갖는 포인트를 상기 특징점으로 검출한다.In this case, the detecting of the plurality of feature points detects, as the feature points, points having a change amount greater than or equal to a predetermined numerical value in the plurality of frames.

이 때, 압축된 상기 영상여기신호, 상기 텍스쳐합성필터의 변수와 상기 각각의 텍스쳐 이미지에 대응하는 상기 시공간위치변환 변수를 압축 해제하는 단계; 상기 영상여기신호 및 상기 텍스쳐합성필터의 변수를 이용하여 상기 복수개의 텍스쳐 블록을 생성하고, 상기 텍스쳐 블록을 합하여 상기 복수개의 텍스쳐 이미지를 생성하는 단계; 상기 텍스쳐 이미지와 상기 텍스쳐 이미지에 대응하는 상기 시공간위치변환 변수를 매칭하는 단계; 상기 텍스쳐 이미지와 상기 시공간위치변환 변수를 이용하여 비쥬얼 텍스쳐를 생성하는 단계; 및 상기 각각의 텍스쳐 이미지에 대응하여 생성된 비쥬얼 텍스쳐들을 결합하는 단계를 더 포함한다.Decompressing the compressed image excitation signal, the variable of the texture synthesis filter and the space-time position conversion variable corresponding to each of the texture images; Generating the plurality of texture blocks using the image excitation signal and the variables of the texture synthesis filter, and generating the plurality of texture images by adding the texture blocks; Matching the texture image and the spatiotemporal position transformation variable corresponding to the texture image; Generating a visual texture using the texture image and the spatiotemporal position transformation variable; And combining the visual textures generated corresponding to each of the texture images.

이 때, 상기 비쥬얼 텍스쳐들의 결합 경계에서의 결함(Artifact)을 필터링하여 보정하는 단계를 더 포함한다.
In this case, the method may further include filtering and correcting an artifact at the combined boundary of the visual textures.

또한, 상기한 목적을 달성하기 위한 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 장치는 입력되는 영상을 샷 단위 영상으로 분류하고, 상기 샷 단위 영상의 복수개의 프레임 중의 한 프레임을 시드 이미지로 선택하는 시드 이미지 선택부; 상기 시드 이미지에서 복수개의 특징점을 검출하는 특징점 검출부; 상기 샷 단위 영상의 상기 복수개의 프레임에서, 상기 복수개의 특징점을 추적하여 특징점 각각에 대한 시공간위치변환 변수를 산출하는 변수 산출부; 상기 시공간위치변환 변수가 대응되는 특징점들을 이용하여 복수개의 텍스쳐 이미지를 정의하는 텍스쳐 이미지 정의부; 및 상기 복수개의 텍스쳐 이미지 각각을, 영상여기신호를 입력으로 한 텍스쳐합성필터의 출력인 복수개의 텍스쳐 블록의 합으로 정의하는 텍스쳐 블록 정의부를 포함한다.In addition, the multi-texture image-based image processing apparatus according to the present invention for achieving the above object is to classify the input image as a shot unit image, the seed for selecting one frame of a plurality of frames of the shot unit image as a seed image An image selector; A feature point detector configured to detect a plurality of feature points from the seed image; A variable calculator configured to track the plurality of feature points in the plurality of frames of the shot unit image and calculate a space-time position conversion variable for each feature point; A texture image definition unit defining a plurality of texture images using feature points corresponding to the spatiotemporal position conversion variables; And a texture block definition unit defining each of the plurality of texture images as a sum of a plurality of texture blocks that are outputs of a texture synthesis filter that receives an image excitation signal.

이 때, 상기 복수개의 텍스쳐 이미지를 정의하는 상기 복수개의 텍스쳐 블록 각각의 상기 영상여기신호, 상기 텍스쳐합성필터의 변수 및 상기 복수개의 텍스쳐 이미지 각각에 대응하는 시공간위치변환 변수를 압축하는 압축부를 더 포함한다.The compression unit may further include a compression unit configured to compress the image excitation signal of each of the plurality of texture blocks defining the plurality of texture images, the variable of the texture synthesis filter, and the spatiotemporal position conversion variable corresponding to each of the plurality of texture images. do.

이 때, 상기 압축부는, 상기 영상여기신호, 상기 텍스쳐합성필터의 변수 및 상기 시공간위치변환 변수는 비트스트림 압축 방식으로 별개로 압축한다.In this case, the compression unit compresses the image excitation signal, the variable of the texture synthesis filter and the space-time position conversion variable separately by a bitstream compression method.

이 때, 상기 복수개의 텍스쳐 이미지에 있어서, 텍스쳐 이미지 신호의 상관관계(correlation) 특성을 얻어냄으로써 계산된 유사성이 기 설정된 임계치 이내의 값들을 가지는 시공간위치변환 변수들을 갖는 텍스쳐 이미지들을 하나의 텍스쳐 이미지로 합쳐서 근사화하는 근사화부를 더 포함한다.In this case, in the plurality of texture images, texture images having spatio-temporal position transformation variables having similar values calculated by obtaining correlation characteristics of texture image signals having values within a preset threshold as one texture image. It further includes an approximation unit that approximates in total.

이 때, 상기 특징점 검출부는, 상기 복수개의 프레임에 있어서, 기 설정된 수치 이상의 변화량을 갖는 포인트를 상기 특징점으로 검출한다.At this time, the feature point detector detects, as the feature point, a point having a change amount equal to or greater than a predetermined numerical value in the plurality of frames.

이 때, 압축된 상기 영상여기신호 및 상기 텍스쳐합성필터의 변수와 상기 각각의 텍스쳐 이미지에 대응하는 상기 시공간위치변환 변수를 압축 해제하는 압축 해제부; 상기 영상여기신호 및 상기 텍스쳐합성필터의 변수를 이용하여 상기 복수개의 텍스쳐 블록을 생성하고, 상기 복수개의 텍스쳐 블록을 합하여 상기 텍스쳐 이미지를 생성하는 텍스쳐 이미지 생성부; 상기 텍스쳐 이미지와 상기 텍스쳐 이미지에 대응하는 상기 시공간위치변환 변수를 매칭하는 매칭부; 상기 텍스쳐 이미지와 상기 시공간위치변환 변수를 이용하여 비쥬얼 텍스쳐를 생성하는 비쥬얼 텍스쳐 생성부; 및 상기 각각의 텍스쳐 이미지에 대응하여 생성된 비쥬얼 텍스쳐들을 결합하는 비쥬얼 텍스쳐 결합부를 더 포함한다.At this time, the decompression unit for decompressing the compressed image excitation signal, the variable of the texture synthesis filter and the space-time position conversion variable corresponding to each of the texture image; A texture image generator configured to generate the plurality of texture blocks using the image excitation signal and the variables of the texture synthesis filter, and to generate the texture image by adding the plurality of texture blocks; A matching unit matching the texture image and the space-time position conversion variable corresponding to the texture image; A visual texture generator for generating a visual texture using the texture image and the spatiotemporal position transformation variable; And a visual texture combiner that combines the generated visual textures corresponding to the respective texture images.

이 때, 상기 비쥬얼 텍스쳐들의 결합 경계에서의 결함(Artifact)을 필터링하여 보정하는 보정부를 더 포함한다.In this case, the apparatus may further include a correction unit configured to filter and correct an artifact at the combined boundary of the visual textures.

본 발명에 따르면, 복수개의 텍스쳐 이미지 및 해당 텍스쳐 이미지의 시공간위치변환 변수를 통하여 다양한 영상 특성을 표현할 수 있다.According to the present invention, various image characteristics may be expressed through a plurality of texture images and a space-time position transformation variable of the texture image.

그리고, 본 발명은 원본 영상을 복수개의 텍스쳐 이미지 및 이에 대응하는 복수개의 시공간위치변환 변수들만으로 압축 처리 가능하여, 원본 크기 대비 획기적으로 크기가 줄은 압축 영상의 제공이 가능하다. 더불어, 본 발명은 복수개의 텍스쳐 이미지 각각을, 가우시안 함수로 표현되는 영상여기신호 및 텍스쳐합성필터의 출력인 텍스쳐 블록의 합으로 정의하여, 원본 크기 대비 보다 획기적으로 크기가 줄은 압축 영상을 제공하는 것을 목적으로 한다.In addition, the present invention can compress the original image using only a plurality of texture images and a plurality of spatio-temporal position transformation variables corresponding thereto, thereby providing a compressed image having a significantly reduced size compared to the original size. Furthermore, the present invention defines each of the plurality of texture images as a sum of an image excitation signal represented by a Gaussian function and a texture block that is an output of the texture synthesis filter, thereby providing a compressed image that is significantly smaller in size than the original size. For the purpose of

또한, 본 발명은 시공간위치변환 변수들의 유사성을 이용하여 기 정하여진 복수개의 텍스쳐 이미지를 근사화함으로써, 압축 영상의 크기를 보다 줄일 수 있다.In addition, the present invention can further reduce the size of the compressed image by approximating a plurality of predetermined texture images using similarities of the spatiotemporal position transformation variables.

또한, 본 발명은 저전송률에서 최적의 화질로 영상을 처리할 수 있다. 즉, 본 발명은 1/500 비트율과 같은 저전송률에서 화질 열화를 최소화할 수 있다.In addition, the present invention can process an image with an optimum image quality at a low data rate. That is, the present invention can minimize image quality degradation at low data rates such as 1/500 bit rate.

도 1은 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법에 있어서의, 인코딩의 방법을 설명하기 위한 동작 흐름도이다.
도 2는 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법에 있어서의, 인코딩의 방법을 설명하기 위한 도면이다.
도 3은 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법에 있어서의, 디코딩의 방법을 설명하기 위한 동작 흐름도이다.
도 4는 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법에 있어서의, 디코딩의 방법을 설명하기 위한 도면이다.
도 5는 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 장치의 구성을 나타낸 블록도이다.1 is a flowchart illustrating an encoding method in a multi-texture image-based image processing method according to the present invention.
2 is a view for explaining a method of encoding in a multi-texture image-based image processing method according to the present invention.
3 is a flowchart illustrating a method of decoding in the multi-texture image-based image processing method according to the present invention.
4 is a diagram for describing a decoding method in the multi-texture image-based image processing method according to the present invention.
5 is a block diagram showing the configuration of a multi-texture image-based image processing apparatus according to the present invention.

본 발명을 첨부된 도면을 참조하여 상세히 설명하면 다음과 같다. 여기서, 반복되는 설명, 본 발명의 요지를 불필요하게 흐릴 수 있는 공지 기능, 및 구성에 대한 상세한 설명은 생략한다. 본 발명의 실시형태는 당 업계에서 평균적인 지식을 가진 자에게 본 발명을 보다 완전하게 설명하기 위해서 제공되는 것이다. 따라서, 도면에서의 요소들의 형상 및 크기 등은 보다 명확한 설명을 위해 과장될 수 있다.
Hereinafter, the present invention will be described in detail with reference to the accompanying drawings. Here, the repeated description, well-known functions and configurations that may unnecessarily obscure the subject matter of the present invention, and detailed description of the configuration will be omitted. Embodiments of the present invention are provided to more completely describe the present invention to those skilled in the art. Accordingly, the shape and size of elements in the drawings may be exaggerated for clarity.

이하에서는 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법에 있어서의 인코딩 방법에 대하여 설명하도록 한다.Hereinafter, the encoding method in the multi-texture image-based image processing method according to the present invention will be described.

도 1은 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법에 있어서의, 인코딩의 방법을 설명하기 위한 동작 흐름도이다. 도 2는 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법에 있어서의, 인코딩의 방법을 설명하기 위한 도면이다.
1 is a flowchart illustrating an encoding method in a multi-texture image-based image processing method according to the present invention. 2 is a view for explaining a method of encoding in a multi-texture image-based image processing method according to the present invention.

도 1 및 도 2를 참조하면, 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법의 인코딩 방법은 먼저, 복수개의 프레임으로 구성된 영상(200)을 입력 받는다(S10).1 and 2, the encoding method of the multi-texture image-based image processing method according to the present invention first receives an image 200 composed of a plurality of frames (S10).

그리고, 입력 받은 영상(200)을 샷 단위 영상으로 분류하고, 샷 단위 영상의 복수개의 프레임 중의 한 프레임을 시드 이미지(210)로 선택한다(S11). 그리고, 샷 단위 영상의 시드 이미지(210)를 제외한 나머지 프레임은 잔여 프레임 이미지(220)로 정의한다. 즉, 샷 단위 영상이 k 개의 프레임으로 구성되어 있을 때, 1 개의 시드 이미지가 선택되고, 나머지 k-1 개의 프레임이 잔여 프레임 이미지(220)로 정의된다. 이 때, 샷 단위 영상은 한 대의 카메라가 연속해서 촬영하는 영상에 해당한다.The input image 200 is classified into a shot unit image, and one frame among the plurality of frames of the shot unit image is selected as the seed image 210 (S11). The remaining frames other than the seed image 210 of the shot unit image are defined as the remaining frame image 220. That is, when the shot unit image is composed of k frames, one seed image is selected, and the remaining k-1 frames are defined as the remaining frame image 220. In this case, the shot unit image corresponds to an image that one camera continuously photographs.

단계(S11)에서 선택된 시드 이미지(210)에서 복수개의 특징점을 검출한다(S12). 이 때, 샷 단위 영상의 복수개의 프레임에 있어서, 기 설정된 수치 이상의 변화량을 갖는 포인트를 특징점으로 검출할 수 있다. 즉, 시드 이미지(210) 및 잔여 프레임 이미지(220)에서 특정 포인트가 기 설정된 수치 이상의 변화를 보인다면, 해당 특정 포인트는 특징점으로 검출될 수 있다. A plurality of feature points are detected in the seed image 210 selected in step S11 (S12). In this case, in a plurality of frames of the shot unit image, a point having a change amount greater than or equal to a predetermined numerical value may be detected as a feature point. That is, when a specific point in the seed image 210 and the remaining frame image 220 shows a change over a predetermined value, the specific point may be detected as a feature point.

그리고, 샷 단위 영상의 복수개의 프레임에서 복수개의 특징점을 추적하여 특징점 각각에 대한 시공간위치변환 변수를 산출한다(S13). 즉, 시드 이미지(210) 및 잔여 프레임 이미지(220)에서 특징점의 변화를 정의하는 시공간위치변환 변수를 산출한다. 시공간위치변환 변수는 특징점의 시간에 따른 위치의 변화량 등을 나타내는 함수의 형태일 수 있다. A plurality of feature points are tracked in a plurality of frames of the shot unit image to calculate a spatiotemporal position change variable for each feature point (S13). That is, the spatiotemporal position transformation variable defining the change of the feature point in the seed image 210 and the remaining frame image 220 is calculated. The space-time position transformation variable may be in the form of a function indicating an amount of change of the position of the feature point over time.

단계(S13)에서 산출된 시공간위치변환 변수(211b, 212b, 213b, 214b, Nb)가 상호 대응되는 특징점들을 이용하여 복수개의 텍스쳐 이미지(211a, 212a, 213a, 214a, Na)를 정의한다(S14). 이 때, 시공간위치변환 변수(211b, 212b, 213b, 214b, Nb)가 상호 동일한 특징점들을 연계하여 하나의 텍스쳐 이미지를 정의할 수 있다. The space-time position transformation variables 211b, 212b, 213b, 214b, and Nb calculated in step S13 define a plurality of texture images 211a, 212a, 213a, 214a, and Na using feature points corresponding to each other (S14). ). At this time, the spatiotemporal position transformation variables 211b, 212b, 213b, 214b, and Nb may define one texture image by linking the same feature points.

그리고, 복수개의 텍스쳐 이미지에 있어서, 유사한 시공간위치변환 변수를 갖는 텍스쳐 이미지들을 하나의 텍스쳐 이미지로 합쳐서 근사화한다(S15). 이 때, 시공간위치변환 변수간의 유사성은 텍스쳐 이미지 신호의 상관관계(correlation) 특성을 얻어냄으로써 계산될 수 있다. 그리고, 시공간위치변환 변수간의 유사성이 기 설정된 임계치 이내의 값들을 가지는 텍스쳐 이미지들을 하나의 텍스쳐 이미지로 합칠 수 있다. 도 2에서는, 시공간위치변환 변수가 유사성이 크다고 가정된 제 1 텍스쳐 이미지(211a)와 제 2 텍스쳐 이미지(212a)가 합쳐지고, 이에 대응하여 제 1 시공간위치변환 변수(211b)와 제 2 시공간위치변환 변수(212b)가 합쳐짐으로써, 제 1 근사화 텍스쳐 이미지(211a')와 제 1 근사화 시공간위치변환 변수(211b')가 생성된다. 그리고, 제 3 텍스쳐 이미지(213a)와 제 4 텍스쳐 이미지(214a)가 합쳐지고, 이에 대응하여 제 3 시공간위치변환 변수(213b)와 제 4 시공간위치변환 변수(214b)가 합쳐짐으로써, 제 2 근사화 텍스쳐 이미지(213a')와 제 2 근사화 시공간위치변환 변수(213b')가 생성된다.In the plurality of texture images, texture images having similar spatiotemporal position transformation variables are combined and approximated as one texture image (S15). In this case, the similarity between the spatiotemporal position transformation variables may be calculated by obtaining correlation characteristics of the texture image signal. Similarity between the spatiotemporal position transformation variables may combine texture images having values within a preset threshold into one texture image. In FIG. 2, the first texture image 211a and the second texture image 212a are assumed to have high similarity to the space-time position transformation variable, and correspondingly, the first space-time position transformation variable 211b and the second space-time position are combined. By combining the transform variables 212b, a first approximated texture image 211a 'and a first approximated spatiotemporal position change variable 211b' are generated. Then, the third texture image 213a and the fourth texture image 214a are combined, and the third space-time position transformation variable 213b and the fourth space-time position transformation variable 214b are correspondingly combined to form a second image. An approximated texture image 213a 'and a second approximated spatiotemporal position change variable 213b' are generated.

그리고, 복수개의 텍스쳐 이미지(211a, 212a, 213a, 214a, Na) 각각을 복수개의 텍스쳐 블록의 합으로 정의한다(S16). 단계(S15)가 진행되었다면, 복수개의 근사화 텍스쳐 이미지(211', 213', Na') 각각을 복수개의 텍스쳐 블록의 합으로 정의할 수 있다. 이 때, 텍스쳐 블록은 영상여기신호를 입력으로 한 텍스쳐합성필터의 출력으로 정의될 수 있다. 그리고, 영상여기신호는 2차원 가우시안 함수로 표현될 수 있다. 영상여기신호 즉, 가우시안 함수는 크기변수 G, 평균값 변수 m과 베어리언스 값 a를 모델 변수로 가진다. 그리고, 텍스쳐합성필터는 다음의 수학식 1과 같은 변환영역 필터계수를 모델 변수로 갖는다.Each of the plurality of texture images 211a, 212a, 213a, 214a, and Na is defined as the sum of the plurality of texture blocks (S16). If step S15 is performed, each of the plurality of approximated texture images 211 ', 213', and Na 'may be defined as the sum of the plurality of texture blocks. In this case, the texture block may be defined as an output of a texture synthesis filter that receives an image excitation signal. The image excitation signal may be represented by a two-dimensional Gaussian function. The image excitation signal, that is, the Gaussian function, has the size variable G, the mean value variable m, and the dispersion value a as model variables. The texture synthesis filter has a transform domain filter coefficient as a model variable, as shown in Equation 1 below.

영상여기신호의 변수 즉, G, m, a 값들과 텍스쳐합성필터의 변수 h 값들은 변환 영역에서 텍스쳐 추정 값과 원 텍스쳐 신호 값의 차이를 최소화하도록 구해진다. 변환 영역에서 텍스쳐 추정신호 R은 다음의 수학식 2와 같이 표현된다.The variables of the image excitation signal, that is, the values of G, m, a and the variable h of the texture synthesis filter are calculated to minimize the difference between the texture estimation value and the original texture signal value in the transform domain. The texture estimation signal R in the transform domain is expressed by Equation 2 below.

E와 H는 변환영역에서 여기신호벡터와 텍스쳐합성필터계수벡터를 나타내고 '?'은 벡터의 각 성분의 곱을 나타낸다. 여기신호벡터 E는 2차원 가우시안 함수로 근사화되어지고, 텍스쳐합성필터 H는 텍스쳐변환영역특성에 따라 대부분의 변수값이 0이고 일부 영역에서만 변수값들을 가지는 특성을 가진다. 따라서, 본 발명에 따른 영상 처리 방법은 가변길이인코더(Variable Length Encoder)나 연산인코더(Arithmetic Encoder)를 이용해서 매우 낮은 비트율로 압축을 가능케 한다. E and H represent the excitation signal vector and the texture synthesis filter coefficient vector in the transform domain, and '?' Represents the product of each component of the vector. The excitation signal vector E is approximated by a two-dimensional Gaussian function, and the texture synthesis filter H has a property that most of the variable values are zero and only some areas have variable values according to the texture conversion region characteristics. Therefore, the image processing method according to the present invention enables the compression at a very low bit rate by using a variable length encoder or an Arithmetic encoder.

그리고, 복수개의 텍스쳐 이미지(211a, 212a, 213a, 214a, Na) 각각을 정의하는 복수개의 텍스쳐 블록 각각의 영상여기신호, 텍스쳐합성필터의 변수 및 상기 복수개의 텍스쳐 이미지 각각에 대응하는 복수개의 시공간위치변환 변수(211b, 212b, 213b, 214b, Nb)를 압축한다(S17). 또한, 단계(S17)에서는 텍스쳐 이미지의 근사화 단계(S15)가 진행되었다는 전제하에, 복수개의 근사화 텍스쳐 이미지(211a', 213a', Na') 각각을 정의하는 복수개의 텍스쳐 블록 각각의 영상여기신호, 텍스쳐합성필터의 변수 및 상기 복수개의 근사화 텍스쳐 이미지에 대응하는 복수개의 근사화 시공간위치변환 변수(211b', 213b', Nb')가 압축될 수 있다. 이 때, 압축은 비트스트림 압축 방식으로 이루어질 수 있다.
The image excitation signal of each of the plurality of texture blocks defining each of the plurality of texture images 211a, 212a, 213a, 214a, and Na, a variable of the texture synthesis filter, and a plurality of spatiotemporal positions corresponding to each of the plurality of texture images. The conversion variables 211b, 212b, 213b, 214b, and Nb are compressed (S17). Further, in step S17, on the premise that the approximation step S15 of the texture image is performed, the image excitation signal of each of the plurality of texture blocks defining each of the plurality of approximation texture images 211a ', 213a', and Na ', Variables of the texture synthesis filter and a plurality of approximated spatiotemporal position conversion variables 211b ', 213b', and Nb 'corresponding to the plurality of approximated texture images may be compressed. In this case, the compression may be performed by a bitstream compression method.

이하에서는 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법에 있어서의 인코딩 방법을 수학식을 통하여 설명하도록 한다.
Hereinafter, the encoding method in the multi-texture image-based image processing method according to the present invention will be described by using an equation.

특징점은 다음과 같이 검출될 수 있다. 먼저, k개의 프레임으로 구성된 입력 영상

에 대하여, 자기 상관 매트릭스(Autocorrelation matrix)

를 계산한다. 여기서,

는 {x,y}가

를 만족할 때의 포인트의 주변 윈도우 신호이다. 그리고, x와 y는 각각 x축 방향과 y축 방향의 픽셀 포인트이며,

는 통계적 기대 함수(Statistical expectation operator)로 정의된다. The feature point can be detected as follows. First, an input image consisting of k frames

With respect to the autocorrelation matrix

Calculate here,

Is { x , y }

The window signal around the point at which is satisfied. And x and y are pixel points in the x- and y- axis directions, respectively.

Is defined as a statistical expectation operator.

픽셀 포인트 {x,y}에서

를 통하여 계산된 고유치들 즉,

및

로부터, 텍스쳐 포인트 매트릭스

를 다음의 수학식 3과 같이 구할 수 있다. At pixel point { x , y }

Eigenvalues computed through

And

Texture point matrix

Can be obtained as in Equation 3 below.

여기서,

및

는 기 설정된 임계치에 해당한다. 상기 수학식 1에서 특정 픽셀 위치의

및

가 임계치

및

보다 큰 경우, 해당 특정 픽셀을 1로 정의한다. 그리고, 특정 픽셀 위치의

및

가 임계치

및

보다 작은 경우, 해당 특정 픽셀을 0으로 정의하여 텍스쳐 포인트 매트릭스를 구한다.
here,

And

Corresponds to a preset threshold. In Equation 1

And

Threshold

And

If it is larger, the specific pixel is defined as 1. And the specific pixel position

And

Threshold

And

If it is smaller, we define that particular pixel as 0 to get a texture point matrix.

그리고, 각 텍스쳐 이미지를 정의하는 복수개의 시공간위치변환 변수 및 이에 대한 텍스쳐 이미지는 다음의 수학식 4와 같이 정의될 수 있다. In addition, a plurality of space-time position transformation variables defining each texture image and texture images thereof may be defined as in Equation 4 below.

여기서,

는 다음의 수학식 5와 같이 정의될 수 있다.here,

May be defined as in Equation 5 below.

그리고, k개의 프레임으로 구성된 입력 영상

는 다음의 수학식 6과 같이, N개의 텍스쳐 이미지의 합으로써 정의될 수 있다. And, the input image consisting of k frames

May be defined as the sum of N texture images, as shown in Equation 6 below.

또한, 상기의 수학식 6에서 i번째 분할된 텍스쳐 이미지는 다음의 수학식 7과 같이 근사화하여 표현될 수 있다.In addition, the i- th divided texture image in Equation 6 may be expressed by approximating Equation 7 below.

여기서,

는 전달 함수를,

은 입력 영상의 l번째 프레임의 i번째 분할된 텍스쳐 이미지를,

는 x축 방향과 y축 방향의 위치변환 벡터를,

는

에서의 개략적인 추정 오류 신호를 나타낸다. 그리고, 수학식 7에서, 프레임 넘버 k는 k+1부터 l+M의 범위에 속한다. 수학식 7은 Taylor expansion에 의하여 다음의 수학식 8과 같이 근사화될 수 있다.here,

Is a transfer function,

Is the i- th partitioned texture image of the l- th frame of the input image,

Is the displacement vector in the x- and y-axis directions,

Is

Shows a rough estimation error signal in. And, in equation (7), frame number k is in the range of k + 1 to l + M. Equation 7 may be approximated by Equation 8 by Taylor expansion.

여기서,

및

각각은

의 x축 방향과 y축 방향의 경사도(Gradient value)의 합을 나타낸다. 그리고, 추정 오류 신호의 제곱합에 대한 정리는 다음의 수학식 9와 같이 나타낼 수 있다. here,

And

Each one

The sum of the gradient values in the x-axis direction and the y-axis direction of. The theorem for the sum of squares of the estimated error signals may be expressed as in Equation 9 below.

여기서, 추정 오류 신호의 제곱합인

크기의 최소화를 가정하여,

의 값을 구할 수 있다. 즉, 다음의 수학식 10 및 수학식 11을 계산하여,

의 값을 구한다.Where the sum of squares of the estimated error signals

Assuming minimization of size,

Can be found. That is, the following equations (10) and (11) are calculated,

Find the value of.

여기서,

가 항등변환(Identity transform)식임을 가정하면, 수학식 9와 수학식 10을 통해, 다음의 수학식 12를 얻을 수 있다.here,

Assuming that is an identity transform equation, the following equation 12 can be obtained through equations (9) and (10).

그리고, 수학식 12를 풀면,

에 대한 다음의 수학식 13을 얻을 수 있다.And, if you solve the equation (12),

The following equation (13) can be obtained.

그리고, 수학식 13을 통해 얻어진

및 수학식 11을 사용하여 다음의 수학식 14와 같은 변환함수

를 얻을 수 있다.And, obtained through the equation (13)

And a conversion function such as the following Equation 14 using Equation 11

Can be obtained.

수학식 10 및 수학식 11을 정리하여,

의 변환함수

으로부터

를 얻을 수 있다. 그리고,

의 변환함수

으로부터

를 얻을 수 있다. 또한, 수학식 3 내지 수학식 14를 통해,

를 시드 이미지

및 변환함수

로써 표현할 수 있다. 그리고,

간의 유사성을 계산하여 텍스쳐 이미지의 근사화가 이루어질 수 있다.
To sum up Equation 10 and Equation 11,

Conversion function of

From

Can be obtained. And,

Conversion function of

From

Can be obtained. Further, through Equations 3 to 14,

Seed image

And conversion functions

Can be expressed as: And,

An approximation of the texture image can be made by calculating the similarity between them.

이하에서는 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법에 있어서의 디코딩 방법에 대하여 설명하도록 한다. Hereinafter, a decoding method of the multi-texture image-based image processing method according to the present invention will be described.

도 3은 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법에 있어서의, 디코딩의 방법을 설명하기 위한 동작 흐름도이다. 도 4는 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법에 있어서의, 디코딩의 방법을 설명하기 위한 도면이다.
3 is a flowchart illustrating a method of decoding in the multi-texture image-based image processing method according to the present invention. 4 is a diagram for describing a decoding method in the multi-texture image-based image processing method according to the present invention.

도 3 및 도 4를 참조하면, 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법의 디코딩 방법은 먼저, 압축된 영상 신호를 입력받는다(S30). 이 때, 압축된 영상 신호는 복수개의 텍스쳐 이미지를 정의하는 복수개의 텍스쳐 블록 각각의 영상여기신호, 텍스쳐합성필터의 변수 및 각각의 텍스쳐 이미지에 대응하는 복수개의 시공간위치변환 변수가 압축된 신호일 수 있다. 물론, 압축된 영상 신호는 복수개의 근사화 텍스쳐 이미지를 정의하는 복수개의 텍스쳐 블록 각각의 영상여기신호, 텍스쳐합성필터의 변수 및 복수개의 근사화 시공간위치변환 변수가 압축된 신호일 수도 있다. 또한, 압축된 영상 신호에서, 영상여기신호, 텍스쳐합성필터의 변수 및 복수개의 시공간위치변환 변수는 비트스트림 압축 방식으로 압축되어 있을 수 있다. 3 and 4, the decoding method of the multi-texture image-based image processing method according to the present invention first receives a compressed image signal (S30). In this case, the compressed image signal may be a signal obtained by compressing an image excitation signal of each of a plurality of texture blocks defining a plurality of texture images, a variable of a texture synthesis filter, and a plurality of spatiotemporal position conversion variables corresponding to each of the texture images. . Of course, the compressed image signal may be a signal obtained by compressing an image excitation signal of each of the plurality of texture blocks that define the plurality of approximated texture images, a variable of the texture synthesis filter, and a plurality of approximated spatiotemporal position conversion variables. Also, in the compressed video signal, the image excitation signal, the variable of the texture synthesis filter, and the plurality of spatiotemporal position conversion variables may be compressed by a bitstream compression method.

그리고, 압축된 영상 신호를 압축 해제한다(S31). 즉, 압축된 영상여기신호, 텍스쳐합성필터의 변수와 각각의 텍스쳐 이미지에 대응하는 복수개의 시공간위치변환 변수를 압축 해제한다.Then, the compressed video signal is decompressed (S31). That is, the compressed image excitation signal, the variable of the texture synthesis filter, and the plurality of spatiotemporal position conversion variables corresponding to each texture image are decompressed.

그리고, 압축된 영상여기신호 및 텍스쳐합성필터의 변수를 이용하여 복수개의 텍스쳐 블록을 생성하고, 복수개의 텍스쳐 블록을 합하여 텍스쳐 이미지를 생성한다(S32). 이 때의 텍스쳐 이미지는 인코딩 과정에서 근사화된 텍스쳐 이미지일 수 있다.Then, a plurality of texture blocks are generated using the compressed image excitation signal and the variables of the texture synthesis filter, and a texture image is generated by combining the plurality of texture blocks (S32). The texture image at this time may be a texture image approximated in the encoding process.

생성된 복수개의 텍스쳐 이미지와 복수개의 시공간위치변환 변수에서, 각 텍스쳐 이미지와 해당 텍스쳐 이미지에 대응하는 시공간위치변환 변수를 일대일로 매칭한다(S33). 물론, 각 근사화 텍스쳐 이미지와 해당 근사화 텍스쳐 이미지에 대응하는 근사화 시공간위치변환 변수가 매칭될 수 있다. 도 4에서는, 제 1 근사화 텍스쳐 이미지(211a')와 제 1 근사화 시공간위치변환 변수(211b')가 매칭되고, 제 2 근사화 텍스쳐 이미지(213a')와 제 2 근사화 시공간위치변환 변수(213b')가 매칭되며, 제 N 근사화 텍스쳐 이미지(Na')와 제 N 근사화 시공간위치변환 변수(Nb')가 매칭된다.In the generated plurality of texture images and the plurality of spatiotemporal positional transformation variables, the spatiotemporal positional transformation variables corresponding to each texture image and the corresponding texture image are matched one-to-one (S33). Of course, each of the approximated texture images and the approximated spatiotemporal position transformation variables corresponding to the corresponding approximated texture images may be matched. In FIG. 4, the first approximation texture image 211a 'and the first approximation spatiotemporal position transformation variable 211b' are matched, and the second approximation texture image 213a 'and the second approximation spatiotemporal position transformation variable 213b'. Is matched, and the Nth approximated texture image Na 'and the Nth approximated spatiotemporal position change variable Nb' are matched.

단계(S33)에서 매칭된 텍스쳐 이미지와 시공간위치변환 변수를 이용하여 비쥬얼 텍스쳐를 생성한다(S34). 구체적으로, 텍스쳐 이미지에 특징점들의 시간 대비 움직임 등을 정의한 시공간위치변환 변수를 적용하여, 해당 텍스쳐 이미지에 대한 복수개의 프레임으로 구성된 비쥬얼 텍스쳐를 생성한다. 물론, 매칭된 근사화 텍스쳐 이미지와 근사화 시공간위치변환 변수를 이용하여 비쥬얼 텍스쳐를 생성할 수도 있다. 도 4에서는 제 1 근사화 텍스쳐 이미지(211a')와 제 1 근사화 시공간위치변환 변수(211b')를 이용하여, 복수개의 프레임으로 구성된 제 1 비쥬얼 텍스쳐(211)를 생성한다. 그리고, 제 2 근사화 텍스쳐 이미지(213a')와 제 2 근사화 시공간위치변환 변수(213b')를 이용하여, 복수개의 프레임으로 구성된 제 2 비쥬얼 텍스쳐(213)를 생성한다. 또한, 제 N 근사화 텍스쳐 이미지(Na')와 제 N 근사화 시공간위치변환 변수(Nb')를 이용하여, 복수개의 프레임으로 구성된 제 N 비쥬얼 텍스쳐(N)를 생성한다.In operation S33, a visual texture is generated using the matched texture image and the space-time position transformation variable in operation S34. Specifically, a visual texture consisting of a plurality of frames for the texture image is generated by applying a spatiotemporal position change variable that defines a time-dependent movement of feature points, etc., to the texture image. Of course, a visual texture may be generated by using a matched approximation texture image and an approximation spatiotemporal position transformation variable. In FIG. 4, a first visual texture 211 composed of a plurality of frames is generated by using the first approximated texture image 211a 'and the first approximated spatiotemporal position change variable 211b'. Then, a second visual texture 213 composed of a plurality of frames is generated using the second approximated texture image 213a 'and the second approximated spatiotemporal position change variable 213b'. Also, an Nth visual texture N consisting of a plurality of frames is generated using the Nth approximated texture image Na 'and the Nth approximated spatiotemporal position change variable Nb'.

단계(S34)에서 각각의 텍스쳐 이미지에 대응하여 생성된 비쥬얼 텍스쳐들을 결합한다(S35). 비쥬얼 텍스쳐들을 결합함으로써, 샷 단위 영상의 복수개의 프레임이 전체적으로 복원된다. 도 4에서는 제 1 비쥬얼 텍스쳐(211), 제 2 비쥬얼 텍스쳐(213), 제 N 비쥬얼 텍스쳐(N)가 결합된다. In operation S34, the visual textures generated in correspondence with the respective texture images are combined (S35). By combining the visual textures, a plurality of frames of the shot unit image are reconstructed as a whole. In FIG. 4, the first visual texture 211, the second visual texture 213, and the Nth visual texture N are combined.

단계(S35)에서 결합된 복수개의 비쥬얼 텍스쳐들의 결합 경계에서의 결함(Artifact)을 필터링하여 보정한다(S36). 즉, 단계(S35)에서 결합된 복수개의 비쥬얼 텍스쳐들은 단순합으로 복원된 것으로서, 비쥬얼 텍스쳐들 간의 경계에서 결함이 발생할 수 있다. 이러한 결함에 대하여 제거 필터링 작업을 수행함으로써 최종적인 보정된 복원 이미지를 생성한다. Artifacts at the combined boundary of the plurality of visual textures combined in step S35 are filtered and corrected (S36). That is, the plurality of visual textures combined in step S35 are restored as a simple sum, and a defect may occur at a boundary between the visual textures. By performing removal filtering on these defects, a final corrected reconstructed image is generated.

그리고, 단계(S36)를 통해 얻어진 샷 단위의 복원 이미지에 다른 샷 단위의 복원 이미지들이 결합되어 최종 영상(200')을 생성할 수 있다.
In addition, the reconstructed images of other shot units may be combined with the reconstructed images of shot units obtained through operation S36 to generate a final image 200 ′.

이하에서는 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 장치의 구성 및 동작에 대하여 설명하도록 한다.Hereinafter, the configuration and operation of the multi-textured image-based image processing apparatus according to the present invention will be described.

도 5는 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 장치의 구성을 나타낸 블록도이다.
5 is a block diagram showing the configuration of a multi-texture image-based image processing apparatus according to the present invention.

도 5를 참조하면, 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 장치(500)는 인코딩부(510) 및 디코딩부(520)를 포함하여 구성될 수 있다.
Referring to FIG. 5, the multi-texture image-based image processing apparatus 500 according to the present invention may include an encoder 510 and a decoder 520.

인코딩부(510)는 시드 이미지 선택부(511), 특징점 검출부(512), 변수 산출부(513), 텍스쳐 이미지 정의부(514) 및 텍스쳐 블록 정의부(516)를 포함하여 구성된다. 또한, 인코딩부(510)는 근사화부(515) 및 압축부(517)를 더 포함하여 구성될 수 있다. The encoder 510 includes a seed image selector 511, a feature point detector 512, a variable calculator 513, a texture image definer 514, and a texture block definer 516. Also, the encoding unit 510 may further include an approximation unit 515 and a compression unit 517.

시드 이미지 선택부(511)는 입력 받은 영상을 샷 단위 영상으로 분류하고, 샷 단위 영상의 복수개의 프레임 중의 한 프레임을 시드 이미지로 선택한다. 그리고, 시드 이미지 선택부(511)는 샷 단위 영상의 시드 이미지를 제외한 나머지 프레임은 잔여 프레임 이미지로 정의한다. 즉, 시드 이미지 선택부(511)는 샷 단위 영상이 k 개의 프레임으로 구성되어 있을 때, 1 개의 시드 이미지를 선택하고, 나머지 k-1 개의 프레임을 잔여 프레임 이미지(220)로 정의한다. 이 때, 샷 단위 영상은 한 대의 카메라가 연속해서 촬영하는 영상에 해당한다.The seed image selector 511 classifies the received image into a shot unit image, and selects one frame among a plurality of frames of the shot unit image as a seed image. The seed image selector 511 defines the remaining frames as the remaining frame image except for the seed image of the shot unit image. That is, the seed image selector 511 selects one seed image when the shot unit image is composed of k frames, and defines the remaining k-1 frames as the remaining frame image 220. In this case, the shot unit image corresponds to an image that one camera continuously photographs.

특징점 검출부(512)는 시드 이미지 선택부(511)에서 선택된 시드 이미지에서 복수개의 특징점을 검출한다. 이 때, 특징점 검출부(512)는 샷 단위 영상의 복수개의 프레임에 있어서, 기 설정된 수치 이상의 변화량을 갖는 포인트를 특징점으로 검출할 수 있다. 즉, 특징점 검출부(512)는 시드 이미지 및 잔여 프레임 이미지에서 특정 포인트가 기 설정된 수치 이상의 변화를 보인다면, 해당 특정 포인트를 특징점으로 검출할 수 있다. The feature point detector 512 detects a plurality of feature points from the seed image selected by the seed image selector 511. In this case, the feature point detector 512 may detect, as a feature point, a point having a change amount equal to or greater than a predetermined value in a plurality of frames of the shot unit image. That is, the feature point detector 512 may detect the specific point as the feature point if the specific point in the seed image and the remaining frame image shows a change more than a predetermined value.

변수 산출부(513)는 샷 단위 영상의 복수개의 프레임에서 복수개의 특징점을 추적하여 특징점 각각에 대한 시공간위치변환 변수를 산출한다. 즉, 변수 산출부(513)는 시드 이미지 및 잔여 프레임 이미지에서 특징점의 변화를 정의하는 시공간위치변환 변수를 산출한다. 시공간위치변환 변수는 특징점의 시간에 따른 위치의 변화량 등을 나타내는 함수의 형태일 수 있다.The variable calculator 513 tracks a plurality of feature points in a plurality of frames of the shot unit image and calculates a space-time position conversion variable for each feature point. That is, the variable calculator 513 calculates a spatiotemporal position transformation variable that defines a change of feature points in the seed image and the remaining frame image. The space-time position transformation variable may be in the form of a function indicating an amount of change of the position of the feature point over time.

텍스쳐 이미지 정의부(514)는 변수 산출부(513)에서 산출된 시공간위치변환 변수가 상호 대응되는 특징점들을 이용하여 복수개의 텍스쳐 이미지를 정의한다. 이 때, 텍스쳐 이미지 정의부(514)는 시공간위치변환 변수가 상호 동일한 특징점들을 연계하여 하나의 텍스쳐 이미지를 정의할 수 있다. The texture image defining unit 514 defines a plurality of texture images using feature points to which the spatiotemporal position conversion variables calculated by the variable calculating unit 513 correspond to each other. In this case, the texture image defining unit 514 may define one texture image by linking feature points having the same space-time position transformation variables.

근사화부(515)는 복수개의 텍스쳐 이미지에 있어서, 유사한 시공간위치변환 변수를 갖는 텍스쳐 이미지들을 하나의 텍스쳐 이미지로 합쳐서 근사화한다. 즉, 근사화부(515)는 복수개의 텍스쳐 이미지와 복수개의 시공간위치변환 변수가 근사화된 복수개의 근사화 텍스쳐 이미지와 복수개의 근사화 시공간위치변환 변수를 생성할 수 있다. 이 때, 근사화부(515)는 시공간위치변환 변수 간의 유사성을 텍스쳐 이미지 신호의 상관관계(correlation) 특성을 얻어냄으로써 계산할 수 있다. 그리고, 근사화부(515)는 시공간위치변환 변수간의 유사성이 기 설정된 임계치 이내의 값들을 가지는 텍스쳐 이미지들을 하나의 텍스쳐 이미지로 합칠 수 있다. The approximation unit 515 approximates a plurality of texture images by combining texture images having similar spatiotemporal displacement variables into one texture image. That is, the approximator 515 may generate a plurality of approximated texture images and a plurality of approximated spatiotemporal position transformation variables to which the plurality of texture images and the plurality of spatiotemporal position transformation variables are approximated. In this case, the approximation unit 515 may calculate the similarity between the spatiotemporal position transformation variables by obtaining correlation characteristics of the texture image signal. In addition, the approximation unit 515 may combine the texture images having values within the preset threshold similarity between the spatiotemporal position transformation variables into one texture image.

텍스쳐 블록 정의부(516)는 복수개의 텍스쳐 이미지 각각을 복수개의 텍스쳐 블록의 합으로 정의한다. 이 때, 텍스쳐 블록은 영상여기신호를 입력으로 한 텍스쳐합성필터의 출력으로 정의될 수 있다. 그리고, 영상여기신호는 2차원 가우시안 함수로 표현될 수 있다. 물론, 텍스쳐 블록 정의부(516)는 복수개의 근사화 텍스쳐 이미지 각각을 복수개의 텍스쳐 블록의 합으로 정의할 수 있다.The texture block definition unit 516 defines each of the plurality of texture images as the sum of the plurality of texture blocks. In this case, the texture block may be defined as an output of a texture synthesis filter that receives an image excitation signal. The image excitation signal may be represented by a two-dimensional Gaussian function. Of course, the texture block definition unit 516 may define each of the plurality of approximated texture images as the sum of the plurality of texture blocks.

압축부(517)는 영상여기신호 및 텍스쳐합성필터의 변수 및 각각의 텍스쳐 이미지에 대응하는 복수개의 시공간위치변환 변수를 압축한다. 물론, 압축부(517)는 복수개의 근사화 텍스쳐 이미지의 영상여기신호, 텍스쳐합성필터의 변수 및 복수개의 근사화 시공간위치변환 변수를 압축할 수 있다.
The compression unit 517 compresses the image excitation signal and the variable of the texture synthesis filter and the plurality of spatiotemporal position conversion variables corresponding to each texture image. Of course, the compression unit 517 may compress the image excitation signals of the plurality of approximated texture images, the variables of the texture synthesis filter, and the plurality of approximated spatiotemporal position conversion variables.

디코딩부(520)는 압축 해제부(521), 텍스쳐 이미지 생성부(522), 매칭부(523), 비쥬얼 텍스쳐 생성부(524) 및 비쥬얼 텍스쳐 결합부(525)를 포함하여 구성된다. 또한, 디코딩부(520)는 보정부(526)를 더 포함하여 구성될 수 있다. The decoder 520 includes a decompressor 521, a texture image generator 522, a matcher 523, a visual texture generator 524, and a visual texture combiner 525. In addition, the decoder 520 may further include a corrector 526.

압축 해제부(521)는 인코딩부(510)에서 압축된 영상 신호를 입력받아, 해당 압축된 영상 신호를 압축 해제한다. 압축 해제부(521)는 압축된 복수개의 텍스쳐 이미지 각각을 정의하는 영상여기신호 및 텍스쳐합성필터의 변수 그리고, 각각의 텍스쳐 이미지에 대응하는 복수개의 시공간위치변환 변수를 압축 해제한다. The decompressor 521 receives the compressed video signal from the encoder 510 and decompresses the compressed video signal. The decompressor 521 decompresses the image excitation signal and the texture synthesis filter variable defining each of the plurality of compressed texture images, and the plurality of spatiotemporal position conversion variables corresponding to each texture image.

텍스쳐 이미지 생성부(522)는 영상여기신호 및 텍스쳐합성필터의 변수를 이용하여 복수개의 텍스쳐 블록을 생성하고, 복수개의 텍스쳐 블록을 합하여 텍스쳐 이미지를 생성한다.The texture image generator 522 generates a plurality of texture blocks using the image excitation signal and the variables of the texture synthesis filter, and generates a texture image by combining the plurality of texture blocks.

매칭부(523)는 텍스쳐 이미지 생성부(522)에서 생성된 텍스쳐 이미지와 복수개의 시공간위치변환 변수에서, 각 텍스쳐 이미지와 해당 텍스쳐 이미지에 대응하는 시공간위치변환 변수를 일대일로 매칭한다. 물론, 매칭부(523)는 각 근사화 텍스쳐 이미지와 해당 근사화 텍스쳐 이미지에 대응하는 근사화 시공간위치변환 변수를 매칭할 수 있다. The matching unit 523 matches the texture image generated by the texture image generating unit 522 and the plurality of spatiotemporal position conversion variables, one-to-one, with each texture image and the spatiotemporal position conversion variable corresponding to the corresponding texture image. Of course, the matching unit 523 may match the approximated spatiotemporal position transformation variable corresponding to each approximated texture image and the corresponding approximated texture image.

비쥬얼 텍스쳐 생성부(524)는 매칭된 텍스쳐 이미지와 시공간위치변환 변수를 이용하여 비쥬얼 텍스쳐를 생성한다. 구체적으로, 비쥬얼 텍스쳐 생성부(524)는 텍스쳐 이미지에 특징점들의 시간 대비 움직임 등을 정의한 시공간위치변환 변수를 적용하여, 해당 텍스쳐 이미지에 대한 복수개의 프레임으로 구성된 비쥬얼 텍스쳐를 생성한다. 물론, 비쥬얼 텍스쳐 생성부(524)는 매칭된 근사화 텍스쳐 이미지와 근사화 시공간위치변환 변수를 이용하여 비쥬얼 텍스쳐를 생성할 수도 있다. The visual texture generator 524 generates a visual texture using the matched texture image and the space-time position transformation variable. In detail, the visual texture generation unit 524 generates a visual texture composed of a plurality of frames for the texture image by applying a spatiotemporal position conversion variable that defines a time-dependent movement of feature points, etc., to the texture image. Of course, the visual texture generator 524 may generate a visual texture using the matched approximated texture image and the approximated spatiotemporal position transformation variable.

비쥬얼 텍스쳐 결합부(525)는 비쥬얼 텍스쳐 생성부(524)에 의하여 각각의 텍스쳐 이미지에 대응하여 생성된 비쥬얼 텍스쳐들을 결합한다. 비쥬얼 텍스쳐들을 결합함으로써, 샷 단위 영상의 복수개의 프레임이 전체적으로 복원된다. The visual texture combiner 525 combines the visual textures generated by the visual texture generator 524 corresponding to each texture image. By combining the visual textures, a plurality of frames of the shot unit image are reconstructed as a whole.

보정부(526)는 결합된 복수개의 비쥬얼 텍스쳐들의 결합 경계에서의 결함(Artifact)을 필터링하여 보정한다. 즉, 비쥬얼 텍스쳐 결합부(525)에 의하여 결합된 복수개의 비쥬얼 텍스쳐들은 단순합으로 복원된 것으로서, 비쥬얼 텍스쳐들 간의 경계에서 결함이 발생할 수 있다. 보정부(526)는 이러한 결함에 대하여 제거 필터링 작업을 수행함으로써 최종적인 보정된 복원 이미지를 생성한다.
The corrector 526 filters and corrects artifacts at the combined boundary of the plurality of combined visual textures. That is, the plurality of visual textures combined by the visual texture combiner 525 are restored as a simple sum, and a defect may occur at a boundary between the visual textures. The correction unit 526 generates a final corrected reconstructed image by performing removal filtering on the defect.

이상에서와 같이 본 발명에 따른 다중 텍스쳐 이미지 기반 영상 처리 방법 및 장치는 상기한 바와 같이 설명된 실시예들의 구성과 방법이 한정되게 적용될 수 있는 것이 아니라, 상기 실시예들은 다양한 변형이 이루어질 수 있도록 각 실시예들의 전부 또는 일부가 선택적으로 조합되어 구성될 수도 있다.As described above, the multi-texture image-based image processing method and apparatus according to the present invention are not limited to the configuration and method of the embodiments described as described above, but the embodiments may be modified in various ways. All or some of the embodiments may be optionally combined.

500; 다중 텍스쳐 이미지 기반 영상 처리 장치
510; 인코딩부
511; 시드 이미지 선택부 512; 특징점 검출부
513; 변수 산출부 514; 텍스쳐 이미지 정의부
515; 근사화부 516; 텍스쳐 블록 정의부
517; 압축부
520; 디코딩부
521; 압축 해제부 522; 텍스쳐 이미지 생성부
523; 매칭부 523; 비쥬얼 텍스쳐 생성부
524; 비쥬얼 텍스쳐 결합부 525; 보정부500; Multi-texture image-based image processing device
510; Encoding Section
511; A seed image selector 512; Feature point detector
513; Variable calculator 514; Texture Image Definition
515; Approximation section 516; Texture block definition
517; Compression part
520; Decoding unit
521; Decompression unit 522; Texture Image Generator
523; A matching unit 523; Visual texture generator
524; Visual texture coupling 525; Compensator

Claims

Classifying the input image into a shot unit image and selecting one frame among a plurality of frames of the shot unit image as a seed image;
Detecting a plurality of feature points in the seed image;
Calculating a spatiotemporal position change variable for each of the feature points by tracking the plurality of feature points in the plurality of frames of the shot unit image;
Defining a plurality of texture images using feature points corresponding to the spatiotemporal position conversion variables; And
And defining each of the plurality of texture images as a sum of a plurality of texture blocks which are outputs of a texture synthesis filter using an image excitation signal as an input.

The method according to claim 1,
The image excitation signal is a multi-texture image-based image processing method, characterized in that represented by a two-dimensional Gaussian function.

The method according to claim 1,
Compressing the image excitation signal of each of the plurality of texture blocks defining the plurality of texture images, the variable of the texture synthesis filter, and the spatiotemporal position conversion variable corresponding to each of the plurality of texture images. Multi-texture image-based image processing method.

The method according to claim 3,
The compressing step,
And compressing the image excitation signal, the variable of the texture synthesis filter, and the spatiotemporal position conversion variable using a bitstream compression method.

Claim 1
In the plurality of texture images, the similarity calculated by obtaining correlation characteristics of the texture image signal is approximated by combining texture images having spatiotemporal displacement variables having values within a predetermined threshold. The multi-texture image-based image processing method further comprising the step.

The method according to claim 1,
Detecting the plurality of feature points,
And detecting a point having a change amount greater than or equal to a predetermined value as the feature point in the plurality of frames.

The method according to claim 3,
Decompressing the compressed image excitation signal, the variables of the texture synthesis filter, and the spatiotemporal position change variables corresponding to the respective texture images;
Generating the plurality of texture blocks using the image excitation signal and the parameters of the texture synthesis filter, and generating the texture image by adding the plurality of texture blocks;
Matching the texture image and the spatiotemporal position transformation variable corresponding to the texture image;
Generating a visual texture using the texture image and the spatiotemporal position transformation variable; And
And combining the visual textures generated in correspondence to the respective texture images.

The method according to claim 7,
And filtering and correcting artifacts at the combined boundary of the visual textures.

A seed image selection unit classifying the input image into a shot unit image and selecting one frame among a plurality of frames of the shot unit image as a seed image;
A feature point detector configured to detect a plurality of feature points from the seed image;
A variable calculator configured to track the plurality of feature points in the plurality of frames of the shot unit image and calculate a space-time position conversion variable for each feature point;
A texture image definition unit defining a plurality of texture images using feature points corresponding to the spatiotemporal position conversion variables; And
And a texture block definition unit which defines each of the plurality of texture images as a sum of a plurality of texture blocks which are outputs of a texture synthesis filter using an image excitation signal as an input.

The method according to claim 9,
And the image excitation signal is represented by a two-dimensional Gaussian function.

The method according to claim 9,
And a compression unit configured to compress the image excitation signal of each of the plurality of texture blocks defining the plurality of texture images, the variable of the texture synthesis filter, and the spatiotemporal position change variable corresponding to each of the plurality of texture images. Multi-texture image-based image processing device.

The method of claim 11,
The compression unit,
And the image excitation signal, the variable of the texture synthesis filter, and the space-time position conversion variable are compressed by a bitstream compression method.

The method according to claim 9,
In the plurality of texture images, the similarity calculated by obtaining correlation characteristics of the texture image signal is approximated by combining texture images having spatiotemporal displacement variables having values within a predetermined threshold. The multi-texture image-based image processing apparatus further comprising an approximation unit.

The method according to claim 9,
The feature point detector,
The multi-texture image-based image processing apparatus of claim 1, wherein the plurality of frames detect a point having a change amount greater than or equal to a predetermined value as the feature point.

The method of claim 11,
A decompression unit for decompressing the compressed image excitation signal, the parameters of the texture synthesis filter, and the spatiotemporal position change variables corresponding to the respective texture images;
A texture image generator configured to generate the plurality of texture blocks using the image excitation signal and the variables of the texture synthesis filter, and to generate the texture image by adding the plurality of texture blocks;
A matching unit matching the texture image and the space-time position conversion variable corresponding to the texture image;
A visual texture generator for generating a visual texture using the texture image and the spatiotemporal position transformation variable; And
And a visual texture combiner for combining the visual textures generated in correspondence with the respective texture images.

The method according to claim 15,
And a correction unit which filters and corrects artifacts at the combined boundary of the visual textures.