KR100953738B1

KR100953738B1 - Apparatus for segmenting multi-view images

Info

Publication number: KR100953738B1
Application number: KR1020080050403A
Authority: KR
Inventors: 윤일동; 이수찬
Original assignee: 한국외국어대학교 연구산학협력단; 윤일동
Priority date: 2008-05-29
Filing date: 2008-05-29
Publication date: 2010-04-19
Also published as: KR20090124285A

Abstract

본 발명은 영상처리기술에 관한 것으로, 다시점 영상집합을 영역화 하는 영상 영역화 장치에 관한 것이다.BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to image processing technology, and to an image segmentation apparatus for localizing a multiview image set.

본 발명의 실시예에 따른 영상 영역화 장치는, 하나의 영상에 대한 영역화 정보를 이용하여 다시점 영상집합(multi-view image set)의 전경 및 배경을 근사화하는 근사화부와, 근사화된 전경 및 배경 정보를 이용하여 다시점 영상집합을 영역화 하는 영역화부를 포함한다. An image segmentation apparatus according to an embodiment of the present invention includes an approximation unit for approximating a foreground and a background of a multi-view image set by using segmentation information of one image, an approximated foreground and It includes a localization unit for localizing a multi-view image set using the background information.

다시점 영상집합, 영역화 Multiview Image Set, Segmentation

Description

Image segmentation device {APPARATUS FOR SEGMENTING MULTI-VIEW IMAGES}

본 발명은 영상처리기술에 관한 것으로, 더욱 자세하게는 하나의 물체를 다양한 시점에서 촬영한 영상집합을 영역화하는 영상 영역화 장치에 관한 것이다.The present invention relates to an image processing technology, and more particularly, to an image segmentation apparatus for localizing an image set obtained by photographing an object at various viewpoints.

영상 영역화(image segmentation)는 영상을 특성이 비슷한 부분끼리 묶어서 분류하는 기술이다. 이러한 영상 영역화는 컴퓨터 비전과 그래픽스 분야에서 인식과 3차원 구조 복원 등의 전처리 과정으로 이용되고 있으며, 배경 대체를 비롯한 영상 합성과 가상 현실 등의 중간 처리과정에 이용되고 있다.Image segmentation is a technique of classifying images by grouping similar parts. Such image segmentation is used as a preprocessing process such as recognition and 3D structure reconstruction in computer vision and graphics fields, and is used for intermediate processing such as image synthesis and virtual reality, including background substitution.

최근, 하나의 물체에 대해 다양한 시점에서 촬영된 영상집합(이하, 다시점 영상집합이라 한다)을 일괄적으로 편집하기 위한 요구가 증가하고 있다. 다시점 영상 집합(multi-view image set)을 입력 데이터로 사용하는 다시점 스테레오 기법을 통한 물체의 3차원 구조 복원 기술의 성능이 개선되면서 수요가 늘었기 때문이다. In recent years, there is an increasing demand for collectively editing image sets (hereinafter, referred to as multi-view image sets) photographed at various viewpoints on one object. This is because the demand for the 3D structure reconstruction of the object is improved through the multi-view stereo technique using a multi-view image set as input data.

이러한 다시점 영상집합에 대한 영역화 기술에는 다시점 영상집합에 대한 전체 시점정보를 사용자로부터 입력받아 영역화 하는 기술과, 거리변환을 통한 경계 전파정보를 이용하여 영역화 하는 기술과, 영역화와 전경물체의 3차원 구조 복원과정을 통해 영역화 하는 기술 등이 있다.Such a multi-view image set segmentation technique includes a technique of receiving the entire viewpoint information of the multi-view image set from the user and terminating it using a boundary propagation information through distance transformation, There is a technique of territorialization through the 3D structure restoration process of the foreground object.

그러나, 다시점 영상집합에 대한 전체 시점정보를 사용자로부터 입력받아 영역화 하는 기술은 사용자가 전체 시점정보를 입력하여야 하는 불편함이 있다. 거리변환을 통한 경계전파정보를 이용하여 영역화 하는 기술은 영상집합을 비디오 영상과 동일하게 취급하므로 시점 변화가 다양한 영상에 적용하기 어렵다. 영역화와 전경물체의 3차원 구조 복원과정을 통해 영역화 하는 기술은 3차원 구조 복원과정을 거치므로 시스템 구성이 복잡해진다.However, a technique for receiving and visualizing entire viewpoint information on a multiview image set from a user is inconvenient for the user to input all viewpoint information. The technique of segmentation using boundary propagation information through distance transformation treats a set of images in the same way as a video image, so it is difficult to apply them to images with various viewpoint changes. The technology of territorialization through the 3D structure restoration process becomes complicated by the system configuration through the 3D structure restoration process.

따라서, 본 발명은 상기한 종래 기술의 문제점을 해결하기 위해 안출된 것으로, 다시점 영상집합에 대한 영역화를 간단하게 할 수 있는 영상 영역화 장치를 제공하는 것을 목적으로 한다. Accordingly, an object of the present invention is to provide an image segmentation apparatus that can simplify the segmentation of a multi-view image set.

상기 목적을 달성하기 위한 발명의 일 양상에 따른 영상 영역화 장치는, 물체에 대한 하나의 영상의 영역화 정보를 이용하여 다시점 영상집합(multi-view image set)의 전경 및 배경을 근사화하는 근사화부; 와 상기 근사화된 전경 및 배경 정보를 이용하여 상기 다시점 영상집합을 영역화 하는 영역화부를 포함한다.According to an aspect of the present invention, there is provided an apparatus for approximating a foreground and a background of a multi-view image set using segmentation information of one image of an object. part; And a localization unit for localizing the multi-view image set using the approximated foreground and background information.

상기 하나의 영상에 대한 영역화 정보는, 상기 다시점 영상집합 중 하나의 영상에 대한 전경 및 배경의 색 분포 히스토그램, 전경영역 및 상기 전경영역의 외곽선 정보 중 적어도 하나를 포함할 수 있다. The segmentation information of the one image may include at least one of a color distribution histogram of a foreground and a background, a foreground region, and outline information of the foreground region of one image of the multiview image set.

상기 근사화부는, 상기 다시점 영상집합의 특징점을 이용하여 상기 다시점 영상집합 간의 전경 및 배경을 근사화할 수 있다.The approximation unit may approximate the foreground and the background between the multiview image sets using the feature points of the multiview image sets.

상기 근사화부는 현재 영상의 배경 영역과 근사화 정보를 전파 받을 인접 영상의 특징점을 추출하고, nearest neighbor matching과 RANSAC(RANdom Sample Consensus)를 이용하여 정합한 후 대응되는 특징점을 이용하여 배경 변화를 변형행렬로 근사화할 수 있다.The approximation unit extracts a feature point of a neighboring image to receive the background region and the approximation information of the current image, matches it using nearest neighbor matching and random sample consensus (RANSAC), and then uses the corresponding feature points to transform the background change into a transformation matrix. Can be approximated

상기 근사화부는, 영상의 특징점을 이용한 레지스트레이션을 통해서 상기 다시점 영상집합의 전경 변화를 근사화할 수 있다.The approximation unit may approximate the foreground change of the multi-view image set through registration using the feature points of the image.

상기 근사화부는, 큐빅 B-스플라인을 기반으로 한 일정한 메쉬(regular mesh)의 자유형태 변형(free-form doformation) 모델을 이용하여 영상을 변형하여 목표 영상으로 레지스트레이션할 수 있다.The approximator may register the target image by transforming the image using a free-form doformation model of a regular mesh based on cubic B-splines.

상기 전경 및 배경의 근사화 정보는, 인접 영상에서 전경 및 배경으로 추정되는 영역, 상기 전경 및 배경으로 추정되는 영역에 해당되는 픽셀집합, 이전 영상의 전경 및 배경 각각의 픽셀확률분포 중 적어도 하나를 할 수 있다.The foreground and background approximation information may include at least one of a region estimated as a foreground and a background in a neighboring image, a pixel set corresponding to the region estimated as the foreground and a background, and a pixel probability distribution of each of the foreground and background of a previous image. Can be.

상기 영역화부는 인접 영상에서 전경으로 추정되는 픽셀들을 이용하여 인접 영상의 전경 픽셀들의 예측확률분포 및 배경 픽셀들의 예측확률분포를 재근사할 수 있다.The localization unit may re-approximate the prediction probability distribution of the foreground pixels and the background prediction pixels of the neighboring pixels using the pixels estimated as the foreground in the adjacent image.

상기 영역화부는, 댐핑(damping)을 적용하여 재근사된 인접 영상의 전경 픽셀들의 예측확률분포 및 배경 픽셀들의 예측확률분포와, 이전 영상의 전경 픽셀들의 확률분포 및 배경 픽셀들의 확률분포의 가중평균을 구하되, 재근사에 이용되는 픽셀들 중 전경 및 배경이 겹치는 픽셀, 이전 영상의 픽셀 확률 분포에 의한 가능 성 중 큰 값을 레이블과 현재 해당하는 영역 픽셀의 레이블과 다른 픽셀을 제외할 수 있다.The segmentation unit may include a prediction probability distribution of foreground pixels and a background prediction pixel, a probability distribution of foreground pixels of a previous image, and a weighted average of probability distributions of background pixels of a neighboring image re-approximated by damping. For example, among the pixels used for re-approximation, a large value among the pixels overlapping the foreground and the background and the possibility of the pixel probability distribution of the previous image may be excluded from the label and pixels different from the labels of the current area pixels. .

상기 영역화부는, 영상의 픽셀이 전경 및 배경 중 어느 영역에 해당하는지를 나타내는 레이블을 각 픽셀에 할당하여 영상을 영역화할 수 있다.The localization unit may localize the image by allocating a label indicating each area of the foreground and the background to each pixel.

상술한 바와 같이, 본 발명의 실시예에 따른 영상 영역화 장치는, 최소한의 사용자 입력을 기반으로 다시점 영상집합에 대해서 순차적으로 전경 및 배경을 근사하고, 근사화된 정보를 이용하여 영역화 함으로써, 다시점 영상집합에 대한 영역화를 간단히 수행할 수 있다. As described above, the apparatus for image segmentation according to an embodiment of the present invention sequentially approximates the foreground and the background with respect to a multi-view image set based on a minimum user input, and makes a region by using the approximated information. Segmentation of a multiview image set can be performed simply.

이하에서는 첨부한 도면을 참조하여 본 발명의 바람직한 실시예를 상세히 설명한다. 본 발명을 설명함에 있어 관련된 공지 기능 또는 구성에 대한 구체적인 설명이 본 발명의 요지를 흐릴 수 있다고 판단되는 경우에는 그 상세한 설명을 생략할 것이다. 또한, 후술 되는 용어들은 본 발명에서의 기능을 고려하여 정의된 용어들로서 이는 사용자, 운용자의 의도 또는 관례 등에 따라 달라질 수 있다. 그러므로 그 정의는 본 명세서 전반에 걸친 내용을 토대로 내려져야 할 것이다.Hereinafter, with reference to the accompanying drawings will be described a preferred embodiment of the present invention; In the following description of the present invention, if it is determined that detailed descriptions of related well-known functions or configurations may obscure the gist of the present invention, the detailed description will be omitted. In addition, terms to be described below are terms defined in consideration of functions in the present invention, which may vary according to intention or custom of a user or an operator. Therefore, the definition should be made based on the contents throughout the specification.

도 1은 본 발명의 실시예에 따른 영상 영역화 장치의 구성을 나타낸 도면이다.1 is a diagram illustrating a configuration of an image segmentation apparatus according to an embodiment of the present invention.

도 1에 도시된 바와 같이, 본 발명의 실시예에 따른 영상 영역화 장치는 근사화부(10)와 영역화부(20)를 포함한다.As shown in FIG. 1, an image segmentation apparatus according to an embodiment of the present invention includes an approximation unit 10 and a regionization unit 20.

근사화부(10)는 사용자로부터 입력된 하나의 영상에 대한 영역화 정보를 이용하여 다시점 영상집합(multi-view image set)의 전경 및 배경을 근사화한다. 이에 영역화부(20)는 근사화부(10)에서 근사화된 전경 및 배경 정보를 이용하여 다시점 영상집합을 영역화 하여 출력한다. The approximation unit 10 approximates the foreground and the background of the multi-view image set by using the segmentation information of one image input from the user. Accordingly, the localization unit 20 localizes and outputs a multiview image set by using the foreground and background information approximated by the approximation unit 10.

이에 대해서 구체적으로 살펴보기로 한다.This will be described in detail.

근사화부(10)는 사용자로부터 입력된 하나의 영상에 대한 영역화 정보를 이용하여 다시점 영상집합의 배경 및 전경을 순차적으로 근사화하여 영역화부(20)에 전송한다. 실시예로, 사용자에 의해 지정된 전경 및 배경이 도 2에 도시되어 있다. 도 2의 (a)는 전경(25) 및 배경(26)의 일부를 사용자가 지정한 경우이고, 도 2의 (b)는 사용자가 전경(27)의 영역을 개략적으로 지정한 경우이다.The approximator 10 sequentially approximates the background and the foreground of the multiview image set by using the segmentation information of one image input from the user and transmits the approximation to the regionr 20. In an embodiment, the foreground and background designated by the user are shown in FIG. 2. FIG. 2A illustrates a case where a part of the foreground 25 and the background 26 are designated by the user, and FIG. 2B illustrates a case where the user schematically designates an area of the foreground 27.

이때, 사용자로부터 입력된 영역화 정보는 다시점 영상집합 중 하나의 영상에 대한 전경(foreground) 및 배경(background)의 색 분포 히스토그램, 전경영역 및 상기 전경영역의 외곽선 정보 중 적어도 하나를 포함한다. 이렇게 사용자로부터 입력된 영역화 정보는 다시점 영상집합 중 사용자에 의해 첫 번째로 촬영된 영상의 영역화 정보일 수 있다. In this case, the localization information input from the user includes at least one of a foreground color and a color distribution histogram of one of the multi-view image sets, a foreground area, and outline information of the foreground area. The segmentation information input from the user may be segmentation information of the first image photographed by the user during the multiview image set.

즉, 근사화부(10)는 사용자에 의해 첫 번째로 촬영된 영상에 대한 영역화 정보를 이용하여 두 번째로 촬영된 영상에 대한 배경 및 전경에 대해 근사화하여 영역화부(20)에 전송하고, 두 번째로 촬영된 영상에 대해 근사화된 정보를 이용하여 세 번째로 촬영된 영상에 대해 근사화한다. 이러한 근사화 과정을 다시점 영상 집합에 대해서 순차적으로 수행한다. 이를 위해서 다시점 영상집합은 시점의 흐름에 따라 촬영된 순서대로 배열되어 있어야 한다.That is, the approximation unit 10 approximates the background and the foreground of the second image captured by using the area information of the first image photographed by the user, and transmits the approximation to the localization unit 20. The third photographed image is approximated by using the approximated information about the third photographed image. This approximation process is performed sequentially for a multiview image set. To this end, the multi-view image sets should be arranged in the order in which they were taken according to the flow of viewpoints.

이때, 근사화부(10)는 다시점 영상집합의 영상들의 특징점(feature point)을 이용하여 다시점 영상집합 간의 전경 및 배경을 근사화한다. 이렇게 다시점 영상집합 간의 전경 및 배경을 분리하여 근사화하는 이유는 잘못된 정합인 아웃라이어(outlier)의 가능성을 낮추기 위함이다. 여기서, 특징점은 영상 내에서 주변 픽셀들에 비해 특수한 성질을 갖는 점을 나타낸다. 실시예에 따라, 특징점은 색 변화 정도가 주변에 비해 극대화되는 점일 수 있다. 도 3의 (a)에 도시된 영상에 대해서 특징점을 추출하여 도 3의 (b)와 같이 사각형으로 표시할 수 있다.In this case, the approximation unit 10 approximates the foreground and the background between the multiview image sets by using feature points of the images of the multiview image sets. The reason for separating and approximating the foreground and the background between the multi-view image sets is to lower the possibility of outliers that are incorrect registration. Here, the feature point represents a point having special properties compared to surrounding pixels in the image. According to an embodiment, the feature point may be a point where the degree of color change is maximized compared to the surroundings. The feature points of the image illustrated in FIG. 3A may be extracted and displayed as rectangles as illustrated in FIG. 3B.

근사화부(10)는 어파인(affine) 변화에 기초하여 다시점 영상집합의 배경 변화를 변형 행렬(homography)로 근사화한다. 즉, 근사화부(10)는 현재 영상의 배경 영역과 근사화 정보를 전파 받을 인접 영상의 특징점을 추출하고, 서로 가장 가까운 네이버 매칭(nearest neighbor matching)과 RANSAC(RANdom SAmple Consensus)를 이용하여 정합한 후 대응되는 특징점을 이용하여 배경 변화를 변형행렬로 근사화하여 영역화부(20)에 전송한다. 이때, 배경은 독립적인 움직임이 없는 것으로 한다. The approximation unit 10 approximates the background change of the multi-view image set with a deformation matrix based on the affine change. That is, the approximation unit 10 extracts the feature points of the neighboring image to receive the background region and the approximation information of the current image, matches them using nearest neighbor matching and RANSAC (RANdom SAmple Consensus) The background change is approximated to the deformation matrix using the corresponding feature points and transmitted to the localization unit 20. At this time, the background is assumed to have no independent movement.

한편, 근사화부(10)는 영상의 특징점을 이용한 레지스트레이션(registration) 방법을 이용하여 전경의 변화를 근사화하여 영역화부(20)에 전송한다. 이때, 레지스트레이션은 기준(reference) 영상에 다른 영상을 찌그려뜨려서 두 영상을 최대한 유사하게 하는 과정이다. 실시예로, 도 4의 (a)는 기준영상을 나타내고, 도 4의 (b)는 변형할 다른 영상을 나타내고, 도 4의 (c)는 레지스트레이션된 영상을 나타낸다. 도 4에 도시된 바와 같이, 도 4의 (b)에 도시된 영상을 찌그 려뜨려서 도 4의 (a)의 기준영상에 겹쳐지게 하여, 도 4의 (c) 영상을 얻을 수 있다. Meanwhile, the approximation unit 10 approximates the change in the foreground by using a registration method using a feature point of the image and transmits the approximation to the areaization unit 20. In this case, registration is a process of displacing another image on a reference image to make the two images as similar as possible. As an example, FIG. 4A shows a reference image, FIG. 4B shows another image to be transformed, and FIG. 4C shows a registered image. As illustrated in FIG. 4, the image illustrated in FIG. 4B may be distorted and overlapped with the reference image of FIG. 4A to obtain the image of FIG. 4C.

이하에서, 영상의 특징점을 이용한 레지스트레이션 방법에 대해서 살펴보기로 한다. Hereinafter, a registration method using feature points of an image will be described.

근사화부(10)는 큐빅 B-스플라인을 기반으로 한 일정한 메쉬(regular mesh)의 자유형태 변형(free-form deformation) 모델을 이용하여 영상을 변형하여 목표 영상으로 레지스트레이션 한다. The approximation unit 10 deforms an image using a free-form deformation model of a regular mesh based on cubic B-splines and registers the target image.

즉, 영상의 크기가 높이

, 너비

인 경우,

의 영역을 각 축에 따라 N 개의 구간으로 나누어,

,

의 간격으로 벌어진 일정한 점을 이용하는 일정한 메쉬

의 변화에 대한 에너지를 아래의 수학식 1을 이용하여 계산한다. In other words, the size of the image

, width

If is

Divide the region of into N intervals along each axis,

,

Mesh using constant points spaced at intervals of

The energy for the change of is calculated using Equation 1 below.

여기서,

는 메쉬

의 변화상태를 나타내는 벡터이고,

는 대응점의 집합,

는 메쉬

의 점들의 변형(deformation)을 나타내는 에너지, 그리고

는 대응관계로 나타나는 에너지를 나타내며,

이다.here,

Mesh

Is a vector representing the state of change of,

Is a set of corresponding points,

Mesh

Energy representing the deformation of the points of, and

Represents the energy represented by the correspondence,

to be.

이때,

는

의 점들에 대한 색인(index)을 나타내며,

은 연속되는 세 점들의 색인의 집합을 나타낸다.At this time,

Is

Index of the points in,

Denotes a set of indices of three consecutive points.

나아가,

으로

,

는 강인한 측정기(robust estimator),

은 강인한 측정기의 입력으로 대입되는 신뢰 반경(confidence radius)을 나타낸다.

는 영역화된 전경에서 추출된 특징점을,

은 인접 영상에서 추출된

의 대응점을, 그리고

는

와 변화상태

를 이용하여

가 변환된 점을 나타내는데, B-스플라인을 이용한 보간(interpolation)을 통해 계산된다. 즉, B-스플라인을 이용한 보간은 아래의 수학식 2을 이용하여 계산된다. Furthermore,

to

,

Is a robust estimator,

Denotes a confidence radius that is substituted into the input of a robust meter.

Is the feature point extracted from the areaized foreground,

Is extracted from adjacent images

And the corresponding points of

Is

And state of change

Using

Denotes the transformed point, which is calculated through interpolation using B-splines. That is, interpolation using a B-spline is calculated using Equation 2 below.

여기에서,

는 버림 함수(floor function)를,

,

, 그리고

과

은 각각

과

차 큐빅 B-스플라인의 기저(basis) 함수를 나타낸다.From here,

Is a float function,

,

, And

and

Are each

and

Represent the basis function of the cubic B-spline.

위의 수학식들을 기반으로 에너지를 최소화하는 상태 변환 S를 구할 수 있다. 이때, 에너지를 최소화하는 상태 변환 S를 구하는데 비선형 켤레 그레디언트(nonlinear conjugate gradient) 기법이 이용된다. 상태변환

는 영상의 변환을 조절하는 일정한 메쉬의 점들의 좌표를 의미한다. Based on the above equations, a state transition S that minimizes energy can be obtained. In this case, a nonlinear conjugate gradient technique is used to obtain a state transition S that minimizes energy. State transition

Denotes the coordinates of the points of a certain mesh that controls the transformation of the image.

영역화부(20)는 근사화부(10)로부터 인접 영상에 대한 전경 및 배경의 근사화 정보를 수신하면, 수신된 근사화 정보에 기초하여 인접 영상을 영역화 한다. 이때, 근사화부(10)로부터 수신된 인접 영상에 대한 전경 및 배경의 근사화 정보는 인접 영상에서 전경 및 배경으로 추정되는 영역과 그에 해당하는 픽셀 집합, 그리고 이전 영상의 전경 및 배경의 픽셀 확률 분포 h _seg (O)와 h _seg (B)를 포함한다.When the localization unit 20 receives the approximation information of the foreground and the background of the adjacent image from the approximation unit 10, the localization unit 20 localizes the adjacent image based on the received approximation information. In this case, the approximation unit 10 receives an adjacent image in the foreground and the approximation information, pixel set corresponding to the area which is estimated as a foreground and a background in an adjacent image of a background on from, and the pixel distribution of the foreground and background from the previous image h Contains _seg (O) and h _seg (B) .

영역화부(20)는 인접 영상에서 전경으로 추정되는 픽셀들을 이용하여 인접 영상의 전경 픽셀들의 예측 확률 분포 h _next (O)를 재근사 하고, 배경에 대해서도 마 찬가지로 배경으로 추정되는 픽셀들을 이용하여 예측 확률 분포 h _next (B)를 재근사 한다. 이때, h는 각 영역의 색 분포 히스토그램을 의미한다. The regionr 20 reapproximates the predicted probability distribution h _next (O) of the foreground pixels of the adjacent images using the pixels estimated as the foreground in the adjacent image, and predicts using the pixels estimated as the background as well for the background. Reapproximate the probability distribution h _next (B) . In this case, h means a color distribution histogram of each region.

한편, 예측된 전경 및 배경 영역은 정확하지 않을 수 있다. 그러나 이전 영상과 인접 영상의 전경 및 배경 픽셀들의 색 분포는 유사할 수 있다. 이에 따라 영역화부(20)는 댐핑(damping)을 적용하여 재근사된 확률 분포와 이전 영상의 확률 분포의 가중 평균을 구한다. 이때, 재근사에 이용되는 픽셀들은 전경 및 배경이 겹치는 픽셀, 이전 영상의 픽셀 확률 분포에 의한 가능성

및

중 큰 값을 갖는 레이블과 현재 해당하는 영역 픽셀의 레이블과 다른 픽셀을 제외한다.

및

은 후술될 수학식 8에 정의된다.On the other hand, the predicted foreground and background areas may not be accurate. However, the color distribution of the foreground and background pixels of the previous image and the adjacent image may be similar. Accordingly, the localization unit 20 obtains a weighted average of the reapproximated probability distribution and the probability distribution of the previous image by applying damping. At this time, the pixels used for the reapproximation are likely to be the pixels overlapping the foreground and the background, and the probability of the pixel probability distribution of the previous image.

And

Exclude the label with the larger value and the pixel that is different from the label of the current area pixel.

And

Is defined in Equation 8 to be described later.

재근사를 통해 예측된 인접 영상의 전경 및 배경 픽셀의 색 확률 분포와 이전 영상의 전경 및 배경의 픽셀 확률 분포의 각 가중치에 대해서 살펴보기로 한다. 영역화된 배경 영역의 픽셀 수를

, 재근사에 이용되는 픽셀 수를

이라 하면, 재근사된 전경과 영역화된 이전 영상의 전경의 가중치는 아래의 수학식 3 및 4를 이용하여 구할 수 있다. Each weight of the color probability distribution of the foreground and background pixels of the neighboring image predicted through the reapproximation and the pixel probability distribution of the foreground and background of the previous image will be described. The number of pixels in the regioned background area

, The number of pixels used for reapproximation

In this case, the weights of the re-approximated foreground and the foreground of the localized previous image may be obtained by using Equations 3 and 4 below.

즉, 재근사된 전경과 영역화된 이전 영상의 전경의 가중치는 각각

,

로 구해짐을 알 수 있다. 나아가, 배경에 대해서도 수학식 3 및 수학식 4를 이용하여 가중치를 계산할 수 있다.That is, the weights of the re-approximated foreground and the foreground of the localized previous image are respectively

,

It can be seen that. Furthermore, the weight may be calculated for the background using equations (3) and (4).

영역화부(20)는 픽셀이 전경 또는 배경에 속하는지를 확인한다. 이에 대해서 자세히 살펴보기로 한다. 영상의 각 픽셀을 노드(node)로, 인접 픽셀과의 관계를 에지(edge)로 보는 픽셀 쌍(pairwise)을 생성한다. 모든 노드의 집합을 P라고 하고, 각 픽셀에 대해 어느 영역에 해당하는지를 나타내는 레이블

을 각 노드에 할당하여 영역화 한다. 레이블 할당은 할당된 레이블에 따른 에너지를 최소화시키는 형태로 이루어질 수 있다. 이때, 에너지는 아래의 수학식 5를 이용하여 구해질 수 있다. The localization unit 20 checks whether the pixel belongs to the foreground or the background. Let's take a closer look at this. Each pixel of the image is generated as a node, and a pair of pixels is generated to view a relationship with adjacent pixels as an edge. The set of all nodes is called P, and a label indicating which area corresponds to each pixel

To each node. Label assignment may be in the form of minimizing the energy associated with the assigned label. In this case, energy may be obtained using Equation 5 below.

여기에서,

과

는 각각 에너지의 영역 항(region term)과 경계 항(boundary term)을 나타낸다. 에너지의 영역 항은 각 픽셀이 전경 또는 배경에 속할 가능성을 모든 픽셀에 대해 종합한 값이다. 경계 항은 특정 지점에 경계가 위치하는 에너지를 나타낸다. 이러한 영역 항과 경계 항은 각각 수학식 6 및 수학식 7을 이용하여 구해질 수 있다. 또한,

는

과

의 실험적인 가중치를 나타낸다.From here,

and

Denote regions of energy and boundary terms, respectively. The area term of energy is the sum of the possibilities for each pixel that each pixel belongs to the foreground or background. The boundary term represents the energy at which the boundary is located at a particular point. These region terms and boundary terms may be obtained using Equations 6 and 7, respectively. Also,

Is

and

Represents the experimental weight of

이때, N은 모든 인접 픽셀 쌍으로 이루어진 집합을 나타내며,

이다.Where N represents a set of all pairs of adjacent pixels,

to be.

는 픽셀

에 레이블

를 할당할 때의 에너지로, 픽셀

의 색 또는 밟기 값

에 따라 근사된 전경 및 배경의 픽셀 확률분포를 이용하여 계산되며, 아래의 수학식 8을 이용하여 구해진다.

Is a pixel

Labels on

Is the energy at which to assign

Color or tread value

It is calculated using the approximate pixel probability distribution of the foreground and background according to the equation, and is obtained using Equation 8 below.

,

경계 항은 인접 픽셀들의 색 또는 밝기 차이인 그래디언트(gradient) 값에 결정되는

와 전파된 예상 경계의 거리 변환에 의해 결정되는

로 이루어진다. 이때,

는 아래의 수학식 9에 의해 구해질 수 있다.The boundary term is determined by the gradient value, which is the difference in color or brightness of adjacent pixels.

Determined by the distance transformation of

Is made of. At this time,

Can be obtained by Equation 9 below.

이때,

이고,

는

와

의 거리를 나타내며,

는 벡터의

노름(norm)을 나타내며,

는 전체 영상 내의 평균을 의미하며, 영역 간의 경계는 차이가 큰 곳에 위치한다는 가정하에 두 픽셀 값의 차이가 클수록 작은 값을 가지게 된다. At this time,

ego,

Is

Wow

Indicates the distance of,

Of the vector

Represents norm,

Means the average in the entire image, and the larger the difference between two pixel values is, the smaller the value is, assuming that the boundary between regions is located at a large difference.

한편,

는 아래의 수학식 10에 의해 구해진다.Meanwhile,

Is obtained by the following equation (10).

이때,

는 가장 가까운 예상 경계 지점으로부터

까지 거리를 의미하며, 예상 경계 지점과 멀리 떨어진 지점에 경계가 있을수록 에너지가 커지게 된다. At this time,

From the nearest expected boundary point

The distance to the distance, the farther away from the expected boundary point, the greater the energy.

이후, 영역화부(20)는 에너지값을 계산한 후 그랩컷 기법을 이용하여 각 노드의 레이블을 도출한다. 이때, 그랩컷 기법은 노드(node)와 노드들을 연결하는 에지(edge)로 이루어지는 그래프를 이용하여 각종 현상을 모델링(modeling)하는 이론인 그래프 컷 기법을 이용하는 흑백 영상 영역화 기법을 컬러 영상으로 확장한 기법이다. Subsequently, the localization unit 20 calculates an energy value and then derives a label of each node using a grab cut technique. At this time, the grab cut technique extends the black and white image segmentation technique to the color image using the graph cut technique, which is a theory of modeling various phenomena using a graph composed of nodes and edges connecting the nodes. One technique.

그래프 컷 기법의 실시 예로, 에너지나 흐름들을 모델링 하는 경우, 에지에 노드들간 최대 흐름의 양(flow capacity)을 나타내는 수치를 지정하고, 흐름이 발생하는 특수 노드 소스(source)와 흐름이 흘러나가는 특수 노드 싱크(sink)를 정의하여 모델링한다. 이러한 흐름 그래프(flow graph)에서 소스로부터 싱크까지 그래프를 지나가는 최대 흐름(max-flow)를 찾는다. 이때, 최대 흐름은 각 에지를 노드들간 연결의 강약을 나타내는 정도로 치환할 경우, 노드들간 에지를 끊어서 전체 그래프를 가장 적은 비용으로 자르는 방법(finding the minimum cost cut)을 통해서 구할 수 있다. As an example of the graph cut technique, when modeling energy or flows, a numerical value representing the maximum flow capacity between nodes at an edge is specified, and a special node source in which the flow occurs and a special flow out flow flow. Define and model node sinks. In this flow graph we find the max-flow through the graph from the source to the sink. In this case, the maximum flow can be obtained through a method of cutting the entire graph at the lowest cost by cutting the edges between the nodes when the edges are replaced to indicate the strength of the connection between the nodes.

그랩컷을 이용한 영역화에서 영상의 픽셀을 그래프의 노드로 정의하고 각 픽셀의 색에 따라 인접 픽셀들간 색의 차이가 클수록 수치가 작은 값을 갖도록 인접 노드들간 에지를 정의할 수 있다. 또한 노드들간 에지 이외에 모든 노드들은 소스 및 싱크와 에지로 연결된다. 소스와 싱크는 각각 전경과 배경을 의미하며, 각 노드마다 해당하는 픽셀의 색이 전경 색의 분포와 어느 정도 유사한지를 소스와의 에지에 반영하며, 배경 색 분포와 어느 정도 유사한지를 싱크와의 에지에 반영한다. In the segmentation using a grab cut, pixels of an image may be defined as nodes of a graph, and edges between adjacent nodes may be defined such that the numerical value is smaller as the color difference between adjacent pixels increases according to the color of each pixel. In addition to the edges between nodes, all nodes are connected to sources and sinks and edges. Source and sink respectively mean foreground and background, and each node reflects how much the color of the corresponding pixel is similar to the foreground color distribution at the edge of the source, and how closely it is similar to the background color distribution. Reflect on.

이렇게 소스 또는 싱크 노드와의 에지의 수치는 픽셀이 어느 영역에 해당할지를 나타내는 값으로 영역 항이라 하며, 노드들간 에지는 영역이 구분되는 경계가 있을 가능성을 나타내는 값으로 경계 항이라 한다. 최저 비용 분할은 모든 노드에 대해 각각 전경에 대한 소스와 배경에 해당하는 싱크와의 에지 둘 중 하나를 자르고, 경계가 발생하는 인접 노드들간 에지를 잘라서 그래프를 분할할 때 잘리는 에지 수치의 총합이 최저가 되는 분할을 나타낸다. 이렇게 정의된 에지 값들에 따른 최저비용분할을 찾음으로써 영상을 전경과 배경의 두 영역으로 영역화할 수 있다.In this way, the value of the edge with the source or sink node is a value that indicates which area the pixel corresponds to and is called an area term. The edge between nodes is a value indicating the possibility that there is a boundary where the area is divided. The lowest cost split is the sum of the edge numbers that are truncated when splitting the graph by truncating one of the edges with the sink corresponding to the source and background for the foreground, respectively, for all nodes, and cutting the edges between adjacent nodes where the boundary occurs. Indicates the division to be made. By finding the lowest cost division based on the edge values defined in this way, the image can be divided into two regions, a foreground and a background.

이제까지 본 발명에 대하여 그 바람직한 실시예들을 중심으로 살펴보았다. 본 발명이 속하는 기술분야에서 통상의 지식을 가진 자는 본 발명의 본질적인 특성에서 벗어나지 않는 범위에서 변형된 형태로 구현될 수 있음을 이해할 수 있을 것이다. 그러므로 개시된 실시예들은 한정적인 관점이 아니라 설명적인 관점에서 고려되어야 한다. 본 발명의 범위는 전술한 설명이 아니라 특허청구범위에 나타나 있으며, 그와 동등한 범위 내에 있는 모든 차이점은 본 발명에 포함된 것으로 해석되어야 할 것이다.So far I looked at the center of the preferred embodiment for the present invention. Those skilled in the art will understand that the present invention may be implemented in a modified form without departing from the essential characteristics of the present invention. Therefore, the disclosed embodiments should be considered in descriptive sense only and not for purposes of limitation. The scope of the present invention is shown in the claims rather than the foregoing description, and all differences within the scope will be construed as being included in the present invention.

도 1은 본 발명의 실시예에 따른 영상 영역화 장치에 대한 구성을 나타낸 도면.1 is a diagram illustrating a configuration of an image segmentation apparatus according to an embodiment of the present invention.

도 2는 본 발명의 실시예에 따른 사용자에 의해 지정된 전경 및 배경을 예시한 도면.2 is a diagram illustrating a foreground and a background designated by a user according to an embodiment of the present invention.

도 3은 본 발명의 실시예에 따른 특징점을 예시한 도면.3 illustrates feature points in accordance with an embodiment of the present invention.

도 4는 본 발명의 실시예에 따른 레지스트레이션 영상을 예시한 도면.4 is a diagram illustrating a registration image according to an embodiment of the present invention.

Claims

The foreground and background of the multi-view image set are approximated using the segmentation information of one image input from the user, but the background change of the multiview image set is based on the affine change. Is approximated by a deformation matrix, and the image is transformed using a free-form deformation model of a regular mesh based on cubic B-splines to register the target image. An approximation unit for approximating the change of the foreground by the method and transmitting the approximation information of the foreground and the background to the areaizer; And

Localize adjacent images based on the approximation information of the foreground and background received from the approximation unit, and segment the corresponding image by assigning each pixel of the image a label indicating which area of the foreground and background is the pixel of the image. An image segmentation apparatus comprising the regionr.

delete

The method of claim 1,

Approximation information of the foreground and background,

And at least one of a pixel set corresponding to the foreground and the background estimated from a neighboring image, a pixel set corresponding to the foreground and the background estimated, and a pixel probability distribution of each of the foreground and the background of a previous image.

delete

The method of claim 7, wherein

The localization unit,

Prediction probability distribution of the foreground pixels of the neighboring image and prediction probability distribution of the background pixels are reapproximated by using the pixels estimated as the foreground in the neighboring image, and the prediction probability of the foreground pixels of the neighboring image reapproximated by damping An image segmentation apparatus for calculating a predicted probability distribution of distribution and background pixels, a probability distribution of foreground pixels of a previous image, and a weighted average of probability distributions of background pixels.

delete

The method of claim 9,

The segmentation unit, the energy according to the assigned label

To obtain using the equation

and

Represents the region term and boundary term of energy, respectively, and the region term of energy is the sum of the probability of each pixel belonging to the foreground or background for all pixels, and the boundary term at a specific point. Represents the energy where the boundary is located,

Is

and

Image segmentation device showing the weight of the.