KR102247288B1

KR102247288B1 - Warping residual based image stitching for large parallax

Info

Publication number: KR102247288B1
Application number: KR1020200028996A
Authority: KR
Inventors: 심재영; 이규열
Original assignee: 울산과학기술원
Priority date: 2019-11-29
Filing date: 2020-03-09
Publication date: 2021-05-03

Abstract

The present invention provides a technology related to an image synthesis method according to a stitching algorithm based on a warping residual. One embodiment includes the steps of: obtaining homography matrices which convert a reference image including a plurality of feature points into a target image; obtaining warping residual vectors for the feature points based on the plurality of homography matrices; obtaining weights of feature points for a reference region in a reference image based on the warping residual vectors; obtaining an estimated transformation matrix for the reference region based on the weights and the homography matrices; and synthesizing the reference image and a target image, by applying the estimated transformation matrix to the reference region.

Description

A method and apparatus for stitching an image having a large parallax based on warping residuals TECHNICAL FIELD [WARPING RESIDUAL BASED IMAGE STITCHING FOR LARGE PARALLAX}

이하, 워핑 잔차에 기초하여 시차가 큰 이미지를 스티칭하는 방법 및 장치에 관한 기술이 제공된다.Hereinafter, a technique related to a method and apparatus for stitching an image having a large parallax based on the warping residual is provided.

이미지 스티칭(image stitching)은 넓은 시야각을 가진 이미지를 생성하기 위해 다른 시야 위치에서 캡처된 복수의 이미지들을 공통 좌표 영역에 정렬하는 다양한 컴퓨터 비전 응용 프로그램에 중요한 기술이다. 최근에는 360° 파노라마 카메라 및 서라운드 뷰 모니터링 시스템과 같은 이미지 스티칭 기술을 사용하는 많은 상용 제품 및 Adobe Photoshop Photomerge 및 Autostitch등과 같은 여러 이미지를 합성하기 위한 이미지 스티칭 소프트웨어 제품이 출시되고 있다.Image stitching is an important technique for various computer vision applications in which multiple images captured at different viewing positions are aligned in a common coordinate area to create an image with a wide viewing angle. Recently, many commercial products using image stitching technology such as 360° panoramic cameras and surround view monitoring systems, and image stitching software products for compositing multiple images such as Adobe Photoshop Photomerge and Autostitch have been released.

기존의 이미지 스티칭 방법은 대부분 유사한 절차를 따른다. 먼저, 한 쌍의 입력 이미지에서 특징점들이 감지되고, 해당 특징점들의 대응하는 매칭 (correspondence matches)이 이미지 사이에서 발견된다. 그런 다음 대상 이미지를 대상 이미지 도메인으로 워핑하는 검출된 특징점의 매칭을 이용하여 파라메트릭 이미지 워핑 모델(parametric image warping model)을 추정한다. 마지막으로, 워핑된 대상 이미지와 대상 이미지 사이의 겹친 영역의 픽셀 값을 결정하여 출력 스티치 이미지를 합성한다.Most of the existing image stitching methods follow a similar procedure. First, feature points are detected in a pair of input images, and corresponding matches of the feature points are found between the images. Then, a parametric image warping model is estimated using matching of detected feature points that warp the target image into the target image domain. Finally, the output stitch image is synthesized by determining the pixel value of the overlapped area between the warped target image and the target image.

이미지 스티칭의 가장 중요하고 어려운 단계 중 하나는 이미지 워핑이다. 호모그래프(Homograph)는 평면 장면(scene) 가정을 기반으로 하는 파라메트릭 평면 변형을 설명하는 단순하고 전통적인 이미지 워핑 모델이다. 그러나 캡처 된 장면이 서로 다른 장면 깊이(scene depth)의 전경 객체를 포함하여 평면 장면에 해당하지 않고, 카메라 기준선(camera baseline)이 큰 경우, 물체의 상대적인 위치가 두 이미지에서 다른 시차(parallax) 현상이 관찰된다. 이러한 경우, 호모그래피와 같은 평면 변환 모델을 사용한 스티칭 결과는 종종 객체의 경계 부근에서 시차 아티팩트(parallax artifact)를 나타낸다.One of the most important and difficult steps in image stitching is image warping. Homograph is a simple and traditional image warping model that describes parametric plane deformation based on the assumption of a plane scene. However, if the captured scene does not correspond to a flat scene including foreground objects of different scene depths and the camera baseline is large, the relative positions of the objects are different in the two images. Is observed. In this case, stitching results using planar transformation models such as homography often show parallax artifacts near the boundary of the object.

이미지 스티칭의 시차 아티팩트를 완화시키기 위해, 이미지를 규칙적인 그리드 셀 또는 픽셀로 분할하고 각각의 분할된 구역을 다른 모델에 의해 워핑하는 적응적 워핑 알고리즘이 제안되었다. 워핑된 이미지의 왜곡을 방지하기 위해 에너지 최소화 프레임 워크가 적응적 워프의 최적화에 적용되었다. 심 커팅 방법(seam-cutting methods)에 기초하여 다른 오정렬된 영역에서 아티팩트를 숨기면서 특정 이미지 영역만을 정렬시키는 국소 정렬 기술이 제안되었다. 그러나, 시차가 큰 이미지의 경우, 하나의 이미지 내의 인접하는 픽셀들에 대응하는 다른 이미지 내의 픽셀들은 서로 인접하지 않을 수 있으며, 이는 기존의 워핑-기반 방법에 의해 얻어진 스티칭 이미지에서 심각한 시차 아티팩트를 야기할 수 있다. 에피폴라 기하학(epipolar geometry)에 기초한 큰 시차 문제를 해결하는 비디오 스티칭 방법이 제안되었지만, 비디오 시퀀스의 시간적 움직임 정보가 없는 이미지 스티칭에 직접 적용될 수는 없다.In order to mitigate the parallax artifact of image stitching, an adaptive warping algorithm has been proposed that divides the image into regular grid cells or pixels and warps each segmented area by a different model. In order to prevent distortion of the warped image, an energy minimization framework was applied to the optimization of adaptive warp. Based on sea-cutting methods, a local alignment technique has been proposed in which only a specific image area is aligned while hiding artifacts in other misaligned areas. However, in the case of an image with a large parallax, pixels in another image corresponding to adjacent pixels in one image may not be adjacent to each other, which causes serious parallax artifacts in the stitched image obtained by the conventional warping-based method. can do. Although a video stitching method that solves a large parallax problem based on epipolar geometry has been proposed, it cannot be directly applied to image stitching without temporal motion information of a video sequence.

일측에 따른 프로세서에 의해 수행되는 복수의 이미지를 합성하는 방법은 복수의 특징점들을 포함하는 기준 이미지를 대상 이미지로 변환하는 호모그래피 행렬들(homography matrices)을 획득하는 단계; 상기 복수의 호모그래피 행렬들에 기초하여, 상기 특징점들에 대한 워핑 잔차(warping residuals) 벡터들을 획득하는 단계; 상기 워핑 잔차 벡터들에 기초하여, 상기 기준 이미지 내 기준 영역에 대한 상기 특징점들의 가중치들을 획득하는 단계; 상기 가중치들 및 상기 호모그래피 행렬들에 기초하여, 상기 기준 영역에 대한 추정 변환 행렬을 획득하는 단계; 및 상기 추정 변환 행렬을 상기 기준 영역에 적용함으로써, 상기 기준 이미지와 상기 대상 이미지를 합성하는 단계를 포함한다.A method of synthesizing a plurality of images performed by a processor according to one side may include obtaining homography matrices for converting a reference image including a plurality of feature points into a target image; Obtaining warping residuals vectors for the feature points based on the plurality of homography matrices; Obtaining weights of the feature points for a reference region in the reference image based on the warping residual vectors; Obtaining an estimated transformation matrix for the reference region based on the weights and the homography matrices; And synthesizing the reference image and the target image by applying the estimated transformation matrix to the reference region.

상기 기준 이미지와 상기 대상 이미지를 합성하는 단계는 상기 대상 이미지 내 대상 영역에 대한 제1 추정 역 변환 행렬을 획득하는 단계; 및 상기 제1 추정 역 변환 행렬을 상기 대상 영역에 적용함으로써, 상기 기준 이미지와 상기 대상 이미지를 합성하는 단계를 포함할 수 있다.The combining of the reference image and the target image may include obtaining a first estimated inverse transform matrix for a target region in the target image; And synthesizing the reference image and the target image by applying the first estimated inverse transform matrix to the target region.

상기 대상 영역에 포함된 픽셀들 및 상기 제1 추정 역 변환 행렬에 기초하여, 상기 대상 영역에 대한 워핑 손실을 계산하는 단계; 상기 대상 영역의 중심 픽셀에 상기 제1 추정 역 변환 행렬 및 상기 추정 변환 행렬을 적용하여, 상기 대상 픽셀에 대한 워핑 거리를 계산하는 단계; 상기 워핑 손실 및 상기 워핑 거리에 기초하여, 상기 제1 추정 역 변환 행렬의 보정 여부를 판단하는 단계; 및 상기 판단에 기초하여, 상기 대상 이미지에 포함된 다른 영역들에 대한 추정 역 변환 행렬들 중 상기 워핑 손실이 적은 제2 추정 역 변환 행렬로 상기 제1 추정 역 변환 행렬을 보정하는 단계를 더 포함할 수 있다.Calculating a warping loss for the target area based on the pixels included in the target area and the first estimated inverse transform matrix; Calculating a warping distance for the target pixel by applying the first estimated inverse transform matrix and the estimated transform matrix to a center pixel of the target area; Determining whether to correct the first estimated inverse transform matrix based on the warping loss and the warping distance; And correcting the first estimated inverse transform matrix based on the determination with a second estimated inverse transform matrix having a low warping loss among estimated inverse transform matrices for other regions included in the target image. can do.

상기 제1 추정 역 변환 행렬을 보정하는 단계는 상기 대상 영역에 기초하여, 상기 대상 이미지에 포함된 영역들 중 미리 정해진 기준에 따라 후보 영역들을 선택하는 단계; 상기 후보 영역들 각각에 대한 상기 워핑 손실을 계산하는 단계; 및 상기 워핑 손실에 기초하여, 후보 영역들 중 어느 하나에 대한 제2 추정 역 변환 행렬로 상기 제1 추정 역 변환 행렬을 보정하는 단계; 상기 제2 추정 역 변환 행렬을 상기 대상 영역에 적용하는 단계; 및 상기 제2 추정 역 변환 행렬이 적용된 대상 영역에 대응하는 상기 기준 이미지 내의 픽셀들에 기초하여, 상기 기준 이미지와 상기 대상 이미지를 합성하는 단계를 포함할 수 있다.Correcting the first estimated inverse transform matrix may include selecting candidate regions from among regions included in the target image according to a predetermined criterion based on the target region; Calculating the warping loss for each of the candidate regions; And correcting the first estimated inverse transform matrix with a second estimated inverse transform matrix for any one of candidate regions based on the warping loss. Applying the second estimated inverse transform matrix to the target region; And synthesizing the reference image and the target image based on pixels in the reference image corresponding to the target region to which the second estimated inverse transform matrix is applied.

상기 복수의 특징점들을 포함하는 기준 이미지를 대상 이미지로 변환하는 호모그래피 행렬들(homography matrices)을 획득하는 단계는 상기 기준 이미지 및 상기 대상 이미지에서 특징점들을 추출하는 단계; 상기 기준 이미지의 특징점들에 대응하는 상기 대상 이미지의 특징점들을 매칭하는 단계; 및 상기 특징점들의 매칭에 기초하여, 상기 변환 행렬들을 획득하는 단계를 포함할 수 있다.Obtaining homography matrices for converting a reference image including a plurality of feature points into a target image may include extracting feature points from the reference image and the target image; Matching feature points of the target image corresponding to feature points of the reference image; And obtaining the transformation matrices based on matching of the feature points.

상기 특징점들의 매칭에 기초하여, 매칭되는 특징점 쌍들을 포함하는 후보 매칭 세트를 획득하는 단계; 및 상기 후보 매칭 세트에 대하여, 미리 정해진 중지 조건을 확인하면서 변환 행렬 추정 프로세스를 반복하는 단계를 더 포함할 수 있다.Obtaining a candidate matching set including matched feature point pairs based on the matching of the feature points; And repeating the process of estimating a transformation matrix while checking a predetermined stopping condition for the candidate matching set.

상기 변환 행렬 추정 프로세스는 상기 후보 매칭 세트에서 아웃라이어 제거 알고리즘을 이용하여 제1 인라이어 매칭 세트를 추정하는 단계; 상기 제1 인라이어 매칭 세트에 기초하여, 제1 변환 행렬을 추정하는 단계; 및 상기 후보 매칭 세트에서 상기 제1 인라이어 매칭 세트를 제거하는 단계 를 포함할 수 있다.The transformation matrix estimation process includes estimating a first inline matching set from the candidate matching set using an outlier removal algorithm; Estimating a first transformation matrix based on the first enlier matching set; And removing the first enlier matching set from the candidate matching set.

상기 복수의 호모그래피 행렬들 각각은 상기 기준 이미지에 포함된 서로 다른 영역을 변환하는 행렬에 해당할 수 있다.Each of the plurality of homography matrices may correspond to a matrix for transforming different regions included in the reference image.

상기 복수의 호모그래피 행렬들에 기초하여, 상기 특징점들에 대한 워핑 잔차 벡터들을 획득하는 단계는 상기 특징점들 각각에 대하여, 상기 복수의 호모그래피 행렬들 각각을 적용하여, 대응되는 상기 대상 이미지 내 특징점과의 워핑 잔차를 계산하는 단계; 및 상기 워핑 잔차들로 구성된 상기 워핑 잔차 벡터를 획득하는 단계를 포함할 수 있다.The step of obtaining warping residual vectors for the feature points based on the plurality of homography matrices includes applying each of the plurality of homography matrices to each of the feature points, and corresponding feature points in the target image Calculating a warping residual of and; And obtaining the warping residual vector composed of the warping residuals.

상기 기준 이미지에 포함된 피사체 간의 경계에 기초하여 상기 기준 이미지를 적어도 하나의 픽셀을 포함하는 복수의 픽셀 영역들로 분할하는 단계를 더 포함하고, 상기 기준 영역은 상기 복수의 픽셀 영역들 중 어느 하나에 해당할 수 있다.Further comprising dividing the reference image into a plurality of pixel areas including at least one pixel based on a boundary between subjects included in the reference image, wherein the reference area is any one of the plurality of pixel areas It may correspond to.

상기 픽셀 영역은 슈퍼 픽셀에 해당할 수 있다.The pixel area may correspond to a super pixel.

상기 기준 영역에 대한 상기 특징점들의 가중치들을 획득하는 단계는 상기 복수의 특징점들 중 상기 기준 영역의 중심 픽셀과 가까운 특징점의 워핑 잔차 벡터를 상기 기준 영역에 대한 워핑 잔차 벡터로 획득하는 단계; 및 상기 기준 영역에 대한 워핑 잔차 벡터 및 상기 특징점들에 대한 워핑 잔차 벡터들에 기초하여, 상기 기준 영역에 대한 가중치를 획득하는 단계를 포함할 수 있다.The obtaining of the weights of the feature points for the reference region may include obtaining a warping residual vector of a feature point close to a center pixel of the reference region among the plurality of feature points as a warping residual vector for the reference region; And obtaining a weight for the reference region based on the warping residual vector for the reference region and the warping residual vectors for the feature points.

일측에 따른 복수의 이미지를 합성하는 장치는 복수의 특징점들을 포함하는 기준 이미지를 대상 이미지로 변환하는 호모그래피 행렬들(homography matrices)을 획득하고, 상기 복수의 호모그래피 행렬들에 기초하여, 상기 특징점들에 대한 워핑 잔차(warping residuals) 벡터들을 획득하고, 상기 워핑 잔차 벡터들에 기초하여, 상기 기준 이미지 내 기준 영역에 대한 상기 특징점들의 가중치들을 획득하고, 상기 가중치들 및 상기 호모그래피 행렬들에 기초하여, 상기 기준 영역에 대한 추정 변환 행렬을 획득하며, 상기 추정 변환 행렬을 상기 기준 영역에 적용함으로써, 상기 기준 이미지와 상기 대상 이미지를 합성하는 적어도 하나의 프로세서를 포함한다.An apparatus for synthesizing a plurality of images according to one side acquires homography matrices for converting a reference image including a plurality of feature points into a target image, and based on the plurality of homography matrices, the feature point Obtain warping residuals vectors for s, obtain weights of the feature points for a reference region in the reference image, based on the warping residual vectors, and based on the weights and the homography matrices And at least one processor that obtains an estimated transformation matrix for the reference region, and synthesizes the reference image and the target image by applying the estimated transformation matrix to the reference region.

상기 프로세서는 상기 기준 이미지와 상기 대상 이미지를 합성함에 있어서, 상기 대상 이미지 내 대상 영역에 대한 제1 추정 역 변환 행렬을 획득하고, 상기 제1 추정 역 변환 행렬을 상기 대상 영역에 적용함으로써, 상기 기준 이미지와 상기 대상 이미지를 합성하고, 상기 대상 영역에 포함된 픽셀들 및 상기 제1 추정 역 변환 행렬에 기초하여, 상기 대상 영역에 대한 워핑 손실을 계산하고, 상기 대상 영역의 중심 픽셀에 상기 제1 추정 역 변환 행렬 및 상기 추정 변환 행렬을 적용하여, 상기 대상 픽셀에 대한 워핑 거리를 계산하고, 상기 워핑 손실 및 상기 워핑 거리에 기초하여, 상기 제1 추정 역 변환 행렬의 보정 여부를 판단하며, 상기 판단에 기초하여, 상기 대상 이미지에 포함된 다른 영역들에 대한 추정 역 변환 행렬들 중 상기 워핑 손실이 적은 제2 추정 역 변환 행렬로 상기 제1 추정 역 변환 행렬을 보정할 수 있다.In synthesizing the reference image and the target image, the processor obtains a first estimated inverse transform matrix for a target area in the target image and applies the first estimated inverse transform matrix to the target area, An image and the target image are synthesized, and a warping loss for the target area is calculated based on pixels included in the target area and the first estimated inverse transform matrix, and the first By applying the estimated inverse transformation matrix and the estimated transformation matrix, a warping distance for the target pixel is calculated, and based on the warping loss and the warping distance, it is determined whether the first estimated inverse transformation matrix is corrected, and the Based on the determination, the first estimated inverse transform matrix may be corrected with a second estimated inverse transform matrix having a low warping loss among estimated inverse transform matrices for other regions included in the target image.

상기 프로세서는 상기 제1 추정 역 변환 행렬을 보정함에 있어서, 상기 대상 영역에 기초하여, 상기 대상 이미지에 포함된 영역들 중 미리 정해진 기준에 따라 후보 영역들을 선택하고, 상기 후보 영역들 각각에 대한 상기 워핑 손실을 계산하고, 상기 워핑 손실에 기초하여, 후보 영역들 중 어느 하나에 대한 제2 추정 역 변환 행렬로 상기 제1 추정 역 변환 행렬을 보정하고, 상기 제2 추정 역 변환 행렬을 상기 대상 영역에 적용하며, 상기 제2 추정 역 변환 행렬이 적용된 대상 영역에 대응하는 상기 기준 이미지 내의 픽셀들에 기초하여, 상기 기준 이미지와 상기 대상 이미지를 합성할 수 있다.In correcting the first estimated inverse transform matrix, the processor selects candidate regions according to a predetermined criterion among regions included in the target image, based on the target region, and selects candidate regions for each of the candidate regions. Compute the warping loss, and based on the warping loss, correct the first estimated inverse transform matrix with a second estimated inverse transform matrix for any one of the candidate regions, and the second estimated inverse transform matrix into the target region The reference image and the target image may be synthesized based on pixels in the reference image corresponding to the target region to which the second estimated inverse transform matrix is applied.

상기 프로세서는 상기 복수의 특징점들을 포함하는 기준 이미지를 대상 이미지로 변환하는 호모그래피 행렬들을 획득함에 있어서, 상기 기준 이미지 및 상기 대상 이미지에서 특징점들을 추출하고, 상기 기준 이미지의 특징점들에 대응하는 상기 대상 이미지의 특징점들을 매칭하고, 상기 특징점들의 매칭에 기초하여, 매칭되는 특징점 쌍들을 포함하는 후보 매칭 세트를 획득하며, 상기 후보 매칭 세트에 대하여, 미리 정해진 중지 조건을 확인하면서 변환 행렬 추정 프로세스를 반복할 수 있다.In obtaining homography matrices for converting a reference image including the plurality of feature points into a target image, the processor extracts feature points from the reference image and the target image, and the object corresponding to the feature points of the reference image Matching the feature points of the image, obtaining a candidate matching set including the matched feature point pairs based on the matching of the feature points, and repeating the transformation matrix estimation process while confirming a predetermined stopping condition for the candidate matching set. I can.

상기 변환 행렬 추정 프로세스는 상기 후보 매칭 세트에서 아웃라이어 제거 알고리즘을 이용하여 제1 인라이어 매칭 세트를 추정하는 단계; 상기 제1 인라이어 매칭 세트에 기초하여, 제1 변환 행렬을 추정하는 단계; 및 상기 후보 매칭 세트에서 상기 제1 인라이어 매칭 세트를 제거하는 단계를 포함할 수 있다.The transformation matrix estimation process includes estimating a first inline matching set from the candidate matching set using an outlier removal algorithm; Estimating a first transformation matrix based on the first enlier matching set; And removing the first enlier matching set from the candidate matching set.

상기 프로세서는 상기 특징점들에 대한 워핑 잔차 벡터들을 획득함에 있어서, 상기 특징점들 각각에 대하여, 상기 복수의 호모그래피 행렬들 각각을 적용하여, 대응되는 상기 대상 이미지 내 특징점과의 워핑 잔차를 계산하고, 상기 특징점들 각각에 대하여, 상기 워핑 잔차들로 구성된 상기 워핑 잔차 벡터를 획득할 수 있다.In obtaining warping residual vectors for the feature points, the processor applies each of the plurality of homography matrices to each of the feature points to calculate a warping residual with a feature point in the corresponding target image, For each of the feature points, the warping residual vector consisting of the warping residuals may be obtained.

상기 프로세서는 상기 기준 이미지에 포함된 피사체 간의 경계에 기초하여 상기 기준 이미지를 적어도 하나의 픽셀을 포함하는 복수의 슈퍼 픽셀로 분할할 수 있다.The processor may divide the reference image into a plurality of super pixels including at least one pixel based on a boundary between subjects included in the reference image.

상기 프로세서는 상기 기준 영역에 대한 상기 특징점들의 가중치들을 획득함에 있어서, 상기 복수의 특징점들 중 상기 기준 영역의 중심 픽셀과 가까운 특징점의 워핑 잔차 벡터를 상기 기준 영역에 대한 워핑 잔차 벡터로 획득하고, 상기 기준 영역에 대한 워핑 잔차 벡터 및 상기 특징점들에 대한 워핑 잔차 벡터들에 기초하여, 상기 기준 영역에 대한 가중치를 획득할 수 있다.In obtaining the weights of the feature points for the reference region, the processor obtains a warping residual vector of a feature point close to a center pixel of the reference region among the plurality of feature points as a warping residual vector for the reference region, and the A weight for the reference region may be obtained based on the warping residual vector for the reference region and the warping residual vectors for the feature points.

도 1a 및 도 1b는 서로 다른 각도에서 촬영한 이미지 I 및 J와 특징점들을 도시한 도면들.
도 2a 내지 도 2d는 가중치 기반의 워핑 잔차의 효과를 설명하기 위한 도면들.
도 3a는 시차가 큰 한 쌍의 입력 이미지를 도시한 도면.
도 3b 내지 도3d는 세 개의 추정된 호모그래피 각각에 의해 입력 이미지를 워핑하여 얻은 이미지 스티칭 결과를 도시한 도면들.
도 4a는 시차가 큰 한 쌍의 입력 이미지 중 일실시예에 따른 이미지 I를 도시한 도면.
도 4b는 시차가 큰 한 쌍의 입력 이미지 중 일실시예에 따른 이미지 J를 도시한 도면.
도 4 c는 이미지 I의 워핑된 이미지를 도시한 도면.
도 4d는 이미지 I의 워핑된 이미지에 폐색된 영역을 구멍으로 표시한 도면.
도 5는 일실시예에 따른 스티칭 알고리즘에 따라 복수의 이미지를 합성하는 방법을 설명하기 위한 도면.1A and 1B are views showing images I and J and feature points taken from different angles.
2A to 2D are diagrams for explaining the effect of a weight-based warping residual.
3A is a diagram showing a pair of input images having a large parallax.
3B to 3D are diagrams showing image stitching results obtained by warping an input image by each of three estimated homography.
4A is a diagram illustrating an image I according to an embodiment among a pair of input images having a large parallax.
4B is a diagram illustrating an image J according to an embodiment of a pair of input images having a large parallax.
4C shows a warped image of image I.
FIG. 4D is a diagram showing an area occluded in the warped image of image I as a hole.
5 is a diagram for explaining a method of synthesizing a plurality of images according to a stitching algorithm according to an embodiment.

실시예들에 대한 특정한 구조적 또는 기능적 설명들은 단지 예시를 위한 목적으로 개시된 것으로서, 다양한 형태로 변경되어 실시될 수 있다. 따라서, 실시예들은 특정한 개시형태로 한정되는 것이 아니며, 본 명세서의 범위는 기술적 사상에 포함되는 변경, 균등물, 또는 대체물을 포함한다.Specific structural or functional descriptions of the embodiments are disclosed for the purpose of illustration only, and may be changed and implemented in various forms. Accordingly, the embodiments are not limited to a specific disclosure form, and the scope of the present specification includes changes, equivalents, or substitutes included in the technical idea.

제1 또는 제2 등의 용어를 다양한 구성요소들을 설명하는데 사용될 수 있지만, 이런 용어들은 하나의 구성요소를 다른 구성요소로부터 구별하는 목적으로만 해석되어야 한다. 예를 들어, 제1 구성요소는 제2 구성요소로 명명될 수 있고, 유사하게 제2 구성요소는 제1 구성요소로도 명명될 수 있다.Although terms such as first or second may be used to describe various components, these terms should be interpreted only for the purpose of distinguishing one component from other components. For example, a first component may be referred to as a second component, and similarly, a second component may be referred to as a first component.

어떤 구성요소가 다른 구성요소에 "연결되어" 있다고 언급된 때에는, 그 다른 구성요소에 직접적으로 연결되어 있거나 또는 접속되어 있을 수도 있지만, 중간에 다른 구성요소가 존재할 수도 있다고 이해되어야 할 것이다.When a component is referred to as being "connected" to another component, it is to be understood that it may be directly connected or connected to the other component, but other components may exist in the middle.

단수의 표현은 문맥상 명백하게 다르게 뜻하지 않는 한, 복수의 표현을 포함한다. 본 명세서에서, "포함하다" 또는 "가지다" 등의 용어는 설명된 특징, 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것이 존재함으로 지정하려는 것이지, 하나 또는 그 이상의 다른 특징들이나 숫자, 단계, 동작, 구성요소, 부분품 또는 이들을 조합한 것들의 존재 또는 부가 가능성을 미리 배제하지 않는 것으로 이해되어야 한다.Singular expressions include plural expressions unless the context clearly indicates otherwise. In the present specification, terms such as "comprise" or "have" are intended to designate that the described feature, number, step, action, component, part, or combination thereof is present, but one or more other features or numbers, It is to be understood that the presence or addition of steps, actions, components, parts, or combinations thereof, does not preclude the possibility of preliminary exclusion.

다르게 정의되지 않는 한, 기술적이거나 과학적인 용어를 포함해서 여기서 사용되는 모든 용어들은 해당 기술 분야에서 통상의 지식을 가진 자에 의해 일반적으로 이해되는 것과 동일한 의미를 가진다. 일반적으로 사용되는 사전에 정의되어 있는 것과 같은 용어들은 관련 기술의 문맥상 가지는 의미와 일치하는 의미를 갖는 것으로 해석되어야 하며, 본 명세서에서 명백하게 정의하지 않는 한, 이상적이거나 과도하게 형식적인 의미로 해석되지 않는다. 이하, 실시예들을 첨부된 도면을 참조하여 상세하게 설명한다. 각 도면에 제시된 동일한 참조 부호는 동일한 부재를 나타낸다.Unless otherwise defined, all terms used herein, including technical or scientific terms, have the same meaning as commonly understood by one of ordinary skill in the relevant technical field. Terms as defined in a commonly used dictionary should be construed as having a meaning consistent with the meaning of the related technology, and should not be interpreted as an ideal or excessively formal meaning unless explicitly defined in the present specification. Does not. Hereinafter, embodiments will be described in detail with reference to the accompanying drawings. The same reference numerals shown in each drawing indicate the same members.

일실시예는 시차(parallax)가 큰 두 이미지를 위한 워핑 잔차(warping residual)에 기반한 스티칭(stitching) 알고리즘을 제안한다. 다시 말해, 일실시예는 시차가 큰 두 이미지를 합성하는 방법을 개시한다. 시차 현상은 보통 이미지에 포함된 피사체(또는 객체)의 경계 부근에서 발생하기 때문에 일실시예에서는 입력 이미지를 슈퍼 픽셀(super pixels)로 분할하고 슈퍼 픽셀을 적응적으로 워핑한다. 슈퍼 픽셀은 특징이 비슷한 픽셀들의 집합을 의미하는 것으로, 예를 들어 SLIC 알고리즘을 통해 이미지 내의 픽셀들이 슈퍼 픽셀로 그룹화될 수 있다.An embodiment proposes a stitching algorithm based on a warping residual for two images having a large parallax. In other words, an embodiment discloses a method of synthesizing two images having a large parallax. Since the parallax phenomenon usually occurs near the boundary of the subject (or object) included in the image, in one embodiment, the input image is divided into super pixels and the super pixels are adaptively warped. A super pixel refers to a set of pixels having similar characteristics, and pixels in an image may be grouped into super pixels through, for example, an SLIC algorithm.

일실시예에 따른 스티칭 알고리즘은 두 이미지에서 특징점을 검출하고 특징점들의 대응 매칭을 찾는다. 다시 말해, 제1 이미지에 포함된 특징점과 이에 대응하는 제2 이미지에 포함된 특징점을 매칭시킨다. 특징점의 매칭은 여러 호모그래피들 및 관련된 인라이어 매칭(inlier matches)을 추정하는 데 사용될 수 있다. 호모그래피는 3차원의 장면을 2차원 평면으로 투영시킨 이미지와 동일한 3차원의 장면을 다른 각도에서 투영시킨 이미지 사이의 변환 관계를 의미하는 것으로, 예를 들어 두 이미지 사이의 매칭되는 특징점 사이의 관계를 행렬 또는 벡터의 형태로 나타낸 것일 수 있다. 이미지를 워핑한다는 것은 이미지 내의 픽셀을 미리 정해진 위치 변환 규칙에 의해서 다른 위치로 옮기는 것으로, 예를 들어, 이미지에 포함된 픽셀들을 호모그래피에 따라 변환하는 것을 의미할 수 있다. 즉, 호모그래피는 기준 이미지와 대상 이미지가 있을 때, 기준 이미지의 특징점 및 이에 대응하는 대상 이미지의 특징점에 기초하여, 기준 이미지를 대상 이미지로 변환하는 행렬에 해당할 수 있다. 이 경우, 기준 이미지의 픽셀에 호모그래피를 적용하여 기준 이미지를 변환하는 것을 기준 이미지를 대상 이미지로 워핑하는 것에 해당할 수 있다. 이하에서 설명하는 이미지 I는 기준 이미지에 해당하고, 이미지 J는 대상 이미지에 해당한다.A stitching algorithm according to an embodiment detects a feature point in two images and finds a corresponding match between the feature points. In other words, the feature points included in the first image and the feature points included in the corresponding second image are matched. Feature point matching can be used to estimate several homography and associated inlier matches. Homography refers to a transformation relationship between an image projected from a three-dimensional scene onto a two-dimensional plane and an image projected from a different angle on the same three-dimensional scene. For example, the relationship between matching feature points between two images. May be expressed in the form of a matrix or vector. Warping an image means moving a pixel in the image to another location according to a predetermined position conversion rule, and may mean, for example, converting pixels included in the image according to homography. That is, when there are a reference image and a target image, homography may correspond to a matrix for converting a reference image into a target image based on the feature points of the reference image and the corresponding feature points of the target image. In this case, transforming the reference image by applying homography to the pixels of the reference image may correspond to warping the reference image into a target image. The image I described below corresponds to the reference image, and the image J corresponds to the target image.

일실시예에 따를 때, 두 이미지 내의 특징점의 매칭을 사용하여 각 슈퍼 픽셀에 대해 최적의 호모그래피를 찾을 수 있고, 최적의 호모그래피에서 각 특징점의 기여는 워핑 잔차에 따라 적응적으로 계산될 수 있다. 주어진 슈퍼 픽셀에 대한 워핑 잔차는 슈퍼 픽셀과 유사한 장면 깊이를 갖는 영역에 위치한 특징점을 강조함으로써 슈퍼 픽셀을 워핑할 때 시차 아티팩트를 완화시킨다. 더욱이, 보다 정확한 워핑을 위해 이웃하는 슈퍼 픽셀의 호모그래피를 사용하여 각 슈퍼 픽셀에서 초기 추정된 호모그래피를 보정할 수 있다.According to an embodiment, the optimal homography can be found for each super pixel by using the matching of the feature points in two images, and the contribution of each feature point in the optimal homography can be adaptively calculated according to the warping residual. have. Warping residuals for a given super pixel alleviate parallax artifacts when warping a super pixel by emphasizing the feature points located in regions with a scene depth similar to that of the super pixel. Furthermore, for more accurate warping, the homography of neighboring super pixels can be used to correct the initially estimated homography at each super pixel.

큰 시차를 위한 워핑 잔차Warping residuals for large parallax

적응형 이미지 워핑 모델 중 하나인 MDLT(Moving Direct Linear Transformation)의 수학적 프레임 워크를 검토하고, 워핑된 슈퍼 픽셀의 정렬 오차를 계산할 때 주어진 슈퍼 픽셀과 유사한 장면 깊이에 있는 특징점에 높은 가중치를 할당하는 새로운 워핑 잔차 개념을 소개한다.The mathematical framework of MDLT (Moving Direct Linear Transformation), one of the adaptive image warping models, is reviewed, and a new weight is assigned to feature points at a scene depth similar to a given super pixel when calculating the alignment error of a warped super pixel. Introduce the concept of warping residuals.

MDLT(Moving Direct Linear Transformation)Moving Direct Linear Transformation (MDLT)

X를 실제의 3D 공간의 평면 π 위의 점이라고 하고, x = [x1; x2; 1]^T 및 y = [y1; y2; 1]^T를 각각 두 이미지 I 및 J에 투영된 X의 픽셀들 이라고 한다. 두 픽셀 사이의 관계는 평면 π에 의해 유도된 3 x 3 호모그래피 매트릭스 H에 의해 수학식 1과 같이 나타낼 수 있다.Let X be the point on the plane π of the actual 3D space, and x = [x1; x2; 1] ^T and y = [y1; y2; 1] Let ^T be the pixels of X projected onto two images I and J, respectively. The relationship between the two pixels can be expressed as in Equation 1 by a 3 x 3 homography matrix H derived by a plane π.

여기서 ~는 스케일까지의 동등(equality up to scale)을 나타낸다. Hx와 y의 두 위치는 J에서 동일하므로 수학식 2와 같이 나타낼 수 있다.Where ~ denotes equality up to scale. Since the two positions of Hx and y are the same in J, it can be expressed as Equation 2.

여기서, h는 H의 행을 벡터화하여 획득된다. x와 y가 평면 π위의 동일한 장면의 점에서 투사된 것이 아니고, I와 J 사이의 카메라 기준선이 충분히 작지 않으면 수학식 2가 유지되지 않고 표준 ∥y×Hx∥이 I와 J의 정렬 오차로 간주될 수 있다.Here, h is obtained by vectorizing the rows of H. If x and y are not projected from the same scene point on the plane π, and the camera baseline between I and J is not small enough, Equation 2 is not maintained and the standard ∥ y × Hx ∥ is the alignment error of I and J. Can be considered.

DLT(Direct linear transformation)은 I와 J 사이의 K개의 매칭 픽셀 쌍들에서 최적의 호모그래피 ^h를 추정하고, 수학식 3 및 수학식 4에 의해 대수 오차를 최소화할 수 있다.Direct linear transformation (DLT) estimates an optimal homography ^h from K matching pixel pairs between I and J, and can minimize a logarithmic error by Equations 3 and 4.

여기서, g_k는 k번째 매칭 픽셀 쌍에 대한 수학식 2의 RHS(right hand side) 매트릭스에서 처음 2개의 행이고, G ∈ R^2K×9는 모든 K개의 매칭 쌍에 대한 g_k를 적층함으로써 획득된다. 최적의 호모그래피 행렬 ^H는 G의 최하위 우특이 벡터(the least significant right singular vector)인 ^h의 솔루션에서 획득된다.Here, g _k is the first two rows in the RHS (right hand side) matrix of Equation 2 for the k-th matching pixel pair, and G ∈ R ^2K×9 is obtained by stacking _{g k} for all K matching pairs. do. The optimal homography matrix ^H is obtained from the solution of ^h, the least significant right singular vector of G.

이 범용적인 호모그래피는 평면이 아닌 3D 장면들 및 큰 카메라 기준선을 가진 이미지를 정렬할 때 상당한 시차 아티팩트를 생성할 수 있다. 이러한 아티팩트를 완화시키기 위해, 이미지를 규칙적인 그리드의 셀로 분할하고 각 셀에 대한 호모그래피를 적응적으로 추정하는 방법인 MDLT를 이용할 수 있다. i번째 셀에 대한 최적의 호모그래피 ^H_i는 (벡터화된 형태 ^h_i와 동등하게) 수학식 5와 같이 가중치 행렬 W_i를 수학식 4에 도입하여 추정된다.This general purpose homography can create significant parallax artifacts when aligning non-planar 3D scenes and images with large camera baselines. In order to mitigate such artifacts, MDLT, which is a method of dividing an image into cells of a regular grid and adaptively estimating homography for each cell, may be used. _{The optimal homography ^H i} for the i-th cell is estimated by introducing the _{weight matrix W i} into Equation 4 as shown in Equation 5 (equivalent to the vectorized form ^h _i).

여기서, W_i ∈ R^2K×2K는 i번째 셀에 대한 수학식 6과 같은 대각 행렬이다.Here, W _i ∈ R ^2K×2K is a diagonal matrix such as Equation 6 for the i-th cell.

가중치 w_i,k는i번째 셀의 중심 픽셀 c_i와 k 번째 특징점 x_k 사이의 공간 거리를 사용하여 수학식 7과 같이 정의된다.The weight w _i,k is defined as in Equation 7 using the spatial distance between the _{center pixel c i} of the i-th cell and the k-th feature point x _k.

워핑 잔차 기반 변환Warping residual-based transformation

수학식 7에 정의된 공간 거리 기반 가중치(w_i,k)는 범용적인 호모그래피에 기초한 이미지 워핑 방법의 아티팩트를 완화하지만, 특히 시차가 큰 이미지에서 심각한 오정렬 아티팩트로 인한 문제점을 갖고 있다.The spatial distance-based weight (w _i,k ) defined in Equation 7 alleviates the artifact of a general-purpose homography-based image warping method, but has a problem due to a serious misalignment artifact, especially in an image with large parallax.

예를 들어 도 1a를 참조할 때, 두 점 x₁과 x₂는 평면 π¹에 있고 점 x₃은 다른 평면 π²에 있는 세 개의 특징점을 갖는 이미지 I가 있을 수 있다. 두 개의 평면 π¹ 및 π²는 서로 다른 장면 깊이를 갖고 심각한 시차를 유발하는 것으로 가정할 수 있다. 따라서, 이를 다른 각도에서 투영시킨 이미지 J에서의 평면 π¹ 및 π²의 상대적 위치는 I에서의 상대적 위치와 크게 다르다. I와 J 사이의 3 개의 매칭 쌍 {x_k ↔ y_k}³ _k=1을 고려할 때, 도 1a 및 도1b를 참조하면, I에서의 x₁과 x₂ 사이의 상대적 위치는 J에서의 대응점 y₁과 y₂의 위치와 동일하지만, I에서의 x₂와 x₃ 사이의 상대적 위치는 J에서의 대응점 y₂와 y₃의 위치와 다를 수 있다. 수학식 7의 기존 가중치 체계는 다른 장면 깊이에 관계없이 공간적으로 더 가까운 특징점에 더 높은 가중치를 할당하므로 객체의 경계 근처에서 시차 아티팩트를 생성한다.For example, referring to FIG. 1A, there may be an image I having three feature points in which _{two points x 1} and x ₂ are on a plane π ¹ and a point x ₃ is on a different plane π ^2. It can be assumed that the two planes π ¹ and π ² have different scene depths and cause serious parallax. ^{Therefore, the relative positions of the planes π 1} and π ² in the image J projected from different angles are significantly different from the relative positions in I. Considering the three matching pairs {x _k ↔ y _k } ³ _k=1 between I and J, referring to FIGS. 1A and 1B, _{the relative position between x 1} and x ₂ in I is the corresponding point in J. It is the same as the positions of _{y 1} and y ₂ _{, but the relative position between x 2} and x ₃ in I may be different from the positions of the corresponding points y ₂ and y _{3 in J.} The existing weighting system of Equation 7 allocates higher weights to feature points that are spatially closer regardless of other scene depths, thereby generating parallax artifacts near the boundary of the object.

이 단점을 극복하기 위해, 먼저 I와 J 사이에서 가능한 여러 개의 호모그래피 행렬들 H^m을 추출하고, m 번째 호모그래피 H^m과 관련하여 k 번째 특징점 x_k의 워핑 잔차 r^m _k를 도입할 수 있다. r^m _k는 수학식 8과 같이 나타낼 수 있다. To overcome this shortcoming, first, we ^{can extract several possible homography matrices H m} between I and J, and introduce a warping residual r ^m _k of the k-th feature point x _k in relation to the m-th homography H ^m. have. r ^m _k can be expressed as in Equation 8.

예를 들어 도 1a 및 도 1b를 참조할 때, π¹및 π²와 관련된 두 개의 호모그래피 H¹ 및 H²를 각각 사용하여 세 점{x_k}³ _k=1을 워핑할 수 있다. x₁ 및 x₂는 π¹에 있기 때문에, H¹에 의해 유도된 워핑 잔차 r¹ ₁ 및 r¹ ₂는 작지만, H²에 의해 유도된 워핑 잔차 r² ₁ 및 r² ₂는 커진다. 반대로, x₃은 π²에 있으므로, 각각 작은 워핑 잔차 r² ₃ 및 큰 워핑 잔차 r¹ ₃를 초래한다.For example, when Fig. 1a and FIG. 1b, can be warped three points {x _{_k}} ³ _k _{= 1,} using two homography H ¹ and H ² related to π ¹ and π ² respectively. Since x ₁ and x ₂ may be a π ^1, the warped residual induced by H ^{^₁} r ^{1 1} r ¹ and ₂ is small, induced by H ^2-warping a residual r ² ² r ₁ and ₂ becomes larger. Conversely, x ₃ is at π ² , resulting in a small warping residual r ² ₃ and a large warping residual r ¹ _{3 respectively.}

k번째 특징점 x_k에 대해, M 개의 호모그래피 {H^m}^M _m=1을 사용하여 워핑 잔차 벡터 r_k를 수학식 9와 같이 정의할 수 있다.For the k-th feature point x _k , the warping residual vector r _k may be defined as in Equation 9 ^{using M} homography {H ^m } _{M m=1.}

서로 다른 장면 깊이에 위치한 점들은 서로 다른 워핑 잔차 벡터를 생성하는 반면, 같은 평면 영역에 위치한 특징점들은 서로 비슷한 워핑 잔차 벡터를 갖는 경향이 있기 때문에, 워핑 잔차 벡터들은 암시적으로 전체 장면 구조를 나타낼 수 있다.Points located at different scene depths generate different warping residual vectors, whereas feature points located in the same plane area tend to have similar warping residual vectors, so warping residual vectors can implicitly represent the entire scene structure. have.

시차가 큰 이미지 스티칭에는 워핑 잔차 벡터를 사용할 수 있다. 특히 시차 현상은 일반적으로 객체의 경계 부근에서 발생하기 때문에, 일실시예에서는 입력 이미지를 슈퍼 픽셀로 분할하여, 슈퍼 픽셀 별로 적응적으로 워핑한다. 일실시예에 따를 때, 기존 공간 거리 기반의 가중치 w_i,k가 제안된 워핑 잔차 기반의 가중치 w^prop _i,k로 대체된 수학식 5를 풀어, i 번째 슈퍼 픽셀 S_i에 대한 최적의 호모그래피 ^h_i를 추정한다. w^prop _i,k는 수학식 10과 같이 나타낼 수 있다.A warping residual vector can be used for image stitching with large parallax. In particular, since the parallax phenomenon generally occurs near the boundary of an object, in one embodiment, the input image is divided into super pixels and adaptively warped for each super pixel. According to an embodiment, by solving Equation 5 in which _{the weight w i,k} based on the existing spatial distance ^{is replaced by the weight w prop} _i,k based on the warping residual, the optimal homology for the _{i-th super pixel S i} Estimate the graphics ^h _i. w ^prop _i,k can be expressed as in Equation 10.

여기서, R_i는 i번째 슈퍼 픽셀 S_i의 워핑 잔차 벡터이며, 이는 S_i의 중심에 가장 가까운 특징점 x_k의 워핑 잔차 벡터에 의해 정의된다. σ 및 γ는 예를 들어, σ = 4 , γ = 0.01로 설정될 수 있다. 각각의 슈퍼 픽셀은 타겟 슈퍼 픽셀과 유사한 평면 영역에 위치한 특징점에 높은 가중치를 부여함으로써 적응적으로 워핑될 수 있다.Here, R _i is the warping residual vector of the i-th super pixel S _i , which is defined by the warping residual vector of the _{feature point x k} closest to the center of _{S i.} σ and γ can be set to, for example, σ = 4 and γ = 0.01. Each super pixel can be adaptively warped by giving a high weight to a feature point located in a planar area similar to the target super pixel.

도 2a 내지 도 2d는 가중치 체계 기반의 워핑 잔차의 효과를 설명하기 위한 도면들이다. 도 2a 및 도 2b에서 빨간색 객체와 건물들은 각각의 시야에서 상대적인 위치가 서로 달라, 두 이미지는 큰 시차를 갖는다. 도 2 c는 분할된 규칙적인 그리드 셀과 특정 알고리즘(예를 들어, APAP)을 적용하여 획득된 특징점을 시각화하여 도시한 것이다. 도 2c를 참조할 때, 흰색 가위표에 의해 표시된 대상 그리드 셀과 관련한 특징점에 대하여 공간 거리에 기반한 수학식 7의 가중치를 계산할 수 있다. 이 경우, 대상 그리드 셀에 공간적으로 더 가까운 특징점에 더 높은 가중치가 할당된다. 특히 대상 그리드 셀이 빨간색 물체 위에 있더라도 빨간색 물체 위의 점들 뿐만 아니라 흰색 타워 위의 점에도 높은 가중치가 할당될 수 있다. 한편, 도 2d는 분할된 슈퍼 픽셀들 및 흰색 가위표로 표시한 주어진 대상 슈퍼 픽셀을 도시한 도면이다. 도 2d를 참조할 때, 흰색 가위표에 의해 표시된 대상 슈퍼 픽셀과 관련된 특징점에 대하여 수학식 10의 워핑 잔차 기반의 가중치를 계산할 수 있다. 이 경우, 대상 슈퍼 픽셀과 동일하게 빨간색 물체 위에 있는 특징점에만 높은 가중치가 할당되는 한편, 빨간색 물체와 다른 장면 깊이에 있는 흰색 타워에 위치한 공간적으로 가까운 특징점에 대한 가중치는 효과적으로 억제될 수 있다. 또한, 시차 아티팩트가 주로 발생하는 객체의 경계는 일반적으로 슈퍼 픽셀의 경계에 해당하므로, 시차 아티팩트를 효과적으로 억제할 수 있다.2A to 2D are diagrams for explaining the effect of a warping residual based on a weight system. In FIGS. 2A and 2B, the red objects and buildings have different positions relative to each other in each field of view, and the two images have a large parallax. FIG. 2C is a diagram illustrating a visualized feature point obtained by applying a divided regular grid cell and a specific algorithm (eg, APAP). Referring to FIG. 2C, a weight of Equation 7 based on a spatial distance may be calculated for a feature point related to a target grid cell indicated by a white scissor table. In this case, a higher weight is assigned to a feature point spatially closer to the target grid cell. In particular, even if the target grid cell is on a red object, high weights may be assigned to points on the white tower as well as points on the red object. Meanwhile, FIG. 2D is a diagram illustrating divided super pixels and a given target super pixel represented by a white scissor table. Referring to FIG. 2D, a weight based on the warping residual of Equation 10 may be calculated for a feature point related to a target super pixel indicated by a white scissor table. In this case, while a high weight is assigned only to the feature points on the red object in the same way as the target super pixel, the weights on the spatially close feature points located in the white tower at a different scene depth from the red object can be effectively suppressed. Also, since the boundary of the object where the parallax artifact mainly occurs corresponds to the boundary of the super pixel, it is possible to effectively suppress the parallax artifact.

이미지 워핑Image warping

먼저 두 이미지 사이에서 유효한 복수의 호모그래피를 추정하는데, 이는 워핑 잔차를 계산하는 데 사용된다. 입력 이미지를 슈퍼 픽셀로 분할하고, 워핑 잔차에 기초하여 각 슈퍼 픽셀에서의 최적의 호모그래피를 추정한다. 또한, 보다 안정적인 이미지 정렬을 위한 폐색(occlusion)을 처리하기 위해 처음에 추정된 호모그래피를 보정한다.First we estimate the number of valid homography between the two images, which is used to calculate the warping residual. The input image is divided into super pixels, and the optimal homography at each super pixel is estimated based on the warping residual. In addition, the initially estimated homography is corrected to handle occlusion for more stable image alignment.

복수의 호모그래피 추정Multiple homography estimation

두 개의 이미지 I와 J 사이의 초기 매칭 세트 F_init를 얻기 위한 SIFT을 사용함으로써 특징점들을 찾는다. 그런 다음 F_init를 사용하여, M개의 호모그래피 {H^m}^M _m=1및 연관된 인라이어 매칭 {F^m _inlier}^M _m=1을 추정한다. 후보 매칭 세트 F_cand를 F_init로 초기화하고, 임계치 η=0.01인 MULTI-GS의 아웃라이어(outlier) 제거 방법을 사용하여 F_cand에서 인라이어 세트 F¹ _inlier이 있는 첫 번째 호모그래피 H¹을 추정한다. 안정적인 호모그래피 추정을 위해, 특징점들의 정규화된 좌표들을 사용하고, 50 픽셀 거리 내의 인접한 점이 없는 고립된 특징점들의 대응점들을 F¹ _inlier 에서 제거한다. F_cand에서 F¹ _inlier를 빼고, 업데이트 된 F_cand에서 다음 호모그래피 H²와 해당 인라이어 세트 F² _inlier를 계속하여 추정한다. 예를 들어, 이 프로세스는 중지 조건 ∥F^m _inlier∥ < 8 또는 ∥F_cand∥/∥F_init∥ < 0.02를 만족할 때까지 5 번까지 반복될 수 있다. 또한, 반복 중에 하나의 호모그래피만 유효한 것으로 추정되는 경우, 입력 이미지가 작은 시차를 나타내는 것으로 간주하고 더 완화된 인라이어의 임계값 η = 0.1을 사용하여 H¹ 및 F¹ _inlier를 다시 추정할 수 있다. 마지막으로, 획득한 세트 {F^m _inlier}^M _m=1를 전체 인라이어 매칭점들의 세트 F_inlier 로 결합한다. 도 3a는 시차가 큰 한 쌍의 입력 이미지를 도시한 도면이고, 도 3b 내지 도3d는 세 개의 추정된 호모그래피 각각에 의해 입력 이미지를 워핑하여 얻은 이미지 스티칭 결과를 도시한 도면들이다. 복수의 호모그래피들 각각은 이미지 내의 특정 영역만을 정렬할 수 있다. 예를 들어, 도 3b를 참조할 때, 제1 호모그래피는 흰색 건물이 포함된 영역만을 정렬하고, 도 3c를 참조할 때, 제2 호모그래피는 바닥 평면이 포함된 영역만을 정렬하며, 도 3 d를 참조할 때, 제3 호모그래피는 파란색 객체가 포함된 영역만을 정렬한다.Feature points are found by using SIFT to obtain _{the initial matching set F init} between the two images I and J. Then, F _init is used to estimate the ^M homography {H ^m } ^M _m=1 and the associated _inlier ^{matching {F m} inlier} _{M m=1.} Initialize the candidate matching set F _cand to F _init , and estimate the first homography H ¹ _{with the inlier} set F ¹ _{inlier in F cand} using the outlier removal method of MULTI-GS with a threshold η = 0.01. do. For stable homography estimation, normalized coordinates of feature points are used, and corresponding points of isolated feature points without adjacent points within a 50 pixel distance are removed ^{from F 1} _inlier. Remove the F ¹ F _inlier in _cand, it continues to estimate a homography, and then H ² and the set of Liar F ² _inlier in the updated F _cand. For example, this process can be repeated up to 5 times until the ^{stopping condition ∥F m} _inlier || <8 or ∥F _cand ||/||F _{init || <0.02.} In addition, if only one homography is assumed to be valid during the iteration, the input image is considered to exhibit small parallax and the H ¹ and F ¹ _inliers can be reestimated using the more relaxed inliner's threshold η = 0.1. have. Finally, the obtained set {F ^m _inlier } ^M _{m = 1} is combined into a set of all _{inlier matching points F inlier.} 3A is a diagram showing a pair of input images having a large parallax, and FIGS. 3B to 3D are diagrams showing image stitching results obtained by warping the input image by each of three estimated homography. Each of the plurality of homography may align only a specific area in the image. For example, when referring to FIG. 3B, the first homography aligns only the area including the white building, and referring to FIG. 3C, the second homography aligns only the area including the floor plane, and FIG. 3 When referring to d, the third homography aligns only the area containing the blue object.

최적의 워핑 추정Optimal Warping Estimation

상술한 바와 같이 SLIC을 사용하여 입력 이미지 I를 슈퍼 픽셀로 분할하고 수학식 10의 워핑 잔차 기반 가중치를 기반으로 수학식 5를 풀어 계산한 최적의 호모그래피에 따라 I의 각 슈퍼 픽셀을 J로 워핑한다. I에서 J로 워핑하는 것을 순방향의 워핑이라고 할 때, 이러한 순방향의 워핑은 워핑된 이미지 영역에 구멍을 만들 수 있으므로, J에서 I로 역 워핑을 수행한다. 구체적으로, 이미지 J 및 이미지 I의 워핑된 이미지를 포함하는 J 도메인의 대형 캔버스를 예로 든다. J 내부는 SLIC을 사용하여 분할되고, J 외부는 100*100의 균일한 그리드의 슈퍼 픽셀로 분할되도록 캔버스를 슈퍼 픽셀로 분할한다. 캔버스의 j번째 슈퍼 픽셀 S^J _j각각에 대해 순방향의 워핑에 대한 호모그래피 추정과 마찬가지로 수학식 5에 기초하여 초기 호모그래피 ^H^J→I _j 값을 추정한다. 그런 다음 ^H^J→I _j에 따라 S^J _j에 대응하는 I의 픽셀을 캔버스 도메인에 포함하여 구멍없는 I의 워핑된 이미지를 생성한다.As described above, each super pixel of I is warped to J according to the optimal homography calculated by dividing the input image I into super pixels using SLIC and solving Equation 5 based on the weight based on the warping residual of Equation 10. do. When warping from I to J is called forward warping, since this forward warping can create a hole in the warped image area, reverse warping is performed from J to I. Specifically, a large canvas of the J domain including image J and the warped image of image I is taken as an example. The canvas is divided into super pixels so that the inside of J is divided using SLIC, and the outside of J is divided into super pixels of a uniform grid of 100*100. For each j-th super pixel S ^J _j ^{of the canvas, an initial homography ^H J→I} _j value is estimated based on Equation 5 similarly to the homography estimation for forward warping. Then, according to ^H ^J→I _j ^{, the pixel of I corresponding to S J} _j is included in the canvas domain to produce a warped image of I without holes.

도 4a 및 도 4b는 각각 2 개의 이미지 I 및 J를 나타내고, 도 4 c는 I의 워핑된 이미지를 나타낸다. 도 4c를 참조할 때, 대부분의 픽셀이 정확하게 워핑되었으나, 노란색 상자에 표시된 것처럼 폐색으로 인해 물고기 조각상의 꼬리가 두 번 나타난다. 이 폐색 아티팩트를 처리하기 위해, 대응하는 픽셀이 I 내에 있는 캔버스 도메인의 슈퍼 픽셀이 I에서 폐색되어 있는지 확인한다. 이를 위해, 먼저 S^J _j에 대한 워핑 손실을 수학식 11과 같이 계산한다.4A and 4B show two images I and J, respectively, and FIG. 4C shows a warped image of I. Referring to FIG. 4C, most of the pixels are warped correctly, but the tail of the fish statue appears twice due to the occlusion as indicated in the yellow box. To deal with this occlusion artifact, it is checked if the super pixel of the canvas domain whose corresponding pixel is within I is occluded at I. To this end, first ^{, the warping loss for S J} _j is calculated as in Equation 11.

여기서, 큰 워핑 손실을 갖는 슈퍼 픽셀은 일반적으로 I에서 폐색된 슈퍼 픽셀로 평가된다. 또한 S^J _j에 대한 양방향 워핑 거리를 수학식 12와 같이 계산한다.Here, a super pixel with a large warping loss is generally evaluated as an occluded super pixel at I. Also ^{, the bidirectional warping distance for S J} _j is calculated as in Equation 12.

여기서 c_j는 S^J _j의 중심 픽셀이고, ^H_i는 ^H^J→Ic_j를 포함하는 I 내의 i번째 슈퍼 픽셀에서의 순방향 워핑의 호모그래피를 나타낸다. 도 4c에 도시된 바와 같이 I에서 폐색되어 있기 때문에 도 4b를 참조할 때, 빨간색 점으로 표시된 바와 같이 워핑된 이미지에서 c_j와 c_j가 양방향으로 워핑된 점 사이의 거리가 크게 나타난다. 예를 들어, L(S^J _j)> 20이고 d(c_j)가 J의 대각선 길이의 2 %보다 클 때, S^J _j가 폐색되어 있는 것으로 판단할 수 있다. 또한 폐색된 것으로 판단된 슈퍼 픽셀에 대해 연결된 구성 요소 분석을 수행하고, 폐색된 영역과 연결된 가장 큰 영역들 M-1개를 선택한다. 도 4d는 폐색된 영역을 구멍으로 표시한 J의 캔버스 영역에 I의 워핑된 이미지를 도시한 도면이다.Here, c _j is the center pixel of S ^J _j , and ^H _i represents the homography of forward warping at the i-th super pixel in I including ^H ^J→I c _j. As illustrated in FIG. 4C, since it is occluded at I, when referring to FIG. 4B, _{the distance between the points where c j} and c _j are warped in both directions in the warped image as indicated by the red dot is large. For example, when L(S ^J _j )> 20 and d(c _j ) is greater than 2% of the diagonal length of ^J , it may be determined that _{S J j is occluded.} In addition, a constituent element analysis connected to a super pixel determined to be occluded is performed, and the largest regions M-1 connected to the occluded region are selected. 4D is a diagram illustrating a warped image of I in the canvas area of J in which the occluded area is marked as a hole.

슈퍼 픽셀의 워핑 잔차 벡터는 슈퍼 픽셀에 가장 가까운 특징점의 워핑 잔차 벡터에 의해 정의되며, 정렬 오차를 야기할 수 있다. 따라서, S^J _j에 대한 추정된 호모그래피 ^H^J→I _j를 S^J _j에 가장 가까운 100개의 슈퍼 픽셀들 중 수학식 11에 따른 워핑 손실이 최소인 이웃하는 슈퍼 픽셀의 호모그래피로 보정한다.The warping residual vector of the super pixel is defined by the warping residual vector of the feature point closest to the super pixel, and may cause an alignment error. Thus, the correction to the homography of the superpixel which the warping loss to a minimum neighborhood according to the estimated homography ^ H ^{J → I} _j in equation (11) of the nearest 100 superpixel to S ^J _j to S ^J _j .

도 5는 일실시예에 따른 스티칭 알고리즘에 따라 복수의 이미지를 합성하는 방법을 설명하기 위한 도면이다. 일실시예에 따른 이미지 합성은 이미지 스티칭을 포함할 수 있다.5 is a diagram illustrating a method of synthesizing a plurality of images according to a stitching algorithm according to an exemplary embodiment. Image synthesis according to an embodiment may include image stitching.

도 5를 참조할 때, 일실시예에 따른 복수의 이미지를 합성하는 방법은 복수의 특징점들을 포함하는 기준 이미지를 대상 이미지로 변환하는 호모그래피 행렬들(homography matrices)을 획득하는 단계(510), 복수의 호모그래피 행렬들에 기초하여, 특징점들에 대한 워핑 잔차(warping residuals) 벡터들을 획득하는 단계(520), 워핑 잔차 벡터들에 기초하여, 상기 기준 이미지 내 기준 영역에 대한 상기 특징점들의 가중치들을 획득하는 단계(530), 가중치들 및 호모그래피 행렬들에 기초하여, 기준 영역에 대한 추정 변환 행렬을 획득하는 단계(540), 및 추정 변환 행렬을 기준 영역에 적용함으로써, 기준 이미지와 대상 이미지를 합성하는 단계(550)를 포함할 수 있다.Referring to FIG. 5, the method of synthesizing a plurality of images according to an embodiment includes obtaining homography matrices for converting a reference image including a plurality of feature points into a target image (510), Obtaining warping residuals vectors for feature points based on a plurality of homography matrices (520), based on the warping residual vectors, weights of the feature points for a reference region in the reference image are calculated. Obtaining 530, obtaining an estimated transformation matrix for the reference region based on the weights and homography matrices 540, and applying the estimated transformation matrix to the reference region, thereby applying a reference image and a target image. It may include the step of synthesizing 550.

일실시예에 따른 기준 이미지는 상술한 이미지 I, 대상 이미지는 상술한 이미지 J에 대응될 수 있다.The reference image according to an embodiment may correspond to the above-described image I, and the target image may correspond to the above-described image J.

일실시예에 따른 단계(510)에서 획득된 호모그래피 행렬들은 상술한 H^m에 대응될 수 있다. 일실시예에 따를 때, 단계(510)는 기준 이미지 및 대상 이미지에서 특징점들을 추출하는 단계, 기준 이미지의 특징점들에 대응하는 상기 대상 이미지의 특징점들을 매칭하는 단계; 및 특징점들의 매칭에 기초하여, 호모그래피 행렬들을 획득하는 단계를 포함할 수 있다.The homography matrices obtained in step 510 according to an embodiment may correspond to ^{H m described above.} According to an embodiment, step 510 includes: extracting feature points from a reference image and a target image, matching feature points of the target image corresponding to feature points of the reference image; And obtaining homography matrices based on matching of the feature points.

일실시예에 따른 단계(510)는 특징점들의 매칭에 기초하여, 매칭되는 특징점 쌍들을 포함하는 후보 매칭 세트를 획득하는 단계 및 후보 매칭 세트에 대하여, 미리 정해진 중지 조건을 확인하면서 변환 행렬 추정 프로세스를 반복하는 단계를 포함할 수 있다. 여기서, 후보 매칭 세트는 상술한 F_cand에 대응될 수 있고, 미리 정해진 중지 조건의 예는 상술한 ∥F^m _inlier∥ < 8 또는 ∥F_cand∥/∥F_init∥ < 0.02 또는 5 번 반복하는 것 등을 포함할 수 있다. Step 510 according to an embodiment includes obtaining a candidate matching set including the matched feature point pairs based on matching feature points, and performing a transformation matrix estimation process while checking a predetermined stopping condition for the candidate matching set. It may include repeating steps. Here, the candidate matching set may _{correspond to the above-described F cand} , and an example of a predetermined stopping condition is the above-described ∥F ^m _inlier ∥ <8 or ∥F _cand ∥/ ∥F _init ∥ <0.02 or repeating 5 times. And the like.

일실시예에 따른 변환 행렬 추정 프로세스는 상술한 복수의 호모그래피 추정 방법에 대응하는 것으로, 후보 매칭 세트에서 아웃라이어 제거 알고리즘을 이용하여 제1 인라이어 매칭 세트를 추정하는 단계, 제1 인라이어 매칭 세트에 기초하여, 제1 변환 행렬을 추정하는 단계, 후보 매칭 세트에서 제1 인라이어 매칭 세트를 제거하는 단계를 포함할 수 있다. 여기서, 아웃라이어 제거 알고리즘은 상술한 임계치 η=0.01인 MULTI-GS의 아웃라이어 제거 방법을 포함할 수 있다. 인라이어 매칭 세트는 상술한 {F^m _inlier}^M _m=1 에 대응될 수 있고, 추정된 호모그래피는 상술한 H^m에 대응될 수 있다.The process of estimating a transformation matrix according to an embodiment corresponds to the plurality of homography estimation methods described above, estimating a first inline matching set using an outlier removal algorithm from a candidate matching set, and a first inline matching Based on the set, estimating a first transformation matrix, and removing the first enlier matching set from the candidate matching set. Here, the outlier removal algorithm may include the outlier removal method of MULTI-GS having the above-described threshold η = 0.01. _{The inlier} matching set may correspond to the above-described {F ^m inlier} ^M _m=1 , and the estimated homography may correspond to the above-described H ^m .

일실시예에 따른 복수의 변환 행렬들 각각은 상기 기준 이미지에 포함된 서로 다른 영역을 변환하는 행렬로, 예를 들어 도 3b를 참조할 때, 제1 호모그래피는 흰색 건물이 포함된 영역을 변환하고, 도 3c를 참조할 때, 제2 호모그래피는 바닥 평면이 포함된 영역을 변환하며, 도 3 d를 참조할 때, 제3 호모그래피는 파란색 객체가 포함된 영역을 변환하는 행렬에 해당할 수 있다.Each of the plurality of transformation matrices according to an embodiment is a matrix for transforming different regions included in the reference image. For example, referring to FIG. 3B, the first homography transforms an area including a white building. And, when referring to FIG. 3C, the second homography transforms the area including the floor plane, and referring to FIG. 3D, the third homography corresponds to a matrix that transforms the area including the blue object. I can.

일실시예에 따른 단계(520)는 특징점들 각각에 대한 워핑 잔차 벡터를 획득하는 단계에 해당할 수 있다. 여기서, 특징점에 대한 워핑 잔차 벡터는 상술한 수학식 9에 의해 정의된 특징점 x_k에 대한 워핑 잔차 벡터 r^-- _k에 대응될 수 있다. 보다 구체적으로, 특징점들 각각에 대하여, 복수의 호모그래피 행렬들 각각을 적용하여, 대응되는 대상 이미지 내 특징점과의 워핑 잔차를 계산하는 단계 및 워핑 잔차들로 구성된 워핑 잔차 벡터를 획득하는 단계를 포함할 수 있다. 다시 말해, 각각의 특징점 x_k에 대하여, 복수의 호모그래피 행렬들 각각을 적용한 H^mx_k와 특징점 x_k에 대응하는 대상 이미지 내의 특징점 y_k의 워핑 잔차 r^m _k를 계산하고, x_k에 M 개의 호모그래피 {H^m}^M _m=1를 적용하여 계산한 워핑 잔차로 구성된 워핑 잔차 벡터 r_k=[r¹ _k, r² _k, ... , r^m _k]^T를 획득할 수 있다.Step 520 according to an embodiment may correspond to a step of obtaining a warping residual vector for each of the feature points. Here, the warping residual vector for the feature point may correspond to the warping residual vector r ^- _k _{for the feature point x k defined by Equation 9 above.} More specifically, for each of the feature points, each of a plurality of homography matrices is applied to calculate a warping residual with a feature point in a corresponding target image, and a warping residual vector consisting of the warping residuals is obtained. can do. That is, for each feature point x _k, and calculating a feature point y warped residual r ^m _k of the _k in the target image corresponding to H ^m x _k and the feature point x _k is applied to a plurality of homography matrices, respectively, the x _k _{A warping residual vector r k} =[r ¹ _k , r ² _k , ..., r ^m _k ] ^T consisting of the warping residuals calculated by applying ^M homography {H ^m } _{M m=1} can be obtained. .

일실시예에 따른 단계(530)에서 기준 영역은 기준 이미지에 포함된 적어도 하나의 픽셀을 포함하는 영역으로, 기준 이미지에 포함된 피사체 또는 객체 간의 경계에 기초하여 기준 이미지를 분할하여 생성된 적어도 하나의 픽셀을 포함하는 복수의 픽셀 영역에 해당할 수 있다. 일실시예에 따를 때, 기준 영역은 상술한 슈퍼 픽셀에 대응될 수 있다.In step 530 according to an embodiment, the reference area is an area including at least one pixel included in the reference image, and at least one generated by dividing the reference image based on a subject included in the reference image or a boundary between objects. It may correspond to a plurality of pixel areas including pixels of. According to an embodiment, the reference region may correspond to the above-described super pixel.

일실시예에 따른 단계(530)는 복수의 특징점들 각각의 기준 영역에 대한 가중치를 계산하는 단계에 해당할 수 있다. 복수의 특징점들 중 기준 영역의 중심 픽셀과 가까운 특징점의 워핑 잔차 벡터를 기준 영역에 대한 워핑 잔차 벡터로 획득하는 단계 및 기준 영역에 대한 워핑 잔차 벡터 및 특징점들에 대한 워핑 잔차 벡터들에 기초하여, 기준 영역에 대한 가중치를 획득하는 단계를 포함할 수 있다. 기준 영역에 대한 가중치는 기준 영역에 대한 복수의 특징점들의 가중치들의 집합에 해당할 수 있다. 여기서, 기준 영역은 상술한 슈퍼 픽셀 S_i에 대응될 수 있고, 기준 영역에 대한 특징점의 가중치는 상술한 수학식 10에 의해 정의된 슈퍼 픽셀 S_i에 대하여 계산된 특징점 x_k에 대한 가중치 w^prop _i,k에 대응될 수 있으며, 기준 영역에 대한 가중치는 상술한 수학식 6에서 w_i,k(k=1, 2, ..., K)를 w^prop _i,k로 대체한 W^prop _i에 대응될 수 있다.Step 530 according to an embodiment may correspond to a step of calculating a weight for a reference region of each of a plurality of feature points. Acquiring a warping residual vector of a feature point close to the center pixel of the reference region among the plurality of feature points as a warping residual vector for the reference region, and based on the warping residual vector for the reference region and the warping residual vectors for the feature points, It may include the step of obtaining a weight for the reference region. The weight for the reference region may correspond to a set of weights of a plurality of feature points for the reference region. Here, the reference region may _{correspond to the above-described super pixel S i} , and the weight of the feature point for the reference region is the weight w ^prop for the feature point x _k _{calculated for the super pixel S i defined by Equation 10 above.} _{It may correspond to i,k} , and the weight for the reference region is ^{W prop} _i by replacing _{w i,k} (k=1, 2, ..., K) with w ^prop _{i,k in Equation 6 above.} Can correspond to.

일실시예에 따른 단계(540)에서 기준 영역에 대하여 획득된 추정 변환 행렬은 상술한 수학식 5에서 W_i를 W^prop _i로 대체하여 슈퍼 픽셀 S_i에 대하여 획득된 ^H_i에 대응될 수 있다.The estimated transformation matrix obtained for the reference region in step 540 according to an embodiment may correspond to ^H _i obtained for the super pixel S _i _{by replacing W i} with W ^prop _{i in Equation 5 above.} have.

일실시예에 따른 단계(550)는 추정 변환 행렬을 기준 영역에 적용하여 변환된 기준 영역과 대상 이미지 내 대응하는 영역과 합성하는 단계에 해당할 수 있다. 즉, 추정 변환 행렬에 따라 기준 이미지를 워핑하는 단계에 해당할 수 있다.Step 550 according to an embodiment may correspond to a step of applying the estimated transformation matrix to a reference region and combining the transformed reference region with a corresponding region in the target image. That is, it may correspond to the step of warping the reference image according to the estimated transformation matrix.

일실시예에 따를 때, 합성된 이미지에는 구멍이 발생할 수 있어, 이를 보정하기 위해, 대상 이미지 내 대상 영역에 대한 제1 추정 역 변환 행렬을 획득하는 단계 및 제1 추정 역 변환 행렬을 대상 영역에 적용함으로써, 기준 이미지와 대상 이미지를 합성하는 단계를 포함할 수 있다. 여기서, 대상 영역은 상술한 슈퍼 픽셀 S^J _j-에 대응될 수 있고, 대상 영역에 대한 제1 추정 역 변환 행렬은 상술한 슈퍼 픽셀 S^J _j 에 대한 ^H^J→I _j 에 대응될 수 있다. 즉, 제1 추정 역 변환 행렬은 도 5의 방법을 기준 이미지와 대상 이미지를 바꾸어 실시하여 획득한 대상 이미지 내 대상 영역에 대한 추정 변환 행렬에 해당할 수 있다.According to an embodiment, holes may occur in the synthesized image, and in order to correct this, obtaining a first estimated inverse transform matrix for a target area in the target image and a first estimated inverse transform matrix to the target area. By applying, it may include the step of synthesizing the reference image and the target image. Here, the target region may ^{correspond to the above-described super pixel S J} _j- , and the first estimated inverse transform matrix for the target region may correspond to ^H ^J→I _j ^{for the above-described super pixel S J} _j . . That is, the first estimated inverse transformation matrix may correspond to the estimated transformation matrix for the target region in the target image obtained by performing the method of FIG. 5 by changing the reference image and the target image.

일실시예에 따를 때, 제1 추정 역 변환 행렬을 대상 영역에 적용함으로써, 기준 이미지와 대상 이미지를 합성하는 단계는 제1 추정 역 변환 행렬이 적용된 대상 영역에 대응하는 기준 이미지 내의 픽셀들에 기초하여, 기준 이미지와 대상 이미지를 합성하는 단계를 포함할 수 있다. 즉, 제1 추정 역 변환 행렬을 대상 영역에 적용함으로써, 기준 이미지와 대상 이미지를 합성하는 단계는 대상 이미지를 기준 이미지로 역 워핑하는 단계에 해당할 수 있다.According to an embodiment, the step of synthesizing the reference image and the target image by applying the first estimated inverse transformation matrix to the target region is based on pixels in the reference image corresponding to the target region to which the first estimated inverse transformation matrix is applied. Thus, it may include the step of synthesizing the reference image and the target image. That is, the step of synthesizing the reference image and the target image by applying the first estimated inverse transform matrix to the target region may correspond to a step of inverse warping the target image to the reference image.

일실시예에 따를 때, 제1 추정 역 변환 행렬을 대상 영역에 적용함으로써, 기준 이미지와 대상 이미지를 합성하는 경우, 합성된 이미지의 일부 영역에서 폐색 현상이 발생할 수 있다. 폐색 현상을 제거하기 위해 일실시예에 따른 제1 추정 역 변환 행렬을 보정할 수 있다. 일실시예에 다를 때, 대상 영역에 포함된 픽셀들 및 제1 추정 역 변환 행렬에 기초하여, 대상 영역에 대한 워핑 손실을 계산하는 단계, 대상 영역의 중심 픽셀에 제1 추정 역 변환 행렬 및 추정 변환 행렬을 적용하여, 대상 픽셀에 대한 워핑 거리를 계산하는 단계, 워핑 손실 및 워핑 거리에 기초하여, 제1 추정 역 변환 행렬의 보정 여부를 판단하는 단계, 및 보정 여부에 대한 판단에 기초하여, 대상 이미지에 포함된 다른 영역들에 대한 추정 역 변환 행렬들 중 워핑 손실이 적은 제2 추정 역 변환 행렬로 제1 추정 역 변환 행렬을 보정하는 단계를 포함할 수 있다. 여기서, 대상 영역에 대한 워핑 손실은 상술한 수학식 11에 따른 이미지 J 내의 슈퍼 픽셀 S^J _j에 대한 L(S^J _j)에 대응될 수 있고, 대상 픽셀에 대한 워핑 거리는 수학식 12에 따른 S^J _j의 중심 픽셀 c_j에 대한 양방향 워핑 거리 d(c_j)에 대응될 수 있다.According to an exemplary embodiment, when the reference image and the target image are combined by applying the first estimated inverse transform matrix to the target region, occlusion may occur in a partial region of the synthesized image. In order to remove the occlusion phenomenon, the first estimated inverse transform matrix according to an embodiment may be corrected. When different from one embodiment, calculating a warping loss for a target area based on pixels included in the target area and a first estimated inverse transform matrix, a first estimated inverse transform matrix and an estimation for the center pixel of the target area Based on the step of calculating a warping distance for a target pixel by applying a transformation matrix, determining whether to correct the first estimated inverse transformation matrix based on the warping loss and the warping distance, and determining whether to correct, It may include correcting the first estimated inverse transform matrix with a second estimated inverse transform matrix having a low warping loss among the estimated inverse transform matrices for other regions included in the target image. Here, the warping loss for the target region may correspond to L(S ^J _j ^{) for the super pixel S J} _j in the image J according to Equation 11, and the warping distance for the target pixel is S according to Equation 12. It may correspond to a bidirectional warping distance d(c _j ) for the center pixel c _j ^{of J} _j.

일실시예에 따른 제1 추정 역 변환 행렬을 보정하는 단계는 대상 영역에 기초하여, 대상 이미지에 포함된 영역들 중 미리 정해진 기준에 따라 후보 영역들을 선택하는 단계, 후보 영역들 각각에 대한 상기 워핑 손실을 계산하는 단계, 및 워핑 손실에 기초하여, 후보 영역들 중 어느 하나에 대한 제2 추정 역 변환 행렬로 제1 추정 역 변환 행렬을 보정하는 단계를 포함할 수 있다. 예를 들어, 상술한 바와 같이 S^J _j에 대한 추정된 호모그래피 ^H^J→I _j를 S^J _j에 가장 가까운 100개의 슈퍼 픽셀들 중 수학식 11에 따른 워핑 손실이 최소인 이웃하는 슈퍼 픽셀의 호모그래피로 보정하는 단계를 포함할 수 있다.Correcting the first estimated inverse transform matrix according to an embodiment includes selecting candidate regions according to a predetermined criterion among regions included in a target image based on a target region, and the warping for each of the candidate regions. It may include calculating the loss, and correcting the first estimated inverse transform matrix with a second estimated inverse transform matrix for any one of the candidate regions based on the warping loss. For example, the estimated homography ^ H ^{J → I} _j a superpixel that the warping loss to a minimum neighborhood according to the equation (11) of the nearest 100 superpixel to S ^J _j to S ^J _j as described above, It may include the step of correcting by homography of.

일실시예에 따를 때, 제1 추정 역 변환 행렬이 보정된 제2 추정 역 변환 행렬을 대상 영역에 적용함으로써, 기준 이미지와 대상 이미지를 합성할 수 있다. 즉, 일실시예에 따른 이미지를 합성하는 단계는 제2 추정 역 변환 행렬이 적용된 대상 영역에 대응하는 기준 이미지 내의 픽셀들에 기초하여, 기준 이미지와 대상 이미지를 합성하는 단계를 포함할 수 있다.According to an embodiment, a reference image and a target image may be synthesized by applying a second estimated inverse transform matrix corrected from the first estimated inverse transform matrix to the target region. That is, synthesizing the image according to an embodiment may include synthesizing the reference image and the target image based on pixels in the reference image corresponding to the target region to which the second estimated inverse transformation matrix is applied.

이상에서 설명된 실시예들은 하드웨어 구성요소, 소프트웨어 구성요소, 및/또는 하드웨어 구성요소 및 소프트웨어 구성요소의 조합으로 구현될 수 있다. 예를 들어, 실시예들에서 설명된 장치, 방법 및 구성요소는, 예를 들어, 프로세서, 콘트롤러, ALU(arithmetic logic unit), 디지털 신호 프로세서(digital signal processor), 마이크로컴퓨터, FPGA(field programmable gate array), PLU(programmable logic unit), 마이크로프로세서, 또는 명령(instruction)을 실행하고 응답할 수 있는 다른 어떠한 장치와 같이, 하나 이상의 범용 컴퓨터 또는 특수 목적 컴퓨터를 이용하여 구현될 수 있다. 처리 장치는 운영 체제(OS) 및 상기 운영 체제 상에서 수행되는 하나 이상의 소프트웨어 애플리케이션을 수행할 수 있다. 또한, 처리 장치는 소프트웨어의 실행에 응답하여, 데이터를 접근, 저장, 조작, 처리 및 생성할 수도 있다. 이해의 편의를 위하여, 처리 장치는 하나가 사용되는 것으로 설명된 경우도 있지만, 해당 기술분야에서 통상의 지식을 가진 자는, 처리 장치가 복수 개의 처리 요소(processing element) 및/또는 복수 유형의 처리 요소를 포함할 수 있음을 알 수 있다. 예를 들어, 처리 장치는 복수 개의 프로세서 또는 하나의 프로세서 및 하나의 콘트롤러를 포함할 수 있다. 또한, 병렬 프로세서(parallel processor)와 같은, 다른 처리 구성(processing configuration)도 가능하다.The embodiments described above may be implemented as a hardware component, a software component, and/or a combination of a hardware component and a software component. For example, the devices, methods, and components described in the embodiments are, for example, a processor, a controller, an arithmetic logic unit (ALU), a digital signal processor, a microcomputer, a field programmable gate (FPGA). array), programmable logic unit (PLU), microprocessor, or any other device capable of executing and responding to instructions. The processing device may execute an operating system (OS) and one or more software applications executed on the operating system. Further, the processing device may access, store, manipulate, process, and generate data in response to the execution of software. For the convenience of understanding, although it is sometimes described that one processing device is used, one of ordinary skill in the art, the processing device is a plurality of processing elements and/or a plurality of types of processing elements. It can be seen that it may include. For example, the processing device may include a plurality of processors or one processor and one controller. In addition, other processing configurations are possible, such as a parallel processor.

소프트웨어는 컴퓨터 프로그램(computer program), 코드(code), 명령(instruction), 또는 이들 중 하나 이상의 조합을 포함할 수 있으며, 원하는 대로 동작하도록 처리 장치를 구성하거나 독립적으로 또는 결합적으로(collectively) 처리 장치를 명령할 수 있다. 소프트웨어 및/또는 데이터는, 처리 장치에 의하여 해석되거나 처리 장치에 명령 또는 데이터를 제공하기 위하여, 어떤 유형의 기계, 구성요소(component), 물리적 장치, 가상 장치(virtual equipment), 컴퓨터 저장 매체 또는 장치, 또는 전송되는 신호 파(signal wave)에 영구적으로, 또는 일시적으로 구체화(embody)될 수 있다. 소프트웨어는 네트워크로 연결된 컴퓨터 시스템 상에 분산되어서, 분산된 방법으로 저장되거나 실행될 수도 있다. 소프트웨어 및 데이터는 하나 이상의 컴퓨터 판독 가능 기록 매체에 저장될 수 있다.The software may include a computer program, code, instructions, or a combination of one or more of these, configuring the processing unit to operate as desired or processed independently or collectively. You can command the device. Software and/or data may be interpreted by a processing device or, to provide instructions or data to a processing device, of any type of machine, component, physical device, virtual equipment, computer storage medium or device. , Or may be permanently or temporarily embodyed in a transmitted signal wave. The software may be distributed over networked computer systems and stored or executed in a distributed manner. Software and data may be stored on one or more computer-readable recording media.

실시예에 따른 방법은 다양한 컴퓨터 수단을 통하여 수행될 수 있는 프로그램 명령 형태로 구현되어 컴퓨터 판독 가능 매체에 기록될 수 있다. 상기 컴퓨터 판독 가능 매체는 프로그램 명령, 데이터 파일, 데이터 구조 등을 단독으로 또는 조합하여 포함할 수 있다. 상기 매체에 기록되는 프로그램 명령은 실시예를 위하여 특별히 설계되고 구성된 것들이거나 컴퓨터 소프트웨어 당업자에게 공지되어 사용 가능한 것일 수도 있다. 컴퓨터 판독 가능 기록 매체의 예에는 하드 디스크, 플로피 디스크 및 자기 테이프와 같은 자기 매체(magnetic media), CD-ROM, DVD와 같은 광기록 매체(optical media), 플롭티컬 디스크(floptical disk)와 같은 자기-광 매체(magneto-optical media), 및 롬(ROM), 램(RAM), 플래시 메모리 등과 같은 프로그램 명령을 저장하고 수행하도록 특별히 구성된 하드웨어 장치가 포함된다. 프로그램 명령의 예에는 컴파일러에 의해 만들어지는 것과 같은 기계어 코드뿐만 아니라 인터프리터 등을 사용해서 컴퓨터에 의해서 실행될 수 있는 고급 언어 코드를 포함한다. 상기된 하드웨어 장치는 실시예의 동작을 수행하기 위해 하나 이상의 소프트웨어 모듈로서 작동하도록 구성될 수 있으며, 그 역도 마찬가지이다.The method according to the embodiment may be implemented in the form of program instructions that can be executed through various computer means and recorded in a computer-readable medium. The computer-readable medium may include program instructions, data files, data structures, and the like alone or in combination. The program instructions recorded on the medium may be specially designed and configured for the embodiment, or may be known and usable to those skilled in computer software. Examples of computer-readable recording media include magnetic media such as hard disks, floppy disks, and magnetic tapes, optical media such as CD-ROMs and DVDs, and magnetic media such as floptical disks. -A hardware device specially configured to store and execute program instructions such as magneto-optical media, and ROM, RAM, flash memory, and the like. Examples of program instructions include not only machine language codes such as those produced by a compiler, but also high-level language codes that can be executed by a computer using an interpreter or the like. The hardware device described above may be configured to operate as one or more software modules to perform the operation of the embodiment, and vice versa.

이상과 같이 실시예들이 비록 한정된 도면에 의해 설명되었으나, 해당 기술분야에서 통상의 지식을 가진 자라면 상기를 기초로 다양한 기술적 수정 및 변형을 적용할 수 있다. 예를 들어, 설명된 기술들이 설명된 방법과 다른 순서로 수행되거나, 및/또는 설명된 시스템, 구조, 장치, 회로 등의 구성요소들이 설명된 방법과 다른 형태로 결합 또는 조합되거나, 다른 구성요소 또는 균등물에 의하여 대치되거나 치환되더라도 적절한 결과가 달성될 수 있다.As described above, although the embodiments have been described by the limited drawings, a person of ordinary skill in the art can apply various technical modifications and variations based on the above. For example, the described techniques are performed in a different order from the described method, and/or components such as systems, structures, devices, circuits, etc. described are combined or combined in a form different from the described method, or other components Alternatively, even if substituted or substituted by an equivalent, an appropriate result can be achieved.

Claims

In the method for synthesizing a plurality of images performed by a processor,
Obtaining homography matrices for converting a reference image including a plurality of feature points into a target image;
Obtaining warping residuals vectors for the feature points based on the plurality of homography matrices;
Obtaining weights of the feature points for a reference region in the reference image based on the warping residual vectors;
Obtaining an estimated transformation matrix for the reference region based on the weights and the homography matrices; And
Synthesizing the reference image and the target image by applying the estimated transformation matrix to the reference region
Including,
Obtaining warping residual vectors for the feature points based on the plurality of homography matrices comprises:
For each of the above feature points,
Calculating a warping residual with a corresponding feature point in the target image by applying each of the plurality of homography matrices; And
Obtaining the warping residual vector composed of the warping residuals
Including
How to composite multiple images.

The method of claim 1,
The step of combining the reference image and the target image
Obtaining a first estimated inverse transform matrix for a target region in the target image; And
Synthesizing the reference image and the target image by applying the first estimated inverse transform matrix to the target region
Including
How to composite multiple images.

The method of claim 2,
Calculating a warping loss for the target area based on the pixels included in the target area and the first estimated inverse transform matrix;
Calculating a warping distance for the target pixel by applying the first estimated inverse transform matrix and the estimated transform matrix to a center pixel of the target area;
Determining whether to correct the first estimated inverse transform matrix based on the warping loss and the warping distance; And
Based on the determination, correcting the first estimated inverse transform matrix with a second estimated inverse transform matrix having a low warping loss among estimated inverse transform matrices for other regions included in the target image.
Further comprising
How to composite multiple images.

The method of claim 3,
Correcting the first estimated inverse transform matrix
Selecting candidate regions according to a predetermined criterion among regions included in the target image based on the target region;
Calculating the warping loss for each of the candidate regions; And
Correcting the first estimated inverse transform matrix with a second estimated inverse transform matrix for any one of candidate regions based on the warping loss;
Applying the second estimated inverse transform matrix to the target region; And
Synthesizing the reference image and the target image based on pixels in the reference image corresponding to the target region to which the second estimated inverse transform matrix is applied
Including
How to composite multiple images.

The method of claim 1,
Acquiring homography matrices for converting a reference image including the plurality of feature points into a target image
Extracting feature points from the reference image and the target image;
Matching feature points of the target image corresponding to feature points of the reference image; And
Obtaining the transformation matrices based on matching of the feature points
Including
How to composite multiple images.

The method of claim 5,
Obtaining a candidate matching set including matched feature point pairs based on the matching of the feature points; And
Repeating the process of estimating a transformation matrix while confirming a predetermined stopping condition for the candidate matching set
Including more,
The transformation matrix estimation process is
Estimating a first inline matching set from the candidate matching set using an outlier removal algorithm;
Estimating a first transformation matrix based on the first enlier matching set; And
Removing the first enlier matching set from the candidate matching set
Including
How to composite multiple images.

The method of claim 1,
Each of the plurality of homography matrices transforms different regions included in the reference image.
How to composite multiple images.

delete

The method of claim 1,
Dividing the reference image into a plurality of pixel areas including at least one pixel based on a boundary between subjects included in the reference image
Including more,
The reference region is any one of the plurality of pixel regions
How to composite multiple images.

The method of claim 9,
The pixel area is a super pixel
How to composite multiple images.

The method of claim 1,
Acquiring weights of the feature points for the reference region comprises:
Acquiring a warping residual vector of a feature point close to a center pixel of the reference area among the plurality of feature points as a warping residual vector for the reference area; And
Obtaining a weight for the reference region based on the warping residual vector for the reference region and the warping residual vectors for the feature points
Including
How to composite multiple images.

A computer program stored in a medium for executing the method of any one of claims 1 to 7 and 9 to 11 in combination with hardware.

Homography matrices for converting a reference image including a plurality of feature points into a target image are obtained, and warping residuals vectors for the feature points are calculated based on the plurality of homography matrices. Acquire, based on the warping residual vectors, obtain weights of the feature points for a reference region in the reference image, and calculate an estimated transformation matrix for the reference region based on the weights and the homography matrices At least one processor that obtains and synthesizes the reference image and the target image by applying the estimated transformation matrix to the reference region
Including,
The processor is
In obtaining warping residual vectors for the feature points,
For each of the feature points, each of the plurality of homography matrices is applied to calculate a warping residual with a feature point in the corresponding target image, and for each of the feature points, the warping residual consisting of the warping residuals Obtaining a vector
A device that combines multiple images.

The method of claim 13,
The processor is
In synthesizing the reference image and the target image, by obtaining a first estimated inverse transform matrix for a target area in the target image and applying the first estimated inverse transform matrix to the target area, the reference image and the target image Compose a target image, calculate a warping loss for the target area based on pixels included in the target area and the first estimated inverse transform matrix, and calculate the first estimated inverse transform to a center pixel of the target area Apply a matrix and the estimated transformation matrix to calculate a warping distance for the target pixel, determine whether to correct the first estimated inverse transformation matrix based on the warping loss and the warping distance, and based on the determination Thus, correcting the first estimated inverse transform matrix with a second estimated inverse transform matrix having a low warping loss among estimated inverse transform matrices for other regions included in the target image.
A device that combines multiple images.

The method of claim 14,
The processor is
In correcting the first estimated inverse transform matrix, based on the target region, candidate regions are selected according to a predetermined criterion among regions included in the target image, and the warping loss for each of the candidate regions is calculated. Compute and correct the first estimated inverse transform matrix with a second estimated inverse transform matrix for any one of candidate regions based on the warping loss, and apply the second estimated inverse transform matrix to the target region, , Based on the pixels in the reference image corresponding to the target region to which the second estimated inverse transform matrix is applied, combining the reference image and the target image
A device that combines multiple images.

The method of claim 13,
The processor is
In obtaining homography matrices for converting a reference image including the plurality of feature points into a target image, feature points are extracted from the reference image and the target image, and feature points of the target image corresponding to feature points of the reference image And, based on the matching of the feature points, obtaining a candidate matching set including the matched feature point pairs, and repeating the transformation matrix estimation process while confirming a predetermined stopping condition for the candidate matching set,
The transformation matrix estimation process is
Estimating a first inline matching set from the candidate matching set using an outlier removal algorithm;
Estimating a first transformation matrix based on the first enlier matching set; And
Removing the first enlier matching set from the candidate matching set
Including
A device that combines multiple images.

delete

The method of claim 13,
The processor is
Dividing the reference image into a plurality of super pixels including at least one pixel based on a boundary between subjects included in the reference image,
The reference region is any one of the plurality of super pixels
A device that combines multiple images.

The method of claim 13,
The processor is
In obtaining the weights of the feature points for the reference region, a warping residual vector of a feature point close to the center pixel of the reference region among the plurality of feature points is obtained as a warping residual vector for the reference region, and Based on the warping residual vector for and the warping residual vectors for the feature points, obtaining a weight for the reference region
A device that combines multiple images.