KR20210108283A

KR20210108283A - A method of improving the quality of 3D images acquired from RGB-depth camera

Info

Publication number: KR20210108283A
Application number: KR1020200025150A
Authority: KR
Inventors: 서영호; 김동욱; 김경진; 박병서
Original assignee: 광운대학교 산학협력단
Priority date: 2020-02-25
Filing date: 2020-02-28
Publication date: 2021-09-02
Also published as: KR102327304B1

Abstract

The present invention relates to a method for improving the quality of a three-dimensional image obtained from a depth image camera, which improves the quality by processing a three-dimensional image obtained from a texture depth (RGB-depth) camera on a multi-view camera system. The method of the present invention comprises the steps of: (a) obtaining a multi-view depth image and a texture image; (b) generating a binary image with respect to a brightness value of the texture image, and generating a mask image from the generated binary image; (c) applying the mask image to the texture image, and filtering the same; (d) performing primary correction by adding an alpha channel to the filtered texture image; (e) filtering a depth image with the mask image; (f) applying time-average filtering to the filtered depth image to be corrected; (g) secondly correcting the firstly corrected texture image using a histogram; (h) generating point cloud data from a corrected multi-view depth image, and removing surface noise for correction; and (i) configuring a curved surface from the corrected point cloud data. In accordance with the present invention, a more accurate three-dimensional image can be obtained by relieving noise in a depth image, noise in illumination and a boundary surface between adjacent color images, noise in point cloud data, noise in surface data, etc.

Description

{ A method of improving the quality of 3D images acquired from RGB-depth camera }

본 발명은 다시점 카메라 시스템 상에서의 텍스처 깊이(RGB-Depth) 카메라를 통해 획득된 3D 이미지를 처리하여 품질을 향상시키되, 깊이 영상에서의 잡음 개선, 인접 텍스처 영상(Texture, 컬러) 간 조명 보상 및 경계면 보간, 깊이 영상으로부터 전환된 점군 데이터의 잡음 개선, 점군 데이터로부터 복원된 면(Mesh) 데이터의 잡음 개선 등의 프로세스를 수행하는, 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 관한 것이다.The present invention improves the quality by processing a 3D image acquired through a texture depth (RGB-Depth) camera on a multi-view camera system, but improves noise in depth images, compensates for lighting between adjacent texture images (texture, color), and It relates to a method for improving the quality of a 3D image obtained from a depth image camera, which performs processes such as interpolation of interfaces, noise improvement of point cloud data converted from a depth image, and noise improvement of mesh data reconstructed from point cloud data.

일반적으로, 컴퓨터 비젼, 로보틱스, 증강현실 분야에서 3차원 공간 및 3차원 객체 검출 및 인식기술의 중용성이 대두되고 있다. 특히, 마이크로소프트사의 키넥트(Microsoft Kinect) 방식을 사용하는 영상 센서를 통하여 RGB 영상과 깊이 영상을 실시간 획득하는 것이 가능해짐으로 인하여 객체 검출, 추적 및 인식 연구에 많은 변화를 가져오고 있다[비특허문헌 1, 2].In general, the importance of 3D space and 3D object detection and recognition technology is emerging in the fields of computer vision, robotics, and augmented reality. In particular, since it is possible to acquire RGB images and depth images in real time through an image sensor using Microsoft's Kinect method, many changes have been made in object detection, tracking, and recognition research [non-patented] Documents 1 and 2].

RGB-D(RGB 영상과 깊이 영상)에서 3차원 객체를 인식하는 전체 프로세스가 도 1에 도시되고 있다. 즉, 도 1에 도시되는 바와 같이, 깊이 카메라로부터 깊이 영상(depth frame)을 획득하고, RGB 카메라(또는 텍스처 카메라)로부터 텍스쳐(texture) 영상을 획득한다. 그리고 깊이 영상과 텍스쳐 영상 등 2차원 영상에 대하여 처리를 수행하고, 이들 영상으로부터 점군(또는 포인트 클라우드, point cloud)을 추출한다. 그리고 점군으로부터 메쉬를 생성하는 과정을 수행한다. 이러한 과정을 거쳐, 최종적으로 3차원 메쉬로 구성되는 3차원 객체가 추출된다.The entire process of recognizing a 3D object in RGB-D (RGB image and depth image) is shown in FIG. 1 . That is, as shown in FIG. 1 , a depth frame is acquired from a depth camera, and a texture image is acquired from an RGB camera (or texture camera). Then, processing is performed on two-dimensional images such as depth images and texture images, and a point cloud (or point cloud, point cloud) is extracted from these images. Then, the process of generating a mesh from the point cloud is performed. Through this process, a 3D object composed of a 3D mesh is finally extracted.

그런데 도 1과 같은 처리 과정에서 다음과 같은 문제들을 해결해야 한다.However, in the process shown in FIG. 1 , the following problems should be solved.

먼저, 깊이 영상을 획득 시 이용되는 구조광에 의해 발생되는 조명 잡음 문제와, DIBR에 의한 영상 합성 시 발생되는 경계 잡음 문제가 있다. First, there is a problem of illumination noise caused by structured light used when acquiring a depth image, and a problem of boundary noise generated when synthesizing an image by DIBR.

즉, 구조광 영상기반 깊이 영상 카메라는 보통의 카메라와 일정한 시각편차를 갖는 구조광 조사장치로 구성된다. 주변 조명과 구별되는 구조광 발생장치로는 적색(660[nm])의 가시광선 레이저, 혹은 적외선(780[nm]) 레이저가 흔히 사용되며, 점(point) 형태의 레이저를 구조적 형태로 변환하는 대표적인 방법으로는 광학회절소자를 이용하여 유리에 새겨진 패턴에 적외선 레이저를 투사하는 방법이 이용된다. 이와 같은 적외선 구조광 방식은 물체에 투사된 구조광의 반사, 번짐, 주파수 겹침 등의 현상과 주변 조명 잡음이 원인으로 깊이 영상획득 결과에 큰 오류를 포함할 수 있다.That is, the structured light image-based depth imaging camera is composed of a normal camera and a structured light irradiation device having a certain visual deviation. A red (660 [nm]) visible ray laser or an infrared (780 [nm]) laser is commonly used as a structured light generator to be distinguished from ambient light, and it is a device that converts a point type laser into a structural type. As a representative method, a method of projecting an infrared laser onto a pattern engraved on glass using an optical diffraction element is used. Such an infrared structured light method may include a large error in the depth image acquisition result due to phenomena such as reflection, blurring, frequency overlap, and the like of structured light projected on an object and ambient lighting noise.

또한, DIBR(Depth Image Based Rendering)을 이용한 영상 합성 시에 발생하는 경계 잡음은 원래 전경 영역에 속하던 화소가 배경으로 흩어져 나와 생성된 잡음이며, 이는 주로 참조 영상과 깊이 지도 간 경계 불일치나 참조 영상에서의 블러링 때문에 발생된다. 이러한 경계 잡음은 3D 영상의 품질을 저하시키는 주요 원인으로 작용한다.In addition, boundary noise generated during image synthesis using DIBR (Depth Image Based Rendering) is noise generated when pixels originally belonging to the foreground region are scattered into the background. This is caused by blurring in Such boundary noise acts as a major cause of deterioration of the 3D image quality.

또한, 깊이 영상의 조명잡음 및 경계 잡음의 원인 이외에도 상기 상황에서의 잡음은 RGB-Depth 카메라에서 시간 축으로 획득되는 모든 깊이 영상 프레임에 유동적으로 작용한다. 이것은 시간축 잡음 문제이다.In addition to the causes of illumination noise and boundary noise of the depth image, the noise in the above situation fluidly acts on all depth image frames acquired in the time axis from the RGB-Depth camera. This is a time-base noise problem.

다음으로, 다시점 카메라에 의한 조명 불일치 문제이다. 조명 불일치를 일으키는 가장 큰 요인으로는 다시점 카메라 시스템의 서로 다른 카메라의 위치이다. 카메라의 서로 다른 시점에 따라 시점 영상 간의 조명 환경 또한 서로 상이하게 되고, 동일한 물체도 획득된 영상에 존재하는 구성과 반사광의 정도가 시점에 따라 다를 수 있다. 이와 같이 시점 간 동일 객체에 발생하는 조명 불일치를 국부적 조명 불일치(local illlumination mismatches)라 한다. 조명 불일치 현상은 인접한 시점 영상 간의 상관성을 떨어뜨려 인접한 시점의 영상을 면 데이터로 복원하는데 품질을 저하시키는 결과를 초래하게 된다.Next, there is the problem of lighting mismatch caused by the multi-view camera. The biggest factor causing the lighting mismatch is the location of the different cameras in the multi-view camera system. According to different viewpoints of the camera, lighting environments between viewpoint images are also different from each other, and even the same object may have different configurations and a degree of reflected light present in the acquired image depending on viewpoints. As such, illumination mismatches occurring in the same object between viewpoints are referred to as local illumination mismatches. The lighting mismatch phenomenon lowers the correlation between images of adjacent viewpoints, resulting in deterioration of quality in restoring images of adjacent viewpoints to surface data.

또한, 점군 데이터 상에서의 표면 잡음 제거 문제도 있다. 즉, 적외선을 이용한 신호체공시간(Time-Of-Flight) 측정방식, 구조광 영상방식 등은 카메라와 촬영 물체의 거리와 물체의 모양에 따라 측정 정밀도가 달라진다는 단점이 있다. 모든 잡음은 복합적으로 작용하며 이와 같은 이유로 깊이 영상을 점군 데이터로 변환 했을 때 표면 및 경계부의 잡음을 특정하는데 높은 난이도를 요하게 된다.Also, there is a problem of surface noise removal on point cloud data. That is, the time-of-flight measurement method using infrared rays, the structured light imaging method, etc. have disadvantages in that the measurement precision varies depending on the distance between the camera and the photographed object and the shape of the object. All noises act in a complex way, and for this reason, it requires a high degree of difficulty in specifying the noise of the surface and boundary when the depth image is converted to point cloud data.

또한, 객체의 점군(point cloud) 데이터에서 다각형 메쉬 등으로 구성할 때, 를 면 데이터 상에서 발생되는 잡음 문제도 있다.In addition, there is a noise problem that occurs on the plane data when the object is composed of a polygonal mesh or the like from the point cloud data.

따라서 상기과 같은 문제점을 해결하는 3차원 영상 생성 기술이 필요하다.Therefore, there is a need for a 3D image generation technology that solves the above problems.

W. Lee, N. Park, W. Woo, "Depth-assisted 3D Object Detection for Augmented Reality," International Conference on Artificial reality and Telexistence, pp. 126-132, 2011. W. Lee, N. Park, W. Woo, “Depth-assisted 3D Object Detection for Augmented Reality,” International Conference on Artificial reality and Telexistence, pp. 126-132, 2011. Park, Y., Lepetit, V., Woo, W., "Texture-less object tracking with online training an RGB-D camera", Int. Symp. Mixed and Augmented Reality (ISMAR), pp. 121-126 (2011) Park, Y., Lepetit, V., Woo, W., "Texture-less object tracking with online training an RGB-D camera", Int. Symp. Mixed and Augmented Reality (ISMAR), pp. 121-126 (2011)

본 발명의 목적은 상술한 바와 같은 문제점을 해결하기 위한 것으로, 다시점 카메라 시스템 상에서의 텍스처-깊이(RGB-D) 카메라를 통해 획득된 깊이 및 텍스쳐 영상으로부터 3차원 객체를 생성하되, 깊이 영상에서의 잡음 개선, 인접 컬러 영상(Texture) 간 조명 보상 및 경계면 보간, 깊이 영상으로부터 전환된 점군 데이터의 잡음 개선, 점군 데이터로부터 복원된 면(Mesh) 데이터의 잡음 개선 등의 프로세스를 수행하는, 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법을 제공하는 것이다.SUMMARY OF THE INVENTION An object of the present invention is to solve the above problems, and a 3D object is generated from depth and texture images acquired through a texture-depth (RGB-D) camera on a multi-view camera system, but in the depth image Depth image, which performs processes such as noise improvement, illumination compensation between adjacent color images (textures) and interpolation of boundary surfaces, noise improvement of point cloud data converted from depth images, and noise improvement of mesh data restored from point cloud data. An object of the present invention is to provide a method for improving the quality of a 3D image obtained from a camera.

상기 목적을 달성하기 위해 본 발명은 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 관한 것으로서, (a) 다시점의 깊이 영상 및 텍스처 영상을 획득하는 단계; (b) 상기 텍스처 영상의 밝기 값을 기준으로 이진 영상을 생성하고, 생성된 이진 영상으로 마스크 이미지를 생성하는 단계; (c) 상기 텍스처 영상에 상기 마스크 이미지를 적용하여 필터링하는 단계; (d) 필터링된 텍스처 영상에 알파 채널을 추가하여 1차 보정하는 단계; (e) 상기 깊이 영상에 상기 마스크 이미지로 필터링하는 단계; (f) 필터링된 깊이 영상에 시간 평균 필터링을 적용하여 보정하는 단계; (g) 히스토그램을 이용하여 1차 보정된 텍스처 영상을 2차 보정하는 단계; (h) 보정된 다시점의 깊이 영상으로부터 점군 데이터를 생성하고 표면 잡음을 제거하여 보정하는 단계; 및, (i) 보정된 점군 데이터로부터 곡면을 구성하는 단계를 포함하는 것을 특징으로 한다.In order to achieve the above object, the present invention relates to a method for improving the quality of a 3D image obtained from a depth image camera, comprising the steps of: (a) obtaining a multi-view depth image and a texture image; (b) generating a binary image based on the brightness value of the texture image, and generating a mask image from the generated binary image; (c) filtering by applying the mask image to the texture image; (d) performing primary correction by adding an alpha channel to the filtered texture image; (e) filtering the depth image with the mask image; (f) correcting the filtered depth image by applying time-average filtering; (g) secondarily correcting the firstly corrected texture image using the histogram; (h) generating point cloud data from the corrected multi-view depth image and correcting it by removing surface noise; and, (i) constructing a curved surface from the corrected point cloud data.

또, 본 발명은 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 있어서, 상기 (b)단계에서, 상기 이진 영상에 모폴로지 연산 중 닫기 연산을 반복하여 이진 영상의 잡음을 제거하는 것을 특징으로 한다.In addition, in the method for improving the quality of a 3D image obtained from a depth image camera, in the step (b), the binary image is characterized in that noise of the binary image is removed by repeating the close operation during the morphological operation on the binary image. .

또, 본 발명은 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 있어서, 상기 (c)단계에서, 상기 텍스처 영상에서 마스크 영역의 밝기값을 유지하고, 마스크 외 영역의 밝기값을 가장 작게 설정하는 것을 특징으로 한다.In addition, in the method for improving the quality of a 3D image obtained from a depth image camera, in the step (c), the brightness value of the mask area is maintained in the texture image, and the brightness value of the area outside the mask is set to the smallest characterized in that

또, 본 발명은 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 있어서, 상기 (d)단계에서, 필터링된 텍스처 영상에 투명도의 알파 채널을 추가하고, 텍스처 영상에서 특정 거리 이상의 영역의 알파값을 0으로 설정하되, 특정 거리 이상의 영역은 해당 깊이 영상의 영역의 깊이값이 사전에 설정된 임계 거리 이상인 영역인 것을 특징으로 한다.In addition, in the method for improving the quality of a 3D image obtained from a depth image camera, in the step (d), an alpha channel of transparency is added to the filtered texture image, and an alpha value of a region over a specific distance in the texture image is set to 0, but the region greater than a specific distance is an region in which the depth value of the region of the corresponding depth image is greater than or equal to a preset threshold distance.

또, 본 발명은 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 있어서, 상기 (e)단계에서, 상기 깊이 영상에서 마스크 영역의 깊이값을 유지하고, 마스크 외 영역의 깊이값을 가장 크게 설정하는 것을 특징으로 한다.In addition, in the method for improving the quality of a 3D image obtained from a depth image camera, the present invention maintains the depth value of the mask region in the depth image and sets the maximum depth value of the region outside the mask in the step (e). characterized in that

또, 본 발명은 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 있어서, 상기 (f)단계에서, 상기 깊이 영상에서 마스크 영역에 해당하는 필터링된 영역에 대해서만 시간 평균 필터링을 수행하는 것을 특징으로 한다.In addition, in the method for improving the quality of a 3D image obtained from a depth image camera, the present invention is characterized in that, in the step (f), time-average filtering is performed only on the filtered region corresponding to the mask region in the depth image. do.

또, 본 발명은 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 있어서, 상기 (f)단계에서, 상기 깊이 영상의 현재 프레임 및, 이전 시간의 적어도 하나의 프레임의 동일한 위치의 픽셀 값들을 평균하고, 평균한 값으로 상기 깊이 영상의 픽셀 값을 보정하는 것을 특징으로 한다.In addition, in the method for improving the quality of a 3D image obtained from a depth image camera, in the step (f), the pixel values of the same position of the current frame of the depth image and at least one frame of the previous time are averaged and correcting the pixel value of the depth image with an average value.

또, 본 발명은 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 있어서, 상기 (g)단계에서, 히스토그램 명세화(histogram specification) 방식을 수행하여 상기 텍스처 영상을 보정하는 것을 특징으로 한다.In addition, in the method for improving the quality of a 3D image obtained from a depth image camera, the present invention is characterized in that the texture image is corrected by performing a histogram specification method in step (g).

또, 본 발명은 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 있어서, 상기 (h)단계에서, 상기 점군 데이터에 대하여 클러스터링을 진행하고 각 포인트의 법석 벡터를 산출하고, 인접 포인트 간의 법선 벡터 내적을 통해 주변 포인트와 상관도가 사전에 정해진 기준치 보다 높은 포인트(이하 바운더리 포인트)를 검출하고, 검출된 바운더리 포인트들에 의해 형성된 경계에 대하여 경계 영역 밖의 포인트를 삭제하는 것을 특징으로 한다.In addition, the present invention provides a method for improving the quality of a 3D image obtained from a depth image camera, in the step (h), clustering is performed on the point cloud data, calculating a normal vector of each point, and a normal vector between adjacent points. It is characterized in that a point (hereinafter, a boundary point) having a higher correlation with a neighboring point than a predetermined reference value is detected through the dot product, and a point outside the boundary region is deleted with respect to a boundary formed by the detected boundary points.

또, 본 발명은 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 있어서, 상기 (i)단계에서, 곡면을 구성할 때, 정점 간의 거리를 이용하여 보정하여 구성하되, 곡면의 각 면에 대하여 면에 대향하는 적어도 2개의 변 중에서 하나 변이 임계값 이하일 때만 면으로 구성하고, 각 면을 구성하는 정점의 x, y, z좌표에 대해 각 최소(min) 최대(max) 값을 계산한 뒤 만들어진 AABB박스의 부피에 임계 값을 설정하여 출력을 제한하는 것을 특징으로 한다.In addition, the present invention provides a method for improving the quality of a 3D image obtained from a depth image camera, in the step (i), when constructing a curved surface, correcting it using the distance between vertices, but with respect to each surface of the curved surface A face is formed only when one of at least two sides facing the face is less than or equal to the threshold value, and the minimum (min) and maximum (max) values are calculated for the x, y, and z coordinates of the vertices constituting each face. It is characterized in that the output is limited by setting a threshold value in the volume of the AABB box.

또한, 본 발명은 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법을 수행하는 프로그램을 기록한 컴퓨터로 읽을 수 있는 기록매체에 관한 것이다.In addition, the present invention relates to a computer-readable recording medium in which a program for performing a method for improving the quality of a 3D image obtained from a depth image camera is recorded.

상술한 바와 같이, 본 발명에 따른 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법에 의하면, 깊이 영상에서의 잡음, 인접 컬러 영상 간 조명 및 경계면 잡음, 점군 데이터의 잡음, 면 데이터의 잡음 등을 개선함으로써, 보다 정확한 3차원 영상을 획득할 수 있는 효과가 얻어진다.As described above, according to the method for improving the quality of a 3D image obtained from a depth image camera according to the present invention, noise in a depth image, illumination and interface noise between adjacent color images, noise of point cloud data, noise of plane data, etc. By improving, the effect of obtaining a more accurate 3D image is obtained.

도 1은 종래기술에 따른 텍스처 및 깊이 영상에서 3차원 객체를 생성하는 방법을 설명하는 흐름도.
도 2는 본 발명을 실시하기 위한 전체 시스템의 구성을 도시한 도면.
도 3은 본 발명의 일실시예에 따른 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법을 설명하는 흐름도.
도 4는 본 발명의 일실시예에 따른 마스크 이미지 생성 및 텍스처 영상의 마스킹 과정을 예시한 도면.
도 5는 본 발명의 일실시예에 따른 마스크 적용으로 조명 잡음이 제거된 영상의 예시도.
도 6은 본 발명의 일실시예에 따른 마스크 적용으로 경계 잡음이 제거된 영상의 예시도.
도 7은 본 발명의 일실시예에 따른 시간 평균 필터링 전 후의 한 픽셀의 시간축으로의 값의 변화를 나타낸 그래프.
도 8은 본 발명의 일실시예에 따른 필터링 전 후의 160번째 열의 값의 변화를 나타낸 그래프.
도 9은 본 발명의 일실시예에 따른 조명 보상 적용된 컬러 영상 또는 텍스처 영상의 예시도.
도 10은 본 발명의 일실시예에 따른 점군 데이터 생성 및 보정 단계를 설명하는 세부 흐름도.
도 11은 본 발명의 일실시예에 따른 점군 데이터의 표면 잡음 특정 방법을 예시한 도면.
도 12는 본 발명의 일실시예에 따른 면 데이터 생성 단계를 설명하는 세부 흐름도.
도 13는 본 발명의 일실시예에 따른 면 데이터의 잡음 유형을 나타낸 예시도.
도 14은 본 발명의 일실시예에 따른 메쉬를 구성하는 정점 영상에 대한 예시도.
도 15는 본 발명의 일실시예에 따른 AABB 박스의 작성 예시도.
도 16는 본 발명의 일실시예에 따른 AABB박스의 부피계산을 통한 잡음면 특정 영상에 대한 예시도.
도 17은 발명의 일실시예에 따른 도 13의 면 데이터의 잡음을 개선한 예시도.1 is a flowchart illustrating a method of generating a 3D object from a texture and depth image according to the prior art;
Figure 2 is a diagram showing the configuration of the entire system for implementing the present invention.
3 is a flowchart illustrating a method for improving the quality of a 3D image obtained from a depth image camera according to an embodiment of the present invention.
4 is a diagram illustrating a process of generating a mask image and masking a texture image according to an embodiment of the present invention;
5 is an exemplary view of an image from which lighting noise is removed by applying a mask according to an embodiment of the present invention;
6 is an exemplary view of an image from which boundary noise is removed by applying a mask according to an embodiment of the present invention;
7 is a graph illustrating a change in the value of one pixel on the time axis before and after time average filtering according to an embodiment of the present invention.
8 is a graph showing a change in the value of the 160th column before and after filtering according to an embodiment of the present invention.
9 is an exemplary view of a color image or a texture image to which illumination compensation is applied according to an embodiment of the present invention.
10 is a detailed flowchart illustrating a step of generating and correcting point cloud data according to an embodiment of the present invention.
11 is a diagram illustrating a method for specifying surface noise of point cloud data according to an embodiment of the present invention.
12 is a detailed flowchart illustrating a face data generation step according to an embodiment of the present invention.
13 is an exemplary diagram illustrating a noise type of surface data according to an embodiment of the present invention.
14 is an exemplary diagram of a vertex image constituting a mesh according to an embodiment of the present invention.
15 is an exemplary view of the creation of an AABB box according to an embodiment of the present invention.
16 is an exemplary view of a noise plane specific image through volume calculation of an AABB box according to an embodiment of the present invention.
17 is an exemplary view of improving noise of the surface data of FIG. 13 according to an embodiment of the present invention;

이하, 본 발명의 실시를 위한 구체적인 내용을 도면에 따라서 설명한다.Hereinafter, specific contents for carrying out the present invention will be described with reference to the drawings.

또한, 본 발명을 설명하는데 있어서 동일 부분은 동일 부호를 붙이고, 그 반복 설명은 생략한다.In addition, in demonstrating this invention, the same part is attached|subjected with the same code|symbol, and the repetition description is abbreviate|omitted.

먼저, 본 발명을 실시하기 위한 전체 시스템의 구성의 예들에 대하여 도 2를 참조하여 설명한다.First, examples of the configuration of the entire system for implementing the present invention will be described with reference to FIG. 2 .

도 2에서 보는 바와 같이, 본 발명에 따른 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법은 카메라 시스템(20)에 의해 촬영된 다시점 깊이 및 텍스처 이미지(60)를 입력받아 3차원 객체 또는 3차원 영상을 생성하는 컴퓨터 단말(30) 상의 프로그램 시스템으로 실시될 수 있다. 즉, 3D 영상의 품질 향상 방법은 프로그램으로 구성되어 컴퓨터 단말(30)에 설치되어 실행될 수 있다. 컴퓨터 단말(30)에 설치된 프로그램은 하나의 프로그램 시스템(40)과 같이 동작할 수 있다.As shown in FIG. 2 , the method for improving the quality of a 3D image obtained from a depth image camera according to the present invention receives a multi-viewpoint depth and texture image 60 photographed by a camera system 20 as an input to obtain a 3D object or 3D image. It may be implemented as a program system on the computer terminal 30 for generating a dimensional image. That is, the method for improving the quality of a 3D image may be configured as a program and installed in the computer terminal 30 to be executed. A program installed in the computer terminal 30 may operate as one program system 40 .

한편, 다른 실시예로서, 3D 영상의 품질 향상 방법은 프로그램으로 구성되어 범용 컴퓨터에서 동작하는 것 외에 ASIC(주문형 반도체) 등 하나의 전자회로로 구성되어 실시될 수 있다. 또는 다시점 깊이 및 색상 이미지에서 포인트 클라우드를 정합하는 것만을 전용으로 처리하는 전용 컴퓨터 단말(30)로 개발될 수도 있다. 이를 3D 영상의 품질 향상 시스템(40)이라 부르기로 한다. 그 외 가능한 다른 형태도 실시될 수 있다.Meanwhile, as another embodiment, the method for improving the quality of a 3D image may be implemented with a single electronic circuit, such as an ASIC (application specific semiconductor), in addition to being configured as a program and operated in a general-purpose computer. Alternatively, it may be developed as a dedicated computer terminal 30 that exclusively processes only matching point clouds in multi-view depth and color images. This will be referred to as the 3D image quality improvement system 40 . Other possible forms may also be implemented.

한편, 카메라 시스템(20)은 물체(10)에 대해 서로 다른 시점으로 촬영하는 다수의 RGB-D 카메라 또는 텍스처-깊이 카메라(21)로 구성된다.On the other hand, the camera system 20 is composed of a plurality of RGB-D cameras or texture-depth cameras 21 that take pictures of the object 10 from different viewpoints.

또한, 각 RGB-D 카메라(21)는 깊이 카메라 및 텍스처 카메라(또는 RGB 카메라, 색상 카메라)를 포함한다. 텍스처 카메라는 통상의 RGB카메라로서, 물체(10)의 텍스처 영상(또는 컬러 영상, 색상 영상) 또는 텍스처 이미지(62)를 획득한다.Each RGB-D camera 21 also includes a depth camera and a texture camera (or RGB camera, color camera). The texture camera is a conventional RGB camera, and acquires a texture image (or a color image, a color image) or a texture image 62 of the object 10 .

또한, 깊이 카메라는 물체(10)의 깊이를 측정하는 카메라로서, 깊이정보를 측정하여 깊이영상 또는 깊이 이미지(61)를 출력한다. 특히, 깊이 카메라는 보통의 카메라와 일정한 시각편차를 갖는 구조광 영상 기반 깊이 영상 카메라이다. 즉, 일정한 패턴의 레이저를 물체에 투사하고 반사된 빛의 왜곡된 패턴을 검출하여 깊이를 측정한다. 바람직하게는, 구조광은 적색(660[nm])의 가시광선 레이저, 혹은 적외선(780[nm]) 레이저를 사용한다. 일례로서, 광학회절소자를 이용하여 유리에 새겨진 패턴에 적외선 레이저를 투사하여 구조광을 형성할 수 있다.Also, the depth camera is a camera that measures the depth of the object 10 , and outputs a depth image or a depth image 61 by measuring depth information. In particular, the depth camera is a structured light image-based depth imaging camera having a constant visual deviation from a normal camera. That is, a laser of a certain pattern is projected onto an object, and a distorted pattern of the reflected light is detected to measure the depth. Preferably, the structured light uses a red (660 [nm]) visible light laser or an infrared (780 [nm]) laser. As an example, structured light may be formed by projecting an infrared laser onto a pattern engraved on glass using an optical diffraction element.

카메라 시스템(20)에 의해 촬영된 다시점 깊이 영상(61) 및 텍스처 영상(62)은 컴퓨터 단말(30)에 직접 입력되어 저장되고, 3D 영상의 품질 향상 시스템(40)에 의해 처리된다. 또는, 다시점 깊이 영상(61) 및 텍스처 영상(62)은 컴퓨터 단말(30)의 저장매체에 미리 저장되고, 3D 영상의 품질 향상 시스템(40)에 의해 저장된 깊이-텍스처 영상(60)을 읽어 입력될 수도 있다.The multi-viewpoint depth image 61 and the texture image 62 photographed by the camera system 20 are directly input to and stored in the computer terminal 30 , and are processed by the 3D image quality improvement system 40 . Alternatively, the multi-viewpoint depth image 61 and the texture image 62 are pre-stored in the storage medium of the computer terminal 30 and read the depth-texture image 60 stored by the 3D image quality improvement system 40 . may be entered.

영상은 시간상으로 연속된 프레임으로 구성된다. 예를 들어, 현재시간 t의 프레임을 현재 프레임이라고 하면, 직전시간 t-1의 프레임은 이전 프레임이라고 하고, t+1의 프레임은 다음 프레임이라고 부르기로 한다. 한편, 각 프레임은 텍스처 영상(또는 컬러 이미지) 및 깊이 영상(또는 깊이 정보)을 갖는다.An image is made up of consecutive frames in time. For example, if the frame of the current time t is called the current frame, the frame of the immediately preceding time t-1 is called the previous frame, and the frame of t+1 is called the next frame. Meanwhile, each frame has a texture image (or color image) and a depth image (or depth information).

특히, 다시점 RGB-D 카메라(21)의 개수만큼 물체(10)에 대해 서로 다른 시점으로 촬영하고, 특정 시간 t에서, 카메라 개수만큼의 다시점 깊이 및 텍스처 영상(61,62)이 획득된다.In particular, the number of multi-view RGB-D cameras 21 is photographed at different viewpoints with respect to the object 10, and at a specific time t, multi-view depth and texture images 61 and 62 as many as the number of cameras are obtained. .

한편, 깊이영상(61) 및 텍스처 영상(62)은 시간상으로 연속된 프레임으로 구성된다. 하나의 프레임은 하나의 이미지를 갖는다. 또한, 영상(61,62)은 하나의 프레임(또는 이미지)을 가질 수도 있다. 즉, 영상(61,62)은 하나의 이미지인 경우에도 해당된다.On the other hand, the depth image 61 and the texture image 62 are composed of consecutive frames in time. One frame has one image. Also, the images 61 and 62 may have one frame (or image). That is, the images 61 and 62 correspond to a single image.

다시점 깊이 영상 및 텍스처 영상에서 3차원 영상 또는 객체를 생성하는 것은, 곧 깊이/텍스처 프레임(또는 이미지) 각각에서 검출하는 것을 의미하나, 이하에서 특별한 구별의 필요성이 없는 한, 영상이나 이미지의 용어를 혼용하기로 한다.Generating a 3D image or object from a multi-viewpoint depth image and a texture image means detecting each depth/texture frame (or image), but unless there is a need for special distinction below, the term image or image to mix

다음으로, 본 발명의 일실시예에 따른 깊이영상 카메라로부터 획득된 3D 영상의 품질 향상 방법을 도 3을 참조하여 설명한다.Next, a method for improving the quality of a 3D image obtained from a depth image camera according to an embodiment of the present invention will be described with reference to FIG. 3 .

먼저, 카메라 시스템(20)으로부터 다시점의 깊이 영상 및 텍스처 영상(또는 컬러 영상)을 획득한다(S10). 즉, 다시점 깊이 및 텍스처 영상은 각기 다른 시점으로 설치된 다수의 RGB-D 카메라(21)로부터 쵤영된 영상이다.First, a multi-view depth image and a texture image (or color image) are obtained from the camera system 20 ( S10 ). That is, the multi-view depth and texture images are images shot from a plurality of RGB-D cameras 21 installed from different viewpoints.

바람직하게는, 텍스처 영상(또는 컬러 영상, 색상 영상)은 RGB 영상이다. 각 RGB-D 카메라(21)에서 촬영된 깊이 영상과 텍스처 영상은 동일한 시점에서 촬영된 영상이다. 또한, 깊이 영상은 구조광 기반에 의해 획득된 깊이 영상이다.Preferably, the texture image (or color image, color image) is an RGB image. The depth image and the texture image captured by each RGB-D camera 21 are images captured at the same viewpoint. In addition, the depth image is a depth image acquired based on structured light.

다시점의 개수가 M개(적어도 2 이상)이고, 모두 M개의 각 시점의 텍스처 영상 및 깊이 영상이 입력된다. 또한, 각 영상은 시간상으로 연속된 프레임으로 입력된다.The number of multi-views is M (at least 2 or more), and all M texture images and depth images of each view are input. In addition, each image is input as consecutive frames in time.

다음으로, 텍스처 영상의 이진 영상을 생성하여 마스크 이미지를 생성한다(S20). 즉, 텍스처 영상의 이진 영상을 생성하여 보정하고, 보정된 이진 영상으로 마스크 이미지를 생성한다. 구체적으로, 텍스처 영상에서 밝기 값을 기준으로 임계값을 설정하고, 설정된 임계값으로 텍스처 영상을 이진화 하여 이진 영상을 생성한다. 일례로서, 임계값 또는 임계 범위 126 또는 126 ~ 255의 밝기값으로 설정한다. 즉, 임계값 126 이상이거나 임계 범위 126 ~ 255에 포함된 영역을 이진화 한다.Next, a mask image is generated by generating a binary image of the texture image (S20). That is, a binary image of the texture image is generated and corrected, and a mask image is generated using the corrected binary image. Specifically, a threshold value is set based on a brightness value in a texture image, and a binary image is generated by binarizing the texture image with the set threshold value. As an example, a threshold value or a brightness value in the threshold range 126 or 126 to 255 is set. That is, a region greater than or equal to the threshold value of 126 or included in the threshold range of 126 to 255 is binarized.

또한, 이진 영상에 모폴로지 연산 중 닫기(closing) 연산을 반복하여, 이진 영상의 잡음이 제거되도록 이진 영상을 보정한다. 즉, 이진화 과정에서 피부에 반사광 등에 의해 유실된 영역(이진 영상에서의 홀 hole 등)을 모플로지 연산 중 닫기 연산 반복을 통해 복원한다.In addition, the binary image is corrected so that noise of the binary image is removed by repeating a closing operation during the morphological operation on the binary image. That is, areas (holes in binary images, etc.) lost by reflected light on the skin during the binarization process are restored through repetition of the closing operation during the morphology operation.

또한, 그리고 보정된 이진 영상으로 마스크 이미지 또는 마스크 영상을 생성한다. 즉, 이진화 한 후, 이진 영상을 반전하여 마스크 이미지를 생성한다. 마스크 이미지는 객체의 외각선 안쪽이 255를 갖고, 외각선 바깥쪽이 0을 갖는다. 한편, 이러한 이진화 방식이나 마스크 영상의 설정 방식은 일례일 뿐, 이외 다양한 형태가 될 수 있다.In addition, a mask image or a mask image is generated using the corrected binary image. That is, after binarization, a mask image is generated by inverting the binary image. The mask image has 255 on the inside of the object's outline and 0 on the outside. On the other hand, such a binarization method or a method of setting a mask image is only an example, and may have various other forms.

이 단계는 마스크 이미지 또는 마스크 영상을 생성하는 과정은 텍스처 영상에서 배경과 전경을 구분하는 의미를 가진다. 이와 같이 생성된 마스크 영상의 예시가 도 4의 "Mask Generation" 이미지에 나타나고 있다.In this step, the mask image or the process of generating the mask image has the meaning of separating the background and the foreground from the texture image. An example of the mask image generated in this way is shown in the “Mask Generation” image of FIG. 4 .

다음으로, 원본 텍스처 영상(또는 원본 컬러 영상)을 마스크 이미지로 필터링(마스킹)하여, 텍스처 영상에서 마스크 영역 외를 제외시킨다(S31).Next, the original texture image (or the original color image) is filtered (masked) with the mask image to exclude the mask area from the texture image (S31).

즉, 텍스처 영상(또는 원본 텍스처 영상)에 마스크 이미지로 마스킹 하여, 텍스처 영상에서 마스크 영역은 원 영상을 유지시키고, 텍스처 영상에서 마스크 영역 외의 영역은 밝기값을 가장 작게 설정한다. 즉, 마스크 외 영역을 0으로 설정한다.That is, by masking the texture image (or the original texture image) with a mask image, the mask region in the texture image maintains the original image, and the region other than the mask region in the texture image sets the smallest brightness value. That is, the area outside the mask is set to 0.

다음으로, 마스킹된 텍스처 영상에 투명도의 알파 채널을 추가하고, 특정 거리 이상의 영역의 알파 값을 0으로 설정한다(S32).Next, an alpha channel of transparency is added to the masked texture image, and an alpha value of an area over a specific distance is set to 0 (S32).

먼저, 텍스처 영상에 투명도의 알파 채널을 추가한다. 즉, 텍스처 영상에는 RGB 채널 외에 투명도의 A 채널이 추가된다. 이때, 투명도의 알파 채널의 디폴트 값은 1(완전 불투명)로 설정한다. 투명도는 0 ~ 1 까지의 값을 가지면, 0일 때 완전 투명하고, 1일 때 완전 불투명이다.First, an alpha channel of transparency is added to the texture image. That is, the A channel of transparency is added to the texture image in addition to the RGB channel. In this case, the default value of the alpha channel of transparency is set to 1 (completely opaque). Transparency has a value from 0 to 1, when it is 0, it is completely transparent, when it is 1, it is completely opaque.

깊이 영상과 텍스처 영상 간을 정렬한다. 즉, 서로 픽셀의 위치가 대응되도록 일치시킨다. RGB-D 카메라로 촬영된 깊이 맵과 텍스쳐 영상은 해상도와 촬영되는 카메라의 렌즈 구경에 따라 왜곡 정도가 다르므로, 이것을 일치시키는 작업을 의미한다.Align between depth image and texture image. That is, the positions of the pixels are matched to each other. The depth map and texture image captured by the RGB-D camera have different degrees of distortion depending on the resolution and the lens aperture of the camera being photographed, so it means matching them.

깊이 영상의 특정 영역 또는 특정 픽셀에서, 그 깊이 값 또는 Z값이 사전에 설정된 임계 거리 이상이면, 해당 영역 또는 픽셀에 대응되는 텍스처 영상의 알파 채널 값(또는 알파 값)을 완전 투명 또는 0으로 설정한다. 객체는 임계 거리가 가깝고, 배경은 임계 거리가 멀다. 따라서 임계 거리를 통해 객체를 추출할 수 있다. 즉, 배경 영역의 알파 채널을 투명하게 설정한다.In a specific region or pixel of the depth image, if the depth value or Z value is greater than or equal to a preset threshold distance, the alpha channel value (or alpha value) of the texture image corresponding to the region or pixel is set to fully transparent or 0 do. The object is close to the critical distance, and the background is far from the critical distance. Therefore, the object can be extracted through the critical distance. That is, the alpha channel of the background area is set to be transparent.

다른 실시예로서, 깊이 영상에서 임계 거리로 깊이 이진 영상(또는 깊이 마스크 영상)을 생성하고, 이진 영상 또는 깊이 마스크 영상으로 알파 채널의 영상을 마스킹 한다.As another embodiment, a depth binary image (or depth mask image) is generated by a threshold distance from the depth image, and the alpha channel image is masked with the binary image or the depth mask image.

앞서 텍스처 영상의 보정 과정(S20, S31, S32)이 도 4에 도시되고 있다. 도 4의 과정에 의하여, 텍스처 영상에서 조명 잡음 및 경계 잡음이 제거된다. 또한, 도 5 및 도 6에 각각 마스크 적용으로 조명 잡음 및 경계 잡음이 제거된 영상을 예시하고 있다.Previously, the correction process ( S20 , S31 , and S32 ) of the texture image is illustrated in FIG. 4 . By the process of FIG. 4 , lighting noise and boundary noise are removed from the texture image. In addition, images from which illumination noise and boundary noise are removed by applying a mask are exemplified in FIGS. 5 and 6, respectively.

도 5의 첫 번째 영상과 세 번째 영상에 예시된다. 즉, 도 5의 첫 번째 영상 오른쪽 하단 작은 이미지가 깊이 영상이다. 도 5의 깊이 영상에 텍스처 영상을 정렬하여 매핑하면 도 5의 첫 번째 큰 영상처럼 출력된다. 이때 도 5의 첫 번째 영상에 거리에 따른 임계값(임계 거리)을 적용하지 않았으므로, 배경의 파티션이 함께 출력된다. 도 5의 첫 번째 영상에 1M이내 객체만 출력하면 도 5의 두 번째와 같이 파티션이 제거된 영상이 출력된다.It is illustrated in the first image and the third image of FIG. 5 . That is, a small image in the lower right corner of the first image of FIG. 5 is a depth image. When the texture image is aligned and mapped to the depth image of FIG. 5 , it is output like the first large image of FIG. 5 . At this time, since the threshold value (threshold distance) according to the distance is not applied to the first image of FIG. 5 , the partition of the background is output together. If only objects within 1M are output in the first image of FIG. 5 , an image from which partitions are removed is output as shown in the second image of FIG. 5 .

또한, 도 5의 두 번째 영상에 컬러영상 마스크만 적용하면 깊이정보에서는 데이터가 있지만 텍스처 영상에서 마스크가 적용되어 투명도가 0이 되므로 화면에서 조명 잡음이 사라진 영상, 즉, 도 5의 세번째와 같은 영상이 출력된다. 마스크는 깊이 정보 맵에서 동일하게 적용될 수 있는데 이 경우에는 잡음이 단순히 투명해 지는 것이 아니라 실제 깊이정보에서 잡음 데이터가 삭제된다.In addition, if only the color image mask is applied to the second image of FIG. 5 , there is data in the depth information, but the transparency becomes 0 because the mask is applied in the texture image. This is output. The mask can be equally applied in the depth information map. In this case, noise data is not simply made transparent, but noise data is deleted from the actual depth information.

즉, 마스크를 깊이영상에 적용하는 것과 텍스처 영상에 적용하는 것 두가지를 모두 수행한다. 컬러영상에 적용하면 잡음이 투명해지고, 깊이정보에 적용하면 잡음 데이터가 삭제된다.That is, both applying the mask to the depth image and applying the mask to the texture image are performed. When applied to color images, noise becomes transparent, and when applied to depth information, noise data is deleted.

한편, 다시점 영상인 경우, M개 다시점의 깊이 및 텍스처 영상이 있으면, M개 각각의 영상에 대하여 위의 과정을 수행한다.Meanwhile, in the case of a multi-view image, if there are M multi-view depth and texture images, the above process is performed for each of the M images.

다음으로, 깊이 영상에 대해 메디안 블러링(median blurring) 등 전처리를 수행한다(S41).Next, preprocessing such as median blurring is performed on the depth image (S41).

객체 패딩(object padding)은 침식연산(Erode)을 수행하던 프로세스이다. 외각선을 찾아서 패딩(padding)하여 경계선을 굵게 만들어 클립핑하던 방식과, 침식(erode) 방식으로 안쪽으로 침식시키는 방식, 그리고 마스킹 방식 세가지 모두 사용할 수 있는데 본 발명에서는 마스킹(Masking) 방식으로 통일한다.Object padding is a process that performed an erosion operation. All three methods are available: a method of finding and padding an outer line to make the boundary thicker and clipping, a method of eroding inward with an erode method, and a masking method. In the present invention, the masking method is unified.

메디안 블러링은 데이터 균일화 한다. 메디안 방식은 3×3 윈도우를 이용하여 평균값을 취하는 방식을 사용한다.Median blurring equalizes data. The median method uses a method of taking an average value using a 3×3 window.

다음으로, 깊이 영상에 앞서 단계(S20)에서 구한 마스크 이미지로 필터링을 수행한다(42).Next, before the depth image, filtering is performed with the mask image obtained in step S20 (42).

즉, 깊이 영상에서 마스크 영역(마스크 이미지의 마스크 영역에 대응하는 깊이 영상의 영역)은 유지하고, 마스크 외 영역은 제외시킨다. 바람직하게는, 깊이영상에서 마스크 외 영역의 깊이값을 가장 크게 설정한다. That is, in the depth image, the mask area (area of the depth image corresponding to the mask area of the mask image) is maintained and the area outside the mask is excluded. Preferably, the depth value of the region outside the mask in the depth image is set to be the largest.

다음으로, 깊이 영상에 시간 평균 필터링(Temporal Average Filtering)을 적용한다(S50). 특히, 필터링된 깊이 영상에 대해서만 시간 평균 필터링을 수행한다. 다시 말하면, 깊이 영상에서 마스크 영역에서만 시간 평균 필터링을 수행한다.Next, temporal average filtering is applied to the depth image (S50). In particular, time average filtering is performed only on the filtered depth image. In other words, time average filtering is performed only in the mask region in the depth image.

바람직하게는, 깊이 영상의 현재 프레임(시간 t 프레임)에 대하여 이전 시간의 적어도 M개(바람직하게는, 4개)의 프레임(시간 t-4, t-3, t-2, t-1 등 프레임)을 참조하여 평균값을 취해준다. 특히, 동일한 위치의 픽셀 값들을 평균한다.Preferably, at least M (preferably 4) frames (time t-4, t-3, t-2, t-1, etc.) of the previous time for the current frame (time t frame) of the depth image frame) and take the average value. In particular, the pixel values of the same position are averaged.

이때, 현재 프레임도 포함한다. 예를 들어, 현재 프레임에 이전의 4개의 프레임을 참조하여 총 5개의 깊이 이미지를 이용하여 필터링을 진행한다. 최종 깊이 영상의 각 픽셀값은 5개의 깊이 이미지의 각 픽셀의 평균값으로 결정된다.In this case, the current frame is also included. For example, filtering is performed using a total of five depth images with reference to four previous frames in the current frame. Each pixel value of the final depth image is determined as an average value of each pixel of the five depth images.

상기 과정에 의해, RGB-D 카메라(21)에서 시간 축으로 획득되는 모든 깊이 영상 프레임에 유동적으로 작용되는 시간축 잡음을 제거할 수 있다.Through the above process, it is possible to remove time axis noise that is flexibly applied to all depth image frames acquired in the time axis from the RGB-D camera 21 .

도 7은 각각 시간 평균 필터링 전 후의 한 픽셀의 시간축으로의 값의 변화를 나타내고, 도 8은 필터링 전 후의 160번째 열의 값의 변화를 나타내고 있다.7 shows the change of the value of one pixel on the time axis before and after the time average filtering, respectively, and FIG. 8 shows the change of the value of the 160th column before and after filtering.

다음으로, 히스토그램을 이용하여 텍스처 영상을 보정한다(S60).Next, the texture image is corrected using the histogram (S60).

다시점 텍스처 영상은 서로 다른 카메라의 위치에 의해 획득되므로, 각 위치에 따른 조명 불일치에 의하여 각 시점의 텍스처 영상 간에 객체의 상관성이 떨어지는 문제점이 있다. 따라서 다시점 텍스처 영상에 대하여, 조명 보상 및 인접 영상 간 경계면 보간을 수행해야 한다.Since multi-view texture images are obtained by different camera positions, there is a problem in that the correlation of objects between texture images at each viewpoint is deteriorated due to mismatch of lighting according to each position. Therefore, for multi-view texture images, illumination compensation and interpolation between adjacent images should be performed.

이러한 문제점을 해결하기 위해 히스토그램 매칭 방법을 적용한다. 이는 특정 모양의 히스토그램을 생성된 디지털 영상의 히스토그램에 포함하여 영상의 일부 영역의 명암 대비(콘트라스트)를 개선하기 위해 사용된다. 다시점 카메라 시스템 상에서의 RGB-D 카메라의 컬러 영상 또는 텍스처 영상(Texture)을 획득하고 영상의 누적 히스토그램을 정해진 참조 영상의 누적 히스토그램으로 매칭한다. 이를 통해 텍스처 영상의 조명 성분의 불일치를 보상하여 텍스쳐 품질을 향상할 수 있다.To solve this problem, the histogram matching method is applied. This is used to improve the contrast (contrast) of some areas of the image by including the histogram of a specific shape in the histogram of the generated digital image. A color image or texture image of an RGB-D camera on a multi-view camera system is acquired, and a cumulative histogram of the image is matched with a cumulative histogram of a predetermined reference image. In this way, the texture quality can be improved by compensating for the mismatch of the lighting component of the texture image.

먼저, 입력된 컬러영상 또는 텍스처 영상(Texture)의 히스토그램을 생성한다(S61).First, a histogram of an input color image or texture image is generated (S61).

다음으로, 텍스처 영상의 히스토그램을 평활화 한다(S62). 즉, 컬러 영상의 히스토그램을 평활화를 위해 정규화된 누적 빈도 수의 함수 T를 구한 뒤 다음 변환식을 얻는다. 여기서 P는 원본 영상의 화소값, q는 평활화 값이다. Next, the histogram of the texture image is smoothed (S62). That is, the function T of the normalized cumulative frequency is obtained for smoothing the histogram of the color image, and then the following transformation equation is obtained. Here, P is the pixel value of the original image, and q is the smoothing value.

[수학식 1][Equation 1]

여기서 설명하는 방식은 히스토그램 명세화(Histogram Specification) 히스토그램 매칭, 히스토그램 명세화 등으로 총칭된다. 즉, 기준이 되는 영상이 있을 때 기준 영상의 화소값이 같을 경우 단순 누적 덧셈한다. 즉 화소 값은 화소의 수의 형태로 저장한다. 일례로서, 밝기 1은 밝기 1인 픽셀 개수이고, 밝기2는 밝기2인 픽셀 개수이다.The methods described here are collectively referred to as histogram specification, histogram matching, histogram specification, and the like. That is, when there is a reference image and the pixel values of the reference image are the same, simple cumulative addition is performed. That is, the pixel value is stored in the form of the number of pixels. As an example, brightness 1 is the number of pixels having brightness 1, and brightness 2 is the number of pixels having brightness 2.

다음으로, 평활화된 컬러 영상(텍스처 영상)의 히스토그램(또는 균일 분포된 히스토그램)을 얻는다(S63). 변환식을 바탕으로 입력된 컬러영상(Texture)의 평활화를 수행하여 입력된 컬러 영상(Texture)의 균일 분포된 히스토그램을 얻는다. 즉, 평활화된 컬러 영상의 히스토그램이 균일 분포된 히스토그램이다.Next, a histogram (or a uniformly distributed histogram) of the smoothed color image (texture image) is obtained (S63). A uniformly distributed histogram of the input color image (Texture) is obtained by smoothing the input color image (Texture) based on the conversion equation. That is, the histogram of the smoothed color image is a uniformly distributed histogram.

다음으로, 인접 시점 컬러영상 히스토그램의 정규화된 누적 빈도 수 함수 G를 구하고, 역변환 함수가 있는 변환식을 구한 뒤 인접 시점 컬러영상의 평활화를 수행한다(S64). 여기서 Z는 인접 시점 컬러영상 히스토그램의 명도 값, v는 평활화 값이다. Next, a normalized cumulative frequency function G of the color image histogram of an adjacent view is obtained, a transformation equation having an inverse transform function is obtained, and then the color image of an adjacent view is smoothed (S64). Here, Z is the brightness value of the color image histogram of an adjacent view, and v is the smoothing value.

[수학식 2][Equation 2]

이때, 인접시점 영상에 대한 정의는 정면에 대해 좌, 정면에 대해 우, 우측에 대해 후면, 좌측에 대해 후면 식으로 쌍으로 이루어지며 개별적으로 진행된다.At this time, the definition of the adjacent view image is made in pairs such as left for front, right for front, rear for right, and rear for left, and proceeds individually.

다음으로, 인접 시점 컬러영상의 평활화를 수행하여 균일 분포된 히스토그램을 얻는다(S65).Next, a uniformly distributed histogram is obtained by performing smoothing of color images of adjacent viewpoints (S65).

다음으로, 평활화 된 인접 시점 컬러영상 히스토그램을 역평활화하여 역변환 함수를 구한다(S66). 여기서 역변환 함수는 실제 룩업테이블이 된다. Next, the inverse transform function is obtained by de-smoothing the smoothed adjacent view color image histogram (S66). Here, the inverse transform function becomes the actual lookup table.

[수학식 3][Equation 3]

다음으로, 앞서 단계(S46)에서 구한 역변환 함수를 이용하여 평활화 된 원본 컬러 영상의 히스토그램을 인접시점 컬러영상의 히스토그램으로 변환한다(S67). Next, the histogram of the smoothed original color image is converted into a histogram of the color image of an adjacent view by using the inverse transform function obtained in the previous step (S46) (S67).

[수학식 4] [Equation 4]

앞서 과정은 히스토그램 매칭, 명세화 방식이다. 생성된 룩업 테이블(LOOKUP TABLE)을 이용하여 기준 히스토그램에 인접 영상 히스토그램 모양을 일치시키는 방식이다. 히스토그램 명세화를 각각의 카메라에 순차 적용하여 전체 영상을 균일하게 만든다.The previous process is histogram matching and specification method. This is a method of matching the shape of an adjacent image histogram to a reference histogram using the generated lookup table. The histogram specification is applied sequentially to each camera to make the entire image uniform.

다음으로, 다시점의 깊이 영상(M개의 깊이 영상)으로부터 점군 데이터를 생성하고, 표면 잡음을 제거한다(S70). 점군 데이터를 생성하고 보정하는 전체 과정은 도 10에 도시된 바와 같다.Next, point cloud data is generated from multi-view depth images (M depth images), and surface noise is removed ( S70 ). The entire process of generating and correcting the point cloud data is shown in FIG. 10 .

먼저, 깊이 영상에서 점군 데이터를 생성한다. 다시점 영상에서 점군(point cloud) 데이터를 생성하는 방법은 통상의 방법을 적용한다. 따라서 점군 데이터의 구체적 생성 방법은 생략한다. 이때, 다시점의 깊이 영상(M개의 깊이영상)에서 점군 데이터를 생성한다.First, point cloud data is generated from the depth image. A conventional method is applied to a method of generating point cloud data from a multi-view image. Therefore, a specific method of generating the point cloud data is omitted. In this case, point cloud data is generated from the depth image (M depth images) of the multi-viewpoint.

이때, 적외선을 이용한 신호체공시간(Time-Of-Flight) 측정방식, 구조광 영상방식 등은 카메라와 촬영 물체의 거리와 물체의 모양에 따라 측정 정밀도가 달라진다. 따라서 이러한 잡음에 의하여 점군 데이터를 생성할 때 표면 잡음이 발생한다.In this case, the measurement precision of the time-of-flight measurement method using infrared rays, the structured light imaging method, etc. varies depending on the distance between the camera and the photographed object and the shape of the object. Therefore, surface noise occurs when generating point cloud data by such noise.

관련 기술에서 상기 이미지 잡음 제거 방법은 평균화(averaging), 중앙값(median) 결정 외에 주변 데이터와의 상관도와 법선의 특정을 위한 데이터 간 클러스터링 방법을 포함한다.In the related art, the image noise removal method includes a data clustering method for specifying a normal and a correlation with surrounding data in addition to averaging and determining a median.

가. 점군 데이터의 잡음 제거를 위해 먼저 근접한 데이터 중 유사한 데이터끼리 분류하는 클러스터링을 진행한 후 각 포인트의 법선벡터를 찾는다.go. To remove noise from the point cloud data, clustering is performed to classify similar data among adjacent data, and then the normal vector of each point is found.

나. 인접(neighborhood) 포인트(point) 간의 법선벡터 내적을 통해 주변 포인트와 상관도가 떨어지는 포인트를 계산한다. 이 과정을 통해 주변 포인트클라우드와 상관성이 높은 포인트만 남는다. 이 포인트가 바운더리 포인트(boundary point)가 된다.me. A point having a low correlation with a neighboring point is calculated through the dot product of a normal vector between neighboring points. Through this process, only points with high correlation with the surrounding point cloud remain. This point becomes the boundary point.

다. 이 과정에서 팔과 다리 등 원래 객체(object)의 굴곡이 심하여 노이즈가 아닌 데이터 임에도 주변 포인트와 상관도가 떨어지는 포인트들을 판별하기 위한 과정이 외부 점군 제거 방식(Exterior Point Cloud Removal)이다. 나. 에서 말한 방식으로 제거되지 않은 포인트를 바운더리 포인트 boundary(Pn)로 설정한다.all. In this process, the external point cloud removal method is a process to determine the points that have low correlation with the surrounding points even though the original object, such as arms and legs, is highly curved and is not noise data. me. Set the point that is not removed in the manner mentioned in the boundary point boundary (Pn).

라. 즉, 원본의 점군 데이터의 포인트를 대상으로 경계(boundary) 영역 밖에 존재하는지 검사한다. 경계(boundary) 밖의 포인트를 삭제한다. 경계(boundary) 검사 방법은 다음과 같다.La. That is, it is checked whether the points of the original point cloud data exist outside the boundary area. Delete points outside the boundary. The boundary inspection method is as follows.

원본의 점군 데이터에서 대상 포인트와 가장 근접해 있는 경계(boundary) 점(포인트)와의 단위 벡터를 구한다. 경계점(boundary point)의 법선 벡터(normal vector)와 내적(dot product)을 진행한다. 이 값이 양수이면 경계(boundary) 안의 점(point)라고 판단하여 복원한다.Find the unit vector between the target point and the closest boundary point (point) from the original point cloud data. The normal vector of the boundary point and the dot product are performed. If this value is positive, it is determined to be a point within the boundary and restored.

도 11은 점군 데이터의 표면 잡음 특정 방법을 도시하고 있다. 도 11에서, P_ori1 부분의 점을 제거하고 P_ori2 부분의 점을 포함시킨다.11 shows a method for specifying surface noise of point cloud data. In Figure 11, the removal of the point P _ori1 part comprises a point P _ori2 part.

앞서 과정의 가와 나의 과정을 통해 주변 포인트와 상관도가 높은 포인트만 남는다. 이 남겨진 포인트들은 노이즈일 확률이 낮은 포인트라고 판단하여 이 포인트들을 바운더리(boundary)로 설정한다.Through the process of K and I in the previous process, only the points with high correlation with the surrounding points remain. It is determined that the remaining points are points with a low probability of being noise, and these points are set as boundaries.

포인트 클라우드는 객체(object)의 표면에 위치하고 있기 때문에 노이즈도 객체(object)의 표면근처에 위치한다. 따라서 표면이 어딘지 정의를 해주게 되면 표면이라고 정의된 위치 밖의 포인트는 노이즈일 확률이 높다.Since the point cloud is located on the surface of the object, the noise is also located near the surface of the object. Therefore, if a surface is defined, there is a high probability that the point outside the defined surface is noise.

따라서 i번째 포인트(P_i)가 노이즈인지 판별하기 위해, 앞서 가, 나 과정을 통해 설정한 바운더리 포인트(bundary point)의 법선벡터의 방향과 이 바운더리 포인트(bundary point)를 향하는 단위벡터의 내적이 필요하다. 이때 내적 결과가 양수가 되면 경계 안쪽에 위치하고 있는 포인트이므로 노이즈가 아니라고 판단되고, 음수가 된다면 노이즈로 판단된다. 도 11의 상황을 예로 들어 설명하자면, 경계 안쪽에 위치한 P_ori2는 바운더리(boundary) 포인트의 법선벡터인 P₃와 내적하게 되면 양수가 되고 경계바깥쪽에 위치한 P_ori1에 대한 내적결과는 음수가 된다. 가~라 순서대로 진행된다.Therefore, in order to determine whether the i-th point (P _i ) is noise, the dot product of the direction of the normal vector of the boundary point set through the steps A and B above and the unit vector toward this boundary point necessary. At this time, if the result of the dot product is positive, it is judged not to be noise because it is a point located inside the boundary, and if it is negative, it is judged to be noise. _Taking the situation of FIG. 11 as an example, when P _{ori2 located inside the boundary is dotted with P 3} , which is the normal vector of the boundary point, it becomes positive, and the dot product result for _{P ori1 located outside the boundary becomes negative.} It proceeds in order from A to D.

다음으로, 점군 데이터로부터 곡면을 재구성하고, 재구성된 곡면을 보정한다(S80). 재구성 및 보정 단계의 세부 단계는 도 11에 도시된 바와 같다.Next, the curved surface is reconstructed from the point cloud data and the reconstructed curved surface is corrected (S80). Detailed steps of the reconstruction and correction steps are shown in FIG. 11 .

먼저, 점군 데이터로부터 곡면 또는 면을 재구성한다(S81).First, a curved surface or a surface is reconstructed from the point cloud data (S81).

즉, 입력된 객체의 점군(pointcloud) 데이터로부터 다각형 메쉬(polygon mesh), 매개화 곡면(parametric surface), 또는 공간상의 함수의 영 집합(zero-set)과 같은 명시적인 표현 형태로 변환하는 것을 곡면 재구성(surface reconstruction)이라 한다. 앞서의 프로세스에서 개선되지 않는 잡음의 형태는 유형에 따라 도 11과 같이 표현할 수 있다. That is, converting the input object pointcloud data into an explicit expression form such as a polygon mesh, a parametric surface, or a zero-set of a spatial function is a curved surface. This is called surface reconstruction. A form of noise that is not improved in the above process may be expressed as shown in FIG. 11 depending on the type.

도 13의 A는 곡면 재구성 과정에서 x,y 좌표상 인접해 있지만 z축 거리가 크게 차이나는 이어진 면을 나타낸다. B는 깊이 영상에서의 조명 잡음이 남아 있는 경우, C는 깊이 영상에서의 경계 잡음이 남아있는 경우를 나타낸다. 13A shows continuous surfaces adjacent to each other on the x and y coordinates but having a large z-axis distance in the process of reconstructing the curved surface. B shows a case in which illumination noise in the depth image remains, and C shows a case in which boundary noise in the depth image remains.

구체적으로, 재구성된 면에 대하여 정점 간의 거리를 이용하여 일정한 거리 이하를 가지는 면만 출력한다(S82). 즉, 도 13의 A의 경우 면을 구성하는 정점 간 거리를 구하여 일정 거리 이하를 가지는 면만 출력함으로서 해결한다. 즉, 도 14에서 보는 바와 같이, 면을 구성하는 정점인 P1 P2의 거리가 임계값 이하 이거나 P3 P4의 거리가 임계 값 이하일 때만 면을 출력 한다. 즉, 면의 대향하는 적어도 2개의 변 중에서 하나 변이 임계값 이하일 때만 면으로 구성한다.Specifically, only a surface having a predetermined distance or less is output using the distance between vertices with respect to the reconstructed surface (S82). That is, in the case of A of FIG. 13, it is solved by calculating the distance between vertices constituting the surface and outputting only the surface having a predetermined distance or less. That is, as shown in FIG. 14 , the surface is output only when the distance of P1 P2, which is the vertex constituting the surface, is less than or equal to the threshold value or the distance of P3 P4 is less than or equal to the threshold value. That is, only when one of the at least two opposing sides of the surface is equal to or less than the threshold value, the surface is configured as a surface.

[수학식 5][Equation 5]

여기서, pi.x, pi.y, pi.z는 정점 Pi의 3차원 좌표이다.Here, pi.x, pi.y, and pi.z are three-dimensional coordinates of the vertex Pi.

위의 식에서, 정점 P1과 P2, 정점 P3와 P4는 정점 간의 2개의 거리를 예시하고 있다. 이것은 면의 대향하는 적어도 2개의 변 중에서 하나 변이 임계값 이하일 때만 면으로 구성한다.In the above equation, vertices P1 and P2 and vertices P3 and P4 illustrate two distances between vertices. It constitutes a face only when one of at least two opposing sides of the face is equal to or less than a threshold value.

다음으로, 도 13의 B와 C의 경우, 각 면을 구성하는 정점의 x, y, z좌표에 대해 각 최소(min) 최대(max) 값을 계산한 뒤 만들어진 AABB박스의 부피에 임계 값을 설정하여 출력을 제한한다. 노이즈는 특정방향으로 길게 늘어지는 특징을 보이므로 도 15 및 도 16와 같이 잡음을 특정할 수 있다. Next, in the case of B and C of FIG. 13, the threshold value is applied to the volume of the AABB box created after calculating the minimum (min) and maximum (max) values for the x, y, and z coordinates of the vertices constituting each face. Set to limit the output. Since the noise is elongated in a specific direction, noise can be specified as shown in FIGS. 15 and 16 .

도 15의 경우 면의 부피에 따라 경계 잡음을 비롯한 다양한 잡음을 특정하였으나, 목 등과 같이 본래 부피가 큰 영역의 면을 구분하지 못하는 단점이 있다. 따라서 각 면에 대한 법선 벡터를 계산하고 이의 방향이 Y축으로 음수와 인접할 때를 예외로 처리한다.In the case of FIG. 15, various noises, including boundary noise, are specified according to the volume of the surface, but there is a disadvantage in that it is not possible to distinguish the surface of an originally bulky area such as a neck. Therefore, it calculates the normal vector for each face and treats as an exception when its direction is adjacent to a negative number on the Y-axis.

도 17은 도 13의 원래 곡면 데이터에서 잡음이 개선된 영상을 나타내고 있다.FIG. 17 shows an image in which noise is improved in the original curved data of FIG. 13 .

이상, 본 발명자에 의해서 이루어진 발명을 상기 실시 예에 따라 구체적으로 설명하였지만, 본 발명은 상기 실시 예에 한정되는 것은 아니고, 그 요지를 이탈하지 않는 범위에서 여러 가지로 변경 가능한 것은 물론이다.In the above, the invention made by the present inventors has been described in detail according to the above embodiments, but the present invention is not limited to the above embodiments, and various modifications can be made without departing from the gist of the present invention.

10 : 홀로그램 데이터 20 : 카메라 시스템
21 : RGB-D 카메라
30 : 컴퓨터 단말 40 : 품질 향상 시스템
60 : 다시점 텍스처-깊이 영상 61 : 깊이 영상
62 : 텍스처 영상10: hologram data 20: camera system
21 : RGB-D camera
30: computer terminal 40: quality improvement system
60: multi-view texture-depth image 61: depth image
62: texture image

Claims

A method for improving the quality of a 3D image obtained from a depth image camera, the method comprising:
(a) obtaining a multi-view depth image and a texture image;
(b) generating a binary image based on the brightness value of the texture image, and generating a mask image from the generated binary image;
(c) filtering by applying the mask image to the texture image;
(d) performing primary correction by adding an alpha channel to the filtered texture image;
(e) filtering the depth image with the mask image;
(f) correcting the filtered depth image by applying time-average filtering;
(g) secondarily correcting the firstly corrected texture image using the histogram;
(h) generating point cloud data from the corrected multi-view depth image and correcting it by removing surface noise; and,
(i) A method for improving the quality of a 3D image obtained from a depth image camera, comprising the step of constructing a curved surface from the corrected point cloud data.

According to claim 1,
In the step (b), the method for improving the quality of a 3D image obtained from a depth image camera, characterized in that the binary image noise is removed by repeating the close operation during the morphological operation on the binary image.

According to claim 1,
In the step (c), the method for improving the quality of a 3D image obtained from a depth image camera, characterized in that maintaining the brightness value of the mask region in the texture image and setting the brightness value of the region outside the mask to the smallest value.

According to claim 1,
In step (d), an alpha channel of transparency is added to the filtered texture image, and the alpha value of an area over a certain distance in the texture image is set to 0, but the area over a certain distance has a depth value of the area of the corresponding depth image. A method for improving the quality of a 3D image obtained from a depth imaging camera, characterized in that the region is greater than or equal to a preset threshold distance.

According to claim 1,
In the step (e), the depth value of the mask region is maintained in the depth image, and the depth value of the region outside the mask is set to be the largest.

According to claim 1,
In the step (f), the method for improving the quality of a 3D image obtained from a depth image camera, characterized in that the time-average filtering is performed only on the filtered region corresponding to the mask region in the depth image.

According to claim 1,
In the step (f), the pixel values of the current frame of the depth image and at least one frame of the previous time are averaged, and the pixel values of the depth image are corrected with the average value. A method for improving the quality of a 3D image obtained from a video camera.

According to claim 1,
In the step (g), a method for improving the quality of a 3D image obtained from a depth image camera, characterized in that correcting the texture image by performing a histogram specification method.

According to claim 1,
In the step (h), clustering is performed on the point cloud data, a normal vector of each point is calculated, and a point having a correlation with a neighboring point higher than a predetermined reference value through a normal vector dot product between adjacent points (hereinafter, a boundary point). ) and deleting points outside the boundary region with respect to the boundary formed by the detected boundary points.

According to claim 1,
In step (i), when composing the curved surface, it is configured by correcting using the distance between the vertices, but with respect to each surface of the curved surface, only when one of the at least two sides opposite to the surface is less than or equal to a threshold value, Depth image, characterized in that the output is limited by calculating the minimum (min) and maximum (max) values for the x, y, and z coordinates of the vertices constituting each face and setting a threshold value in the volume of the AABB box created A method for improving the quality of a 3D image acquired from a camera.

A computer-readable recording medium recording a program for performing the method for improving the quality of a 3D image obtained from the depth image camera of any one of claims 1 to 10.