KR20190069750A

KR20190069750A - Enhancement of augmented reality using posit algorithm and 2d to 3d transform technique

Info

Publication number: KR20190069750A
Application number: KR1020170169976A
Authority: KR
Inventors: 왕한호; 조민호
Original assignee: 왕한호
Priority date: 2017-12-12
Filing date: 2017-12-12
Publication date: 2019-06-20

Abstract

Disclosed is a method for representing augmented reality (AR) using a technique converting 2D into 3D and a pose from orthography and scaling with iterations (POSIT) algorithm, to provide smooth communication between a hairdresser and a customer. According to one embodiment of the present invention, a 3D face reconfiguration technique using 2D images, such as a facial image, is provided. Prior face knowledge or a generic face is used for detecting scare 3D information from the images and identifying image pairs. Bundle adjustment is performed for determining a more accurate position of a 3D camera, the image pairs are corrected, and high density 3D face information is detected without using the prior face knowledge. Outliers are removed by using tensor voting for instance. A 3D surface is detected from the high density 3D information and detail surface information is detected from the images. Coordinates of a body are figured out through the POSIT algorithm and thus a 3D object acquired therethrough can be represented thereon as AR.

Description

TECHNICAL FIELD [0001] The present invention relates to a technique for converting 2D to 3D and a method for representing augmented reality using POSIT algorithm. [0002]

본 발명은 2D를 3D로 변환하는 기술과 POSIT 알고리즘을 이용한 증강현실 표현방법에 관한 것으로, 더욱 상세하게는 2D사진을 3D형태로 변환하는 기술을 이용하여 머리의 3D형태파일을 만든다. 그 후 POSIT(Pose from Orthography and Scaling with ITerations)알고리즘을 이용하여 3차원 객체 즉, 사람의 포즈를 알아내 머리가 위치한 좌표 값 위에 3D변환 기술로 만든 머리 객체를 입혀 가상으로 원하는 머리를 씌우는 방법에 관한 것이다.The present invention relates to a technique of converting 2D to 3D and a method of representing augmented reality using the POSIT algorithm, and more particularly, a 3D shape file of a head is created using a technique of converting a 2D photograph to a 3D form. Then, POSIT (Pose from Orthography and Scaling with ITerations) algorithm is used to find the 3D object, that is, the pose of the person. .

한 번 획득된 영상들은 종종 안면들이 일반적으로 보이는 모습의 선행 지식 또는 추정들에 기초하여 프로세싱되며 종래의 증강현실 헤어스타일링 기술은 미리 촬영한 영상에 객체(Object)를 입히는 방식이다. Once acquired images are often processed based on prior knowledge or estimates of facial features that are generally visible, and conventional augmented reality hair styling techniques are methods of applying objects to previously imaged images.

기존 기술로는 카메라를 통해 실시간으로 3D객체를 입히지 못하고 수동적인 최적화 작업이 수반되어야만 할 수 있다.With existing technologies, 3D objects can not be applied in real time through a camera, and passive optimization work may have to be accompanied.

2차원 영상들을 이용하여 공통된 특징을 통해 만든 3차원의 자세하고 사실적으로 묘사된 3D Hair Style을 가상으로 자신의 머리에 Simulation함으로써 카메라를 통해 실시간으로 POSIT알고리즘을 알아낸 좌표위에 3D객체를 표현한다. Simulation of 3D hair style, which is a detailed and realistic 3D hair style created by common features using 2D images, is used to represent the 3D object on the coordinates obtained by real time POSIT algorithm through the camera.

본 발명은 주기적으로 미용을 하는 많은 사람들이 미용실에서 원하는 모델의 머리 스타일이 자신과 어울리는지 아닌지 정확하게 판단하기 어려운 불편함을 느끼고 있다. 또한 생각보다 원하는 스타일이 나오지 않아 미용사에게 불만족스러운 경험이 있는 사람들이 있다. 우리는 이런 사람들에게 좀 더 자세하고 사실적으로 묘사된 3D Hair Style을 가상으로 자신의 머리에 Simulation함으로써 위와 같은 불편함을 줄이고 미용사와 고객 간의 원활한 의사소통을 제공한다.The present invention is inconvenient that many people who are regularly hairdressers are unable to accurately determine whether or not a hair style of a desired model in a beauty salon matches with itself. Also, there are people who are not satisfied with hairdressers because they do not have the style they want. We simulate the more detailed and realistic 3D Hair Style to these people by giving them a virtual way to reduce these inconveniences and provide smooth communication between the hairdresser and the customer.

2D를 3D변환하는 기술과 POSIT 알고리즘을 이용한 증강현실기술은 안면의 선행 지식(prior knowledge)을 사용하여 희소(sparse) 3차원 안면 형태들을 발견하기 위해, 복수의 안면영상들을 분석하는 단계, 임의의 선행 지식을 사용하지 않고 데이터 구동 접근 방식(data driven approach)을 사용하여 고밀도(dense) 3차원 형태들을 발견하기 위해, 상기 복수의 영상들을 분석하도록 상기 희소 3차원 안면 형태들을 사용하는 단계 및 POSIT(Pose from Orthography and Scaling with ITerations) 알고리즘을 이용하여 동체의 포즈를 알아내 해당 위치의 좌표 값에 접목시키는 단계를 포함하는 것을 특징으로 한다.The technique of 2D to 3D conversion and the augmented reality technique using the POSIT algorithm includes analyzing a plurality of facial images to find sparse three dimensional facial shapes using prior knowledge of facial features, Using the rare three-dimensional facial shapes to analyze the plurality of images to find dense three-dimensional shapes using a data driven approach without using prior knowledge, And determining the pose of the moving body using the algorithm of Pose from Orthography and Scaling with ITerations and applying it to the coordinate value of the corresponding position.

본 발명의 실시 예에 따르면, 주기적으로 미용을 하는 많은 사람들이 미용실에서 원하는 모델의 머리 스타일이 자신과 어울리는지 아닌지 정확하게 판단하기 어려운 불편함을 느끼고 있다. 또한 생각보다 원하는 스타일이 나오지 않아 미용사에게 불만족스러운 경험이 있는 사람들이 있다. 우리는 이런 사람들에게 좀 더 자세하고 사실적으로 묘사된 3D Hair Style을 가상으로 자신의 머리에 Simulation함으로써 위와 같은 불편함을 줄이고 미용사와 고객 간의 원활한 의사소통을 제공 할 수 있다.According to the embodiment of the present invention, many people who are regularly hairdressers feel inconvenience that it is difficult to accurately determine whether or not a hair style of a desired model in a beauty salon matches with itself. Also, there are people who are not satisfied with hairdressers because they do not have the style they want. We can simulate the more detailed and realistic 3D Hair Style on the head of these people, thereby reducing the inconvenience and providing smooth communication between the hairdresser and the customer.

도 1은 전체 동작의 흐름도를 도시한다.
도 2는 흐름도를 수행할 수 있는 범용 컴퓨터를 도시한다.
도 3은 3D변환 동작 흐름도를 도시한다.Figure 1 shows a flow diagram of the overall operation.
Figure 2 shows a general purpose computer capable of performing a flow chart.
3 shows a 3D conversion operation flow chart.

본 발명은 보조적인 기술을 사용하여 3차원 안면 정보를 얻기 위한 기술들을 기재하고 그를 이용하여 POSIT알고리즘과 함께 증강현실 구현방법을 기재한다. 본 발명의 측면들에 따라, 안면 구조의 선행 지식은 프로세싱 동작 동안 일부 지점들에서 사용되지만, 프로세싱 동작 동안 다른 부분들은 순수 데이터 구동 방식이 사용된다.The present invention describes techniques for obtaining three-dimensional facial information using an assistive technique and describes a method of implementing an augmented reality with POSIT algorithm using it. According to aspects of the present invention, the prior knowledge of the facial structure is used at some points during the processing operation, but other parts during the processing operation are pure data driven.

다른 동작은 2D 영상들의 세트로부터 3D 애니메이션의 측정(determination)을 위해 단일 카메라를 사용한다.Another operation uses a single camera for the determination of 3D animation from a set of 2D images.

본 발명은 안면의 선행 지식(prior knowledge)을 사용하여 희소(sparse) 3차원 안면 형태들을 발견하기 위해, 복수의 안면영상들을 분석하는 단계이며, 도 1은 임의의 선행 지식을 사용하지 않고 데이터 구동 접근 방식(data driven approach)을 사용하여 고밀도(dense) 3차원 형태들을 발견하기 위해, 상기 복수의 영상들을 분석하도록 상기 희소 3차원 안면 형태들을 사용하는 단계(12)와 POSIT(Pose from Orthography and Scaling with ITerations) 알고리즘을 이용하여 동체의 포즈를 알아내 해당 위치의 좌표 값에 접목시키는 단계(11)를 포함하는 것을 특징으로 하는 2D를 3D변환하는 기술과 POSIT 알고리즘을 이용하여 증강현실(20)로 표현한다.The present invention is a method for analyzing a plurality of facial images in order to find sparse three-dimensional facial shapes using prior knowledge of facial features, (12) of using the rare three-dimensional facial shapes to analyze the plurality of images to find dense three-dimensional shapes using a data driven approach, and a step of performing a POS (Pose from Orthography and Scaling and a step 11 of recognizing the pose of the moving body using the algorithm with the ITERATION algorithm and applying the pose to the coordinate value of the corresponding position. Express.

본 출원은 객체, 예컨대 안면에 대한 3차원 정보를 판단하는 것을 언급한다. 본 발명의 실시예가 안면의 3D 재구성 및 렌더링을 언급하여 기재될지라도, 이것과 동일한 기술은 임의의 객체의 다중 뷰들(mutiple views)을 재구성하고 렌더링하기 위해 사용될 수 있다는 것이 이해되어야 한다. 안면을 위해 사용될 때, 본원에 기재된 기술에 의해 생성되는 3차원 정보는 애니메이션, 인식, 또는 렌더링과 같은 임의의 안면 기반 응용을 위해 사용될 수 있다. 본원에 기재된 기술들은 일반적 안면의 선행 지식에 더욱 널리 의존하는 다른 기술들보다 더욱 현실적일 수 있다.The present application refers to determining three-dimensional information about an object, such as a face. Although embodiments of the present invention are described with reference to facial 3D reconstruction and rendering, it should be understood that this same technique can be used to reconstruct and render multiple views of any object. When used for a face, the three-dimensional information generated by the techniques described herein can be used for any face-based application such as animation, recognition, or rendering. The techniques described herein may be more realistic than other techniques that are more dependent upon the prior knowledge of the general facial.

본 발명의 발명자들은 안면을 재구성하기 위해 안면 외양의 강력한 선행 지식을 사용했던 이전 시스템들이 실제적으로 안면을 형성하고 렌더링하기 위해 사용되는 기본 형상들의 수를 양자화한다는 것을 인지한다. 강력한 선행 지식 또는 일반적 안면 접근 방식은 부과된 선행 안면 지식 또는 일반적 안면에 의해 제공된 자유도들(degrees of freedom)에 의해 효과적으로 제한된다. 그래서 정보 및 차후의 재구성은 원래의 안면의 모든 미세한 상세 부분들을 포착하지는 않는다.The inventors of the present invention recognize that prior systems that used a strong prior knowledge of facial appearance to reconstruct the face actually quantize the number of basic shapes used to form and render facial features. A strong prior knowledge or general facial approach is effectively limited by the degrees of freedom provided by the prior facial knowledge or general facial imposed. So information and subsequent reconstruction does not capture all the fine details of the original face.

이 "안면 공간(face space)" 양자화는 선행 지식 및 관련된 변환이 시스템에 의해 재구성될 수 있는 모든 가능한 안면들의 공간을 제한하기 때문에 일어난다. 일반적 안면 또는 순수 선행 안면 지식 기반 방법들은 전체적인안면 공간을 커버하기 위해 충분한 자유도들을 갖지 않을 수 있다.This "face space" quantization occurs because the prior knowledge and associated transforms limit the space of all possible faces that can be reconstructed by the system. General facial or pure prior facial knowledge-based methods may not have sufficient degrees of freedom to cover the entire facial space.

일 실시예는 프로세스의 중요 지점들에서 선행 안면 지식 또는 일반적 안면 제한들을 무시하고, 대신에 본원에서 고밀도 형태들(dense features)로 언급되는 안면의 상세 부분들을 발견하기 위해 데이터 구동 접근 방식을사용하여 데이터에 의존하는 것에 의해 미세한 안면 상세 부분들을 포착한다. 데이터 구동 접근 방식은 노이즈(noise), 측정 불확정성, 및 이상 값들(outliers)을 효과적으로 다루기 위해 대용량의 데이터를 요구한다. 그러나 본 시스템은 순수 데이터 구동 접근 방식을 사용하지 않을 뿐만 아니라, 선행 안면 지식 또는 일반적 안면들을 통합하는 방법들에 의해 보조된다.One embodiment is to use a data driven approach to disregard prior facial knowledge or general facial limitations at important points of the process and instead to find details of facial features referred to herein as dense features And captures fine facial detail by relying on data. The data-driven approach requires large amounts of data to effectively deal with noise, measurement uncertainty, and outliers. However, the system not only does not use a pure data driven approach, but is also assisted by methods that incorporate prior facial knowledge or common facial features.

본 발명의 일 측면에 따라, 대용량의 데이터는 다중 영상들을 얻기 위해 동작하는 단일 카메라로부터 획득될 수 있다. 예를 들면, 이것은 영상들의 이동 시퀀스(moving sequence)를 집합적으로 형성하는 비디오의 프레임들을 사용할 수 있다. 이것은 또한 하나 또는 다수의 카메라들로부터 획득된 다중의 다른 정지 영상들(still images)로부터 획득될 수 있다.According to an aspect of the invention, a large amount of data can be obtained from a single camera operating to obtain multiple images. For example, it can use frames of video that collectively form a moving sequence of images. This can also be obtained from multiple other still images obtained from one or more cameras.

일 실시예가 도 2의 흐름도를 참조하여 기재된다. 도 1은 또한 동작을 설명하는 일부의 예시적인 썸네일(thumbnail) 영상들을 도시한다. 이 흐름도는 도 2에 도시된 시스템과 같은 임의의 범용 컴퓨터(general purpose computer)로 수행될 수 있다. 이 시스템은 프로세서(200), 마우스 및 키보드와 같은 사용자 인터페이스(205), 및 디스플레이 스크린(210)을 포함한다. 컴퓨터는 예를 들면 인텔(Intel) 기반 프로세서 또는 임의의 기타 종류의 프로세서일 수 있다. 컴퓨터는 하나 또는 그 이상의 카메라들(215), 예컨대 스틸 카메라(stillcamera)들 또는 비디오카메라들로부터 비 가공(raw)된 또는 프로세싱 된 않은 영상을 수신한다. 프로세서(200)는 본원에 제공되는 설명들에 따라 비 가공된 영상 데이터를 프로세싱 한다. 대안적으로, 카메라 정보는 예컨대 하드드라이브와 같은 메모리(220)에 저장될 수 있고, 일정 시간 후에 프로세싱 될 수 있다.One embodiment is described with reference to the flow chart of FIG. Figure 1 also illustrates some exemplary thumbnail images illustrating operation. This flowchart can be performed with any general purpose computer, such as the system shown in FIG. The system includes a processor 200, a user interface 205 such as a mouse and a keyboard, and a display screen 210. The computer may be, for example, an Intel based processor or any other type of processor. The computer receives raw or unprocessed video from one or more cameras 215, such as stillcamera or video cameras. The processor 200 processes the raw video data according to the description provided herein. Alternatively, the camera information may be stored in memory 220, such as a hard drive, and may be processed after a period of time.

일 실시예는 예컨대 비디오 시퀀스, 비디오 시퀀스로부터의 정지 모션 형식 영상들의 시퀀스, 또는 단순히 다수의 정지 영상들과 같은 영상들의 시퀀스로부터 정보를 검출한다. 피사체(subject)가 완전히 정지하여 서있지 않고, 카메라가 위치를 변경한다면, 영상들의 시퀀스는 영상들의 세트에서 피사체의 머리의 다중의 다른 뷰들을 가질 것이다.One embodiment detects information from, for example, a video sequence, a sequence of still motion format images from a video sequence, or simply a sequence of images, such as multiple still images. If the subject does not stand still and the camera changes position, the sequence of images will have multiple different views of the subject's head in the set of images.

도3의 단계(100)에서, 최초의 자세 추정이 판단된다. 이는 최초의 머리 자세 추정을 도출하고, 또한 안면의 모양을 표현하는 마스크(mask)를 도출하기 위해 안면 추적 알고리즘(face tracking algorithm)을 사용할 수 있다. 이것은 머리의 대략의 위치 및 자세, 코, 입 등과 같은 안면 형태들의 위치, 및 기타 등을 판단하기 위해 안면 구조의 선행 지식을 사용한다. 마스크는 영상들의 자세를 추정하는 것을 돕는다.In step 100 of FIG. 3, the initial posture estimation is determined. It can use a face tracking algorithm to derive the initial head posture estimate and also to derive a mask that expresses the shape of the face. This uses the prior knowledge of the facial structure to determine the approximate location and posture of the head, the location of facial features such as the nose, mouth, etc., and the like. The mask helps to estimate the posture of the images.

자세 추정 기술은 110에서 희소 형태 추적 모듈(sparse feature tracking module)로 뷰(view)들의 세트를 전달한다. 상기 모듈로 전달되는 뷰들은 3차원 영상이 검출될 수 있는 영상 쌍들을 위한 양호한 지원자(candidate)들로 신뢰되는 것들이다. 상기 희소 형태 추적 모듈(110)은 각각의 영상 쌍에 대해 형태 일치의 세트를 생성한다. 쌍의 2개의 영상들은 충분히 근접되고, 이에 따라 이 형태 일치들이 획득될 수 있다.The posture estimation technique delivers a set of views to a sparse feature tracking module at 110. Views transmitted to the module are those that are good candidates for image pairs for which a three-dimensional image can be detected. The sparse tracking module 110 generates a set of form matches for each image pair. The two images of the pair are sufficiently close, and thus these type matches can be obtained.

자세 선택은 단계(120)에서 수행되고, 3D 정보의 판단을 위해 사용될 수 있는 쌍을 적절히 만드는 영상들을 선택한다. 이 쌍들은 자세에서 근접되고, 유사한 광 특성들을 가져야한다.The posture selection is performed in step 120 and selects images that make the appropriate pair that can be used for the determination of the 3D information. These pairs should be close in attitude and have similar optical properties.

전체적인 최적화(global optimization)가 단계(130)에서 형태 점들의 전체적인 세트상에서 수행된다. 이는 카메라 위치 추정을 개선하고 희소한 2차원 형태들의 3차원 구조를 계산하기 위해 사용된다.Global optimization is performed at step 130 on the entire set of shape points. This is used to improve camera position estimation and to calculate the three-dimensional structure of rare two-dimensional shapes.

개선된 카메라 위치들은 단계(135)에서 영상들의 쌍들을 수정하기 위해 사용되며, 이에 따라 대응하는 형태 점들을 위한 검색 공간을 쌍의 영상들의 수평 주사선(horizontal scan line)으로 제한한다.The improved camera positions are used to modify the pairs of images in step 135, thereby limiting the search space for corresponding shape points to a horizontal scan line of the pair of images.

단계(140)에서, 고밀도 형태(dense feature) 매칭(matching)은 쌍들을 교차하여 수행된다. 이것은 단계(110)에서 수행되었던 희소 검출을 넘어서 부가적인 형태들을 발견한다. 이러한 일치(correspondesce)들은 고밀도의 3D 점 구름(point cloud) 또는 불일치 맵을 형성하기 위해 최적화된 카메라 자세들을 사용하여 삼각 측량법에 의해 판단될 수 있다.In step 140, dense feature matching is performed crossing the pairs. This finds additional forms beyond the rare detection that was performed in step 110. [ These correspondences can be judged by triangulation using camera postures optimized to form a high density 3D point cloud or mismatch map.

개별적인 쌍들에 대응하는 점 구름들은 단일 구름으로 병합되고, 이상값들은 단계(145)에서 거부된다. 고밀도형태 검출(dense feature detection)은 선행 안면 지식 또는 일반적 안면들을 사용하지 않고 전적으로 데이터구동 방식이다. 단계(150)는 고밀도 형태 매칭으로의 단순화로서 사용되는 고밀도 형태 계산 보조 방식들(densefeature computation aids)을 정의한다. 이는 이상값 거부 기술들(예컨대, 텐서 보팅(tensor voting))을 포함하고, 영역 검색 최소화(area search minimization)를 포함할 수 있다.The point clouds corresponding to the individual pairs are merged into a single cloud, and the outliers are rejected in step 145. Dense feature detection is data driven entirely without using prior facial knowledge or common facial features. Step 150 defines dense feature computation aids that are used as simplifications to high density form matching. This includes ideal value rejection techniques (e.g., tensor voting) and may include area search minimization.

단계(155)에서, 최종적인 청소된 점 구름(cleaned point cloud)은 연결된 면을 형성하기 위해 사용된다. 면 텍스처(face texture)은 정면 영상으로부터 얻어진다. 최종적인 결과는 면을 표현하는 정보이다. 이는 3각 조각들로 형성된 3-D 그물망(mesh)이다. 이 최종 결과는 대안적으로 3D 점들의 세트, 또는 예컨대 곡선형 박판(spline)들, 세분된 면들, 또는 기타 디지털 면 정의들에 의해 정의된 면일 수 있다.At step 155, a final cleaned point cloud is used to form the connected surface. The face texture is obtained from the frontal image. The final result is information representing the face. This is a 3-D mesh formed of triangular pieces. This final result may alternatively be a set of 3D points, or a plane defined by, for example, curved slabs, subdivided faces, or other digital surface definitions.

동작에 대한 추가적인 상세한 설명은 이제 제공된다.Additional details of the operation are now provided.

종래의 입체 재구성 방식은 하나 또는 그 이상의 유사한 영상 쌍들을 얻는 다중 카메라들의 존재에 의존한다. 이들 다중 영상 쌍들 사이의 형태 대응이 판단된다. 형태 대응은 그 후에 점들의 최종 3차원 그룹을 찾기 위해3각 형태로 될 수 있다.Conventional stereolithography schemes rely on the presence of multiple cameras to obtain one or more similar image pairs. The morphological correspondence between these multiple pairs of images is determined. The shape response can then be triangulated to find the final three-dimensional group of points.

일 실시예에서, 단일 카메라는 다중 영상들을 얻기 위해 사용되고, 이후에 영상들은 다중 뷰 입체 영상들로서 개조된다. 일 실시예에서, 이 프로세스는 머리가 정적이고, 카메라가 머리에 대해 이동 중 또는 이동된 것을 가정한다. 이것이 있을 법한 상황이 아닌 한편, 이러한 가정은 예컨대, 카메라가 정적이고 머리가 이동되거나, 카메라와 머리 모두가 이동될 수 있는 것과 같이, 일반성의 비 손실을 제공할 수 있다.In one embodiment, a single camera is used to obtain multiple images, after which the images are converted into multi-view stereoscopic images. In one embodiment, this process assumes that the head is stationary and that the camera is moving or moved relative to the head. While this is not likely, this assumption can provide a loss of generality, such as, for example, a camera may be static, head moved, or both the camera and head may be moved.

상기 기재된 바와 같이, 처음에 다중 영상들은 영상들 사이에서 카메라 자세의 최초 추정을 판단하기 위해 단계(100)에서 분석된다. 이러한 최초 추정은 추정을 수행하기 위해, 예컨대 선행 안면 지식 또는 일반적 안면과 같은 안면을 나타내는 정보를 사용한다. 이것은 시스템이 영상들 사이의 대응 및 자세를 찾기 위해 충분한 정보를 판단하도록 허용하는 "희소(sparse)" 정보를 제공할 수 있다.As described above, the multiple images are initially analyzed at step 100 to determine an initial estimate of the camera posture between the images. This initial estimate uses information representing facial features such as, for example, prior facial knowledge or a general facial, to perform the estimation. This may provide "sparse" information that allows the system to determine sufficient information to find the correspondence and attitude between the images.

예를 들면, 선행 안면 지식 또는 일반적 안면으로 수행된 최초 추정들은 안면 경계선, 안면의 부분들을 정의하는 마스크의 위치들, 또는 기타 정보를 나타내는 정보를 제공할 수 있다. 이는 영상 선택을 위한 정보를 제공하고, 매칭되는 희소 형태들의 세트를 제한한다. 선행 안면 지식 또는 일반적 안면은 희소 형태들을 생성하기 위해 사용되지만, 희소 형태들은 고밀도 형태들이 판단되는 것에 앞서 데이터 구동 최적화를 사용하여 개선될 수 있다.For example, initial estimates performed with a prior facial knowledge or a general facial may provide information indicating a facial boundary, locations of the mask defining portions of the facial, or other information. This provides information for image selection and limits the set of matching rare shapes. Prior facial knowledge or general facial features are used to generate sparse shapes, but sparse shapes can be improved using data driven optimization prior to determining high density shapes.

추적자 자세 추정 모듈은 서로에 대해 수정될 수 있는 유사한 영상들을 발견하기 위해 영상들을 조사한다. 유사한 영상들은 유사한 자세들을 정의하는 영상들을 포함한다. 이에 따라 이것은 영상들의 부분 집합의 선택이 재구성을 위해 사용되도록 허용한다. 이 영상들은 기본선 정보(baseline information) 뿐만 아니라, 다중 영상들을 교차하는 신뢰성이 있는 추적된 형태 점들 모두를 사용하여 선택된다.The tracer attitude estimation module examines images to find similar images that can be modified for each other. Similar images contain images that define similar postures. This allows the selection of subsets of images to be used for reconstruction. These images are selected using both the baseline information as well as the reliable traced shape points that cross multiple images.

다중의 다른 영상들 사이에는 항상 측정의 불확실성이 있다. 예를 들면, 영상들의 쌍 사이에 각진 기본선이 감소하면, 계산된 3-D 점들의 오류는 확대된다. 이에 따라 이러한 감소된 각진 기본선은 3-D 측정의 불확실성을증가시킨다. 덜 정확한 3D 정보는 영상들 사이의 더 작은 각진 기본선들이 있는 영상들로부터 획득될 수 있다.There is always uncertainty of measurement between multiple different images. For example, if the angular base line decreases between pairs of images, the error of the calculated 3-D points is magnified. This reduced angled base line thus increases the uncertainty of the 3-D measurement. Less accurate 3D information can be obtained from images with smaller angled base lines between images.

각진 기본선이 증가하면, 더 정확한 3D 정보가 검출될 수 있으나, 또한 2개의 뷰들 사이의 공통인 더 작은 면영역이 있고, 이에 따라 더 작은 가능성으로 매칭한다. 따라서 영상 쌍들은 측정 불확실성과 오류의 수들 사이에서 균형을 위해 선택된다. 예를 들면, 영상 쌍을 교차하여 매칭된 6개의 점들과 8 내지 15도의 각진 기본선이 있는 영상들이 선호될 수 있다.As the angled base line increases, more accurate 3D information can be detected, but there is also a smaller surface area that is common between the two views, thus matching with a smaller possibility. Thus, the image pairs are selected for balance between the number of measurement uncertainties and errors. For example, images with six points matched across an image pair and an angled base line of 8 to 15 degrees may be preferred.

균형화(balancing)는 다중 선택 영상들의 형태 점들을 추적하는 것에 의해 수행될 수 있다. 형태들 사이의 높은 신뢰 매치(match)들(예컨대, 90%보다 더 큰)을 갖는 영상들만 형태 사슬들을 설정하기 위해 유지된다. 프레임 쌍들은 만일 영상들이 형태 점들에 부합하고, 또한 설정 기본선 기준에 부합한다면 영상들의 세트 내에서 유지된다. 예를 들면, 기본선 기준은 적어도 5도의 각진 기본선을 요구하는 것과 같이 설정될 수 있다. 형태 점 기준은 또한 크게 부정확한 추적자 자세 추정들을 갖는 프레임들을 거부한다.Balancing can be performed by tracking the shape points of the multiple selection images. Only images with high confidence matches (e.g., greater than 90%) between forms are maintained to set up shape chains. The frame pairs are maintained in the set of images if they match the shape points and also meet the set base line criteria. For example, the baseline reference may be set to require an angled base line of at least 5 degrees. The shape point criterion also rejects frames with largely inaccurate tracer attitude estimates.

이러한 희소 매칭 위상은 시퀀스를 교차하여 매칭되는 영상들 및 형태 점들의 세트를 생성한다. 이러한 형태 점매칭에 의해 제공되는 매치들은 자세 추적자에 의해 단독으로 예측되는 매치들보다 더욱 정확할 것이다. 형태 점 매치들은 또한 추적자 예측된 매치들보다 많은 수의 프레임들을 커버할 수 있고, 이에 따라 카메라 자세 개선 프로세스상의 더욱 큰 제한들을 제공한다. 이러한 제한들은 단계(130)에서 자세 개선에 있어 더욱 큰 정확도를 결과할 수 있다.This sparse matching phase produces a set of images and shape points that cross the sequence and match. The matches provided by this shape point matching will be more accurate than the matches predicted alone by the posture tracer. Morphological point matches can also cover a larger number of frames than tracker predicted matches, thus providing even greater limitations on the camera attitude improvement process. These limitations can result in greater accuracy in posture improvement at step 130. [

번들 조정(bundle adjustment)은 영상 세트를 교차하여 매칭된 형태 점들 및 영상들의 세트들로 시작한다. 이들은 상술된 바와 같이 형태 추적에 의해 획득된다. 단계(130)에서 수행된 번들 조정은 영상들의 세트들 사이의 2차원 대응들에 기초하여 점들의 3-D 위치들 및 카메라 파라미터들을 해석하는 최적화 기술이다. 최적화된 파라미터들은 카메라의 위치 및 방위를 포함할 수 있고, 2-D 형태 점들의 3-D 구조를 포함할 수 있다. 최적화는 구조에 대한 부분적인 방법, 및 이후의 카메라 자세에 대한 부분적인 방법을 교체하는 것에 의해 수행될 수 있다.A bundle adjustment begins with sets of matching points and images crossing the set of images. These are obtained by shape tracing as described above. The bundle adjustment performed in step 130 is an optimization technique for interpreting 3-D positions and camera parameters of points based on two-dimensional correspondences between sets of images. The optimized parameters may include the position and orientation of the camera and may include a 3-D structure of 2-D shaped points. The optimization can be performed by replacing the partial method for the structure and the partial method for the subsequent camera attitude.

대안적으로, 컴퓨터는 적당한 방법이 수렴할 때까지 이들 계산들 모두를 수행할 수 있다.Alternatively, the computer may perform all of these calculations until the appropriate method converges.

그리고 번들 조정은 최종적으로 수렴할 때까지, 반복적인 방식으로 점들의 위치와 카메라의 자세를 추정하는 것 사이에서 플립 플롭핑(flip-flopping)에 의해 각각의 영상에서 카메라의 위치를 추정한다. 마지막 결과는 더욱 정확한 카메라 위치뿐만 아니라 점들의 구조이다. 이들이 희소한 "높은 신뢰도" 점들이기 때문에, 전체의 고밀도의 표현을 제공하지는 않지만, 그것은 이후의 스테이지들에서 수행된다.Then, the bundle adjustment estimates the position of the camera in each image by flip-flopping between estimating the position of the points and the position of the camera in an iterative manner until the final convergence. The end result is a more accurate camera position as well as the structure of the points. Because they are rare "high confidence" points, they do not provide a full dense representation, but are performed in subsequent stages.

대안적인 기술은 양호한 값들이 획득될 때까지 단순히 반복적으로 값들을 변경할 수 있다.An alternative technique may simply change values repeatedly until good values are obtained.

추정되고 개선된 번들 조정(130)으로서 매칭된 형태 점들의 3-D 위치들은 재구성의 범위를 제한하기 위해 이후의 스테이지들에서 사용된다. 이들은 모든 차후의 프로세싱 스테이지들에서 사용되는 최적화된 카메라 자세들을 형성한다.The 3-D positions of the matched morphologies as estimated and improved bundle adjustments 130 are used in subsequent stages to limit the extent of reconstruction. Which form optimized camera postures used in all subsequent processing stages.

고밀도 형태 매칭(140)은 영상 쌍들 사이에서 대응하는 점들에 대한 더 많은 정보를 발견한다. 그러나 각각의 매치를 위해 전체 영상 검색을 요구할 수 있기 때문에, 제한되지 않은 고밀도의 매칭은 계산적으로 금지될 수 있다. 비 제한된 검색은 모든 다른 영상의 각각의 점에 대하여 각각의 영상의 각각의 점을 비교할 것이다.The high density feature matching 140 finds more information about corresponding points between image pairs. However, unlimited high density matching can be computationally prohibited since it may require a full image search for each match. An unrestricted search will compare each point of each image for each point of every other image.

단계(150)는 일반적으로 고밀도 형태 검색의 범위를 감소시키기 위해 사용되는 기술들을 표현한다.Step 150 represents techniques that are typically used to reduce the scope of high density form searching.

일 실시예에 따라, 에피폴라 기하학(epipolar geometry) 기술이 사용된다. 에피폴라 기하학에서, 각각의 대응하는 항목은 쌍으로 된 또는 밀집된 영상들 사이에서 확장하는 단일 선을 따라 놓여 져야만 한다. 프로세스는 영상들을 수정하는 것에 의해 더 단순화될 수 있고, 이에 따라 각각의 에피폴라 선은 수평 주사선으로 일치한다. 이는 각각의 잠재적인 매치를 위해 영상들을 재 샘플링(re-sample)해야하는 필요를 제거한다.According to one embodiment, an epipolar geometry technique is used. In epipolar geometry, each corresponding item must be placed along a single line that extends between paired or densely packed images. The process can be further simplified by modifying the images, so that each epipolar line coincides with a horizontal scan line. This eliminates the need to re-sample images for each potential match.

수정 이후에, 영상들의 각각의 쌍에서 대응하는 점들은 매칭 프로세스를 사용하여 발견된다. 선행 안면 지식 또는 일반적 안면은 안면 마스크를 추적하는 것에 의해 커버되는 영역으로의 매칭을 제한하는 것에 의해 매칭 프로세스(matching process)를 보조하도록 사용될 수 있다. 이는 검색을 단순화하는 것을 허용하며, 이에 따라 템플릿(template)이 한 영상에서 각각의 픽셀에 대한 고정된 윈도우 크기(window size)를 사용하여 검출된다. 템플릿은 쌍으로 된 영상에서 대응하는 에피폴라 선을 따라 매칭된다.After modification, corresponding points in each pair of images are found using a matching process. The prior facial knowledge or general facial features can be used to assist in the matching process by limiting matching to the areas covered by tracking the facial mask. This allows to simplify the search, so that a template is detected using a fixed window size for each pixel in one image. The template is matched along the corresponding epipolar line in the paired images.

안면에 대해 적절한 최소의 상관 임계값 및 제한된 불일치 범위는 가짜 매칭들의 수를 감소시키기 위해 사용된다. 평평한 상관도(correlation plot)를 갖는 위치들은 제거되고, 명백한 피크(peak)는 제거되지 않는다. 그러나, 다중의 후보 매치들은 최적의 매칭을 찾기 위해 유지될 수 있다.The appropriate minimum correlation threshold and limited mismatch range for the face are used to reduce the number of false matches. Positions with a flat correlation plot are removed, and apparent peaks are not removed. However, multiple candidate matches can be maintained to find the best match.

매칭 프로세스의 결과는 불일치 체적(disparity volume)이다. 각각의 (x,y,d) 트리플릿(triplet)은 하나의 수정된 영상의 픽셀(x,y)을 쌍으로 된 영상의 픽셀 (x+d,y)로 맵핑한다.The result of the matching process is the disparity volume. Each (x, y, d) triplet maps a pixel (x, y) of one modified image to a pixel (x + d, y) of the paired image.

알려진 자세들은 불일치 값들을 3차원 점들로 변환하기 위해 삼각 형태로 될 수 있다. 각각의 불일치 픽셀은 수정하는 변환의 역변환을 사용하여 자신의 원래의 영상 공간으로 변환된다. 매치의 3차원 위치는 카메라의 광학중심을 통해 통과하는 광선들 사이의 교차점, 및 영상 평면의 대응하는 형태 매치들에 의해 제공된다. 실제로, 형태 매칭의 오류 및 카메라 추정들은 이들 선들이 정확하게 교차하는 것을 막을 것이다. 광선들 사이의 직교거리를 최소화하는 3차원 점이 사용될 수 있다.Known attitudes can be triangular to convert discrepancy values into 3D points. Each mismatch pixel is transformed into its original image space using the inverse transform of the transform to modify. The three-dimensional position of the match is provided by the intersection between the rays passing through the optical center of the camera, and the corresponding pattern matches of the image plane. Indeed, the errors of shape matching and camera estimates will prevent these lines from crossing exactly. Three-dimensional points that minimize the orthogonal distance between rays can be used.

다른 제한은 도출된 구조의 이상값들의 거부에 의해 제공될 수 있다. 번들 조정 프로세스로부터의 3차원 결과는희소하더라도 더욱 정확한 3차원 안면 구조의 추정을 제공한다. 이는 안면의 미세 구조를 포착하기에 충분하지않다. 일 실시예에서, 이것은 고밀도의 재구성에서 허용 가능한 3차원 계산에 제한을 제공하도록 사용된다.Other constraints may be provided by rejecting the outliers of the derived structure. The three-dimensional results from the bundle adjustment process provide a more accurate estimate of the three-dimensional facial structure even if it is rare. This is not enough to capture the facial microstructure. In one embodiment, this is used to provide a constraint on acceptable three-dimensional computation at high density reconstruction.

특히, 계산된 구조는 번들 조정으로 도출된 구조에서 멀리 벗어나서는 안 된다. 이 구조는 먼저 내부에 삽입된 번들 조정된 구조를 복셀들(voxels)로 변환하고, 상기 복셀들로부터 미리 정해진 거리에서 데이터를 거부하는 것에 의해 데이터를 프리필터링(prefilter)하도록 사용된다. 실제로, 이것은 데이터 최적화 기술이 된다.In particular, the calculated structure should not deviate from the structure derived by bundle adjustment. This structure is first used to prefilter data by transforming the internally inserted bundle-tuned structure into voxels and rejecting the data at a predetermined distance from the voxels. In practice, this becomes a data optimization technique.

복셀 검사는 번들 복셀들로부터 미리 정해진 거리보다 더 멀리 있는 큰 이상값들을 제거한다. 이것은 또한 안면마스크의 부정확한 배치로 인한 경계 아티팩트들(boundary artifacts)을 제거한다. 그러나 형태 매칭의 오류들은 재구성 노이즈를 결과할 수 있다. 만일 노이즈가 뷰들 내에서 및 뷰들 사이에서 상관되지 않는다면, 이것은 3차원 구조에서 희소한 고 주파수 변형으로서 나타날 것이다. 그러나 정확한 매치들은 안면 구조의 연속성 및 평탄함으로 인해 뷰들 사이에서 상관될 것이다.The voxel check removes large anomalies farther than the predetermined distance from the bundle voxels. This also eliminates boundary artifacts due to incorrect placement of the facial mask. Errors in shape matching, however, can result in reconstruction noise. If noise is not correlated within views and between views, this will appear as a rare high frequency distortion in the three-dimensional structure. However, the exact matches will be correlated between views due to the continuity and smoothness of the facial structure.

텐서 보팅(tensor voting)은 또한 표면 돌출(saliency)을 판단하기 위해 사용될 수 있고, 이에 따라 상관 구조를 유지하는 것은 텐서 보팅이다. 3차원 텐서 보팅 방식은 표면 돌출을 강화하고 판단하기 위해 사용될 수 있다. 텐서 보팅은 각각의 3-D 점이 공 텐서(ball tensor) 또는 막대 텐서(stick tensor) 중 어느 하나로 인코딩되도록 허용한다. 텐서의 정보는 보팅 동작을 통해 이들의 이웃들로 전파된다. 이웃들은 유사한 구조를 갖고 이에 따라 텐서 보팅 프로세스를 통해 서로 강화한다. 구조적인 강화의 양은 초기의 구조적 돌출에 의해 영향 받는다. 이 기술은 점들의 구름으로부터 표면을 회복한다.Tensor voting can also be used to determine surface saliency, and thus maintaining a correlated structure is tensor voting. The 3D tensor voting method can be used to enhance and judge surface protrusion. Tensor voting allows each 3-D point to be encoded in either a ball tensor or a stick tensor. The information of the tensor is propagated to their neighbors through a voting operation. Neighbors have similar structures and thus reinforce each other through a tensor voting process. The amount of structural reinforcement is influenced by the initial structural protrusion. This technique restores the surface from the cloud of dots.

포인트 노멀(point normal)들의 양호한 초기 추정은 공 텐서들로서 점들을 무턱대고 인코딩하는 것보다 우선시될 수 있다. 일 실시예에서, 머리는 도 4a에 도시된 것과 같이 원기둥에 의해 근사화된다. 원기둥 상태들이 획득된다. 원기둥 상태들은 점 상태 근사화로서 사용될 수 있다.A good initial estimate of the point normals may be prioritized over the encoding of the points as empty tensors. In one embodiment, the head is approximated by a cylinder as shown in FIG. 4A. Cylindrical states are obtained. Cylindrical states can be used as point state approximations.

일 실시예에서, 시스템은 3 x 3 고유 시스템(Eigensystem)을 사용할 수 있고, 그 고유 시스템에서 첫 번째 고유벡터(eigenvector)로서 상태를 고정할 수 있다. 남아 있는 기본 벡터들은 이후에 단일 값 분해(singular value decomposition)를 사용하여 계산될 수 있다. 예컨대 최초의 2개의 고유 벡터들 사이의 크기 차이에 의해 정의된 초기의 표면 돌출은 모든 점들에 대해 균일하게 설정될 수 있다.In one embodiment, the system may use a 3 x 3 EigenSystem and may fix the state as a first eigenvector in its native system. The remaining base vectors may then be computed using singular value decomposition. For example, the initial surface projection defined by the size difference between the first two eigenvectors can be uniformly set for all points.

<번들 조정으로부터 획득된 3D 점들은 매우 정확하지만, 안면 구조의 희소 추정들이다. 이 점들은 밀어 올려진 표면 돌출이 있는 텐서 보팅 설정으로 부가된다. 방사형 기저 함수들(radial basis functions)은 또한 번들 조정으로부터 획득된 3D 점들 사이에 평탄한 표면을 내삽하기 위해 사용될 수 있다. 이러한 실시예에서, 3D 번들 점들에 대한 상태들은 텐서 보팅을 위해 사용하기 위해 내부에 삽입된 표면으로부터 계산된다. 그러나 내부에 삽입된 표면 자체는 바람직하게는 텐서 보팅을 위해 사용되지 않는다.<3D points obtained from bundle adjustment are very accurate, but are rare estimates of facial structure. These points are added to the tensor voting setting with the raised surface protrusion. Radial basis functions may also be used to interpolate a flat surface between the 3D points obtained from the bundle adjustment. In this embodiment, the states for the 3D bundle points are computed from the surface embedded therein for use for tensor voting. However, the internally inserted surface itself is preferably not used for tensile voting.

텐서 보팅이 2회 지나간 후에, 낮은 표면 돌출이 있는 점들은 제거되고, 안면의 표면을 교차하여 분산된 점들의 고밀도 구름을 남긴다.After two passes of the tensor bowl, the points with low surface protrusions are removed, leaving a dense cloud of dots scattered across the surface of the face.

선행 안면 지식 또는 일반적 안면은 고밀도 재구성 스테이지에 도입될 수 있고, 이에 따라 안면 공간은 제한되지 않는다. 특히, 일 실시예는 예컨대 존재하는 일반적 안면 표현의 근사화에 기초하여 이상값들을 판단 및 제거하기 위해 고밀도 프로세스에서 선행 안면 지식 또는 일반적 안면을 사용할 수 있지만, 재구성된 점들의 3D위치를 계산 또는 수정하기 위해 사용되지 않는다.Prior facial knowledge or general facial features may be introduced into the high-density reconstructive stage, and thus facial space is not limited. In particular, one embodiment may use the prior facial knowledge or general facial in a high-density process to determine and remove anomalies based on, for example, an approximation of an existing generic facial expression, but it is also possible to calculate or modify the 3D position of the reconstructed points It is not used for.

안면 상세 구조는 3차원 점 구름에서 효과적으로 포착된다. 만일 최종 목표가 안면의 수학적 기술이라면, 3차원점 구름은 충분할 것이다.The facial detail structure is effectively captured in a three-dimensional point cloud. If the final goal is a mathematical description of the face, a 3D point cloud will suffice.

POSIT알고리즘은 3D 객체간의 상대적인 좌표만을 사용하여 카메라에 가상객체를 삽입하는 방법을 이용하여 동체 중에서도 사람의 머리 부분의 좌표 값을 알아낼 수 있다.The POSIT algorithm can find out the coordinate value of the head of a human body among the body by using a method of inserting the virtual object into the camera using only the relative coordinates between the 3D objects.

3D 객체의 점들 간의 상대적인 좌표만 준비된다면 각 비디오 프레임의 카메라 행렬을 계산할 수 있고 가상객체의 가려짐을 고려하여 합성할 수 있다.If only the relative coordinates of the points of the 3D object are prepared, the camera matrix of each video frame can be calculated and synthesized considering the cloaking of the virtual object.

11: POSIT알고리즘이용 좌표 값 추출단계 12: 2D영상을 3D로 바꾸는 단계
20: 구현단계 100: 자세 판단/추정
110: 형태 추적 120: 자세선택
130: 자세개선 135: 영상들 수정
140: 고밀도 2D 형태 매칭 145: 고밀도 3D 재구성
150: 고밀도 3D포스트 프로세싱 155: 면 재구성 및 조직 구성
200: 프로세스 205: 사용자 인터페이스
210: 디스플레이 215: 카메라
220: 메모리11: Coordinate value extraction using POSIT algorithm Step 12: Step of converting 2D image into 3D
20: Implementation step 100: Determination / estimation of attitude
110: Shape tracing 120: Posture selection
130: Posture improvement 135: Modification of images
140: High Density 2D Shape Matching 145: High Density 3D Reconstruction
150: High Density 3D Post Processing 155: Surface Reorganization and Organization
200: process 205: user interface
210: display 215: camera
220: Memory

Claims

Analyzing a plurality of facial images to find sparse three-dimensional facial features using facial prior knowledge;
Using the rare three-dimensional facial shapes to analyze the plurality of images to find dense three-dimensional shapes using a data driven approach without using any prior knowledge; And
Identifying a pose of the body using a POSIT (Pose from Orthography and Scaling with ITerations) algorithm and combining the pose with the coordinate value of the corresponding position;
Containing
And a method for expressing augmented reality using POSIT algorithm.

The method according to claim 1,
Wherein the analyzing the plurality of facial images comprises:
Using the prior knowledge to identify types in the images; And
Modifying between pairs of images to find pairs of similar images
More included
Wherein the facial reconstruction method comprises the steps of:

The method according to claim 1,
Wherein the prior knowledge is used to identify portions of the face.

The method according to claim 1,
Wherein the prior knowledge is used to identify a face mask representing a generic face. &Lt; RTI ID = 0.0 > 11. < / RTI >

The method according to claim 1,
Wherein using the prior knowledge comprises using the prior knowledge to limit facial shapes that form a set of rare three dimensional shapes. &Lt; Desc / Clms Page number 21 >

3. The method of claim 2,
Wherein the pairs of similar images are pairs of images that are sufficient to identify the three-dimensional information but contain an angular baseline that does not increase beyond a certain amount to unintended increase in measurement uncertainty. &Lt; RTI ID = 0.0 > .

3. The method of claim 2,
Characterized by further comprising the step of testing said image pairs to require an angular base line greater than a particular amount of a first amount and to require a correspondence between pairs of forms greater than a particular amount of a second amount A facial reconstruction method

The method according to claim 1,
Analyzing the plurality of images a second time to find image clusters having feature point matches between images larger than a particular amount, and forming a set of traced shape points Further comprising using the image clusters to improve a first analysis performed using the prior knowledge in order to improve the first analysis. &Lt; Desc / Clms Page number 21 >

9. The method of claim 8,
Wherein the image clusters comprise pairs of images.

9. The method of claim 8,
Further comprising using the set of traced shape points to find the position and movement of the traced shape points. &Lt; Desc / Clms Page number 13 >

11. The method of claim 10,
Further comprising using the position of the traced shape points to improve the sparse three-dimensional shapes. &Lt; RTI ID = 0.0 > 11. < / RTI >

The method according to claim 1,
Wherein finding the high density features includes limiting a search range for the high density features.

The method according to claim 1,
Wherein finding the high-density features includes rejecting outlier parts that are farther than a predetermined distance than other features.

14. The method of claim 13,
Wherein rejecting the ideal value portions comprises converting data into voxels and rejecting data that is greater than a predetermined distance from the voxels.

The method according to claim 1,
Rejecting portions deviating from the surface protrusion by more than a specified amount.

16. The method of claim 15,
Wherein said rejecting step comprises retrieving said portions by using tensor voting. &Lt; Desc / Clms Page number 21 >

In a facial reconstruction system,
A camera for acquiring a plurality of facial images; And

CLAIMS What is claimed is: 1. A method, comprising: processing a plurality of images to find sparse three-dimensional facial shapes using facial prior knowledge as a processing part; Wherein the processing unit uses the rare three-dimensional facial shapes to analyze the plurality of images to find high-density features using a data driven approach.

18. The method of claim 17,
Wherein the camera is a still camera.

18. The method of claim 17,
Wherein the camera is a video camera.

18. The method of claim 17,
Wherein the processing unit is operative to use the prior knowledge to identify types in the images and to modify between pairs of images to find pairs of similar images.

21. The method of claim 20,
Wherein the processing unit uses the prior knowledge to identify a face mask representing a general face.

18. The method of claim 17,
Wherein the processing unit is operative to request the angular base line greater than the first specific amount and to test the image pairs to require a correspondence between the types of pairs greater than the second specific amount. system.

23. The method of claim 22,
RTI ID = 0.0 > 1, < / RTI > wherein the image clusters comprise pairs of images.

18. The method of claim 17,
Wherein the processing unit finds dense forms by rejecting outlier parts that are greater than a predetermined distance from other forms.

25. The method of claim 24,
Wherein the processing unit performs the rejecting step using tensor voting. &Lt; Desc / Clms Page number 13 >

In a facial reconstruction method,
Analyzing the plurality of images from a single camera to modify the plurality of images and to find three-dimensional information representative of at least one face from the plurality of images, wherein analyzing the plurality of images An initial analysis using facial prior knowledge to determine the initial forms in the images and a subsequent analysis using the initial forms of the face to discover additional information without using any prior knowledge And analyzing the at least one facial image.

27. The method of claim 26,
Using the prior knowledge to identify the types of images, and modifying between pairs of images to find pairs of similar images. &Lt; Desc / Clms Page number 13 >

27. The method of claim 26,
Wherein the prior knowledge is used to identify portions of the face.

29. The method of claim 28,
Wherein the prior knowledge is used to identify a facial mask representing a generic face.

28. The method of claim 27,
Wherein the pair of similar images are image pairs sufficient to identify the three-dimensional information but comprise an angular base line that is not large enough to inadvertently increase measurement uncertainty beyond a certain amount.

27. The method of claim 26,
Wherein the subsequent analysis comprises limiting the search range of high density features.

27. The method of claim 26,
Rejecting outlier portions that are farther than predetermined distances from the other features. &Lt; Desc / Clms Page number 22 >

33. The method of claim 32,
Wherein rejecting the portion of the anomalies comprises converting data into voxels, and rejecting data farther than a predetermined distance from the voxels.

27. The method of claim 26,
Rejecting portions deviating from a surface protrusion greater than a specified amount. &Lt; Desc / Clms Page number 22 >

35. The method of claim 34,
Wherein the rejecting step comprises searching for the portions using tensor voting. &Lt; RTI ID = 0.0 > 11. < / RTI >

Analyzing a plurality of facial images to find sparse information from the face; And

Using the sparse information to discover dense information using a data driven approach, the step of using the sparse information using a tensor voting technique, And limiting the high density information search range.

In a facial processing method,
Analyzing a plurality of facial images to find matches between images using a prior knowledge of a generic face, the matches being used to generate sparse information; Analyzing the plurality of facial images to be used;

Using the matches to form pairs of images; And

Analyzing the pairs to find a set of dense forms using a data driven approach without any prior knowledge of the general face, Wherein analyzing the pairs comprises removing outlier portions from the set of dense forms. &Lt; Desc / Clms Page number 22 >

A method for automatically reconstructing a 3D face from a plurality of 2D images of a human face, the method comprising: using a generic face prior face knowledge to derive an initial camera position estimate;

Selecting pairs of images and detecting sparse shape points for each of the pairs of images;

Improving the original camera position estimate and the sparse shape points;
Using a pure data driven approach in detecting dense 3D point clouds from the image pairs;

Merging the high density 3D point clouds into a single 3D cloud;

Removing outliers from the single 3D point cloud to form a cleaned 3D point cloud;

Fitting the connected surface to the cleaned 3D point cloud; And

A method for automatically reconstructing a 3D face from a plurality of 2D images of a human face, comprising texture mapping the face detail information and color information of the face of the subject to the connected face

CLAIMS What is claimed is: 1. A method for automatically reconstructing a 3D face from a plurality of 2D images of a human face, the method comprising: using a prior face knowledge of a generic face in deriving a first camera position estimate;

Selecting pairs of images and detecting sparse shape points for each of the pairs of images;

Improving the original camera position estimate and the sparse shape points;

Using a pure data-driven approach in detecting dense 3D point clouds from the image pairs;

Merging the high density 3D point clouds into a single 3D cloud;

Using a general facial prior art facial knowledge to remove outliers from the single 3D point cloud to form a cleaned 3D point cloud;

Fitting the connected surface to the cleaned 3D point cloud; And

And texture mapping the face detail information and color information of the face of the subject to the face. &Lt; RTI ID = 0.0 > 11. < / RTI >

The method according to claim 1,
The POSIT algorithm uses only relative coordinates between 3D objects to insert virtual objects into the camera
Using
How to find out the coordinate value of the human head among the fuselage.

41. The method of claim 40,
If only the relative coordinates of the points of the 3D object are prepared, the camera matrix of each video frame can be calculated and the synthetic method considering the cloaking of the virtual object.