CN107292956A

CN107292956A - A kind of scene reconstruction method assumed based on Manhattan

Info

Publication number: CN107292956A
Application number: CN201710563682.0A
Authority: CN
Inventors: 朱尊杰; 颜成钢; 徐峰; 宁瑞忻
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2017-07-12
Filing date: 2017-07-12
Publication date: 2017-10-24
Anticipated expiration: 2037-07-12
Also published as: CN107292956B

Abstract

The present invention discloses a kind of scene reconstruction method assumed based on Manhattan.This method gives accurate estimation based on the scheme that Manhattan is assumed；The normal direction of all 3D points of estimated record frame first；Then the normal vector direction of three main orthogonal planes is estimated by PCA.The normal vector of plane is estimated by then passing through all depth points, random noise is greatly filtered out so that the normal vector of estimation is very accurate；Principal plane coordinate is further determined that by position of each pixel in three-dimensional coordinate；Finally, by the transformation matrix that is obtained by plane coordinates and normal vector to estimate the posture of camera, and scene threedimensional model is spliced into using the camera posture of each two field picture.Robustness is had more because the plane information of the present invention is obtained by largely putting calculating, therefore than the feature point methods of only a single point.

Description

A kind of scene reconstruction method assumed based on Manhattan

Technical field

The invention belongs to computer vision field, especially for scene three-dimensional reconstruction, and in particular to one kind is based on Manhattan The scene reconstruction method of hypothesis.

Background technology

In recent years, with the development of depth perception technology, realize that the real-time 3D scenes scanning of 3D indoor scenes is possibly realized. Industry proposes several systems, and generates promising result.On the other hand, as augmented reality (AR) turns into academic and row The hot issue of industry, in the urgent need to real-time 3D scan because the recovery of the 3D geometry of our real scenes be make virtual objects without Stitch the key of alignment.In the head-mounted display Hololens of Microsoft, many application programs based on AR, which need to scan, works as anterior chamber Between 3D geometry.

Using depth camera, directly record 3D information, the key for realizing 3D scannings be the continuous input frame of estimation each two it Between camera motion.It is the correspondence put between cloud for estimating to be obtained by two depth frames using iteration closest approach (ICP) first Relation.Then, 2 clouds can be merged by the camera motion of estimation.

However, it is necessary to abundant textural characteristics in the method based on ICP, by the geometric properties in scene, robustly Estimate correct camera motion.For two pure 2D planes in 3d space, immediate point may not be that correct correspondence is closed System.In this case, ICP may produce the camera motion of mistake.In addition, ICP needs a large amount of sampled points, it is necessary to iteration Final corresponding relation can be converged to, it means that relatively heavy calculating cost.Even if some systems based on ICP are used GPU realizes real-time performance, but is not still suitable for many practical applications, because GPU may be taken by other tasks, so as to lead The system based on ICP of cause can not be calculated in time.

The content of the invention

The purpose of the present invention is that there is provided a kind of scene rebuilding side assumed based on Manhattan in view of the shortcomings of the prior art Method.

When carrying out three-dimensional reconstruction to indoor scene, because most scene structure meets Manhattan it is assumed that i.e. scene It is made up of multiple mutually orthogonal planes, such as ceiling, metope, floor.When the image sequence that depth camera is gathered includes foot During enough orthogonal planes, it is understood that there may be big plane domain, because plane generally has consistent color (such as metope, ceiling Deng), so seldom feature can only be extracted.In this case, the scheme assumed based on Manhattan gives accurate motion Estimation.The normal direction of all 3D points of estimated record frame first；Then three are estimated mainly by principal component analysis (PCA) The normal vector direction of orthogonal plane (big plane domain in such as wall, ceiling, the scene such as floor).By then passing through all depths Spend point to estimate the normal vector of plane, random noise is greatly filtered out so that the normal vector of estimation is very accurate；By each Position of the pixel in three-dimensional coordinate further determines that principal plane coordinate；Finally, we pass through by plane coordinates and normal vector The transformation matrix of acquisition is spliced into scene threedimensional model to estimate the posture of camera using the camera posture of each two field picture. The inventive method comprises the following steps：

Step (1), the image sequence by depth camera shooting, collecting acquisition indoor scene, calculate each pixel in image Point normal vector, be specifically：

First, all pixels point for recording frame in image sequence is converted into 3D by the camera model of depth camera to sit Mark；Then calculated by the 3D coordinates of adjacent 4 pixels of certain pixel and obtain the normal vector.

D₁(u, v)=D (u+k, v)-D (u-k, v) (1)

D₂(u, v)=D (u, v+k)-D (u, v-k) (2)

Wherein, k is to represent distance between two pixels, is adjustable parameters；D₁、D₂For by pixel D's (u, v) Vector；

By D₁、D₂Bring into formula (3), obtain D (u, v) normal vector n (u, v)；

N (u, v)=ψ (D₁(u,v)×D₂(u,v)) (3)

Wherein, × and it is crossed product, Ψ is the function that normal is converted to unit vector：ψ (normal)=normal | |normal||^-1；

Step (2), by setting up spherical coordinate normal vector statistics is carried out, extracted from the normal vector of all pixels point near Like the normal direction duration set of principal plane normal vector, it is specifically：

2.1 unit spherical coordinates are set up

The normal n=(x, y, z) that step (1) is obtained is converted into spherical coordinates form n=(α, beta, gamma), and wherein α is normal n Angle between axle X, β is the angle between normal n and axis Y, and γ is the angle between normal n and axis Z.

2.2 normal vectors are counted

Each base in unit spherical coordinate is subdivided into 180 intervals, i.e., it is each interval for 1 degree of scope；Then by three Each normal vector n is stored in its corresponding container by the individual interval combinations for being belonging respectively to each base into a container.

Described base refers to α, the coordinate where beta, gamma, i.e. α, and beta, gamma is 0~180 degree；

The extraction of 2.3 principal normal vector set

Due to having the pixel for largely belonging to a certain principal plane, therefore the corresponding container of plane normal vector in a two field picture Middle normal vector quantity will be much larger than other containers, therefore will be set to first comprising all normal vectors in the most container of normal vector quantity The normal direction quantity set L of individual principal plane₁；

According to the normal vector vertical relation of principal plane, the normal direction quantity set L of second principal plane₂See formula (4)：

θ₁＜ Θ (α 1, α 2)+Θ (β 1, β 2)+Θ (γ 1, γ 2) ＜ θ₂ (4)

Wherein Θ (a, b)=cos (a) * cos (b), θ₁=π * 100/180, θ₂=π * 80/180；

The normal direction quantity set L of 3rd principal plane₃For simultaneously apart from L₁And L₂80 ° to 100 ° of normal direction quantity set；

Step (3), principal component analysis

By 3 normal direction quantity set L₁、L₂、L₃In all normal vectors as PCA input, then PCA extract these normal vectors In Main way, three mutually orthogonal characteristic vectors are obtained, while these vectors to be set to the normal vector n of principal plane₁、 n₂、n₃；Characteristic vector wherein corresponding to the minimal eigenvalue of PCA methods output is the normal vector of first principal plane, maximum on the contrary Characteristic vector corresponding to characteristic value is the normal vector of the 3rd principal plane.

Step (4), each 3D points are calculated first in normal vector n₁、n₂、n₃On projected position, then utilize projected position Extract position d of the principal plane on normal vector₁,d₂,d₃, then each principal plane can be expressed as a normal vector form (n, d), Specifically：

4.1, by the way that depth camera optical centre is set into the origin of coordinates, calculate each pixel 3D coordinate D (u, v) point Projected position p that Wei Yu be on three principal plane normal vectors_i ^f=D (u, v) n_i, i=1,2,3.

4.2 in perfect condition, is n for normal vector_iPlane, being equal to apart from d for plan range camera opticses center be flat The projected position on plane normal vector, i.e. D (u, v) n are put on face_i=d；But it is due to depth camera noise itself and precision The reason for, projected position of the partial dot on plane normal vector in plane be not equal to plan range camera opticses center away from From, therefore peak value extracted using one-dimensional mean shift algorithm to the projected positions of all each pixels, corresponding to peak value peak Projected position exactly principal plane is to camera opticses center apart from d.Therefore we by the step obtained three principal planes away from From camera opticses center distance d₁,d₂,d₃。

Step (5), pass through the normal vector and range information of three principal planes obtained before, calculate camera posture changing square Battle array；And the camera posture changing matrix and the 3D coordinates of all pixels by each frame carry out the reconstruction of scene, are specifically

The camera posture changing matrix of f two field pictures to g two field pictures can be expressed as form：

Wherein R is the spin matrix that size is 3 × 3, and t is the translation vector that size is 3 × 1.

The camera coordinates of first frame are set to the world coordinates of world coordinates, i.e. scene so that each picture in frame f Vegetarian refreshments 3D coordinates D_f(u, v) is reverted in world coordinates by camera posture changing matrix：

D_f' (u, v)=T_1,2·T_2,3…T_f-1,f·D_f(u,v) (8)

The pixel 3D coordinates D after world coordinates will be reverted in all frames_f' (u, v) add up, obtain the 3D points of scene Cloud, and the threedimensional model finally given after scene rebuilding is rendered with OpenGL.

The beneficial effects of the invention are as follows：

This planar approach assumed based on Manhattan, can rebuild interior in the case where geometry and textural characteristics are less Scene and amount of calculation is small, this is all important for the method based on characteristics of image and the method based on ICP and with challenge Property.Main idea is that being related to Manhattan it is assumed that it efficiently estimates principal plane by setting up unit soccer star's coordinate Normal vector, and quickly and accurately calculate using one-dimensional mean shift algorithm the distance for obtaining camera opticses center and then pass through The information of 3 principal planes calculates camera motion.Because the plane information of the present invention is obtained by largely putting calculating, therefore ratio only has The feature point methods of a single point have more robustness.

Brief description of the drawings

Fig. 1 is the areal model for calculating normal；

Fig. 2 is the unit sphere in 3D cartesian coordinate systems；

Fig. 3 is l₁,l₂,l₃Main shaft orthogonal graph；

Fig. 4 is the inventive method flow chart.

Embodiment

It is for further analysis to the present invention with reference to specific embodiment.

A kind of scene reconstruction method assumed based on Manhattan, is comprised the following steps as shown in Figure 4：

Step (1)：The image sequence for obtaining indoor scene is gathered by depth camera, each pixel method in image is calculated Vector.

First, all pixels point for recording frame in image sequence is converted into 3D by the camera model of depth camera to sit Mark；Then calculated by the 3D coordinates of adjacent 4 pixels of certain pixel and obtain the normal vector.Specific manifestation form is as schemed O-UV is pixel coordinate system in 1, wherein Fig. 1, and O-XYZ is camera coordinates system.

D₁(u, v)=D (u+k, v)-D (u-k, v) (1)

D₂(u, v)=D (u, v+k)-D (u, v-k) (2)

Wherein, k is to represent distance between two pixels, is adjustable parameters；D₁, D₂For by pixel D (u, v) to Amount, D (u, v) is the 3D coordinates of the D (u, v) of pixel on pixel coordinate system (u, v) position.

By D₁, D₂Bring into formula (3), obtain normal vector

N (u, v)=ψ (D₁(u,v)×D₂(u,v)) (3)

Wherein, × and it is crossed product, Ψ is the function that normal is converted to unit vector：ψ (normal)=normal | |normal||^-1。

Step (2)：Normal vector statistics is carried out by setting up spherical coordinate, is extracted from the normal vector of all pixels point near Like the normal direction duration set of principal plane normal vector, specific manifestation form such as Fig. 2 of the step.

2.1 unit spherical coordinates are set up

There to be the point for largely belonging to a certain principal plane (such as wall) in captured image, these points are counted in step 1 The normal vector direction calculated is very close, and all very close with the principal plane normal vector.Simultaneously as irregularly shaped object and biography To have most of flat with master in interference and noise that sensor is produced in itself, the normal vector that a certain two field picture all pixels point is calculated Face normal vector direction is not close.So setting up unit spherical coordinate on the basis of three-dimensional coordinate.

The normal n=(x, y, z) calculated is converted into spherical coordinates form n=(α, beta, gamma), wherein α is normal n and axle X Between angle, β is the angle between normal n and axis Y, and γ is the angle between normal n and axis Z.

2.2 normal vectors are counted

Each base in unit spherical coordinate is subdivided into 180 intervals first, i.e., each interval is 1 degree of scope, wherein Base herein refers to α, the coordinate where beta, gamma, i.e. α, and beta, gamma is 0~180 degree.Then it is belonging respectively to each base by three Interval combinations are into a container, and container total quantity is the product of quantity between three bases, i.e., 5832000, each of which container Storage belongs to its interval normal vector respectively, and such as certain interval storage α is at 90 °~91 °, and β is at 90 °~91 °, and γ is at 90 °~91 ° Normal vector.Finally each normal vector n is stored in its corresponding container.

The extraction of 2.3 principal normal vector set

Due to having the pixel for largely belonging to a certain principal plane, therefore the corresponding container of plane normal vector in a two field picture Middle normal vector quantity will be much larger than other containers.Then the present invention sets normal vector all in the container comprising most normal vectors It is set to the normal direction quantity set L of first principal plane₁, according to Manhattan it is assumed that the principal plane in image is orthogonality relation, i.e., two masters The normal vector of plane is vertical.Then when we regard each container as a vector, i.e., each container is expressed as its correspondence The vectorial l=(α, beta, gamma) of interval intermediate value angle, for example, will deposit α at 90 °~91 °, β is at 90 °~91 °, and γ is at 90 °~91 ° The container of normal vector is expressed as vectorial l=(90.5,90.5,90.5).According to the normal vector vertical relation of principal plane, we can be with Judge the corresponding container L of normal vector of second principal plane₂, its vector representation l₂With the l of first principal plane₁Between angle About 90 ° of degree.In view of error range, we will be apart from l₁Most normal vectors are included in 80 ° to 100 ° all containers of scope All normal vectors are set to the normal direction quantity set L of second principal plane in container₂, wherein container l₂Extraction formula see formula (4).。

θ₁＜ Θ (α 1, α 2)+Θ (β 1, β 2)+Θ (γ 1, γ 2) ＜ θ₂ (4)

Wherein Θ (a, b)=cos (a) * cos (b), θ₁=π * 100/180, θ₂=π * 80/180

Finally apart from l₁And l₂L is found in container in the range of 80 ° to 100 °₃.And by l₃In all normal vector set It is set to the normal direction quantity set L of the 3rd principal plane₃

Step (3)：Principal component analysis

Because the normal vector of three principal planes is mutually orthogonal, but only by calculating 3 normal direction quantity set L₁, L₂, L₃Interior method The average value of line can not obtain most accurate value to extract the normal vector of principal plane, and the vector that they are extracted is from each other It is not orthogonality relation in maximum probability very much.Therefore the present invention is using the linear independence characteristic of characteristic vector in PCA (PCA), Principal plane normal vector is extracted from PCA methods.

By 3 normal direction quantity set L₁, L₂, L₃In all normal amounts as PCA input, then PCA will extract these normal direction Main way in amount.Finally, we obtain three mutually orthogonal characteristic vectors, and these vectors are set into principal plane Normal vector n₁, n₂, n₃, the characteristic vector wherein corresponding to the minimal eigenvalue of PCA methods output is the normal direction of first principal plane Amount, on the contrary the characteristic vector corresponding to eigenvalue of maximum is the normal vector of the 3rd principal plane.

Step (4)：By calculating each 3D points in normal vector n₁, n₂, n₃On coordinate extract principal plane on normal vector Position d₁,,d₂,d₃, then each principal plane can be expressed as a normal vector form (n, d)：

By the way that depth camera optical centre is set into the origin of coordinates, each pixel 3D coordinate D (u, v) position respectively is calculated Projected position p on three principal plane normal vectors_i ^f=D (u, v) n_i, i=1,2,3；In perfect condition, it is for normal vector n_iPlane, plan range camera optics center apart from d be equal to projected position of the Plane-point on plane normal vector, i.e., D(u,v)·n_i=d；But the partial dot in the reason for being due to depth camera noise itself and precision, plane is in plane normal vector On projected position be not equal to the distance at plan range camera opticses center；Therefore the projected position of all each pixels is used One-dimensional mean shift algorithm extracts peak value, the projected position exactly principal plane corresponding to peak value peak to camera opticses center away from From d.

L in Fig. 3₁, l₂, l₃Respectively three orthogonal principal plane normal vectors, l₁Curve on axle represents l₁It is different on axle to throw The quantity of 3D points on shadow position, the wherein exactly principal plane of the projected position corresponding to peak peak to camera opticses center away from From d.Some less peaks are can also be observed that in Fig. 3, they represent the cluster containing a small amount of sample, and they are also represented and master The parallel facet of plane.However, the information that the present invention only needs to principal plane can rebuild scene, therefore obtained by the step Three principal planes are apart from camera opticses center apart from d₁,d₂,d₃。

Step (5)：The step calculates the appearance of camera by the normal vector and range information of three principal planes obtained before State transformation matrix, and the reconstruction of the camera posture changing matrix and the 3D coordinates progress scene of all pixels by each frame.

Camera posture changing matrix from frame f to frame g can be expressed as form：

Wherein R is the spin matrix that size is 3 × 3, and t is the translation vector that size is 3 × 1..

The present invention obtains the information of three vertical principal planes by above-mentioned 1~4 step.In this step, we make Use plane normal vector, i.e. l_iTo calculate the spin matrix R from frame g to f：

And translation vector t is obtained by calculating the planar offset of consecutive frame, and it is converted into camera coordinates：

The camera coordinates of first frame are set to the world coordinates of world coordinates, i.e. scene by the present invention so that in frame f Each pixel 3D coordinate D (u, v) are reverted in world coordinates by camera posture changing matrix：

D_f' (u, v)=T_1,2·T_2,3…T_f-1,f·D_f(u,v) (8)

Calculated by formula (8) and obtained in all image sequences of depth camera photographed scene each 3D point in the world Position in coordinate, the pixel 3D coordinates D after world coordinates will be reverted to by then passing through in all frames_f' (u, v) add up, obtain To the 3D point cloud of scene, and the threedimensional model finally given after scene rebuilding is rendered with OpenGL.

Above-described embodiment is not the limitation for the present invention, and the present invention is not limited only to above-described embodiment, as long as meeting Application claims, belong to protection scope of the present invention.

Claims

1. a kind of scene reconstruction method assumed based on Manhattan, it is characterised in that this method comprises the following steps：

Step (1), the image sequence by depth camera shooting, collecting acquisition indoor scene, calculate each pixel method in image Vector；

Step (2), by set up spherical coordinate carry out normal vector statistics, approximate master is extracted from the normal vector of all pixels point The normal direction duration set of plane normal vector；

Step (3), according to all normal vectors in the normal direction duration set of approximate principal plane normal vector, carried using PCA The Main way in these normal vectors is taken, mutually orthogonal principal plane normal vector is obtained；

Step (4), the projected position of each pixel 3D coordinates on above-mentioned principal plane normal vector is obtained, then using projecting position Put the position for extracting principal plane on normal vector；

The positional information of step (5), the principal plane normal vector and principal plane obtained according to above-mentioned steps on normal vector, is calculated To camera posture changing matrix；And the camera posture changing matrix by each frame and the 3D coordinates of all pixels point carry out scene Reconstruction.

2. a kind of scene reconstruction method assumed based on Manhattan as claimed in claim 1, it is characterised in that step (1), logical The image sequence that depth camera shooting, collecting obtains indoor scene is crossed, each pixel normal vector in image is calculated, is specifically：

First, all pixels point of record frame in image sequence is converted into 3D coordinates by the camera model of depth camera；So Calculated afterwards by the 3D coordinates of adjacent 4 pixels of certain pixel and obtain the normal vector；

D₁(u, v)=D (u+k, v)-D (u-k, v) (1)

D₂(u, v)=D (u, v+k)-D (u, v-k) (2)

Wherein, k is to represent distance between two pixels, is adjustable parameters；D₁、D₂For by pixel D (u, v) vector；

By D₁、D₂Bring into formula (3), obtain D (u, v) normal vector n (u, v)；

N (u, v)=ψ (D₁(u,v)×D₂(u,v)) (3)

Wherein, × and it is crossed product, Ψ is the function that normal is converted to unit vector：ψ (normal)=normal | | normal^-1。

3. a kind of scene reconstruction method assumed based on Manhattan as claimed in claim 2, it is characterised in that step (2), logical Cross and set up spherical coordinate progress normal vector statistics, the method for approximate principal plane normal vector is extracted from the normal vector of all pixels point Vector set, be specifically：

2.1 unit spherical coordinates are set up

The normal n=(x, y, z) that step (1) is obtained is converted into spherical coordinates form n=(α, beta, gamma), and wherein α is normal n and axle Angle between X, β is the angle between normal n and axis Y, and γ is the angle between normal n and axis Z；

2.2 normal vectors are counted

Each base in unit spherical coordinate is subdivided into 180 intervals, i.e., it is each interval for 1 degree of scope；Then by three points Do not belong to the interval combinations of each base into a container, each normal vector n is stored in its corresponding container；

The extraction of 2.3 principal normal vector set

Due to there is method in the pixel for largely belonging to a certain principal plane, therefore the corresponding container of plane normal vector in a two field picture Vectorial quantity will be much larger than other containers, therefore will be set to first master comprising all normal vectors in the most container of normal vector quantity The normal direction quantity set L of plane₁；

θ₁＜ Θ (α 1, α 2)+Θ (β 1, β 2)+Θ (γ 1, γ 2) ＜ θ₂ (4)

Wherein Θ (a, b)=cos (a) * cos (b), θ₁=π * 100/180, θ₂=π * 80/180；

The normal direction quantity set L of 3rd principal plane₃For simultaneously apart from L₁And L₂80 ° to 100 ° of normal direction quantity set.

4. a kind of scene reconstruction method assumed based on Manhattan as claimed in claim 3, it is characterised in that step (3) is specific It is by 3 normal direction quantity set L₁、L₂、L₃In all normal vectors as PCA input, then PCA extract the master in these normal vectors Direction is wanted, three mutually orthogonal characteristic vectors are obtained, while these vectors to be set to the normal vector n of principal plane₁、n₂、n₃； Characteristic vector wherein corresponding to the minimal eigenvalue of PCA methods output is the normal vector of first principal plane, maximum feature on the contrary The corresponding characteristic vector of value is the normal vector of the 3rd principal plane.

5. a kind of scene reconstruction method assumed based on Manhattan as claimed in claim 4, it is characterised in that step (4) is first Each pixel 3D coordinates are calculated in normal vector n₁、n₂、n₃On projected position, then using projected position extract principal plane exist Position d on normal vector₁,d₂,d₃, then each principal plane can be expressed as a normal vector form (n, d), be specifically：

4.1, by the way that depth camera optical centre is set into the origin of coordinates, calculate each pixel 3D coordinate D (u, v) position respectively Projected position on three principal plane normal vectors

4.2 in perfect condition, is n for normal vector_iPlane, plan range camera optics center apart from d be equal to plane on Projected position of the point on plane normal vector, i.e. D (u, v) n_i=d；But it is due to the original of depth camera noise itself and precision Projected position of the partial dot on plane normal vector in cause, plane is not equal to the distance at plan range camera opticses center；

Therefore peak value extracted using one-dimensional mean shift algorithm to the projected positions of all each pixels, corresponding to peak value peak Projected position exactly principal plane is to camera opticses center apart from d.

6. a kind of scene reconstruction method assumed based on Manhattan as claimed in claim 5, it is characterised in that step (5) is specific It is：

Wherein R is the spin matrix that size is 3 × 3, and t is the translation vector that size is 3 × 1；

<mrow> <mi>R</mi> <mo>=</mo> <mrow> <mo>(</mo> <msubsup> <mi>n</mi> <mn>1</mn> <mi>f</mi> </msubsup> <mo>,</mo> <msubsup> <mi>n</mi> <mn>2</mn> <mi>f</mi> </msubsup> <mo>,</mo> <msubsup> <mi>n</mi> <mn>3</mn> <mi>f</mi> </msubsup> <mo>)</mo> </mrow> <mo>&CenterDot;</mo> <msup> <mrow> <mo>(</mo> <msubsup> <mi>n</mi> <mn>1</mn> <mi>g</mi> </msubsup> <mo>,</mo> <msubsup> <mi>n</mi> <mn>2</mn> <mi>g</mi> </msubsup> <mo>,</mo> <msubsup> <mi>n</mi> <mn>3</mn> <mi>g</mi> </msubsup> <mo>)</mo> </mrow> <mrow> <mo>-</mo> <mn>1</mn> </mrow> </msup> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>6</mn> <mo>)</mo> </mrow> </mrow>

<mrow> <mi>t</mi> <mo>=</mo> <munder> <mo>&Sigma;</mo> <mi>i</mi> </munder> <mrow> <mo>(</mo> <msubsup> <mi>d</mi> <mi>i</mi> <mi>f</mi> </msubsup> <mo>-</mo> <msubsup> <mi>d</mi> <mi>i</mi> <mi>g</mi> </msubsup> <mo>)</mo> </mrow> <msubsup> <mi>n</mi> <mi>i</mi> <mi>g</mi> </msubsup> <mo>,</mo> <mi>i</mi> <mo>=</mo> <mn>1</mn> <mo>,</mo> <mn>2</mn> <mo>,</mo> <mn>3</mn> <mo>-</mo> <mo>-</mo> <mo>-</mo> <mrow> <mo>(</mo> <mn>7</mn> <mo>)</mo> </mrow> </mrow>

The camera coordinates of first frame are set to the world coordinates of world coordinates, i.e. scene so that all 3D points D in frame f_f (u, v) reverts to world coordinates P by camera posture changing matrix_globalIn：

D_f' (u, v)=T_1,2·T_2,3···T_f-1,f·D_f(u,v) (8)

By the 3D points D of all frames_f' (u, v) add up, and obtains the 3D point cloud of scene, and rendered with OpenGL and finally give scene rebuilding Threedimensional model afterwards.