CN102222348B

CN102222348B - Method for calculating three-dimensional object motion vector

Info

Publication number: CN102222348B
Application number: CN201110176736.0A
Authority: CN
Inventors: 袁杰; 顾人舒; 石磊
Original assignee: Nanjing University
Current assignee: Nanjing University
Priority date: 2011-06-28
Filing date: 2011-06-28
Publication date: 2013-04-24
Anticipated expiration: 2031-06-28
Also published as: CN102222348A

Abstract

The invention discloses a method for calculating a three-dimensional object motion vector. The method comprises the following steps of: detection and matching of object corner: identifying objects in a video steam, labelling the corners of the objects in a basic frame and a key frame, and matching the corners with each other; calculating the three-dimensional motion vector of the key frame in relation to the basic frame based on the matched corners in the basic frame and the key frame; calculating a three-dimensional object model of the key frame based on a basic three-dimensional model and the three-dimensional motion vector, back projecting the three-dimensional object model to a two-dimensional plane, eliminating wrong points and calculating the final three-dimensional object motion vector.

Description

A kind of method for calculating three-dimensional object motion vector

Technical field

The present invention relates to a kind of multi-angle dynamic imaging field, particularly a kind of stereoscopic video with free visual angles method for calculating three-dimensional object motion vector in dynamic demonstration of three-dimensional body.

Background technology

Along with technical development, occurred some in the research field and shown the display terminal of stereoeffect, wherein, free-viewing angle stereo display is active, real dynamic the demonstration, do not rely on parallax and form the stereoscopic vision effect, the observer can independently select observation visual angle and distance.The calculating of target three-dimensional motion vector is a difficult point in the free-viewing angle stereo display, also is that stereoscopic vision is from static state to dynamic key.By calculating the three-dimensional motion vector of target, can follow the tracks of, the motion of three-dimensional body in the simulation, display video, therefore for video signal processing field, important function and significance is arranged, be requisite step crucial in the Stereoscopic Video Presentation.

Summary of the invention

Goal of the invention: technical matters to be solved by this invention is for the Stereoscopic Video Presentation system, and a kind of linear method that calculates the target three-dimensional motion vector is provided.

Technical scheme: the invention discloses the calculating of three-dimensional motion vector in a kind of Stereoscopic Video Presentation, may further comprise the steps:

Step (1), Corner detect and coupling: the target in the identification video flowing marks out the angle point of target in basic frame and the key frame, and mates;

Step (2), according to the angle point that mates in basic frame and the key frame, the primary Calculation key frame is with respect to the three-dimensional motion of basic frame;

Step (3), according to the target three-dimensional model of motion and basic three-dimensional model calculating key frame, it is overdue that the two dimensional surface rejecting is returned by back projection, calculates final three-dimensional motion vector.

Among the present invention, step (1) may further comprise the steps:

Step (11), the target in identification key frame and the basic frame;

Step (12) marks out the angle point of target in basic frame and the key frame;

Step (13) is mated key frame and basic frame: the corresponding angle point and the two-dimensional coordinate thereof that find key frame and basic frame.

Among the present invention, step (2) may further comprise the steps:

Based on video camera 1 and video camera 2:

Use approximation method to draw system of linear equations:

[\begin{matrix} {x_{1}}^{'} - p_{14} - X_{1} p_{11} - Y_{1} p_{12} - Z_{1} p_{13} \\ {y_{1}}^{'} - p_{24} - X_{1} p_{21} {- Y}_{1} p_{22} - Z_{1} p_{23} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ {x_{n}}^{'} - {pp}_{14} - X_{n} {pp}_{11} - Y_{n} {pp}_{12} - Z_{n} {pp}_{13} \\ {y_{n}}^{'} - {pp}_{24} - X_{n} {pp}_{21} - Y_{n} {pp}_{22} - Z_{n} {pp}_{23} \end{matrix}] = \begin{matrix}  \end{matrix} [\begin{matrix} Y_{1} p_{13} - Z_{1} p_{12} & Z_{1} p_{11} - X_{1} p_{13} & X_{1} p_{12} - Y_{1} p_{11} & p_{11} & p_{12} & p_{13} \\ Y_{1} p_{23} - Z_{1} p_{22} & Z_{1} p_{21} - X_{1} p_{23} & X_{1} p_{22} - Y_{1} p_{21} & p_{21} & p_{22} & p_{23} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ Y_{n} {pp}_{13} - Z_{n} {pp}_{12} & Z_{n} {pp}_{11} - X_{n} {pp}_{13} & X_{n} {pp}_{12} - Y_{n} pp_{11} & {pp}_{11} & {pp}_{12} & {pp}_{13} \\ Y_{n} {pp}_{23} - Z_{n} {pp}_{22} & Z_{n} {pp}_{21} - X_{n} {pp}_{23} & X_{n} {pp}_{22} - Y_{n} {pp}_{21} & {pp}_{21} & {pp}_{22} & {pp}_{23} \end{matrix}] [\begin{matrix} φ_{1} \\ φ_{2} \\ φ_{3} \\ t_{1} \\ t_{2} \\ t_{3} \end{matrix}],

φ ₁, φ ₂, φ ₃Be respectively the three-dimensional motion vector reduced parameter, t ₁, t ₂, t ₃Be respectively the translation matrix parameter; Form three-dimensional motion vector M:

M = [\begin{matrix} 1 & {- φ}_{3} & φ_{2} & t_{1} \\ φ_{3} & 1 & {- φ}_{1} & t_{2} \\ {- φ}_{2} & φ_{1} & 1 & t_{3} \\ 0 & 0 & 0 & 1 \end{matrix}];

Or use non-approximation method, draw system of linear equations:

, r wherein ₁₁, r ₂₁, r ₃₁, r ₁₂, r ₂₂, r ₃₂, r ₁₃, r ₂₃, r ₃₃Be the rotation matrix parameter, form three-dimensional motion vector M:

M = [\begin{matrix} r_{11} & r_{12} & r_{13} & t_{1} \\ r_{21} & r_{22} & r_{23} & t_{2} \\ r_{31} & r_{32} & r_{33} & t_{3} \\ 0 & 0 & 0 & 1 \end{matrix}];

(X wherein _i, Y _i, Z _i) expression any point three-dimensional coordinate in the basic frame of video camera 1 or video camera 2; (X _i', Y _i', Z _i') represent this three-dimensional coordinate in the key frame of video camera 1 or video camera 2; (x _i, y _i) represent that this projects to the two-dimensional coordinate of two dimensional surface at video camera 1 or video camera 2 in basic frame; (x _i', y _i') represent that this projects to the two-dimensional coordinate of two dimensional surface at video camera 1 or video camera 2 in key frame;

P = [\begin{matrix} p_{11} & p_{12} & p_{13} & p_{14} \\ p_{21} & p_{22} & p_{23} & p_{24} \\ 0 & 0 & 0 & 1 \end{matrix}]

Projection matrix for video camera 1;

PP = [\begin{matrix} {pp}_{11} & p p_{12} & {pp}_{13} & {pp}_{14} \\ {pp}_{21} & p p_{22} & {pp}_{23} & {pp}_{24} \\ 0 & 0 & 0 & 1 \end{matrix}]

Projection matrix for video camera 2;

Find the solution the least mean-square error solution of system of linear equations, obtain the kinematic matrix parameter, i.e. motion vector.

Among the present invention, step (3) may further comprise the steps:

To two video cameras, described kinematic matrix parameter is additional to basic three-dimensional model, obtain the three-dimensional model of two camera video key frames;

To two video cameras, respectively with the three-dimensional model of two camera video key frames, separately two dimensional surface corresponding to video camera matrix returns in back projection, with the angle point two-dimensional coordinate of the key frame of corresponding video camera relatively, reject the point of error beyond threshold value, use step 2 to recomputate the kinematic matrix parameter, obtain final three-dimensional motion motion vector.

Beneficial effect: the present invention calculates fast relative motion and obtains the key frame three-dimensional model key frame according to the static three-dimensional model of basic frame, in order to obtain the motion video of Same Scene visual angle.The present invention does not need camera calibration, and real-time is good, and the smoothness that is conducive to dynamic video shows, and algorithm has certain robustness.In calculating the free-viewing angle display system, can carry out three-dimensional reconstruction consuming time every the long period, and calculate in real time motion with the inventive method, adjust the position of three-dimensional body, make to show that real-time is better, do not sacrifice precision simultaneously.

Description of drawings

Below in conjunction with the drawings and specific embodiments the present invention is done further to specify, above-mentioned and/or otherwise advantage of the present invention will become apparent.

Fig. 1 is video sequence basis frame of the present invention and key frame signal.

The coordinate system perspective view that Fig. 2 is involved in the present invention.

Fig. 3 is that the present invention tests comparing result with the control of existing research.

Embodiment:

The inventive method adopts twin camera (can expand to multiple-camera) on the basis that static three-dimensional is rebuild, according to two angle motion video pictures of synchronous shooting Same Scene, calculate the target three-dimensional motion vector, shows in order to realize the dynamic solid video.

During the invention process, the condition that needs to satisfy is:

Video flowing is synchronous: the collection that starts video flowing comes the two-dimensional scene of synchronous different visual angles camera head shooting; Concerning embodiment, be embodied as the picture frame that adopts with the moment two video cameras, such as Fig. 1.

During the invention process, Given information has:

Basis frame and basic three-dimensional model: basic three-dimensional model is set up centered by basic frame, and basic frame comprises the basic frame of two video cameras.

The projection matrix of two video cameras: rebuild in the process of basic three-dimensional model and can obtain the video camera projection matrix.

Among the present invention, rebuild on the completed basis at static three-dimensional, according to the two-dimensional video picture that two video cameras provide, calculate the three-dimensional motion vector of target.As shown in Figure 1, the image sequence that the optional interval of key frame equates, present embodiment choose wherein that a frame calculates it with respect to the motion vector of basic frame, and other frame method is identical.

Implementation process is as follows:

One, Corner detects and coupling: identify the separately target in the video flowing of two video cameras, mark out two video cameras separately in basic frame and the key frame, the angle point on the target;

Can be 200910234584.8 in the application number of application on November 23rd, 2009 referring to the applicant, name be called the patent of invention of " a kind of method for displaying stereoscopic video with free visual angles ".Concise and to the point step is:

1.1 with the Harris algorithm angle point in the basic frame being carried out the first time estimates;

1.2 the angle point that estimates is further screened with the SUSAN algorithm, draws final angle point.

1.3 key frame and basic frame are mated: the two-dimensional coordinate (x that finds key frame with the template window matching process ₁, y ₁) and the two-dimensional coordinate (x of the corresponding angle point of basic frame ₁', y ₁').

Use above method, for two video cameras, its key frame is mated to basic frame respectively.

Two, calculate key frame with respect to the three-dimensional motion of basic frame.

The present invention adopts the affine camera model under the homogeneous coordinates, and is applicable when its degree of depth can be ignored relatively in the change in depth of target as shown in Figure 2, realistic video camera.

Consider first video camera 1.For any point, establish that three-dimensional coordinate is respectively (X in basic frame and the key frame ₁, Y ₁, Z ₁) and (X ₁', Y ₁', Z ₁'), two-dimensional coordinate is respectively (x ₁, y ₁) and (x ₁', y ₁').

Three-dimensional coordinate is write to the mapping of two-dimensional coordinate:

[\begin{matrix} x_{1} \\ y_{1} \\ 1 \end{matrix}] = P [\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \\ 1 \end{matrix}] - - - (1)

Wherein

P = [\begin{matrix} p_{11} & p_{12} & p_{13} & p_{14} \\ p_{21} & p_{22} & p_{23} & p_{24} \\ 0 & 0 & 0 & 1 \end{matrix}]

Be the video camera projection matrix.

Then key frame can be expressed as with respect to the motion of basic frame:

[\begin{matrix} {X_{1}}^{'} \\ {Y_{1}}^{'} \\ {Z_{1}}^{'} \\ 1 \end{matrix}] = M [\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \\ 1 \end{matrix}] - - - (2)

Wherein kinematic matrix M is:

M = [\begin{matrix} R & T \\ O \\ 1 \end{matrix}] = [\begin{matrix} r_{11} & r_{12} & r_{13} & t_{1} \\ r_{21} & r_{22} & r_{23} & t_{2} \\ r_{31} & r_{32} & r_{33} & t_{3} \\ 0 & 0 & 0 & 1 \end{matrix}] - - - (3)

Wherein R is rotation matrix, and T is translation matrix, and all elements of 1 * 3 matrix O is zero.Although rotation matrix R has 9 elements, it is orthogonal matrix, has 6 independent restrainings:

Thereby really independently parameter only have 3.Can be by the x of Roll-Pitch-Yaw(right-handed Cartesian coordinate system, y and z axle) the method representation rotation matrix, θ is the anglec of rotation:

R = [\begin{matrix} \cos θ_{y} \cos θ_{z} & - \cos θ_{y} \sin θ_{z} & \sin θ_{y} \\ \sin θ_{x} \sin θ_{y} \cos θ_{z} + \cos θ_{x} \sin θ_{z} & - \sin θ_{x} \sin θ_{y} \sin θ_{z} + \cos θ_{x} \cos θ_{z} & - \sin θ_{x} \cos θ_{y} \\ - \cos θ_{x} \sin θ_{y} \cos θ_{z} + \sin θ_{x} \sin θ_{z} & \cos θ_{x} \sin θ_{y} \sin θ_{z} + \sin θ_{x} \cos θ_{z} & \cos θ_{x} \cos θ_{y} \end{matrix}] - - - (4)

Finding the solution of motion is divided into 2 steps:

2.1 will finding the solution motion, step 1. is summed up as solving equations b=Ax

By

[\begin{matrix} x_{1} \\ y_{1} \\ 1 \end{matrix}] = P [\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \\ 1 \end{matrix}] [\begin{matrix} {x_{1}}^{'} \\ {y_{1}}^{'} \\ 1 \end{matrix}] = P [\begin{matrix} X \end{matrix}] [\begin{matrix} {_{1}}^{'} \\ {Y_{1}}^{'} \\ {Z_{1}}^{'} \\ 1 \end{matrix}] [\begin{matrix} {X_{1}}^{'} \\ {Y_{1}}^{'} \\ {Z_{1}}^{'} \\ 1 \end{matrix}] = M [\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \\ 1 \end{matrix}]

Three formulas can get:

[\begin{matrix} {x_{1}}^{'} \\ {y_{1}}^{'} \\ 1 \end{matrix}] = PM [\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \\ 1 \end{matrix}] - - - (5)

Each point provides two equations, and is as follows:

\{\begin{matrix} \begin{matrix} {x_{1}}^{'} = p_{14} + p_{11} t_{1} + p_{12} t_{2} + p_{13} t_{3} + X_{1} (p_{11} r_{11} + p_{12} r_{21} + p_{13} r_{31}) \\ + Y_{1} (p_{11} r_{12} + p_{12} r_{22} + p_{13} r_{32}) + Z_{1} (p_{11} r_{13} + p_{12} r_{23} + p_{13} r_{33}) \end{matrix} \\ \begin{matrix} {y_{1}}^{'} = p_{24} + p_{21} t_{1} + p_{22} t_{2} + p_{23} t_{3} + X_{1} (p_{21} r_{11} + p_{22} r_{21} + p_{23} r_{31}) \\ + Y_{1} (p_{21} r_{12} + p_{22} r_{22} + p_{23} r_{32}) + Z_{1} (p_{21} r_{13} + p_{22} r_{23} + p_{23} r_{33}) \end{matrix} \end{matrix} - - - (6)

1. " if make small angle approximation (key frame is separated by when nearer, and it is less to move, and can make small angle approximation), and the Roll of (6) formula of employing, Pitch and Yaw method for expressing, rotation matrix is reduced to, φ ₁, φ ₂, φ ₃Be respectively the three-dimensional motion vector reduced parameter:

R = [\begin{matrix} 1 & - n_{3} θ & n_{2} θ \\ n_{3} θ & 1 & - n_{1} θ \\ - n_{2} θ & n_{1} θ & 1 \end{matrix}] = [\begin{matrix} 1 & - φ_{3} & φ_{2} \\ φ_{3} & 1 & - φ_{1} \\ - φ_{2} & φ_{1} & 1 \end{matrix}] - - - (7)

(n wherein ₁θ) ²+ (n ₂θ) ²+ (n ₃θ) ²=θ ²So, φ ₁ ²+ φ ₂ ²+ φ ₃ ²=θ ²(8)

System of equations (6) is rewritten as:

[\begin{matrix} {x_{1}}^{'} - p_{14} - X_{1} p_{11} - Y_{1} p_{12} - Z_{1} p_{13} \\ {y_{1}}^{'} - p_{24} - X_{1} p_{21} - Y_{1} p_{22} - Z_{1} p_{23} \end{matrix}] = [\begin{matrix} Y_{1} p_{13} - Z_{1} p_{12} & Z_{1} p_{11} - X_{1} p_{13} & X_{1} p_{12} - Y_{1} p_{11} & p_{11} & p_{12} & p_{13} \\ Y_{1} p_{23} - Z_{1} p_{22} & Z_{1} p_{21} - X_{1} p_{23} & X_{1} p_{22} - Y_{1} p_{21} & p_{21} & p_{22} & p_{23} \end{matrix}] [\begin{matrix} φ_{1} \\ φ_{2} \\ φ_{3} \\ t_{1} \\ t_{2} \\ t_{3} \end{matrix}] - - - (9)

As only using the multiple spot of a video camera, can list system of equations:

[\begin{matrix} {x_{1}}^{'} - p_{14} - X_{1} p_{11} - Y_{1} p_{12} - Z_{1} p_{13} \\ {y_{1}}^{'} - p_{24} - X_{1} p_{21} {- Y}_{1} p_{22} - Z_{1} p_{23} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ {x_{n}}^{'} - {pp}_{14} - X_{n} {pp}_{11} - Y_{n} {pp}_{12} - Z_{n} {pp}_{13} \\ {y_{n}}^{'} - {pp}_{24} - X_{n} {pp}_{21} - Y_{n} {pp}_{22} - Z_{n} {pp}_{23} \end{matrix}] = \begin{matrix}  \end{matrix} [\begin{matrix} Y_{1} p_{13} - Z_{1} p_{12} & Z_{1} p_{11} - X_{1} p_{13} & X_{1} p_{12} - Y_{1} p_{11} & p_{11} & p_{12} & p_{13} \\ Y_{1} p_{23} - Z_{1} p_{22} & Z_{1} p_{21} - X_{1} p_{23} & X_{1} p_{22} - Y_{1} p_{21} & p_{21} & p_{22} & p_{23} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ Y_{n} {pp}_{13} - Z_{n} {pp}_{12} & Z_{n} {pp}_{11} - X_{n} {pp}_{13} & X_{n} {pp}_{12} - Y_{n} {pp}_{11} & {pp}_{11} & {pp}_{12} & {pp}_{13} \\ Y_{n} {pp}_{23} - Z_{n} {pp}_{22} & Z_{n} {pp}_{21} - X_{n} {pp}_{23} & X_{n} {pp}_{22} - Y_{n} {pp}_{21} & {pp}_{21} & {pp}_{22} & {pp}_{23} \end{matrix}] [\begin{matrix} φ_{1} \\ φ_{2} \\ φ_{3} \\ t_{1} \\ t_{2} \\ t_{3} \end{matrix}] - - - (10)

First matrix of equation the right is designated as A in the following formula, and its order is 5 to the maximum.This is because a video camera can not provide along the depth information of its optical axis direction.The below is issued a certificate:

I. observation matrix A finds that right three row interlacing are identical.This matrix is done line translation, capable with the first row cancellation 2i-1 after three row, three be listed as (i=2,3 after capable with the second row cancellation 2i ... n):

[\begin{matrix} Y_{1} p_{13} - Z_{1} p_{12} & Z_{1} p_{11} - X_{1} p_{13} & X_{1} p_{12} - Y_{1} p_{11} & p_{11} & p_{12} & p_{13} \\ Y_{1} p_{23} - Z_{1} p_{22} & Z_{1} p_{21} - X_{1} p_{23} & X_{1} p_{22} - Y_{1} p_{21} & p_{21} & p_{22} & p_{23} \\ \cdot \\ \cdot & 0 & 0 & 0 \\ \cdot \\ \cdot & \cdot \\ \cdot & \cdot \\ \cdot & \cdot \\ \cdot \\ \cdot & 0 & 0 & 0 \\ \cdot \\ Y_{n} p_{13} - Z_{n} p_{12} & Z_{n} p_{11} - X_{1} p_{13} & X_{n} p_{12} - Y_{n} p_{11} & 0 & 0 & 0 \\ Y_{n} p_{23} - Z_{n} p_{22} & Z_{n} p_{21} - X_{n} p_{23} & X_{n} p_{22} - Y_{n} p_{21} & 0 & 0 & 0 \end{matrix}] - - - (11)

Ii. in the matrix that obtains, right three column ranks are 2 to the maximum, because non-vanishingly only have two row; Left three column ranks are 3 to the maximum.So this rank of matrix is 5 to the maximum.The elementary row rank transformation does not change rank of matrix, so the order of matrix A is 5 to the maximum.

By above-mentioned proof as can be known, find the solution the rigid body three-dimensional motion, at least need two video cameras.Second video camera projection matrix is:

PP = [\begin{matrix} {pp}_{11} & p p_{12} & {pp}_{13} & {pp}_{14} \\ {pp}_{21} & p p_{22} & {pp}_{23} & {pp}_{24} \\ 0 & 0 & 0 & 1 \end{matrix}]

So find the solution motion, the group that need establish an equation to the multiple spot of two video cameras is:

[\begin{matrix} {x_{1}}^{'} - p_{14} - X_{1} p_{11} - Y_{1} p_{12} - Z_{1} p_{13} \\ {y_{1}}^{'} - p_{24} - X_{1} p_{21} {- Y}_{1} p_{22} - Z_{1} p_{23} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ {x_{n}}^{'} - {pp}_{14} - X_{n} {pp}_{11} - Y_{n} {pp}_{12} - Z_{n} {pp}_{13} \\ {y_{n}}^{'} - {pp}_{24} - X_{n} {pp}_{21} - Y_{n} {pp}_{22} - Z_{n} {pp}_{23} \end{matrix}] = \begin{matrix}  \end{matrix} [\begin{matrix} Y_{1} p_{13} - Z_{1} p_{12} & Z_{1} p_{11} - X_{1} p_{13} & X_{1} p_{12} - Y_{1} p_{11} & p_{11} & p_{12} & p_{13} \\ Y_{1} p_{23} - Z_{1} p_{22} & Z_{1} p_{21} - X_{1} p_{23} & X_{1} p_{22} - Y_{1} p_{21} & p_{21} & p_{22} & p_{23} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ Y_{n} {pp}_{13} - Z_{n} {pp}_{12} & Z_{n} {pp}_{11} - X_{n} {pp}_{13} & X_{n} {pp}_{12} - Y_{n} {pp}_{11} & {pp}_{11} & {pp}_{12} & {pp}_{13} \\ Y_{n} {pp}_{23} - Z_{n} {pp}_{22} & Z_{n} {pp}_{21} - X_{n} {pp}_{23} & X_{n} {pp}_{22} - Y_{n} {pp}_{21} & {pp}_{21} & {pp}_{22} & {pp}_{23} \end{matrix}] [\begin{matrix} φ_{1} \\ φ_{2} \\ φ_{3} \\ t_{1} \\ t_{2} \\ t_{3} \end{matrix}] - - - (12)

Note equation left side matrix is b, and the right matrix is followed successively by A and x.This moment, the order of A was 6 to the maximum.. then finding the solution conversion of motion is solving equation group b=Ax.

N point comprises m point of video camera 1, and the k of video camera 2 point.N=m+k(m, k 〉=1). in theory, 3 groups of points that n=3 namely uses two video cameras to the time, can solve system of equations; When n 〉=3, need to separate the overdetermined equation group, obtain a result.

Herein, equation only needs to add newline in the matrix bottom also widenable to multiple-camera, adds a plurality of video camera equations, and equation form is identical.

If 2. do not make small angle approximation, need equally two video cameras, system of equations is:

[\begin{matrix} {x_{1}}^{'} - p_{14} \\ {y_{1}}^{'} - p_{24} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ {x_{n}}^{'} - {pp}_{14} \\ {y_{n}}^{'} - {pp}_{24} \end{matrix}] = [\begin{matrix} X_{1} p_{11} & X_{1} p_{12} & X_{1} p_{13} & Y_{1} p_{11} & Y_{1} p_{12} & Y_{1} p_{13} & Z_{1} p_{11} & Z_{1} p_{12} & Z_{1} p_{13} & p_{11} & p_{12} & p_{13} \\ X_{1} p_{21} & X_{1} p_{22} & X_{1} p_{23} & Y_{1} p_{21} & Y_{1} p_{22} & Y_{1} p_{23} & Z_{1} p_{21} & Z_{1} p_{22} & Z_{1} p_{23} & p_{21} & p_{22} & p_{23} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ X_{n} {pp}_{11} & X_{n} {pp}_{12} & X_{n} {pp}_{13} & Y_{n} {pp}_{11} & Y_{n} {pp}_{12} & Y_{n} {pp}_{13} & Z_{n} {pp}_{11} & Z_{n} {pp}_{12} & Z_{n} {pp}_{13} & {pp}_{11} & {pp}_{12} & {pp}_{13} \\ X_{n} {pp}_{21} & X_{n} {pp}_{22} & X_{n} {pp}_{23} & Y_{n} {pp}_{21} & Y_{n} {pp}_{22} & Y_{n} {pp}_{23} & Z_{n} {pp}_{21} & Z_{n} {pp}_{22} & Z_{n} {pp}_{23} & {pp}_{21} & {pp}_{22} & {pp}_{23} \end{matrix}] [\begin{matrix} r_{11} \\ r_{21} \\ r_{31} \\ r_{12} \\ r_{22} \\ r_{32} \\ r_{13} \\ r_{23} \\ r_{33} \\ t_{1} \\ t_{2} \\ t_{3} \end{matrix}] - - - (13)

For non-approximate data, in order to guarantee the matrix A full rank, need satisfy m 〉=4 and n 〉=4, altogether the minimum 4+4=8 stack features point that needs.

Equally, equation is also widenable to multiple-camera.

2.2 step 2. is found the solution the least mean-square error solution of system of linear equations b=Ax

Adopt QR(orthonomal matrix-upper triangular matrix) decomposition method, solution is as follows:

Since the existence of error, Ax=b+ ε, problem is to find the solution x to make norm || ε || ₂ ²Minimum.

Can find Q so that

QA = (\begin{matrix} R \\ O \end{matrix}) .

Wherein Q is orthogonal matrix, and R is nonsingular upper triangular matrix.

{| | ϵ | |}_{2}^{2} = {| | Ax - b | |}_{2}^{2} = {| | Q (Ax - b) | |}_{2}^{2} = {| | QAx - Qb | |}_{2}^{2} = {| | [\begin{matrix} R \\ O \end{matrix}] - Qb | |}_{2}^{2}

Note

Qb = [\begin{matrix} b_{1} \\ b_{2} \end{matrix}],

{| | Ax - b | |}_{2}^{2} = {| | [\begin{matrix} Rx - b_{1} \\ b_{2} \end{matrix}] | |}_{2}^{2} .

Because be column vector,

Notice || b ₂|| ₂ ²Constant, so former problem || Ax-b|| ₂ ²Min is converted into || Rx-b ₁|| ₂ ²Min, that is: Rx-b ₁=0.

So have:

x=R ^-1b ₁ (14)

So far, the x vector solves.

To the approximate data of system of equations (12), x=(φ ₁, φ ₂, φ ₃, t ₁, t ₂, t ₃) ^T,

To the non-approximate data of system of equations (13), x=(R ₁₁, R ₂₁, R ₃₁, R ₁₂, R ₂₂, R ₃₂, R ₁₃, R ₂₃, R ₃₃, t ₁, t ₂, t ₃) ^T.

Three, the three-dimensional motion vector of calculating according to step 2 is calculated the target three-dimensional model of key frame, to two video cameras, two dimensional surface corresponding to this video camera matrix returns in back projection respectively, compare with the key frame angle point two-dimensional coordinate of this video camera, reject the point of error beyond threshold value, then recomputate three-dimensional motion vector by step 2.

The mistake coupling can cause back projection's error very large, and this error directly affects result of calculation, so that the present invention proposes to reject is overdue.Because back projection's error that mistake coupling causes is generally at 10 more than the pixel, even 20, and general normal point is many in 2 pixels, is 10 so establish threshold value.

As long as surpass threshold value in x direction or y deflection error, namely as overdue rejecting.Then (the individual point of l≤n) re-starts step 2, recomputates three-dimensional motion vector by (15) formula or (16) formula for the l that keeps.

[\begin{matrix} {x_{1}}^{'} - p_{14} - X_{1} p_{11} - Y_{1} p_{12} - Z_{1} p_{13} \\ {y_{1}}^{'} - p_{24} - X_{1} p_{21} {- Y}_{1} p_{22} - Z_{1} p_{23} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ {x_{l}}^{'} - {pp}_{14} - X_{l} {pp}_{11} - Y_{l} {pp}_{12} - Z_{l} {pp}_{13} \\ {y_{l}}^{'} - {pp}_{24} - X_{l} {pp}_{21} - Y_{l} {pp}_{22} - Z_{l} {pp}_{23} \end{matrix}] = \begin{matrix}  \end{matrix} [\begin{matrix} Y_{1} p_{13} - Z_{1} p_{12} & Z_{1} p_{11} - X_{1} p_{13} & X_{1} p_{12} - Y_{1} p_{11} & p_{11} & p_{12} & p_{13} \\ Y_{1} p_{23} - Z_{1} p_{22} & Z_{1} p_{21} - X_{1} p_{23} & X_{1} p_{22} - Y_{1} p_{21} & p_{21} & p_{22} & p_{23} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ Y_{l} {pp}_{13} - Z_{l} {pp}_{12} & Z_{l} {pp}_{11} - X_{l} {pp}_{13} & X_{l} {pp}_{12} - Y_{l} {pp}_{11} & {pp}_{11} & {pp}_{12} & {pp}_{13} \\ Y_{l} {pp}_{23} - Z_{l} {pp}_{22} & Z_{l} {pp}_{21} - X_{l} {pp}_{23} & X_{l} {pp}_{22} - Y_{l} {pp}_{21} & {pp}_{21} & {pp}_{22} & {pp}_{23} \end{matrix}] [\begin{matrix} φ_{1} \\ φ_{2} \\ φ_{3} \\ t_{1} \\ t_{2} \\ t_{3} \end{matrix}] - - - (15)

[\begin{matrix} {x_{1}}^{'} - p_{14} \\ {y_{1}}^{'} - p_{24} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ {x_{l}}^{'} - {pp}_{14} \\ {y_{l}}^{'} - {pp}_{24} \end{matrix}] = [\begin{matrix} X_{1} p_{11} & X_{1} p_{12} & X_{1} p_{13} & Y_{1} p_{11} & Y_{1} p_{12} & Y_{1} p_{13} & Z_{1} p_{11} & Z_{1} p_{12} & Z_{1} p_{13} & p_{11} & p_{12} & p_{13} \\ X_{1} p_{21} & X_{1} p_{22} & X_{1} p_{23} & Y_{1} p_{21} & Y_{1} p_{22} & Y_{1} p_{23} & Z_{1} p_{21} & Z_{1} p_{22} & Z_{1} p_{23} & p_{21} & p_{22} & p_{23} \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ \cdot \\ X_{l} {pp}_{11} & X_{l} {pp}_{12} & X_{l} {pp}_{13} & Y_{l} {pp}_{11} & Y_{l} {pp}_{12} & Y_{l} {pp}_{13} & Z_{l} {pp}_{11} & Z_{l} {pp}_{12} & Z_{l} {pp}_{13} & {pp}_{11} & {pp}_{12} & {pp}_{13} \\ X_{l} {pp}_{21} & X_{l} {pp}_{22} & X_{l} {pp}_{23} & Y_{l} {pp}_{21} & Y_{l} {pp}_{22} & Y_{l} {pp}_{23} & Z_{l} {pp}_{21} & Z_{l} {pp}_{22} & Z_{l} {pp}_{23} & {pp}_{21} & {pp}_{22} & {pp}_{23} \end{matrix}] [\begin{matrix} r_{11} \\ r_{21} \\ r_{31} \\ r_{12} \\ r_{22} \\ r_{32} \\ r_{13} \\ r_{23} \\ r_{33} \\ t_{1} \\ t_{2} \\ t_{3} \end{matrix}]

（16）

Think that the result is final three-dimensional motion vector.

Comparatively ripe Han-Kanade method in the present invention and the existing research has been done the control experiment.The Han-Kanade method is based on the three-dimensional reconstruction of uncalibrated image not, also can recover motion.Experiment embodiment is: real-world object around the x direction rotation 5 degree, and control inputs point is identical, obtains calculating the anglec of rotation to such as Fig. 3.Can find out that precision is in same level.And on the time, MATLAB realizes that the computing of three frames needs 5 ~ 10 minutes, and comprehensive 360 degree are rebuild object just needs the more time.5 video cameras needed more than half an hour.And video camera is fewer, and is higher to the requirement of wide-angle coupling, so can not reduce simply video camera.Therefore, in the free-viewing angle display system, can carry out three-dimensional reconstruction consuming time every the long period, and calculate in real time motion with the inventive method, adjust the position of three-dimensional body, make to show that real-time is better, not sacrifice precision simultaneously.

The invention provides thinking and the method for three-dimensional vectors computing method in a kind of Stereoscopic Video Presentation; method and the approach of this technical scheme of specific implementation are a lot; the above only is preferred implementation of the present invention; should be understood that; for those skilled in the art; under the prerequisite that does not break away from the principle of the invention, can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.In the present embodiment not clear and definite each ingredient all available prior art realized.

Claims

1. a method for calculating three-dimensional object motion vector is characterized in that, may further comprise the steps:

Step (1), Corner detect and coupling: identify the separately target in the video flowing of two video cameras, mark out the separately angle point of target in basic frame and the key frame of two video cameras, and for two video cameras key frame is mated to basic frame respectively;

Step (2) according to the angle point that mates in basic frame and the key frame, is calculated key frame with respect to the three-dimensional motion matrix of basic frame;

Step (3), the target three-dimensional model according to basic three-dimensional model and described three-dimensional motion matrix computations key frame returns target three-dimensional model back projection to two dimensional surface, and rejects overduely, calculates final objective motion vector;

Step (1) may further comprise the steps:

Identify the separately target in key frame and the basic frame in the video flowing of two video cameras;

Mark out the separately angle point of target in basic frame and the key frame of two video cameras;

Angle point to basic frame and key frame mates: find the separately corresponding angle point of key frame and basic frame of two video cameras, obtain the two-dimensional coordinate of angle point;

Step (2) may further comprise the steps:

Based on video camera 1 and video camera 2: for any point, establish that three-dimensional coordinate is respectively (X in basic frame and the key frame ₁, Y ₁, Z ₁) and (X ₁', Y ₁', Z ₁'), two-dimensional coordinate is respectively (x ₁, y ₁) and (x ₁', y ₁');

[\begin{matrix} x_{1} \\ y_{1} \\ 1 \end{matrix}] = P [\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \\ 1 \end{matrix}] - - - (1)

Wherein

P = [\begin{matrix} p_{11} & p_{12} & p_{13} & p_{14} \\ p_{21} & p_{22} & p_{23} & p_{24} \\ 0 & 0 & 0 & 1 \end{matrix}]

Be video camera 1 projection matrix;

Then key frame with respect to the movement representation of basic frame is:

[\begin{matrix} {X_{1}}^{'} \\ {Y_{1}}^{'} \\ {Z_{1}}^{'} \\ 1 \end{matrix}] = M [\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \\ 1 \end{matrix}] - - - (2)

Wherein three-dimensional motion matrix M is:

M = [\begin{matrix} R & T \\ O & 1 \end{matrix}] = [\begin{matrix} r_{11} & r_{12} & r_{13} & t_{1} \\ r_{21} & r_{22} & r_{23} & t_{2} \\ r_{31} & r_{32} & r_{33} & t_{3} \\ 0 & 0 & 0 & 1 \end{matrix}]

(3) wherein R is rotation matrix, wherein r ₁₁, r ₂₁, r ₃₁, r ₁₂, r ₂₂, r ₃₂, r ₁₃, r ₂₃, r ₃₃Be the rotation matrix parameter, T is translation matrix, t ₁, t ₂, t ₃Be respectively the translation matrix parameter, all elements of 1 * 3 matrix O is zero;

By the x of right-handed Cartesian coordinate system, y and z axle method representation rotation matrix, θ is the anglec of rotation:

R = [\begin{matrix} \cos θ_{y} \cos θ_{z} & - \cos θ_{y} \sin θ_{z} & \sin θ_{y} \\ \sin θ_{x} \sin θ_{y} \cos θ_{z} + \cos θ_{x} \sin θ_{z} & - \sin θ_{x} \sin θ_{y} \sin θ_{z} + \cos θ_{x} \cos θ_{z} & - \sin θ_{x} \cos θ_{y} \\ - \cos θ_{x} \sin θ_{y} \cos θ_{z} + \sin θ_{x} \sin θ_{z} & \cos θ_{x} \sin θ_{y} \sin θ_{z} + \sin θ_{x} \cos θ_{z} & \cos θ_{x} \cos θ_{y} \end{matrix}] - - - (4)

Finding the solution of motion is divided into 2 steps:

To find the solution motion and be summed up as solving equations b=Ax:

By

[\begin{matrix} x_{1} \\ y_{1} \\ 1 \end{matrix}] = P [\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \\ 1 \end{matrix}],

[\begin{matrix} {x_{1}}^{'} \\ {y_{1}}^{'} \\ 1 \end{matrix}] = P [\begin{matrix} {X_{1}}^{'} \\ {Y_{1}}^{'} \\ {Z_{1}}^{'} \\ 1 \end{matrix}],

[\begin{matrix} {X_{1}}^{'} \\ {Y_{1}}^{'} \\ {Z_{1}}^{'} \\ 1 \end{matrix}] = M [\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \\ 1 \end{matrix}]

Three formulas can get:

[\begin{matrix} {x_{1}}^{'} \\ {y_{1}}^{'} \\ 1 \end{matrix}] = PM [\begin{matrix} X_{1} \\ Y_{1} \\ Z_{1} \\ 1 \end{matrix}] - - - (5)

Each point provides two equations, and is as follows:

\{\begin{matrix} {x_{1}}^{'} = p_{14} + p_{11} t_{1} + p_{12} t_{2} + p_{13} t_{3} + X_{1} (p_{11} r_{11} + p_{12} r_{21} + p_{13} r_{31}) \\ + Y_{1} (p_{11} r_{12} + p_{12} r_{22} + p_{13} r_{32}) + Z_{1} (p_{11} r_{13} + p_{12} r_{23} + p_{13} r_{33}) \\ {y_{1}}^{'} = p_{24} + p_{21} t_{1} + p_{22} t_{2} + p_{23} t_{3} + X_{1} (p_{21} r_{11} + p_{22} r_{21} + p_{23} r_{31}) \\ + Y_{1} (p_{21} r_{12} + p_{22} r_{22} + p_{23} r_{32}) + Z_{1} (p_{21} r_{13} + p_{22} r_{23} + p_{23} r_{33}) \end{matrix} - - - (6)

If make small angle approximation, and the method for expressing of (6) formula of employing, rotation matrix is reduced to:

R = [\begin{matrix} 1 & - n_{3} θ & n_{2} θ \\ n_{3} θ & 1 & - n_{1} θ \\ - n_{2} θ & n_{1} θ & 1 \end{matrix}] = [\begin{matrix} 1 & - φ_{3} & φ_{2} \\ φ_{3} & 1 & - φ_{1} \\ - φ_{2} & φ_{1} & 1 \end{matrix}] - - - (7)

(n wherein ₁θ) ²+ (n ₂θ) ²+ (n ₃θ) ²=θ ²So, φ ₁ ²+ φ ₂ ²+ φ ₃ ²=θ ²(8), wherein, φ ₁, φ ₂, φ ₃Be respectively three-dimensional motion matrix reduction parameter;

System of equations (6) is rewritten as:

[\begin{matrix} {x_{1}}^{'} - p_{14} - X_{1} p_{11} - Y_{1} p_{12} - Z_{1} p_{13} \\ {y_{1}}^{'} - p_{24} - X_{1} p_{21} - Y_{1} p_{22} - Z_{1} p_{23} \end{matrix}] = [\begin{matrix} Y_{1} p_{13} - Z_{1} p_{12} & Z_{1} p_{11} - X_{1} p_{13} & X_{1} p_{12} - Y_{1} p_{11} & p_{11} & p_{12} & p_{13} \\ Y_{1} p_{23} - Z_{1} p_{22} & Z_{1} p_{21} - X_{1} p_{23} & X_{1} p_{22} - Y_{1} p_{21} & p_{21} & p_{22} & p_{23} \end{matrix}] [\begin{matrix} φ_{1} \\ φ_{2} \\ φ_{3} \\ t_{1} \\ t_{2} \\ t_{3} \end{matrix}] - - - (9)

Order

PP = [\begin{matrix} {pp}_{11} & {pp}_{12} & {pp}_{13} & {pp}_{14} \\ {pp}_{21} & {pp}_{22} & {pp}_{23} & {pp}_{24} \\ 0 & 0 & 0 & 1 \end{matrix}]

Projection matrix for video camera 2;

Use approximation method to the multiple spot alignment system of equations of two video cameras to be:

Or use non-approximation method, draw system of linear equations:

(X wherein _i, Y _i, Z _i) expression any point three-dimensional coordinate in the basic frame of video camera 1 or video camera 2; (x _i', y _i') represent that this projects to the two-dimensional coordinate of two dimensional surface at video camera 1 or video camera 2 in key frame;

Find the solution the least mean-square error solution of system of linear equations, obtain the three-dimensional motion matrix parameter, form the three-dimensional motion matrix;

Step (3) may further comprise the steps:

To two video cameras, described three-dimensional motion matrix parameter is additional to basic three-dimensional model, obtain the three-dimensional model of two camera video key frames;

Respectively the three-dimensional model back projection of two camera video key frames is returned separately two dimensional surface corresponding to video camera projection matrix, with the angle point two-dimensional coordinate of the key frame of corresponding video camera relatively, reject the point of error beyond threshold value, use step (2) to recomputate the three-dimensional motion matrix parameter, obtain final objective motion vector.