CN101794459A

CN101794459A - Seamless integration method of stereoscopic vision image and three-dimensional virtual object

Info

Publication number: CN101794459A
Application number: CN201010110440A
Authority: CN
Inventors: 王晨升
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2010-02-09
Filing date: 2010-02-09
Publication date: 2010-08-04

Abstract

The invention discloses a seamless integration method of a stereoscopic vision image and a three-dimensional virtual object, which comprises the following steps: 1) acquiring a practical stereoscopic vision image; 2) carrying out feature extraction and match on the acquired stereoscopic vision image and a virtual image model; 3) determining pose of the matched virtual image model by matching score calculation; 4) adjusting scale of the matched virtual image model; and 5) replacing a real object in the operation site with the completed matched virtual image model. The seamless integration method of the stereoscopic vision image and the three-dimensional virtual object realizes the seamless integration of the stereoscopic image in real environment and the three-dimensional virtual object and is required to integrate the gestures of stereoscopic image in teleoperation environment with those of the three-dimensional virtual object and a virtual robot, and is favor of breaking through the bottleneck that the teleoperation depends on two dimension images for a long time and has poor operation positioning accuracy and provides virtual simulation environment with deep immersion for teleoperation and control.

Description

The seamless integration method of a kind of stereoscopic vision image and three-dimensional virtual object

Technical field

The present invention relates to the teleoperation of robot field, refer to be used for the stereoscopic vision image of robot manipulation location and the seamless integration method of three-dimensional virtual object especially.

Background technology

The distant operation control of existing robots depends on bidimensional image for a long time, and its 3-dimensional image gap of comparing in the actual environment is quite big, therefore adopts bidimensional image that teleoperation of robot is positioned all the time, and ubiquity the inaccurate problem in operation location.

Realization is based on the stereopsis of true environment and the seamless fusion of three-dimensional target object, need be with the stereopsis of distant operation scenario and virtual three-dimensional object, virtual robot attitude dynamic fusion together.The purpose of doing like this is to help to break through the bottleneck that distant operation control depends on bidimensional image, operation bearing accuracy difference for a long time, and the virtual emulation environment of degree of depth feeling of immersion is provided for distant operation control.

Summary of the invention

At the problem that prior art exists, the invention provides and a kind ofly robot is shaken operation control the stereoscopic vision image of virtual emulation environment and the seamless integration method of three-dimensional virtual object are provided.

For achieving the above object, the seamless integration method of stereoscopic vision image of the present invention and three-dimensional virtual object is specially: 1) gather actual stereoscopic vision image; 2) according to the characteristic information extraction algorithm to collecting to such an extent that stereoscopic vision image and virtual image model carry out feature extraction, and then carry out the coupling of stereoscopic vision image and virtual image model; 3), mate score value and calculate, to determine the pose of matching virtual iconic model by the virtual image model after the rotation coupling; 4) the yardstick adjustment of matching virtual iconic model; 5) will mate completely virtual image model generation puts the real object object of distant operation site.

Further, described virtual image model is stored in the model bank, and can call by different instruction, the access model information.

Further, the concrete steps that the pose of virtual image model is determined in the step 3) are: the experience threshold values of setting up the pose coupling, virtual image model of every rotation, by calculating actual threshold values, when actual threshold values is lower than the experience threshold values, the pose of assert virtual image model and stereoscopic vision image is inconsistent, continue rotation virtual image model, when actual threshold values is higher than the experience threshold values, this pose is noted, and it is consistent with the target object pose in the stereoscopic vision image that statistics obtains the virtual image model pose of actual threshold values when the highest at last.

Further, the yardstick adjustment of virtual image model need be carried out convergent-divergent to the virtual image model in the step 4) on dimension, the ratio of convergent-divergent, is adjusted as coefficient by the scale ratio of the containing box of virtual image model under the containing box of target object in the real scene image and the identical pose.

Further, in the step 5) in the real scene virtual image model of yardstick such as target object be taken back in the real scene, need carry out being tied to the conversion of camera coordinates system from world coordinates.

The seamless integration method of stereoscopic vision image of the present invention and three-dimensional virtual object, realization is based on the stereopsis of true environment and the seamless fusion of three-dimensional target object, need be with the stereopsis of distant operation scenario and virtual three-dimensional object, virtual robot attitude dynamic fusion together.The purpose of doing like this is to help to break through the bottleneck that distant operation control depends on bidimensional image, operation bearing accuracy difference for a long time, and the virtual emulation environment of degree of depth feeling of immersion is provided for distant operation control.

Description of drawings

Fig. 1 is the process flow diagram of fusion method of the present invention;

Fig. 2 is a virtual image model pose calculation flow chart;

Fig. 3 is the binocular solid imaging schematic diagram;

Fig. 4 a and Fig. 4 b are the actual situation fusion exemplary plot based on the two-way video.

Embodiment

As shown in Figure 1, the seamless integration method of stereoscopic vision image of the present invention and three-dimensional virtual object is specially:

1. choosing alternately of target object:

Choosing of target object can be carried out in the video image of the left and right sides, and concrete operations are chosen similar to screen area.

2. the aspect of model extracts and the coupling retrieval:

Utilize the characteristic information extraction algorithm to carry out the model that feature extraction reaches and the model knowledge base is interior and carry out feature extraction and coupling.

Feature Extraction is relevant with choosing target object residing angle in video image, and reasonable angle should be the angle that can represent object singularity to greatest extent.When carrying out characteristic matching, system adopts the method for incremental rotation model to guarantee the accuracy of mating.

Feature extraction SIFT algorithm detailed step:

(1) generation of metric space

The theoretical purpose of metric space is the multiple dimensioned feature of simulated image data.

Gaussian convolution nuclear is the unique linear kernel that realizes change of scale, so the metric space of a secondary two dimensional image is defined as:

L(x，y，σ)＝G(x，yσ)*I(x，y) (1)

Wherein G (x, y σ) are the changeable scale Gaussian function,

(x y) is volume coordinate, and σ is the yardstick coordinate.

In order effectively to detect stable key point, difference of Gaussian metric space (DOG scale-space) has been proposed at metric space.Utilize the Gaussian difference pyrene and the image convolution of different scale to generate.

D(x，y，σ)＝(G(x，y，kσ)-G(x，y，σ))*I(x，y)＝L(x，y，kσ)-L(x，y，σ) (3)

The DOG operator calculates simple, is the approximate of the normalized LoG operator of yardstick.

(2) make up the parameter that metric space need be determined

σ-metric space coordinate

The O-octave coordinate

The S-sub-level coordinate

σ and O, S concern σ (o, s)=σ ₀2 ^O+s/S, o ∈ o _Min+ [0 ..., O-1], s ∈ [0 ..., S-1]

σ wherein ₀It is the key horizon yardstick.The o-octave coordinate, the s-sub-level coordinate.Annotate: the index of octaves may be born.First group index usually is made as 0 or-1, and when being made as-1, image expansion earlier before calculating Gauss's metric space is twice.

Volume coordinate x is the function of group octave, establishes x ₀Be 0 group volume coordinate, then

x＝2 ^ox ₀，o∈Z，x ₀∈[0，...，N ₀-1]×[0，...，M ₀-1]

If (M ₀, N ₀) be the resolution of base set o=0, then the resolution of other groups is obtained by following formula:

(3) accurately determine the extreme point position

By fitting three-dimensional quadratic function accurately to determine the position and the yardstick (reaching sub-pixel precision) of key point, remove the key point and the unsettled skirt response point (because the DoG operator can produce stronger skirt response) of low contrast simultaneously, to strengthen coupling stability, to improve noise resisting ability.

The removal of skirt response

An extreme value that defines bad difference of Gaussian operator has bigger principal curvatures in the place across the edge, and in the direction of vertical edge less principal curvatures is arranged.Principal curvatures is obtained by the Hessian matrix H of a 2x2:

H = [\begin{matrix} D_{xx} & D_{xy} \\ D_{xy} & D_{yy} \end{matrix}] - - - (4)

Derivative is estimated to obtain by the adjacent difference of sampled point.

The principal curvatures of D and the eigenwert of H are directly proportional, and make that α is an eigenvalue of maximum, and β is minimum eigenwert, then

Tr(H)＝D _xx+D _yy＝α+β，

Det(H)＝D _xxD _yy-(D _xy) ²＝αβ.

Make α=γ β, then:

\frac{Tr {(H)}^{2}}{Det (H)} = \frac{{(α + β)}^{2}}{αβ} = \frac{{(rβ + β)}^{2}}{r β^{2}} = \frac{{(r + 1)}^{2}}{r},

(r+1) ²The value of/r is minimum when two eigenwerts equate, increases along with the increase of r, therefore, in order to detect principal curvatures whether under certain thresholding r, only needs to detect

\frac{Tr {(H)}^{2}}{Det (H)} < \frac{{(r + 1)}^{2}}{r} .

(4) the key point direction is distributed

Utilize the gradient direction distribution character of key point neighborhood territory pixel to be each key point assigned direction parameter, make operator possess rotational invariance.

m (x, y) = \sqrt{{(L (x + 1, y) - L (x - 1, y))}^{2} + {(L (x, y + 1) - L (x, y - 1))}^{2}}

θ(x，y)＝αtan2((L(x，y+1)-L(x，y-1))/(L(x+1，y)-L(x-1，y))) (5)

Formula (5) is that (x y) locates the mould value and the direction formula of gradient.Wherein the used yardstick of L is each key point yardstick at place separately.

When actual computation, we sample in the neighborhood window that with the key point is the center, and with the gradient direction of statistics with histogram neighborhood territory pixel.The scope of histogram of gradients is 0～360 degree, wherein per a 10 degree post, 36 posts altogether.Histogrammic peak value has then been represented the principal direction of this key point place neighborhood gradient, promptly as the direction of this key point.

In gradient orientation histogram, when existing another to be equivalent to the peak value of main peak value 80% energy, then this direction is thought the auxilliary direction of this key point.A key point may designatedly have a plurality of directions (principal direction, auxilliary direction more than), and this can strengthen the robustness of coupling.

So far, the key point of image has detected and has finished, and each key point has three information: position, yardstick of living in, direction.Can determine a SIFT characteristic area (representing with ellipse or arrow) thus at the experiment chapters and sections.

(5) the unique point descriptor generates

At first coordinate axis is rotated to be the direction of key point, to guarantee rotational invariance.Next be that 8 * 8 window is got at the center with the key point, on per 4 * 4 fritter, calculate the gradient orientation histogram of 8 directions then, draw the accumulated value of each gradient direction, can form a seed points.The thought of this neighborhood directivity information associating has strengthened the antimierophonic ability of algorithm, also provides fault-tolerance preferably for the characteristic matching that contains positioning error simultaneously.

In the actual computation process, in order to strengthen the robustness of coupling, suggestion to each key point use 4 * 4 totally 16 seed points describe, just can produce 128 data for a key point like this, promptly finally form the 128 SIFT proper vectors of tieing up.The influence that this moment, the SIFT proper vector was removed geometry deformation factors such as dimensional variation, rotation continues the length normalization method with proper vector again, then can further remove the influence of illumination variation.

After the SIFT proper vector of two width of cloth images generated, we adopted the Euclidean distance of key point proper vector to be used as the similarity determination tolerance of key point in two width of cloth images next step.Get certain key point in the image 1, and find out European nearest preceding two key points in itself and the image 2, in these two key points, near distance is less than certain proportion threshold value if nearest distance is removed in proper order, then accepts this a pair of match point.Reduce this proportion threshold value, SIFT match point number can reduce, but more stable.

3. calculate the pose conversion of Matching Model and locus:

Fundamental purpose is to determine orientation and the exact position in scene and the pose of dummy model.Determining of pose is by continuous rotation dummy model and to matching result marking, and be the highest and when reaching the experience threshold values of setting, system thinks that this bit model appearance is consistent with the pose of target object in the real scene when the coupling score value.This computation process as shown in Figure 2.

4. model dimension adjustment:

To the model on the coupling, sometimes might be inconsistent on yardstick with the real object object, this just need carry out convergent-divergent to dummy model on dimension.Shown in Fig. 4 a and Fig. 4 b, the ratio of convergent-divergent by the scale ratio of the containing box of dummy model under the containing box of target object in the real scene image and the identical pose as coefficient.When putting back in the real scene, at first carry out scale transformation, to dummy model generation to guarantee consistent on space scale of dummy model and real object object.

5. actual situation merges

The core that actual situation merges is Matching Model generation of extracting with model bank to put the real object object of distant operation site.In this process, be taken back in the real scene with the dummy model of yardstick such as target object in the real scene, need carry out being tied to the conversion of camera coordinates system from world coordinates.

Usually, two video camera centers keep on the same surface level, i.e. camera horizontal positioned, and then the image coordinate Y coordinate of unique point P is identical, i.e. Y _Left=Y _Right=Y, according to shown in Figure 3,

Obtain following equation by the triangle geometric relationship:

\{\begin{matrix} X_{left} = f \frac{x_{c}}{z_{c}} \\ X_{right} = f \frac{(x_{c} - B)}{z_{c}} \\ Y = f \frac{y_{c}}{z_{c}} \end{matrix}

Parallax is in the following formula: Disparity=X _Left-X _RightCan calculate the three-dimensional coordinate of unique point P under camera coordinates system thus is:

\{\begin{matrix} x_{c} = \frac{t_{c} \cdot X_{left}}{Disparity} \\ y_{c} = \frac{t_{c} \cdot Y}{Disparity} \\ z_{c} = \frac{t_{c} \cdot f}{Disparity} \end{matrix}

(4-1-3)

Wherein, (X _Left, Y), (X _Right, Y) being the picture point of spatial point P left and right sides image centering, picture point is to obtaining by the correlativity of calculating pixel:

Corr (X, Y) = \frac{Σ_{i = 1}^{m} (x_{i} - \overset{&OverBar;}{x}) (y_{i} - \overset{&OverBar;}{y})}{{[Σ_{i = 1}^{m} {(x_{i} - \overset{&OverBar;}{x})}^{2} Σ_{i = 1}^{m} {(y_{i} - \overset{&OverBar;}{y})}^{2}]}^{1 / 2}}

Wherein (x y) represents given pixel; (x _i, y _i) represent image to the pixel in another width of cloth image; The size of m template.To given P (x _c, y _c, z _c) (x y), when following formula is obtained minimum value under given template size m, then thinks (x to the projection image's vegetarian refreshments in the width of cloth image at stereopsis _i, y _i) be (x, the y) projection of correspondence in another width of cloth image of stereopsis centering.

Formula 4-1-3 is used to set up the actual coordinate of model under world coordinate system and its dummy model projection coordinate related under camera coordinates system.

Fig. 4 a and Fig. 4 b have provided actual situation and have merged the sectional drawing of example in the video image of the left and right sides.In this example, target object is replaced by virtual three-dimensional model, and its model can be selected.

It is to be noted and any distortion of having done according to the specific embodiment of the present invention all do not break away from the scope that spirit of the present invention and claim are put down in writing.

Claims

1. the seamless integration method of stereoscopic vision image and three-dimensional virtual object is specially: 1) gather actual stereoscopic vision image; 2) according to the characteristic information extraction algorithm to collecting to such an extent that stereoscopic vision image and virtual image model carry out feature extraction, and then carry out the coupling of stereoscopic vision image and virtual image model; 3), mate score value and calculate, to determine the pose of matching virtual iconic model by the virtual image model after the rotation coupling; 4) the yardstick adjustment of matching virtual iconic model; 5) will mate completely virtual image model generation puts the real object object of distant operation site.

2. the seamless integration method of stereoscopic vision image as claimed in claim 1 and three-dimensional virtual object is characterized in that, described virtual image model is stored in the model bank, and can call by different instruction, the access model information.

3. the seamless integration method of stereoscopic vision image as claimed in claim 1 and three-dimensional virtual object, it is characterized in that, the concrete steps that the pose of virtual image model is determined in the step 3) are: the experience threshold values of setting up the pose coupling, virtual image model of every rotation, by calculating actual threshold values, when actual threshold values is lower than the experience threshold values, the pose of assert virtual image model and stereoscopic vision image is inconsistent, continue rotation virtual image model, when actual threshold values is higher than the experience threshold values, this pose is noted, and it is consistent with the target object pose in the stereoscopic vision image that statistics obtains the virtual image model pose of actual threshold values when the highest at last.

4. the seamless integration method of stereoscopic vision image as claimed in claim 1 and three-dimensional virtual object, it is characterized in that, the yardstick adjustment of virtual image model need be carried out convergent-divergent to the virtual image model in the step 4) on dimension, the ratio of convergent-divergent, is adjusted as coefficient by the scale ratio of the containing box of virtual image model under the containing box of target object in the real scene image and the identical pose.

5. the seamless integration method of stereoscopic vision image as claimed in claim 1 and three-dimensional virtual object, it is characterized in that, in the step 5) in the real scene virtual image model of yardstick such as target object be taken back in the real scene, need carry out being tied to the conversion of camera coordinates system from world coordinates.