CN103226830B

CN103226830B - The Auto-matching bearing calibration of video texture projection in three-dimensional virtual reality fusion environment

Info

Publication number: CN103226830B
Application number: CN201310148771.0A
Authority: CN
Inventors: 高鑫光; 兰江; 李胜; 汪国平
Original assignee: Peking University
Current assignee: Beijing weishiwei Information Technology Co.,Ltd.
Priority date: 2013-04-25
Filing date: 2013-04-25
Publication date: 2016-02-10
Anticipated expiration: 2033-04-25
Also published as: CN103226830A

Abstract

The present invention relates to Auto-matching bearing calibration and real video image and virtual scene fusion method that in three-dimensional virtual reality fusion environment, video texture projects, Auto-matching bearing calibration step is: by building virtual scene, acquisition video data, video texture fusion, projector's correction.Adopt the real video of shooting, the mode projected by texture, carries out virtual scene fusion in scene surface such as the earth's surface of complexity and buildingss, improves the expression of reality environment Scene multidate information and demonstrates one's ability, also enhancing the stereovision of scene.By increasing the number of videos from different shooting angles, realizing the dynamic video Texture mapping effect of virtual scene on a large scale, thus realizing the dynamic sense of reality effect of the virtual reality fusion of reality environment and displayed scene.By frame of video colour consistency process in advance, eliminate obvious color saltus step, promote visual effect.The automatic calibration algorithm proposed by the present invention, makes the fusion of virtual scene and real video more accurate.

Description

The Auto-matching bearing calibration of video texture projection in three-dimensional virtual reality fusion environment

Technical field

Relate to virtual reality herein, particularly relate to a kind of method utilizing real video image and virtual scene to merge, correct, belong to the technical fields such as virtual reality, computer graphics, computer vision and man-machine interaction.

Background technology

In virtual reality system, the details using the surface on static images performance buildings or ground is the most frequently used means, usually adopts texture mapping mode to realize.The deficiency of the method be the texture of scene surface once setting just no longer changes, for ignoring of model of place surface change key element, reduce the sense of reality of virtual environment, can not a kind of sensation on the spot in person of people be given.In order to eliminate the sense of reality deficiency that static images causes, video is utilized to replace picture to be one idea intuitively.Present stage also has some systems to add video elementary, but is adopt to play the form of window mostly, utilizes existing video player to play video, just reaches the effect of global monitoring, does not accomplish video and the real fusion of scene.Some research work are had to improve on this basis, by building additional plane in space, and the mode playing video on that plane strengthen the sense of reality (can see K.Kim, S.Oh, J.Lee, I.Essa.AugmentingAerialEarthMapswithDynamicInformation.I EEEinternationalSymposiumonMixedandAugmentedReality, ScienceandTechnologyProceedings.19-22Oct, 2009, Orlando, Florida, and Y.Wang USA., D.Bowman, D.Krum, E.Coelho, T.Smith-Jackson, D.Bailey, S.Peck, S.Anand, T.Kennedy, andY.Abdrazakov.EffectsofVideoPlacementandSpatialContext PresentationonPathReconstructionTaskswithContextualizedV ideos.IEEETransactionsonvisualizationandcomputergraphics, Vol.14, No.6, November/December2008.), although video is added virtual environment by said method, but environment for use is very limited, can only be attached on some heavy construction plane or smooth ground, for scenario complicated a little, such as build the position such as turning or irregular ground, their geometric configuration cannot represent by plane approximation, the method that these planes play video is just inapplicable.

On the other hand, because the development of graphics and visual field has had a lot of ripe algorithm, such as, based on the coupling of color, the coupling of texture, the coupling (EdgeDirection, SIFT, HOG) of feature.But these methods is all the method being applied to two dimensional image, uses larger limitation in three dimensions.And present stage corrects relevant algorithm with projector, it is mostly the algorithm for view field's keystone under " projector-screen " system, as: multi-projector method for correcting image and equipment, application number 201010500209.6, bearing calibration is limited in two-dimensional space, by the correction parameter that the independent image information obtaining the zero lap region that a video camera gathers respectively is corresponding with independent image, video data according to video camera corresponding to correction parameter carries out correction process, corrects only for overlapping or that overlapping region is less image.Based on the tangibly true 3 D displaying method of multi-projector rotating panel 3-dimensional image, application number: 200810114457.X, by obtaining three-dimensional space, the cross-sectional image obtaining different angles is described, make hand directly can touch stereopsis, improve the contrast of stereo-picture simultaneously, but this application mainly relies on rotary screen to solve the tangibly problem of 3-D view, different from this job applications scene.

Above patented claim or feature matching method of the prior art there is no too many reference significance on the correction of three dimensions projector.

Summary of the invention

The object of the invention is to, with the real video of shooting, by the mode that texture projects, virtual scene fusion is carried out in scene surface such as the earth's surface of complexity and buildingss, improve the expression of reality environment Scene multidate information and demonstrate one's ability, also enhance the stereovision of scene, and can by increasing the number of videos from different shooting angles, realize the dynamic video Texture mapping effect of virtual scene on a large scale, thus realize the dynamic sense of reality effect of the virtual reality fusion of reality environment and displayed scene.

In order to actualizing technology object, the present invention adopts following technical scheme:

An Auto-matching bearing calibration for video texture projection in three-dimensional virtual reality fusion environment, its step comprises:

1) according to the remotely-sensed data image that obtains in advance set up surface have static texture image terrain model and by multiple virtual scene comprising the model-composing of three-dimensional geometry and texture; Obtain multistage true capture video stream and records photographing time residing pose of camera information;

2) in described virtual scene, the viewing volume of virtual projection plane model and the projector corresponding with camera parameters is added according to residing pose of camera information during described shooting, simultaneously according to the initial pose value in pose of camera information setting virtual projection plane model virtual scene;

3) frame of video pre-service is carried out to the image of described true capture video stream and obtain dynamic video texture, utilize projective textures technology to project in virtual environment by described pretreated video data;

4) static for model surface in described virtual environment texture and/or earth's surface original remote sensing image texture are merged with described dynamic video texture, obtain the final texture value of scene surface covering;

5) according to described final texture value from described virtual projection plane model by play up means obtain virtual projection plane as the image under viewpoint, and with corresponding Image Matching in true capture video stream, structure energy function;

6) utilize optimum solution in energy function to reset the initial pose value of the projector in described virtual scene, complete virtual projection plane and correct.

Further, in described step 4), grain table method is as follows:

1), under virtual view is converted into projector's viewpoint by replacement modelview matrix and projection matrix, draws described virtual scene, obtain the depth value (utilizing Z-Buffer to realize depth buffered) under Current projection machine viewpoint;

2) under replacement modelview matrix and projection matrix become viewpoint again virtual view, repaint described virtual scene, obtain each corresponding real depth value in scene;

3) under each projector viewpoint, virtual scene is drawn successively, obtained the projective textures coordinate of each point in scene by automatically texture generating mode, and to above-mentioned steps 1), 2) the described real depth value that obtains compares with described depth value (utilizing Z-Buffer to realize depth buffered);

4) if both are equal, projector's video texture is adopted, if not etc., adopt model of place self texture, and by setting the mode iteration of texture-combined device function, until traveled through all projectors in scene, obtain the texture value that in scene, each point is final.

Further, described step 5) and corresponding Image Matching in true capture video stream, setting up with posture information is that the energy function building method of independent variable is as follows:

The first step, resets modelview matrix and projection matrix, viewpoint in virtual scene is adjusted to place of projector, and drafting scene obtains the image under a width virtual environment, utilizes mean-shift algorithm to do binary conversion treatment to after Image Segmentation Using to image;

Second step, extracts a key frame from described true capture video stream, uses the method for the first step to do binary conversion treatment;

3rd step, calculate the viewing volume region Internal periphery error that projector is formed, do XOR process to the image that described first two steps obtain suddenly by pixel, statistics is the pixel quantity of 1, and this result is energy function Part I;

4th step, SIFT consistance operator is utilized to add the feature of local message, collect first and second step obtain without the matching double points in the image of binary conversion treatment, obtained the error amount of matching double points by key point constraint (Key-pointconstraint) process, this error amount is energy function Part II;

5th step, distributes different weights for energy function two parts;

6th step, for solving of energy function optimal value,

7th step, utilizes optimum solution to replace the initial pose value of projector.

Further, described energy function optimal value solves in accordance with the following methods:

First apply simulated annealing to energy function, the solution space of function narrowed down in optimum solution approximate extents, the compression of recycling downhillsimplex algorithm pairing approximation solution space, obtains optimum solution.

Further, first and second step described utilizes mean-shift algorithm to the color characteristic utilizing building and highway during Image Segmentation Using, the pixel value in non-building or highway region is set to white, preserved building model or region corresponding to highway, then binary conversion treatment is done to image, building and highway relevant range are set to black.

Further, the pretreated method of frame of video carried out to the image of described true capture video stream as follows:

Video data decoding obtains individual video frame image, extracts a sample frame and utilizes SIFT operator to find Feature Points Matching in sample frame, and carry out colour consistency process from each video flowing.

Further, described colour consistency is treated to:

1) each extraction sample frame from two videos carrying out mating, builds the color histogram that in frame, all pixels are formed, by color histogram equalization and specification processing, makes two width frame of video have identical Color histogram distribution;

2) histogram equalization identical with corresponding sample frame and specification processing are done to each frame in same video flowing, thus consistency treatment is completed to whole video flowing;

3) for frame of video creates buffer memory (cache), size about can hold 50 frame of video (video frame resolution is 1920*1080);

4) list structure of first in first out (FIFO) is adopted to be loaded into frame data.

Further, described true capture video flows through http agreement and obtains, and carries out video data decoding, and frame of video is saved as Jepg form in this locality.

Further, multi-resolution hierarchy is carried out to described video frame image, same image is loaded into the frame of video of different resolution according to different situations, adopt and carry out bilinear interpolation operation by pixel, image is taken out analyse as original image 1/4,1/16, one or more in 1/64.

Further, Alpha passage is increased to described Jepg format video image.

The present invention also proposes a kind of real video image and virtual scene fusion method, the steps include:

1) set up surface according to the remotely-sensed data image obtained in advance and there is static texture image model and virtual scene; Relative position in described virtual scene between model space position and model, to be consistent towards, size and reality scene;

2) pose of camera information residing for the true capture video stream of multistage records photographing is obtained;

3) realization of the method for the invention can be based upon one based on the VR-Platform of digital earth, each virtual projection plane has the Cartesian coordinates that geo-localisation information and virtual reality have and represents this two covers coordinate representation, therefore the world coordinates that the Cartesian coordinates that the latitude and longitude coordinates of earth surface is converted to virtual scene place residing for described shooting represents being combined in described virtual scene adds virtual projection plane model and the corresponding viewing volume of projector's model, simultaneously according to the initial pose value in the virtual projection plane model virtual scene of pose of camera information setting under world coordinate system,

4) frame of video pre-service is carried out to the image of described true capture video stream and obtain dynamic video texture, utilize projective textures technology to project in virtual environment by described pretreated video data;

5) the static texture of model in described virtual environment and/or earth's surface original remote sensing image texture and described dynamic video texture are merged;

6) crossing overlay area is had to adopt grain table to projector different in virtual projection plane model.

Beneficial effect of the present invention

A () overcomes complicated scene condition, achieve the fusion of video and virtual scene, utilize video texture to instead of original terrain remote sensing texture and the intrinsic coarse still image texture of model, for virtual scene texture adds multidate information, improve visual effect.And by increasing number of videos, widen one's influence scope.

B () provides buffer structure for video, and construct data pyramid, improves the efficiency of display, and the data of adjacent two layers are replaced passable.

C () provides automatic calibration algorithm, adjust initial virtual projection plane pose, makes the fusion of virtual scene and real video more accurate, make comparisons with initial position, more precisely be embodied in the value of energy function, position is more accurate, and the value of energy function more levels off to zero.

D () carries out colour consistency process in advance for frame of video, eliminate obvious color saltus step, promotes visual effect.

Accompanying drawing explanation

Fig. 1 is concrete operations realization flow schematic diagram in Auto-matching bearing calibration one embodiment of video texture projection in the three-dimensional virtual reality fusion environment of the present invention;

Fig. 2 a, Fig. 2 b are the scene schematic diagram of texture of not adding drop shadow in Auto-matching bearing calibration one embodiment of video texture projection in the three-dimensional virtual reality fusion environment of the present invention;

Fig. 3 a, Fig. 3 b are the scene schematic diagram that with the addition of projective textures in Auto-matching bearing calibration one embodiment of video texture projection in the three-dimensional virtual reality fusion environment of the present invention;

Fig. 4 be in Auto-matching bearing calibration one embodiment of video texture projection in the three-dimensional virtual reality fusion environment of the present invention projector without the scene schematic diagram corrected;

Fig. 5 be in Auto-matching bearing calibration one embodiment of video texture projection in the three-dimensional virtual reality fusion environment of the present invention projector through the scene schematic diagram of overcorrect.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described, be understandable that the technical scheme in the embodiment of the present invention, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those skilled in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

(1) virtual scene is built.Utilize the remote sensing image obtained in advance that terrain texture is set, in Virtual Space, build surface has the model of static texture image and the virtual scene of model-composing, the relative position in scene between the locus of model and model, should to be consistent with reality scene as far as possible towards key elements such as, sizes.

(2) video data is obtained.Data Source can be CCTV camera, also can be the video image of mobile device shooting, obtains the parameter information of video camera during shooting, for the first location of projector in Virtual Space simultaneously.Video flowing is taken out and analyses, obtain single-frame images, and multi-resolution hierarchy and color of light photograph consistency treatment are done to image.

(3) video texture merges.According to the video camera latitude and longitude information obtained in step (2), add virtual projection plane model and its viewing volume in step (1), and by the pose of camera information setting virtual projection plane model that obtains in step (2) towards.At present, due to the restriction of implementation method, in the viewing volume space of single virtual view, 32 virtual projection plane models can be loaded at the most simultaneously.Utilize projective textures technology to project in virtual environment by video data, the original remote sensing image texture of model or ground and dynamic video texture are merged.If there is crossing overlay area in different projector, also need mixing operation is adopted for this region.

(4) projector corrects.Obtain the image under virtual projection plane viewpoint, do to image in relevant real video and mate.By the algorithm in the present invention, calculate buildings or highway disparity range in virtual image and real image, and local feature difference, structure energy function, solves the optimum solution in energy function.Utilize optimum solution to reset for the projector in virtual scene, complete trimming process, promote effect.

From several aspect, the inventive method is illustrated below.

First some concepts are illustrated:

Video camera: the video source in real space, for obtaining video data.

Projector: the dummy model in virtual scene, for projection video texture in virtual scene.

Grain table a: model can apply the texture of several separate sources, so needs to carry out fusion treatment to texture color values different on same point, obtains final color value.

Depth value: the representative that in space, any point obtains after perspective transform is in the value of Z-direction and virtual view distance.

Depth buffered (Z-Buffer): what scene was preserved after play up cushions with one of Color buffer formed objects, each element in buffering stores the depth value in a scene, represents the depth value that the nearest body surface of three-dimensional scenic middle distance viewpoint corresponding to this element has.

Projective textures technology: differ from four traditional point texture pinup picture modes, is applied in virtual scene by texture with the form of projection, merges with the building in virtual scene and/or landform, as building and/or the final texture of landform.

Technical scheme (2) specific implementation is as follows: for video data, obtained by traditional http agreement, video data decoding is carried out in this locality, obtain individual frame of video, video is saved as Jepg form, a sample frame is extracted from each video flowing, utilize SIFT operator to find Feature Points Matching in sample frame and (can carry out for video source process of presorting herein, the video source of the identical building of shooting is grouped together, matching process can be reduced so consuming time, improve pre-service efficiency), and carry out colour consistency process.

The concrete operations of colour consistency process are each extraction sample frames from two videos carrying out mating, build the color histogram that in frame, all pixels are formed, by color histogram equalization and specification processing, two width frame of video are made to have identical Color histogram distribution, all with sample frame, there is approximately uniform Color histogram distribution for all frames in same video flowing, so do the histogram equalization identical with corresponding sample frame and specification processing for each frame in same video flowing, thus consistency treatment is completed to whole video flowing.The object of this process is exactly allow the video texture of overlapping region can have identical texture color, improves syncretizing effect between video, avoids occurring obvious saltus step effect.Because the frame of video parsed is larger, and memory source is precious, and institute thinks that frame of video creates cache, and size is 50 frame of video.The reason of selection 50 is herein, obtaining video flowing and in the process of resolving, the maximum resolution of single video frame is 1920*1080, and each pixel has 4 bytes, i.e. RGB and Alpha passage, wherein Alpha determines the translucent degree of image, and its span is 0 to 255(0 represent opaque, and 255 represent all-transparent), so read a frame video and will consume 1MB space, 30 road videos will consume the internal memory of 30MB, if within save as 1G and estimate, do many can buffer memory 50 frame.Adopt the list structure of FIFO.If spatial cache is full, then time-out imports into and enters wait.If due to network problem, display speed faster than loading speed, then returns previous frame video, until there are new frame data to import into.A prioritization scheme provided by the invention: the space-saving strategy of another one carries out multi-resolution hierarchy for frame of video exactly, builds pyramid for same image, and different situations are loaded into the frame of video of different resolution, saves memory cost.Concrete mode carries out bilinear interpolation operation by pixel, image taken out analyse as original image 1/4,1/16,1/64.In addition, a prioritization scheme provided by the invention: in order to raise the efficiency, makes frame of video can be directly used in texture projection algorithm, needs for video frame images increases Alpha passage, as fusion parameters between video and scene or between video.

Technical scheme (3) specific implementation is as follows:

First under virtual view being converted into projector's viewpoint by modelview matrix and projection matrix replacement, empty the depth buffer under virtual view, and polygon offset and color mask are set, virtual scene corresponding in rendering technique scheme (1), obtain depth buffered as under viewpoint of Current projection machine, and form depth texture;

Secondly, under becoming viewpoint again virtual view by modelview matrix and projection matrix replacement, empty color and depth buffer, virtual scene corresponding in rendering technique scheme (1), comprise its superficial makings, obtain each corresponding real depth value in scene thus.

Finally, reset by modelview matrix and projection matrix, under each projector viewpoint, draw scene successively.The projective textures coordinate of each point in scene is obtained by automatically texture generating mode, and the real depth value and the Z-Buffer value that are obtained by first and second two step compare the texture value determining that in scene, each point is final, if both are equal, then adopt projector's video texture, if etc., do not utilize model of place self texture.This process of iteration, until traveled through all projectors in scene.

For the fusion between different video, the mode of setting texture-combined device function is adopted to realize.Because colour consistency corrects and completes in frame of video preprocessing process, so adopt Replace mode herein, namely substitute original value by texture fragment afterwards.

Technical scheme (4) specific implementation is as follows: for the correction of projector's pose, is for projector's 3 d space coordinate x in virtual scene, y, z and three direction deflection angle φ, the correction of θ, γ.

Using the pose value obtained during video acquisition as the initial value in virtual projection plane virtual scene, but due to the impact of equipment precision, this numerical value can not make projective textures and Virtual Space merge completely.So need the trimming process outside plus.The present invention adopts and builds take posture information as the energy function of independent variable, and corrects virtual projection plane the mode that energy function solves optimal value.

First, reset by modelview matrix and projection matrix, viewpoint in virtual scene is adjusted to place of projector, drafting scene obtains the image under a width virtual environment, utilizes mean-shift algorithm to Image Segmentation Using, and utilizes the color characteristic of building and highway, the pixel value in non-building or highway region is set to white, preserved building model or region corresponding to highway, then do binary conversion treatment to image, building and highway relevant range be set to black.

Second step, extracts a key frame from video, uses and the similar method of the first step, and preserved building model or region corresponding to highway, do binary conversion treatment.

3rd step, calculates profile errors in projector's viewing volume region

First two steps are obtained image, does XOR process by pixel, last statistics is the pixel quantity of 1, using the Part I of this result as energy function.

4th step, for the building with outward appearance rotational symmetry character, if only do the result that mistake may appear in form fit, so need the feature of adding some local messages.Utilize SIFT consistance operator, collect the one or two liang step obtain without the matching double points in the image of binary conversion treatment.The error amount of matching double points is obtained, by the Part II of this numerical value energy function the most by Key-pointconstraint process.

5th step, distributes different weights for energy function two parts, and the present invention is that overall profile errors is assigned with more weight, and so far energy function builds complete.

6th step, for solving of energy function optimal value, first applies simulated annealing to energy function, the solution space of function is narrowed down in optimum solution approximate extents.Recycling downhillsimplex algorithm pairing approximation solution space is compressed further, thus obtains optimum solution.

7th step, utilizes optimum solution replacement technology to realize projector's pose value initial in (3).

The present embodiment is according to video fusion and correct product process, and following step can be divided into implement:

1 builds virtual scene

First for virtual campus, by existing terrain remote sensing data, build data pyramid, to the texture of landform binding different levels under different points of view.Create landmark building model in campus, and according to landform remotely-sensed data, manually add model to relevant position, allow relative position relation between building be consistent with reality as far as possible.See Fig. 1 step (2) remote sensing landform transmission of data → (3) LOD process → (4) terrain texture value → (6) model data → (7) locus calibration → (8) model texture.

2 obtain video data

From different video source, as monitoring camera or camera, mobile phone in the school, obtain undressed video flowing.Video flowing is taken out and analyses into single-frame images, and multi-resolution hierarchy is carried out to image, create the image of different resolution, and be image interpolation Alpha passage, facilitate the fusion between subsequent video and between video and scene.In addition, the longitude and latitude of video source, visual angle and directional information is preserved, for the initial alignment of projector in Virtual Space.See that Fig. 1 step (11) is analysed for taking out of frame of video, build multiresolution, for impact increases alpha passage → (13) video flowing → (14) projector texture.

3 video textures merge

Video image and landform, model texture are merged mutually.Realized by three scene drawings, first time draws object under projector visual angle, obtains corresponding Z-Buffer, and second time draws scene under virtual view, obtains the real depth value of each point in scene.Third time draws, and obtains depth value and compares, determine the texture value of each point in scene by front twice drafting.Fusion process is realized by the grain table device that setting is different.See Fig. 1 step (1) texture coordinates automatically generate → (5) Z-buffer value in comparison → (9) virtual scene multipass drafting → (10) texture-combined device function → (12) of real depth value final image.

4 projectors correct

Viewpoint is placed in place of projector, draws scene, obtain virtual scene image.Find its video flowing in real scene according to projector, and extract a wherein key frame.Carry out Iamge Segmentation process to above-mentioned two width images, the algorithm that can select is a lot, such as mean-shift, normalizedcut, JSEG, pixelaffinity, and what the present invention adopted is mean-shift algorithm.Be zones of different by Iamge Segmentation, take out according to color characteristic and separate out ground or partial building, irrelevant portions is rejected.Do normalized to the image after being separated, non-model is set to white in part, and model is set to black in part.Then two images do the xor operation by pixel, if two image resolutions are different, need to add consistency treatment process.Do counting operation to the pixel that result is, this result is as the Part I of energy function.Outline is a kind of comparison means of the overall situation, needs some local features to mate as a supplement, uses SIFT feature coupling operator herein, selects a few stack features point, calculate its key-pointerror value, using the Part II of this value superposition as energy function.Now, the energy function that the Correction Problems of projector changes multi variable into asks optimum solution problem.The present invention uses the combination of simulated annealing and downhillsimplex algorithm.Find approximate optimal solution by simulated annealing, then projector position is done among a small circle by downhillsimplex algorithm and optimize.After obtaining optimum solution, this value is replaced the pose of virtual projection plane, complete trimming process.See that Fig. 1 step (15) is taken out based on Iamge Segmentation → (16) SIFT operator extraction local feature match point → (17) virtual projection plane pose calibration value → (18) building of Mean-shift or highway to analyse, and do xor operation → (19) Key-Point error → (20) Downhillsimplex → (21) simulated annealing → (22) and build energy function.

Claims

1. an Auto-matching bearing calibration for video texture projection in three-dimensional virtual reality fusion environment, its step comprises:

1) set up according to the remotely-sensed data image obtained in advance terrain model and the virtual scene that surface has static texture image; Obtain multistage true capture video stream and records photographing time residing pose of camera information;

4) terrain model surface static texture in described virtual environment and/or earth's surface original remote sensing image texture and described dynamic video texture are merged, obtain the final texture value that scene surface covers, comprising:

4-1) reset under virtual view is converted into projector's viewpoint by modelview matrix and projection matrix, draw described virtual scene, obtain the depth value under Current projection machine viewpoint;

4-2) reset under modelview matrix and projection matrix become viewpoint again virtual view, repaint described virtual scene, obtain the real depth value of each correspondence in scene;

4-3) under each projector viewpoint, draw virtual scene successively, obtained the projective textures coordinate of each point in scene by automatically texture generating mode, and to above-mentioned steps 4-1), 4-2) the described real depth value that obtains compares with described depth value;

If 4-4) both are equal, adopt projector's video texture, if not etc., adopt model of place self texture, and by setting the mode iteration of texture-combined device function, until traveled through all projectors in scene, obtain the texture value that in scene, each point is final;

5) from described virtual projection plane model, obtain virtual projection plane as the image under viewpoint according to described final texture value, and with corresponding Image Matching in true capture video stream, structure energy function;

2. the Auto-matching bearing calibration of video texture projection in three-dimensional virtual reality fusion environment as claimed in claim 1, it is characterized in that, described step 5) with corresponding Image Matching in true capture video stream, setting up with posture information is that the energy function building method of independent variable is as follows:

4th step, SIFT consistance operator is utilized to add the feature of local message, collect first and second step obtain without the matching double points in the image of binary conversion treatment, obtained the error amount of matching double points by key point constraint Key-pointconstraint process, this error amount is energy function Part II;

5th step, distributes different weights for energy function two parts;

6th step, for solving of energy function optimal value,

3. the Auto-matching bearing calibration of video texture projection in three-dimensional virtual reality fusion environment as claimed in claim 1 or 2, it is characterized in that, described energy function optimal value solves in accordance with the following methods:

4. the Auto-matching bearing calibration of video texture projection in three-dimensional virtual reality fusion environment as claimed in claim 2, it is characterized in that, first and second step described utilizes mean-shift algorithm to the color characteristic utilizing building and highway during Image Segmentation Using, the pixel value in non-building or highway region is set to white, preserved building model or region corresponding to highway, then binary conversion treatment is done to image, building and highway relevant range are set to black.

5. the Auto-matching bearing calibration of video texture projection in three-dimensional virtual reality fusion environment as claimed in claim 1, is characterized in that, carries out the pretreated method of frame of video as follows to the image of described true capture video stream:

6. the Auto-matching bearing calibration of video texture projection in three-dimensional virtual reality fusion environment as claimed in claim 5, it is characterized in that, described colour consistency is treated to:

3) for frame of video creates buffer memory cache, large I holds 50 frame of video;

4) list structure of fifo fifo is adopted to be loaded into video requency frame data.

7. the Auto-matching bearing calibration of video texture projection in three-dimensional virtual reality fusion environment as claimed in claim 5, it is characterized in that, described true capture video flows through http agreement and obtains, and carries out video data decoding, and frame of video is saved as Jepg form in this locality.

8. the Auto-matching bearing calibration of video texture projection in three-dimensional virtual reality fusion environment as claimed in claim 5, it is characterized in that, multi-resolution hierarchy is carried out to described video frame image, same image is loaded into the frame of video of different resolution according to different situations, adopt and carry out bilinear interpolation operation by pixel, image is taken out analyse as original image 1/4,1/16, one or more in 1/64.

9. the Auto-matching bearing calibration of video texture projection in three-dimensional virtual reality fusion environment as claimed in claim 7, is characterized in that, increases Alpha passage to described Jepg format video image.

10. real video image and a virtual scene fusion method, the steps include:

3) world coordinates Cartesian coordinates that the latitude and longitude coordinates of earth surface residing for described shooting is converted to virtual scene place represented also combines, virtual projection plane model and the corresponding viewing volume of projector's model is added, simultaneously according to the initial pose value in the virtual projection plane model virtual scene of pose of camera information setting under world coordinate system in described virtual scene;

5) the static texture of model in described virtual environment and/or earth's surface original remote sensing image texture and described dynamic video texture are merged, comprising:

5-1) reset under virtual view is converted into projector's viewpoint by modelview matrix and projection matrix, draw described virtual scene, obtain the depth value under Current projection machine viewpoint;

5-2) reset under modelview matrix and projection matrix become viewpoint again virtual view, repaint described virtual scene, obtain the real depth value of each correspondence in scene;

5-3) under each projector viewpoint, draw virtual scene successively, obtained the projective textures coordinate of each point in scene by automatically texture generating mode, and to above-mentioned steps 5-1), 5-2) the described real depth value that obtains compares with described depth value;

If 5-4) both are equal, adopt projector's video texture, if not etc., adopt model of place self texture, and by setting the mode iteration of texture-combined device function, until traveled through all projectors in scene, obtain the texture value that in scene, each point is final;