CN102006425B

CN102006425B - Method for splicing video in real time based on multiple cameras

Info

Publication number: CN102006425B
Application number: CN2010105873947A
Authority: CN
Inventors: 张春雨; 齐彤岩; 李斌; 蔡胜昔; 蔡蕾; 汪林; 孔涛
Original assignee: Research Institute of Highway Ministry of Transport
Current assignee: Research Institute of Highway Ministry of Transport
Priority date: 2010-12-13
Filing date: 2010-12-13
Publication date: 2012-01-11
Anticipated expiration: 2030-12-13
Also published as: CN102006425A

Abstract

The invention discloses a method for splicing a video in real time based on multiple cameras, which comprises the following steps of: acquiring synchronous multi-path video data; preprocessing frame images at the same moment; converting a color image into a grayscale image; enhancing the image, and expanding the dynamic range of grayscale of the image by a histogram equalization method; extracting the characteristic points of corresponding frames by using a speeded up robust features (SURF) algorithm; solving matched characteristic point pairs among corresponding frame images of the video by using a nearest neighbor matching method and a random sample consensus matching algorithm; solving an optimal homography matrix of initial k frames of the video; determining splicing overlapping regions according to the matched characteristic point pairs; taking a homography matrix corresponding to a frame with highest overlapping region similarity as the optimal homography matrix, and splicing subsequent video frame scenes; and outputting the spliced video. The method can reduce the calculated amount of splicing the video frame single-frame image, improves the splicing speed of traffic monitoring videos and achieves real-time processing effect.

Description

A kind of real-time video joining method based on multiple-camera

Technical field

The invention belongs to the field of video monitoring of intelligent transportation, be based on the monitoring range expansion technique of computer vision.

Technical background

Present splicing is mainly used in the real-time video monitoring field of transportation industry, and research also mainly concentrates in the splicing to the relatively-stationary traffic monitoring video of camera position.What algorithm was relatively ripe is the splicing to still image; The video-splicing technology is mainly utilized the correlation of video image interframe; Relate to a large amount of calculating in the splicing, this just need choose a kind of matching algorithm fast under the prerequisite that guarantees image quality; To reduce the amount of calculation of whole video splicing, improve video-splicing speed.Have many paper publishings both at home and abroad about video-splicing principle and method, but also do not have general video-splicing software that can reach real-time processing speed and product to occur now.

Summary of the invention

The purpose of this invention is to provide a kind of real-time video joining method, wait technical problem slowly to solve the big and video-splicing speed of video single-frame images splicing amount of calculation based on multiple-camera.

A kind of real-time video joining method of the present invention based on multiple-camera, hardware is based on the traffic monitoring video-splicing device of multiple-camera, mainly comprises processor, video acquisition module, power module, display module and memory module, and its step is following:

Step 1, start-up operation obtain synchronous multi-path video data;

Step 2, the two field picture of synchronization is carried out preliminary treatment: become the gray level image of 256 gray scales to coloured image, and adopt method of histogram equalization to carry out image enhancement processing to image;

Step 3, employing SURF algorithm carry out the extraction of corresponding frame image features point;

Step 4, employing arest neighbors matching method, random sampling consistency matching algorithm are obtained the homography matrix of the initial corresponding interframe of video;

Step 5, find the solution the optimum homography matrix of the initial k frame of video:

Step 5-1, confirm the overlapping region of stitching image

According to the perspective mapping relations of image, obtain the match point of keeping characteristics point at another corresponding two field picture according to homography matrix; If the match point that calculates does not drop in the image, this point is described not in the overlay region of two width of cloth images, otherwise, if all drop in the image before and after the characteristic point mapping, explain that it is in the overlay region; Travel through in this way one time, promptly all keeping characteristics points in the piece image are had an X-rayed mapping transformation, obtain shining upon the zone that character pair point is confirmed.Getting with another width of cloth edge of image is the border, and the minimum inscribed polygon of getting the overlay region is as the overlapping region;

Step 5-2, ask for before the similarity of the corresponding frame stitching image of k frame overlapping region

Similarity is estimated by the normalized covariance correlation function and is defined, and following formula 1 is said:

C (I_{1}, I_{2}) = \frac{Σ_{m = 0}^{w} Σ_{n = 0}^{h} [I_{1} (i_{1}, j_{1}) - \overset{&OverBar;}{I_{1} (i_{1}, j_{1})}] \times [I_{2} (i_{2}, j_{2}) - \overset{&OverBar;}{I_{2} (i_{2}, j_{2})}]}{\sqrt{{Σ_{m = 0}^{w} Σ_{n = 0}^{h} {[I_{1} (i_{1}, j_{1}) - \overset{&OverBar;}{I_{1} (i_{1}, j_{1})}]}^{2}} \times Σ_{m = 0}^{w} Σ_{n = 0}^{h} {[I_{2} (i_{2}, j_{2}) - \overset{&OverBar;}{I_{2} (i_{2}, j_{2})}]}^{2}}}} - - - (1)

In the formula, w, k are the wide height of overlay region.Obviously, similarity C span is (1,1), and the big more explanation of value overlay region degree of correlation is high more;

The optimum homography matrix of step 5-3, definite preceding k frame

To continuously preceding k frame overlapping region similarity value maximizing, the homography matrix of the homography matrix of the highest picture frame of correlation as optimum, and with its homography matrix as subsequent frame;

Step 6, carry out the splicing of subsequent video frame scene according to the homography matrix of optimum;

Step 7, export spliced video.

As above the described image gray processing of step 2 is to utilize the formula of following formula weighted mean method to carry out conversion, becomes coloured image the gray level image of 256 grey levels:

f(i，j)＝0.3*R(i，j)+0.59*G(i，j)+0.11*B(i，j)。

As above step 2 is said carries out the dynamic range that non-linear stretching increases the gradation of image value to image, and increasing contrast is to adopt histogram equalization, its step:

Gray scale is that the number of pixels of k is nk in step 2-1, the statistical picture;

The value of dependent variable: pk=nk/ in step 2-2, the compute histograms (M * N);

The value of dependent variable in step 2-3, the calculating accumulative histogram: sk=∑ pk;

Step 2-4, definite mapping corresponding relation: searching i (i=0,1 ..., ((positive integer is got in round off) is the most approaching with sk L) to make it satisfy i/L;

Step 2-5, be that the value of the pixel of k changes i into gray scale in the original image;

Wherein L is the gray-level number, and M * N is a picture size.

The extraction that above-mentioned steps 3 said employing SURF algorithms carry out corresponding frame characteristic point comprises:

Step 3-1, metric space extreme value detect, and confirm characteristic point position and belong to yardstick with preliminary; On original image, form the image pyramid of different scale, structure square frame filtering template through the size that enlarges square frame; Make up the image pyramid of metric space, form pyramidal voxel space; Calculate the delta value of every bit;

The delta value method of described calculating every bit is following:

The Hessian matrix does

H = (\begin{matrix} D_{xx}, D_{xy} \\ D_{xy}, D_{yy} \end{matrix})

The delta value of every bit in the SURF operator quantitative Analysis metric space, the calculating of delta is shown below:

delta＝D _xxD _yy-(0.9D _xy) ²。

26 points carry out asking for of maximum around extreme point, thereby obtain rough characteristic point, to ask for the maximum point of metric space; Discrete point through in quadratic interpolattion (polynomial interopolation) the match extreme point neighborhood obtains three-dimensional conic section; The extreme value place of conic section is the sub-pixel location of extreme point; If departing from former integer extreme point coordinate, this sub-pixel location all is no more than 0.5 pixel three dimensions; Think that promptly extreme point is stable, otherwise reject, accurately to confirm the position and the yardstick of characteristic point;

Step 3-2, characteristic point choose one with the corresponding neighborhood of yardstick, obtain principal direction: be the center with the characteristic point, make up a border circular areas around the characteristic point to be detected; Calculate the interior point of neighborhood at the little wave response of the Haar of x, y direction, the Harr small echo responds and is calculated, and composes Gauss's weight coefficient for these responses, and make and contribute greatly near the response of characteristic point, and little away from the response contribution of characteristic point; Cover π/3 in glide direction, the response of the small echo of level and vertical direction forms new vector mutually in the window in 60 ° of scopes, travels through whole border circular areas, and selecting the direction of maximum vector is the principal direction of this characteristic point; Calculate one by one through characteristic point, obtain the principal direction of each characteristic point;

Step 3-3,, form the characteristic vector of SURF: with the characteristic point is the center, at first reference axis is rotated to principal direction; Choose the square area that the length of side is 20s according to principal direction, these windows are divided into 4 * 4 subregion; In each subregion, calculate the little wave response in 5s * 5s scope, sampling step length is got s; With respect to the Haar small echo response of the level of principal direction, vertical direction respectively note be dx, dy; Give response equally with the weights coefficient, to increase robustness to geometric transformation; The response of each subregion and the absolute value addition of response are formed ∑ dx, ∑ dy, ∑ dx, ∑ dy; Each subregion forms the vector V sub=(∑ dx, ∑ dx, ∑ dy, ∑ dy) of four-dimensional component; To each characteristic point, then form the description vector of 4 * (4 * 4)=64 dimension; Carry out the normalization of vector again, thereby illumination is had certain robustness.

Above-mentioned steps 4 adopt arest neighbors matching methods, random sampling consistency matching algorithm to the characteristic point of the corresponding frame of video image to mating, comprising:

Step 4-1, to each characteristic point of source images, calculate itself and the distance of corresponding two field picture matched feature points; Adopt preferential k-d tree to carry out first search, search 2 approximate arest neighbors characteristic points of each characteristic point; The ratio of the nearest neighbor distance of characteristic point and time nearest neighbor distance as the foundation of whether mating between characteristic point; If ratio, is then accepted this a pair of match point less than certain proportion threshold value;

Step 4-2, repeated sampling n time take out 4 groups of corresponding points at random to forming a sample and calculating homography matrix H;

The computing formula of step 4-3, said homography matrix:

Make that I is plane in the space, its imaging under two viewpoints is designated as I1, I2 respectively, [x y 1] ^T∈ I ₁, [X Y 1] ^T∈ I ₂Be any a pair of corresponding points, then image I ₁And image I ₂Transformation relation can be used formula I ₁=kH * I ₂Expression is specially:

[\begin{matrix} x \\ y \\ 1 \end{matrix}] = kH [\begin{matrix} X \\ Y \\ 1 \end{matrix}] - - - (1)

Wherein, k is a scale factor, the convergent-divergent relation between presentation video.H is called homography matrix, and it is one 3 * 3 non-singular matrix, is written as following form usually:

H = (\begin{matrix} h_{11} & h_{12} & h_{13} \\ h_{21} & h_{22} & h_{23} \\ h_{31} & h_{32} & h_{33} \end{matrix}) - - - (2)

The value of lower right corner h33 is 1;

The calculating of homography matrix H:

Obtain 4 pairs of accurate match points to all calculating the situation whether three point on a straight line is arranged in each sampling; Judge 3 whether methods of conllinear: 3 for coordinate for (x1, y1), (x2, y2), (x3; Y3), try to achieve the phasor coordinate of wherein any 2 lines, like x1-x2=a, y1-y2=b, x1-x3=c; Y1-y3=d get (a, b), (c, d); Whether see whether ad=bc promptly sees parallel, equals, conllinear then, otherwise conllinear not.Will to choose match point again right if having.

Step 4-4, calculate every group of hypothesis corresponding apart from d, i.e. Euclidean distance; Through comparing with threshold value, initial parameter value is taken as x, and whether the point that utilizes this threshold decision to extract is the point that satisfies estimated parameter; Point that will be consistent with H is to as interior point; After the certain number of times of this iteration, choose the maximum point set of counting out in comprising, in count out when equating the point set that the choice criteria variance is minimum; Correspondence is imported the highest estimated parameter value that goes out of " interior point " ratio in the data and the interior point of electing is optimized together;

Step 4-5, the coupling concentrated with selected point are come minimize error to recomputating homography matrix H with least square method.

Advantage of the present invention is on the basis of reducing video-splicing amount of calculation and raising traffic monitoring video-splicing speed; Video image to traffic monitoring carries out real-time video-splicing processing and demonstration; And offer the traffic monitoring administrative staff with direct form; Improve the field range of traffic monitoring, effectively traffic accident is monitored.

Description of drawings

Fig. 1 is the x direction plate value sketch map of square frame filtering template of the present invention.

Fig. 2 is the y direction plate value sketch map of square frame filtering template of the present invention.

Fig. 3 is the xy direction plate value sketch map of square frame filtering template of the present invention.

Fig. 4 is a principle schematic of asking for the maximum point of metric space of the present invention.

Embodiment

Major function of the present invention realizes that by two main basic modules video acquisition module mainly is made up of video camera, is responsible for the collection of vedio data and the conversion of data format; Graphics processing unit in the processor, the main splicing fusion treatment of accomplishing the two-path video signal of input.

Concrete steps of the present invention:

One, obtains homogeneity multi-channel video image;

Start-up operation obtains synchronous multi-path video data.

Two, the two field picture to synchronization carries out preliminary treatment;

1, image gray processing

Utilize the formula of following formula weighted mean method to carry out conversion, become coloured image the gray level image of 256 grey levels:

f(i，j)＝0.3*R(i，j)+0.59*G(i，j)+0.11*B(i，j)

2, histogram equalization

Histogram equalization is through image is carried out the dynamic range that non-linear stretching increases the gradation of image value.Its essence is that the gray scale that a plurality of frequencies are different merges to same gray scale, to exchange the expansion of contrast for.See that from histogram histogram equalization is exactly to change over uniform histogram distribution to the histogram distribution of given image.

The histogram equalization step:

Gray scale is that the number of pixels of k is nk in the statistical picture;

The value of dependent variable: pk=nk/ in the compute histograms (M * N);

Calculate the value of dependent variable in the accumulative histogram: sk=∑ pk;

Confirm the mapping corresponding relation: searching i (i=0,1, ^, ((positive integer is got in round off) is the most approaching with sk L) to make it satisfy i/L;

With gray scale in the original image is that the value of the pixel of k changes i into.

Wherein L is the gray-level number, and M * N is a picture size.

Three, adopt the SURF algorithm to carry out the extraction of corresponding frame characteristic point;

Step:

1, the metric space extreme value detects, and confirms the key point position and belongs to yardstick with preliminary.

A), structure square frame filtering template

On original image, form the image pyramid of different scale through the size that enlarges square frame.9 * 9 being example, Fig. 1 to 3 is the square frame filtering template after the corresponding simplification.Corresponding scale-value s=1.2.

B), make up the image pyramid of metric space, form pyramidal voxel space.

Make up the yardstick image pyramid, in each rank, select 4 layers scalogram picture, the structure parameter on 4 rank is as shown in Figure 3.The size of the numeral square frame filtering template at the bottom of the grey is if picture size much larger than template size, also can continue to increase exponent number.Like the filtering template size is N * N, then corresponding yardstick s=1.2 * N/9.

C), calculate the delta value of every bit

The Hessian matrix is:

H = (\begin{matrix} D_{xx}, D_{xy} \\ D_{xy}, D_{yy} \end{matrix})

dela＝D _xxD _yy-(0.9D _xy) ²

D), ask for the maximum point of metric space

26 points carry out asking for of maximum around extreme point, thereby obtain rough characteristic point.The pixel that is labeled as cross among Fig. 4 need the attendant of a stage actor draw together same yardstick around 8 pixels of neighborhood and adjacent yardstick correspondence position around 9 * 2 pixels of neighborhood altogether 26 pixels compare, to guarantee all to detect local extremum at metric space and two dimensional image space.Referring to Fig. 4.

E), the determination of stability of extreme point: position and the yardstick of accurately confirming key point

Discrete point through in quadratic interpolattion (polynomial interopolation) the match extreme point neighborhood obtains three-dimensional conic section;

The extreme value place of conic section is the sub-pixel location of extreme point.All be no more than 0.5 pixel if this sub-pixel location departs from former integer extreme point coordinate three dimensions, think that promptly extreme point is stable, otherwise reject.

2, choose one and the corresponding neighborhood of yardstick in characteristic point, obtain principal direction: the consistency that obtains the image rotation

A), be the center with the characteristic point, around characteristic point to be detected, make up a border circular areas, be shown in the following figure.Radius is 6s (s is the scale-value at characteristic point place).

B), the point that calculates in the neighborhood responds at the Haar small echo (the Haar small echo length of side is got 4s) of x, y direction.

The response of Harr small echo is calculated, and composes Gauss's weight coefficient (σ=2.5w, w represent the space-number that the every component in the metric space is divided into, and w is that resolution setting is 2 in the literary composition) for these responses.Make and contribute greatly near the response of characteristic point, and little away from the response contribution of characteristic point;

C), the small echo of level and vertical direction responds and forms new vector mutually in glide direction covers the window of π/3 (60 ° scope in), travels through whole border circular areas, the direction of selection maximum vector is the principal direction of this characteristic point;

D), calculate one by one, obtain the principal direction of each characteristic point through characteristic point.

3, form the characteristic vector of SURF

A) be the center with the characteristic point, at first reference axis rotated to principal direction;

B) choose the square area that the length of side is 20s according to principal direction, these windows are divided into 4 * 4 subregion;

C) in each subregion, calculate the little wave response in 5s * 5s (sampling step length is got s) scope.With respect to the Haar small echo response of the level of principal direction, vertical direction respectively note be dx, dy.Give response equally with the weights coefficient, to increase robustness to geometric transformation;

D) response of each subregion and the absolute value addition of response are formed ∑ dx, ∑ dy, ∑ dx, ∑ dy;

E) each subregion forms the vector V sub=(∑ dx, ∑ dx, ∑ dy, ∑ dy) of four-dimensional component;

To each characteristic point, then form the description vector of 4 * (4 * 4)=64 dimension;

Carry out the normalization of vector again, thereby illumination is had certain robustness.

Also have a lot of additive methods, can realize the extraction of characteristic point, like the sift method.

Four, the matched feature points that adopts arest neighbors matching method, random sampling consistency matching algorithm to obtain between the corresponding two field picture of video is right;

1, thick matching algorithm: arest neighbors method

Step:

1), to each characteristic point of source images, calculate itself and the distance of coupling two field picture all characteristic points;

2), adopting preferential k-d to set carries out first search, searches 2 approximate arest neighbors characteristic points of each characteristic point;

3), the ratio of the nearest neighbor distance of characteristic point and time nearest neighbor distance as the foundation of whether mating between characteristic point;

4) if ratio, is then accepted this a pair of match point less than certain proportion threshold value.In this algorithm, its ratio is 0.6.Reduce this proportion threshold value, the SURF coupling can reduce number, but more stable;

Because thick coupling possibly exist the point of mistake coupling right, therefore need be to the characteristic of thick coupling to accurately mating.

2, accurate matching algorithm: stochastical sampling consistency

It is right to obtain potential coupling through thick matching algorithm, wherein unavoidably can produce some erroneous matching, therefore need eliminate erroneous matching according to geometric limitations and other additional constraints, improves robustness.(Random Sample Consensus, RANSAC) algorithm filters to using the random sampling consistency for the coupling that obtains.

Step:

1), repeated sampling n time (n is confirmed by the sampling self adaptation), take out 4 groups of corresponding points at random to forming a sample and compute matrix homography matrix H;

2), the calculating of homography matrix H

Theoretically, from the feature point set of accurate coupling, choose 4 pairs of accurate match points and just can solve H, often have error in the Practical Calculation, as the situation that three point on a straight line occurs just can not draw and separate, must be to the characteristic of equation to carrying out nonlinear optimization;

Obtain 4 pairs of accurate match points to all calculating the situation whether three point on a straight line is arranged in each sampling.Judge 3 whether methods of conllinear: 3 for coordinate for (x1, y1), (x2, y2), (x3; Y3), try to achieve the phasor coordinate of wherein any 2 lines, like x1-x2=a, y1-y2=b, x1-x3=c; Y1-y3=d get (a, b), (c, d); Whether see whether ad=bc promptly sees parallel, equals, conllinear then, otherwise conllinear not.Will to choose match point again right if having.

3), calculate every group of hypothesis corresponding apart from d (Euclidean distance); Through comparing with threshold value, initial parameter value is taken as 0.001, and whether the point that utilizes this threshold decision to extract is the point that satisfies estimated parameter.Point that will be consistent with H is to as interior point;

4) after the number of times that, so iteration is certain, choose the maximum point set of counting out in comprising (in count out when equating the point set that the choice criteria variance is minimum); Correspondence is imported the highest estimated parameter value that goes out of " interior point " ratio in the data and the interior point of electing is optimized together.The coupling of concentrating with selected point is come minimize error to recomputating homography matrix with least square method.

Also have a lot of additive methods, realize coupling, like the normalized mutual information metering method, similarity is estimated etc.

Five, find the solution the optimum homography matrix of the initial k frame of video

The conclusion that the good homography matrix of performance can make two images to be spliced be stitched together well.But owing in the Feature Points Matching process, possibly there be the situation of characteristic point mistake coupling, reality is found the solution the gained homography matrix through feature point pair matching and is not necessarily optimum homography matrix in single-frame images.And, adopt the mode of single-frame images splicing to carry out to video image, do not utilize the correlation of video inter frame image, do not consider the association of interframe homography matrix yet, therefore also can reduce the efficient of video-splicing.Therefore, can utilize the correlation between frame of video, in splicing,, ask for optimum homography matrix, as the homography matrix of subsequent frame according to the maximum correlation of preceding k frame overlapping region.

1, confirms the overlapping region of stitching image.

2, ask for the similarity of the corresponding stitching image of preceding k frame overlapping region.

C (I_{1}, I_{2}) = \frac{Σ_{m = 0}^{w} Σ_{n = 0}^{h} [I_{1} (i_{1}, j_{1}) - \overset{&OverBar;}{I_{1} (i_{1}, j_{1})}] \times [I_{2} (i_{2}, j_{2}) - \overset{&OverBar;}{I_{2} (i_{2}, j_{2})}]}{\sqrt{{Σ_{m = 0}^{w} Σ_{n = 0}^{h} {[I_{1} (i_{1}, j_{1}) - \overset{&OverBar;}{I_{1} (i_{1}, j_{1})}]}^{2}} \times {Σ_{m = 0}^{w} Σ_{n = 0}^{h} {[I_{2} (i_{2}, j_{2}) - \overset{&OverBar;}{I_{2} (i_{2}, j_{2})}]}^{2}}}} - - - (1)

3, confirm the optimum homography matrix of preceding k frame.

To continuously preceding k frame similarity value maximizing; The homography matrix of the homography matrix of the highest picture frame of correlation as optimum; Homography matrix as subsequent frame; And need not carry out the calculating of feature point extraction, feature point pair matching and homography matrix again to the subsequent frame image, thus shorten the algorithm required time greatly, can satisfy the requirement of video-splicing real-time.

Six, carry out the splicing of subsequent video frame scene according to the homography matrix of optimum;

The homography matrix that utilizes optimum homography matrix to do subsequent frame splices.Subsequent frame is spliced, and need not ask characteristic point and coupling to calculate to the subsequent frame image again, thereby shorten the algorithm required time greatly, can satisfy the requirement of video-splicing real-time.

Seven, export spliced video.

Claims

1. real-time video joining method based on multiple-camera, hardware is based on the traffic monitoring video-splicing device of multiple-camera, mainly comprises processor, video acquisition module, power module, display module and memory module; Its step is following:

Step 1, start-up operation obtain synchronous multi-path video data;

Step 2, the two field picture of synchronization is carried out preliminary treatment: become the gray level image of 256 gray scales to coloured image, and adopt histogram equalizing method that image is carried out enhancement process;

Step 3, employing SURF algorithm carry out the extraction of corresponding frame characteristic point;

Step 4, adopt arest neighbors matching method, random sampling consistency matching algorithm to the characteristic point of the corresponding frame of video image to mating;

Step 5-1, confirm the overlapping region of stitching image

According to the perspective mapping relations of image, obtain the match point of keeping characteristics point at another corresponding two field picture according to homography matrix; If the match point that calculates does not drop in the image, this point is described not in the overlay region of two width of cloth images, otherwise, if all drop in the image before and after the characteristic point mapping, explain that it is in the overlay region; Travel through in this way one time, promptly all keeping characteristics points in the piece image are had an X-rayed mapping transformation, obtain shining upon the zone that character pair point is confirmed; Getting with another width of cloth edge of image is the border, and the minimum inscribed polygon of getting the overlay region is as the overlapping region;

Step 5-2, ask for before the similarity of k frame correspondence image splicing overlapping region

In the formula, w, h are the wide height of overlay region, and obviously, similarity C span is (1,1), and the big more explanation of value overlay region degree of correlation is high more;

The optimum homography matrix of step 5-3, definite preceding k frame

To continuously preceding k frame overlapping region similarity maximizing, the homography matrix of the homography matrix of the highest picture frame of correlation as optimum, and with its homography matrix as subsequent frame;

Step 7, export spliced video.

2. a kind of real-time video joining method based on multiple-camera according to claim 1, described image gray processing are to utilize the formula of following formula weighted mean method to carry out conversion, become coloured image the gray level image of 256 grey levels:

f(i，j)＝0.3*R(i，j)+0.59*G(i，j)+0.11*B(i，j)。

3. a kind of real-time video joining method based on multiple-camera according to claim 1 carries out enhancement process to image, and the employing histogram equalizing method enlarges the dynamic range of gradation of image, its step:

Gray scale is that the number of pixels of k is nk in the step 2-1 statistical picture;

The value of dependent variable: pk=nk/ in the step 2-2 compute histograms (M * N);

Step 2-3 calculates the value of dependent variable in the accumulative histogram: sk=∑ pk;

Step 2-4 confirms the mapping corresponding relation: searching i (i=0,1 ..., L) make it satisfy i/L, the most approaching with sk;

Step 2-5 is that the value of the pixel of k changes i into gray scale in the original image;

Wherein L is the gray-level number, and M * N is a picture size.

4. a kind of real-time video joining method according to claim 1 based on multiple-camera, the extraction that said employing SURF algorithm carries out corresponding frame characteristic point comprises:

The delta value method of described calculating every bit is following:

The Hessian matrix does

delta＝D _xxD _yy-(0.9D _xy) ²；

26 points carry out asking for of maximum around extreme point, thereby obtain rough characteristic point, to ask for the maximum point of metric space; Discrete point through in the quadratic interpolattion match extreme point neighborhood obtains three-dimensional conic section; The extreme value place of conic section is the sub-pixel location of extreme point; If departing from former integer extreme point coordinate, this sub-pixel location all is no more than 0.5 pixel three dimensions; Think that promptly extreme point is stable, otherwise reject, accurately to confirm the position and the yardstick of characteristic point;

The characteristic vector of step 3-3, formation SURF: with the characteristic point is the center, at first reference axis is rotated to principal direction; Choose the square area that the length of side is 20s according to principal direction, these windows are divided into 4 * 4 subregion; In each subregion, calculate the little wave response in 5s * 5s scope, sampling step length is got s; With respect to the Haar small echo response of the level of principal direction, vertical direction respectively note be dx, dy;

Give response equally with the weights coefficient, to increase robustness to geometric transformation; The response of each subregion and the absolute value addition of response are formed ∑ dx, ∑ dy, ∑ dx, ∑ dy; Each subregion forms the vector V sub=(∑ dx, ∑ dx, ∑ dy, ∑ dy) of four-dimensional component; To each characteristic point, then form the description vector of 4 * (4 * 4)=64 dimension; Carry out the normalization of vector again, thereby illumination is had certain robustness.

5. a kind of real-time video joining method according to claim 1 based on multiple-camera, adopt arest neighbors matching method, random sampling consistency matching algorithm to the characteristic point of the corresponding frame of video image to mating, comprising:

Step 4-1, arest neighbors matching method to each characteristic point of source images, calculate itself and the distance of corresponding two field picture matched feature points; Adopt preferential k-d tree to carry out first search, search 2 approximate arest neighbors characteristic points of each characteristic point; The ratio of the nearest neighbor distance of characteristic point and time nearest neighbor distance as the foundation of whether mating between characteristic point; If ratio, is then accepted this a pair of match point less than certain proportion threshold value;

Step 4-2, random sampling consistency matching algorithm, repeated sampling n time takes out 4 groups of corresponding points at random to forming a sample, and calculates homography matrix H;

The computing formula of step 4-3, homography matrix:

Make that I is plane in the space, its imaging under two viewpoints is designated as 11 respectively, I2, [x y 1] ^T∈ I ₁, [X Y 1] ^T∈ I ₂Be any a pair of corresponding points, then image I ₁And image I ₂Transformation relation can be used formula I ₁=kH * I ₂Expression is specially:

Wherein, k is a scale factor, the convergent-divergent relation between presentation video, and H is called homography matrix, and it is one 3 * 3 non-singular matrix, is written as following form usually:

The value of lower right corner h33 is 1;

Obtain 4 pairs of accurate match points to all calculating the situation whether three point on a straight line is arranged in each sampling; Judge 3 whether methods of conllinear: 3 for coordinate for (x1, y1), (x2, y2), (x3; Y3), try to achieve the phasor coordinate of wherein any 2 lines, like x1-x2=a, y1-y2=b, x1-x3=c; Y1-y3=d get (a, b), (c d), sees whether whether parallel ad=bc promptly sees; Equal, conllinear then, otherwise conllinear not are if there have conllinear to choose match point again to be right;

The characteristic point of step 4-5, the coupling concentrated with selected point is come minimize error to recomputating homography matrix with least square method.