CN109376641B

CN109376641B - Moving vehicle detection method based on unmanned aerial vehicle aerial video

Info

Publication number: CN109376641B
Application number: CN201811203391.1A
Authority: CN
Inventors: 朱旭; 孙思琦; 徐伟; 闫茂德; 杨盼盼; 左磊
Original assignee: Changan University
Current assignee: CHINA HIGHWAY ENGINEERING CONSULTING GROUP Co Ltd; CHECC Data Co Ltd
Priority date: 2018-10-16
Filing date: 2018-10-16
Publication date: 2021-04-27
Anticipated expiration: 2038-10-16
Also published as: CN109376641A

Abstract

The invention discloses a moving vehicle detection method based on unmanned aerial vehicle aerial video, which comprises the steps of firstly adopting an SURF algorithm to carry out feature point matching and abnormal point elimination on an image, utilizing an unmanned aerial vehicle image registration algorithm combining a global homography matrix and a local homography matrix to obtain a conversion matrix, compensating adverse effects generated by movement of an onboard camera, then adopting a 2-frame difference method to reduce a region to be detected, traversing the region to be detected according to the center of a superpixel, further improving the moving vehicle detection efficiency, then utilizing a multichannel HOG feature algorithm to extract low-order features of a vehicle, introducing context information of the vehicle to obtain high-order features of the vehicle, fusing the two features to obtain multi-order features of a target vehicle, and finally combining the multi-order features and a dictionary learning algorithm to realize moving vehicle detection. The method can inhibit the influence caused by the motion of the airborne camera of the unmanned aerial vehicle, process the vehicle deformation and background interference in the image, and improve the robustness and real-time performance of the moving vehicle detection.

Description

Moving vehicle detection method based on unmanned aerial vehicle aerial video

Technical Field

The invention relates to a method for detecting a moving vehicle, in particular to a method for detecting a moving vehicle based on an unmanned aerial vehicle aerial video.

Background

The unmanned aerial vehicle aerial photography is used as a novel remote sensing data acquisition means, and has the unique advantages of flexible deployment mode, large monitoring range, fine information acquisition granularity, no ground traffic interference and the like. Unmanned aerial vehicle flying speed and height-adjustable, visual angle are nimble, acquire the efficient, with low costs, the risk of ground traffic image information low, can realize from local to wide area's traffic monitoring on a large scale. With the further development and fusion of the unmanned aerial vehicle aerial photography technology and the image processing technology, the unmanned aerial vehicle images are reasonably utilized and analyzed, and the method has wide application prospects in the fields of traffic planning, design and management.

Commonly used moving vehicle detection methods include a background extraction method, an optical flow method, and the like. Among them, the background extraction method is extremely sensitive to illumination and background variation, and the optical flow method is too expensive to calculate. In order to improve the robustness of moving vehicle detection, some scholars establish a dynamic Bayesian network and adopt a sliding window method to detect vehicles, and although a certain effect is achieved, the calculation amount of the sliding window method is still too large, and the application is limited.

Therefore, although many moving vehicle detection algorithms exist at present and have a certain detection effect. However, the stability, robustness and real-time performance of the moving vehicle detection method based on the unmanned aerial vehicle aerial video still need to be improved.

Disclosure of Invention

The invention aims to provide a moving vehicle detection method based on an unmanned aerial vehicle aerial video, so as to overcome the defects of the prior art.

In order to achieve the purpose, the invention adopts the following technical scheme:

a moving vehicle detection method based on unmanned aerial vehicle aerial video comprises the following steps:

step 1), acquiring an aerial video of a moving vehicle, extracting a continuous image sequence of the aerial video, then extracting SURF (speeded up robust features) feature points of a reference image and an image to be registered, then performing feature point matching, and performing abnormal point elimination on the matched feature points by adopting a random sampling consistency algorithm;

step 2), aiming at the characteristic points after the abnormal points are removed, obtaining a conversion matrix of the image through an unmanned aerial vehicle image registration algorithm;

step 3), aiming at the image processed in the step 2), determining a to-be-detected area of the moving vehicle by adopting a 2-frame difference method, performing superpixel segmentation on the image, and determining a scanning frame according to the center of the superpixel so as to traverse the to-be-detected area;

step 4), extracting the texture and color of the vehicle to form low-order features of the vehicle by using the image processed in the step 3); introducing context information of the vehicle, and extracting high-order characteristics of the vehicle; after the low-order characteristic and the high-order characteristic of the target vehicle are obtained, the low-order characteristic and the high-order characteristic are fused to obtain the multi-order characteristic of the target vehicle;

and 5) training the dictionary by using a dictionary learning algorithm for the obtained multi-order features of the vehicle, and detecting the moving vehicle by using the trained dictionary.

Further, Harr features and integral image concepts are adopted for extracting SURF feature points of the reference image and the image to be registered.

Further, calculating the Euclidean distance between any SURF characteristic point in the reference image and the characteristic point in the image to be registered; the smaller the Euclidean distance is, the higher the similarity is, and when the Euclidean distance is smaller than a set threshold value, the matching is judged to be successful; and if a certain SURF characteristic point in the image to be registered is matched with the plurality of characteristic points in the reference image, the matching is regarded as unsuccessful.

Further, after the abnormal point elimination is completed, an image pyramid is introduced, and a global homography matrix and a local homography matrix are determined according to the feature point pairing result in a top-down mode: firstly, establishing an L +1 level pyramid of a reference image and an image to be registered, starting from an L level global homography matrix when determining the global homography matrix, and then gradually increasing the resolution to a 0 level, and further obtaining the 0 level global homography matrix.

Further, define

And

respectively representing corresponding coordinates of the L-level reference image and the image to be registered; wherein

Is the x-coordinate of the L-th level reference image,

is the y-coordinate of the L-th level reference image,

for the x coordinate of the L-th level image to be registered,

for the y coordinate of the L-th level image to be registered:

the L-th level global homography matrix is determined by:

wherein, w_LIs an intermediate variable and has

The L level global homography matrix is defined by the matrix elements as follows:

is abbreviated as

The following are determined

Randomly selecting 4 groups of feature point matching results each time to determine a homography matrix, and adopting l₂The norm is used for screening the remaining feature matching points according to the following formula:

wherein, t^rA threshold value for outlier screening; when the residual feature matching points meet the formula, the feature matching points are regarded as valid feature matching points, otherwise, the feature matching points are regarded as invalid feature matching points; the homography matrix when the number of the effective characteristic matching points is the maximum is the finally determined L-level global homography matrix

The homography matrix of level L-1 is obtained by increasing the image resolution: introducing a scale factor mu, wherein pixel points corresponding to the L-1 level of the reference image and the image to be registered can be expressed as follows:

wherein the content of the first and second substances,

is the x-coordinate of the L-1 th level reference image,

is the y coordinate of the L-1 th level reference image,

for the x-coordinate of the L-1 th level image to be registered,

the y coordinate of the L-1 level image to be registered is obtained; μ is a scale factor: to find the homography matrix of the L-1 th level, there are:

order to

The above formula can be rewritten as:

wherein the content of the first and second substances,

the global homography matrix is the L-1 level;

by adopting a homography matrix derivation method from the L level to the L-1 level, the global homography matrix corresponding to the 0 level can be obtained by gradually increasing the resolution

Namely:

wherein the content of the first and second substances,

is the x-coordinate of the level 0 reference image,

is the y-coordinate of the level 0 reference image,

for the x-coordinate of the level 0 image to be registered,

for the y-coordinate, mu, of the 0 th-order image to be registered^LIs the scale factor of the 0 th level homography matrix.

Further, F (k-1) and F (k) are adopted to respectively represent the k-1 frame and the k frame in the unmanned aerial vehicle image sequence, F^r(k-1) and F^r(k) The registered images are obtained; for registered image F^r(k-1) and F^r(k) And determining the region to be detected by adopting a 2-frame difference method.

Further, the image is first divided into small connected regions, i.e. cell units; then collecting the direction histogram of the gradient or edge of each pixel point in the cell unit; finally, combining the features of these cell units can form a HOG feature descriptor: firstly converting an image into HSV color space, respectively extracting HOG characteristics from three channels, finally performing characteristic fusion, converting the image from RGB color space to HSV color space, respectively extracting H, S, V three-channel data templates of the image, and storing the data templates as a two-dimensional matrix M_H、M_SAnd M_VSimultaneously calculating HOG characteristics H of three matrixes respectively_H、H_SAnd H_V。

Further, a weighting mode is adopted to fuse the three-channel HOG characteristics, namely: h_l＝w_HH_H+w_SH_S+w_VH_V(ii) a Wherein H_lRepresenting low-order features of the vehicle; w is a_H、w_SAnd w_VRespectively HOG characteristic H_H、H_SAnd H_VAnd w is_H+w_S+w _V1 is ═ 1; the weight of three channels is determined by each channel data template in a self-adaptive mode, and specifically is determined by the following formula:

a low-order feature of the vehicle, i.e., a fused H, S, V three-channel HOG feature, is determined.

Further, when high-order features are determined, context information of the vehicle is introduced; manually selecting positive and negative samples to initialize a positive dictionary and a negative dictionary, and then determining a final positive dictionary D according to dictionary learning and autonomous selection strategies_pAnd negative dictionary D_n(ii) a Determining high-order characteristics by calculating the reconstruction error of a target area and the reconstruction errors of other image blocks in the neighborhood;

for vehicle t_vThe reconstruction error is denoted as e (t)_v) And e (t)_v)＝[e(t_v,D_p),e(t_v,D_n)]^TWherein e (t)_v,D_p) And e (t)_v,D_n) Are each t_vReconstruction errors on the positive and negative dictionaries; for a certain neighborhood image block a of a vehicle_ιThe reconstruction error is e (a)_ι) And e (a)_ι)＝[e(a_ι,D_p),e(a_ι,D_n)]^TWhere the subscript t is the target vehicle t_vThe number of image blocks in the neighborhood; wherein e (a)_ι,D_p) And e (a)_ι,D_n) Are respectively a_ιReconstruction errors on the positive and negative dictionaries; for a neighborhood image block a_ιDefining a target vehicle t_vIs characterized by a high order of t_vAnd a_ιIs expressed as H (t)_v,a_ι)＝||e(t_v)-e(a_ι)||₂Wherein, H (t)_v,a_ι) Is a target vehicle t_vRelative to neighborhood a_ιHigh order features of (1);

when the target vehicle t_vWhen M image blocks exist in the neighborhood, the target vehicle t_vThe high-order features of (1) are: h_h＝[H(t_v,a₁),H(t_v,a₂),…,H(t_v,a_M)]^T；

And fusing the obtained high-order characteristic and the low-order characteristic of the vehicle together to obtain a multi-order characteristic of the target vehicle: f_v＝[H_l,H_h](ii) a And combining the low-order characteristics and the high-order characteristics of the vehicle to obtain the multi-order characteristics of the target vehicle.

Further, in a dictionary learning algorithm based on correlation, in a dictionary updating stage, atoms related to new sample sparse representation are determined firstly, and only the atoms are updated; introducing sparsity into a dictionary updating stage; and repeatedly iterating the updating process until convergence, further realizing rapid and efficient dictionary training and finally finishing the detection of the moving vehicle.

Compared with the prior art, the invention has the following beneficial technical effects:

the invention discloses a moving vehicle detection method based on unmanned aerial vehicle aerial video, which comprises the steps of firstly adopting an SURF algorithm to carry out feature point matching and abnormal point elimination on an image, utilizing an unmanned aerial vehicle image registration algorithm combining a global homography matrix and a local homography matrix to obtain a conversion matrix, compensating adverse effects generated by movement of an onboard camera, then adopting a 2-frame difference method to reduce a region to be detected, traversing the region to be detected according to the center of a superpixel, further improving the moving vehicle detection efficiency, then utilizing a multichannel HOG feature algorithm to extract low-order features of a vehicle, introducing context information of the vehicle to obtain high-order features of the vehicle, fusing the two features to obtain multi-order features of a target vehicle, and finally combining the multi-order features and a dictionary learning algorithm to realize moving vehicle detection. The method can inhibit the influence caused by the motion of the airborne camera of the unmanned aerial vehicle, process the vehicle deformation and background interference in the image, and improve the robustness and real-time performance of the moving vehicle detection. The invention compensates the adverse effect generated by the movement of the airborne camera and lays a foundation for the detection of moving vehicles; the method combining the 2-frame difference method and the center traversal of the superpixel is adopted, so that the efficiency of acquiring the region to be detected is improved; aiming at the obtained region to be detected, when the low-order features of the vehicle are extracted, a multi-channel HOG feature extraction method is adopted, so that false detection and missing detection are reduced; when the high-order characteristics of the vehicle are extracted, the context information of the vehicle is introduced, so that the deformation and background interference of the vehicle are effectively inhibited, and the accuracy of detecting the moving vehicle is improved. The method for detecting the unmanned aerial vehicle video motion vehicle can realize accurate detection of the vehicle running on the road.

Furthermore, an image registration algorithm combining global and local homography matrixes is provided in a top-down mode according to the feature point pairing result. The global homography matrix describes global position changes, and the local homography matrix describes local position changes.

Furthermore, the 2-frame difference method is utilized to reduce the area to be detected, superpixel segmentation is introduced, the area to be detected is determined and scanned according to the center of the superpixel, and the calculated amount of detection of the moving vehicles is effectively reduced.

Furthermore, when the high-order features of the vehicle are extracted, firstly, positive and negative samples are manually selected to initialize the positive dictionary and the negative dictionary, then, after the final positive dictionary and the final negative dictionary are determined according to the dictionary learning and sample self-selection strategies, the high-order features are determined by calculating the reconstruction errors of the target area and the reconstruction errors of other image blocks in the neighborhood, the calculation amount of the dictionary learning is reduced, and further, the fast and efficient dictionary training is realized.

Drawings

FIG. 1 is a flow chart of the detection method in the embodiment of the present invention.

FIG. 2 is the image pyramid in an example of the invention.

FIG. 3 is a framework of a method for detecting moving vehicles based on image registration and superpixel segmentation according to an embodiment of the present invention.

Detailed Description

The invention is described in further detail below with reference to the accompanying drawings:

a moving vehicle detection method based on an unmanned aerial vehicle aerial video mainly aims to suppress influence caused by movement of an airborne camera of the unmanned aerial vehicle, process vehicle deformation and background interference in an image and improve robustness and real-time performance of moving vehicle detection. The invention is further described below with reference to the accompanying drawings.

Fig. 1 in the drawings shows a flow chart of the detection method of the present invention, and the specific implementation manner is as follows:

step 1), aerial photography is carried out on vehicles on a road by using an unmanned aerial vehicle onboard camera to obtain aerial photography videos, continuous image sequences of the aerial photography videos are extracted, SURF feature points of a reference image and an image to be registered are extracted, and then feature point matching is carried out. Mismatching still possibly exists in the matched feature points, and therefore, an abnormal point elimination is further performed by adopting a random sampling consistency algorithm:

specifically, Harr characteristics and integral image concepts are adopted for extracting SURF characteristic points of a reference image and an image to be registered. Finding out correctly matched feature points in the reference image and the image to be registered according to the following two principles:

1) calculating the Euclidean distance between any SURF characteristic point in the reference image and the characteristic point in the image to be registered; the smaller the Euclidean distance is, the higher the similarity is, and when the Euclidean distance is smaller than a set threshold value, the matching is judged to be successful; the threshold was taken to be 6.

2) And if a certain SURF characteristic point in the image to be registered is matched with the plurality of characteristic points in the reference image, the matching is regarded as unsuccessful.

After the feature points are matched, mismatching still possibly exists, and in order to eliminate the mismatching, a random sampling consistency algorithm is adopted to eliminate abnormal points.

Step 2), aiming at the characteristic points after the abnormal points are removed, obtaining a conversion matrix of the image through an unmanned aerial vehicle image registration algorithm, and compensating the adverse effect of the movement of an onboard camera of the unmanned aerial vehicle on the image during shooting;

and after the abnormal points are eliminated, introducing an image pyramid, and determining a global homography matrix and a local homography matrix according to the feature point pairing result in a top-down mode. First, as shown in fig. 2 of the drawings, an L +1 level pyramid of a reference image and an image to be registered is established. The 0 th level is a reference image or an image to be registered, and the resolution is highest. When moving to the upper pyramid layer, both image size and resolution are reduced. At the top of the pyramid, the lth level, is the lowest resolution. When the global homography matrix is determined, the global homography matrix corresponding to the 0 th level can be obtained by starting from the global homography matrix of the L th level and then increasing the resolution step by step until the 0 th level.

Definition of

And

Is the x-coordinate of the L-th level reference image,

is the y-coordinate of the L-th level reference image,

for the x coordinate of the L-th level image to be registered,

and the y coordinate of the L-th level image to be registered is obtained.

The L-th level global homography matrix is determined by:

wherein, w_LIs an intermediate variable and has

for convenience, it can be abbreviated as

The following are determined

wherein, t^rAnd (4) screening threshold values for abnormal points. When the residual feature matching points satisfy the above formula, the remaining feature matching points are regarded as valid feature matching points, and whether the valid feature matching points satisfy the above formula or notThe feature matching point is deemed invalid. The homography matrix when the number of the effective characteristic matching points is the maximum is the finally determined L-level global homography matrix

The homography matrix of level L-1 can be obtained by increasing the image resolution. Introducing a scale factor mu, wherein pixel points corresponding to the L-1 level of the reference image and the image to be registered can be expressed as follows:

wherein the content of the first and second substances,

is the x-coordinate of the L-1 th level reference image,

is the y coordinate of the L-1 th level reference image,

for the x-coordinate of the L-1 th level image to be registered,

the y coordinate of the L-1 level image to be registered is obtained; mu is a scale factor. To find the homography matrix of the L-1 th level, there are:

order to

The above formula can be rewritten as:

wherein the content of the first and second substances,

is a global homography matrix of level L-1.

Namely:

wherein the content of the first and second substances,

is the x-coordinate of the level 0 reference image,

is the y-coordinate of the level 0 reference image,

for the x-coordinate of the level 0 image to be registered,

Taking the L-1 level as an example, how to realize image registration by combining global and local homography matrixes, the scale factor μ is 2. As shown in fig. 2 of the drawings, the L-1 level image is divided into four blocks on average, and the homography matrix corresponding to each sub-block is defined as a local homography matrix and is recorded as a local homography matrix

Representing the ζ -th image block of level L-1A local homography matrix. The algorithm for solving the local homography matrix is the same as the global homography matrix, invalid feature matching points are further removed, and then the local homography matrix is determined.

For the image block 1 of the L-1 level in the figure 2 in the figure, the L-1 level global homography matrix is synthesized

And L-1 level local homography matrix

The coordinate transformation relation of the image blocks 1 of the reference image and the image to be registered is obtained as

Wherein

And

respectively representing the corresponding coordinates of the image blocks 1 of the L-1 level reference image and the image to be registered,

and

respectively expressed as a local intermediate variable and a global intermediate variable corresponding to the L-1 level image block 1. Note the book

The above formula can be abbreviated as the conversion matrix of the image block 1 of the L-1 level image

Similarly, for the image blocks 2, 3, 4 of level L-1 in FIG. 2 of the drawings, there are:

wherein, F_L-1,2、F_L-1,3、F_L-1,4Respectively the transformation matrices of the image blocks 2, 3, 4 of the L-1 level image,

respectively corresponding coordinates of image blocks 2, 3 and 4 of the L-1 level image to be registered,

the corresponding coordinates of the image blocks 2, 3, 4 of the L-1 th level reference image, respectively.

Conversion matrix F for synthesizing four L-1 level image blocks_L-1,1、F_L-1,2、F_L-1,3、F_L-1,4The coordinate transformation relation between the reference image of the L-1 level and the image to be registered is obtained as

Wherein the content of the first and second substances,

a joint transformation matrix for the L-1 level image; f_L-1,ζA transformation matrix representing the Zeth image block of the L-1 level; lambda [ alpha ]_L-1,ζAnd the weight of the Zeth image block conversion matrix of the L-1 level.

Gradually increasing the resolution until reaching the 0 th level of the image pyramid, and obtaining the coordinate transformation relation between the reference image and the image to be registered:

wherein the content of the first and second substances,

a joint transformation matrix for the level 0 image, i.e. a transformation matrix for the final joint global and local homography matrices,

for the 0 th-level image to be registeredThe respective coordinates of the respective coordinates,

corresponding coordinates for the level 0 reference image.

Step 3), aiming at the image processed in the step 2), determining a to-be-detected area of the moving vehicle by adopting a 2-frame difference method; performing superpixel segmentation on the image, and determining a scanning frame according to the center of the superpixel to traverse the area to be detected;

as shown in fig. 3 of the drawings, F (k-1) and F (k) respectively represent the k-1 th frame and the k-th frame in the drone image sequence. F^r(k-1) and F^r(k) Is the registered image. To reduce the amount of computation for moving vehicle detection, the registered images F are processed^r(k-1) and F^r(k) The 2-frame difference method is adopted to determine the region to be detected, which is shown as a rectangular box in a small picture 2-frame difference method in figure 3 in the attached drawing. Taking 2 moving vehicles as an example, 4 regions to be detected are generated after 2-frame difference method is used.

After the area to be detected is determined by using a 2-frame difference method, the image is subjected to superpixel segmentation, a scanning frame is determined according to the center of the superpixel, and then the area to be detected is traversed to realize moving vehicle detection. When traversing the area to be detected, affine transformation needs to be carried out on the scanning frame due to rotation, translation and the like of the target vehicle so as to reduce the missing rate of detection of the moving vehicle.

specifically, the image is first divided into small connected regions, called cells. And then acquiring the direction histogram of the gradient or edge of each pixel point in the cell unit. Finally, the features of these cell units are combined to form a HOG feature descriptor. Firstly converting the image into HSV color space, respectively extracting HOG characteristics from three channels, finally making characteristic fusion, converting the image from RGB color space to HSV color space, and dividing them into different portionsH, S, V three-channel data template of the image is extracted and stored as a two-dimensional matrix M_H、M_SAnd M_VSimultaneously calculating HOG characteristics H of three matrixes respectively_H、H_SAnd H_V. And fusing the three-channel HOG characteristics in a weighting mode, namely: h_l＝w_HH_H+w_SH_S+w_VH_V. Wherein H_lRepresenting low-order features of the vehicle; w is a_H、w_SAnd w_VRespectively HOG characteristic H_H、H_SAnd H_VAnd w is_H+w_S+w _V1 is ═ 1; the weight of three channels is determined by each channel data template in a self-adaptive mode, and is specifically determined by the following formula:

to this end, a low-order feature of the vehicle, i.e., a fused H, S, V three-channel HOG feature, is determined.

And when the high-order characteristics are determined, introducing the context information of the vehicle. Manually selecting positive and negative samples to initialize a positive dictionary and a negative dictionary, and then determining a final positive dictionary D according to dictionary learning and autonomous selection strategies_pAnd negative dictionary D_n. Next, the high order features are determined by calculating the reconstruction error of the target region and the reconstruction errors of other image blocks in the neighborhood.

For vehicle t_vThe reconstruction error is denoted as e (t)_v) And e (t)_v)＝[e(t_v,D_p),e(t_v,D_n)]^TWherein e (t)_v,D_p) And e (t)_v,D_n) Are each t_vReconstruction errors on the positive and negative dictionaries. For a certain neighborhood image block a of a vehicle_ιThe reconstruction error is e (a)_ι) And e (a)_ι)＝[e(a_ι,D_p),e(a_ι,D_n)]^TWhere the subscript t is the target vehicle t_vThe number of image blocks in the neighborhood. Wherein e (a)_ι,D_p) And e (a)_ι,D_n) Are respectively a_ιReconstruction errors on the positive and negative dictionaries. For the neighborhood image block a iota, defining a target vehicle t_vIs characterized by a high order of t_vAnd a_ιThe difference of the reconstruction error of (1) can be represented as H (t)_v,a_ι)＝||e(t_v)-e(a_ι)₂Wherein, H (t)_v,a_ι) Is a target vehicle t_vRelative to neighborhood a_ιHigh order features of (1).

When the target vehicle t_vWhen M image blocks exist in the neighborhood, the target vehicle t_vThe high-order features of (1) are: h_h＝[H(t_v,a₁),H(t_v,a₂),…,H(t_v,a_M)]^T

And fusing the obtained high-order characteristic and the low-order characteristic of the vehicle together to obtain a multi-order characteristic of the target vehicle: f_v＝[H_l,H_h]。

Thus, the multi-level features of the target vehicle are obtained by combining the low-level features and the high-level features of the vehicle.

Specifically, in the dictionary learning algorithm based on the correlation, in the dictionary updating stage, atoms related to the new sample sparse representation are firstly determined, and only the atoms are updated, so that the calculation amount of dictionary learning is reduced. On the other hand, sparsity is introduced into the dictionary update phase. And (4) repeatedly iterating the process until convergence is achieved, so that the fast and efficient dictionary training is realized, and finally the detection of the moving vehicle is completed.

And 2) introducing an image pyramid, and providing an image registration algorithm combining global and local homography matrixes according to the feature point pairing result in a top-down mode. The global homography matrix describes global position changes, and the local homography matrix describes local position changes.

And 3) introducing a 2-frame difference method and superpixel segmentation, reducing the region to be detected by using the 2-frame difference method, introducing superpixel segmentation, determining and scanning the region to be detected according to the center of the superpixel, and effectively reducing the calculated amount of moving vehicle detection.

The step 4) is implemented as follows: when the high-order features of the vehicle are extracted, firstly, positive and negative samples are manually selected to initialize a positive dictionary and a negative dictionary, then, after the final positive dictionary and the final negative dictionary are determined according to dictionary learning and sample autonomous selection strategies, the high-order features are determined by calculating the reconstruction errors of a target area and the reconstruction errors of other image blocks in a neighborhood.

Claims

1. A moving vehicle detection method based on unmanned aerial vehicle aerial video is characterized by comprising the following steps:

2. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video of claim 1, wherein Harr features and integral image concepts are adopted for SURF feature point extraction on the reference image and the image to be registered.

3. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video according to claim 2, wherein the Euclidean distance between any SURF feature point in the reference image and the feature point in the image to be registered is calculated; the smaller the Euclidean distance is, the higher the similarity is, and when the Euclidean distance is smaller than a set threshold value, the matching is judged to be successful; and if a certain SURF characteristic point in the image to be registered is matched with the plurality of characteristic points in the reference image, the matching is regarded as unsuccessful.

4. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video, according to claim 1, is characterized in that after the abnormal point elimination is completed, an image pyramid is introduced, and a global homography matrix and a local homography matrix are determined according to a feature point pairing result in a top-down mode: firstly, establishing an L +1 level pyramid of a reference image and an image to be registered, starting from an L level global homography matrix when determining the global homography matrix, and then gradually increasing the resolution to a 0 level, and further obtaining the 0 level global homography matrix.

5. The method according to claim 4, wherein the definition defines a moving vehicle detection method based on the aerial video of the unmanned aerial vehicle

And

Is the x-coordinate of the L-th level reference image,

is the y-coordinate of the L-th level reference image,

for the x coordinate of the L-th level image to be registered,

for the y coordinate of the L-th level image to be registered:

the L-th level global homography matrix is determined by:

wherein, w_LIs an intermediate variable and has

is abbreviated as

The following are determined

wherein the content of the first and second substances,

is the x-coordinate of the L-1 th level reference image,

is the y coordinate of the L-1 th level reference image,

for the x-coordinate of the L-1 th level image to be registered,

order to

The above formula can be rewritten as:

wherein the content of the first and second substances,

the global homography matrix is the L-1 level;

Namely:

wherein the content of the first and second substances,

is the x-coordinate of the level 0 reference image,

is the y-coordinate of the level 0 reference image,

for the x-coordinate of the level 0 image to be registered,

for the y-coordinate, mu, of the 0 th-order image to be registered^LIs the 0 th level homographyThe scale factor of the matrix.

6. The method for detecting the moving vehicle based on the aerial video of the unmanned aerial vehicle as claimed in claim 1, wherein F (k-1) and F (k) are adopted to respectively represent the k-1 th frame and the k-th frame in the image sequence of the unmanned aerial vehicle, and F (k) is adopted to respectively represent the k-1 th frame and the k-th frame in the image sequence of the unmanned aerial vehicle^r(k-1) and F^r(k) The registered images are obtained; for registered image F^r(k-1) and F^r(k) And determining the region to be detected by adopting a 2-frame difference method.

7. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video is characterized in that the image is firstly divided into small connected areas, namely cell units; then collecting the direction histogram of the gradient or edge of each pixel point in the cell unit; finally, combining the features of these cell units can form a HOG feature descriptor: firstly converting an image into HSV color space, respectively extracting HOG characteristics from three channels, finally performing characteristic fusion, converting the image from RGB color space to HSV color space, respectively extracting H, S, V three-channel data templates of the image, and storing the data templates as a two-dimensional matrix M_H、M_SAnd M_VSimultaneously calculating HOG characteristics H of three matrixes respectively_H、H_SAnd H_V。

8. The method for detecting the moving vehicle based on the aerial video of the unmanned aerial vehicle as claimed in claim 7, wherein a three-channel HOG feature is fused in a weighting manner, that is: h_l＝w_HH_H+w_SH_S+w_VH_V(ii) a Wherein H_lRepresenting low-order features of the vehicle; w is a_H、w_SAnd w_VRespectively HOG characteristic H_H、H_SAnd H_VAnd w is_H+w_S+w_V1 is ═ 1; the weight of three channels is determined by each channel data template in a self-adaptive mode, and specifically is determined by the following formula:

9. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video, according to claim 7, is characterized in that context information of the vehicle is introduced when high-order features are determined; manually selecting positive and negative samples to initialize a positive dictionary and a negative dictionary, and then determining a final positive dictionary D according to dictionary learning and autonomous selection strategies_pAnd negative dictionary D_n(ii) a Determining high-order characteristics by calculating the reconstruction error of a target area and the reconstruction errors of other image blocks in the neighborhood;

10. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial photography video is characterized in that specifically in a dictionary learning algorithm based on the correlation, in a dictionary updating stage, atoms related to a new sample sparse representation are firstly determined, and only the atoms are updated; introducing sparsity into a dictionary updating stage; and repeatedly iterating the updating process until convergence, further realizing rapid and efficient dictionary training and finally finishing the detection of the moving vehicle.