CN109376641B - Moving vehicle detection method based on unmanned aerial vehicle aerial video - Google Patents
Moving vehicle detection method based on unmanned aerial vehicle aerial video Download PDFInfo
- Publication number
- CN109376641B CN109376641B CN201811203391.1A CN201811203391A CN109376641B CN 109376641 B CN109376641 B CN 109376641B CN 201811203391 A CN201811203391 A CN 201811203391A CN 109376641 B CN109376641 B CN 109376641B
- Authority
- CN
- China
- Prior art keywords
- image
- vehicle
- level
- order
- registered
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000001514 detection method Methods 0.000 title claims abstract description 35
- 239000011159 matrix material Substances 0.000 claims abstract description 83
- 238000000034 method Methods 0.000 claims abstract description 45
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 25
- 230000002159 abnormal effect Effects 0.000 claims abstract description 13
- 238000006243 chemical reaction Methods 0.000 claims abstract description 8
- 230000008030 elimination Effects 0.000 claims abstract description 7
- 238000003379 elimination reaction Methods 0.000 claims abstract description 7
- 239000004576 sand Substances 0.000 claims description 12
- 239000000126 substance Substances 0.000 claims description 11
- 230000011218 segmentation Effects 0.000 claims description 8
- 238000012549 training Methods 0.000 claims description 7
- 238000012216 screening Methods 0.000 claims description 6
- 238000000605 extraction Methods 0.000 claims description 4
- 230000004927 fusion Effects 0.000 claims description 4
- 238000005070 sampling Methods 0.000 claims description 4
- 238000009795 derivation Methods 0.000 claims description 3
- 230000033001 locomotion Effects 0.000 abstract description 8
- 230000000694 effects Effects 0.000 abstract description 7
- 230000002411 adverse Effects 0.000 abstract description 4
- 230000009466 transformation Effects 0.000 description 9
- 238000004364 calculation method Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000012544 monitoring process Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- PXFBZOLANLWPMH-UHFFFAOYSA-N 16-Epiaffinine Natural products C1C(C2=CC=CC=C2N2)=C2C(=O)CC2C(=CC)CN(C)C1C2CO PXFBZOLANLWPMH-UHFFFAOYSA-N 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000005286 illumination Methods 0.000 description 1
- 238000007726 management method Methods 0.000 description 1
- 238000013439 planning Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000002194 synthesizing effect Effects 0.000 description 1
- 238000013519 translation Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
- G06V20/42—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/40—Extraction of image or video features
- G06V10/50—Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/50—Context or environment of the image
- G06V20/56—Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
- G06V20/58—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
- G06V20/584—Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Data Mining & Analysis (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Image Processing (AREA)
Abstract
The invention discloses a moving vehicle detection method based on unmanned aerial vehicle aerial video, which comprises the steps of firstly adopting an SURF algorithm to carry out feature point matching and abnormal point elimination on an image, utilizing an unmanned aerial vehicle image registration algorithm combining a global homography matrix and a local homography matrix to obtain a conversion matrix, compensating adverse effects generated by movement of an onboard camera, then adopting a 2-frame difference method to reduce a region to be detected, traversing the region to be detected according to the center of a superpixel, further improving the moving vehicle detection efficiency, then utilizing a multichannel HOG feature algorithm to extract low-order features of a vehicle, introducing context information of the vehicle to obtain high-order features of the vehicle, fusing the two features to obtain multi-order features of a target vehicle, and finally combining the multi-order features and a dictionary learning algorithm to realize moving vehicle detection. The method can inhibit the influence caused by the motion of the airborne camera of the unmanned aerial vehicle, process the vehicle deformation and background interference in the image, and improve the robustness and real-time performance of the moving vehicle detection.
Description
Technical Field
The invention relates to a method for detecting a moving vehicle, in particular to a method for detecting a moving vehicle based on an unmanned aerial vehicle aerial video.
Background
The unmanned aerial vehicle aerial photography is used as a novel remote sensing data acquisition means, and has the unique advantages of flexible deployment mode, large monitoring range, fine information acquisition granularity, no ground traffic interference and the like. Unmanned aerial vehicle flying speed and height-adjustable, visual angle are nimble, acquire the efficient, with low costs, the risk of ground traffic image information low, can realize from local to wide area's traffic monitoring on a large scale. With the further development and fusion of the unmanned aerial vehicle aerial photography technology and the image processing technology, the unmanned aerial vehicle images are reasonably utilized and analyzed, and the method has wide application prospects in the fields of traffic planning, design and management.
Commonly used moving vehicle detection methods include a background extraction method, an optical flow method, and the like. Among them, the background extraction method is extremely sensitive to illumination and background variation, and the optical flow method is too expensive to calculate. In order to improve the robustness of moving vehicle detection, some scholars establish a dynamic Bayesian network and adopt a sliding window method to detect vehicles, and although a certain effect is achieved, the calculation amount of the sliding window method is still too large, and the application is limited.
Therefore, although many moving vehicle detection algorithms exist at present and have a certain detection effect. However, the stability, robustness and real-time performance of the moving vehicle detection method based on the unmanned aerial vehicle aerial video still need to be improved.
Disclosure of Invention
The invention aims to provide a moving vehicle detection method based on an unmanned aerial vehicle aerial video, so as to overcome the defects of the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a moving vehicle detection method based on unmanned aerial vehicle aerial video comprises the following steps:
step 1), acquiring an aerial video of a moving vehicle, extracting a continuous image sequence of the aerial video, then extracting SURF (speeded up robust features) feature points of a reference image and an image to be registered, then performing feature point matching, and performing abnormal point elimination on the matched feature points by adopting a random sampling consistency algorithm;
step 2), aiming at the characteristic points after the abnormal points are removed, obtaining a conversion matrix of the image through an unmanned aerial vehicle image registration algorithm;
step 3), aiming at the image processed in the step 2), determining a to-be-detected area of the moving vehicle by adopting a 2-frame difference method, performing superpixel segmentation on the image, and determining a scanning frame according to the center of the superpixel so as to traverse the to-be-detected area;
step 4), extracting the texture and color of the vehicle to form low-order features of the vehicle by using the image processed in the step 3); introducing context information of the vehicle, and extracting high-order characteristics of the vehicle; after the low-order characteristic and the high-order characteristic of the target vehicle are obtained, the low-order characteristic and the high-order characteristic are fused to obtain the multi-order characteristic of the target vehicle;
and 5) training the dictionary by using a dictionary learning algorithm for the obtained multi-order features of the vehicle, and detecting the moving vehicle by using the trained dictionary.
Further, Harr features and integral image concepts are adopted for extracting SURF feature points of the reference image and the image to be registered.
Further, calculating the Euclidean distance between any SURF characteristic point in the reference image and the characteristic point in the image to be registered; the smaller the Euclidean distance is, the higher the similarity is, and when the Euclidean distance is smaller than a set threshold value, the matching is judged to be successful; and if a certain SURF characteristic point in the image to be registered is matched with the plurality of characteristic points in the reference image, the matching is regarded as unsuccessful.
Further, after the abnormal point elimination is completed, an image pyramid is introduced, and a global homography matrix and a local homography matrix are determined according to the feature point pairing result in a top-down mode: firstly, establishing an L +1 level pyramid of a reference image and an image to be registered, starting from an L level global homography matrix when determining the global homography matrix, and then gradually increasing the resolution to a 0 level, and further obtaining the 0 level global homography matrix.
Further, defineAndrespectively representing corresponding coordinates of the L-level reference image and the image to be registered; whereinIs the x-coordinate of the L-th level reference image,is the y-coordinate of the L-th level reference image,for the x coordinate of the L-th level image to be registered,for the y coordinate of the L-th level image to be registered:
the L-th level global homography matrix is determined by:
wherein, wLIs an intermediate variable and has The L level global homography matrix is defined by the matrix elements as follows:
The following are determinedRandomly selecting 4 groups of feature point matching results each time to determine a homography matrix, and adopting l2The norm is used for screening the remaining feature matching points according to the following formula:
wherein, trA threshold value for outlier screening; when the residual feature matching points meet the formula, the feature matching points are regarded as valid feature matching points, otherwise, the feature matching points are regarded as invalid feature matching points; the homography matrix when the number of the effective characteristic matching points is the maximum is the finally determined L-level global homography matrix
The homography matrix of level L-1 is obtained by increasing the image resolution: introducing a scale factor mu, wherein pixel points corresponding to the L-1 level of the reference image and the image to be registered can be expressed as follows:
wherein the content of the first and second substances,is the x-coordinate of the L-1 th level reference image,is the y coordinate of the L-1 th level reference image,for the x-coordinate of the L-1 th level image to be registered,the y coordinate of the L-1 level image to be registered is obtained; μ is a scale factor: to find the homography matrix of the L-1 th level, there are:
wherein the content of the first and second substances,the global homography matrix is the L-1 level;
by adopting a homography matrix derivation method from the L level to the L-1 level, the global homography matrix corresponding to the 0 level can be obtained by gradually increasing the resolutionNamely:
wherein the content of the first and second substances, is the x-coordinate of the level 0 reference image,is the y-coordinate of the level 0 reference image,for the x-coordinate of the level 0 image to be registered,for the y-coordinate, mu, of the 0 th-order image to be registeredLIs the scale factor of the 0 th level homography matrix.
Further, F (k-1) and F (k) are adopted to respectively represent the k-1 frame and the k frame in the unmanned aerial vehicle image sequence, Fr(k-1) and Fr(k) The registered images are obtained; for registered image Fr(k-1) and Fr(k) And determining the region to be detected by adopting a 2-frame difference method.
Further, the image is first divided into small connected regions, i.e. cell units; then collecting the direction histogram of the gradient or edge of each pixel point in the cell unit; finally, combining the features of these cell units can form a HOG feature descriptor: firstly converting an image into HSV color space, respectively extracting HOG characteristics from three channels, finally performing characteristic fusion, converting the image from RGB color space to HSV color space, respectively extracting H, S, V three-channel data templates of the image, and storing the data templates as a two-dimensional matrix MH、MSAnd MVSimultaneously calculating HOG characteristics H of three matrixes respectivelyH、HSAnd HV。
Further, a weighting mode is adopted to fuse the three-channel HOG characteristics, namely: hl=wHHH+wSHS+wVHV(ii) a Wherein HlRepresenting low-order features of the vehicle; w is aH、wSAnd wVRespectively HOG characteristic HH、HSAnd HVAnd w isH+wS+w V1 is ═ 1; the weight of three channels is determined by each channel data template in a self-adaptive mode, and specifically is determined by the following formula:
a low-order feature of the vehicle, i.e., a fused H, S, V three-channel HOG feature, is determined.
Further, when high-order features are determined, context information of the vehicle is introduced; manually selecting positive and negative samples to initialize a positive dictionary and a negative dictionary, and then determining a final positive dictionary D according to dictionary learning and autonomous selection strategiespAnd negative dictionary Dn(ii) a Determining high-order characteristics by calculating the reconstruction error of a target area and the reconstruction errors of other image blocks in the neighborhood;
for vehicle tvThe reconstruction error is denoted as e (t)v) And e (t)v)=[e(tv,Dp),e(tv,Dn)]TWherein e (t)v,Dp) And e (t)v,Dn) Are each tvReconstruction errors on the positive and negative dictionaries; for a certain neighborhood image block a of a vehicleιThe reconstruction error is e (a)ι) And e (a)ι)=[e(aι,Dp),e(aι,Dn)]TWhere the subscript t is the target vehicle tvThe number of image blocks in the neighborhood; wherein e (a)ι,Dp) And e (a)ι,Dn) Are respectively aιReconstruction errors on the positive and negative dictionaries; for a neighborhood image block aιDefining a target vehicle tvIs characterized by a high order of tvAnd aιIs expressed as H (t)v,aι)=||e(tv)-e(aι)||2Wherein, H (t)v,aι) Is a target vehicle tvRelative to neighborhood aιHigh order features of (1);
when the target vehicle tvWhen M image blocks exist in the neighborhood, the target vehicle tvThe high-order features of (1) are: hh=[H(tv,a1),H(tv,a2),…,H(tv,aM)]T;
And fusing the obtained high-order characteristic and the low-order characteristic of the vehicle together to obtain a multi-order characteristic of the target vehicle: fv=[Hl,Hh](ii) a And combining the low-order characteristics and the high-order characteristics of the vehicle to obtain the multi-order characteristics of the target vehicle.
Further, in a dictionary learning algorithm based on correlation, in a dictionary updating stage, atoms related to new sample sparse representation are determined firstly, and only the atoms are updated; introducing sparsity into a dictionary updating stage; and repeatedly iterating the updating process until convergence, further realizing rapid and efficient dictionary training and finally finishing the detection of the moving vehicle.
Compared with the prior art, the invention has the following beneficial technical effects:
the invention discloses a moving vehicle detection method based on unmanned aerial vehicle aerial video, which comprises the steps of firstly adopting an SURF algorithm to carry out feature point matching and abnormal point elimination on an image, utilizing an unmanned aerial vehicle image registration algorithm combining a global homography matrix and a local homography matrix to obtain a conversion matrix, compensating adverse effects generated by movement of an onboard camera, then adopting a 2-frame difference method to reduce a region to be detected, traversing the region to be detected according to the center of a superpixel, further improving the moving vehicle detection efficiency, then utilizing a multichannel HOG feature algorithm to extract low-order features of a vehicle, introducing context information of the vehicle to obtain high-order features of the vehicle, fusing the two features to obtain multi-order features of a target vehicle, and finally combining the multi-order features and a dictionary learning algorithm to realize moving vehicle detection. The method can inhibit the influence caused by the motion of the airborne camera of the unmanned aerial vehicle, process the vehicle deformation and background interference in the image, and improve the robustness and real-time performance of the moving vehicle detection. The invention compensates the adverse effect generated by the movement of the airborne camera and lays a foundation for the detection of moving vehicles; the method combining the 2-frame difference method and the center traversal of the superpixel is adopted, so that the efficiency of acquiring the region to be detected is improved; aiming at the obtained region to be detected, when the low-order features of the vehicle are extracted, a multi-channel HOG feature extraction method is adopted, so that false detection and missing detection are reduced; when the high-order characteristics of the vehicle are extracted, the context information of the vehicle is introduced, so that the deformation and background interference of the vehicle are effectively inhibited, and the accuracy of detecting the moving vehicle is improved. The method for detecting the unmanned aerial vehicle video motion vehicle can realize accurate detection of the vehicle running on the road.
Furthermore, an image registration algorithm combining global and local homography matrixes is provided in a top-down mode according to the feature point pairing result. The global homography matrix describes global position changes, and the local homography matrix describes local position changes.
Furthermore, the 2-frame difference method is utilized to reduce the area to be detected, superpixel segmentation is introduced, the area to be detected is determined and scanned according to the center of the superpixel, and the calculated amount of detection of the moving vehicles is effectively reduced.
Furthermore, when the high-order features of the vehicle are extracted, firstly, positive and negative samples are manually selected to initialize the positive dictionary and the negative dictionary, then, after the final positive dictionary and the final negative dictionary are determined according to the dictionary learning and sample self-selection strategies, the high-order features are determined by calculating the reconstruction errors of the target area and the reconstruction errors of other image blocks in the neighborhood, the calculation amount of the dictionary learning is reduced, and further, the fast and efficient dictionary training is realized.
Drawings
FIG. 1 is a flow chart of the detection method in the embodiment of the present invention.
FIG. 2 is the image pyramid in an example of the invention.
FIG. 3 is a framework of a method for detecting moving vehicles based on image registration and superpixel segmentation according to an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings:
a moving vehicle detection method based on an unmanned aerial vehicle aerial video mainly aims to suppress influence caused by movement of an airborne camera of the unmanned aerial vehicle, process vehicle deformation and background interference in an image and improve robustness and real-time performance of moving vehicle detection. The invention is further described below with reference to the accompanying drawings.
Fig. 1 in the drawings shows a flow chart of the detection method of the present invention, and the specific implementation manner is as follows:
step 1), aerial photography is carried out on vehicles on a road by using an unmanned aerial vehicle onboard camera to obtain aerial photography videos, continuous image sequences of the aerial photography videos are extracted, SURF feature points of a reference image and an image to be registered are extracted, and then feature point matching is carried out. Mismatching still possibly exists in the matched feature points, and therefore, an abnormal point elimination is further performed by adopting a random sampling consistency algorithm:
specifically, Harr characteristics and integral image concepts are adopted for extracting SURF characteristic points of a reference image and an image to be registered. Finding out correctly matched feature points in the reference image and the image to be registered according to the following two principles:
1) calculating the Euclidean distance between any SURF characteristic point in the reference image and the characteristic point in the image to be registered; the smaller the Euclidean distance is, the higher the similarity is, and when the Euclidean distance is smaller than a set threshold value, the matching is judged to be successful; the threshold was taken to be 6.
2) And if a certain SURF characteristic point in the image to be registered is matched with the plurality of characteristic points in the reference image, the matching is regarded as unsuccessful.
After the feature points are matched, mismatching still possibly exists, and in order to eliminate the mismatching, a random sampling consistency algorithm is adopted to eliminate abnormal points.
Step 2), aiming at the characteristic points after the abnormal points are removed, obtaining a conversion matrix of the image through an unmanned aerial vehicle image registration algorithm, and compensating the adverse effect of the movement of an onboard camera of the unmanned aerial vehicle on the image during shooting;
and after the abnormal points are eliminated, introducing an image pyramid, and determining a global homography matrix and a local homography matrix according to the feature point pairing result in a top-down mode. First, as shown in fig. 2 of the drawings, an L +1 level pyramid of a reference image and an image to be registered is established. The 0 th level is a reference image or an image to be registered, and the resolution is highest. When moving to the upper pyramid layer, both image size and resolution are reduced. At the top of the pyramid, the lth level, is the lowest resolution. When the global homography matrix is determined, the global homography matrix corresponding to the 0 th level can be obtained by starting from the global homography matrix of the L th level and then increasing the resolution step by step until the 0 th level.
Definition ofAndrespectively representing corresponding coordinates of the L-level reference image and the image to be registered; whereinIs the x-coordinate of the L-th level reference image,is the y-coordinate of the L-th level reference image,for the x coordinate of the L-th level image to be registered,and the y coordinate of the L-th level image to be registered is obtained.
The L-th level global homography matrix is determined by:
wherein, wLIs an intermediate variable and has The L level global homography matrix is defined by the matrix elements as follows:
The following are determinedRandomly selecting 4 groups of feature point matching results each time to determine a homography matrix, and adopting l2The norm is used for screening the remaining feature matching points according to the following formula:
wherein, trAnd (4) screening threshold values for abnormal points. When the residual feature matching points satisfy the above formula, the remaining feature matching points are regarded as valid feature matching points, and whether the valid feature matching points satisfy the above formula or notThe feature matching point is deemed invalid. The homography matrix when the number of the effective characteristic matching points is the maximum is the finally determined L-level global homography matrix
The homography matrix of level L-1 can be obtained by increasing the image resolution. Introducing a scale factor mu, wherein pixel points corresponding to the L-1 level of the reference image and the image to be registered can be expressed as follows:
wherein the content of the first and second substances,is the x-coordinate of the L-1 th level reference image,is the y coordinate of the L-1 th level reference image,for the x-coordinate of the L-1 th level image to be registered,the y coordinate of the L-1 level image to be registered is obtained; mu is a scale factor. To find the homography matrix of the L-1 th level, there are:
By adopting a homography matrix derivation method from the L level to the L-1 level, the global homography matrix corresponding to the 0 level can be obtained by gradually increasing the resolutionNamely:
wherein the content of the first and second substances, is the x-coordinate of the level 0 reference image,is the y-coordinate of the level 0 reference image,for the x-coordinate of the level 0 image to be registered,for the y-coordinate, mu, of the 0 th-order image to be registeredLIs the scale factor of the 0 th level homography matrix.
Taking the L-1 level as an example, how to realize image registration by combining global and local homography matrixes, the scale factor μ is 2. As shown in fig. 2 of the drawings, the L-1 level image is divided into four blocks on average, and the homography matrix corresponding to each sub-block is defined as a local homography matrix and is recorded as a local homography matrixRepresenting the ζ -th image block of level L-1A local homography matrix. The algorithm for solving the local homography matrix is the same as the global homography matrix, invalid feature matching points are further removed, and then the local homography matrix is determined.
For the image block 1 of the L-1 level in the figure 2 in the figure, the L-1 level global homography matrix is synthesizedAnd L-1 level local homography matrixThe coordinate transformation relation of the image blocks 1 of the reference image and the image to be registered is obtained asWhereinAndrespectively representing the corresponding coordinates of the image blocks 1 of the L-1 level reference image and the image to be registered,andrespectively expressed as a local intermediate variable and a global intermediate variable corresponding to the L-1 level image block 1. Note the bookThe above formula can be abbreviated as the conversion matrix of the image block 1 of the L-1 level image
Similarly, for the image blocks 2, 3, 4 of level L-1 in FIG. 2 of the drawings, there are:
wherein, FL-1,2、FL-1,3、FL-1,4Respectively the transformation matrices of the image blocks 2, 3, 4 of the L-1 level image,respectively corresponding coordinates of image blocks 2, 3 and 4 of the L-1 level image to be registered,the corresponding coordinates of the image blocks 2, 3, 4 of the L-1 th level reference image, respectively.
Conversion matrix F for synthesizing four L-1 level image blocksL-1,1、FL-1,2、FL-1,3、FL-1,4The coordinate transformation relation between the reference image of the L-1 level and the image to be registered is obtained asWherein the content of the first and second substances,a joint transformation matrix for the L-1 level image; fL-1,ζA transformation matrix representing the Zeth image block of the L-1 level; lambda [ alpha ]L-1,ζAnd the weight of the Zeth image block conversion matrix of the L-1 level.
Gradually increasing the resolution until reaching the 0 th level of the image pyramid, and obtaining the coordinate transformation relation between the reference image and the image to be registered:wherein the content of the first and second substances,a joint transformation matrix for the level 0 image, i.e. a transformation matrix for the final joint global and local homography matrices,for the 0 th-level image to be registeredThe respective coordinates of the respective coordinates,corresponding coordinates for the level 0 reference image.
Step 3), aiming at the image processed in the step 2), determining a to-be-detected area of the moving vehicle by adopting a 2-frame difference method; performing superpixel segmentation on the image, and determining a scanning frame according to the center of the superpixel to traverse the area to be detected;
as shown in fig. 3 of the drawings, F (k-1) and F (k) respectively represent the k-1 th frame and the k-th frame in the drone image sequence. Fr(k-1) and Fr(k) Is the registered image. To reduce the amount of computation for moving vehicle detection, the registered images F are processedr(k-1) and Fr(k) The 2-frame difference method is adopted to determine the region to be detected, which is shown as a rectangular box in a small picture 2-frame difference method in figure 3 in the attached drawing. Taking 2 moving vehicles as an example, 4 regions to be detected are generated after 2-frame difference method is used.
After the area to be detected is determined by using a 2-frame difference method, the image is subjected to superpixel segmentation, a scanning frame is determined according to the center of the superpixel, and then the area to be detected is traversed to realize moving vehicle detection. When traversing the area to be detected, affine transformation needs to be carried out on the scanning frame due to rotation, translation and the like of the target vehicle so as to reduce the missing rate of detection of the moving vehicle.
Step 4), extracting the texture and color of the vehicle to form low-order features of the vehicle by using the image processed in the step 3); introducing context information of the vehicle, and extracting high-order characteristics of the vehicle; after the low-order characteristic and the high-order characteristic of the target vehicle are obtained, the low-order characteristic and the high-order characteristic are fused to obtain the multi-order characteristic of the target vehicle;
specifically, the image is first divided into small connected regions, called cells. And then acquiring the direction histogram of the gradient or edge of each pixel point in the cell unit. Finally, the features of these cell units are combined to form a HOG feature descriptor. Firstly converting the image into HSV color space, respectively extracting HOG characteristics from three channels, finally making characteristic fusion, converting the image from RGB color space to HSV color space, and dividing them into different portionsH, S, V three-channel data template of the image is extracted and stored as a two-dimensional matrix MH、MSAnd MVSimultaneously calculating HOG characteristics H of three matrixes respectivelyH、HSAnd HV. And fusing the three-channel HOG characteristics in a weighting mode, namely: hl=wHHH+wSHS+wVHV. Wherein HlRepresenting low-order features of the vehicle; w is aH、wSAnd wVRespectively HOG characteristic HH、HSAnd HVAnd w isH+wS+w V1 is ═ 1; the weight of three channels is determined by each channel data template in a self-adaptive mode, and is specifically determined by the following formula:
to this end, a low-order feature of the vehicle, i.e., a fused H, S, V three-channel HOG feature, is determined.
And when the high-order characteristics are determined, introducing the context information of the vehicle. Manually selecting positive and negative samples to initialize a positive dictionary and a negative dictionary, and then determining a final positive dictionary D according to dictionary learning and autonomous selection strategiespAnd negative dictionary Dn. Next, the high order features are determined by calculating the reconstruction error of the target region and the reconstruction errors of other image blocks in the neighborhood.
For vehicle tvThe reconstruction error is denoted as e (t)v) And e (t)v)=[e(tv,Dp),e(tv,Dn)]TWherein e (t)v,Dp) And e (t)v,Dn) Are each tvReconstruction errors on the positive and negative dictionaries. For a certain neighborhood image block a of a vehicleιThe reconstruction error is e (a)ι) And e (a)ι)=[e(aι,Dp),e(aι,Dn)]TWhere the subscript t is the target vehicle tvThe number of image blocks in the neighborhood. Wherein e (a)ι,Dp) And e (a)ι,Dn) Are respectively aιReconstruction errors on the positive and negative dictionaries. For the neighborhood image block a iota, defining a target vehicle tvIs characterized by a high order of tvAnd aιThe difference of the reconstruction error of (1) can be represented as H (t)v,aι)=||e(tv)-e(aι)2Wherein, H (t)v,aι) Is a target vehicle tvRelative to neighborhood aιHigh order features of (1).
When the target vehicle tvWhen M image blocks exist in the neighborhood, the target vehicle tvThe high-order features of (1) are: hh=[H(tv,a1),H(tv,a2),…,H(tv,aM)]T
And fusing the obtained high-order characteristic and the low-order characteristic of the vehicle together to obtain a multi-order characteristic of the target vehicle: fv=[Hl,Hh]。
Thus, the multi-level features of the target vehicle are obtained by combining the low-level features and the high-level features of the vehicle.
And 5) training the dictionary by using a dictionary learning algorithm for the obtained multi-order features of the vehicle, and detecting the moving vehicle by using the trained dictionary.
Specifically, in the dictionary learning algorithm based on the correlation, in the dictionary updating stage, atoms related to the new sample sparse representation are firstly determined, and only the atoms are updated, so that the calculation amount of dictionary learning is reduced. On the other hand, sparsity is introduced into the dictionary update phase. And (4) repeatedly iterating the process until convergence is achieved, so that the fast and efficient dictionary training is realized, and finally the detection of the moving vehicle is completed.
And 2) introducing an image pyramid, and providing an image registration algorithm combining global and local homography matrixes according to the feature point pairing result in a top-down mode. The global homography matrix describes global position changes, and the local homography matrix describes local position changes.
And 3) introducing a 2-frame difference method and superpixel segmentation, reducing the region to be detected by using the 2-frame difference method, introducing superpixel segmentation, determining and scanning the region to be detected according to the center of the superpixel, and effectively reducing the calculated amount of moving vehicle detection.
The step 4) is implemented as follows: when the high-order features of the vehicle are extracted, firstly, positive and negative samples are manually selected to initialize a positive dictionary and a negative dictionary, then, after the final positive dictionary and the final negative dictionary are determined according to dictionary learning and sample autonomous selection strategies, the high-order features are determined by calculating the reconstruction errors of a target area and the reconstruction errors of other image blocks in a neighborhood.
Claims (10)
1. A moving vehicle detection method based on unmanned aerial vehicle aerial video is characterized by comprising the following steps:
step 1), acquiring an aerial video of a moving vehicle, extracting a continuous image sequence of the aerial video, then extracting SURF (speeded up robust features) feature points of a reference image and an image to be registered, then performing feature point matching, and performing abnormal point elimination on the matched feature points by adopting a random sampling consistency algorithm;
step 2), aiming at the characteristic points after the abnormal points are removed, obtaining a conversion matrix of the image through an unmanned aerial vehicle image registration algorithm;
step 3), aiming at the image processed in the step 2), determining a to-be-detected area of the moving vehicle by adopting a 2-frame difference method, performing superpixel segmentation on the image, and determining a scanning frame according to the center of the superpixel so as to traverse the to-be-detected area;
step 4), extracting the texture and color of the vehicle to form low-order features of the vehicle by using the image processed in the step 3); introducing context information of the vehicle, and extracting high-order characteristics of the vehicle; after the low-order characteristic and the high-order characteristic of the target vehicle are obtained, the low-order characteristic and the high-order characteristic are fused to obtain the multi-order characteristic of the target vehicle;
and 5) training the dictionary by using a dictionary learning algorithm for the obtained multi-order features of the vehicle, and detecting the moving vehicle by using the trained dictionary.
2. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video of claim 1, wherein Harr features and integral image concepts are adopted for SURF feature point extraction on the reference image and the image to be registered.
3. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video according to claim 2, wherein the Euclidean distance between any SURF feature point in the reference image and the feature point in the image to be registered is calculated; the smaller the Euclidean distance is, the higher the similarity is, and when the Euclidean distance is smaller than a set threshold value, the matching is judged to be successful; and if a certain SURF characteristic point in the image to be registered is matched with the plurality of characteristic points in the reference image, the matching is regarded as unsuccessful.
4. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video, according to claim 1, is characterized in that after the abnormal point elimination is completed, an image pyramid is introduced, and a global homography matrix and a local homography matrix are determined according to a feature point pairing result in a top-down mode: firstly, establishing an L +1 level pyramid of a reference image and an image to be registered, starting from an L level global homography matrix when determining the global homography matrix, and then gradually increasing the resolution to a 0 level, and further obtaining the 0 level global homography matrix.
5. The method according to claim 4, wherein the definition defines a moving vehicle detection method based on the aerial video of the unmanned aerial vehicleAndrespectively representing corresponding coordinates of the L-level reference image and the image to be registered; whereinIs the x-coordinate of the L-th level reference image,is the y-coordinate of the L-th level reference image,for the x coordinate of the L-th level image to be registered,for the y coordinate of the L-th level image to be registered:
the L-th level global homography matrix is determined by:
wherein, wLIs an intermediate variable and has The L level global homography matrix is defined by the matrix elements as follows:
The following are determinedRandomly selecting 4 groups of feature point matching results each time to determine a homography matrix, and adopting l2The norm is used for screening the remaining feature matching points according to the following formula:
wherein, trA threshold value for outlier screening; when the residual feature matching points meet the formula, the feature matching points are regarded as valid feature matching points, otherwise, the feature matching points are regarded as invalid feature matching points; the homography matrix when the number of the effective characteristic matching points is the maximum is the finally determined L-level global homography matrix
The homography matrix of level L-1 is obtained by increasing the image resolution: introducing a scale factor mu, wherein pixel points corresponding to the L-1 level of the reference image and the image to be registered can be expressed as follows:
wherein the content of the first and second substances,is the x-coordinate of the L-1 th level reference image,is the y coordinate of the L-1 th level reference image,for the x-coordinate of the L-1 th level image to be registered,the y coordinate of the L-1 level image to be registered is obtained; μ is a scale factor: to find the homography matrix of the L-1 th level, there are:
wherein the content of the first and second substances,the global homography matrix is the L-1 level;
by adopting a homography matrix derivation method from the L level to the L-1 level, the global homography matrix corresponding to the 0 level can be obtained by gradually increasing the resolutionNamely:
wherein the content of the first and second substances, is the x-coordinate of the level 0 reference image,is the y-coordinate of the level 0 reference image,for the x-coordinate of the level 0 image to be registered,for the y-coordinate, mu, of the 0 th-order image to be registeredLIs the 0 th level homographyThe scale factor of the matrix.
6. The method for detecting the moving vehicle based on the aerial video of the unmanned aerial vehicle as claimed in claim 1, wherein F (k-1) and F (k) are adopted to respectively represent the k-1 th frame and the k-th frame in the image sequence of the unmanned aerial vehicle, and F (k) is adopted to respectively represent the k-1 th frame and the k-th frame in the image sequence of the unmanned aerial vehicler(k-1) and Fr(k) The registered images are obtained; for registered image Fr(k-1) and Fr(k) And determining the region to be detected by adopting a 2-frame difference method.
7. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video is characterized in that the image is firstly divided into small connected areas, namely cell units; then collecting the direction histogram of the gradient or edge of each pixel point in the cell unit; finally, combining the features of these cell units can form a HOG feature descriptor: firstly converting an image into HSV color space, respectively extracting HOG characteristics from three channels, finally performing characteristic fusion, converting the image from RGB color space to HSV color space, respectively extracting H, S, V three-channel data templates of the image, and storing the data templates as a two-dimensional matrix MH、MSAnd MVSimultaneously calculating HOG characteristics H of three matrixes respectivelyH、HSAnd HV。
8. The method for detecting the moving vehicle based on the aerial video of the unmanned aerial vehicle as claimed in claim 7, wherein a three-channel HOG feature is fused in a weighting manner, that is: hl=wHHH+wSHS+wVHV(ii) a Wherein HlRepresenting low-order features of the vehicle; w is aH、wSAnd wVRespectively HOG characteristic HH、HSAnd HVAnd w isH+wS+wV1 is ═ 1; the weight of three channels is determined by each channel data template in a self-adaptive mode, and specifically is determined by the following formula:
a low-order feature of the vehicle, i.e., a fused H, S, V three-channel HOG feature, is determined.
9. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video, according to claim 7, is characterized in that context information of the vehicle is introduced when high-order features are determined; manually selecting positive and negative samples to initialize a positive dictionary and a negative dictionary, and then determining a final positive dictionary D according to dictionary learning and autonomous selection strategiespAnd negative dictionary Dn(ii) a Determining high-order characteristics by calculating the reconstruction error of a target area and the reconstruction errors of other image blocks in the neighborhood;
for vehicle tvThe reconstruction error is denoted as e (t)v) And e (t)v)=[e(tv,Dp),e(tv,Dn)]TWherein e (t)v,Dp) And e (t)v,Dn) Are each tvReconstruction errors on the positive and negative dictionaries; for a certain neighborhood image block a of a vehicleιThe reconstruction error is e (a)ι) And e (a)ι)=[e(aι,Dp),e(aι,Dn)]TWhere the subscript t is the target vehicle tvThe number of image blocks in the neighborhood; wherein e (a)ι,Dp) And e (a)ι,Dn) Are respectively aιReconstruction errors on the positive and negative dictionaries; for a neighborhood image block aιDefining a target vehicle tvIs characterized by a high order of tvAnd aιIs expressed as H (t)v,aι)=||e(tv)-e(aι)||2Wherein, H (t)v,aι) Is a target vehicle tvRelative to neighborhood aιHigh order features of (1);
when the target vehicle tvWhen M image blocks exist in the neighborhood, the target vehicle tvThe high-order features of (1) are: hh=[H(tv,a1),H(tv,a2),…,H(tv,aM)]T;
And fusing the obtained high-order characteristic and the low-order characteristic of the vehicle together to obtain a multi-order characteristic of the target vehicle: fv=[Hl,Hh](ii) a And combining the low-order characteristics and the high-order characteristics of the vehicle to obtain the multi-order characteristics of the target vehicle.
10. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial photography video is characterized in that specifically in a dictionary learning algorithm based on the correlation, in a dictionary updating stage, atoms related to a new sample sparse representation are firstly determined, and only the atoms are updated; introducing sparsity into a dictionary updating stage; and repeatedly iterating the updating process until convergence, further realizing rapid and efficient dictionary training and finally finishing the detection of the moving vehicle.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811203391.1A CN109376641B (en) | 2018-10-16 | 2018-10-16 | Moving vehicle detection method based on unmanned aerial vehicle aerial video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811203391.1A CN109376641B (en) | 2018-10-16 | 2018-10-16 | Moving vehicle detection method based on unmanned aerial vehicle aerial video |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109376641A CN109376641A (en) | 2019-02-22 |
CN109376641B true CN109376641B (en) | 2021-04-27 |
Family
ID=65400009
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811203391.1A Active CN109376641B (en) | 2018-10-16 | 2018-10-16 | Moving vehicle detection method based on unmanned aerial vehicle aerial video |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109376641B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110136104B (en) * | 2019-04-25 | 2021-04-13 | 上海交通大学 | Image processing method, system and medium based on unmanned aerial vehicle ground station |
CN110598613B (en) * | 2019-09-03 | 2022-10-25 | 长安大学 | Expressway agglomerate fog monitoring method |
CN112749779A (en) * | 2019-10-30 | 2021-05-04 | 北京市商汤科技开发有限公司 | Neural network processing method and device, electronic equipment and computer storage medium |
CN111552269B (en) * | 2020-04-27 | 2021-05-28 | 武汉工程大学 | Multi-robot safety detection method and system based on attitude estimation |
CN111612966B (en) * | 2020-05-21 | 2021-05-07 | 广东乐佳印刷有限公司 | Bill certificate anti-counterfeiting detection method and device based on image recognition |
CN111881853B (en) * | 2020-07-31 | 2022-09-16 | 中北大学 | Method and device for identifying abnormal behaviors in oversized bridge and tunnel |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105554456A (en) * | 2015-12-21 | 2016-05-04 | 北京旷视科技有限公司 | Video processing method and apparatus |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8913783B2 (en) * | 2009-10-29 | 2014-12-16 | Sri International | 3-D model based method for detecting and classifying vehicles in aerial imagery |
-
2018
- 2018-10-16 CN CN201811203391.1A patent/CN109376641B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105554456A (en) * | 2015-12-21 | 2016-05-04 | 北京旷视科技有限公司 | Video processing method and apparatus |
Non-Patent Citations (2)
Title |
---|
Vehicle detection in high-resolution aerial images based on fast sparse representation classification and multiorder feature;Chen Z 等;《IEEE transactions on intelligent transportation systems》;20160218;第17卷(第8期);第2296-2309页 * |
无人机航拍视频中的车辆检测方法;王素琴 等;《系统仿真学报》;20180731;第30卷(第07期);第359-369页 * |
Also Published As
Publication number | Publication date |
---|---|
CN109376641A (en) | 2019-02-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109376641B (en) | Moving vehicle detection method based on unmanned aerial vehicle aerial video | |
CN110569704B (en) | Multi-strategy self-adaptive lane line detection method based on stereoscopic vision | |
CN109903331B (en) | Convolutional neural network target detection method based on RGB-D camera | |
CN104574347B (en) | Satellite in orbit image geometry positioning accuracy evaluation method based on multi- source Remote Sensing Data data | |
CN112686935B (en) | Airborne sounding radar and multispectral satellite image registration method based on feature fusion | |
CN108446634B (en) | Aircraft continuous tracking method based on combination of video analysis and positioning information | |
CN112883850B (en) | Multi-view space remote sensing image matching method based on convolutional neural network | |
CN109215053B (en) | Method for detecting moving vehicle with pause state in aerial video shot by unmanned aerial vehicle | |
CN108428220A (en) | Satellite sequence remote sensing image sea island reef region automatic geometric correction method | |
CN112929626B (en) | Three-dimensional information extraction method based on smartphone image | |
CN111553845B (en) | Quick image stitching method based on optimized three-dimensional reconstruction | |
CN106530313A (en) | Sea-sky line real-time detection method based on region segmentation | |
CN113838064B (en) | Cloud removal method based on branch GAN using multi-temporal remote sensing data | |
CN112016478A (en) | Complex scene identification method and system based on multispectral image fusion | |
CN107705295B (en) | Image difference detection method based on robust principal component analysis method | |
CN115063447A (en) | Target animal motion tracking method based on video sequence and related equipment | |
CN112946679A (en) | Unmanned aerial vehicle surveying and mapping jelly effect detection method and system based on artificial intelligence | |
CN114663880A (en) | Three-dimensional target detection method based on multi-level cross-modal self-attention mechanism | |
CN114742864A (en) | Belt deviation detection method and device | |
CN113034398A (en) | Method and system for eliminating jelly effect in urban surveying and mapping based on artificial intelligence | |
CN117274627A (en) | Multi-temporal snow remote sensing image matching method and system based on image conversion | |
CN110298347B (en) | Method for identifying automobile exhaust analyzer screen based on GrayWorld and PCA-CNN | |
CN116563104A (en) | Image registration method and image stitching method based on particle swarm optimization | |
CN114693755B (en) | Non-rigid registration method and system for multimode image maximum moment and space consistency | |
CN111833384B (en) | Method and device for rapidly registering visible light and infrared images |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right | ||
TR01 | Transfer of patent right |
Effective date of registration: 20211224 Address after: 908, block a, floor 8, No. 116, Zizhuyuan Road, Haidian District, Beijing 100089 Patentee after: ZHONGZI DATA CO.,LTD. Patentee after: China Highway Engineering Consulting Group Co., Ltd. Address before: 710064 middle section of South Second Ring Road, Beilin District, Xi'an City, Shaanxi Province Patentee before: CHANG'AN University |