CN109376641B - Moving vehicle detection method based on unmanned aerial vehicle aerial video - Google Patents

Moving vehicle detection method based on unmanned aerial vehicle aerial video Download PDF

Info

Publication number
CN109376641B
CN109376641B CN201811203391.1A CN201811203391A CN109376641B CN 109376641 B CN109376641 B CN 109376641B CN 201811203391 A CN201811203391 A CN 201811203391A CN 109376641 B CN109376641 B CN 109376641B
Authority
CN
China
Prior art keywords
image
vehicle
level
order
registered
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811203391.1A
Other languages
Chinese (zh)
Other versions
CN109376641A (en
Inventor
朱旭
孙思琦
徐伟
闫茂德
杨盼盼
左磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
CHINA HIGHWAY ENGINEERING CONSULTING GROUP Co Ltd
CHECC Data Co Ltd
Original Assignee
Changan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Changan University filed Critical Changan University
Priority to CN201811203391.1A priority Critical patent/CN109376641B/en
Publication of CN109376641A publication Critical patent/CN109376641A/en
Application granted granted Critical
Publication of CN109376641B publication Critical patent/CN109376641B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/50Context or environment of the image
    • G06V20/56Context or environment of the image exterior to a vehicle by using sensors mounted on the vehicle
    • G06V20/58Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads
    • G06V20/584Recognition of moving objects or obstacles, e.g. vehicles or pedestrians; Recognition of traffic objects, e.g. traffic signs, traffic lights or roads of vehicle lights or traffic lights

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention discloses a moving vehicle detection method based on unmanned aerial vehicle aerial video, which comprises the steps of firstly adopting an SURF algorithm to carry out feature point matching and abnormal point elimination on an image, utilizing an unmanned aerial vehicle image registration algorithm combining a global homography matrix and a local homography matrix to obtain a conversion matrix, compensating adverse effects generated by movement of an onboard camera, then adopting a 2-frame difference method to reduce a region to be detected, traversing the region to be detected according to the center of a superpixel, further improving the moving vehicle detection efficiency, then utilizing a multichannel HOG feature algorithm to extract low-order features of a vehicle, introducing context information of the vehicle to obtain high-order features of the vehicle, fusing the two features to obtain multi-order features of a target vehicle, and finally combining the multi-order features and a dictionary learning algorithm to realize moving vehicle detection. The method can inhibit the influence caused by the motion of the airborne camera of the unmanned aerial vehicle, process the vehicle deformation and background interference in the image, and improve the robustness and real-time performance of the moving vehicle detection.

Description

Moving vehicle detection method based on unmanned aerial vehicle aerial video
Technical Field
The invention relates to a method for detecting a moving vehicle, in particular to a method for detecting a moving vehicle based on an unmanned aerial vehicle aerial video.
Background
The unmanned aerial vehicle aerial photography is used as a novel remote sensing data acquisition means, and has the unique advantages of flexible deployment mode, large monitoring range, fine information acquisition granularity, no ground traffic interference and the like. Unmanned aerial vehicle flying speed and height-adjustable, visual angle are nimble, acquire the efficient, with low costs, the risk of ground traffic image information low, can realize from local to wide area's traffic monitoring on a large scale. With the further development and fusion of the unmanned aerial vehicle aerial photography technology and the image processing technology, the unmanned aerial vehicle images are reasonably utilized and analyzed, and the method has wide application prospects in the fields of traffic planning, design and management.
Commonly used moving vehicle detection methods include a background extraction method, an optical flow method, and the like. Among them, the background extraction method is extremely sensitive to illumination and background variation, and the optical flow method is too expensive to calculate. In order to improve the robustness of moving vehicle detection, some scholars establish a dynamic Bayesian network and adopt a sliding window method to detect vehicles, and although a certain effect is achieved, the calculation amount of the sliding window method is still too large, and the application is limited.
Therefore, although many moving vehicle detection algorithms exist at present and have a certain detection effect. However, the stability, robustness and real-time performance of the moving vehicle detection method based on the unmanned aerial vehicle aerial video still need to be improved.
Disclosure of Invention
The invention aims to provide a moving vehicle detection method based on an unmanned aerial vehicle aerial video, so as to overcome the defects of the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a moving vehicle detection method based on unmanned aerial vehicle aerial video comprises the following steps:
step 1), acquiring an aerial video of a moving vehicle, extracting a continuous image sequence of the aerial video, then extracting SURF (speeded up robust features) feature points of a reference image and an image to be registered, then performing feature point matching, and performing abnormal point elimination on the matched feature points by adopting a random sampling consistency algorithm;
step 2), aiming at the characteristic points after the abnormal points are removed, obtaining a conversion matrix of the image through an unmanned aerial vehicle image registration algorithm;
step 3), aiming at the image processed in the step 2), determining a to-be-detected area of the moving vehicle by adopting a 2-frame difference method, performing superpixel segmentation on the image, and determining a scanning frame according to the center of the superpixel so as to traverse the to-be-detected area;
step 4), extracting the texture and color of the vehicle to form low-order features of the vehicle by using the image processed in the step 3); introducing context information of the vehicle, and extracting high-order characteristics of the vehicle; after the low-order characteristic and the high-order characteristic of the target vehicle are obtained, the low-order characteristic and the high-order characteristic are fused to obtain the multi-order characteristic of the target vehicle;
and 5) training the dictionary by using a dictionary learning algorithm for the obtained multi-order features of the vehicle, and detecting the moving vehicle by using the trained dictionary.
Further, Harr features and integral image concepts are adopted for extracting SURF feature points of the reference image and the image to be registered.
Further, calculating the Euclidean distance between any SURF characteristic point in the reference image and the characteristic point in the image to be registered; the smaller the Euclidean distance is, the higher the similarity is, and when the Euclidean distance is smaller than a set threshold value, the matching is judged to be successful; and if a certain SURF characteristic point in the image to be registered is matched with the plurality of characteristic points in the reference image, the matching is regarded as unsuccessful.
Further, after the abnormal point elimination is completed, an image pyramid is introduced, and a global homography matrix and a local homography matrix are determined according to the feature point pairing result in a top-down mode: firstly, establishing an L +1 level pyramid of a reference image and an image to be registered, starting from an L level global homography matrix when determining the global homography matrix, and then gradually increasing the resolution to a 0 level, and further obtaining the 0 level global homography matrix.
Further, define
Figure BDA0001830584680000031
And
Figure BDA0001830584680000032
respectively representing corresponding coordinates of the L-level reference image and the image to be registered; wherein
Figure BDA0001830584680000033
Is the x-coordinate of the L-th level reference image,
Figure BDA0001830584680000034
is the y-coordinate of the L-th level reference image,
Figure BDA0001830584680000035
for the x coordinate of the L-th level image to be registered,
Figure BDA0001830584680000036
for the y coordinate of the L-th level image to be registered:
the L-th level global homography matrix is determined by:
Figure BDA0001830584680000037
wherein, wLIs an intermediate variable and has
Figure BDA0001830584680000038
Figure BDA0001830584680000039
The L level global homography matrix is defined by the matrix elements as follows:
Figure BDA00018305846800000310
is abbreviated as
Figure BDA00018305846800000311
The following are determined
Figure BDA00018305846800000312
Randomly selecting 4 groups of feature point matching results each time to determine a homography matrix, and adopting l2The norm is used for screening the remaining feature matching points according to the following formula:
Figure BDA00018305846800000313
wherein, trA threshold value for outlier screening; when the residual feature matching points meet the formula, the feature matching points are regarded as valid feature matching points, otherwise, the feature matching points are regarded as invalid feature matching points; the homography matrix when the number of the effective characteristic matching points is the maximum is the finally determined L-level global homography matrix
Figure BDA0001830584680000041
The homography matrix of level L-1 is obtained by increasing the image resolution: introducing a scale factor mu, wherein pixel points corresponding to the L-1 level of the reference image and the image to be registered can be expressed as follows:
Figure BDA0001830584680000042
wherein the content of the first and second substances,
Figure BDA0001830584680000043
is the x-coordinate of the L-1 th level reference image,
Figure BDA0001830584680000044
is the y coordinate of the L-1 th level reference image,
Figure BDA0001830584680000045
for the x-coordinate of the L-1 th level image to be registered,
Figure BDA0001830584680000046
the y coordinate of the L-1 level image to be registered is obtained; μ is a scale factor: to find the homography matrix of the L-1 th level, there are:
Figure BDA0001830584680000047
order to
Figure BDA0001830584680000048
The above formula can be rewritten as:
Figure BDA0001830584680000049
wherein the content of the first and second substances,
Figure BDA00018305846800000410
the global homography matrix is the L-1 level;
by adopting a homography matrix derivation method from the L level to the L-1 level, the global homography matrix corresponding to the 0 level can be obtained by gradually increasing the resolution
Figure BDA00018305846800000411
Namely:
Figure BDA00018305846800000412
wherein the content of the first and second substances,
Figure BDA00018305846800000413
Figure BDA00018305846800000414
is the x-coordinate of the level 0 reference image,
Figure BDA00018305846800000415
is the y-coordinate of the level 0 reference image,
Figure BDA00018305846800000416
for the x-coordinate of the level 0 image to be registered,
Figure BDA00018305846800000417
for the y-coordinate, mu, of the 0 th-order image to be registeredLIs the scale factor of the 0 th level homography matrix.
Further, F (k-1) and F (k) are adopted to respectively represent the k-1 frame and the k frame in the unmanned aerial vehicle image sequence, Fr(k-1) and Fr(k) The registered images are obtained; for registered image Fr(k-1) and Fr(k) And determining the region to be detected by adopting a 2-frame difference method.
Further, the image is first divided into small connected regions, i.e. cell units; then collecting the direction histogram of the gradient or edge of each pixel point in the cell unit; finally, combining the features of these cell units can form a HOG feature descriptor: firstly converting an image into HSV color space, respectively extracting HOG characteristics from three channels, finally performing characteristic fusion, converting the image from RGB color space to HSV color space, respectively extracting H, S, V three-channel data templates of the image, and storing the data templates as a two-dimensional matrix MH、MSAnd MVSimultaneously calculating HOG characteristics H of three matrixes respectivelyH、HSAnd HV
Further, a weighting mode is adopted to fuse the three-channel HOG characteristics, namely: hl=wHHH+wSHS+wVHV(ii) a Wherein HlRepresenting low-order features of the vehicle; w is aH、wSAnd wVRespectively HOG characteristic HH、HSAnd HVAnd w isH+wS+w V1 is ═ 1; the weight of three channels is determined by each channel data template in a self-adaptive mode, and specifically is determined by the following formula:
Figure BDA0001830584680000051
a low-order feature of the vehicle, i.e., a fused H, S, V three-channel HOG feature, is determined.
Further, when high-order features are determined, context information of the vehicle is introduced; manually selecting positive and negative samples to initialize a positive dictionary and a negative dictionary, and then determining a final positive dictionary D according to dictionary learning and autonomous selection strategiespAnd negative dictionary Dn(ii) a Determining high-order characteristics by calculating the reconstruction error of a target area and the reconstruction errors of other image blocks in the neighborhood;
for vehicle tvThe reconstruction error is denoted as e (t)v) And e (t)v)=[e(tv,Dp),e(tv,Dn)]TWherein e (t)v,Dp) And e (t)v,Dn) Are each tvReconstruction errors on the positive and negative dictionaries; for a certain neighborhood image block a of a vehicleιThe reconstruction error is e (a)ι) And e (a)ι)=[e(aι,Dp),e(aι,Dn)]TWhere the subscript t is the target vehicle tvThe number of image blocks in the neighborhood; wherein e (a)ι,Dp) And e (a)ι,Dn) Are respectively aιReconstruction errors on the positive and negative dictionaries; for a neighborhood image block aιDefining a target vehicle tvIs characterized by a high order of tvAnd aιIs expressed as H (t)v,aι)=||e(tv)-e(aι)||2Wherein, H (t)v,aι) Is a target vehicle tvRelative to neighborhood aιHigh order features of (1);
when the target vehicle tvWhen M image blocks exist in the neighborhood, the target vehicle tvThe high-order features of (1) are: hh=[H(tv,a1),H(tv,a2),…,H(tv,aM)]T
And fusing the obtained high-order characteristic and the low-order characteristic of the vehicle together to obtain a multi-order characteristic of the target vehicle: fv=[Hl,Hh](ii) a And combining the low-order characteristics and the high-order characteristics of the vehicle to obtain the multi-order characteristics of the target vehicle.
Further, in a dictionary learning algorithm based on correlation, in a dictionary updating stage, atoms related to new sample sparse representation are determined firstly, and only the atoms are updated; introducing sparsity into a dictionary updating stage; and repeatedly iterating the updating process until convergence, further realizing rapid and efficient dictionary training and finally finishing the detection of the moving vehicle.
Compared with the prior art, the invention has the following beneficial technical effects:
the invention discloses a moving vehicle detection method based on unmanned aerial vehicle aerial video, which comprises the steps of firstly adopting an SURF algorithm to carry out feature point matching and abnormal point elimination on an image, utilizing an unmanned aerial vehicle image registration algorithm combining a global homography matrix and a local homography matrix to obtain a conversion matrix, compensating adverse effects generated by movement of an onboard camera, then adopting a 2-frame difference method to reduce a region to be detected, traversing the region to be detected according to the center of a superpixel, further improving the moving vehicle detection efficiency, then utilizing a multichannel HOG feature algorithm to extract low-order features of a vehicle, introducing context information of the vehicle to obtain high-order features of the vehicle, fusing the two features to obtain multi-order features of a target vehicle, and finally combining the multi-order features and a dictionary learning algorithm to realize moving vehicle detection. The method can inhibit the influence caused by the motion of the airborne camera of the unmanned aerial vehicle, process the vehicle deformation and background interference in the image, and improve the robustness and real-time performance of the moving vehicle detection. The invention compensates the adverse effect generated by the movement of the airborne camera and lays a foundation for the detection of moving vehicles; the method combining the 2-frame difference method and the center traversal of the superpixel is adopted, so that the efficiency of acquiring the region to be detected is improved; aiming at the obtained region to be detected, when the low-order features of the vehicle are extracted, a multi-channel HOG feature extraction method is adopted, so that false detection and missing detection are reduced; when the high-order characteristics of the vehicle are extracted, the context information of the vehicle is introduced, so that the deformation and background interference of the vehicle are effectively inhibited, and the accuracy of detecting the moving vehicle is improved. The method for detecting the unmanned aerial vehicle video motion vehicle can realize accurate detection of the vehicle running on the road.
Furthermore, an image registration algorithm combining global and local homography matrixes is provided in a top-down mode according to the feature point pairing result. The global homography matrix describes global position changes, and the local homography matrix describes local position changes.
Furthermore, the 2-frame difference method is utilized to reduce the area to be detected, superpixel segmentation is introduced, the area to be detected is determined and scanned according to the center of the superpixel, and the calculated amount of detection of the moving vehicles is effectively reduced.
Furthermore, when the high-order features of the vehicle are extracted, firstly, positive and negative samples are manually selected to initialize the positive dictionary and the negative dictionary, then, after the final positive dictionary and the final negative dictionary are determined according to the dictionary learning and sample self-selection strategies, the high-order features are determined by calculating the reconstruction errors of the target area and the reconstruction errors of other image blocks in the neighborhood, the calculation amount of the dictionary learning is reduced, and further, the fast and efficient dictionary training is realized.
Drawings
FIG. 1 is a flow chart of the detection method in the embodiment of the present invention.
FIG. 2 is the image pyramid in an example of the invention.
FIG. 3 is a framework of a method for detecting moving vehicles based on image registration and superpixel segmentation according to an embodiment of the present invention.
Detailed Description
The invention is described in further detail below with reference to the accompanying drawings:
a moving vehicle detection method based on an unmanned aerial vehicle aerial video mainly aims to suppress influence caused by movement of an airborne camera of the unmanned aerial vehicle, process vehicle deformation and background interference in an image and improve robustness and real-time performance of moving vehicle detection. The invention is further described below with reference to the accompanying drawings.
Fig. 1 in the drawings shows a flow chart of the detection method of the present invention, and the specific implementation manner is as follows:
step 1), aerial photography is carried out on vehicles on a road by using an unmanned aerial vehicle onboard camera to obtain aerial photography videos, continuous image sequences of the aerial photography videos are extracted, SURF feature points of a reference image and an image to be registered are extracted, and then feature point matching is carried out. Mismatching still possibly exists in the matched feature points, and therefore, an abnormal point elimination is further performed by adopting a random sampling consistency algorithm:
specifically, Harr characteristics and integral image concepts are adopted for extracting SURF characteristic points of a reference image and an image to be registered. Finding out correctly matched feature points in the reference image and the image to be registered according to the following two principles:
1) calculating the Euclidean distance between any SURF characteristic point in the reference image and the characteristic point in the image to be registered; the smaller the Euclidean distance is, the higher the similarity is, and when the Euclidean distance is smaller than a set threshold value, the matching is judged to be successful; the threshold was taken to be 6.
2) And if a certain SURF characteristic point in the image to be registered is matched with the plurality of characteristic points in the reference image, the matching is regarded as unsuccessful.
After the feature points are matched, mismatching still possibly exists, and in order to eliminate the mismatching, a random sampling consistency algorithm is adopted to eliminate abnormal points.
Step 2), aiming at the characteristic points after the abnormal points are removed, obtaining a conversion matrix of the image through an unmanned aerial vehicle image registration algorithm, and compensating the adverse effect of the movement of an onboard camera of the unmanned aerial vehicle on the image during shooting;
and after the abnormal points are eliminated, introducing an image pyramid, and determining a global homography matrix and a local homography matrix according to the feature point pairing result in a top-down mode. First, as shown in fig. 2 of the drawings, an L +1 level pyramid of a reference image and an image to be registered is established. The 0 th level is a reference image or an image to be registered, and the resolution is highest. When moving to the upper pyramid layer, both image size and resolution are reduced. At the top of the pyramid, the lth level, is the lowest resolution. When the global homography matrix is determined, the global homography matrix corresponding to the 0 th level can be obtained by starting from the global homography matrix of the L th level and then increasing the resolution step by step until the 0 th level.
Definition of
Figure BDA0001830584680000091
And
Figure BDA0001830584680000092
respectively representing corresponding coordinates of the L-level reference image and the image to be registered; wherein
Figure BDA0001830584680000093
Is the x-coordinate of the L-th level reference image,
Figure BDA0001830584680000094
is the y-coordinate of the L-th level reference image,
Figure BDA0001830584680000095
for the x coordinate of the L-th level image to be registered,
Figure BDA0001830584680000096
and the y coordinate of the L-th level image to be registered is obtained.
The L-th level global homography matrix is determined by:
Figure BDA0001830584680000097
wherein, wLIs an intermediate variable and has
Figure BDA0001830584680000098
Figure BDA0001830584680000099
The L level global homography matrix is defined by the matrix elements as follows:
Figure BDA00018305846800000910
for convenience, it can be abbreviated as
Figure BDA00018305846800000911
The following are determined
Figure BDA00018305846800000912
Randomly selecting 4 groups of feature point matching results each time to determine a homography matrix, and adopting l2The norm is used for screening the remaining feature matching points according to the following formula:
Figure BDA00018305846800000913
wherein, trAnd (4) screening threshold values for abnormal points. When the residual feature matching points satisfy the above formula, the remaining feature matching points are regarded as valid feature matching points, and whether the valid feature matching points satisfy the above formula or notThe feature matching point is deemed invalid. The homography matrix when the number of the effective characteristic matching points is the maximum is the finally determined L-level global homography matrix
Figure BDA0001830584680000101
The homography matrix of level L-1 can be obtained by increasing the image resolution. Introducing a scale factor mu, wherein pixel points corresponding to the L-1 level of the reference image and the image to be registered can be expressed as follows:
Figure BDA0001830584680000102
wherein the content of the first and second substances,
Figure BDA0001830584680000103
is the x-coordinate of the L-1 th level reference image,
Figure BDA0001830584680000104
is the y coordinate of the L-1 th level reference image,
Figure BDA0001830584680000105
for the x-coordinate of the L-1 th level image to be registered,
Figure BDA0001830584680000106
the y coordinate of the L-1 level image to be registered is obtained; mu is a scale factor. To find the homography matrix of the L-1 th level, there are:
Figure BDA0001830584680000107
order to
Figure BDA0001830584680000108
The above formula can be rewritten as:
Figure BDA0001830584680000109
wherein the content of the first and second substances,
Figure BDA00018305846800001010
is a global homography matrix of level L-1.
By adopting a homography matrix derivation method from the L level to the L-1 level, the global homography matrix corresponding to the 0 level can be obtained by gradually increasing the resolution
Figure BDA00018305846800001011
Namely:
Figure BDA00018305846800001012
wherein the content of the first and second substances,
Figure BDA0001830584680000111
Figure BDA0001830584680000112
is the x-coordinate of the level 0 reference image,
Figure BDA0001830584680000113
is the y-coordinate of the level 0 reference image,
Figure BDA0001830584680000114
for the x-coordinate of the level 0 image to be registered,
Figure BDA0001830584680000115
for the y-coordinate, mu, of the 0 th-order image to be registeredLIs the scale factor of the 0 th level homography matrix.
Taking the L-1 level as an example, how to realize image registration by combining global and local homography matrixes, the scale factor μ is 2. As shown in fig. 2 of the drawings, the L-1 level image is divided into four blocks on average, and the homography matrix corresponding to each sub-block is defined as a local homography matrix and is recorded as a local homography matrix
Figure BDA0001830584680000116
Representing the ζ -th image block of level L-1A local homography matrix. The algorithm for solving the local homography matrix is the same as the global homography matrix, invalid feature matching points are further removed, and then the local homography matrix is determined.
For the image block 1 of the L-1 level in the figure 2 in the figure, the L-1 level global homography matrix is synthesized
Figure BDA0001830584680000117
And L-1 level local homography matrix
Figure BDA0001830584680000118
The coordinate transformation relation of the image blocks 1 of the reference image and the image to be registered is obtained as
Figure BDA0001830584680000119
Wherein
Figure BDA00018305846800001110
And
Figure BDA00018305846800001111
respectively representing the corresponding coordinates of the image blocks 1 of the L-1 level reference image and the image to be registered,
Figure BDA00018305846800001112
and
Figure BDA00018305846800001113
respectively expressed as a local intermediate variable and a global intermediate variable corresponding to the L-1 level image block 1. Note the book
Figure BDA00018305846800001114
The above formula can be abbreviated as the conversion matrix of the image block 1 of the L-1 level image
Figure BDA00018305846800001115
Similarly, for the image blocks 2, 3, 4 of level L-1 in FIG. 2 of the drawings, there are:
Figure BDA00018305846800001116
wherein, FL-1,2、FL-1,3、FL-1,4Respectively the transformation matrices of the image blocks 2, 3, 4 of the L-1 level image,
Figure BDA00018305846800001117
respectively corresponding coordinates of image blocks 2, 3 and 4 of the L-1 level image to be registered,
Figure BDA00018305846800001118
the corresponding coordinates of the image blocks 2, 3, 4 of the L-1 th level reference image, respectively.
Conversion matrix F for synthesizing four L-1 level image blocksL-1,1、FL-1,2、FL-1,3、FL-1,4The coordinate transformation relation between the reference image of the L-1 level and the image to be registered is obtained as
Figure BDA0001830584680000121
Wherein the content of the first and second substances,
Figure BDA0001830584680000122
a joint transformation matrix for the L-1 level image; fL-1,ζA transformation matrix representing the Zeth image block of the L-1 level; lambda [ alpha ]L-1,ζAnd the weight of the Zeth image block conversion matrix of the L-1 level.
Gradually increasing the resolution until reaching the 0 th level of the image pyramid, and obtaining the coordinate transformation relation between the reference image and the image to be registered:
Figure BDA0001830584680000123
wherein the content of the first and second substances,
Figure BDA0001830584680000124
a joint transformation matrix for the level 0 image, i.e. a transformation matrix for the final joint global and local homography matrices,
Figure BDA0001830584680000125
for the 0 th-level image to be registeredThe respective coordinates of the respective coordinates,
Figure BDA0001830584680000126
corresponding coordinates for the level 0 reference image.
Step 3), aiming at the image processed in the step 2), determining a to-be-detected area of the moving vehicle by adopting a 2-frame difference method; performing superpixel segmentation on the image, and determining a scanning frame according to the center of the superpixel to traverse the area to be detected;
as shown in fig. 3 of the drawings, F (k-1) and F (k) respectively represent the k-1 th frame and the k-th frame in the drone image sequence. Fr(k-1) and Fr(k) Is the registered image. To reduce the amount of computation for moving vehicle detection, the registered images F are processedr(k-1) and Fr(k) The 2-frame difference method is adopted to determine the region to be detected, which is shown as a rectangular box in a small picture 2-frame difference method in figure 3 in the attached drawing. Taking 2 moving vehicles as an example, 4 regions to be detected are generated after 2-frame difference method is used.
After the area to be detected is determined by using a 2-frame difference method, the image is subjected to superpixel segmentation, a scanning frame is determined according to the center of the superpixel, and then the area to be detected is traversed to realize moving vehicle detection. When traversing the area to be detected, affine transformation needs to be carried out on the scanning frame due to rotation, translation and the like of the target vehicle so as to reduce the missing rate of detection of the moving vehicle.
Step 4), extracting the texture and color of the vehicle to form low-order features of the vehicle by using the image processed in the step 3); introducing context information of the vehicle, and extracting high-order characteristics of the vehicle; after the low-order characteristic and the high-order characteristic of the target vehicle are obtained, the low-order characteristic and the high-order characteristic are fused to obtain the multi-order characteristic of the target vehicle;
specifically, the image is first divided into small connected regions, called cells. And then acquiring the direction histogram of the gradient or edge of each pixel point in the cell unit. Finally, the features of these cell units are combined to form a HOG feature descriptor. Firstly converting the image into HSV color space, respectively extracting HOG characteristics from three channels, finally making characteristic fusion, converting the image from RGB color space to HSV color space, and dividing them into different portionsH, S, V three-channel data template of the image is extracted and stored as a two-dimensional matrix MH、MSAnd MVSimultaneously calculating HOG characteristics H of three matrixes respectivelyH、HSAnd HV. And fusing the three-channel HOG characteristics in a weighting mode, namely: hl=wHHH+wSHS+wVHV. Wherein HlRepresenting low-order features of the vehicle; w is aH、wSAnd wVRespectively HOG characteristic HH、HSAnd HVAnd w isH+wS+w V1 is ═ 1; the weight of three channels is determined by each channel data template in a self-adaptive mode, and is specifically determined by the following formula:
Figure BDA0001830584680000131
to this end, a low-order feature of the vehicle, i.e., a fused H, S, V three-channel HOG feature, is determined.
And when the high-order characteristics are determined, introducing the context information of the vehicle. Manually selecting positive and negative samples to initialize a positive dictionary and a negative dictionary, and then determining a final positive dictionary D according to dictionary learning and autonomous selection strategiespAnd negative dictionary Dn. Next, the high order features are determined by calculating the reconstruction error of the target region and the reconstruction errors of other image blocks in the neighborhood.
For vehicle tvThe reconstruction error is denoted as e (t)v) And e (t)v)=[e(tv,Dp),e(tv,Dn)]TWherein e (t)v,Dp) And e (t)v,Dn) Are each tvReconstruction errors on the positive and negative dictionaries. For a certain neighborhood image block a of a vehicleιThe reconstruction error is e (a)ι) And e (a)ι)=[e(aι,Dp),e(aι,Dn)]TWhere the subscript t is the target vehicle tvThe number of image blocks in the neighborhood. Wherein e (a)ι,Dp) And e (a)ι,Dn) Are respectively aιReconstruction errors on the positive and negative dictionaries. For the neighborhood image block a iota, defining a target vehicle tvIs characterized by a high order of tvAnd aιThe difference of the reconstruction error of (1) can be represented as H (t)v,aι)=||e(tv)-e(aι)2Wherein, H (t)v,aι) Is a target vehicle tvRelative to neighborhood aιHigh order features of (1).
When the target vehicle tvWhen M image blocks exist in the neighborhood, the target vehicle tvThe high-order features of (1) are: hh=[H(tv,a1),H(tv,a2),…,H(tv,aM)]T
And fusing the obtained high-order characteristic and the low-order characteristic of the vehicle together to obtain a multi-order characteristic of the target vehicle: fv=[Hl,Hh]。
Thus, the multi-level features of the target vehicle are obtained by combining the low-level features and the high-level features of the vehicle.
And 5) training the dictionary by using a dictionary learning algorithm for the obtained multi-order features of the vehicle, and detecting the moving vehicle by using the trained dictionary.
Specifically, in the dictionary learning algorithm based on the correlation, in the dictionary updating stage, atoms related to the new sample sparse representation are firstly determined, and only the atoms are updated, so that the calculation amount of dictionary learning is reduced. On the other hand, sparsity is introduced into the dictionary update phase. And (4) repeatedly iterating the process until convergence is achieved, so that the fast and efficient dictionary training is realized, and finally the detection of the moving vehicle is completed.
And 2) introducing an image pyramid, and providing an image registration algorithm combining global and local homography matrixes according to the feature point pairing result in a top-down mode. The global homography matrix describes global position changes, and the local homography matrix describes local position changes.
And 3) introducing a 2-frame difference method and superpixel segmentation, reducing the region to be detected by using the 2-frame difference method, introducing superpixel segmentation, determining and scanning the region to be detected according to the center of the superpixel, and effectively reducing the calculated amount of moving vehicle detection.
The step 4) is implemented as follows: when the high-order features of the vehicle are extracted, firstly, positive and negative samples are manually selected to initialize a positive dictionary and a negative dictionary, then, after the final positive dictionary and the final negative dictionary are determined according to dictionary learning and sample autonomous selection strategies, the high-order features are determined by calculating the reconstruction errors of a target area and the reconstruction errors of other image blocks in a neighborhood.

Claims (10)

1. A moving vehicle detection method based on unmanned aerial vehicle aerial video is characterized by comprising the following steps:
step 1), acquiring an aerial video of a moving vehicle, extracting a continuous image sequence of the aerial video, then extracting SURF (speeded up robust features) feature points of a reference image and an image to be registered, then performing feature point matching, and performing abnormal point elimination on the matched feature points by adopting a random sampling consistency algorithm;
step 2), aiming at the characteristic points after the abnormal points are removed, obtaining a conversion matrix of the image through an unmanned aerial vehicle image registration algorithm;
step 3), aiming at the image processed in the step 2), determining a to-be-detected area of the moving vehicle by adopting a 2-frame difference method, performing superpixel segmentation on the image, and determining a scanning frame according to the center of the superpixel so as to traverse the to-be-detected area;
step 4), extracting the texture and color of the vehicle to form low-order features of the vehicle by using the image processed in the step 3); introducing context information of the vehicle, and extracting high-order characteristics of the vehicle; after the low-order characteristic and the high-order characteristic of the target vehicle are obtained, the low-order characteristic and the high-order characteristic are fused to obtain the multi-order characteristic of the target vehicle;
and 5) training the dictionary by using a dictionary learning algorithm for the obtained multi-order features of the vehicle, and detecting the moving vehicle by using the trained dictionary.
2. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video of claim 1, wherein Harr features and integral image concepts are adopted for SURF feature point extraction on the reference image and the image to be registered.
3. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video according to claim 2, wherein the Euclidean distance between any SURF feature point in the reference image and the feature point in the image to be registered is calculated; the smaller the Euclidean distance is, the higher the similarity is, and when the Euclidean distance is smaller than a set threshold value, the matching is judged to be successful; and if a certain SURF characteristic point in the image to be registered is matched with the plurality of characteristic points in the reference image, the matching is regarded as unsuccessful.
4. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video, according to claim 1, is characterized in that after the abnormal point elimination is completed, an image pyramid is introduced, and a global homography matrix and a local homography matrix are determined according to a feature point pairing result in a top-down mode: firstly, establishing an L +1 level pyramid of a reference image and an image to be registered, starting from an L level global homography matrix when determining the global homography matrix, and then gradually increasing the resolution to a 0 level, and further obtaining the 0 level global homography matrix.
5. The method according to claim 4, wherein the definition defines a moving vehicle detection method based on the aerial video of the unmanned aerial vehicle
Figure FDA0002934808530000021
And
Figure FDA0002934808530000022
respectively representing corresponding coordinates of the L-level reference image and the image to be registered; wherein
Figure FDA0002934808530000023
Is the x-coordinate of the L-th level reference image,
Figure FDA0002934808530000024
is the y-coordinate of the L-th level reference image,
Figure FDA0002934808530000025
for the x coordinate of the L-th level image to be registered,
Figure FDA0002934808530000026
for the y coordinate of the L-th level image to be registered:
the L-th level global homography matrix is determined by:
Figure FDA0002934808530000027
wherein, wLIs an intermediate variable and has
Figure FDA0002934808530000028
Figure FDA0002934808530000029
The L level global homography matrix is defined by the matrix elements as follows:
Figure FDA00029348085300000210
is abbreviated as
Figure FDA00029348085300000211
The following are determined
Figure FDA00029348085300000212
Randomly selecting 4 groups of feature point matching results each time to determine a homography matrix, and adopting l2The norm is used for screening the remaining feature matching points according to the following formula:
Figure FDA0002934808530000031
wherein, trA threshold value for outlier screening; when the residual feature matching points meet the formula, the feature matching points are regarded as valid feature matching points, otherwise, the feature matching points are regarded as invalid feature matching points; the homography matrix when the number of the effective characteristic matching points is the maximum is the finally determined L-level global homography matrix
Figure FDA0002934808530000032
The homography matrix of level L-1 is obtained by increasing the image resolution: introducing a scale factor mu, wherein pixel points corresponding to the L-1 level of the reference image and the image to be registered can be expressed as follows:
Figure FDA0002934808530000033
wherein the content of the first and second substances,
Figure FDA0002934808530000034
is the x-coordinate of the L-1 th level reference image,
Figure FDA0002934808530000035
is the y coordinate of the L-1 th level reference image,
Figure FDA0002934808530000036
for the x-coordinate of the L-1 th level image to be registered,
Figure FDA0002934808530000037
the y coordinate of the L-1 level image to be registered is obtained; μ is a scale factor: to find the homography matrix of the L-1 th level, there are:
Figure FDA0002934808530000038
order to
Figure FDA0002934808530000039
The above formula can be rewritten as:
Figure FDA00029348085300000310
wherein the content of the first and second substances,
Figure FDA00029348085300000311
the global homography matrix is the L-1 level;
by adopting a homography matrix derivation method from the L level to the L-1 level, the global homography matrix corresponding to the 0 level can be obtained by gradually increasing the resolution
Figure FDA00029348085300000312
Namely:
Figure FDA0002934808530000041
wherein the content of the first and second substances,
Figure FDA0002934808530000042
Figure FDA0002934808530000043
is the x-coordinate of the level 0 reference image,
Figure FDA0002934808530000044
is the y-coordinate of the level 0 reference image,
Figure FDA0002934808530000045
for the x-coordinate of the level 0 image to be registered,
Figure FDA0002934808530000046
for the y-coordinate, mu, of the 0 th-order image to be registeredLIs the 0 th level homographyThe scale factor of the matrix.
6. The method for detecting the moving vehicle based on the aerial video of the unmanned aerial vehicle as claimed in claim 1, wherein F (k-1) and F (k) are adopted to respectively represent the k-1 th frame and the k-th frame in the image sequence of the unmanned aerial vehicle, and F (k) is adopted to respectively represent the k-1 th frame and the k-th frame in the image sequence of the unmanned aerial vehicler(k-1) and Fr(k) The registered images are obtained; for registered image Fr(k-1) and Fr(k) And determining the region to be detected by adopting a 2-frame difference method.
7. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video is characterized in that the image is firstly divided into small connected areas, namely cell units; then collecting the direction histogram of the gradient or edge of each pixel point in the cell unit; finally, combining the features of these cell units can form a HOG feature descriptor: firstly converting an image into HSV color space, respectively extracting HOG characteristics from three channels, finally performing characteristic fusion, converting the image from RGB color space to HSV color space, respectively extracting H, S, V three-channel data templates of the image, and storing the data templates as a two-dimensional matrix MH、MSAnd MVSimultaneously calculating HOG characteristics H of three matrixes respectivelyH、HSAnd HV
8. The method for detecting the moving vehicle based on the aerial video of the unmanned aerial vehicle as claimed in claim 7, wherein a three-channel HOG feature is fused in a weighting manner, that is: hl=wHHH+wSHS+wVHV(ii) a Wherein HlRepresenting low-order features of the vehicle; w is aH、wSAnd wVRespectively HOG characteristic HH、HSAnd HVAnd w isH+wS+wV1 is ═ 1; the weight of three channels is determined by each channel data template in a self-adaptive mode, and specifically is determined by the following formula:
Figure FDA0002934808530000051
a low-order feature of the vehicle, i.e., a fused H, S, V three-channel HOG feature, is determined.
9. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial video, according to claim 7, is characterized in that context information of the vehicle is introduced when high-order features are determined; manually selecting positive and negative samples to initialize a positive dictionary and a negative dictionary, and then determining a final positive dictionary D according to dictionary learning and autonomous selection strategiespAnd negative dictionary Dn(ii) a Determining high-order characteristics by calculating the reconstruction error of a target area and the reconstruction errors of other image blocks in the neighborhood;
for vehicle tvThe reconstruction error is denoted as e (t)v) And e (t)v)=[e(tv,Dp),e(tv,Dn)]TWherein e (t)v,Dp) And e (t)v,Dn) Are each tvReconstruction errors on the positive and negative dictionaries; for a certain neighborhood image block a of a vehicleιThe reconstruction error is e (a)ι) And e (a)ι)=[e(aι,Dp),e(aι,Dn)]TWhere the subscript t is the target vehicle tvThe number of image blocks in the neighborhood; wherein e (a)ι,Dp) And e (a)ι,Dn) Are respectively aιReconstruction errors on the positive and negative dictionaries; for a neighborhood image block aιDefining a target vehicle tvIs characterized by a high order of tvAnd aιIs expressed as H (t)v,aι)=||e(tv)-e(aι)||2Wherein, H (t)v,aι) Is a target vehicle tvRelative to neighborhood aιHigh order features of (1);
when the target vehicle tvWhen M image blocks exist in the neighborhood, the target vehicle tvThe high-order features of (1) are: hh=[H(tv,a1),H(tv,a2),…,H(tv,aM)]T
And fusing the obtained high-order characteristic and the low-order characteristic of the vehicle together to obtain a multi-order characteristic of the target vehicle: fv=[Hl,Hh](ii) a And combining the low-order characteristics and the high-order characteristics of the vehicle to obtain the multi-order characteristics of the target vehicle.
10. The method for detecting the moving vehicle based on the unmanned aerial vehicle aerial photography video is characterized in that specifically in a dictionary learning algorithm based on the correlation, in a dictionary updating stage, atoms related to a new sample sparse representation are firstly determined, and only the atoms are updated; introducing sparsity into a dictionary updating stage; and repeatedly iterating the updating process until convergence, further realizing rapid and efficient dictionary training and finally finishing the detection of the moving vehicle.
CN201811203391.1A 2018-10-16 2018-10-16 Moving vehicle detection method based on unmanned aerial vehicle aerial video Active CN109376641B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811203391.1A CN109376641B (en) 2018-10-16 2018-10-16 Moving vehicle detection method based on unmanned aerial vehicle aerial video

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811203391.1A CN109376641B (en) 2018-10-16 2018-10-16 Moving vehicle detection method based on unmanned aerial vehicle aerial video

Publications (2)

Publication Number Publication Date
CN109376641A CN109376641A (en) 2019-02-22
CN109376641B true CN109376641B (en) 2021-04-27

Family

ID=65400009

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811203391.1A Active CN109376641B (en) 2018-10-16 2018-10-16 Moving vehicle detection method based on unmanned aerial vehicle aerial video

Country Status (1)

Country Link
CN (1) CN109376641B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110136104B (en) * 2019-04-25 2021-04-13 上海交通大学 Image processing method, system and medium based on unmanned aerial vehicle ground station
CN110598613B (en) * 2019-09-03 2022-10-25 长安大学 Expressway agglomerate fog monitoring method
CN112749779A (en) * 2019-10-30 2021-05-04 北京市商汤科技开发有限公司 Neural network processing method and device, electronic equipment and computer storage medium
CN111552269B (en) * 2020-04-27 2021-05-28 武汉工程大学 Multi-robot safety detection method and system based on attitude estimation
CN111612966B (en) * 2020-05-21 2021-05-07 广东乐佳印刷有限公司 Bill certificate anti-counterfeiting detection method and device based on image recognition
CN111881853B (en) * 2020-07-31 2022-09-16 中北大学 Method and device for identifying abnormal behaviors in oversized bridge and tunnel

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105554456A (en) * 2015-12-21 2016-05-04 北京旷视科技有限公司 Video processing method and apparatus

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8913783B2 (en) * 2009-10-29 2014-12-16 Sri International 3-D model based method for detecting and classifying vehicles in aerial imagery

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105554456A (en) * 2015-12-21 2016-05-04 北京旷视科技有限公司 Video processing method and apparatus

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Vehicle detection in high-resolution aerial images based on fast sparse representation classification and multiorder feature;Chen Z 等;《IEEE transactions on intelligent transportation systems》;20160218;第17卷(第8期);第2296-2309页 *
无人机航拍视频中的车辆检测方法;王素琴 等;《系统仿真学报》;20180731;第30卷(第07期);第359-369页 *

Also Published As

Publication number Publication date
CN109376641A (en) 2019-02-22

Similar Documents

Publication Publication Date Title
CN109376641B (en) Moving vehicle detection method based on unmanned aerial vehicle aerial video
CN110569704B (en) Multi-strategy self-adaptive lane line detection method based on stereoscopic vision
CN109903331B (en) Convolutional neural network target detection method based on RGB-D camera
CN104574347B (en) Satellite in orbit image geometry positioning accuracy evaluation method based on multi- source Remote Sensing Data data
CN112686935B (en) Airborne sounding radar and multispectral satellite image registration method based on feature fusion
CN108446634B (en) Aircraft continuous tracking method based on combination of video analysis and positioning information
CN112883850B (en) Multi-view space remote sensing image matching method based on convolutional neural network
CN109215053B (en) Method for detecting moving vehicle with pause state in aerial video shot by unmanned aerial vehicle
CN108428220A (en) Satellite sequence remote sensing image sea island reef region automatic geometric correction method
CN112929626B (en) Three-dimensional information extraction method based on smartphone image
CN111553845B (en) Quick image stitching method based on optimized three-dimensional reconstruction
CN106530313A (en) Sea-sky line real-time detection method based on region segmentation
CN113838064B (en) Cloud removal method based on branch GAN using multi-temporal remote sensing data
CN112016478A (en) Complex scene identification method and system based on multispectral image fusion
CN107705295B (en) Image difference detection method based on robust principal component analysis method
CN115063447A (en) Target animal motion tracking method based on video sequence and related equipment
CN112946679A (en) Unmanned aerial vehicle surveying and mapping jelly effect detection method and system based on artificial intelligence
CN114663880A (en) Three-dimensional target detection method based on multi-level cross-modal self-attention mechanism
CN114742864A (en) Belt deviation detection method and device
CN113034398A (en) Method and system for eliminating jelly effect in urban surveying and mapping based on artificial intelligence
CN117274627A (en) Multi-temporal snow remote sensing image matching method and system based on image conversion
CN110298347B (en) Method for identifying automobile exhaust analyzer screen based on GrayWorld and PCA-CNN
CN116563104A (en) Image registration method and image stitching method based on particle swarm optimization
CN114693755B (en) Non-rigid registration method and system for multimode image maximum moment and space consistency
CN111833384B (en) Method and device for rapidly registering visible light and infrared images

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20211224

Address after: 908, block a, floor 8, No. 116, Zizhuyuan Road, Haidian District, Beijing 100089

Patentee after: ZHONGZI DATA CO.,LTD.

Patentee after: China Highway Engineering Consulting Group Co., Ltd.

Address before: 710064 middle section of South Second Ring Road, Beilin District, Xi'an City, Shaanxi Province

Patentee before: CHANG'AN University