CN114529584A - Single-target vehicle tracking method based on unmanned aerial vehicle aerial photography - Google Patents

Single-target vehicle tracking method based on unmanned aerial vehicle aerial photography Download PDF

Info

Publication number
CN114529584A
CN114529584A CN202210156746.6A CN202210156746A CN114529584A CN 114529584 A CN114529584 A CN 114529584A CN 202210156746 A CN202210156746 A CN 202210156746A CN 114529584 A CN114529584 A CN 114529584A
Authority
CN
China
Prior art keywords
target
frame
tracking
neighborhood
similarity
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210156746.6A
Other languages
Chinese (zh)
Inventor
吕艳辉
郭向坤
李彬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Ligong University
Original Assignee
Shenyang Ligong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Ligong University filed Critical Shenyang Ligong University
Priority to CN202210156746.6A priority Critical patent/CN114529584A/en
Publication of CN114529584A publication Critical patent/CN114529584A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30181Earth observation

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a single-target vehicle tracking method based on unmanned aerial vehicle aerial photography, and relates to the technical field of computer vision. According to the invention, on the basis of the result of target detection, an image matching algorithm based on multi-feature fusion is adopted, the color histogram feature and the HOG feature of the image are fused, and the degree of representation of the image by using only a single feature and the accuracy in the image matching process can be obviously improved. In order to accurately predict the appearance position of the target in the video, a K + + neighborhood search algorithm is designed, so that the method is beneficial to reducing the calculation amount, has higher precision, and can effectively eliminate the interference generated by the appearance of the similar target. In the tracking process, the tracking target is completely shielded, and the single-target tracking is realized by adopting an anti-shielding algorithm based on vehicle motion state estimation. The method can quickly and accurately track a single target of a certain vehicle target in the video shot by the unmanned aerial vehicle, and has good universality and expandability.

Description

Single-target vehicle tracking method based on unmanned aerial vehicle aerial photography
Technical Field
The invention relates to the technical field of computer vision, in particular to a single-target vehicle tracking method based on unmanned aerial vehicle aerial photography.
Background
The target tracking tasks can be divided into two categories: one is multi-target tracking and the other is single-target tracking. Multi-target tracking refers to the task of tracking all targets or all individuals of a class of targets in a video sequence. It involves not only continuous tracking of each target, but also processing of recognition, self-occlusion, and mutual occlusion between different targets, and associating detection results with tracking results. Compared with multi-target tracking, the single-target tracking is more prone to single-target tracking, and because the motion situation of a certain body is usually more concentrated in a video sequence, the development of a single-target tracking technology is hot and rapid. From the traditional processing method for video frames, representative static background, frame difference method, optical flow method, Meanshift and Camshift algorithm belong to the early single-target tracking method, and have the characteristics of more practical applications, high FPS (field programmable gate array), low requirement on equipment computing capacity and the like. Then, a combination of detection and tracking begins to occur. The image features are extracted and a classifier (such as an SVM) is trained based on a machine learning method, so that the trained classifier finds the optimal region in the next frame. One method is based on generative models, and one is discriminant models, collectively referred to as detection tracking. The most representative single-target tracking method at present is a kernel correlation filtering method and an algorithm combined with deep learning on the basis of the kernel correlation filtering method. The precision and the speed of single-target tracking are refreshed again, but because the depth algorithm has large computation amount and high equipment computation requirement, and many tracking algorithms use online fine adjustment, the speed is not ideal, the practical application is limited, and a great development space still exists.
In the single-target tracking algorithm based on deep learning in recent years, the single-target tracking is pushed to a new era by the SimFC based on the twin network proposed in the CVPR2016, the characteristics of the twin network are utilized, after the template and the image are subjected to the same network for extracting the characteristics, the characteristic vectors of the template are subjected to cross correlation on the characteristic vectors of the searched image to obtain a response image, and the position with the maximum response is the position of the target. And almost all subsequent single-target tracking algorithms based on deep learning are proposed based on the algorithm. For example, the SiamMask algorithm adds a simple 1 × 1 convolution kernel with 2 channels on the basis of the cross-correlation, and obtains the output of two branches to perform different task processing; and the other representative siamrPN is based on siamrFC, and the classification and regression of the RPN network in the Faster-RCNN are fused with the siamrFC, so that the tracking precision and speed are improved greatly. The latest Dimp (learning discrete model prediction for tracking) tracking algorithm published by Martindaniel component laboratories is built, and on the basis of SimFC, an online training classifier is added, and the optimization of the classifier is carried out by using frame information before and after a video, so that the model can be adjusted in real time, and the tracking precision is improved.
Although the single target tracking algorithm based on the twin network has a better tracking effect at present, the information obtained by the network is provided by the first frame, and the obtained information amount is actually too small. Therefore, aiming at the problems of low precision and difficult real-time performance meeting of the tracking speed caused by insufficient samples in the current target tracking field,
disclosure of Invention
The invention aims to solve the technical problem of the prior art, provides a single-target vehicle tracking method based on unmanned aerial vehicle aerial photography, can quickly and accurately track a certain vehicle target in a video shot by an unmanned aerial vehicle, and has good universality and expandability.
In order to solve the technical problems, the technical scheme adopted by the invention is as follows:
a single-target vehicle tracking method based on unmanned aerial vehicle aerial photography comprises the following steps:
step 1: loading an unmanned aerial vehicle aerial video needing vehicle tracking, pausing at a first frame, manually framing a target vehicle to be tracked by using a mouse, wherein the framed area is an area to be tracked, and the target vehicle in the area to be tracked is a tracking target; then starting from a second frame of the video, detecting a target vehicle appearing in the video frame by using a target detection algorithm;
step 2: judging whether the target vehicle in the current frame is completely shielded, if not, continuing to the step 3; otherwise, executing step 5;
and step 3: establishing a K + + neighborhood around the tracking frame, screening redundant target detection frames by using the K + + neighborhood, leaving vehicles which possibly are tracking targets in the K + + neighborhood, and calculating IoU offset between the tracking frame and the detection frame and a central point;
and 4, step 4: extracting the targets in the screened detection frame into pictures, matching the pictures with the tracking targets selected by the first frame by using a multi-feature fusion image matching algorithm, respectively calculating the image similarity of the tracking targets and the screened targets and sequencing the images, comprehensively judging which detected target in the current frame is the target to be tracked by combining the calculation result of the step 3, and then updating the tracking frame; executing the step 6;
and 5: when the tracking target is blocked, an anti-blocking algorithm based on vehicle motion state estimation is adopted, after the tracking starts to exceed 20 frames, the average speed of the vehicle moving in the video is recorded every 20 frames, according to the mode, when the target disappears in the visual field, the disappearance coordinates are stored, the moving speed of the target is stopped being recorded, and the moving speed of the previous 20 frames is stored. If the target disappears within 50 frames, normally estimating a moving track and coordinates in the disappearance, simultaneously acquiring a K neighborhood of the current estimated position, recording the position of the target possibly appearing after 50 frames, setting the K neighborhood at the position to wait for capturing the target, if the vehicle is re-detected and captured by the K neighborhood, calling an image matching algorithm in the step 4 to start matching, if the matching is successful, continuing to track, if the matching is not successful, starting full-image matching, removing the recording of the motion speed and the coordinates, automatically searching by a tracker, calling a tracking mode of the steps 3-4, matching the target with the maximum similarity of the initially selected tracking target appearing in the current visual field by using a multi-feature fusion matching algorithm, and re-establishing a K + + neighborhood based on the coordinates of the target; executing the step 6;
step 6: judging whether the video is finished or not; if yes, ending the detection; otherwise, receiving the next frame and returning to the step 2.
Adopt the produced beneficial effect of above-mentioned technical scheme to lie in: the single-target vehicle tracking method based on unmanned aerial vehicle aerial photography provided by the invention adopts an image matching algorithm based on multi-feature fusion, a target prediction K + + neighborhood search algorithm and an anti-occlusion algorithm based on vehicle motion state estimation to realize single-target tracking. The image matching algorithm based on multi-feature fusion fuses the color histogram feature and the HOG feature of the image, and the method can remarkably improve the representation degree of the image by using only a single feature and the accuracy in the image matching process. The target prediction K + + neighborhood search algorithm can screen the result of target detection, contributes to reducing the calculated amount, is higher in precision of searching and tracking targets in neighborhoods compared with the original K + + neighborhood search algorithm, and can effectively eliminate interference generated by similar targets. When the tracked target is completely shielded in the tracking process, an anti-shielding algorithm based on vehicle motion state estimation is adopted, when the motion state of the vehicle in the shielding period is not changed much from the motion state before shielding, the algorithm can calculate the moving speed and the coordinates of the target before shielding in the video, and predict the motion state of the target in the disappearance time period, thereby estimating the position of the target which is probably appeared after the target exits from the shielding object, and realizing the relocation of the target. The method can quickly and accurately track a single target of a certain vehicle target in the video shot by the unmanned aerial vehicle, and has good universality and expandability.
Drawings
Fig. 1 is a flowchart of a single-target vehicle tracking method based on unmanned aerial vehicle aerial photography according to an embodiment of the present invention;
FIG. 2 illustrates a manner of selecting a tracking target according to an embodiment of the present invention; wherein, fig. 2a shows the wrong frame selection mode, and fig. 2b shows the correct frame selection mode;
FIG. 3 is a diagram illustrating tracking effects when a small portion of occlusion occurs, according to an embodiment of the present invention; wherein, fig. 3a, fig. 3b and fig. 3c are an image of 370 th frame, an image of 400 th frame and an image of 420 th frame, respectively;
FIG. 4 is a representation of a similar vehicle in accordance with an embodiment of the present invention; among them, fig. 4a, fig. 4b and fig. 4c are respectively the image of the 146 th frame, the image of the 219 th frame and the image of the 241 th frame;
FIG. 5 illustrates a fully occluded target according to an embodiment of the present invention; fig. 5a, 5b, and 5c show the 134 th frame image, the 140 th frame image, and the 144 th frame image, respectively.
Detailed Description
The following detailed description of embodiments of the present invention is provided in connection with the accompanying drawings and examples. The following examples are intended to illustrate the invention but are not intended to limit the scope of the invention.
Target tracking is an important task of computer vision, and from traditional tracking algorithms to the realization of the current generation of target tracking based on deep learning, numerous scholars and researchers invest a great deal of time and energy to make a very important contribution to the target tracking task. In the current computer vision neighborhood, especially the target tracking direction based on deep learning develops the most fierce heat, the invention also starts from a target detection network YOLOv4 of deep learning and combines image matching and target prediction to realize single-target tracking. For single target tracking, the invention proposes the idea that we should detect the target and know the kind of the target, and then add some means between each video frame to link the same target from frame to frame and find and label in the target category to which it belongs, thereby realizing single target tracking. Therefore, the main contents of the research of the invention are an image matching algorithm based on multi-feature fusion, a target prediction K + + neighborhood search algorithm and an anti-occlusion algorithm based on vehicle motion state estimation, which are divided into the following three points:
(1) and (3) performing an image matching algorithm IMF of multi-feature fusion. Starting with the basic features of the image, testing the similarity of each basic feature and calculating the image, and finding out an optimal mode suitable for matching the target under the condition that the unmanned aerial vehicle shoots the vehicle. After the color histogram feature is obtained and is more suitable for tracking vehicles, stress testing is carried out on the similarity calculation method of the feature, a road section crowded by vehicles is selected, and as a result, a single image feature cannot well express a target, so that the contour feature HOG feature of the vehicles is fused, the obtained similarity is added with corresponding weight to obtain a final similarity score, and after comprehensive judgment, the target is screened. Therefore, the image matching algorithm of multi-feature fusion is provided by taking the basic features of the image as the basis.
(2) And (4) a target prediction K + + neighborhood search algorithm. Firstly, a classification algorithm KNN is researched, then a K neighborhood search algorithm proposed according to the classification idea is introduced, and a K + + neighborhood search algorithm is proposed on the basis of the K + + neighborhood search algorithm. The K + + neighborhood search algorithm is obtained by fusing the ideas of the K neighborhood search algorithm and IoU and the Euclidean distance of the central point between the tracking frame and the prediction frame. The algorithm can better cope with the situation that vehicles same as the tracked target appear compared with the original K neighborhood search algorithm through experiments, and the algorithm can still ensure the correctness of the tracked target.
(2) An anti-occlusion algorithm AOE based on vehicle motion state estimation. The algorithm starts to record the motion information of the vehicle once every 20 frames after the target moves over 20 frames, wherein the motion information comprises the average speed and the displacement direction, and when the target is completely shielded, the motion information of the first 20 frames is stored to start to estimate the motion trend of the target in the vehicle gear period. Experiments prove that the AOE anti-occlusion algorithm can process the situation of re-occurrence of full occlusion of a target to a certain extent.
As shown in fig. 1, the method for tracking a single-target vehicle based on unmanned aerial vehicle aerial photography in the present embodiment is specifically described as follows.
Step 1: the method comprises the steps of obtaining an unmanned aerial vehicle aerial video, pausing at a first frame after the video is loaded, and selecting a target to be tracked in the first frame of the video by using a mouse frame, wherein the selected area is an area to be tracked, and a target vehicle in the area to be tracked is the tracked target. Then, the position of the target is continuously found in the video by a tracking algorithm in subsequent video frames, and the target itself should be selected as much as possible when the target is selected, as shown in fig. 2(b), the target does not need to contain too much information other than the target, so that the apparent feature of the target can be more accurately extracted, as shown in fig. 2(a), the information other than the target is selected too much. This part is implemented by OpenCV, which makes callback functions by using listening mouse events.
Step 2: judging whether the target vehicle in the current frame is completely shielded, if not, continuing to the step 3; if so, step 5 is performed.
And step 3: and the method is suitable for the K + + neighborhood search algorithm to predict the position of the tracking target.
After the tracking target is selected, a target detection algorithm is required to detect all vehicles appearing in the second frame and the subsequent frames of the video, and a K + + neighborhood search algorithm is required to process the detection result. In order to enhance the prediction effect of the target prediction algorithm, a more strict constraint condition needs to be added on the basis of the existing K neighborhood so as to increase the performance of the target prediction algorithm. In the embodiment, IoU and the concept of center point offset are combined to improve a K neighborhood search algorithm, and a K + + neighborhood search algorithm is designed to predict the position of the tracking target.
The execution process of the K + + neighborhood search algorithm is as follows:
step 3.1: calculating a K neighborhood range corresponding to the tracking frame when K is 2 according to the size of the tracking frame of the previous frame, and reducing the detection range of the current frame to the K neighborhood; the K neighborhood should satisfy formula (1):
Figure BDA0003512513660000051
wherein, WkAnd HkK is the width and height of the neighborhood search area, W and H are the width and height of the target tracking frame of the previous frame, and K is the aspect ratio of the two;
step 3.2: if only one target of the current frame is detected in the K neighborhood, namely at least two thirds of the area of the detection frame of the target is in the range of the K neighborhood, the target is the target of the previous frame, the tracking frame is updated, and the step 4 is continuously executed; if more than two targets appear in the K neighborhood in the current frame, executing the step 3.3;
step 3.3: respectively carrying out similarity calculation on the targets in the K neighborhood and the tracking target to obtain similarity scores and carrying out sequencing;
step 3.4: making IoU Euclidean distance between the target detection frame corresponding to the sequenced similarity score and the tracking frame of the previous frame and the central point; calculation IoU satisfies formula (2):
Figure BDA0003512513660000052
wherein gt is the tracking frame of the previous frame; bb (bounding box) is a detection box of the current frame appearing in the K neighborhood, IoU is calculated by using gt and the detection box in the K neighborhood respectively, and the detection box with the largest value IoU is selected for reservation, which satisfies the formula (3):
IoU(gt,bb)max=Max(IoU(gt,bb1),...,IoU(gt,bbn)) (3)
wherein IoU is the intersection ratio of the detection frame and the previous frame tracking frame, gt is the tracking frame of the previous frame, bb is the detection frame in the K + + neighborhood of the current frame, and n is the number of the detection frames in the K + + neighborhood;
calculating the Euclidean distance of the central point to satisfy the formula (4):
Figure BDA0003512513660000053
wherein d is the Euclidean distance between two points, c is the central point of the frame, and the coordinates are (x, y); c. CgtFor the center point of the previous frame, cbbDetecting the central point of a frame for the current frame; and (4) selecting the detection frame corresponding to the central point with the minimum Euclidean distance, and combining the detection frame with the similarity calculated by the previous image matching and the maximum IoU to judge which detection frame detects the tracking target.
The judgment sequence at this time is: the similarity of the images is compared, similar vehicles are excluded according to IoU, and finally the tracking target is selected according to the Euclidean distance of the central point.
And 4, step 4: after the detection frame is screened in the step 3, the tracking target is matched with the target in the screened detection frame to select the position where the current frame tracking target appears. The matching image uses a multi-feature fusion image matching algorithm, and the execution process is as follows:
step 4.1: after the target to be tracked is selected in a frame mode, extracting color histogram features and HOG features of the tracked target, and converting the two features into feature vectors.
And 4.2: and in subsequent frames, extracting the detected objects of the same category as pictures, extracting the color histogram feature and the HOG feature of each object in the same way, and obtaining feature vectors. In the calculation method of the color histogram feature vector, if each primary color can take 256 (0-255) values, then 1600 ten thousand colors (256 powers of three) in the whole color space result in huge calculation amount, so that the range 0-255 is divided into four equal-range regions: [0,63] is the 0 region, [64,127] is the 1 region, [128,191] is the 2 region and [192,255] is the 3 region. Four values are provided for each primary color, and 64 color types (fourth power) are provided for the three primary colors. Therefore, any color appearing in the image will certainly belong to one of the four regions. Then, the number of pixels appearing in each region is counted, and thus after the regions are partitioned, the calculated amount is reduced to the maximum extent, and a 64-dimensional feature vector is obtained.
Step 4.3: and (4) respectively calculating the color histogram feature similarity and the HOG feature similarity between the tracked target and all the targets obtained in the step (4.2), carrying out weighted scoring, and finally sorting all the scores. Wherein, the calculation mode of the color histogram feature similarity is cosine similarity, which satisfies the formula (5):
Figure BDA0003512513660000061
the smaller the included angle between the two vectors in the space is, the more the representative pointing directions tend to be the same, the more similar the representative pointing directions are proved to be, the greater the similarity is, and the smaller the characteristic angle theta corresponding to the cosine function image is, the larger the corresponding function value is. Therefore, according to the above theoretical basis, the magnitude of the vector angle in the coordinate system space can be used as the basis for determining the vector similarity. The smaller the angle, the larger the cosine value corresponding to the angle, the more similar they are. Accordingly, the cosine calculation method is also true for multidimensional vectors. Assume P and Q are two multidimensional vectors, P is [ P ]1,P2,...,Pm]Q is [ Q ]1,Q2,...,Qm]And m is the vector dimension.
The HOG feature similarity uses a HOG feature descriptor, namely feature vectors, and finally calculates Euclidean distance between the feature vectors, wherein the smaller the distance is, the more similar the two pictures are proved. Since the HOG feature vector is an n-dimensional vector, the corresponding euclidean distance satisfies equation (6):
Figure BDA0003512513660000062
where d is the Euclidean distance of the vector, xi、yiTwo coordinate values of a vector in a multidimensional space. As can be seen from the formula, the closer the values at the corresponding indexes of the two feature vectors are, the smaller the distance d is, and the more similar the two images are.
And respectively giving an excitation coefficient to the similarity of the color histogram feature and the similarity of the HOG feature so as to obtain a calculation method which is most suitable for calculating the similarity between the vehicles. According to the application scenarios of experimental data and algorithms, the weight of the color histogram feature is set to 1, the weight of the HOG feature is set to 2, and the candidate frame picture with the maximum similarity is selected as the target to be tracked.
And (3) after each candidate frame which is screened and participates in calculation is extracted, after the cosine similarity calculation of the color histogram feature and the Euclidean distance similarity calculation of the HOG feature are carried out on the features and the tracked target, the features and the tracked target are respectively multiplied by the corresponding weights, and the final similarity score is obtained by adding the results to satisfy the formula (7):
Si=W1(S(ci,ct))+W2(S(hi,ht)) (7)
wherein S is a similarity calculation function of the parameter in parentheses, SiScoring the total similarity of the image in the ith candidate frame and the tracking target; w1The weight coefficient is the color histogram feature similarity weight coefficient, and the value is 1; w2The HOG characteristic similarity weight coefficient is 2; s (c)i,ct) Color histogram feature similarity for the ith candidate frame and tracking target t, S (h)i,ht) Is composed ofHOG feature similarity, h, of the ith candidate box and the tracking target tiFor the HOG feature of the ith candidate box, htHOG characteristics of the tracked target; c. CtFor the center point of the previous frame, ciThe center point of the frame is detected for the current frame. And finally, selecting the candidate frame with the maximum similarity total score as a real tracking object for tracking.
After the characteristics are fused, the expression capability of the template object is improved by the algorithm, the contour characteristics are added except for the apparent color characteristics, and the tracker can be ensured to make correct judgment when the traffic flow is large.
Step 4.4: and selecting the maximum score value sequenced in the step 4.3, and combining the result calculated in the step 3, and selecting the corresponding target as the appearance position of the tracking target of the current frame under the judgment condition that the IoU of the tracking frame and the detection frame is the largest and the Euclidean distance between the central points of the tracking frame and the detection frame is the shortest.
And 5: when the tracking target is blocked, an anti-blocking algorithm based on vehicle motion state estimation is adopted, after the tracking starts to exceed 20 frames, the average speed of the vehicle moving in the video is recorded every 20 frames, according to the mode, when the target disappears in the visual field, the disappearance coordinates are stored, the moving speed of the target is stopped being recorded, and the moving speed of the previous 20 frames is stored. If the target disappears within 50 frames (about 3 seconds), normally estimating the moving track and the coordinates in the disappearance, simultaneously acquiring a K neighborhood of the current estimated position, recording the position of the target possibly appearing after 3 seconds, setting the K neighborhood to wait for capturing the target at the position, calling an image matching algorithm in the step 4 to start matching if the vehicle is redetected and captured by the K neighborhood, continuing tracking if the matching is successful, starting full-image matching if the matching is not successful, removing the recording of the motion speed and the coordinates, automatically searching by a tracker, calling a tracking mode of the step 3-the step 4, matching the target with the maximum similarity between the tracking target appearing in the current visual field and the initially selected by using a multi-feature fusion matching algorithm, updating a tracking frame, and reestablishing the K neighborhood based on the coordinates of the target.
Step 6: judging whether the video is finished or not; if yes, ending the detection; otherwise, receiving the next frame and returning to the step 2.
As shown in fig. 3, fig. 3a is an image of 370 th frame, which is the tracking effect in the case of no occlusion, and fig. 3b and fig. 3c are images of 400 th frame and 420 th frame, respectively, which are the tracking cases in the case of partial occlusion of the vehicle by the traffic light, and the tracking algorithm of the present embodiment can perform good processing for the cases.
As shown in fig. 4, where fig. 4a, fig. 4b, and fig. 4c are images of frame 146, frame 219, and frame 241, respectively, it can be seen that the image matching algorithm of the present embodiment is combined with the K + + neighborhood search algorithm, and even if there are vehicles with similar color profiles near the target vehicle, the tracking algorithm can still cope well.
As shown in fig. 5, where fig. 5 (a), fig. 5 (b) and fig. 5 (c) are images of frames 134, 140 and 144, respectively, the frame 134 indicates that the vehicle is about to disappear, the frame 140 is the motion estimation of the AOE anti-occlusion algorithm when the vehicle is completely occluded, and the frame 144 is when the vehicle exits from the occlusion, the target detected again by the K + + neighborhood generated by the estimation frame is captured and matched with the tracked target, it can be seen that the AOE anti-occlusion algorithm can solve the full occlusion problem of the target to some extent.
The method of the embodiment is used for tracking and detecting the single-target vehicles on the urban roads, the expressways and the traffic jam road sections, and the tracking accuracy is shown in table 1. The average accuracy of the single-target vehicle tracking algorithm based on unmanned aerial vehicle aerial photography of the embodiment to vehicle tracking is 91.1%.
TABLE 1 tracking accuracy for each scene
Scene Tracking accuracy
City road 0.887
Highway with a light-emitting diode 0.935
Traffic jam 0.912
The accuracy of the method of this embodiment is compared with the accuracy of TLD, SiamFC, SiamRPN + +, and Dimp tracking algorithms, and the results are shown in table 2.
TABLE 2 comparison of tracking accuracy for algorithms
Algorithm Tracking accuracy
The invention 0.911
TLD 0.735
SiamFC 0.814
SiamRPN++ 0.876
Dimp 0.925
According to the table 2, compared with the traditional TLD algorithm and the twin network-based single target tracking algorithm, the target tracking algorithm of the invention has the accuracy influenced by the detection result, so that the tracking accuracy of the tracking algorithm is comprehensively evaluated to be improved by 5.9% on average.
Finally, it should be noted that: the above embodiments are only used to illustrate the technical solution of the present invention, and not to limit the same; while the invention has been described in detail and with reference to the foregoing embodiments, it will be understood by those skilled in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some or all of the technical features may be equivalently replaced; such modifications and substitutions do not depart from the spirit of the corresponding technical solutions and scope of the present invention as defined in the appended claims.

Claims (7)

1. The utility model provides a single target vehicle tracking method based on unmanned aerial vehicle takes photo by plane which characterized in that: the method comprises the following steps:
step 1: loading an unmanned aerial vehicle aerial video needing vehicle tracking, pausing at a first frame, manually framing a target vehicle to be tracked by using a mouse, wherein the framed area is an area to be tracked, and the target vehicle in the area to be tracked is a tracking target; then, starting from the second frame of the video, executing the step 2, and detecting the target vehicle appearing in the video frame;
step 2: judging whether the target vehicle in the current frame is completely shielded, if not, continuing to the step 3; otherwise, executing step 5;
and step 3: the method is suitable for a K + + neighborhood search algorithm to predict the position of a tracking target;
establishing a K + + neighborhood around the tracking frame, screening redundant target detection frames by using the K + + neighborhood, leaving vehicles which possibly are tracking targets in the K + + neighborhood, and calculating IoU offset between the tracking frame and the detection frame and a central point;
and 4, step 4: extracting the targets in the screened detection frame into pictures, matching the pictures with the tracking targets selected by the first frame by using a multi-feature fusion image matching algorithm, respectively calculating the image similarity of the tracking targets and the screened targets and sequencing the images, comprehensively judging which detected target in the current frame is the target to be tracked by combining the calculation result of the step 3, then updating the tracking frame, and executing a step 6;
and 5: when the tracking target is blocked, an anti-blocking algorithm based on vehicle motion state estimation is adopted, after the tracking starts to exceed 20 frames, the average speed of the vehicle moving in the video is recorded every 20 frames, according to the mode, when the target disappears in the visual field, the coordinates of the disappearance are stored, the moving speed of the target is stopped being recorded, and the moving speed of the previous 20 frames is stored; if the target disappears within 50 frames, normally estimating a moving track and coordinates in the disappearance, simultaneously acquiring a K neighborhood of the current estimated position, recording the position of the target possibly appearing after 50 frames, setting the K neighborhood at the position to wait for capturing the target, if the vehicle is re-detected and captured by the K neighborhood, calling an image matching algorithm in the step 4 to start matching, if the matching is successful, continuing to track, if the matching is not successful, starting full-image matching, removing the recording of the motion speed and the coordinates, automatically searching by a tracker, calling a tracking mode of the steps 3-4, matching the target with the maximum similarity of the initially selected tracking target appearing in the current visual field by using a multi-feature fusion matching algorithm, and re-establishing a K + + neighborhood based on the coordinates of the target; executing the step 6;
step 6: judging whether the video is finished or not; if yes, ending the detection; otherwise, receiving the next frame and returning to the step 2.
2. The unmanned aerial vehicle aerial photography-based single-target vehicle tracking method according to claim 1, wherein: and when the target is selected in the frame in the step 1, only the target is selected.
3. The unmanned aerial vehicle aerial photography-based single-target vehicle tracking method according to claim 1, wherein: the specific method of the step 3 comprises the following steps:
step 3.1: calculating a K neighborhood range corresponding to the tracking frame when K is 2 according to the size of the tracking frame of the previous frame, and reducing the detection range of the current frame to the K neighborhood; k neighborhood satisfies formula (1):
Figure FDA0003512513650000021
wherein, WkAnd HkRespectively the width and the height of a K neighborhood search area, respectively the width and the height of a target tracking frame of a previous frame, and K is the ratio of the width to the height of the target tracking frame of the previous frame;
step 3.2: if only one target of the current frame is detected in the K neighborhood, namely at least two thirds of the area of the detection frame of the target is in the range of the K neighborhood, the target is the target of the previous frame, the tracking frame is updated, and the step 4 is continuously executed; if more than two targets appear in the K neighborhood in the current frame, executing the step 3.3;
step 3.3: respectively carrying out similarity calculation on the targets in the K neighborhood and the tracking target to obtain similarity scores and carrying out sequencing;
step 3.4: making IoU Euclidean distance between the target detection frame corresponding to the sequenced similarity score and the tracking frame of the previous frame and the central point; and (3) taking the detection frame corresponding to the central point with the minimum Euclidean distance, combining the detection frame with the similarity calculated by the matching of the previous images and the maximum IoU to judge which detection frame detects the tracking target, wherein the judgment sequence is as follows: the similarity of the images is compared, similar vehicles are excluded according to IoU, and finally the tracking target is selected according to the Euclidean distance of the central point.
4. The unmanned aerial vehicle aerial photography-based single-target vehicle tracking method according to claim 3, wherein: in step 3.4, IoU is calculated to satisfy equation (2):
Figure FDA0003512513650000022
wherein gt is the tracking frame of the previous frame; bb is a detection frame of the current frame in the K neighborhood range; IoU calculation is carried out by using detection boxes in gt and K neighborhoods respectively, and the detection box with the largest IoU value is selected for reservation, and the formula (3) is satisfied:
IoU(gt,bb)max=Max(IoU(gt,bb1),...,IoU(gt,bbn)) (3)
wherein IoU () is the intersection ratio of the detection frame and the previous frame tracking frame, gt is the tracking frame of the previous frame, bbiThe number of the ith detection frame appearing in the K + + neighborhood of the current frame is n, and the n is the total number of the detection frames in the K + + neighborhood;
calculating the Euclidean distance of the central point to satisfy the formula (4):
Figure FDA0003512513650000023
wherein d (-) is the Euclidean distance between two points in the parentheses, c1、c2Respectively, the central points of the two frames, and the coordinates thereof are respectively (x)1,y1)、(x2,y2);cgtFor the center point of the previous frame, cbbThe center point of the frame is detected for the current frame.
5. The unmanned aerial vehicle aerial photography-based single-target vehicle tracking method according to claim 1, wherein: the specific method of the step 4 comprises the following steps:
step 4.1: after the target to be tracked is selected, extracting color histogram features and HOG features of the tracked target, and converting the two features into feature vectors;
step 4.2: extracting the detected targets of the same category as pictures in subsequent frames, extracting the color histogram feature and the HOG feature of each target in the same way, and obtaining feature vectors;
step 4.3: respectively calculating the color histogram feature similarity and the HOG feature similarity between the tracked target and all the targets obtained in the step 4.2, carrying out weighted scoring, and finally sorting all the scores; finally, selecting the candidate frame with the maximum similarity total score as a real tracking object for tracking;
step 4.4: and selecting the maximum score value sequenced in the step 4.3, and combining the result calculated in the step 3, and selecting the corresponding target as the appearance position of the tracking target of the current frame under the judgment condition that the IoU of the tracking frame and the detection frame is the largest and the Euclidean distance between the central points of the tracking frame and the detection frame is the shortest.
6. The unmanned aerial vehicle aerial photography-based single-target vehicle tracking method according to claim 5, wherein: in the calculation of the color histogram feature vector in the step 4.1 and the step 4.2, the value range 0-255 of each primary color is divided into four equal regions: [0,63] is region 0, [64,127] is region 1, [128,191] is region 2, [192,255] is region 3; the value of each primary color corresponds to four values after being partitioned, any color appearing in the image definitely belongs to one of the four regions, the number of pixels appearing in each region is counted, and a 64-dimensional feature vector is obtained.
7. The unmanned aerial vehicle aerial photography-based single-target vehicle tracking method according to claim 5, wherein: in the step 4.3, the calculation mode of the feature similarity of the color histogram is cosine similarity, which satisfies the formula (5):
Figure FDA0003512513650000031
the smaller the included angle is, the larger the cosine value corresponding to the included angle is, the more similar the included angle is; p and Q are two multidimensional vectors, P is [ P ]1,P2,...,Pm]Q is [ Q ]1,Q2,...,Qm]M is the vector dimension;
the HOG feature similarity uses a HOG feature descriptor, namely feature vectors, and finally calculates Euclidean distance between the feature vectors, wherein the smaller the distance is, the more similar the two pictures are proved; since the HOG feature vector is an n-dimensional vector, the corresponding euclidean distance satisfies equation (6):
Figure FDA0003512513650000032
where d is the Euclidean distance of the vector, xi、yiTwo coordinate values of a vector in a multi-dimensional space;
after cosine similarity calculation of the color histogram feature and Euclidean distance similarity calculation of the HOG feature are carried out, the cosine similarity calculation and the Euclidean distance similarity calculation are respectively multiplied by corresponding weights, and the final similarity score is obtained through addition, and the formula (7) is satisfied:
Si=W1(S(ci,ct))+W2(S(hi,ht)) (7)
in the formula, SiScoring the total similarity of the image in the ith candidate frame and the tracking target; w1A color histogram feature similarity weight coefficient; w2Weighting coefficient for HOG feature similarity; s is a similarity calculation function of the parameters in brackets; s (c)i,ct) A color histogram feature similarity function for the ith candidate frame and the tracking target t, ciDetecting the center point of the frame for the current frame, ctThe central point of the tracking frame of the previous frame; s (h)i,ht) HOG feature similarity function for ith candidate box and tracking target t, hiFor the HOG feature of the ith candidate box, htFor tracking HOG features of the target.
CN202210156746.6A 2022-02-21 2022-02-21 Single-target vehicle tracking method based on unmanned aerial vehicle aerial photography Pending CN114529584A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210156746.6A CN114529584A (en) 2022-02-21 2022-02-21 Single-target vehicle tracking method based on unmanned aerial vehicle aerial photography

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210156746.6A CN114529584A (en) 2022-02-21 2022-02-21 Single-target vehicle tracking method based on unmanned aerial vehicle aerial photography

Publications (1)

Publication Number Publication Date
CN114529584A true CN114529584A (en) 2022-05-24

Family

ID=81625192

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210156746.6A Pending CN114529584A (en) 2022-02-21 2022-02-21 Single-target vehicle tracking method based on unmanned aerial vehicle aerial photography

Country Status (1)

Country Link
CN (1) CN114529584A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115273078A (en) * 2022-09-30 2022-11-01 南通炜秀环境技术服务有限公司 Sewage treatment method and system based on image data
CN117372719A (en) * 2023-12-07 2024-01-09 四川迪晟新达类脑智能技术有限公司 Target detection method based on screening
CN117689907A (en) * 2024-02-04 2024-03-12 福瑞泰克智能系统有限公司 Vehicle tracking method, device, computer equipment and storage medium

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115273078A (en) * 2022-09-30 2022-11-01 南通炜秀环境技术服务有限公司 Sewage treatment method and system based on image data
CN117372719A (en) * 2023-12-07 2024-01-09 四川迪晟新达类脑智能技术有限公司 Target detection method based on screening
CN117689907A (en) * 2024-02-04 2024-03-12 福瑞泰克智能系统有限公司 Vehicle tracking method, device, computer equipment and storage medium
CN117689907B (en) * 2024-02-04 2024-04-30 福瑞泰克智能系统有限公司 Vehicle tracking method, device, computer equipment and storage medium

Similar Documents

Publication Publication Date Title
CN114529584A (en) Single-target vehicle tracking method based on unmanned aerial vehicle aerial photography
Wen et al. Visdrone-sot2018: The vision meets drone single-object tracking challenge results
CN111161317A (en) Single-target tracking method based on multiple networks
CN109214403B (en) Image recognition method, device and equipment and readable medium
Zhu et al. Multi-drone-based single object tracking with agent sharing network
CN113011329A (en) Pyramid network based on multi-scale features and dense crowd counting method
CN111445497B (en) Target tracking and following method based on scale context regression
CN111127519A (en) Target tracking control system and method for dual-model fusion
Bouachir et al. Structure-aware keypoint tracking for partial occlusion handling
CN111429485B (en) Cross-modal filtering tracking method based on self-adaptive regularization and high-reliability updating
CN112329784A (en) Correlation filtering tracking method based on space-time perception and multimodal response
CN112614161A (en) Three-dimensional object tracking method based on edge confidence
CN107330918B (en) Football video player tracking method based on online multi-instance learning
CN112767440A (en) Target tracking method based on SIAM-FC network
CN111091583B (en) Long-term target tracking method
Fan et al. Covered vehicle detection in autonomous driving based on faster rcnn
CN116381672A (en) X-band multi-expansion target self-adaptive tracking method based on twin network radar
CN112560651B (en) Target tracking method and device based on combination of depth network and target segmentation
CN112614158B (en) Sampling frame self-adaptive multi-feature fusion online target tracking method
CN112487927B (en) Method and system for realizing indoor scene recognition based on object associated attention
CN111242980B (en) Point target-oriented infrared focal plane blind pixel dynamic detection method
CN113920155A (en) Moving target tracking algorithm based on kernel correlation filtering
CN113269808A (en) Video small target tracking method and device
CN113129332A (en) Method and apparatus for performing target object tracking
Wang et al. Adaptive weight collaborative complementary learning for robust visual tracking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination