CN112819856A - Target tracking method and self-positioning method applied to unmanned aerial vehicle - Google Patents

Target tracking method and self-positioning method applied to unmanned aerial vehicle Download PDF

Info

Publication number
CN112819856A
CN112819856A CN202110086693.0A CN202110086693A CN112819856A CN 112819856 A CN112819856 A CN 112819856A CN 202110086693 A CN202110086693 A CN 202110086693A CN 112819856 A CN112819856 A CN 112819856A
Authority
CN
China
Prior art keywords
frame image
target
aerial vehicle
unmanned aerial
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110086693.0A
Other languages
Chinese (zh)
Other versions
CN112819856B (en
Inventor
叶俊杰
符长虹
林付凌
丁方强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tongji University
Original Assignee
Tongji University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tongji University filed Critical Tongji University
Priority to CN202110086693.0A priority Critical patent/CN112819856B/en
Publication of CN112819856A publication Critical patent/CN112819856A/en
Application granted granted Critical
Publication of CN112819856B publication Critical patent/CN112819856B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/50Extraction of image or video features by performing operations within image blocks; by using histograms, e.g. histogram of oriented gradients [HoG]; by summing image-intensity values; Projection analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Discrete Mathematics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Image Processing (AREA)

Abstract

The invention relates to a target tracking method and a self-positioning method applied to an unmanned aerial vehicle, wherein the method is based on relevant filtering, the interframe change rate of a response image is smoothly detected by constraining the second-order difference of response in the tracking process, the capability of a tracker for adapting to the target apparent change is enhanced, the weight distribution of each characteristic channel is iteratively optimized by introducing a channel weight regular term and using an alternative direction multiplier method in the training process, the self-adaptive distribution of channel weight is realized, the tracker focuses on the characteristic channel with higher reliability, and the judgment force of the tracker is enhanced; on the basis of the target tracking method, the invention provides the unmanned aerial vehicle self-positioning system which has better feasibility and universality.

Description

Target tracking method and self-positioning method applied to unmanned aerial vehicle
Technical Field
The invention relates to the technical field of visual target tracking and self-positioning, relates to a target tracking method and a self-positioning method applied to an unmanned aerial vehicle, and particularly relates to an unmanned aerial vehicle target tracking and self-positioning method based on multi-regularization correlation filtering.
Background
Visual target tracking is an important research direction in the field of computer vision. The process of target tracking is essentially a process of dynamic information extraction and analysis of the target according to the initial information of the target. With the rapid development of computer vision technology and image processing technology, the target tracking technology has great progress, and the applications in the aspects of automatic driving, man-machine interaction, intelligent monitoring systems and the like are developed.
Unmanned aerial vehicle is because of its detection range is wide, the flexibility is strong, the deployment is fast, characteristics such as with low costs, no matter in military field still civilian field, all receives favour. The development of the target tracking technology undoubtedly brings huge opportunity for multi-field application of the unmanned aerial vehicle, and the applications of autonomous landing, intelligent inspection, traffic management, video shooting, intelligent monitoring and the like are developed. However, as the tracked target and the background are usually in a dynamic change process, the target tracking task faces many unpredictable visual uncertainties, such as target deformation, scale change, occlusion, and the like, which makes the target tracking task extremely challenging. Due to the particularity of the carrier of the drone, the application of visual target tracking on the drone platform faces unique challenges: (1) due to the high visual angle and high speed of the unmanned aerial vehicle, challenging scenes such as visual angle change, rapid lens movement, motion blur and the like frequently occur in the target tracking process of the unmanned aerial vehicle, so that the target appearance change is caused, further model training is interfered, and even tracking failure is caused; (2) in consideration of cost and cruising ability of the unmanned aerial vehicle, the computational power level of an onboard computer carried by the unmanned aerial vehicle is limited, but the tracking task of the unmanned aerial vehicle generally has strict real-time performance requirements, so that the tracking algorithm facing the unmanned aerial vehicle has to well balance the computational complexity and the tracking performance.
The existing target tracking algorithms with excellent performance can be divided into two types: a target tracking algorithm based on correlation filtering and a target tracking algorithm based on deep learning. The method based on deep learning exhibits excellent performance depending on strong discrimination brought by deep semantic features, but the deployment of the method depends on an expensive high-performance GPU, so that the method is particularly difficult to apply to unmanned planes which are generally only provided with a single CPU. In recent years, a target tracking algorithm based on correlation filtering has received wide attention in the field of unmanned aerial vehicle target tracking due to high calculation efficiency and good tracking performance. Henriques et al propose a nuclear Correlation filter tracker in the document High-Speed Tracking with Kernelized Correlation Filters, exhibiting fast and excellent Tracking performance. Galoogahi et al introduce a binary clipping matrix into a related filtering frame in Learning Background-Aware Correlation Filters for Visual Tracking, which alleviates the boundary effect and further improves the Tracking performance. However, these trackers typically only utilize the training samples in the current frame to train the filter, which results in a lack of historical information. Therefore, they are prone to tracking drift when faced with scenes such as similar objects, target scale changes, target/drone motion, etc. For this reason, some studies have attempted to introduce temporal information into the correlation filtering. Li et al, in the document left Spatial-Temporal regulated Correlation Filters for Visual Tracking, propose to limit the variation of successive inter-frame Filters. The document Visual Tracking Visual Adaptive mapping-regulated Correlation Filters considers the temporal continuity of the spatial regularization term to cope with sudden appearance changes. Huang et al, in the document Learning interference reconstructed Filters for Real-Time UAV Tracking, propose to suppress the variation of the continuous inter-frame response map to suppress possible Tracking result anomalies. These methods are commonly implemented by limiting the difference of components in the current frame from the previous frame, which enhances robustness to some extent. However, in the case of sharp changes in the appearance of the target, such as motion blur, fast motion, the above approach forces the components to be constrained rather than tolerating reasonable changes to accommodate changes in the appearance of the target, resulting in tracking failure. In addition, the existing correlation filtering tracking method generally considers all characteristic channels equally, and some channels with redundant information often cannot help to locate an object, which limits further improvement of tracking robustness.
The vision-based self-positioning method is a basic subtask in many unmanned aerial vehicle related applications, and the existing vision-based unmanned aerial vehicle self-positioning method generally utilizes local features to perform 2D-3D matching. Faessler et al propose an Infrared LED based Monocular positioning System in the document a singular position System based on infra red LEDs, but their application relies on the deployment of Infrared cameras.
Disclosure of Invention
The invention aims to overcome the defects in the prior art and provide a target tracking method and a self-positioning method applied to an unmanned aerial vehicle.
The purpose of the invention can be realized by the following technical scheme:
a target tracking method applied to an unmanned aerial vehicle comprises the following steps:
reading a t frame image acquired by an unmanned aerial vehicle, inputting the position, width and height of a tracking target in the t frame image, confirming a training area of the target in the t frame image, extracting the characteristics of the training area of the t frame image, updating an appearance model of the t frame image according to the characteristics of the training area of the t frame image, and training a filter model and channel weight distribution of the t frame image;
the method comprises the steps of taking a training area of a target in a t +1 th frame image as a search area of the t +1 th frame image, extracting search area characteristics of the t +1 th frame image, obtaining a detection response image of the t +1 th frame image according to the search area characteristics of the t +1 th frame image and a filter model and channel weight distribution of the t +1 th frame image, updating the position, width and height of the target in the t +1 th frame image, judging whether a video frame input collected by an unmanned aerial vehicle exists subsequently, if yes, repeating the steps to track the target of a next frame image, and if not, ending the tracking process.
Preferably, the method comprises the steps of:
a1: reading a t frame image acquired by the unmanned aerial vehicle, and inputting the position, the width w and the height h of a tracking target in the t frame image;
a2: extracting target bits from the t frame image according to the position of the target in the t frame imageIs arranged as a center and has a side length of
Figure BDA0002910995750000031
Wherein alpha is a predefined training area proportion, and extracting HOG characteristics, CN characteristics and gray-scale characteristics of the training area in the t-th frame image;
a3: updating an appearance model of the t frame image according to the HOG characteristics, the CN characteristics and the gray-scale characteristics of the training area in the t frame image;
a4: training a filter model and channel weight distribution of the t frame image according to the appearance model of the t frame image;
a5: the position of the target in the t frame image is taken as the center, and the side length is
Figure BDA0002910995750000032
The square of the image is used as a search area of the t +1 th frame image, HOG characteristics, CN characteristics and gray-scale characteristics of the search area of the t +1 th frame image are extracted, and the search area characteristics of the t +1 th frame image are obtained;
a6: acquiring a detection response diagram of the t +1 frame image according to the search region characteristics of the t +1 frame image, the filter model of the t frame image and channel weight distribution;
a7: updating the position, the width w and the height h of the target in the t +1 frame image according to the detection response image of the t +1 frame image;
a8: and judging whether video frames acquired by the unmanned aerial vehicle are input subsequently, if so, making t equal to t +1, repeating the steps A2-A8 to track the target of the next frame of image, and otherwise, ending the tracking process.
Preferably, the step a3 includes:
a3-1: the HOG characteristics, the CN characteristics and the gray-scale characteristics of the training area in the t frame image are fused to obtain the training area characteristics of the t frame image with D channels
Figure BDA0002910995750000033
A3-2: for the characteristics of the training area
Figure BDA0002910995750000041
Performing discrete Fourier transform to obtain Fourier domain of training region characteristics
Figure BDA0002910995750000042
A3-3: judging whether the t frame image is the 1 st frame image collected by the unmanned aerial vehicle, if so, updating the appearance model of the t frame image
Figure BDA0002910995750000043
Wherein
Figure BDA0002910995750000044
Is an apparent model of the image of the t-th frame,
Figure BDA0002910995750000045
expressing a Fourier domain, otherwise, updating an appearance model of the t frame image by adopting a linear interpolation formula based on a preset learning rate eta
Figure BDA0002910995750000046
Preferably, the linear interpolation formula is:
Figure BDA0002910995750000047
preferably, the specific step of the step a4 includes:
apparent model based on the t frame image
Figure BDA0002910995750000048
Detection response graph R of preset Gaussian training label y, t-1 frame image and t-2 frame imaget-1、Rt-2And training a filter model and channel weight distribution of the t-th frame image by minimizing a preset multi-regularization objective function.
Preferably, the multi-regularization objective function is:
Figure BDA0002910995750000049
wherein h istIs a filter model for the image of the t-th frame,
Figure BDA00029109957500000410
filter model for the d characteristic channel of the t frame image, betatChannel weight distribution for the t-th frame image, D is the number of characteristic channels, symbol ≧ represents the circular convolution,
Figure BDA00029109957500000411
Figure BDA00029109957500000412
is the feature of the d characteristic channel of the appearance model of the t frame image,
Figure BDA00029109957500000413
is a diagonal matrix composed of the weights of the D eigen-channels,
Figure BDA00029109957500000414
the weight of the d characteristic channel of the t frame image is defined, P is a binary cutting matrix, the last three items in the multi-regularization target function are respectively a filter regularization item, a channel weight regularization item and a response difference regularization item to form a plurality of regularization items, kappa, gamma and lambda are respectively coefficients of the filter regularization item, the channel weight regularization item and the response difference regularization item,
Figure BDA00029109957500000415
respectively representing the first difference of the d characteristic channel detection response graphs of the t frame image and the t-1 frame image,
and:
Figure BDA00029109957500000416
wherein the content of the first and second substances,
Figure BDA00029109957500000419
representing a shift operator, the effect of which is to shift the matrix maxima to a central position, and
Figure BDA00029109957500000417
preferably, in step a4, the multi-regularization objective function is minimized by using an alternating direction multiplier method, and a filter model and a channel weight assignment of the t-th frame are obtained.
Preferably, in the step a5, the HOG feature, the CN feature and the gray-scale feature of the search region in the t +1 th frame image are fused to obtain the search region feature of the t +1 th frame image with D channels
Figure BDA00029109957500000418
Preferably, the step a6 is based on the characteristics of the search region in the t +1 th frame image
Figure BDA0002910995750000051
Filter model h of the t-th frame imagetChannel weight assignment β of the t-th frame imagetObtaining a detection response image R of the t +1 th frame image by a detection formulat+1
Preferably, the detection formula is:
Figure BDA0002910995750000052
wherein the content of the first and second substances,
Figure BDA0002910995750000053
representing the inverse discrete fourier transform and,
Figure BDA0002910995750000054
is the weight of the d characteristic channel of the t frame image,
Figure BDA0002910995750000055
for the d characteristic channel of the t +1 th frame imageThe characteristics of the area are searched for,
Figure BDA0002910995750000056
Figure BDA0002910995750000057
which is representative of the fourier domain,
Figure BDA0002910995750000058
p is a binary clipping matrix for the filter model of the d-th eigen channel of the t-th frame image.
A self-positioning method applied to an unmanned aerial vehicle comprises the following steps:
b1: reading a t frame image of a self-positioning video sequence acquired by an unmanned aerial vehicle, and acquiring the position, width w and height h of a mark point in the t frame image, wherein the t frame image is provided with 4 or more mark points;
b2: the target tracking method applied to the unmanned aerial vehicle is operated in parallel, and the positions of the mark points in the subsequent image frames are tracked respectively;
b3: converting the coordinate position of the mark point in the image into a world coordinate system;
b4: and outputting the coordinate position of the unmanned aerial vehicle in the world coordinate system after iteratively optimizing the reconstruction error.
Preferably, B4 specifically includes: and (3) iteratively optimizing the reprojection error of the image coordinate system-world coordinate system of the mark point by using a nonlinear least square method, and outputting the coordinate position of the unmanned aerial vehicle in the world coordinate system.
Compared with the prior art, the invention has the beneficial effects that:
(1) according to the target tracking method applied to the unmanned aerial vehicle, the time regular term based on the response second-order difference is designed, and the interframe change rate of the response image is smoothly detected by reasonably introducing the historical response image information, so that the capability of the tracking method for adapting to the target apparent change is enhanced;
(2) the invention designs a channel weight regular term, and realizes the self-adaptive distribution of channel weights by iteratively optimizing the weight distribution of each characteristic channel by using an alternating direction multiplier method in the training process, so that the tracking method focuses on the characteristic channel with higher reliability and enhances the discrimination of the tracking method;
(3) based on the response difference regular term, the channel weight regular term and the filter regular term, the invention constructs a multi-regularization related filtering unmanned aerial vehicle target tracking method, and the tracking robustness is greatly improved;
(4) the invention designs a self-positioning method applied to the unmanned aerial vehicle, and the target tracking algorithm can aim at any target, so the self-positioning method can be applied to a wide range of complex scenes, and provides a new solution for the self-positioning task of the unmanned aerial vehicle.
Drawings
Fig. 1 is a flowchart of a target tracking method applied to an unmanned aerial vehicle according to the present invention;
fig. 2 is an overall framework diagram of a target tracking method applied to an unmanned aerial vehicle according to the present invention;
FIG. 3 is a qualitative comparison of a target tracking method applied to an unmanned aerial vehicle according to the present invention with an existing tracking method;
FIG. 4 is a visualization of a channel weight regularization term applied to a target tracking method of an unmanned aerial vehicle according to the present invention;
FIG. 5 is a comparison of the performance of the UAVDT data set of the target tracking method of the present invention applied to the UAV and the existing excellent tracking method;
fig. 6 is a flow chart of a self-positioning method applied to a drone;
FIG. 7 is a diagram illustrating a scenario in which a self-positioning method applied to an unmanned aerial vehicle is applied to an unmanned aerial vehicle-self-guided vehicle cooperative work scenario;
fig. 8 is a schematic diagram of the marking points on the self-guiding carriage.
Detailed Description
The invention is described in detail below with reference to the figures and specific embodiments. Note that the following description of the embodiments is merely a substantial example, and the present invention is not intended to be limited to the application or the use thereof, and is not limited to the following embodiments.
Examples
A target tracking method applied to an unmanned aerial vehicle comprises the following steps:
reading a t frame image acquired by an unmanned aerial vehicle, inputting the position, width and height of a tracking target in the t frame image, confirming a training area of the target in the t frame image, extracting the characteristics of the training area of the t frame image, updating an appearance model of the t frame image according to the characteristics of the training area of the t frame image, and training a filter model and channel weight distribution of the t frame image;
the method comprises the steps of taking a training area of a target in a t +1 th frame image as a search area of the t +1 th frame image, extracting search area characteristics of the t +1 th frame image, obtaining a detection response image of the t +1 th frame image according to the search area characteristics of the t +1 th frame image and a filter model and channel weight distribution of the t +1 th frame image, updating the position, width and height of the target in the t +1 th frame image, judging whether a video frame input collected by an unmanned aerial vehicle exists subsequently, if yes, repeating the steps to track the target of a next frame image, and if not, ending the tracking process.
As shown in fig. 1, fig. 2, and fig. 4, specifically, the method includes the following steps:
a1: and reading a t frame image acquired by the unmanned aerial vehicle, and inputting the position, the width w and the height h of the tracking target in the t frame image.
In a1, t is 1,2,3 ….
A2: according to the position of the target in the t frame image, extracting the position of the target as the center and the side length as the side length from the t frame image
Figure BDA0002910995750000071
Wherein α is a predefined training region proportion, and extracting the HOG features, CN features and gray-scale features of the training region in the t-th frame image.
A3: and updating the appearance model of the image of the t frame according to the HOG characteristic, the CN characteristic and the gray-scale characteristic of the training area in the image of the t frame.
Step a3 specifically includes:
a3-1: for the HOG characteristic of the training area in the t frame image,Performing fusion processing on the CN features and the gray-scale features to obtain training region features of the t frame image with D channels
Figure BDA0002910995750000072
A3-2: for the characteristics of the training area
Figure BDA0002910995750000073
Performing discrete Fourier transform to obtain Fourier domain of training region characteristics
Figure BDA0002910995750000074
A3-3: judging whether the t frame image is the 1 st frame image collected by the unmanned aerial vehicle, if so, updating the appearance model of the t frame image
Figure BDA0002910995750000075
Wherein
Figure BDA0002910995750000076
Is an apparent model of the image of the t-th frame,
Figure BDA0002910995750000077
expressing a Fourier domain, otherwise, updating an appearance model of the t frame image by adopting a linear interpolation formula based on a preset learning rate eta
Figure BDA0002910995750000078
In this embodiment, the linear interpolation formula is:
Figure BDA0002910995750000079
a4: and training a filter model and channel weight distribution of the t frame image according to the appearance model of the t frame image.
The step A4 specifically comprises the following steps:
apparent model based on the t frame image
Figure BDA00029109957500000710
Detection response graph R of preset Gaussian training label y, t-1 frame image and t-2 frame imaget-1、Rt-2And training a filter model and channel weight distribution of the t-th frame image by minimizing a preset multi-regularization objective function.
The multi-regularization objective function is as follows:
Figure BDA00029109957500000711
wherein h istIs a filter model for the image of the t-th frame,
Figure BDA00029109957500000712
filter model for the d characteristic channel of the t frame image, betatChannel weight distribution for the t-th frame image, D is the number of characteristic channels, symbol ≧ represents the circular convolution,
Figure BDA00029109957500000713
Figure BDA00029109957500000714
is the feature of the d characteristic channel of the appearance model of the t frame image,
Figure BDA00029109957500000715
is a diagonal matrix composed of the weights of the D eigen-channels,
Figure BDA00029109957500000716
the weight of the d characteristic channel of the t frame image is defined, P is a binary cutting matrix, the last three items in the multi-regularization target function are respectively a filter regularization item, a channel weight regularization item and a response difference regularization item to form a plurality of regularization items, kappa, gamma and lambda are respectively coefficients of the filter regularization item, the channel weight regularization item and the response difference regularization item,
Figure BDA0002910995750000081
respectively representing the t-th frame and the t-1 th frameDetects the first order difference of the response map of the d-th eigenchannel of (1),
and:
Figure BDA0002910995750000082
wherein the content of the first and second substances,
Figure BDA0002910995750000083
representing a shift operator, the effect of which is to shift the matrix maxima to a central position, and
Figure BDA0002910995750000084
in this embodiment, in step a4, an alternating direction multiplier method is used to minimize the multi-regularization objective function, and a filter model and channel weight assignment of the t-th frame are obtained.
A5: the position of the target in the t frame image is taken as the center, and the width is
Figure BDA0002910995750000085
The square of the image is used as a search area of the t +1 th frame image, the HOG characteristic, the CN characteristic and the gray-scale characteristic of the search area of the t +1 th frame image are extracted, and the search area characteristic of the t +1 th frame image is obtained.
Similar to A3, in step a5, the HOG feature, the CN feature, and the gray-scale feature of the search region in the t +1 th frame image are fused to obtain the search region feature of the t +1 th frame image with D channels
Figure BDA0002910995750000086
A6: and acquiring a detection response diagram of the t +1 frame image according to the search region characteristics of the t +1 frame image, the filter model of the t frame image and channel weight distribution. Step A6 is based on the characteristics of the search region in the t +1 th frame image
Figure BDA0002910995750000087
Filter model h of the t-th frame imagetThe image of the t-th frameChannel weight assignment βtObtaining a detection response image R of the t +1 th frame image by a detection formulat+1The detection formula is as follows:
Figure BDA0002910995750000088
wherein the content of the first and second substances,
Figure BDA0002910995750000089
representing the inverse discrete fourier transform and,
Figure BDA00029109957500000810
is the weight of the d characteristic channel of the t frame image,
Figure BDA00029109957500000811
the search region feature of the d-th feature channel of the t + 1-th frame image,
Figure BDA00029109957500000812
Figure BDA00029109957500000813
which is representative of the fourier domain,
Figure BDA00029109957500000814
p is a binary clipping matrix for the filter model of the d-th eigen channel of the t-th frame image.
A7: updating the position, the width w and the height h of the target in the t +1 frame image according to the detection response image of the t +1 frame image;
a8: and judging whether video frames acquired by the unmanned aerial vehicle are input subsequently, if so, making t equal to t +1, repeating the steps A2-A8 to track the target of the next frame of image, and otherwise, ending the tracking process.
As shown in fig. 3, in this embodiment, the target is tracked by using the method of the present invention and similar methods, a second order difference change curve of the inter-frame detection response graph of the method of the present invention and similar methods is drawn in the graph, the change rate of the inter-frame detection response graph is reflected, the inter-frame detection response graph of similar methods changes drastically, the method of the present invention effectively smoothes the change of the inter-frame detection response graph, and the robustness of the tracking algorithm under the condition that the target appearance model changes rapidly is enhanced.
As shown in FIG. 5, the method of the invention and other 35 excellent similar methods at present are evaluated on a UAVDT unmanned aerial vehicle target tracking data set, the method of the invention shows higher precision and success rate, and is very suitable for unmanned aerial vehicle target tracking tasks when the running speed of a single CPU reaches 50.5 frames per second.
A self-positioning method applied to an unmanned aerial vehicle comprises the following steps:
b1: reading a t frame image of a self-positioning video sequence acquired by an unmanned aerial vehicle, and acquiring the position, width w and height h of a mark point in the t frame image, wherein the t frame image is provided with 4 or more mark points;
b2: in parallel to the target tracking method applied to the unmanned aerial vehicle, the positions of the mark points in the subsequent image frames are tracked respectively;
b3: converting the coordinate position of the mark point in the image into a world coordinate system;
b4: and outputting the coordinate position of the unmanned aerial vehicle in the world coordinate system after iteratively optimizing the reconstruction error.
In this embodiment, B4 specifically includes: and (3) iteratively optimizing the reprojection error of the image coordinate system-world coordinate system of the mark point by using a nonlinear least square method, and outputting the coordinate position of the unmanned aerial vehicle in the world coordinate system.
In one embodiment of the present invention, as shown in fig. 7 and 8, the video sequence of the AGV is acquired by the unmanned aerial vehicle, and the AGV is provided with 4 mark points.
The above embodiments are merely examples and do not limit the scope of the present invention. These embodiments may be implemented in other various manners, and various omissions, substitutions, and changes may be made without departing from the technical spirit of the present invention.

Claims (10)

1. A target tracking method applied to an unmanned aerial vehicle is characterized by comprising the following steps:
reading a t frame image acquired by an unmanned aerial vehicle, inputting the position, width and height of a tracking target in the t frame image, confirming a training area of the target in the t frame image, extracting the characteristics of the training area of the t frame image, updating an appearance model of the t frame image according to the characteristics of the training area of the t frame image, and training a filter model and channel weight distribution of the t frame image;
the method comprises the steps of taking a training area of a target in a t +1 th frame image as a search area of the t +1 th frame image, extracting search area characteristics of the t +1 th frame image, obtaining a detection response image of the t +1 th frame image according to the search area characteristics of the t +1 th frame image and a filter model and channel weight distribution of the t +1 th frame image, updating the position, width and height of the target in the t +1 th frame image, judging whether a video frame input collected by an unmanned aerial vehicle exists subsequently, if yes, repeating the steps to track the target of a next frame image, and if not, ending the tracking process.
2. The target tracking method applied to the unmanned aerial vehicle as claimed in claim 1, wherein the method comprises the following steps:
a1: reading a t frame image acquired by the unmanned aerial vehicle, and inputting the position, the width w and the height h of a tracking target in the t frame image;
a2: according to the position of the target in the t frame image, extracting the position of the target as the center and the side length as the side length from the t frame image
Figure FDA0002910995740000011
Wherein alpha is a predefined training area proportion, and extracting HOG characteristics, CN characteristics and gray-scale characteristics of the training area in the t-th frame image;
a3: updating an appearance model of the t frame image according to the HOG characteristics, the CN characteristics and the gray-scale characteristics of the training area in the t frame image;
a4: training a filter model and channel weight distribution of the t frame image according to the appearance model of the t frame image;
a5: the position of the target in the t frame image is taken asCenter, side length of
Figure FDA0002910995740000012
The square of the image is used as a search area of the t +1 th frame image, HOG characteristics, CN characteristics and gray-scale characteristics of the search area of the t +1 th frame image are extracted, and the search area characteristics of the t +1 th frame image are obtained;
a6: acquiring a detection response diagram of the t +1 frame image according to the search region characteristics of the t +1 frame image, the filter model of the t frame image and channel weight distribution;
a7: updating the position, the width w and the height h of the target in the t +1 frame image according to the detection response image of the t +1 frame image;
a8: and judging whether video frames acquired by the unmanned aerial vehicle are input subsequently, if so, making t equal to t +1, repeating the steps A2-A8 to track the target of the next frame of image, and otherwise, ending the tracking process.
3. The method of claim 2, wherein the step a3 includes:
a3-1: the HOG characteristics, the CN characteristics and the gray-scale characteristics of the training area in the t frame image are fused to obtain the training area characteristics of the t frame image with D channels
Figure FDA0002910995740000021
A3-2: for the characteristics of the training area
Figure FDA0002910995740000022
Performing discrete Fourier transform to obtain Fourier domain of training region characteristics
Figure FDA0002910995740000023
A3-3: judging whether the t frame image is the 1 st frame image collected by the unmanned aerial vehicle, if so, updating the appearance model of the t frame image
Figure FDA0002910995740000024
Figure FDA0002910995740000025
Wherein
Figure FDA0002910995740000026
Is an apparent model of the image of the t-th frame,
Figure FDA0002910995740000027
expressing a Fourier domain, otherwise, updating an appearance model of the t frame image by adopting a linear interpolation formula based on a preset learning rate eta
Figure FDA0002910995740000028
4. The method of claim 3, wherein the linear interpolation formula is as follows:
Figure FDA0002910995740000029
5. the target tracking method applied to the unmanned aerial vehicle as claimed in claim 2, wherein the step a4 comprises the following steps:
apparent model based on the t frame image
Figure FDA00029109957400000210
Detection response graph R of preset Gaussian training label y, t-1 frame image and t-2 frame imaget-1、Rt-2And training a filter model and channel weight distribution of the t-th frame image by minimizing a preset multi-regularization objective function.
6. The method of claim 5, wherein the multi-regularized objective function is:
Figure FDA00029109957400000211
wherein h istIs a filter model for the image of the t-th frame,
Figure FDA00029109957400000212
filter model for the d characteristic channel of the t frame image, betatChannel weight distribution for the t-th frame image, D is the number of characteristic channels, symbol ≧ represents the circular convolution,
Figure FDA00029109957400000213
Figure FDA00029109957400000214
is the feature of the d characteristic channel of the appearance model of the t frame image,
Figure FDA00029109957400000215
is a diagonal matrix composed of the weights of the D eigen-channels,
Figure FDA00029109957400000216
the weight of the d characteristic channel of the t frame image is defined, P is a binary cutting matrix, the last three items in the multi-regularization target function are respectively a filter regularization item, a channel weight regularization item and a response difference regularization item to form a plurality of regularization items, kappa, gamma and lambda are respectively coefficients of the filter regularization item, the channel weight regularization item and the response difference regularization item,
Figure FDA0002910995740000031
respectively representing the first difference of the d characteristic channel detection response graphs of the t frame image and the t-1 frame image,
and:
Figure FDA0002910995740000032
wherein the content of the first and second substances,
Figure FDA0002910995740000033
representing a shift operator, the effect of which is to shift the matrix maxima to a central position, and
Figure FDA0002910995740000034
7. the method of claim 5, wherein the step A4 is to minimize the multi-regularized objective function by using an alternating direction multiplier method to obtain a filter model and a channel weight distribution for the t-th frame.
8. The method of claim 2, wherein the step A6 is based on the characteristics of the search area in the t +1 th frame image
Figure FDA0002910995740000035
Filter model h of the t-th frame imagetChannel weight assignment β of the t-th frame imagetObtaining a detection response image R of the t +1 th frame image by a detection formulat+1
9. The method of claim 8, wherein the detection formula is:
Figure FDA0002910995740000036
wherein the content of the first and second substances,
Figure FDA0002910995740000037
representing the inverse discrete fourier transform and,
Figure FDA0002910995740000038
is the weight of the d characteristic channel of the t frame image,
Figure FDA0002910995740000039
the search region feature of the d-th feature channel of the t + 1-th frame image,
Figure FDA00029109957400000310
Figure FDA00029109957400000311
which is representative of the fourier domain,
Figure FDA00029109957400000312
p is a binary clipping matrix for the filter model of the d-th eigen channel of the t-th frame image.
10. A self-positioning method applied to an unmanned aerial vehicle, based on any one of claims 1 to 9, and a target tracking method applied to the unmanned aerial vehicle, characterized by comprising the following steps:
b1: reading a t frame image of a self-positioning video sequence acquired by an unmanned aerial vehicle, and acquiring the position, width w and height h of a mark point in the t frame image, wherein the t frame image is provided with 4 or more mark points;
b2: the method for tracking the target applied to the unmanned aerial vehicle, as claimed in any one of claims 1-9, is operated in parallel, and positions of mark points in subsequent image frames are tracked respectively;
b3: converting the coordinate position of the mark point in the image into a world coordinate system;
b4: and outputting the coordinate position of the unmanned aerial vehicle in the world coordinate system after iteratively optimizing the reconstruction error.
CN202110086693.0A 2021-01-22 2021-01-22 Target tracking method and self-positioning method applied to unmanned aerial vehicle Active CN112819856B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110086693.0A CN112819856B (en) 2021-01-22 2021-01-22 Target tracking method and self-positioning method applied to unmanned aerial vehicle

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110086693.0A CN112819856B (en) 2021-01-22 2021-01-22 Target tracking method and self-positioning method applied to unmanned aerial vehicle

Publications (2)

Publication Number Publication Date
CN112819856A true CN112819856A (en) 2021-05-18
CN112819856B CN112819856B (en) 2022-10-25

Family

ID=75858752

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110086693.0A Active CN112819856B (en) 2021-01-22 2021-01-22 Target tracking method and self-positioning method applied to unmanned aerial vehicle

Country Status (1)

Country Link
CN (1) CN112819856B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113379804A (en) * 2021-07-12 2021-09-10 闽南师范大学 Unmanned aerial vehicle target tracking method, terminal equipment and storage medium
CN113470075A (en) * 2021-07-09 2021-10-01 郑州轻工业大学 Target tracking method based on interference suppression appearance modeling

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108776975A (en) * 2018-05-29 2018-11-09 安徽大学 Visual tracking method based on semi-supervised feature and filter joint learning
CN108986140A (en) * 2018-06-26 2018-12-11 南京信息工程大学 Target scale adaptive tracking method based on correlation filtering and color detection
CN109741366A (en) * 2018-11-27 2019-05-10 昆明理工大学 A kind of correlation filtering method for tracking target merging multilayer convolution feature
CN110211157A (en) * 2019-06-04 2019-09-06 重庆邮电大学 A kind of target long time-tracking method based on correlation filtering
CN110349190A (en) * 2019-06-10 2019-10-18 广州视源电子科技股份有限公司 Target tracking method, device and equipment for adaptive learning and readable storage medium
CN110675423A (en) * 2019-08-29 2020-01-10 电子科技大学 Unmanned aerial vehicle tracking method based on twin neural network and attention model
CN111951298A (en) * 2020-06-25 2020-11-17 湖南大学 Target tracking method fusing time series information

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108776975A (en) * 2018-05-29 2018-11-09 安徽大学 Visual tracking method based on semi-supervised feature and filter joint learning
CN108986140A (en) * 2018-06-26 2018-12-11 南京信息工程大学 Target scale adaptive tracking method based on correlation filtering and color detection
CN109741366A (en) * 2018-11-27 2019-05-10 昆明理工大学 A kind of correlation filtering method for tracking target merging multilayer convolution feature
CN110211157A (en) * 2019-06-04 2019-09-06 重庆邮电大学 A kind of target long time-tracking method based on correlation filtering
CN110349190A (en) * 2019-06-10 2019-10-18 广州视源电子科技股份有限公司 Target tracking method, device and equipment for adaptive learning and readable storage medium
CN110675423A (en) * 2019-08-29 2020-01-10 电子科技大学 Unmanned aerial vehicle tracking method based on twin neural network and attention model
CN111951298A (en) * 2020-06-25 2020-11-17 湖南大学 Target tracking method fusing time series information

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
佟磊 等: ""基于强跟踪滤波器预测的主动表观模型人脸特征点跟踪"", 《计算机应用》 *
孙梦宇 等: ""基于通道可靠性的时空正则项目标跟踪算法"", 《计算机应用与软件》 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113470075A (en) * 2021-07-09 2021-10-01 郑州轻工业大学 Target tracking method based on interference suppression appearance modeling
CN113470075B (en) * 2021-07-09 2022-09-23 郑州轻工业大学 Target tracking method based on interference suppression appearance modeling
CN113379804A (en) * 2021-07-12 2021-09-10 闽南师范大学 Unmanned aerial vehicle target tracking method, terminal equipment and storage medium
CN113379804B (en) * 2021-07-12 2023-05-09 闽南师范大学 Unmanned aerial vehicle target tracking method, terminal equipment and storage medium

Also Published As

Publication number Publication date
CN112819856B (en) 2022-10-25

Similar Documents

Publication Publication Date Title
TWI750498B (en) Method and device for processing video stream
US10719940B2 (en) Target tracking method and device oriented to airborne-based monitoring scenarios
Chen et al. Augmented ship tracking under occlusion conditions from maritime surveillance videos
EP2956891B1 (en) Segmenting objects in multimedia data
CN112819856B (en) Target tracking method and self-positioning method applied to unmanned aerial vehicle
Ringwald et al. UAV-Net: A fast aerial vehicle detector for mobile platforms
CN107590432A (en) A kind of gesture identification method based on circulating three-dimensional convolutional neural networks
CN113591968A (en) Infrared weak and small target detection method based on asymmetric attention feature fusion
CN111027377B (en) Double-flow neural network time sequence action positioning method
EP3690744B1 (en) Method for integrating driving images acquired from vehicles performing cooperative driving and driving image integrating device using same
CN112215074A (en) Real-time target identification and detection tracking system and method based on unmanned aerial vehicle vision
CN112861808B (en) Dynamic gesture recognition method, device, computer equipment and readable storage medium
Li et al. UAV object tracking by background cues and aberrances response suppression mechanism
CN103198491A (en) Indoor visual positioning method
CN112785626A (en) Twin network small target tracking method based on multi-scale feature fusion
Abdullah et al. Objects detection and tracking using fast principle component purist and kalman filter.
CN113743357A (en) Video representation self-supervision contrast learning method and device
CN114029941A (en) Robot grabbing method and device, electronic equipment and computer medium
CN110889460A (en) Mechanical arm specified object grabbing method based on cooperative attention mechanism
CN113723468B (en) Object detection method of three-dimensional point cloud
CN116110095A (en) Training method of face filtering model, face recognition method and device
CN115512263A (en) Dynamic visual monitoring method and device for falling object
CN115035397A (en) Underwater moving target identification method and device
Angus et al. Real-time video anonymization in smart city intersections
Yang et al. Locator slope calculation via deep representations based on monocular vision

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant