CN113128605A - Target tracking method based on particle filtering and depth distance measurement learning - Google Patents

Target tracking method based on particle filtering and depth distance measurement learning Download PDF

Info

Publication number
CN113128605A
CN113128605A CN202110442516.1A CN202110442516A CN113128605A CN 113128605 A CN113128605 A CN 113128605A CN 202110442516 A CN202110442516 A CN 202110442516A CN 113128605 A CN113128605 A CN 113128605A
Authority
CN
China
Prior art keywords
target
tracking
automatic driving
sample
depth
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110442516.1A
Other languages
Chinese (zh)
Inventor
王洪雁
张莉彬
袁海
张鼎卓
周贺
薛喜扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN202110442516.1A priority Critical patent/CN113128605A/en
Publication of CN113128605A publication Critical patent/CN113128605A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Abstract

The invention discloses a target tracking method based on particle filtering and depth distance measurement learning, and relates to the field of automatic driving target visual tracking. The method comprises the following steps: constructing a nonlinear depth measurement learning model; training the nonlinear depth measurement learning model based on a given automatic driving target positive and negative sample set, and optimizing nonlinear depth learning model parameters based on a gradient descent method; constructing a target observation model based on particle filtering to obtain an optimal estimation of the state of the automatic driving target; and the target template is updated through an online tracking strategy combining short-term and long-term stable updating, so that the automatic driving target is effectively tracked. The method has better performance in partial shielding, illumination change and other scenes; compared with a comparison algorithm, under most test scenes, the average center error of the method is low, the average overlapping rate is high, and the overall tracking performance of the method is excellent.

Description

Target tracking method based on particle filtering and depth distance measurement learning
Technical Field
The invention relates to the field of automatic driving target visual tracking, in particular to an automatic driving target visual tracking method based on particle filtering and depth distance measurement learning.
Background
The automatic driving relates to the fields of information perception, information processing, decision execution and the like, wherein the information perception is used as a basic module for collecting driving environment information, and relates to a plurality of information collecting sensors such as laser radars, millimeter wave radars, ultrasonic radars, GPS (global positioning system), cameras and the like. As a sensor that can collect rich scene information and is inexpensive, a camera has been considered as an automatic driving matching scene information sensing apparatus by the industry. Thus, camera-based automatic driving target tracking has become one of the research hotspots in the field of computer vision. In recent years, numerous efficient and robust automatic driving visual tracking algorithms are proposed in succession, and the practical process of target visual tracking is greatly promoted. However, due to the complexity of the actual automatic driving scene, a great deal of interference and uncertainty factors such as illumination change, size change, target occlusion and the like exist in the tracking process, so that the tracking performance is significantly reduced. Therefore, how to improve the accuracy and robustness of the automatic driving target visual tracking algorithm in a complex scene is still one of the research difficulties in the field of visual tracking.
Aiming at the problem of reduced target visual tracking performance in a complex scene, Nam H and the like propose a deep learning tracking method, the method firstly trains a network offline and then finely adjusts network parameters to finally obtain a relatively good network model, but due to the problems of long time consumption, poor pertinence of feature training and the like, Zhang K H and the like propose a visual tracking algorithm (CNT) adopting a convolution network, the method firstly uses a K-means algorithm to construct a feature map set, then noise is reduced on a training result image based on an adaptive threshold shrinkage algorithm, and finally a target model is constructed based on sparse representation, but the resolution of a feature extraction map can be reduced due to convolution operation of the algorithm. To solve the above problem, Lu X K et al propose a regression network that maps samples to a labeled response graph, however, since the dimension mismatch problem may occur between the target and the background, the authors propose a method that considers the loss function of shrinkage loss and performs regression. Hu J proposes to improve the target-to-background discrimination problem by learning a nonlinear distance metric using stacked independent subspace analysis networks, and a Discriminant Depth Metric Learning (DDML) based method explicitly obtains a nonlinear distance metric by using a large margin criterion at the top of a trained depth network, but since a very large set of auxiliary data is required and may not be consistent with an online captured object, the learned features cannot be adapted to these objects.
Disclosure of Invention
Aiming at the problem of performance degradation of the traditional target tracking method in a complex environment, the invention provides a target tracking method based on particle filtering and depth distance measurement learning, which comprises the following steps:
constructing a nonlinear depth measurement learning model;
training the nonlinear depth measurement learning model based on a given automatic driving target positive and negative sample set, and optimizing nonlinear deep learning model parameters based on a gradient descent method;
constructing a target observation model based on particle filtering to obtain an optimal estimation of the state of the automatic driving target;
and the target template is updated through an online tracking strategy combining short-term and long-term stable updating, so that the automatic driving target is effectively tracked.
Due to the adoption of the technical scheme, the invention can obtain the following technical effects: the automatic driving target visual tracking method combining depth distance measurement learning and particle filtering provided by the invention has higher target tracking precision and robustness when the target is tracked in a complex environment. The method constructs a nonlinear depth measurement learning model based on a depth network; then optimizing the obtained depth measurement learning model parameters based on a gradient descent algorithm; then constructing an observation model based on the obtained optimal candidate target predicted value to obtain the optimal estimation of the automatic driving target state; and finally, updating the target template based on a short-term and long-term stable updating combined updating strategy so as to realize effective tracking of the automatic driving target. As can be seen from qualitative analysis, the method has better performance in partial shading, illumination change and other scenes; based on quantitative analysis, compared with a comparison algorithm, under most test scenes, the average central error of the method is lower, the average overlapping rate is higher, and the overall tracking performance of the method is better.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings needed to be used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of an implementation of the present invention;
FIG. 2 is a graph of the tracking results of five different tracking algorithms;
FIG. 3 is a graph of tracking success rate for different tracking methods;
FIG. 4 is a graph of overall accuracy for different tracking methods.
Detailed Description
The implementation steps of the present invention are further described in detail below with reference to the accompanying drawings and specific embodiments: the invention provides an automatic driving target visual tracking method based on particle filtering and depth distance measurement learning, aiming at the problem that the target tracking performance is obviously reduced due to factors such as illumination change, target deformation and partial shielding in a complex environment. The method firstly learns the layered nonlinear transformation in the feedforward neural network based on a depth measurement learning method; then mapping the template and the particles to the same feature space, and maximizing the minimum difference in the positive training pair class and the difference between the negative training pair classes in the feature space; then, identifying the candidate which is most similar to the template obtained by the depth measurement network as a real target based on a particle filter framework; and finally, updating the template on line based on an updating strategy combining short-term and long-term stable updating to reduce the influence of adverse factors so as to realize the effective tracking of the automatic driving target. The experimental result shows that compared with the existing mainstream tracking algorithm, the method provided in the complex environment has higher target tracking precision and better robustness. The method comprises the following specific steps:
step 1, constructing a nonlinear depth measurement learning model, specifically:
constructing a deep network model by combining the samples
Figure BDA0003035527750000031
Passed into a multi-layer non-linear transformation to learn its non-linear representation. The number of network layers K is 1,2(k)Representing the dimension of the samples in the k-th layer network, the first layer samples are output as follows:
Figure BDA0003035527750000032
wherein, W(1)Projection matrix representing the first layer network, b(1)Is the amount of deviation of the first layer network. Phi is an S-shaped nonlinear activation function.
The output of the first layer network can be used as the input of the second layer network and recursion is performed in turn, and then the k-th layer network output can be expressed as:
Figure BDA0003035527750000033
wherein the content of the first and second substances,
Figure BDA0003035527750000034
sample(s)
Figure BDA0003035527750000035
The output of the top-most layer of the network can be expressed as:
Figure BDA0003035527750000036
wherein the mapping f is a parametric nonlinear function consisting of parameters
Figure BDA0003035527750000037
And
Figure BDA0003035527750000038
and (4) jointly determining.
Based on the method, the sample x is represented by the constructed depth network modeliAnd xjEuclidean distance between them to measure the similarity between them:
Figure BDA0003035527750000039
for learning parameters in the deep network model
Figure BDA00030355277500000310
And
Figure BDA00030355277500000311
based on the edge Fisher analysis criterion, the following nonlinear depth metric learning function is constructed:
Figure BDA00030355277500000312
wherein the content of the first and second substances,
Figure BDA0003035527750000041
represents a sample xiAnd xjBelongs to the technical field of the direct-alignment,
Figure BDA0003035527750000042
represents a sample xiAnd xjBelonging to a negative pair. Alpha is a positive parameter, the internal compactness of the positive sample pair and the separability of the negative sample pair to the sample can be balanced, and beta (beta is more than 0) is a regularization parameter.
Figure BDA0003035527750000043
The Frobenius norm of the matrix is represented, and M and N represent the total number of positive and negative pairs in the training data, respectively.
Step 2, training the nonlinear depth metric learning model based on a given automatic driving target positive and negative sample set, and optimizing parameters of the nonlinear deep learning model based on a gradient descent method, specifically Wie
Given a training sample set x ═ x (x)1,x2,...,xn) And E, sampling positive and negative samples according to a zero mean Gaussian distribution around the target sample image block, wherein the six parameter diagonal covariance matrix of the automatic driving positive sample is diag ([1,1,0, 0)]) Representing samples within a radius of two pixels around the object; the six parameter diagonal covariance matrix for the autopilot negative example is diag ([ w, h,0, 0)]) And w and h represent the width and height, respectively, of the target sample, sampled from distant negative samples away from the target object. Thus, some negative examples may contain both the background and a partial ontology of the target object. Randomly forming M parameters of a positive pair training depth network and N parameters of a negative pair training depth network based on the obtained training set;
because the constructed nonlinear depth measurement model is non-convex, a closed-form solution is difficult to obtain directly, and in order to solve the optimization problem, a gradient descent-based method is used for solving the parameter W(k)And b(k)Specifically:
Figure BDA0003035527750000044
Figure BDA0003035527750000045
wherein the content of the first and second substances,
Figure BDA0003035527750000046
representing the original input sample, sample xi xjStep length between
Figure BDA0003035527750000047
And
Figure BDA0003035527750000048
the topmost layer of (d) can be represented as follows:
Figure BDA0003035527750000049
Figure BDA00030355277500000410
wherein the content of the first and second substances,
Figure BDA00030355277500000411
representing a function
Figure BDA00030355277500000412
With respect to W(K)The derivative of (c). Other layer variables
Figure BDA00030355277500000413
And
Figure BDA00030355277500000414
is represented as follows:
Figure BDA00030355277500000415
Figure BDA00030355277500000416
wherein, the lines indicate element-by-element multiplication,
Figure BDA00030355277500000417
can be expressed as follows:
Figure BDA00030355277500000418
updating parameters based on gradient descent algorithmW(k)And b(k)K1, 2, K until convergence:
Figure BDA00030355277500000419
Figure BDA0003035527750000051
where η is a learning rate, and is used to control the convergence rate of the objective function L.
Step 3, constructing a target observation model based on the particle filter to obtain the optimal estimation of the automatic driving target state, specifically:
suppose that the automatic driving target state vector at time r is hr={hrx,hry,scrrrrIn which h isrx,hry,scrrrrFor six-degree-of-freedom affine transformation parameters, the motion model of the object between adjacent frames can be expressed as follows:
Figure BDA0003035527750000052
wherein the content of the first and second substances,
Figure BDA0003035527750000053
to represent
Figure BDA0003035527750000054
Obedient mean value of hr-1The variance is a gaussian distribution of Σ, and Σ is a diagonal covariance matrix.
Since the candidate object updates the estimate only in the nearest neighbor frame, the motion model
Figure BDA0003035527750000055
While stationary, the optimal candidate target may be based directly on the observation model
Figure BDA0003035527750000056
Selecting, namely:
Figure BDA0003035527750000057
wherein gamma is a normalization factor, gamma is a constant for controlling the shape of the Gaussian kernel, and the simulation is 0.01.
And 4, updating the target template through an online tracking strategy combining short-term and long-term stable updating so as to realize effective tracking of the automatic driving target, wherein the method specifically comprises the following steps:
in the actual tracking process, the variable targets in the complex scene cannot be effectively tracked by keeping the target template unchanged, so that the template updating is always a hot spot problem of online target tracking. If the tracking is implemented based on the fixed template from the first frame, the tracker cannot capture the target well due to factors such as illumination change, background clutter or partial shielding and the like; conversely, if the template is updated quickly, each update introduces an error that gradually accumulates over time causing the tracker to drift away from the target. In order to solve the problems, the invention introduces an online tracking strategy combining short-term and long-term stable updating to update the target template.
Template initialization: firstly, determining the position of the first frame of the target, then obtaining the tracking result of the previous n frames based on the tracking method, normalizing the tracking result, and finally combining the tracking result into a template set T ═ T1,t2,···,tn]∈Rb×n
And (3) dynamic updating of the template: the similarity between the template and the tracking result can be expressed as psi [. psi12,···,ψn]If the threshold value is rho, the similarity psi between the tracking result and the ith template is determinediExpressed as:
Figure BDA0003035527750000058
in the formula (I), the compound is shown in the specification,
Figure BDA0003035527750000059
tracking for the r-th frameAs a result, the similarity value ψiLarger indicates that the tracking result is more similar to the template.
Let the maximum similarity be Λ, which is expressed as:
Λ=maxψi
comparing the maximum similarity Lambda with a threshold rho, and if the maximum similarity Lambda is larger than the rho, indicating that the similarity of the tracking result and a certain target template is maximum, updating the corresponding template; otherwise, no update is made. In the simulation experiment, the threshold value is rho 0.7.
The effects of the present invention can be further illustrated by the following simulations:
simulation conditions are as follows: the hardware environment is as follows: intel Core (TM) i5-4258 CPU, dominant frequency 2.4GHz, memory 8GB, and experimental software test environment: python3.7, MATLAB 2017a, and open source deep learning framework Caffe. The simulation conditions were set as follows: the number of positive and negative samples extracted from the first frame by the tracking algorithm is respectively 100 and 400, and the number of positive and negative samples of each subsequent frame is respectively 30 and 120, so that 300 positive pairs and 900 negative pairs are generated. The tracking performance and the calculation complexity are balanced, if too many particles increase the calculation amount of the algorithm, and if too few particles cannot obtain the optimal target state, based on which the number of particles per frame is set to be 600, and the weight of the particles is initialized to be 1/600. The video tracking data set OTB-100 selected 6 video sequences of MotorRolling, Boy, Skating1, Bird2, Tiger2, Basketball, as test sets, which contained multiple tracking challenges. The CNN network used in the invention adopts a deep learning framework Caffe, the network weight value is updated by a gradient descent method, and a local area normalization parameter alpha is set to be 0.0001 and tau is set to be 0.75, so that the function of side inhibition is achieved, and the generalization capability of the network for extracting complex environment information is enhanced; the learning rate was set to 0.001 and the training period was 300 to minimize the occurrence of the "overfitting" phenomenon. The method adopts the average tracking overlapping rate and the average center position error to quantitatively analyze the tracking performance of the method.
Simulation content:
simulation 1: and (3) qualitative analysis: fig. 2 is a comparison of the results of 5 tracking algorithms for 6 test sequences. The MotorRolling sequence comprises the challenging factors of rapid motion, background clutter, illumination change and the like, the target in the 112 th frame is obviously changed from air falling, MIL and BACF have tracking drift or the phenomenon that a tracking frame is inconsistent with a real target, and the algorithm can always better track the target and can be attributed to the fact that the background clutter and the rapid motion influence are considered by the algorithm so as to accurately estimate the moving target. The target in the Basketball has obvious size change, the algorithm and the BCAF can locate the target and effectively track the target, and the method has a good tracking effect under the condition of size change. The target in Boy moves rapidly, meanwhile, factor interference such as scale change and rotation occurs, and the tracking drift phenomenon occurs in MIL after 418 frames. Skating1 belongs to a more complex scene, the target background contrast is lower, and there is a stronger illumination change. The target resolution ratio is low in the scene, and the template is timely updated by the algorithm through long and short time combination with an online updating strategy, so that stable tracking is realized. The algorithms proposed in Bird2 video sequences and Tiger2 video sequences can lock well to the target.
Simulation 2: quantitative analysis: as can be seen from tables 1 and 2, based on the 6 test sequences selected by the OTB-100, the proposed algorithm has better tracking effect than the comparison algorithm, which is attributable to the fact that the proposed algorithm employs depth distance metric learning and introduces error terms to construct a likelihood model to reduce the sensitivity between similar target backgrounds. Compared with a comparison algorithm, the algorithm has excellent performance under the conditions of shielding, noise and the like, and the main reasons can be expressed as follows:
(1) the correlation among candidate target templates is considered by the extracted model, so that the algorithm tracking robustness in a complex scene is improved;
(2) the depth distance measurement measures the similarity of particles, so that the tracking effectiveness is improved;
(3) the long-time and short-time updating strategy improves the robustness and tracking accuracy of the algorithm under the noise and shielding scenes.
TABLE 1 average overlap ratio for different tracking methods
Figure BDA0003035527750000071
TABLE 2 average center position error for different tracking methods
Figure BDA0003035527750000072
The invention adopts a success rate curve graph and an overall precision graph to evaluate the overall performance of the tracker. The overall accuracy map represents the percentage of successful frames to total frames within a distance threshold for the center position error. The success rate and overall accuracy curves obtained by the mentioned comparison algorithm are shown in fig. 3 and 4, respectively. As can be seen from FIGS. 3 and 4, the tracking success rate of the algorithm proposed in most sequences is higher than that of the comparison algorithm; the algorithm for tracking the precision map in the Tiger2 sequence is slightly inferior to the BCAF, but the tracking success rate curve is still superior to the BCAF, and the overall tracking precision of the algorithm in other sequences is also superior to that of the comparison algorithm. Therefore, the overall performance of the algorithm is better than that of a comparison method in a complex scene, and the robustness is better.
Simulation 3: average running speed of different tracking methods under each test sequence: in order to verify the tracking timeliness of the algorithm, the invention adopts the frame second (FPS) running every second to measure the algorithm speed (the algorithm runs for 50 times, and the average obtained FPS is used as an evaluation index), and the FPS obtained by each algorithm in different test sequences is shown in the table 3. As can be seen from Table 3, the proposed algorithm speed is higher than Struck, BCAF and DFT, and inferior to MIT. However, as mentioned above, the tracking performance of the proposed algorithm in each test sequence is overall better than that of the comparison algorithm.
TABLE 3 average running speed (FPS) for different tracking methods under each test sequence
Figure BDA0003035527750000081
In summary, the present invention provides an automatic driving target visual tracking method based on particle filtering and depth distance metric learning. The method constructs a nonlinear depth distance measurement learning model based on a depth network; then constructing an observation model based on the obtained optimal candidate target prediction value; and finally, updating the target template based on a short-term and long-term stable updating combined updating strategy. Based on the OTB-100 data set, 6 test sequences containing factors such as occlusion and illumination change are selected, and the effectiveness of the method is verified by comparing the test sequences with four mainstream trackers such as BACF, MIL, Struck and DFT. As can be seen from qualitative analysis, the method has better representation in scenes of partial shielding, illumination change, and the like; based on quantitative analysis, compared with a comparison algorithm, under most test scenes, the average central error of the algorithm is lower, the average overlapping rate is higher, and the overall tracking performance of the method is better. Therefore, the algorithm provided by the invention can provide a solid theoretical and engineering realization basis for automatic driving target visual tracking in a complex driving scene.
The embodiments of the present invention are illustrative, but not restrictive, of the invention in any manner. The technical features or combinations of technical features described in the embodiments of the present invention should not be considered as being isolated, and they may be combined with each other to achieve a better technical effect. The scope of the preferred embodiments of the present invention may also include additional implementations, and this should be understood by those skilled in the art to which the embodiments of the present invention pertain.

Claims (5)

1. The target tracking method based on particle filtering and depth distance measurement learning is characterized by comprising the following steps of:
constructing a nonlinear depth measurement learning model;
training the nonlinear depth measurement learning model based on a given automatic driving target positive and negative sample set, and optimizing nonlinear deep learning model parameters based on a gradient descent method;
constructing a target observation model based on particle filtering to obtain an optimal estimation of the state of the automatic driving target;
and the target template is updated through an online tracking strategy combining short-term and long-term stable updating, so that the automatic driving target is effectively tracked.
2. The target tracking method based on particle filtering and depth distance metric learning of claim 1, wherein a nonlinear depth metric learning model is further constructed by using positive and negative samples of an automatic driving target obtained by nonlinear change of a depth network, specifically:
constructing a deep network model by combining the samples
Figure FDA0003035527740000011
Passing into a multi-layer non-linear transformation to learn its non-linear representation; the number of network layers K is 1,2(k)Representing the dimension of the samples in the k-th layer network, the first layer samples are output as follows:
Figure FDA0003035527740000012
wherein, W(1)Projection matrix representing the first layer network, b(1)Is the deviation amount of the first layer network; phi is an S-shaped nonlinear activation function;
the output result of the first layer network is used as the input of the second layer network and recurses in turn, and then the output of the k-th layer network is expressed as:
Figure FDA0003035527740000013
wherein the content of the first and second substances,
Figure FDA0003035527740000014
sample(s)
Figure FDA0003035527740000015
The output at the top of the network is represented as:
Figure FDA0003035527740000016
wherein the mapping f is a parametric nonlinear function consisting of parameters
Figure FDA0003035527740000017
And
Figure FDA0003035527740000018
jointly determining;
based on the method, the sample x is represented by the constructed depth network modeliAnd xjEuclidean distance between them to measure the similarity between them:
Figure FDA0003035527740000019
for learning parameters in the deep network model
Figure FDA00030355277400000110
And
Figure FDA00030355277400000111
based on the edge Fisher analysis criterion, the following nonlinear depth metric learning function is constructed:
Figure FDA0003035527740000021
wherein the content of the first and second substances,
Figure FDA0003035527740000022
represents a sample xiAnd xjBelongs to the technical field of the direct-alignment,
Figure FDA0003035527740000023
represents a sample xiAnd xjBelong to a negative pair; alpha is a positive parameter, the internal compactness of the automatic driving target positive sample and the separability of the automatic driving target negative sample to the internal sample are balanced, and beta (beta is more than 0) is a regularization parameter;
Figure FDA0003035527740000024
the Frobenius norm of the matrix is represented, and M and N represent the total number of positive and negative pairs in the training data, respectively.
3. The target tracking method based on particle filtering and depth distance metric learning of claim 1, wherein based on a given set of positive and negative samples of an automatic driving target, the nonlinear depth metric learning model is trained and its nonlinear deep learning model parameters are optimized based on a gradient descent method, specifically:
given a training sample set x ═ x (x)1,x2,...,xn) And E, sampling positive and negative samples according to a zero mean Gaussian distribution around the target sample image block, wherein the six parameter diagonal covariance matrix of the automatic driving target positive sample is diag ([1,1,0, 0)]) Representing a sample within a radius of two pixels around the target; the six parameter diagonal covariance matrix for the autopilot target negative example is diag ([ w, h,0, 0)]) W and h respectively represent the width and height of the target sample, and sampling is performed from a negative sample far away from the target object; randomly forming M parameters of a positive pair training depth network and N parameters of a negative pair training depth network based on the obtained training set;
solving for parameter W using a gradient descent algorithm(k)And b(k)Specifically:
Figure FDA0003035527740000025
Figure FDA0003035527740000026
wherein the content of the first and second substances,
Figure FDA0003035527740000027
which represents the original input sample, is,
Figure FDA0003035527740000028
to be related to the vector
Figure FDA0003035527740000029
Is transposed, then sample xixjStep length between
Figure FDA00030355277400000210
And
Figure FDA00030355277400000211
the top-most layer of (d) is represented as follows:
Figure FDA00030355277400000212
Figure FDA00030355277400000213
wherein the content of the first and second substances,
Figure FDA00030355277400000214
representing a function
Figure FDA00030355277400000215
With respect to W(K)A derivative of (a); other layer variables
Figure FDA00030355277400000216
And
Figure FDA00030355277400000217
is represented as follows:
Figure FDA00030355277400000218
Figure FDA00030355277400000219
wherein, the lines indicate element-by-element multiplication,
Figure FDA00030355277400000220
comprises the following steps:
Figure FDA0003035527740000031
updating parameter W based on gradient descent algorithm(k)And b(k)K1, 2, K until convergence:
Figure FDA0003035527740000032
Figure FDA0003035527740000033
where η is a learning rate, and is used to control the convergence rate of the objective function L.
4. The target tracking method based on particle filtering and depth distance metric learning of claim 1, wherein a target observation model is constructed based on particle filtering to obtain an optimal estimation of the automatic driving target state, specifically: suppose that the automatic driving target state vector at time r is hr={hrx,hry,scrrrrIn which h isrx,hry,scrrrrFor six-degree-of-freedom affine transformation parameters, the motion model of the driving target between adjacent frames is expressed as follows:
Figure FDA0003035527740000034
wherein the content of the first and second substances,
Figure FDA0003035527740000035
to represent
Figure FDA0003035527740000036
Obedient mean value of hr-1The variance is Gaussian distribution of sigma, and the sigma is a diagonal covariance matrix; motion model
Figure FDA0003035527740000037
While stationary, the optimal candidate target is directly based on the observation model
Figure FDA0003035527740000038
Selecting, namely:
Figure FDA0003035527740000039
where Γ is a normalization factor and γ is a constant that controls the shape of the gaussian kernel.
5. The target tracking method based on particle filtering and depth distance metric learning according to claim 1, characterized in that the target template is updated by an online tracking strategy combining short-term and long-term stable updating, so as to realize effective tracking of the automatic driving target, specifically:
firstly, determining the position of the first frame of the driving target, then obtaining the tracking result of the previous n frames based on the provided tracking method and carrying out normalization processing, and finally combining the tracking result into a template set T ═ T [ T ]1,t2,…,tn]∈Rb×n
The similarity between the template and the tracking result is expressed as psi [. psi12,…,ψn]If the threshold value is rho, the similarity psi between the tracking result and the ith template is determinediExpressed as:
Figure FDA00030355277400000310
in the formula (I), the compound is shown in the specification,
Figure FDA00030355277400000311
for the r frameTrace result, similarity value psiiThe larger the tracking result, the more similar the tracking result is to the template;
let the maximum similarity be Λ, which is expressed as:
Λ=maxψi
comparing the maximum similarity Lambda with a threshold rho, and if the maximum similarity Lambda is larger than the rho, indicating that the similarity of the tracking result and a certain target template is maximum, updating the corresponding template; otherwise, the updating is not carried out.
CN202110442516.1A 2021-04-23 2021-04-23 Target tracking method based on particle filtering and depth distance measurement learning Pending CN113128605A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110442516.1A CN113128605A (en) 2021-04-23 2021-04-23 Target tracking method based on particle filtering and depth distance measurement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110442516.1A CN113128605A (en) 2021-04-23 2021-04-23 Target tracking method based on particle filtering and depth distance measurement learning

Publications (1)

Publication Number Publication Date
CN113128605A true CN113128605A (en) 2021-07-16

Family

ID=76779390

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110442516.1A Pending CN113128605A (en) 2021-04-23 2021-04-23 Target tracking method based on particle filtering and depth distance measurement learning

Country Status (1)

Country Link
CN (1) CN113128605A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110082106A (en) * 2019-04-17 2019-08-02 武汉科技大学 A kind of Method for Bearing Fault Diagnosis of the depth measure study based on Yu norm
CN111160119A (en) * 2019-12-11 2020-05-15 常州工业职业技术学院 Multi-task depth discrimination metric learning model construction method for cosmetic face verification
CN112085765A (en) * 2020-09-15 2020-12-15 浙江理工大学 Video target tracking method combining particle filtering and metric learning

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110082106A (en) * 2019-04-17 2019-08-02 武汉科技大学 A kind of Method for Bearing Fault Diagnosis of the depth measure study based on Yu norm
CN111160119A (en) * 2019-12-11 2020-05-15 常州工业职业技术学院 Multi-task depth discrimination metric learning model construction method for cosmetic face verification
CN112085765A (en) * 2020-09-15 2020-12-15 浙江理工大学 Video target tracking method combining particle filtering and metric learning

Similar Documents

Publication Publication Date Title
Uzkent et al. Tracking in aerial hyperspectral videos using deep kernelized correlation filters
Bruno et al. LIFT-SLAM: A deep-learning feature-based monocular visual SLAM method
CN109816689B (en) Moving target tracking method based on adaptive fusion of multilayer convolution characteristics
CN108090919B (en) Improved kernel correlation filtering tracking method based on super-pixel optical flow and adaptive learning factor
Liu et al. Visual tracking in complex scenes: A location fusion mechanism based on the combination of multiple visual cognition flows
CN109859241B (en) Adaptive feature selection and time consistency robust correlation filtering visual tracking method
CN108961308B (en) Residual error depth characteristic target tracking method for drift detection
CN111582349B (en) Improved target tracking algorithm based on YOLOv3 and kernel correlation filtering
Lu et al. Learning transform-aware attentive network for object tracking
CN110728694B (en) Long-time visual target tracking method based on continuous learning
Huang et al. Siamatl: Online update of siamese tracking network via attentional transfer learning
Kim et al. Towards sequence-level training for visual tracking
CN108038515A (en) Unsupervised multi-target detection tracking and its storage device and camera device
CN112085765A (en) Video target tracking method combining particle filtering and metric learning
Xiao et al. MeMu: Metric correlation Siamese network and multi-class negative sampling for visual tracking
Zhang et al. A background-aware correlation filter with adaptive saliency-aware regularization for visual tracking
Abbass et al. Efficient object tracking using hierarchical convolutional features model and correlation filters
Du et al. Spatial–temporal adaptive feature weighted correlation filter for visual tracking
Chen et al. Correlation filter tracking via distractor-aware learning and multi-anchor detection
Chen et al. Single-object tracking algorithm based on two-step spatiotemporal deep feature fusion in a complex surveillance scenario
CN110827327B (en) Fusion-based long-term target tracking method
CN110689557A (en) Improved anti-occlusion target tracking method based on KCF
CN108257148B (en) Target suggestion window generation method of specific object and application of target suggestion window generation method in target tracking
Huang et al. SVTN: Siamese visual tracking networks with spatially constrained correlation filter and saliency prior context model
Yi et al. Single online visual object tracking with enhanced tracking and detection learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination