CN110009663B

CN110009663B - Target tracking method, device, equipment and computer readable storage medium

Info

Publication number: CN110009663B
Application number: CN201910285431.XA
Authority: CN
Inventors: 边丽娜; 马小虎
Original assignee: Suzhou University
Current assignee: Suzhou University
Priority date: 2019-04-10
Filing date: 2019-04-10
Publication date: 2023-06-09
Anticipated expiration: 2039-04-10
Also published as: CN110009663A

Abstract

The invention discloses a target tracking method, which comprises the following steps: extracting gradient histogram features from the current frame by using a scale filtering adaptive algorithm, and extracting color features from the current frame by using a mean shift algorithm; training the gradient histogram features and the color features by utilizing a multi-feature fusion scale filtering adaptive algorithm obtained by fusion of the gradient histogram features and the color features to obtain a target position filter and a target scale filter; the first target region of the tracking target in each candidate region of the next frame is predicted in combination with the target position filter and the target scale filter. By applying the technical scheme provided by the embodiment of the invention, the problems of drift and interference caused by shielding and incapability of scale self-adaption are effectively solved, and the accuracy of tracking the target is greatly improved. The invention also discloses a target tracking device, equipment and a storage medium, which have corresponding technical effects.

Description

Target tracking method, device, equipment and computer readable storage medium

Technical Field

The present invention relates to the field of positioning technologies, and in particular, to a target tracking apparatus, a device, and a computer readable storage medium.

Background

At present, target tracking is widely applied to video monitoring, human-computer interaction, unmanned driving, medical diagnosis and the like. Target tracking refers to a process of continuously processing new images received by a target model in order to maintain continuous estimation of the current state of the target. The target state mainly comprises motion components, such as target position, velocity, etc., and also comprises some other characteristic components, such as signal intensity of radiation, general characteristics, etc. Target tracking is a typical uncertainty problem and as monitoring and anti-monitoring technologies develop and target mobility improves, the uncertainty of the target tracking problem becomes more serious. These uncertainties are mainly represented by uncertainty of the target motion state, uncertainty of the information source, data blurring caused by background clutter, and the like. The essence of target tracking is to estimate and predict the target motion state by filtering to eliminate the uncertainty.

The existing target tracking mode mainly comprises a scale filter adaptive algorithm DSST and a mean shift algorithm MS. However, both algorithms have respective disadvantages, wherein the DSST algorithm thoroughly loses the target position and drift phenomenon occurs when the target is partially blocked or rotated by the scale filter adaptive algorithm. The mean shift algorithm has the problem of scale self-adaption, and interference exists, so that the tracking effect on the target is poor.

In summary, how to effectively solve the problems of drift and interference caused by shielding and incapability of scale self-adaption, and poor tracking effect on targets, and the like, is a problem which needs to be solved by those skilled in the art at present.

Disclosure of Invention

The invention aims to provide a target tracking method, which effectively solves the problems of drift and interference caused by shielding and incapability of scale self-adaption, and greatly improves the accuracy of tracking a target; it is another object of the present invention to provide an object tracking apparatus, device and computer readable storage medium.

In order to solve the technical problems, the invention provides the following technical scheme:

a target tracking method, comprising:

receiving a target tracking request, and determining a tracking target corresponding to the target tracking request and a current frame of the tracking target;

extracting gradient histogram features of the tracking target from the current frame by using a scale filtering adaptive algorithm, and extracting color features of the tracking target from the current frame by using a mean shift algorithm;

training the gradient histogram features and the color features by utilizing a multi-feature fusion scale filtering adaptive algorithm which is obtained by fusing the gradient histogram features and the color features in advance to obtain a target position filter and a target scale filter;

predicting a first target area of the tracking target in each candidate area of the next frame by combining the target position filter and the target scale filter;

in a specific embodiment of the present invention, after determining the tracking target corresponding to the target tracking request and the current frame of the tracking target, the method further includes:

predicting a second target region of the tracking target in each of the candidate regions using the mean shift algorithm;

and linearly combining the first target area and the second target area to obtain a corrected target area.

In one embodiment of the present invention, after obtaining the corrected target area, the method further includes:

and verifying the corrected target area by using a central error algorithm.

In one embodiment of the present invention, predicting a second target region of the tracking target in each of the candidate regions using the mean shift algorithm includes:

calculating probability densities of the tracking target corresponding to the color features in the current frame by using the mean shift algorithm to obtain a current frame model, and calculating probability densities of the color features in the candidate areas to obtain candidate models;

calculating the Pasteur coefficients of the current frame model and each candidate model according to the probability density corresponding to each color feature in the current frame and the probability density corresponding to each color feature in each candidate region;

and predicting a candidate region corresponding to the maximum value in each Pasteur coefficient as the second target region.

In one embodiment of the present invention, predicting, in combination with the target position filter and the target scale filter, a first target region of the tracking target in each candidate region of a next frame includes:

combining the target position filter and the target scale filter, and respectively calculating the correlation scores of the current frame of the tracking target and each candidate region;

respectively predicting first position information and first scale information of the tracking target in a next frame according to candidate areas corresponding to the maximum value in each correlation score;

the first target region is predicted in combination with the first location and the first scale information.

An object tracking device comprising:

the target and current frame determining module is used for receiving a target tracking request and determining a tracking target corresponding to the target tracking request and a current frame of the tracking target;

the feature extraction module is used for extracting gradient histogram features of the tracking target from the current frame by utilizing a scale filtering adaptive algorithm and extracting color features of the tracking target from the current frame by utilizing a mean shift algorithm;

the filter obtaining module is used for training the gradient histogram features and the color features by utilizing a multi-feature fusion scale filtering adaptive algorithm which is obtained by fusing the gradient histogram features and the color features in advance to obtain a target position filter and a target scale filter;

and the first region prediction module is used for predicting a first target region of the tracking target in each candidate region of the next frame by combining the target position filter and the target scale filter.

In one embodiment of the present invention, the method further comprises:

a second region prediction module, configured to predict a second target region of the tracking target in each of the candidate regions by using the mean shift algorithm after determining a tracking target corresponding to the target tracking request and a current frame of the tracking target;

and the corrected target area obtaining module is used for linearly combining the first target area and the second target area to obtain the corrected target area.

In one embodiment of the present invention, the first region prediction module includes:

a correlation score calculation sub-module, configured to combine the target position filter and the target scale filter, and calculate a correlation score of the current frame of the tracking target and each candidate region;

the position and scale prediction sub-module is used for predicting first position information and first scale information of the tracking target in a next frame according to candidate areas corresponding to the maximum value in each correlation score;

and a first region prediction sub-module, configured to predict the first target region in combination with the first location and the first scale information.

A target tracking device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the object tracking method as described above when executing the computer program.

A computer readable storage medium having stored thereon a computer program which when executed by a processor implements the steps of the target tracking method as described above.

By applying the method provided by the embodiment of the invention, the gradient histogram features and the color features of the current frame of the tracking target are extracted, and the extracted gradient histogram features and color features are trained by utilizing a multi-feature fusion scale filtering adaptive algorithm which is obtained by fusing the gradient histogram features and the color features in advance, so that a target position filter and a target scale filter are obtained. The first target region of the tracking target in each candidate region of the next frame is predicted in combination with the target position filter and the target scale filter. The gradient histogram features and the color features are fused, and the two features are mutually complemented, so that a multi-feature fusion scale filtering adaptive algorithm is obtained, a target position filter and a target scale filter which are obtained through training are more accurate, a predicted first target area is more accurate, the problems of drift and interference caused by shielding and incapability of scale adaptation are effectively solved, and the tracking accuracy of a target is greatly improved.

Correspondingly, the embodiment of the invention also provides a target tracking device, equipment and a computer readable storage medium corresponding to the target tracking method, which have the technical effects and are not repeated herein.

Drawings

In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a flow chart of an implementation of a target tracking method according to an embodiment of the present invention;

FIG. 2 is a flowchart of another implementation of the target tracking method according to an embodiment of the present invention;

FIG. 3 is a block diagram of a target tracking apparatus according to an embodiment of the present invention;

fig. 4 is a block diagram of a target tracking apparatus according to an embodiment of the present invention.

Detailed Description

In order to better understand the aspects of the present invention, the present invention will be described in further detail with reference to the accompanying drawings and detailed description. It will be apparent that the described embodiments are only some, but not all, embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Embodiment one:

referring to fig. 1, fig. 1 is a flowchart showing an implementation of a target tracking method according to an embodiment of the present invention, where the method may include the following steps:

s101: and receiving the target tracking request, and determining a tracking target corresponding to the target tracking request and a current frame of the tracking target.

When a preset target is required to be tracked, a target tracking request can be sent to a target tracking terminal, the target tracking terminal receives the target tracking request, and a tracking target corresponding to the target tracking request and a current frame of the tracking target are determined. In video monitoring, face information can be preset as a tracking target, when a request generating end detects that face information exists in a current environment, a target tracking request is automatically generated, and the target tracking request is sent to a target tracking terminal, wherein the current face information can be framed by a preset frame, the current face information is determined as the tracking target, and image information in the frame is taken as a current frame.

S102: and extracting gradient histogram features of the tracking target from the current frame by using a scale filtering adaptive algorithm, and extracting color features of the tracking target from the current frame by using a mean shift algorithm.

After the tracking target and the current frame of the tracking target are determined, the gradient histogram feature HOG of the tracking target may be extracted from the current frame using a scale filtering adaptive algorithm, and the color feature of the tracking target may be extracted from the current frame using a mean shift algorithm. With the above example, after determining the current face information as the tracking target and taking the framed image information as the current frame, the gradient histogram feature of the current face information may be extracted from the current frame by using a scale filtering adaptive algorithm, and the color feature of the current face information may be extracted from the image information by using a mean shift algorithm.

The scale filter adaptive algorithm based on the discriminant filtering mainly realizes target tracking through two filters (a translation filter and a scale filter). The translation filter is used for determining the target position of the current frame, the scale filter is used for estimating the target scale, namely, the gradient histogram characteristic of the tracked target is extracted from the current frame, and the two filters are mutually independent and track the target more efficiently.

The mean shift algorithm is a parameter-free probability density estimation method, the algorithm is obtained by utilizing gradient derivation of a probability density function of pixel characteristic points, the algorithm converges to a local maximum value of the probability density function through iterative operation, namely, the mean shift algorithm is utilized to extract color characteristics of a tracking target from a current frame, positioning and tracking of the target are achieved, and the mean shift algorithm is used for determining the target according to probability distribution of the color of the tracking target, so that the method has stronger robustness to deformation, rotation and other conditions of the tracking target.

S103: and training the gradient histogram features and the color features by utilizing a multi-feature fusion scale filtering adaptive algorithm which is obtained by fusing the gradient histogram features and the color features in advance to obtain a target position filter and a target scale filter.

Color features provide perception of the color of an object, often containing important information about the object. Gradient histogram features are the extraction of gradient information from a cell consisting of a series of pixels, and the computation of discrete directions to form a histogram. Gradient histogram features emphasize image gradient, while color features focus on color information. The two features are mutually complemented, so that the color features and the gradient histogram features can be fused in advance to obtain a multi-feature fusion scale filtering adaptive algorithm MSMF, and the extracted gradient histogram features and color features are trained by utilizing the multi-feature fusion scale filtering adaptive algorithm to obtain a target position filter and a target scale filter.

By adopting the above example, after the gradient histogram feature of the current face information can be extracted from the current frame by using the scale filtering adaptive algorithm and the color feature of the current face information can be extracted from the image information by using the mean shift algorithm, the gradient histogram feature and the color feature can be trained by using the multi-feature fusion scale filtering adaptive algorithm obtained by fusing the gradient histogram feature and the color feature in advance, so as to obtain the target position filter and the target scale filter required by predicting the next frame of the face information.

S104: the first target region of the tracking target in each candidate region of the next frame is predicted in combination with the target position filter and the target scale filter.

After the target position filter and the target scale filter are obtained through training, the target position filter and the target scale filter can be combined to predict the first target area of the tracking target in each candidate area of the next frame, namely, the target position filter can be utilized to predict the position information of the tracking target in each candidate area of the next frame, and the target scale filter can be utilized to predict the scale information of the tracking target in each candidate area of the next frame. The target position filter and the target scale filter are obtained by training the extracted gradient histogram features and color features through a multi-feature fusion scale filter adaptive algorithm, so that the target position filter and the target scale filter have no shielding problem, the drift and interference problems caused by shielding and incapability of scale adaptation are effectively solved, and the accuracy of target tracking is greatly improved.

With the above example in mind, after obtaining the target position filter and the target scale filter required for predicting the next frame of face information, the target position filter and the target scale filter may be combined to predict the first target region of the current face information in each candidate region of the next frame.

It should be noted that, based on the first embodiment, the embodiment of the present invention further provides a corresponding improvement scheme. The following embodiments relate to the same steps as those in the first embodiment or the steps corresponding to the first embodiment, and the corresponding beneficial effects can also be referred to each other, so that the following modified embodiments will not be repeated.

Embodiment two:

referring to fig. 2, fig. 2 is a flowchart of another implementation of a target tracking method according to an embodiment of the present invention, where the method may include the following steps:

s201: and receiving the target tracking request, and determining a tracking target corresponding to the target tracking request and a current frame of the tracking target.

S202: and extracting gradient histogram features of the tracking target from the current frame by using a scale filtering adaptive algorithm, and extracting color features of the tracking target from the current frame by using a mean shift algorithm.

Extracting gradient histogram features of the tracking target from the current frame by using a scale filtering adaptive algorithm, wherein the scale filtering adaptive algorithm can use d-dimensional feature mapping to perform target appearance representation on the tracking target, let f be a training sample target block extracted from the feature mapping, and use f ^l The first dimension feature number representing the training sample target block f, where l e { 1..d }. The color characteristics of the tracking target are extracted from the current frame by using a mean shift algorithm, and the tracking target can be obtained in a d-dimensional space R ^d In which n sample points are given and denoted as { x } _i } _i＝1,...,n The basic form of the mean shift vector for the reference point x in the d-dimensional space is:

where k is that all n sample points fall into the region S _h The number of samples in S _h Is a high-dimensional spherical region with radius h satisfying the following relationship _y A set of points:

S _h ≡{y:(y-x) ^T (y-x)≤h ² }；

the above equation is a definition of the initial start of the mean shift algorithm, which indicates that the region S in the initial mean shift algorithm _h The contribution of each sample point to the reference point x is the same. Considering the actual situation that the sample points around the datum point x with small distance from the datum point have great influence on the mean shift vector, changing the theory that the contribution of any sample point in the above formula to the mean shift vector is the same, introducing a kernel function, and enabling the contribution of the sample point in the mean shift vector to be related to the distance of the sample point from the datum point. Sample points near the fiducial point contribute significantly to the mean shift vector, whereas sample points far from the fiducial point contribute little to the mean shift vector. In the parameter-free density estimation of the mean shift vector, not only the distance affects the size of the mean shift vector, but also the importance of different sample points to the mean shift vector is different, and when the actual situation is considered, a weight function can be introduced.

Given a unit kernel function G (x), its contour function G (x) satisfy G (x) = G (||x|| ² ) The new mean shift expression is:

taking the example when the sample point is a pixel point, { x _i } _{i＝1,...，n} To track n pixels of the region where the target is located, w (x _i ) For point x _i Corresponding weights, w (x _i )≥0，G _E (x _i -x)＝|E| ^-1/2 G(E ^-1/2 (x _i -x)), G (·) is a unity kernel function, E is a positive definite symmetric matrix. In practical applications, usually take e=e ² I, wherein I is an identity matrix, and the following expansion form of the mean value vector is obtained:

where e is the kernel window size, i.e., kernel width. The choice of h determines how many candidate regions are, and also has an impact on the probability density estimation of each color feature. The larger e is, the more the number of candidate regions participate in the calculation, the fewer peaks of probability density estimation, and the smoother the target probability model distribution.

S203: and training the gradient histogram features and the color features by utilizing a multi-feature fusion scale filtering adaptive algorithm which is obtained by fusing the gradient histogram features and the color features in advance to obtain a target position filter and a target scale filter.

Determining the target position of the tracking target in the next frame, namely finding an optimal correlation filter h, performing target appearance representation on the tracking target by using d-dimensional feature mapping, enabling f to be a training sample target block extracted from the feature mapping, and using f ^l A first dimension of the feature number representing the target block f, wherein l e {1,., d } each feature dimension contains a filter h ^l Obtained by minimizing the sum of squared errorsIs a combination of the above:

wherein g is the response output of the filter, lambda is a regularized term coefficient, lambda is greater than or equal to 0, and as indicated by the following, the solution of the above formula can be obtained:

wherein G is a frequency domain value obtained by fourier transform, F is a frequency domain value obtained by fourier transform, a e { 1..once., d }, and the regularized term coefficient λ alleviates the problem of zero frequency component in the F spectrum, avoiding division by zero. By minimizing the output errors of all training blocks, an optimal filter can be obtained. To obtain a robust approximation, the correlation filter H in the above equation may be updated ^l The multi-feature fusion scale filtering adaptive algorithm obtained by fusing the gradient histogram features and the color features in advance can be utilized, for example, 31 gradient direction bins can be used, and the 11-dimensional color features and the gradient histogram features with single cells being 4*4 are fused, so that f= [ f ] ₁ ,f ₂ ...f _C ]For a fused representation of the multi-dimensional features,

representing the corresponding filter, so that the solving formula of the filter is improved as follows:

wherein c is the number of different features, in the present example, c=2 in the fusion of the gradient histogram feature and the color feature, i.e. when c=1, the formula is degraded into a filter solving formula which only considers the gradient histogram feature.

Gradient histogram feature and color using multi-feature fusion scale filter adaptive algorithmTraining features to update the t-th frame correlation filter in the above

Molecule->

And denominator B _t The following is shown:

where η is a learning rate parameter, for example, η may take a value of 0.025.

Thereby obtaining a target position filter and a target scale filter.

S204: and combining the target position filter and the target scale filter to respectively calculate the correlation scores of the current frame of the tracking target and each candidate region.

After the gradient histogram feature and the color feature are trained by utilizing the multi-feature scale filtering adaptive algorithm to obtain the target position filter and the target scale filter, the target position filter and the target scale filter can be combined to calculate the correlation scores of the current frame of the tracking target and each candidate region respectively. If the correlation score can be expressed by y and the candidate region by Z, the correlation score of each candidate region can be expressed as:

s205: and respectively predicting first position information and first scale information of the tracking target in the next frame according to the candidate region corresponding to the maximum value in each correlation score.

After the correlation scores of the current frame and each candidate region of the tracking target are calculated respectively, the first position information and the first scale information of the tracking target in the next frame can be predicted respectively according to the candidate region corresponding to the maximum value in each correlation score. The position of the tracking target in the next frame is estimated by extracting a feature map Z of the predicted target position, a joint translation scale tracking method based on a three-dimensional scale space correlation filter can be used for confirming the scale of the tracking target in the next frame, the filter size is fixed to be MxNxS, M and N are the height and the width of the filter, S is the scale number and can be regarded as common S layers, S layers can be arranged in a tapering or gradually expanding mode, and each layer is marked as a scale candidate area. And in order to update the filter, the algorithm calculates first a feature pyramid of a rectangular area around the target, constructs the pyramid into a rectangular cuboid of the size of the target as m×n according to the estimated proportion, trains and then sets the sample f as a rectangular cuboid of the feature pyramid, the cuboid is m×n×s in size, the algorithm uses a three-dimensional gaussian function as a corresponding expected correlation output g with the estimated position and scale of the tracked target as the center, and updates the filter according to the formula updated by the filter numerator denominator in step S203, thereby updating the filter. In order to locate the target in the new frame, the algorithm extracts an m×n×s cuboid from the feature pyramid, and for the S layer, S may take a value of 33, i.e. each scale candidate region of S calculates the correlation score with the current frame, and the scale of the tracking target in the next frame is obtained by finding the maximum value in the correlation score. Thereby predicting first position information and first scale information of the tracking target in the next frame, respectively.

S206: the first target region is predicted in combination with the first location and the first scale information.

After obtaining the first position information and the first scale information of the tracking target in the next frame, the first target area may be predicted in combination with the first position and the first scale information.

S207: and calculating the probability density corresponding to each color feature of the tracking target in the current frame by using a mean shift algorithm to obtain a current frame model, and calculating the probability density corresponding to each color feature in each candidate region to obtain each candidate model.

Taking the area of the tracking target as the current frame as an example, the current frame of the tracking target can be expressed as probability values of all color features on the current frame, so that

For the current frame pixel position, +.>

Is the center coordinates of the target. Probability density q corresponding to each color feature in current frame _u = (u=1, 2, …, m) is:

wherein k (x) is a kernel function for weighting pixels, e is a kernel function bandwidth, n is the number of pixel points in the current frame, i is the ith frame, u is the number of color features of the current frame, ms is an english abbreviation of a mean shift algorithm, a function b (x) maps coordinates of pixel positions to a feature space, δ (x) is a one-dimensional Kronecker delta function, and C is a normalization constant.

Similarly, the center coordinates can be calculated as

Probability density { p } corresponding to each color feature in each candidate region of (c) _u (y)} _{u＝1,2,...,m} 。

S208: and calculating the Pasteur coefficients of the current frame model and each candidate model according to the probability density corresponding to each color feature in the current frame and the probability density corresponding to each color feature in each candidate region.

In order to determine the position of the average shift algorithm tracking target in the new frame, the similarity between the current frame and the candidate region is measured by using the Papanicolaou coefficient, and the Papanicolaou coefficient of the current frame model and each candidate model can be calculated according to the probability density corresponding to each color feature in the current frame and the probability density corresponding to each color feature in each candidate region respectively:

s209: and predicting a candidate region corresponding to the maximum value in each Pasteur coefficient as a second target region.

After the pasteurization coefficients of the current frame model and each candidate model are calculated, the candidate region corresponding to the maximum value in each pasteurization coefficient may be predicted as the second target region.

S210: and linearly combining the first target area and the second target area to obtain a corrected target area.

In the target tracking process, errors are accumulated continuously along with continuous updating of the filter, while the actual position of the target is expected to be obtained, errors are most likely to be caused, and in order to reduce the gap as much as possible, the mean shift and the scale adaptive algorithm fused by the multiple features can be combined linearly to jointly determine the target area of the tracked target in the next frame. If the first target area obtained by the multi-feature fusion scale filtering adaptive algorithm is assumed to be

The mean shift algorithm obtains the second target area as +.>

The corrected target area obtained after the linear combination is: />

Wherein lambda is a weight coefficient, lambda is more than 0 and less than 1,

p _t ^smf The upper corner marks of the three algorithms are used for distinguishing the target areas obtained by the three algorithms.

S211: and verifying the corrected target area by using a central error algorithm.

After obtaining the corrected target region by linearly combining the first target region and the second target region, the corrected target region may be verified using a center error algorithm (Center Location Error, CLE). The formula for the center error is as follows:

CLE＝|Center _GT -Center _algorithm |；

wherein, center _GT Center for tracking the true Center of the target _algorithm The center of the corrected target area is obtained by the tracking result of the tracking algorithm.

And the corrected target area can be verified by calculating an average center error ACLE. In the continuous tracking process of the tracking target, a data set formed by multi-frame image information has certain errors in the prediction of the center of each frame, the average value of the center errors can be calculated, the average center error ACLE is obtained, and the average center error can be expressed as:

where b is the total number of frames in the dataset.

Corresponding to the above method embodiments, the present invention further provides an object tracking device, where the object tracking device described below and the object tracking method described above may be referred to correspondingly.

Referring to fig. 3, fig. 3 is a block diagram of an object tracking device according to an embodiment of the present invention, where the device may include:

the target and current frame determining module 31 is configured to receive a target tracking request, and determine a tracking target and a current frame of the tracking target corresponding to the target tracking request;

a feature extraction module 32, configured to extract gradient histogram features of the tracking target from the current frame by using a scale filtering adaptive algorithm, and extract color features of the tracking target from the current frame by using a mean shift algorithm;

a filter obtaining module 33, configured to train the gradient histogram feature and the color feature by using a multi-feature fusion scale filtering adaptive algorithm obtained by fusing the gradient histogram feature and the color feature in advance, so as to obtain a target position filter and a target scale filter;

a first region prediction module 34, configured to predict a first target region of the tracking target in each candidate region of the next frame in combination with the target position filter and the target scale filter.

By applying the device provided by the embodiment of the invention, the gradient histogram features and the color features of the current frame of the tracking target are extracted, and the extracted gradient histogram features and color features are trained by utilizing a multi-feature fusion scale filtering adaptive algorithm which is obtained by fusing the gradient histogram features and the color features in advance, so that a target position filter and a target scale filter are obtained. The first target region of the tracking target in each candidate region of the next frame is predicted in combination with the target position filter and the target scale filter. The gradient histogram features and the color features are fused, and the two features are mutually complemented, so that a multi-feature fusion scale filtering adaptive algorithm is obtained, a target position filter and a target scale filter which are obtained through training are more accurate, a predicted first target area is more accurate, the problems of drift and interference caused by shielding and incapability of scale adaptation are effectively solved, and the tracking accuracy of a target is greatly improved.

In one embodiment of the present invention, the apparatus may further include:

the second region prediction module is used for predicting a second target region of the tracking target in each candidate region by using a mean shift algorithm after determining the tracking target corresponding to the target tracking request and the current frame of the tracking target;

In one embodiment of the present invention, the apparatus may further include:

and the verification module is used for verifying the corrected target area by using a central error algorithm after the corrected target area is obtained.

In one embodiment of the present invention, the second region prediction module includes:

the model obtaining submodule is used for calculating probability densities of the tracking target corresponding to the color features in the current frame by means of a mean shift algorithm to obtain a current frame model, and calculating probability densities of the color features in the candidate areas to obtain candidate models;

the Pasteur coefficient calculation sub-module is used for calculating the Pasteur coefficients of the current frame model and each candidate model according to the probability density corresponding to each color feature in the current frame and the probability density corresponding to each color feature in each candidate region respectively;

and the second region prediction submodule is used for predicting a candidate region corresponding to the maximum value in each Pasteur coefficient as a second target region.

In one embodiment of the present invention, the first region prediction module 34 includes:

the correlation score calculation sub-module is used for combining the target position filter and the target scale filter to calculate the correlation scores of the current frame of the tracking target and each candidate region respectively;

the position and scale prediction sub-module is used for respectively predicting first position information and first scale information of the tracking target in the next frame according to candidate areas corresponding to the maximum value in each correlation score;

and the first region prediction submodule is used for predicting a first target region by combining the first position and the first scale information.

Corresponding to the above method embodiment, referring to fig. 4, fig. 4 is a schematic diagram of an object tracking device provided by the present invention, where the device may include:

a memory 41 for storing a computer program;

the processor 42 is configured to execute the computer program stored in the memory 41, and implement the following steps:

receiving a target tracking request, and determining a tracking target corresponding to the target tracking request and a current frame of the tracking target; extracting gradient histogram features of the tracking target from the current frame by using a scale filtering self-adaptive algorithm, and extracting color features of the tracking target from the current frame by using a mean shift algorithm; training the gradient histogram features and the color features by utilizing a multi-feature fusion scale filtering adaptive algorithm which is obtained by fusing the gradient histogram features and the color features in advance to obtain a target position filter and a target scale filter; the first target region of the tracking target in each candidate region of the next frame is predicted in combination with the target position filter and the target scale filter.

For the description of the apparatus provided by the present invention, please refer to the above method embodiment, and the description of the present invention is omitted herein.

Corresponding to the above method embodiments, the present invention also provides a computer readable storage medium having a computer program stored thereon, which when executed by a processor, performs the steps of:

The computer readable storage medium may include: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.

For the description of the computer-readable storage medium provided by the present invention, refer to the above method embodiments, and the disclosure is not repeated here.

In this specification, each embodiment is described in a progressive manner, and each embodiment is mainly described in a different point from other embodiments, so that the same or similar parts between the embodiments are referred to each other. The apparatus, device and computer readable storage medium of the embodiments are described more simply because they correspond to the methods of the embodiments, and the description thereof will be given with reference to the method section.

The principles and embodiments of the present invention have been described herein with reference to specific examples, but the description of the examples above is only for aiding in understanding the technical solution of the present invention and its core ideas. It should be noted that it will be apparent to those skilled in the art that various modifications and adaptations of the invention can be made without departing from the principles of the invention and these modifications and adaptations are intended to be within the scope of the invention as defined in the following claims.

Claims

1. A target tracking method, comprising:

predicting a candidate region corresponding to the maximum value in each Pasteur coefficient as a second target region;

2. The target tracking method according to claim 1, further comprising, after obtaining the corrected target area:

and verifying the corrected target area by using a central error algorithm.

3. The target tracking method according to any one of claims 1 to 2, characterized in that predicting a first target area of the tracking target in each candidate area of a next frame in combination with the target position filter and the target scale filter includes:

4. An object tracking device, comprising:

a first region prediction module, configured to predict a first target region of the tracking target in each candidate region of a next frame in combination with the target position filter and the target scale filter;

the second region prediction module comprises a model obtaining sub-module, a Pasteur coefficient calculation sub-module and a second region prediction sub-module, wherein the model obtaining sub-module is used for calculating probability densities corresponding to all color features of the tracking target in a current frame by using the mean shift algorithm to obtain a current frame model, and calculating probability densities corresponding to all color features in each candidate region to obtain candidate models;

the second region prediction submodule is used for predicting a candidate region corresponding to the maximum value in each Pasteur coefficient as a second target region;

5. The object tracking device of claim 4, wherein the first region prediction module comprises:

6. An object tracking device, comprising:

a memory for storing a computer program;

a processor for implementing the steps of the object tracking method according to any one of claims 1 to 3 when executing said computer program.

7. A computer readable storage medium, characterized in that the computer readable storage medium has stored thereon a computer program which, when executed by a processor, implements the steps of the object tracking method according to any of claims 1 to 3.