CN106683110A

CN106683110A - User terminal and object tracking method and device thereof

Info

Publication number: CN106683110A
Application number: CN201510757638.4A
Authority: CN
Inventors: 潘博阳; 陈敏杰; 刘阳; 郭春磊; 林福辉
Original assignee: Spreadtrum Communications Tianjin Co Ltd
Current assignee: Spreadtrum Communications Tianjin Co Ltd
Priority date: 2015-11-09
Filing date: 2015-11-09
Publication date: 2017-05-17

Abstract

The invention relates to a user terminal and an object tracking method and device thereof. The method comprises that a parameter model of an object is trained according to a first picture frame; according to the trained parameter model of the object, the position of the object in a second picture frame is predicted; the parameter model of the training object comprises first and second parameter models of the training objects; and the second parameter model of the object is described by combining FHOG and color features of the object. When the object is tracked, a correlation filter in a KCF algorithm is utilized under a tracking-detection frame, and a combination feature formed by the FHOG and color features is added to improve the performance of the algorithm. In the object tracking algorithm, multiple types of features are used to express information, the instantaneity is high, and adverse influence of adverse factors, including complex background, illumination and non-rigid transformation, on object tracking can be handled effectively.

Description

User terminal and target tracking method and device thereof

Technical Field

The present invention relates to the field of wireless communication technologies, and in particular, to a user terminal and a target tracking method and apparatus thereof.

Background

Intelligent human-computer interaction is the development direction of future mobile phone multimedia application, and target tracking (tracking) is the basis of intelligent human-computer interaction, and various target tracking algorithms exist in the prior art. So-called target tracking algorithms, given a set of video sequences and the initial position of a target, a target tracking algorithm can automatically locate the position of the target in a video sequence.

The following two types of target tracking algorithms exist in the prior art:

1) tracing-detection algorithm (tracking-by-detection)

This type of algorithm generally involves two major steps, training (training) and detection (detection). Training generally refers to taking samples based on the target location of the previous frame and then training the parametric model using a machine learning algorithm. And the detection is to classify the samples of the current frame according to the parameter model trained by the previous frame, predict the target position of the current frame, and then extract the samples of the current frame to update the parameter model for predicting the target position of the next frame.

2) Correlation tracking algorithm (correlation tracking)

Currently, correlation filters (correlation filters) have been widely used in many applications such as object detection and recognition. Correlation filters have attracted a great deal of attention in the field of visual tracking because their arithmetic operations are easily shifted to point-by-point multiplication (element-wise multiplication) in the fourier domain. Bolme et al propose learning the square sum minimum output error (MOSSE) tracking algorithm on a gray scale map; heriques et al propose a circular structure kernel function (CSK) tracking algorithm that uses correlation filters for target tracking in kernel space (kernel space) and achieves the highest speed in the nearest criterion, after which the group proposes a kernel-correlated filtering (KCF) algorithm that adds HOG features to the CSK to further improve algorithm performance.

The prior art scheme including the above two types of target tracking algorithms mainly has the following two defects:

a) in the feature extraction step, single features such as gray scale, color or HOG are usually adopted, and the single feature cannot adapt to a plurality of different scenes because different features have different expression effects in different scenes.

b) At present, some target tracking algorithms such as CSK, MOSSE and KCF can only estimate the position of target offset, which results in poor tracking performance of the algorithms when the tracking target has large scale change, while other target tracking algorithms can only estimate the scale change at a low frame rate and cannot meet the requirement of real-time tracking. That is, it is difficult to achieve both the scale change of the tracking target and the real-time performance of the tracking.

Disclosure of Invention

The technical problem solved by the invention is as follows: when the target is tracked, the method adapts to the larger scale change of the tracked target and adapts to various different scenes on the basis of ensuring the real-time tracking.

In order to solve the above technical problem, an embodiment of the present invention provides a target tracking method, including:

training a parameter model of a target according to the image frame A;

predicting the position of the target in the image frame B according to the trained parameter model of the target;

the parametric model of the training target comprises: a first parametric model of a training target and a second parametric model of the training target;

the first parametric model of the training target comprises: training a first parameter model of a target through a KCF algorithm;

the second parametric model of the target is: a model of the target is described in terms of a combination of FHOG features of the target and color features of the target.

Optionally, the first parametric model of the target is a model for predicting target deviation, the second parametric model of the target is a model for predicting target scale, and the first parametric model of the target and the second parametric model of the target are regression models based on a kernel correlation filter.

Optionally, the FHOG characteristic of the target is: the 9-dimensional contrast insensitive feature and the 4-dimensional texture feature in the FHOG feature vector of the target.

Optionally, the second parametric model of the training target includes:

converting the image frame A from an RGB color space to an HSI color space;

acquiring a hue component and a saturation component of a target in an image frame A of an HSI color space;

and (3) corresponding the hue component to the direction of the first-order gradient in the FHOG characteristic, corresponding the saturation component to the amplitude of the first-order gradient in the FHOG characteristic, and calculating the histogram characteristic of the saturation component in the direction of the hue component to serve as the color characteristic of the target.

Optionally, the color characteristics of the target are: histogram feature of the target in the direction of hue component of 9-dimensional saturation component in HSI color space and 4-dimensional texture feature of the target in HSI color space.

Optionally, the first parametric model for training the target by the KCF algorithm includes:

collecting image samples x with the size of W multiplied by H in a nearby area with a target as a center to train a KCF classifier;

using the properties of the cyclic shift matrix and the extended image, KCF shifts all the cyclic shifted samples x_w,h(W, H) ∈ {0, 1., W-1} × {0, 1., H-1} as training samples of the KCF classifier;

the regression target y has a center point value of 1, and the values decay to 0 at the target edge as the position farther from the center point is_w,hDenotes x_w,hThe label of (1);

the following function was found:

f(z)＝w^Tz

such that the sample x_w,hAnd its regression target y_w,hIs minimized, i.e.:

where φ represents the mapping of samples to Hilbert space by the kernel function κ, and the inner product of x and x' is expressed as:

<f(x),f(x′)>＝κ(x,x′)

wherein λ represents a regular term coefficient;

the solution w after mapping the input of the linear problem to the nonlinear feature space phi (x) is expressed as:

the solution for vector α is:

wherein F represents a Fourier transform, F^-1Representing an inverse fourier transform; wherein (k)^x)＝κ(x_w,h,x)；

The vector α contains all α (w, h) coefficients; updating the target appearance model when processing the target in each frame of image frame; the first parametric model includes the learned target appearance model and classifier coefficients F (α).

Optionally, the predicting the position of the target in the image frame b according to the trained parametric model of the target includes the following steps: predicting the target position of the current frame by calculating the confidence score in the current frame, wherein the prediction formula is as follows:

wherein,which means that the multiplication is performed point by point,representing the learned target appearance model.

Optionally, the target tracking method further includes: presetting a first threshold; the target tracking method further comprises: after the predicting the position of the target in the image frame B, performing scale estimation under the condition that the confidence of predicting the offset position of the target is greater than the first threshold value.

Optionally, the performing the scale estimation includes:

assuming that P × Q is the predicted target size in image frame b, and N represents the number of scales, then scale S is represented as:

wherein a represents a scale parameter; for S ∈ S:

scaling an image sample with a prediction target position at the center and a size of sP multiplied by sQ to a sample with a size of P multiplied by Q;

constructing a scale feature pyramid by adopting FHOG features; wherein,a relevance response map representing a regression target;

the optimal scale for the target is then:

。

in order to solve the above technical problem, an embodiment of the present invention further provides a target tracking apparatus, including: a model training unit and a target prediction unit; wherein:

the model training unit is suitable for training a parameter model of the target according to the image frame A;

the target prediction unit is suitable for predicting the position of the target in the image frame B according to the trained parameter model of the target after the model training unit executes the operation;

Optionally, the model training unit includes: an HSI conversion subunit, a component acquisition subunit and a color feature calculation subunit; wherein:

an HSI transformation subunit adapted to transform the image frame a from an RGB color space to an HSI color space;

a component obtaining subunit adapted to obtain a hue component and a saturation component of a target in an image frame a of an HSI color space after the HSI conversion subunit performs an operation;

and the color characteristic calculating subunit is suitable for corresponding the hue component to the direction of the first-order gradient in the FHIG characteristic and the saturation component to the magnitude of the first-order gradient in the FHIG characteristic after the component acquiring subunit performs the operation, and calculating the histogram characteristic of the saturation component in the direction of the hue component as the color characteristic of the target.

the following function was found:

f(z)＝w^Tz

such that the sample x_w,hAnd its regression target y_w,hIs minimized, i.e.:

<f(x),f(x′)>＝κ(x,x′)

wherein λ represents a regular term coefficient;

the solution for vector α is:

Optionally, the predicting, by the target predicting unit, a position of the target in the image frame b according to the trained parameter model of the target includes: predicting the target position of the current frame by calculating the confidence score in the current frame, wherein the preset formula is as follows:

wherein，Which means that the multiplication is performed point by point,representing the learned target appearance model.

Optionally, the target tracking apparatus further includes: a threshold setting unit adapted to set a first threshold in advance; the target tracking apparatus further includes: a scale estimation unit adapted to perform scale estimation in a case where the confidence of predicting the target offset position is greater than the first threshold after the target prediction unit performs the operation.

Optionally, the performing the scale estimation includes:

wherein a represents a scale parameter; for S ∈ S:

the optimal scale for the target is then:

。

in order to solve the above technical problem, an embodiment of the present invention further provides a user terminal, including the above target tracking apparatus.

Optionally, the user terminal is a smart phone or a tablet computer.

Compared with the prior art, the technical scheme of the invention has the following beneficial effects:

when the target tracking is carried out, a correlation filter in a KCF algorithm is utilized under a tracking-detection framework, and a combined feature consisting of FHOG features and (HSI) color features is added to improve the performance of the algorithm. The scheme is a target tracking algorithm which utilizes multiple characteristics to express information, has high real-time performance, and can effectively deal with adverse effects on target tracking caused by adverse factors such as complex background, illumination, non-rigid transformation (non-rigidddevice) and the like.

Further, the 18-dimensional contrast-sensitive feature is not included in the FHOG feature vector of the target, and due to the presence of the (HSI) color feature, the effect of target tracking is not greatly affected without considering the 18-dimensional contrast-sensitive feature in the FHOG feature vector. The dimension of the combined feature formed by the obtained FHOG feature and the (HSI) color feature is 13+ 13-26 dimensions, and is less than that of the FHOG feature with 31 dimensions in the prior art (including 18-dimensional contrast sensitivity feature).

Further, the image frame is converted from the RGB color space to the HSI color space, the hue component corresponds to the direction of the first-order gradient in the FHOG feature, the saturation component corresponds to the magnitude of the first-order gradient in the FHOG feature, and the histogram feature of the saturation component in the hue component direction is calculated to serve as the color feature of the target, so that the color feature and the FHOG feature in the embodiment have similar structures (both based on the cell), thereby facilitating the connection and the matching use of the two.

Further, scale estimation is added to further improve the performance of the algorithm.

Drawings

FIG. 1 is a flow chart of a target tracking method in an embodiment of the present invention;

FIG. 2 is a flow chart of a method of training a second parametric model of a target in an embodiment of the present invention;

fig. 3 is a block diagram of a target tracking device according to an embodiment of the present invention.

Detailed Description

From the analysis in the background section, the following two types of target tracking algorithms exist in the prior art: 1) tracking-detection algorithm; 2) a relevance tracking algorithm. The prior art scheme including the above two types of target tracking algorithms mainly has the following two defects: a) in the feature extraction step, single features such as gray scale, color or HOG are usually adopted, and the single features cannot be suitable for a plurality of different scenes due to different expression effects of the different features in the different scenes; b) at present, some target tracking algorithms such as CSK, MOSSE and KCF can only estimate the position of target offset, which results in poor tracking performance of the algorithms when the tracking target has large scale change, while other target tracking algorithms can only estimate the scale change at a low frame rate and cannot meet the requirement of real-time tracking. That is, it is difficult to achieve both the scale change of the tracking target and the real-time performance of the tracking.

After research, the inventor proposes a new target tracking method, which utilizes a correlation filter in a KCF algorithm under a tracking-detection framework and adds a combined features (combined features) composed of FHOG features and (HSI) color features to improve the performance of the algorithm. The scheme is a target tracking algorithm which utilizes multiple characteristics to express information, has high real-time performance, and can effectively deal with the adverse effects on target tracking caused by adverse factors such as complex background, illumination, non-rigid transformation (non-linear transformation) and the like.

Since the kernel correlation function only needs to calculate dot-product (dot-product) and vector norm (vector norm), the image features are calculated in multiple channels, and thus, the power of feature fusion (feature fusion) can be exerted by using other powerful features.

For the defect a), the application tries to express more abundant information by utilizing a plurality of characteristics by considering the complementarity among different characteristics; for the defect b), in view of higher calculation efficiency of the KCF tracking algorithm, the application utilizes a correlation filter in the KCF algorithm under a tracking-detection framework, and adds a combination feature and a scale estimation to improve the performance of the algorithm.

In order that those skilled in the art will better understand and realize the present invention, the following detailed description is given by way of specific embodiments with reference to the accompanying drawings.

Example one

As described below, an embodiment of the present invention provides a target tracking method.

Referring to a flow chart of a target tracking method shown in fig. 1, the following detailed description is made through specific steps:

and S101, training a parameter model of the target according to the image frame A.

The embodiment utilizes a correlation filter in the KCF algorithm under a tracking-detection framework, and adds a combination feature and scale estimation (scale estimation) to improve the performance of the algorithm.

As mentioned above, the tracking-detection type algorithm involves two major steps, training and detection. Training generally refers to taking samples based on the target location of the previous frame and then training the parametric model using a machine learning algorithm. And the detection is to classify the samples of the current frame according to the parameter model trained by the previous frame, predict the target position of the current frame, and then extract the samples of the current frame to update the parameter model for predicting the target position of the next frame.

In this embodiment, a parameter model of a target is trained according to an image frame a, and in a subsequent step, the position of the target in an image frame b is predicted according to the trained parameter model of the target. Wherein, the image frame A is prior and the image frame B is posterior.

The difference from the prior art is that, in this embodiment, the parameter model of the training target includes: a first parametric model of the training target and a second parametric model of the training target.

The first parametric model of the target and the second parametric model of the target are both regression models based on a kernel correlation filter.

In a specific implementation, the first parametric model of the target may be a model that predicts a target excursion and the second parametric model of the target may be a model that predicts a target dimension.

The present embodiment trains a first parametric model predicting target offsets through the KCF algorithm.

The KCF algorithm itself belongs to the prior art, and how to apply the KCF algorithm to train the first parametric model predicting the target offset in the present embodiment is briefly described below. The first parametric model for training the target through the KCF algorithm comprises:

using the properties of the cyclic shift matrix (cyclic shift matrix) and the extended image, KCF shifts all the samples x cyclically_w,h(W, H) ∈ {0, 1., W-1} × {0, 1., H-1} as training samples of the KCF classifier;

the following function was found:

f(z)＝w^Tz

such that the sample x_w,hAnd its regression target y_w,hIs minimized, i.e.:

<f(x),f(x′)>＝κ(x,x′)

wherein λ represents a regular term coefficient;

the solution for vector α is:

In a specific implementation, the predicting the position of the target in the image frame b according to the trained parametric model of the target may include the following steps: predicting the target position of the current frame by calculating the confidence score in the current frame, wherein the prediction formula is as follows:

The above describes how the KCF algorithm is applied to train the first parametric model that predicts the target shift in the present embodiment. The second parametric model is described below.

The inventor finds that: since the kernel correlation function only needs to calculate dot-product (dot-product) and vector norm (vector norm), image features can be calculated in multiple channels, and thus, the power of feature fusion (feature fusion) can be exerted by using other powerful features.

The second parametric model in this embodiment describes the target by using a combination of the FHOG feature of the target and the color feature of the target.

The HOG feature was originally proposed by Dalal & Triggs. The operator calculates a first order gradient of each pixel, then gathers the gradients in corresponding cells (cells), then calculates a histogram of each cell, normalizes the histogram along four directions, and finally connects all normalized histograms into corresponding feature vectors.

The solution adopted in this embodiment is based on the improved FHOG feature proposed by felzenzwald which is superior to the original HOG mainly in two points, on the one hand, the addition of the cell features in four directions, unlike the original direct connection, which can reduce the dimension (dimension) of the feature vector by 1/4, and on the other hand, the addition of a 4-dimensional texture feature vector in each cell.

In this embodiment, an FHOG feature algorithm is used to calculate the FHOG feature of the target.

Usually, the direction (bins) of the histogram in the FHOG feature is set to 9, the dimension of the FHOG feature vector is 31-dimensional (9 contrast-insensitive features) +18 contrast-sensitive features) +4 texture features (texture features)),

in this embodiment, the FHOG of the target is characterized by: and 9-dimensional contrast insensitive features and 4-dimensional texture features in the FHOG feature vector of the target, namely 18-dimensional contrast sensitive features are not included.

Generally, a color image contains information of three channels of red, green and blue, but feature extraction (feature extraction) performed directly by using an RGB color space is poor because the feature contains color information and gray scale information. This embodiment converts the RGB color space into (Hue-Saturation-Intensity, HSI) color space, thereby separating color information and gradation information.

Since the gradation information has been used in calculating the FHOG feature, the present embodiment retains the hue (hue) component and the saturation (saturation) component in the HSI color space, i.e., does not consider the luminance (intensity) component, in order to avoid redundancy of information.

The hue component and the saturation component form a disk-like space without considering the brightness component, wherein the hue corresponds to an angle, the saturation corresponds to a radius, then the hue component corresponds to the direction of the first order gradient in the FHOG feature, and the saturation component corresponds to the magnitude of the first order gradient in the FHOG feature, so that the histogram feature of the saturation component in the direction of the hue component can be calculated by utilizing the calculation process of the FHOG feature as the color feature of the target, and the histogram feature can describe the color distribution in the image while avoiding information redundancy.

Since FHOG features and color features in this embodiment have similar structures (both based on cells), the above features can be concatenated to describe the target as a second parametric model.

The above description of the technical solution shows that: in this embodiment, the image frame is converted from the RGB color space to the HSI color space, the hue component corresponds to the direction of the first-order gradient in the FHOG feature, the saturation component corresponds to the magnitude of the first-order gradient in the FHOG feature, and the histogram feature of the saturation component in the hue component direction is calculated and obtained as the color feature of the target, so that the color feature and the FHOG feature in this embodiment have similar structures (both based on the cell), thereby facilitating the connection and the use of the two.

In a specific implementation, as shown in fig. 2, the second parametric model of the training target may include:

s201, converting the image frame A from an RGB color space to an HSI color space.

S202, acquiring a hue component and a saturation component of the target in an image frame A of the HSI color space.

And S203, enabling the hue component to correspond to the direction of the first-order gradient in the FHOG characteristic, enabling the saturation component to correspond to the amplitude of the first-order gradient in the FHOG characteristic, and calculating a histogram characteristic of the saturation component in the direction of the hue component to serve as the color characteristic of the target.

The color characteristics of the target are: histogram feature of the target in the direction of hue component of 9-dimensional saturation component in HSI color space and 4-dimensional texture feature of the target in HSI color space.

The inventors have found in practice that the contrast-sensitive feature in the FHOG feature has a greater effect when using the FHOG feature alone, but has a lesser effect when using the combined feature of the FHOG feature and the (HSI) color feature. The reason may be that (HSI) color features compensate for the effect of the contrast-sensitive features in some particular way, and removing the 18-dimensional contrast-sensitive features may make the time consumed by the feature extraction process short. Therefore, the FHOG feature vector of the present embodiment target does not contain 18-dimensional contrast-sensitive features.

As mentioned before, the FHOG characteristic of the target is: 9-dimensional contrast insensitive feature and 4-dimensional texture feature in FHOG feature vector of the target; the color characteristics of the target are: histogram feature of the target in the direction of hue component of 9-dimensional saturation component in HSI color space and 4-dimensional texture feature of the target in HSI color space.

Therefore, in the embodiment, the dimension of the combined feature formed by the FHOG feature and the (HSI) color feature is 13+ 13-26, which is less than that of the FHOG feature with 31 dimensions in the prior art (including 18-dimensional contrast-sensitive features), although the overhead of converting the image frame from the RGB color space to the HSI color space is increased, the overall overhead is not greatly increased, and thus higher real-time performance is ensured during target tracking.

In addition, compared with the prior art in which RGB and gradient are directly connected based on pixel level, the combined feature formed by the FHOG feature and the (HSI) color feature in the present embodiment not only can reduce the dimension of feature mapping (feature map), but also has invariance (invariance) to small transformation (small transformation).

In an alternative embodiment, further scale estimation may be performed.

In specific implementation, a first threshold value can be preset; the target tracking method may further include: after the predicting the position of the target in the image frame B, performing scale estimation under the condition that the confidence of predicting the offset position of the target is greater than the first threshold value.

The performing the scale estimation comprises:

wherein a represents a scale parameter; for S ∈ S:

the optimal scale for the target is then:

。

after the parameter model of the target is trained according to the previous image frame A, the position of the target in the next image frame B can be predicted according to the trained parameter model of the target.

The above description of the technical solution shows that: in this embodiment, a scale estimation is added to further improve the performance of the algorithm.

S102, predicting the position of the target in the image frame B according to the trained parameter model of the target.

The above description of the technical solution shows that: in the embodiment, when the target tracking is carried out, a correlation filter in the KCF algorithm is utilized under a tracking-detection framework, and the combined feature formed by the FHOG feature and the (HSI) color feature is added to improve the performance of the algorithm. The scheme is a target tracking algorithm which utilizes various characteristics to express information, has high real-time performance, and can effectively deal with adverse effects on target tracking caused by adverse factors such as complex background, illumination, non-rigid transformation and the like.

On this basis, the present embodiment may further be modified as follows: 1) the discrete Fourier transform can be subjected to parallel acceleration processing through a Neon or GPU acceleration technology, so that the operation efficiency of the algorithm is improved; 2) convolutional neural network auto-learning features may also be employed for feature extraction.

Example two

As described below, an embodiment of the present invention provides a target tracking apparatus.

Referring to fig. 3, a block diagram of a target tracking apparatus is shown.

The target tracking apparatus includes: a model training unit 301 and a target prediction unit 302; the main functions of each unit are as follows:

a model training unit 301 adapted to train a parametric model of the target based on the image frame a;

an object prediction unit 302 adapted to predict a position of the object in the image frame b in accordance with the trained parametric model of the object after the operation performed by the model training unit 301;

In a specific implementation, the first parametric model of the target may be a model for predicting a target offset, the second parametric model of the target may be a model for predicting a target scale, and both the first parametric model of the target and the second parametric model of the target may be regression models based on a kernel correlation filter.

In a specific implementation, the FHOG characteristic of the target may be: the 9-dimensional contrast insensitive feature and the 4-dimensional texture feature in the FHOG feature vector of the target.

The above description of the technical solution shows that: in this embodiment, the FHOG feature vector of the target does not include 18-dimensional contrast-sensitive features, and due to the existence of (HSI) color features, the effect of target tracking is not greatly affected without considering the 18-dimensional contrast-sensitive features in the FHOG feature vector. The dimension of the combined feature formed by the obtained FHOG feature and the (HSI) color feature is 13+ 13-26 dimensions, and is less than that of the FHOG feature with 31 dimensions in the prior art (including 18-dimensional contrast sensitivity feature).

In a specific implementation, the model training unit 301 may include: an HSI transformation subunit 3011, a component acquisition subunit 3012, and a color feature calculation subunit 3013; wherein:

an HSI transformation subunit 3011 adapted to transform image frame a from an RGB color space to an HSI color space;

a component acquisition subunit 3012 adapted to acquire a hue component and a saturation component of a target in an image frame a of an HSI color space after an operation is performed by the HSI conversion subunit 3011;

and the color feature calculation subunit 3013 is adapted to, after the component obtaining subunit 3012 performs the operation, correspond the hue component to the direction of the first-order gradient in the FHOG feature, correspond the saturation component to the magnitude of the first-order gradient in the FHOG feature, and calculate a histogram feature of the saturation component in the direction of the hue component as the color feature of the target.

In a specific implementation, the color characteristics of the target may be: histogram feature of the target in the direction of hue component of 9-dimensional saturation component in HSI color space and 4-dimensional texture feature of the target in HSI color space.

In a specific implementation, the first parametric model for training the target through the KCF algorithm may include:

the following function was found:

f(z)＝w^Tz

such that the sample x_w,hAnd its regression target y_w,hIs minimized, i.e.:

<f(x),f(x′)>＝κ(x,x′)

wherein λ represents a regular term coefficient;

the solution for vector α is:

In a specific implementation, the predicting the position of the target in the image frame b by the target predicting unit 302 according to the trained parameter model of the target may include: predicting the target position of the current frame by calculating the confidence score in the current frame, wherein the preset formula is as follows:

In a specific implementation, the target tracking apparatus may further include: a threshold setting unit 303 adapted to set a first threshold in advance; the target tracking apparatus may further include: a scale estimation unit 304 adapted to perform scale estimation in case the confidence of predicting the target offset position is greater than the first threshold after the target prediction unit performs the operation.

In a specific implementation, the performing the scale estimation may include:

wherein a represents a scale parameter; for S ∈ S:

constructing a scale feature pyramid by adopting FHOG features; wherein,representing regressionA relevance response mapping of the target;

the optimal scale for the target is then:

。

EXAMPLE III

As described below, an embodiment of the present invention provides a user terminal.

The difference from the prior art is that the user terminal further comprises a target tracking device as provided in the embodiments of the present invention. Therefore, when the user terminal carries out target tracking, a correlation filter in the KCF algorithm is utilized under a tracking-detection framework, and a combined feature formed by FHOG features and (HSI) color features is added to improve the performance of the algorithm. The scheme is a target tracking algorithm which utilizes various characteristics to express information, has high real-time performance, and can effectively deal with adverse effects on target tracking caused by adverse factors such as complex background, illumination, non-rigid transformation and the like.

In a specific implementation, the user terminal may be a smartphone or a tablet computer.

Those skilled in the art will understand that, in the methods of the embodiments, all or part of the steps can be performed by hardware associated with program instructions, and the program can be stored in a computer-readable storage medium, which can include: ROM, RAM, magnetic or optical disks, and the like.

Although the present invention is disclosed above, the present invention is not limited thereto. Various changes and modifications may be effected therein by one skilled in the art without departing from the spirit and scope of the invention as defined in the appended claims.

Claims

1. A target tracking method, comprising:

training a parameter model of a target according to the image frame A;

2. The target tracking method of claim 1, wherein the first parametric model of the target is a model that predicts target excursion, the second parametric model of the target is a model that predicts target dimensions, and the first parametric model of the target and the second parametric model of the target are regression models based on a kernel correlation filter.

3. The target tracking method of claim 1, wherein the FHOG characteristic of the target is: the 9-dimensional contrast insensitive feature and the 4-dimensional texture feature in the FHOG feature vector of the target.

4. The target tracking method of claim 1, wherein the second parametric model of the training target comprises:

converting the image frame A from an RGB color space to an HSI color space;

5. The target tracking method of claim 1, wherein the color characteristics of the target are: histogram feature of the target in the direction of hue component of 9-dimensional saturation component in HSI color space and 4-dimensional texture feature of the target in HSI color space.

6. The target tracking method of claim 1, wherein the training of the first parametric model of the target by the KCF algorithm comprises:

the following function was found:

f(z)＝w^Tz

such that the sample x_w,hAnd its regression target y_w,hIs minimized, i.e.:

\underset{w}{m i n} \underset{w, h}{Σ} | < f (x_{w, h}), w > - y (w, h) |^{2} + λ | | w | |^{2}

<f(x),f(x')>＝κ(x,x')

wherein λ represents a regular term coefficient;

w = \underset{w, h}{Σ} α (w, h) f (x_{w, h})

the solution for vector α is:

α = F^{- 1} (\frac{F (y)}{F (k^{x}) + λ})

7. The target tracking method of claim 1, wherein said predicting the position of said target in image frame b based on said trained parametric model of said target comprises the steps of: predicting the target position of the current frame by calculating the confidence score in the current frame, wherein the prediction formula is as follows:

where ⊙ denotes point-by-point multiplication,representing the learned target appearance model.

8. The target tracking method of claim 1, further comprising: presetting a first threshold; the target tracking method further comprises: after the predicting the position of the target in the image frame B, performing scale estimation under the condition that the confidence of predicting the offset position of the target is greater than the first threshold value.

9. The target tracking method of claim 8, wherein said performing a scale estimation comprises:

S = {a^{n} | n = [- \frac{N - 1}{2}], [- \frac{N - 3}{2}], ..., [\frac{N - 1}{2}]}

wherein a represents a scale parameter; for S ∈ S:

the optimal scale for the target is then:

\hat{s} = \underset{s}{\arg \max} (\max (\hat{y_{1}}), \max (\hat{y_{2}}), ..., \max (\hat{y_{s}})) .

10. an object tracking device, comprising: a model training unit and a target prediction unit;

wherein:

11. The object tracking device of claim 10, wherein the first parametric model of the object is a model that predicts an offset of the object, the second parametric model of the object is a model that predicts a scale of the object, and the first parametric model of the object and the second parametric model of the object are regression models based on a kernel correlation filter.

12. The target tracking device of claim 10, wherein the FHOG characteristic of the target is: the 9-dimensional contrast insensitive feature and the 4-dimensional texture feature in the FHOG feature vector of the target.

13. The target tracking device of claim 10, wherein the model training unit comprises: an HSI conversion subunit, a component acquisition subunit and a color feature calculation subunit; wherein:

14. The object tracking device of claim 10, wherein the color characteristics of the object are: histogram feature of the target in the direction of hue component of 9-dimensional saturation component in HSI color space and 4-dimensional texture feature of the target in HSI color space.

15. The target tracking device of claim 10 wherein the first parametric model for training the target via the KCF algorithm comprises:

the following function was found:

f(z)＝w^Tz

such that the sample x_w,hAnd its regression target y_w,hIs minimized, i.e.:

\min_{w} \underset{w, h}{Σ} | < f (x_{w, h}), w > - y (w, h) |^{2} + λ | | w | |^{2},

<f(x),f(x')>＝κ(x,x')

wherein λ represents a regular term coefficient;

w = \underset{w, h}{Σ} α (w, h) f (x_{w, h})

the solution for vector α is:

α = F^{- 1} (\frac{F (y)}{F (k^{x}) + λ})

16. The object tracking device of claim 10 wherein the object prediction unit predicts the location of the object in image frame b based on the trained parametric model of the object comprises: predicting the target position of the current frame by calculating the confidence score in the current frame, wherein the preset formula is as follows:

17. The target tracking device of claim 10, wherein the target tracking device further comprises: a threshold setting unit adapted to set a first threshold in advance; the target tracking apparatus further includes: a scale estimation unit adapted to perform scale estimation in a case where the confidence of predicting the target offset position is greater than the first threshold after the target prediction unit performs the operation.

18. The target tracking device of claim 17, wherein said performing a scale estimation comprises: assuming that P × Q is the predicted target size in image frame b, and N represents the number of scales, then scale S is represented as:

S = {a^{n} | n = [- \frac{N - 1}{2}], [- \frac{N - 3}{2}], ..., [\frac{N - 1}{2}]}

wherein a represents a scale parameter; for S ∈ S:

the optimal scale for the target is then:

\hat{s} = \underset{s}{\arg \max} (\max (\hat{y_{1}}), \max (\hat{y_{2}}), ..., \max (\hat{y_{s}})) .

19. a user terminal, characterized in that it comprises a target tracking device according to any one of claims 10 to 18.

20. The user terminal of claim 19, wherein the user terminal is a smartphone or a tablet computer.