CN107194951B

CN107194951B - Target tracking method based on limited structure chart search

Info

Publication number: CN107194951B
Application number: CN201710302422.8A
Authority: CN
Inventors: 黄庆明; 独大为; 齐洪钢
Original assignee: University of Chinese Academy of Sciences
Current assignee: University of Chinese Academy of Sciences
Priority date: 2017-05-02
Filing date: 2017-05-02
Publication date: 2020-08-21
Anticipated expiration: 2037-05-02
Also published as: CN107194951A

Abstract

The invention discloses a target tracking method based on limited structure chart search, which comprises the following steps: s0, initializing a target model; s1, inputting a next video frame; s2, solving a target component label; s3, solving a target state; s4, updating the target model; s5, in step S4, if the energy is reduced, the model is updated, the step S2 is carried out to continue the iterative optimization, otherwise, the iterative loop is exited, the optimal target state of the current frame is output, and the step S1 is carried out. The invention has the advantages that: (1) the modules organized in sequence are uniformly considered in an energy minimization framework, so that the mutual support relationship among the modules can be better excavated, the modules are mutually restricted and promoted, and the tracking effect is improved; (2) an optimization method based on rotation iteration is adopted, the original multivariable optimization problem is decomposed into a plurality of more easily processed energy minimization sub-problems to be solved one by one, and the target tracking precision is improved.

Description

Target tracking method based on limited structure chart search

Technical Field

The invention relates to a target tracking method, in particular to a target tracking method based on limited structure chart searching, and belongs to the technical field of computers.

Background

The target tracking method is one of the important research subjects in the field of computer vision. Accurate target tracking can provide a reliable basis for further analysis of video data, so that the method is widely applied to important occasions such as automatic driving, video monitoring, unmanned aerial vehicles and human-computer interaction. Although the target tracking has been greatly developed, many challenges are still faced to restrict the performance improvement, such as geometric deformation of the target, partial occlusion, background clutter, and the like.

At present, most target tracking algorithms are classified according to target representation methods and can be divided into tracking algorithms based on a target whole frame model and tracking algorithms based on a target component model. The method is robust to the conditions of illumination change, background disorder and the like, but is easy to track failure when the appearance is changed violently due to deformation, shielding or scale change, and the method adopts a set of target components (pixels, super pixels, rectangular components and the like) to represent the target and mainly learns local structure information of the target.

Graph model based tracking algorithms typically contain the following three sequential modules: target component selection, target component matching, and target state estimation. The target component selection refers to distinguishing candidate target components from the background by using an appearance model, the target component matching refers to associating components of two adjacent frames according to appearance and structural similarity, and the target state estimation refers to estimating a target state (a target center position and a target scale) according to a matching result.

Such sequential mechanisms are not sufficient to enable robust tracking in complex scenarios.

First, the target component selection and the subsequent two modules are relatively independent, so that an inaccurate appearance model directly has negative influence on a matching result, and further causes tracking failure.

Second, they do not take into account global constraints, making them sensitive to background noise in a cluttered background.

Further, this mechanism is not sufficient to reflect the true relationship of these three modules:

(1) matching of elements of two consecutive frames can provide supplementary information for element selection of the current frame, and vice versa;

(2) part selection and matching extract local appearance changes, and contribute to overall target state estimation;

(3) the estimated target state, in turn, can provide an overall constraint for component selection and matching, achieving higher accuracy.

If the mutual supporting and promoting relations among the three modules can be considered at the same time, the method can help to build a more accurate tracking model. Previous work did not take these issues into account.

Disclosure of Invention

In order to solve the defects that the existing tracking algorithm based on the graph model is limited to local representation modeling and cannot simultaneously consider the mutual support promotion of all modules, the invention aims to provide a target tracking method based on limited structure graph searching.

In order to achieve the above object, the present invention adopts the following technical solutions:

the target tracking method based on the limited structure chart search is characterized by comprising the following steps of:

s0, initializing the target model

Obtaining a target search region R with the size twice of the target dimension according to the target state calibrated by the first frame, and then searching the targetSuper-pixel of target search region R over-divided into a series of color similar pixels

The superpixels collected in the target frame are positive samples, the superpixels outside the target frame are negative samples, a linear support vector machine model is learned as an appearance model M and a related filter model F are used as integral constraints, and in addition, a target structure graph model G is established as { V, C }, wherein V represents a target component set, and C represents a set of edges formed by the relationship of adjacent target component sets;

s1, inputting the next video frame

Determining a target search area of the current frame according to the target state B of the previous frame, searching in a scale which is twice that of the current target at the target center position, and then over-dividing the target search area into a series of super pixels;

s2, solving target component label

Given the target models M, G, F and the target state B, the energy minimization function can be expressed as:

wherein E is^PSAnd E^PMRespectively representing the energy selected by the target component and the energy matched by the target component,

and P (l)_p＝f₀) Respectively representing the proportion value, lambda, of the foreground and background pixels in the rectangular frame of the target state¹To balance the coefficients of the two terms;

then, solving by using a graph cut algorithm to obtain a target component label L;

s3 solving target state

After a target component label L is obtained, solving a target state B in combination with a correlation filter model F;

s4, updating the target model

And updating the target models M, G and F according to the target component label L and the target state B obtained by solving, wherein the target models M, G and F are shown as the following formula:

wherein E is^daRepresenting the probability that a super-pixel element p belongs to the foreground or background, E^PMRepresents the energy of the target component match, F (B, R) represents the response fraction corresponding to any target state B in the search region R, λ¹To balance the coefficients of the two terms;

each item is independent, each model can be updated respectively, training samples are obtained through the target component label L and the target state B, and new target models M, G and F are learned;

s5, outputting the target state of the current frame

In step S4, if the energy is reduced, the model is updated, and the step S2 is carried out to continue iterative optimization; otherwise, the iterative loop is exited, the optimal target state of the current frame is output, and the process goes to step S1.

The target tracking method based on the limited structure map search is characterized in that, in step S0, the target search region R is over-divided into a series of super-pixels with similar color pixels

The Simple linear iterative Clustering algorithm is used.

The target tracking method based on the limited structure diagram search is characterized in that in step S3, the method for solving the target state B is as follows:

(1) obtaining a series of candidate target states by using a sampling method;

(2) selecting a target center position of the current frame based on the target scale of the previous frame;

(3) the target scale that minimizes the energy of the target function is selected.

The invention has the advantages that:

(1) compared with the prior target tracking algorithm based on the structural graph model, the method considers the promotion effect of each module relatively and independently, uniformly considers the modules organized in sequence based on the tracking algorithm of the graph model in an energy minimization frame, can better mine the mutual support relationship among the modules, enables the modules to be mutually constrained and promoted, and improves the tracking effect;

(2) the invention adopts an optimization method based on rotation iteration to decompose the original multivariable optimization problem into a plurality of more easily processed energy minimization subproblems to solve one by one, so that the learned target structure graph model better represents the local change of the target structure under the global target representation constraint, and the target tracking precision is improved.

Drawings

FIG. 1 is a schematic diagram of the present invention of target tracking based on restricted architectural graph search;

FIG. 2 is a flow chart of a target tracking method based on a restricted structure graph search according to the present invention.

Detailed Description

The invention is described in detail below with reference to the figures and the embodiments.

Referring to fig. 1 and 2, the target tracking method based on the restricted structure graph search of the present invention includes the following steps:

s0, initializing the target model

Obtaining a target search region R with the size twice of a target scale according to the target state calibrated by a first frame, and then excessively dividing the target search region R into a series of super-pixels with similar color pixels by using a Simple Linear Iterative Clustering algorithm (SLIC algorithm)

Compared with a regular image block, the superpixel can better retain target edge information and reduce background noise.

The superpixels collected in the target frame are positive samples, the superpixels outside the target frame are negative samples, a linear support vector machine model is learned as an appearance model M and a related filter model F are used as integral constraints, and in addition, a target structure graph model G is established as { V, C }, wherein V represents a target component set, and C represents a set of edges formed by the relationship of adjacent target component sets.

S1, inputting the next video frame

According to the target state B of the previous frame, determining a target search area of the current frame, namely searching in a target scale twice as large as the target center position, and then over-dividing the target search area into a series of super pixels.

S2, solving target component label

For target part selection, the target is selected from the background (denoted f) by means of the representation model M₀) To select a target component

To model the target structure information, a target structure graph model G is built to distinguish each target component. For convenience, target part tag sets are used

To express its state. On this basis, the correlation filter model F may provide a global constraint for the target state B.

Finally, the energy minimization model established by the present invention is shown as follows:

wherein E is^PS、E^PMAnd E^SERespectively representing the energy of the target component selection, the energy of the target component matching and the energy of the target state estimation, λ^SEIs the equilibrium coefficient.

Each term is specifically defined as follows:

(1) energy E of target part selection^PSIs defined as:

wherein E is^daCharacterizing the probability of a superpixel component belonging to the foreground or the background, E^smEnsuring the continuity of the label mark, lambda^bIs the equilibrium coefficient.

(2) Energy E of target component matching^PMIs defined as:

wherein E is^apMeasuring the similarity of super-pixel features to target features in the junction-map model G, E^geMeasuring the similarity of the local structure of the target part, λ^bIs the equilibrium coefficient.

(3) Energy E of target state estimation^SEIs defined as:

wherein F (B, R) represents the response score corresponding to any target state B in the search region R, and

and P (l)_p＝f₀) Respectively representing the proportion value, lambda, of the foreground and background pixels in the rectangular frame of the target state¹To balance the coefficients of the two terms.

Given the target models M, G, F and the target state B, the energy minimization model of the present invention can be re-expressed as:

because the energy minimization model comprises a plurality of variables and is difficult to optimize simultaneously, the optimization process is divided into three stages by adopting a rotation iteration optimization mode:

(1) fixing target models M, G and F and a target state B, and solving a target component label L;

(2) fixing target models M, G and F and a target component label L, and solving a target state B;

(3) fixing the target state B and the target component label L, and updating the target models M, G and F.

The three stages are alternately carried out until the total energy is not increasedThen reducing and exiting the iterative loop, and finally calculating to obtain the optimal target state B of the current frame^*(as shown in fig. 1).

In the step of solving the target component label L, we perform the first stage of the optimization process, i.e., fixing the target models M, G, F and the target state B, solving the target component label L, and specifically using a graph cut algorithm to solve the target component label L.

S3 solving target state

In the step of solving the target state B, the second stage of the optimization process is executed, and after the target component label L is obtained, the target state B is solved in combination with the correlation filter model F. Specifically, the state with the minimum energy is selected from a series of candidate target states to satisfy the target formula

The method for solving the target state B specifically comprises the following steps:

(1) obtaining a series of candidate target states by using a sampling method;

S4, updating the target model

In the step of updating the target models M, G, and F, we perform the third stage of the optimization process, that is, in order to adapt to the appearance change of the target during the motion process, the target models M, G, and F are updated according to the target component label L and the target state B obtained by the solution, as shown in the following formula:

wherein E is^daRepresenting the probability that a super-pixel element p belongs to the foreground or background, E^PMRepresents the energy of the target component match, F (B, R) represents the response fraction corresponding to any target state B in the search region R, λ¹To balance two termsAnd (4) the coefficient.

Each item is independent, each model can be updated respectively, specifically, training samples are obtained through the target component label L and the target state B, and new target models M, G and F are learned.

S5, outputting the target state of the current frame

In step S4, if the energy is reduced, the model is updated, and the step S2 is carried out to continue iterative optimization; if the energy is not reduced any more, the iteration loop is exited, and the optimal target state B of the current frame is output^*Go to step S1.

In conclusion, the invention integrates all modules of the target tracking algorithm based on the graph model by establishing a uniform energy minimization model, not only considers the promoting function of all modules of the algorithm, but also excavates the mutual supporting relation of all modules, and finally further improves the tracking effect.

In addition, aiming at the energy minimization model, the invention adopts an optimization method of rotation iteration to solve each variable, and based on the iteration process of gradually reducing energy, a more reliable target structure diagram model can be searched and obtained under the constraint of the global target representation so as to express the representation and the structure of the target component.

It should be noted that the above-mentioned embodiments do not limit the present invention in any way, and all technical solutions obtained by using equivalent alternatives or equivalent variations fall within the protection scope of the present invention.

Claims

1. The target tracking method based on the limited structure chart search is characterized by comprising the following steps of:

s0, initializing the target model

Obtaining a target search region R with the size twice of the target size according to the target state calibrated by the first frame, and then over-dividing the target search region R into a series of super-pixels with similar color pixels

The superpixels collected within the target box are positive samples,the super-pixels outside the target frame are negative samples, so that a linear support vector machine model is learned as an appearance model M and a related filter model F is used as integral constraint, and in addition, a target structure graph model G is established as { V, C }, wherein V represents a target component set, and C represents a set of edges formed by adjacent target component set relations;

s1, inputting the next video frame

s2, solving target component label

wherein:

E^PSrepresents the energy of the target component selection, which is defined as:

E^dacharacterizing the probability of a superpixel component belonging to the foreground or the background, E^smEnsuring the continuity of the label mark, lambda^bTo balance the coefficients,/_pA tag status indicating a target component p;

E^PMenergy representing the target component match, defined as:

E^apmeasuring the similarity of super-pixel features to target features in the junction-map model G, E^geMeasuring the similarity of the local structure of the target part, λ^bTo balance the coefficients,/_pA tag status indicating a target component p;

then fixing the target models M, G and F and the target state B, solving a target component label L, and specifically solving by using a graph cut algorithm to obtain the target component label L;

s3 solving target state

s4, updating the target model

s5, outputting the target state of the current frame

2. The method for tracking target based on limited structure map search of claim 1, wherein in step S0, the target search region R is over-divided into a series of super-pixels with similar color pixels

The SimpleLinear Iterative Cluster algorithm is used.

3. The target tracking method based on the limited structure graph search as claimed in claim 1, wherein in step S3, the method for solving the target state B is:

(1) obtaining a series of candidate target states by using a sampling method;

(3) selecting the state of minimum energy to satisfy the target formula

Wherein E is^SEAn energy representing a target state estimate, defined as:

f (B, R) represents the response score corresponding to an arbitrary target state B in the search region R.