CN103310466A

CN103310466A - Single target tracking method and achievement device thereof

Info

Publication number: CN103310466A
Application number: CN2013102688346A
Authority: CN
Inventors: 王军; 陈先开; 吴金勇
Original assignee: China Security and Surveillance Technology PRC Inc
Current assignee: SHANGHAI QINGTIAN ELECTRONIC TECHNOLOGY CO LTD
Priority date: 2013-06-28
Filing date: 2013-06-28
Publication date: 2013-09-18
Anticipated expiration: 2033-06-28
Also published as: CN103310466B

Abstract

Provided are a single target tracking method and an achievement device thereof. The single target tracking method includes that in a video V (V={F0, F1, FN}) formed by N frames of gray level images, a target O0 is selected in an F0 frame, and the image O0 is subjected to graying and width and height normalization processing to obtain an initialized parameter of the image; classifier initializing and updating which include training set constructing, feature extracting and model updating are carried out; and target tracking is carried out on the Ft+1 frame of image through an ft+1 model. The single target tracking method displays target surface appearances in a binary feature based compressed sensing and dimension descending method, can effectively display target deformation, improves shielding resistance and light resistance, and accordingly can achieve robust target tracking; and meanwhile has the advantages of small memory consumption and calculated amount, and achieves real-time tracking speed.

Description

A kind of monotrack method and implement device thereof

Technical field

The present invention relates to tracking and implement device thereof to the reference position that sets the goal, especially a kind of monotrack method and implement device thereof.

Background technology

Carrying out the research of tracking technique based on the moving target feature, is the focus of computer vision field research in recent years.Identify though biological characteristics such as fingerprint, palmmprint and vein have been carried out in security fields to study widely and had Preliminary Applications, these biological characteristics to belong to contact, limited its range of application greatly.Comparatively speaking, gait and recognition of face this " contactless " recognition technology is carried out ingenious combination with human motion and biological characteristic and is carried out Study of recognition, has become key areas in the intelligent scene video monitoring at present.Especially Gait Recognition, the People's Bank of China of need being expert at collects the feature of movement human and carries out identification when walking to move, and the movement human in early stage detects and the accuracy of the work of tracking, the prerequisite that real-time is whole recognition performance.This has proposed very big challenge to video monitoring, and based on the requirement of system's scene security performance, the manually-operated video monitoring of traditional dependence can not adapt to the needs that the actual scene security monitoring is used owing to have following shortcoming.How to realize that under the complex background of reality the tracking of target is not only the key of intelligent video monitoring system, in intelligent transportation or man-machine interaction are used, very important effect is arranged also, so target tracking algorism has obtained development very widely.But the success or not of most of pedestrian's track algorithm all will be depended on the complexity of background and the similarity of pedestrian's target and background, has only target and background just can obtain good result under the bigger situation of color difference.In order to solve the problem that pedestrian in the complex scene follows the tracks of, need us to design increasing robust algorithm, make it be enough to solve illumination variation, noise effect, barrier and unavoidable problem in all practical applications such as block.How accurately and rapidly from video sequence detection and tracking to go out moving target extremely important, be one of the most key technology of identification and abnormal behaviour identification, the method for motion target tracking mainly contains two kinds at present: 1, statistical learning method; 2, based on the algorithm of color characteristic.First method becomes one of mainstream technology in the area of pattern recognition gradually, and it has the application of success on many classical problems, and motion target tracking is exactly an example.The Adaboost algorithm is a kind of cascade track algorithm that people such as Freund proposes, and its target is automatically to pick out several Weak Classifiers to be integrated into a strong classifier from the Weak Classifier space.The Adaboost algorithm based on Haar type feature that people such as Vila propose is the successful Application of Adaboost algorithm on people's face detects.People such as Grabner propose online Adaboost algorithm, and the Adaboost algorithm application to target tracking domain, has been obtained tracking effect preferably.Be different from off-line Adaboost algorithm, the training sample of online Adaboost algorithm is one or several data that obtain in real time.Use this algorithm can adapt to problems such as moving target changing features better, but online Adaboost algorithm relies on sorter to follow the tracks of merely, easy classification error river under the situation that occurrence of large-area is blocked in complicated background is caused to follow the tracks of and is lost.

It also is a kind of by the algorithm of extensive concern in performance good aspect real-time and the robustness that the Camshift track algorithm that Bradski proposes relies on it.The Camshift algorithm is core with the Meanshift algorithm, solve Meanshift and can not change the shortcoming of following the tracks of window size, dwindle the target search scope, improved accuracy and operation efficiency, under the simple situation of background, can obtain tracking effect preferably.But other moving targets influenced greatlyyer around the Camshift track algorithm was subjected to, and thought non-impact point by mistake impact point easily, target size is changed and cause to follow the tracks of to lose efficacy, and then occur following the tracks of and lose phenomenon.Traditional Camshift target tracking algorism is followed the tracks of as feature with colouring information, when color of object and background or non-target are close, also can occur following the tracks of and lose.And traditional Camshift target tracking algorism is followed the tracks of failure easily to fast-moving target, and can't restore from failure.In view of single online Adaboost algorithm and Camshift algorithm all can not be obtained good tracking effect, patent of invention CN201210487250.3, patent name discloses a kind of motion target tracking method based on online Adaboost algorithm and the combination of Camshift algorithm for " a kind of motion target tracking method ", at first eigenmatrix and the sorter computing with online Adaboost track algorithm obtains confidence map, local direction histogram feature and color characteristic have been merged in choosing of feature, use the Camshift algorithm at confidence map then, make the feature of Camshift algorithm application merge texture and colouring information.This method may further comprise the steps: the first step accurately detects moving target based on the fast-moving target detection method of code book model; Second step to the initialization of online Adaboost algorithm Weak Classifier group, obtained strong classifier, and local direction histogram feature and color characteristic have been merged in choosing of moving target feature; The 3rd step, eigenmatrix and the Weak Classifier computing of online Adaboost track algorithm are obtained confidence map, use the Camshift track algorithm at confidence map, upgrade Weak Classifier according to the moving target position that obtains, obtain the tracking results of whole section video sequence at last.This method utilizes conventional approach to solve tracking problem, and its existing problems have two aspects, at first, the feature robustness deficiency of extraction, often comparatively responsive to noise for direction histogram and the color characteristic of part, thus cause the apparent robustness deficiency of target; Secondly, the Camshift method that this method adopts is to illumination, and the target of change color is easy to generate drift phenomenon, thereby the problem that has caused tracking accuracy to descend can't satisfy the severe rugged environment in the actual monitored video.Therefore design the feature of robust and the two big key issues that the strong sorter of generalization ability is target following.

Patent name is " based on the monotrack method of composing the power least square " (publication number: 103093482A, open day: 2013-05-08) disclose a kind of monotrack method based on tax power least square, in the mode of reconstructed error target has been followed the tracks of.The shortcoming of the method is sparse needs of finding the solution reconstruct to consume the regular hour, therefore follows the tracks of and goes up the difficult real-time tracking that realizes.Suppose the known video V={F that is formed by N frame gray level image ₀, F ₁... F _N, the wide height of two field picture is respectively w, h.The problem that the present invention wants to solve is: at F ₀Selected target O in the frame ₀A kind of tracking is proposed then to O ₀Carrying out the N continuous frame follows the tracks of.

This tracking of giving the reference position that sets the goal, the major technique that adopts is that profile and textural characteristics information are expressed the apparent of target, utilize the apparent model of classification learning method learning objective then by the color of extraction target at present; Then in the next frame picture, detect the position of target by apparent model, perhaps by simple track algorithm, as average drifting or light stream etc., the position that tracking target occurs in the next frame picture, the result who after this integrates tracking and two kinds of methods of target detection obtains a tracing positional the most believable; By certain update strategy apparent model is carried out adaptive renewal at last.

Summary of the invention

For addressing the above problem, the object of the present invention is to provide a kind of by the compressed sensing feature extracting method with at line core study update method and device, overcome the deficiency of above-mentioned tracking technique, target is extracted the extremely strong compressed sensing feature of robustness, improve the apparent ability to express of target, just go then to upgrade apparent model in line core study way.

For achieving the above object, technical scheme of the present invention is:

A kind of monotrack method comprises:

The first step, initiation parameter, that is, and at F ₀Obtain target O in the frame ₀The rectangle frame B of initial position ₀=[x ₀, y ₀, w ₀, h ₀], represent the upper left corner horizontal ordinate of frame respectively, upper left corner ordinate, frame is wide, the frame height; On the identical image block I of wide height, generate L random point to collection

Wherein

Represent respectively l point to the horizontal ordinate of first point, the ordinate of first point, second horizontal ordinate, the ordinate of second point, the right generating mode of random point are limited in level or vertical two kinds; Generate sparse stochastic matrix A, be used for the feature dimensionality reduction;

In second step, t=0 is carried out in sorter initialization and renewal thereof, and 1 ..., N-1 time iteration is upgraded, and will handle t two field picture F _t, comprise training set structure, feature extraction and three processing procedures of model modification;

In the 3rd step, target following utilizes model f _T+1At F _T+1Two field picture carries out target following, and tracking step comprises: forecast sample structure, feature extraction, sample classification, select the highest a plurality of samples of degree of confidence, generate final (also being a best simultaneously) object boundary frame, the output tracking frame, t=t+1 is if t＞N then finishes to follow the tracks of; Otherwise, returned for second step.

In the first step, Line number is H, and value is 50-300, (being preferably 100), columns is L, known have an equiprobability function rand, its generate equiprobably 1,2,3 ..., an element among the 2024}, if rand ∈ 1,2,3 ... 16}, then

If rand ∈ 17,2,3 ..., 32}, then

Otherwise a _Ij=0.

In second step,

Described training set structure comprises the steps:

A) positive sample collection

From object boundary frame B _t=[x _t, y _t, w _t, h _t] neighborhood

Q_{t}^{pos} = {(x, y) | x_{t} - 10 < x < x_{t} + 10, y_{t} - 10 < y < y_{t} + 10}

Middle random extraction 50-500 (preferred 100) positive samples pictures collection

Acquisition methods is Pan and Zoom, and step is as follows:

I. the generation formula of positive sample bounding box:

[x′,y′,w′,h′]=scale[x,y,w _t,h _t]+shift (1）

Wherein scale represents the convergent-divergent rate, span [0.8,1.2], and shift represents the positive integer side-play amount, span [0,20] .x, the y span is

Ii. carry out 80-150 (preferred 100 times) following operation: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (1) calculates the bounding box [x ', y ', w ', h '] of sample then; According to bounding box [x ', y ', w ', h '] cut-away view as F _tSubimage I; I is normalized to the image I of wide height identical (for example 32x32); After so carrying out 80-150 time, generated 80-150 and opened positive class samples pictures set, be designated as

B) negative sample collection

Outer peripheral areas, definitely

Q_{t}^{neg} = {(x, y) | 0 \leq x, x &GreaterEqual; x_{t} - 10,0 \leq y, y &GreaterEqual; y_{t} + 10},

Obtain 50-1000 (preferred 200) negative class sample set at random

Acquisition methods is translation or convergent-divergent, and step is as follows:

I. generate the formula of negative sample bounding box:

[x′,y′,w′,h′]=scale[x,y,w _t,h _t]shift （2）

Ii. carry out 150-500 (preferred 200 times) following operation: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (2) calculates the bounding box [x ', y ', w ', h '] of sample then; According to frame [x ', y ', w ', h '] cut-away view as F _tSubimage I; I is normalized to the image I of wide height identical (for example 32x32); After so carrying out 150-500 time, generated 150-500 and opened negative class samples pictures, be designated as

C) merge positive and negative class sample, composing training sample Dt; Definitely,

Y wherein _i{ 1,1} represents the class label of sample to ∈, the negative class sample of-1 expression, the positive class sample of 1 expression.

In second step, described feature extraction is for extracting D _tIn the feature of all sample images, extract sample { I _t, y _tThe step of feature is as follows:

A) initialization sample { I _t, y _tBe characterized as

Characteristic length is the element number of CP, i.e. L;

B)

I ∈ 0,1 ..., L} component value

Computing formula as follows:

I wherein _t(p, q) presentation video I _tIn point (p, gray-scale pixel values q);

C) utilize sparse stochastic matrix A right

Carry out dimensionality reduction, dimension is that 50-300(is preferably 100), thus new feature z obtained, and computing formula is as follows:

z = A \overset{&OverBar;}{z}

Gou Zao training sample is designated as thus

In second step, described model modification is for utilizing training set Upgrade sorter model

Wherein

Namely upgrade model parameter w _t∈ R ^{1 * 101}, R represents real number, step is as follows:

If a) t=0, initialization w _t∈ R ^{1 * 101},

The λ value is 0.0001; Otherwise, execution in step b);

B) carry out t=1 ..., T following iterative step:

I. from Z _tIn select k sample at random, constitute subclass

Ii. from A _tThe middle sample subclass that satisfies certain condition of seeking

Iii. calculating parameter value η _t=1/ (λ t).

Iv. undated parameter for the first time:

V. undated parameter for the second time:

Wherein min represents the minimum value in the element, || || represent 2 norms.

C) output w _T+1

In the 3rd step, the described model f that utilizes _T+1At F _T+1The step that two field picture carries out target following is as follows:

1) sample set Extract.From object boundary frame B _t=[x _t, y _t, w _t, h _t] neighborhood

Q_{t + 1}^{u} = {(x, y) | x_{t} - 30 < x < x_{t} + 30, y_{t} - 30 < y < y_{t} + 30}

Middle random extraction 150-300 (preferred 200) positive samples pictures collection

I. generate the formula of sample bounding box:

[x′,y′,w′,h′]=scale[x,y,w _t,h _t]+shift （3）

Ii. carry out 150-500 (preferred 200 times) following operation: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (3) calculates the bounding box [x ', y ', w ', h '] of sample then; According to frame [x ', y ', w ', h '] cut-away view as F _T+1Subimage I; I is normalized to the image I of wide height identical (for example 32x32);

Iii. according to step I i, generated 150-500 and opened the set of (preferred 200) samples pictures, be designated as

D_{t + 1}^{U} = {{I_{i}, 1}_{i}}_{i = 1}^{200};

The sample frame set of sample set in image is designated as

D_{t + 1}^{U} = {{x_{i}, y_{i}, w_{i}, h_{i},}_{i}}_{i = 1}^{200};

2) calculate

In the feature of every pictures, calculate sample { I _t, y _tThe method of feature is as follows:

A) initialization sample { I _t, y _tBe characterized as

Characteristic length is plain number, i.e. L of CP;

B)

I ∈ { 0,1 ..., L} component value

Computing formula as follows:

C) utilize sparse stochastic matrix A right

z = A \overset{&OverBar;}{z}

Gou Zao training sample is designated as thus

As above, constitute sample set to be sorted

3) utilize model f _T+1To U _T+1In all samples classify each sample z _i∈ U _T+1Produce corresponding degree of confidence:

{Conf}_{i} = f_{t + 1} (\hat{z} i) = w_{t + 1} \hat{z} i

Wherein

Degree of confidence is designated as Conf _T+1={ conf ₁, conf ₂..., conf ₂₀₀}

4) according to Conf _T+1, from

Middle most the highest bounding boxes of degree of confidence of selecting

This number is preferably 1/20 of sample number, generates a final objective bounding box B _T+1=[x _T+1, y _T+1, w _T+1, h _T+1];

5) t=t+1 is if t＞N then finishes to follow the tracks of; Otherwise, return second of front and go on foot.

In order to realize said method, the present invention also provides a kind of implement device of monotrack method, comprising:

Image acquiring device is used for obtaining a two field picture and image being carried out gray processing processing and wide high normalized from video;

Sorter initialization and updating device thereof are used for initialization model and online updating model;

Target tracker is used for doing target homing at a new images, makes search result consistent as far as possible with target.

Described image acquiring device comprises that random point is to generation unit and sparse stochastic matrix generation unit.

Described sorter initialization and updating device thereof comprise samples pictures collection tectonic element, feature extraction unit and model modification unit, wherein, samples pictures collection tectonic element is used for from the samples pictures positive class sample set of sub-images of structure and negative class sample set of sub-images; Feature extraction unit is used for aligning the feature extraction that the negative sample image carries out compressed sensing; The model modification unit, the feature samples collection that utilizes feature extraction unit to obtain upgrades sorter model.

Described target tracker comprises samples pictures collection tectonic element, feature extraction unit and target following unit, and wherein, samples pictures collection tectonic element is used for from the samples pictures positive class sample set of sub-images of structure and negative class sample set of sub-images; Feature extraction unit is used for aligning the feature extraction that the negative sample image carries out compressed sensing, and the target following unit be used for all samples are classified, and therefrom select the highest a plurality of bounding boxes of degree of confidence, and then cluster generates the object boundary frame an of the best.

In sum, creationary feature extracting method and the feature dimension reduction method of having proposed of a kind of monotrack method provided by the invention, target is extracted the extremely strong compressed sensing feature of robustness, improved the apparent ability to express of target, then by sorter initialization and updating steps thereof, on-line study and the apparent model of fresh target more, thus precision and the speed of target following improved greatly.

And, implement device provided by the invention has adopted based on the compressed sensing dimension reduction method of two value tags and has expressed the apparent of target, deformation that can the effective expression target, improve anti-blocking and the ability of illumination, thereby can the robust tracking target, have the low and little advantage of calculated amount of memory consumption simultaneously, reach real-time follow-up speed.Therefore has good using value in actual applications.

Description of drawings

Fig. 1 is that the point that generates at random of the present invention is to synoptic diagram;

Fig. 2 is image block dot matrix trellis diagram exemplary plot of the present invention;

Fig. 3 is feature extracting method synoptic diagram of the present invention;

Fig. 4 is feature dimensionality reduction exemplary plot of the present invention;

Fig. 5 is monotrack method flow diagram of the present invention;

Fig. 6 is the implement device structural representation of monotrack method of the present invention;

Fig. 7 is sample architecture of the present invention unit process flow diagram;

Fig. 8 is model modification of the present invention unit process flow diagram;

Fig. 9 is target following of the present invention unit process flow diagram;

Figure 10 and Figure 11 are tracking effect figure of the present invention.

Embodiment

In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Should be appreciated that specific embodiment described herein only in order to explaining the present invention, and be not used in restriction the present invention.

The invention provides a kind of monotrack method, as shown in Figure 5, suppose the known video V={F that is formed by N frame gray scale pedestrian image ₀, F ₁... F _N, the wide height of two field picture is respectively w, and h comprises the steps:

The first step, get parms

As among Fig. 5 1. shown in, concrete initialization step is as follows:

1) by manually, at F ₀Obtain target O in the frame ₀The rectangle frame B of initial position ₀=[x ₀, y ₀, w ₀, h ₀], x wherein ₀, y ₀, w ₀, h ₀The upper left corner horizontal ordinate of representing frame respectively, upper left corner ordinate, frame is wide, the frame height.

2) on the image block I of wide height identical (for example 32x32), generate L random point to collection

Wherein

Represent respectively l point to the horizontal ordinate of first point, the ordinate of first point, second horizontal ordinate, the ordinate of second point.The right generating mode of random point is limited in level or vertical two kinds.As shown in Figure 1, be two synoptic diagram that random point is right.Concrete steps are as follows:

A) generate the grid point set in image I

As shown in Figure 2.

B) according to S, can obtain a pair set

Right to each point

Two ordinate points

Add a random number, i.e. the 4th ordinate respectively

p_{4}^{1} = \min (h, p_{2}^{1} (1 + rand))

With second ordinate

p_{2}^{l} = \min {0, p_{2}^{l} (1 - rand)},

Wherein rand represents the random number that [0,1] is interval.Generated the some pair set about vertical direction thus

{CP}_{v} = {[p_{1}^{l}, p_{3}^{l}, p_{1}^{l}, p_{4}^{l}]}_{l = 1}^{1024}

C) according to S, can obtain a pair set

Right to each point

Two horizontal ordinate points

Add a random number, i.e. the 3rd horizontal ordinate respectively

p_{3}^{l} = \min (w, p_{1}^{l} (1 + rand))

With first horizontal ordinate

p_{1}^{l} = \max (0, p_{1}^{l} (1 - rand)),

Wherein rand represents the random number that [0,1] is interval.Generate the some pair set about horizontal direction thus

{CP}_{h} = {[p_{1}^{l}, p_{2}^{l}, p_{3}^{l}, p_{2}^{l}]}_{l = 1}^{1024} .

D) merging is about the some pair set of vertical and horizontal direction

It is right to have noted removing the point that repeats here, and the element number of set CP is L.

3) generate sparse stochastic matrix

Ranks are respectively H, L.Wherein the value of line number is 50-300, and optimum value is 100, and known have an equiprobability function rand, its generate equiprobably 1,2,3 ..., an element among the 2024}.If rand ∈ 1,2,3 ..., 16}, then

If rand ∈ 17,2,3 ..., 32}, then

Otherwise a _Ij=0.Sparse stochastic matrix A is used for the feature dimensionality reduction, reduces calculated amount and Noise Resistance Ability.

Second step, sorter initialization and renewal thereof

As among Fig. 5 3., 4. and 5. shown in the step, suppose to carry out t=0,1 ..., N-1 time iteration is upgraded, and will handle t two field picture F _t, iterative process is as follows:

1) training set structure

A) positive sample collection

From object boundary frame B _t=[x _t, y _t, w _t, h _t] neighborhood

Q_{t}^{pos} = {(x, y) | x_{t} - 10 < x < x_{t} + 10, y_{t} - 10 < y < y_{t} + 10}

Middle random extraction 50-500, the best is 100 positive samples pictures collection

Acquisition methods is Pan and Zoom.Shown in Fig. 7 process flow diagram, it is as follows that cycle index T is set at 100. steps:

I. the generation formula of positive sample bounding box:

[x',y',w',h']=scale[x,y,w _t,h _t]+shift （1）

Ii. in the present embodiment, to carry out the following example that is operating as 100 times: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (1) calculates the bounding box [x', y', w', h'] of sample then; According to frame [x', y', w', h'] cut-away view as F _tSubimage I _iWith I _iBe normalized to the image I that wide height is 32x32 _iAfter so carrying out 100 times, generated 100 positive class samples pictures set, be designated as

D_{t}^{pos} = {{I_{i}, 1}_{i}}_{i = 1}^{100} .

B) negative sample collection

Outer peripheral areas, definitely

Q_{t}^{neg} = {(x, y) | 0 \leq x, x &GreaterEqual; x_{t} - 10,0 \leq y, y &GreaterEqual; y_{t} + 10},

Obtain 50-1000 (the best is 100) negative class sample set at random

Acquisition methods is the translation convergent-divergent.Shown in Fig. 7 process flow diagram, it is 200 that cycle index T is set at 150-500(the best).Step is as follows:

I. generate the formula of negative sample bounding box:

[x',y',w',h']=scale[x,y,w _t,h _t]+shift （2）

Ii. in the present embodiment, to carry out the following example that is operating as 200 times: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (2) calculates the bounding box [x', y', w', h'] of sample then; According to frame [x', y', w', h'] cut-away view as F _tSubimage I; With I _tBe normalized to the image I of wide height identical (for example 32x32).After so carrying out 200 times, generated 200 negative class samples pictures, be designated as

D_{t}^{neg} = {{I_{i}, - 1}_{i}}_{i = 1}^{200} .

C) merge positive and negative class sample, composing training sample D _tDefinitely,

2) feature extraction, as shown in Figure 3.Extract D _tIn the feature of all sample images, extract sample { I _i, y _iThe step of feature is as follows:

A) initialization sample { I _i, y _iBe characterized as

Characteristic length is the element number of CP, i.e. L.

B)

J ∈ 0,1 ..., L} component value

Computing formula as follows:

I wherein _i(p, q) presentation video I _iIn point (p, gray-scale pixel values q).

C) as shown in Figure 4, utilize sparse stochastic matrix A right

Carry out dimensionality reduction, dimension is 100 for 50-300(the best), thus new feature z obtained, and computing formula is as follows:

z = A \overset{&OverBar;}{z}

Gou Zao training sample is designated as thus

Sorter initialization or renewal as shown in Figure 8, utilize training set Z _tUpgrade sorter model

Wherein Namely upgrade model parameter w _t∈ R ^{1 * 101}, R represents real number.Step is as follows:

D) if t=0, initialization w _t∈ R ^{1 * 101},

The λ value is 0.0001; Otherwise, execution in step b);

E) carry out t=1 ..., T following iterative step:

I. from Z _tIn select k sample at random, constitute subclass

Iii. calculating parameter value η _t=1 (λ t).

Iv. undated parameter for the first time:

V. undated parameter for the second time

F) output w _T+1, i.e. model f _T+1

The 3rd step, tracking target

As 6.-9. step among Fig. 5, particularly shown in Fig. 9 process flow diagram.Utilize model f _T+1At F _T+1Two field picture carries out target following, and tracking step is as follows:

1) sample set

Extract.From object boundary frame B _t=[x _t, y _t, w _t, h _t] neighborhood

Q_{t + 1}^{u} = {(x, y) | x_{t} - 30 < x < x_{t} + 30, y_{t} - 30 < y < y_{t} + 30}

Middle random extraction 50-500 (the best is 200) positive samples pictures collection

Acquisition methods is Pan and Zoom etc.Step is as follows:

I. generate the formula of sample bounding box:

[x',y',w',h']=scale[x,y,w _t,h _t]+shift （3）

Ii. carry out 150-500 following operation, in the present embodiment to carry out the following example that is operating as 200 times: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (3) calculates the bounding box [x', y', w', h'] of sample then; According to frame [x', y', w', h'] cut-away view as F _T+1Subimage I; I is normalized to the image I of wide height identical (for example 32x32).

Iii. according to step I i, generated 200 samples pictures set, be designated as

The sample frame set of sample set in image is designated as

2) calculate

In the feature of every pictures.Feature extracting method as shown in Figure 3 is identical.Constitute sample set to be sorted

3) utilize model f _T+1To U _T+1In all samples classify.Each sample z _i∈ U _T+1Produce corresponding degree of confidence:

{Conf}_{i} = f_{t + 1} ({\hat{z}}_{i}) = w_{t + 1} {\hat{z}}_{i}

Wherein

4) according to Conf _T+1, from Middle 10 the highest bounding boxes of degree of confidence of selecting Utilize weighting Meanshift clustering method (Dalal N.Finding people in images and videos[D] .Institut National Polytechnique de Grenoble-INPG, 2006.) to generate a final objective bounding box B _T+1=[x _T+1, y _T+1, w _T+1, h _T+1];

5) if t=t+1 is t〉N, then finish to follow the tracks of; Otherwise, return sorter initialization and the renewal thereof in second step.

As shown in figure 10, the tracking effect of pedestrian under the different video frame, the pedestrian has and changes one's clothes in the picture, turn round, characteristics such as deformation, the method that this patent proposes can effectively address these problems.As shown in figure 11, the tracking effect of pedestrian under the different video frame, the pedestrian has fuzzy, and illumination variation is too little, and characteristics such as the frame background is more are followed the tracks of in float, and the method that this patent proposes can effectively overcome the above problems.This patent has extremely strong tracking power, has anti-illumination variation, target deformation, apparent variation and is subjected to characteristics such as background influence is low.

At method for tracking target set forth above, the present invention also provides a kind of implement device of this method, as shown in Figure 6.Image acquiring device is used for obtaining a two field picture and image being carried out gray processing processing and wide high normalized from video;

Target tracker is used for doing target homing at a new images, makes search result consistent as far as possible with target.Described image acquiring device comprises that random point is to generation unit and sparse stochastic matrix generation unit.

Described sorter initialization and updating device thereof comprise samples pictures collection tectonic element, feature extraction unit and model modification unit, wherein, samples pictures collection tectonic element is used for from the samples pictures positive class sample set of sub-images of structure and negative class sample set of sub-images; Feature extraction unit is used for aligning the feature extraction that the negative sample image carries out compressed sensing; The model modification unit, the feature samples collection that utilizes feature extraction unit to obtain upgrades sorter model;

Further describe the workflow of each unit of implement device of a kind of monotrack method of the present invention below,

As shown in Figure 6, the at first selected i two field picture of described image acquiring device obtains target O ₀The rectangle frame B of initial position ₀=[x ₀, y ₀, w ₀, h ₀], represent the upper left corner horizontal ordinate of frame respectively, upper left corner ordinate, frame is wide, the frame height;

Be on the image block I of 32x32 at wide height, generate L random point to collection

Wherein Represent respectively l point to the horizontal ordinate of first point, the ordinate of first point, second horizontal ordinate, the ordinate of second point, the right generating mode of random point are limited in level or vertical two kinds, and be specific as follows:

Generate the grid point set in image I

S = {[p_{1}^{l}, p_{2}^{l}]}_{l = 1}^{1024}, p_{1}^{l}, p_{2}^{l} &Element; {0,1,2, \cdot \cdot \cdot 31},

Vertical point according to S, can obtain a pair set to generating

Right to each point

Two ordinate points

Add a random number, i.e. the 4th ordinate respectively

p_{4}^{l} = \min (h, p_{2}^{l} (1 + rand))

With second ordinate

p_{2}^{1} = \min {0, p_{2}^{l} (1 - rand)},

Level point according to S, can obtain a pair set to generating Right to each point

Two horizontal ordinate points

Add a random number, i.e. the 3rd horizontal ordinate respectively

p_{3}^{1} = \min (w, p_{1}^{l} (1 + rand))

With first horizontal ordinate

p_{1}^{l} = \max (0, p_{1}^{l} (1 - rand)),

Merging is about the some pair set of vertical and horizontal direction

It is right to have removed the point that repeats, and the element number of set CP is L;

Generate sparse stochastic matrix A=[a _Ij] _{100 * L}, ranks are respectively 100 and L, known have an equiprobability function rand, its generate equiprobably 1,2,3 ..., an element among the 2024}, if rand ∈ 1,2,3 ..., 16}, then If rand ∈ 17,2,3 ..., 32}, then

Otherwise a _Ij=0.

Sorter initialization and updating device thereof, as among Fig. 6 2.-4. shown in, comprise unit, feature extraction unit and the model modification unit of samples pictures set structure.Wherein, described positive and negative samples pictures set tectonic element: be mainly used in the positive class sample set of sub-images of structure and negative class sample set of sub-images from samples pictures.Described feature extraction unit module: be used for aligning the feature extraction that the negative sample image carries out compressed sensing.Described model modification unit: the feature samples collection that is used for utilizing feature extraction unit to obtain upgrades sorter model.

Suppose to carry out t=0,1 ..., N-1 time iteration is upgraded, and will handle t two field picture F _t, iterative process is as follows:

1) training set structure

A) positive sample collection

From object boundary frame B _t=[x _t, y _t, w _t, h _t] neighborhood

Q_{t}^{pos} = {(x, y) | x_{t} - 10 < x < x_{t} + 10, y_{t} - 10 < y < y_{t} + 10}

Middle 100 positive samples pictures collection of random extraction

Acquisition methods is Pan and Zoom.Shown in Fig. 7 process flow diagram, cycle index T is set at 100.

I. the generation formula of positive sample bounding box:

[x',y',w',h']=scale[x,y,w _t,h _t]+shift （1）

Ii. carry out the following operation of 80-150, in the present embodiment with 100 following examples that are operating as: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (1) calculates the bounding box [x', y', w', h'] of sample then; According to frame [x', y', w', h'] cut-away view as F _tSubimage I _iWith I _iBe normalized to the image I that wide height is 32x32 _iAfter so carrying out 100 times, generated 100 positive class samples pictures set, be designated as

D_{t}^{pos} = {{I_{i}, 1}_{i}}_{i = 1}^{100} .

B) negative sample collection

Outer peripheral areas, definitely,

Q_{t}^{neg} = {(x, y) | 0 \leq x, x &GreaterEqual; x_{t} - 10,0 \leq y, y &GreaterEqual; y_{t} + 10},

Obtain 100 negative class sample sets at random

Acquisition methods is the translation convergent-divergent.Shown in Fig. 7 process flow diagram, it is as follows that cycle index T is set at 200. steps:

I. generate the formula of negative sample bounding box:

[x',y',w',h']=scale[x,y,w _t,h _t]+shift （2）

Ii. carry out 150-500 following operation, in the present embodiment to carry out 200 following operations: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (2) calculates the bounding box [x', y', w', h'] of sample then; According to frame [x', y', w', h'] cut-away view as F _tSubimage I; With I _tBe normalized to the image I that wide height is 32x32.After so carrying out 200 times, generated 200 negative class samples pictures, be designated as

2) feature extraction; Extract D _tIn the feature of all sample images, extract sample { I _i, y _iThe step of feature is as follows:

A) initialization sample { I _i, y _iBe characterized as

Characteristic length is the element number of CP, i.e. L.

B)

J ∈ 0,1 ..., L} component value

Computing formula as follows:

C) as shown in Figure 4, utilize sparse stochastic matrix A right

Carry out dimensionality reduction, dimension is 50-300, and dimension is preferred 100 in the present embodiment, thereby obtains new feature z, and computing formula is as follows:

z = A \overset{&OverBar;}{z}

Gou Zao training sample is designated as thus

3) model modification unit utilizes training set Z _tUpgrade sorter model

Wherein

Namely upgrade model parameter w _t∈ R ^{1 * 101}, R represents real number.Step is as follows:

If t=0, initialization w _t∈ R ^{1 * 101},

The λ value is 0.0001; Otherwise, execution in step b);

Carry out t=1 ..., T following iterative step:

From Z _tIn select k sample at random, constitute subclass

From A _tThe middle sample subclass that satisfies certain condition of seeking

Calculating parameter value η _t=1 (λ t).

Undated parameter for the first time:

Output w _T+1, i.e. model f _T+1

This sorter initialization and updating device thereof mainly provide the online updating function to model, realized in the target following process, and along with target shape, the variation of illumination and size, constantly learning objective is apparent, to improve the robustness of following the tracks of.

Target tracker, be used for doing target homing at a new images, make search result consistent as far as possible with target, the feature samples set that utilizes the category of model device to align the acquisition of negative sample characteristics of image set constructing module is classified, and obtains best target bezel locations.Described target tracker comprises samples pictures collection tectonic element, feature extraction unit and target following unit.As among Fig. 6 5.-7. shown in, wherein, the samples pictures collection tectonic element in samples pictures collection tectonic element, feature extraction unit and sorter initialization and the updating device thereof is identical with method and the flow process of feature extraction unit.Its workflow is as follows:

1) sample set

Extract.From object boundary frame B _t=[x _t, y _t, w _t, h _t] neighborhood

Q_{t + 1}^{u} = {(x, y) | x_{t} - 30 < x < x_{t} + 30, y_{t} - 30 < y < y_{t} + 30}

Middle 200 positive samples pictures collection of random extraction

Acquisition methods is Pan and Zoom etc.Step is as follows:

I. generate the formula of sample bounding box:

[x',y',w',h']=scale[x,y,w _t,h _t]+shift （3）

Ii. to carry out the following example that is operating as 200 times: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (3) calculates the bounding box [x', y', w', h'] of sample then; According to frame [x', y', w', h'] cut-away view as F _T+1Subimage I; I is normalized to the image I that wide height is 32x32.

The sample frame set of sample set in image is designated as

2) calculate

{Conf}_{i} = f_{t + 1} ({\hat{z}}_{i}) = w_{t + 1} {\hat{z}}_{i}

Wherein

Degree of confidence is designated as Conf _T+1={ conf ₁, conf ₂..., conf _200}

4) according to Conf _T+1, from

Middle 10 the highest bounding boxes of degree of confidence of selecting

Utilize weighting Meanshift method to generate a final objective bounding box B _T+1=[x _T+1, y _T+1, w _T+1, h _T+1];

5) if t=t+1 is t〉N, then finish to follow the tracks of; Otherwise, return the model modification unit in sorter initialization and the updating device thereof.

To sum up, implement device provided by the invention has adopted based on the compressed sensing dimension reduction method of two value tags and has expressed the apparent of target, deformation that can the effective expression target, improve anti-blocking and the ability of illumination, thereby can the robust tracking target, have the low and little advantage of calculated amount of memory consumption simultaneously, reach real-time follow-up speed.Therefore has good using value in actual applications.

The above only is preferred embodiment of the present invention, not in order to limiting the present invention, all any modifications of doing within the spirit and principles in the present invention, is equal to and replaces and improvement etc., all should be included within protection scope of the present invention.

Claims

1. monotrack method comprises:

Wherein Represent respectively l point to the horizontal ordinate of first point, the ordinate of first point, second horizontal ordinate, the ordinate of second point, the right generating mode of random point are limited in level or vertical two kinds; Generate sparse stochastic matrix A, be used for the feature dimensionality reduction;

In the 3rd step, target following utilizes model f _T+1At F _T+1Two field picture carries out target following, and tracking step comprises: the forecast sample structure, and feature extraction, sample classification is selected most the highest samples of degree of confidence, generates a final objective bounding box, the output tracking frame, t=t+1 is if t＞N then finishes to follow the tracks of; Otherwise, returned for second step.

2. monotrack method as claimed in claim 1 is characterized in that, the sparse stochastic matrix of described generation

Line number is H, and value is 50-300, and columns is L, and known have an equiprobability function rand, its generate equiprobably 1,2,3 ... an element among the 2024}, if rand ∈ 1,2,3 ..., 16}, then

If rand ∈ 17,2,3 ..., 32}, then

3. monotrack method as claimed in claim 1 or 2 is characterized in that, described training set structure comprises the steps:

A) positive sample collection

From object boundary frame B _t=[x _t, y _t, w _t, h _t] neighborhood

Q_{t}^{pos} = {(x, y) | x_{t} - 10 < x < x_{t} + 10, y_{t} - 10 < y < y_{t} + 10}

Middle random extraction 50-500 positive samples pictures collection

Step is as follows:

I. the generation formula of positive sample bounding box:

[x′,y′,w′,h′]=scale[x,y,w _t,h _t]+shift （1）

Ii. carry out 80-150 following operation: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (1) calculates the bounding box [x ', y ', w ', h '] of sample then; According to frame [x ', y ', w ', h '] cut-away view as F _tSubimage I; I is normalized to the identical image I of wide height; After so carrying out 80-150 time, generated 80-150 and opened positive class samples pictures set;

B) negative sample collection

Outer peripheral areas, definitely

Q_{t}^{neg} = {(x, y) | 0 \leq x, x &GreaterEqual; x_{t} - 10,0 \leq y, y &GreaterEqual; y_{t} + 10},

Obtain 50-1000 negative class sample set at random

I. generate the formula of negative sample bounding box:

[x′,y′,w′,h′]=scale[x,y,w _t,h _t]+shift （2）

Ii. carry out 150-500 following operation: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (2) calculates the bounding box [x ', y ', w ', h '] of sample then; According to frame [x ', y ', w ', h '] cut-away view as F _tSubimage I; It is identical image I that I is normalized to wide height; After so carrying out 150-500 time, generated 150-500 and opened negative class samples pictures set;

C) merge positive and negative class sample, composing training sample D _t

4. monotrack method as claimed in claim 1 or 2 is characterized in that, described feature extraction is for extracting D _tIn the feature of all sample images, extract sample { I _i, y _iThe step of feature is as follows:

A) initialization sample { I _i, y _iBe characterized as

Characteristic length is the element number of CP, i.e. L;

B) I ∈ 0,1 ..., L} component value

Computing formula as follows:

I wherein _i(p, q) presentation video I _iIn point (p, gray-scale pixel values q);

C) utilize sparse stochastic matrix A right

Carry out dimensionality reduction, dimension is 50-300, thereby obtains new feature z, and computing formula is as follows:

z = A \overset{&OverBar;}{z}

Gou Zao training sample is designated as thus

5. monotrack method as claimed in claim 1 or 2 is characterized in that, described model modification is for utilizing training set

z_{t} = {{z_{i}, y_{i}}_{i}}_{i = 1}^{L}

Upgrade sorter model

f_{t} (\hat{z} i) = w_{t} \hat{z} i,

Wherein

\hat{z} i = [z_{i}; 1] &Element; R^{101 \times 1}

If a) t=0, initialization w _t∈ R ^{1 * 101},

The λ value is 0.0001; Otherwise, execution in step b);

B) carry out t=1 ..., T following iterative step:

I. from Z _tIn select k sample at random, constitute subclass

A_{t}^{+} = {{z, y} {| A}_{t} | y < w_{t}, z > < 1};

Iii. calculating parameter value η _t=1/ (λ t);

Iv. undated parameter for the first time:

w_{t + \frac{1}{2}} = (1 - η_{t} λ) w_{t} + \frac{ηt}{k} \underset{(z, y) &Element; A_{t}^{+}}{Σ} yz

V. undated parameter for the second time:

w_{t + 1} = \min {1, \frac{1 / \sqrt{λ}}{| | w_{t + \frac{1}{2}} | |}} w_{t + \frac{1}{2}}

Wherein min represents the minimum value in the element, || || represent 2 norms;

C) output w _T+1

6. monotrack method as claimed in claim 1 or 2 is characterized in that, the described model f that utilizes _T+1At F _T+1The step that two field picture carries out target following is as follows:

1) sample set Extract, from object boundary frame Bt=[x _t, y _t, w _t, h _t] neighborhood

Q_{t + 1}^{u} = {(x, y) | x_{t} - 30 < x < x_{t} + 30, y_{t} - 30 < y < y_{t} + 30}

Middle random extraction 150-300 positive samples pictures collection

I. generate the formula of sample bounding box:

[x′,y′,w′,h′]=scale[x,y,w _t,h _t]+shift （3）

Ii. carry out 150-500 following operation: at x, y, scale in the span of shift, obtains random value respectively; Substitution formula (3) calculates the bounding box [x ', y ', w ', h '] of sample then; According to frame [x ', y ', w ', h '] cut-away view as F _T+1Subimage I; I is normalized to the image I of wide height identical (for example 32*32);

Iii. according to step I i, generated 150-500 and opened the samples pictures set;

2) feature of the every pictures of calculating is calculated sample { I _i, the method for 1} feature is as follows:

A) initialization sample { I _i, 1} is characterized as

Characteristic length is the element number of CP, i.e. L;

B)

I ∈ 0,1 ..., L} component value

Computing formula as follows:

C) utilize sparse stochastic matrix A right

Carry out dimensionality reduction, dimension is the line number of matrix A, thereby obtains new feature z, and computing formula is as follows:

z = A \overset{&OverBar;}{z}

As above, constitute sample set to be sorted

{Conf}_{i} = f_{t + 1} (\hat{z} i) = w_{t + 1} \hat{z} i

Wherein

4) according to Conf _T+1, from

Middle most the highest bounding boxes of degree of confidence of selecting

5) t=t+1 is if t＞N then finishes to follow the tracks of; Otherwise, returned for second step.

7. the implement device of a monotrack method comprises:

8. the implement device of monotrack method as claimed in claim 7 is characterized in that, described image acquiring device comprises that random point is to generation unit and sparse stochastic matrix generation unit.

9. as the implement device of claim 7 or 8 described monotrack methods, it is characterized in that, described sorter initialization and updating device thereof comprise samples pictures collection tectonic element, feature extraction unit and model modification unit, wherein, samples pictures collection tectonic element is used for from the samples pictures positive class sample set of sub-images of structure and negative class sample set of sub-images; Feature extraction unit is used for aligning the feature extraction that the negative sample image carries out compressed sensing; The model modification unit, the feature samples collection that utilizes feature extraction unit to obtain upgrades sorter model.

10. as the implement device of claim 7 or 8 described monotrack methods, it is characterized in that, described target tracker comprises samples pictures collection tectonic element, feature extraction unit and target following unit, wherein, samples pictures collection tectonic element is used for from the samples pictures positive class sample set of sub-images of structure and negative class sample set of sub-images; Feature extraction unit is used for aligning the feature extraction that the negative sample image carries out compressed sensing; The target following unit is used for all samples are classified, and selects the highest a plurality of bounding boxes of degree of confidence, and then cluster generates the object boundary frame an of the best.