CN105261040A

CN105261040A - Multi-target tracking method and apparatus

Info

Publication number: CN105261040A
Application number: CN201510676986.9A
Authority: CN
Inventors: 傅慧源; 马华东; 张�诚; 方瑞
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2015-10-19
Filing date: 2015-10-19
Publication date: 2016-01-20
Anticipated expiration: 2035-10-19
Also published as: CN105261040B

Abstract

The embodiment of the invention provides a multi-target tracking method and apparatus. The method comprises: determining the first type of prediction areas in a current target picture frame of each target object in a reference picture frame, and the gradient amplitude graph corresponding to the target picture frame; utilizing a preserved window to successively slide along pixels in the gradient amplitude graph to obtain a plurality of sub image areas generated in the sliding process of the preserved window; screening a first type of sub image areas including objects from the sub image areas; and respectively coupling the first type of sub image areas in the target picture frame with the first type of prediction areas in a current target picture frame of each target object in a reference picture frame, and determining the area of the target object in the target image frame in a coupling mode, instead of directly predicting only based on the area information of the target object of a previous frame of image, thereby improving multi-target tracking accuracy.

Description

A kind of multi-object tracking method and device

Technical field

The present invention relates to computer vision field, particularly a kind of multi-object tracking method and device.

Background technology

Computer vision is mainly studied the identification of destination object, the location of destination object and the motion analysis to destination object, multiple target tracking is one of basic problem of computer vision field, its application is very extensive, the main task of multiple target tracking is in one section of continuous print video file, orient the position at interested target place in every piece image, thus the tracking realized interested target, multiple target tracking is computer vision field very part and parcel, only have the accurate tracking realized target, succeeding target behavioural analysis work could be launched.

Wherein, existing Multitarget Tracking be according to destination object at the area information target of prediction object of previous frame image in the region of current image frame, using doped region as the region of destination object in current image frame.

But because the change in location of destination object has ambiguity, the area information of the destination object of this only foundation previous frame image carrys out direct prediction mode, causes multiple target tracking accuracy rate lower.

Summary of the invention

The object of the embodiment of the present invention is to provide a kind of multi-object tracking method and device, and improve multiple target tracking accuracy rate, technical scheme is as follows:

Determine that in reference image frame, each destination object is in the first kind estimation range of current target image frame, wherein, described reference image frame and described target image frame are the different images frame in video file, and reference image frame is consecutive previous frame image with target image frame; Wherein, described destination object is at least one in multiple subject included in described reference image frame;

Determine the gradient magnitude figure corresponding to described target image frame;

Utilize predetermined window to slide by pixel in described gradient magnitude figure, obtain multiple sub-image areas that described predetermined window generates in sliding process;

From multiple sub-image area, screening obtains the first kind sub-image area comprising subject;

Perform presumptive area analysis operation to each first kind sub-image area, wherein, described predetermined analysis operation at least comprises:

Images match is carried out with first kind estimation range corresponding to each destination object respectively in current first kind sub-image area, if there is the target first kind estimation range that matching rate is greater than predetermined threshold value, then by this first kind sub-image area, as the region of destination object in described target image frame corresponding to described target first kind estimation range.

Optionally, described regional analysis operation also comprises:

If there is no matching rate is greater than the target first kind estimation range of predetermined threshold value, then when there is at least one two field picture before judging described reference image frame:

From before described reference image frame not by least one two field picture of auxiliary view picture frame corresponding to this first kind sub-image area, select near a two field picture of described reference image frame as the current auxiliary view picture frame corresponding to this first kind sub-image area, and determine that in current auxiliary view picture frame, each destination object is in the Equations of The Second Kind estimation range of current target image frame, images match is carried out with each Equations of The Second Kind estimation range respectively in this first kind sub-image area, if there is the target Equations of The Second Kind estimation range that matching rate is greater than predetermined threshold value, then by this first kind sub-image area, as the region of destination object in described target image frame corresponding to described target second estimation range, otherwise, continue to perform from before described reference image frame not by least one two field picture of auxiliary view picture frame corresponding to this first kind sub-image area, select near a two field picture of described reference image frame as the current auxiliary view picture frame step corresponding to this first kind sub-image area, until each two field picture before reference image frame is all by as the auxiliary view picture frame corresponding to this first kind sub-image area.

Optionally, the described gradient magnitude figure determined corresponding to described target image frame, comprising:

Calculate the gradient magnitude of each pixel position in described target image frame;

Based on the gradient magnitude of calculated each pixel position, obtain the gradient magnitude figure corresponding to described target image frame;

Wherein, the formula that the gradient magnitude calculating each pixel position in described target image frame utilizes is:

f＝min(|g _x|+|g _y|,255)

g _x＝I(x+1,y)-I(x,y)

g _y＝I(x,y+1)-I(x,y)

Wherein, f is the gradient magnitude of pixel position (x, y), g _xfor the horizontal gradient of image slices vegetarian refreshments position (x, y), g _yfor the VG (vertical gradient) of this pixel position (x, y) of image, I (x, y) is the pixel of pixel position (x, y).

Optionally, described from multiple sub-image area, screening obtains the first kind sub-image area comprising subject, comprising:

Calculate the score of each sub-image area in described multiple subimage;

Based on described score, descending sort is carried out to multiple sub-image area, extract at least one sub-image area that sorting position belongs to precalculated position scope;

Use Supplement device and nearest neighbor classifier, at least one extracted sub-image area is screened, obtains the first kind sub-image area comprising subject;

Wherein, the formula that the score of each sub-image area calculated in described multiple subimage utilizes is:

S = ω \cdot f^{T} = Σ_{k = 1}^{64} x_{k} y_{k}

Wherein, S is sub-image area score, ω=(x ₁, x ₂..., x ₆₄) for presetting proper vector, f=(y ₁, y ₂..., y ₆₄) vector form of the two-value gradient magnitude of sub-image area that generates in sliding process for window, k is dimension.

Optionally, described use Supplement device and nearest neighbor classifier, screen at least one extracted sub-image area, obtains the first kind sub-image area comprising subject, comprising:

Use Supplement device and nearest neighbor classifier, at least one extracted sub-image area is screened, obtains the selection result;

Using the selection result as the first kind sub-image area comprising subject;

Or,

Described use Supplement device and nearest neighbor classifier, screen at least one extracted sub-image area, obtains the first kind sub-image area comprising subject, comprising:

Judge whether each sub-image area in the selection result is predetermined background image region;

The sub-image area of background image region will not be belonged to as the first kind sub-image area comprising subject in the selection result.

Embodiments provide a kind of multiple target tracking device, comprising:

First determination module, for determining that in reference image frame, each destination object is in the first kind estimation range of current target image frame, wherein, described reference image frame and described target image frame are the different images frame in video file, and reference image frame is consecutive previous frame image with target image frame; Wherein, described destination object is at least one in multiple subject included in described reference image frame;

Second determination module, for determining the gradient magnitude figure corresponding to described target image frame;

Sub-image area generation module, for utilizing predetermined window to slide by pixel in described gradient magnitude figure, obtains multiple sub-image areas that described predetermined window generates in sliding process;

Screening module, for from multiple sub-image area, screens and obtains the first kind sub-image area comprising subject;

Analysis module, for performing presumptive area analysis operation to each first kind sub-image area, wherein, described predetermined analysis operation at least comprises:

Optionally, the presumptive area analysis operation performed by described analysis module also comprises:

Optionally, described second determination module comprises:

First calculating sub module, for calculating the gradient magnitude of each pixel position in described target image frame;

Obtain submodule, for the gradient magnitude based on calculated each pixel position, obtain the gradient magnitude figure corresponding to described target image frame;

f＝min(|g _x|+|g _y|,255)

g _x＝I(x+1,y)-I(x,y)

g _y＝I(x,y+1)-I(x,y)

Optionally, described screening module, comprising:

Second calculating sub module, for calculating the score of each sub-image area in described multiple subimage;

Extracting submodule, for carrying out descending sort based on described score to multiple sub-image area, extracting at least one sub-image area that sorting position belongs to precalculated position scope;

Classification submodule, for using Supplement device and nearest neighbor classifier, screening at least one extracted sub-image area, obtaining the first kind sub-image area comprising subject;

S = ω \cdot f^{T} = Σ_{k = 1}^{64} x_{k} y_{k}

Optionally, described sort module, comprising:

First screening submodule, for using Supplement device and nearest neighbor classifier, screening at least one extracted sub-image area, obtaining the selection result;

First obtains submodule, for using the selection result as the first kind sub-image area comprising subject;

Or,

Described sort module, comprising:

Second screening submodule, for using Supplement device and nearest neighbor classifier, screening at least one extracted sub-image area, obtaining the selection result;

Judge submodule, for judging whether each sub-image area in the selection result is predetermined background image region;

Second obtains submodule, for not belonging to the sub-image area of background image region in the selection result as the first kind sub-image area comprising subject.

Visible, in the embodiment of the present invention, propose a kind of multi-object tracking method and device, by the first kind sub-image area in target image frame is carried out images match with each destination object in reference image frame in the first kind estimation range of current target image frame respectively, by the region of mode determination destination object in described target image frame of coupling, and not only directly predict according to the area information of the destination object of previous frame image, therefore, it is possible to improve multiple target tracking accuracy rate.Certainly, arbitrary product of the present invention is implemented or method must not necessarily need to reach above-described all advantages simultaneously.

Accompanying drawing explanation

In order to be illustrated more clearly in the embodiment of the present invention or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.

The schematic flow sheet of a kind of multi-object tracking method that Fig. 1 provides for the embodiment of the present invention;

Fig. 2 is the intermediate result proper vector ω=(x of training linear SVM classifier ₁, x ₂..., x ₆₄) graph-based figure;

Fig. 3 is that a kind of multi-object tracking method in this paper and original TLD method are at process one frame temporal comparison diagram used, a () is for the AVSS-PV-Medium video requency frame data collection in AVSS2007 file at process one frame temporal comparison diagram used, (b) is at process one frame temporal comparison diagram used for the AVSS-PV-Hard video requency frame data collection in AVSS2007 file;

Fig. 4 is a kind of multi-object tracking method in this paper and the comparison diagram of original TLD method in accuracy rate, a () is for the comparison diagram of AVSS-PV-Medium video requency frame data collection in accuracy rate in AVSS2007 file, (b) is for the comparison diagram of AVSS-PV-Hard video requency frame data collection in accuracy rate in AVSS2007 file;

Fig. 5 is the tracking results schematic diagram of a kind of multi-object tracking method in this paper;

The structural representation of a kind of multiple target tracking device that Fig. 6 provides for the embodiment of the present invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, be clearly and completely described the technical scheme in the embodiment of the present invention, obviously, described embodiment is only the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making the every other embodiment obtained under creative work prerequisite, belong to the scope of protection of the invention.

Below by specific embodiment, the present invention is described in detail.

The schematic flow sheet of a kind of multi-object tracking method that Fig. 1 provides for the embodiment of the present invention, comprises the steps:

S101: determine that in reference image frame, each destination object is in the first kind estimation range of current target image frame.

Wherein, described reference image frame and described target image frame are the different images frame in video file, and reference image frame is consecutive previous frame image with target image frame; Wherein, described destination object is at least one in multiple subject included in described reference image frame.

Read the two field picture in video file, using this two field picture as reference picture frame, force to be adjusted to fixed measure with reference to image frame dimension according to actual service condition, to accelerate computing, such as: fixed measure is 320 × 240 pixels, use and select the optical flow method tracker of tracking point to predict first estimation range of each destination object at current target image frame with FB (Forward-BackwardError just swings to mistake) error approach, concrete steps are as follows:

Evenly get a little according to grid node in reference image frame targeted object region, such as: sizing grid is 10 × 10 pixels;

Then, for wherein each pixel position (x1, y1), follow the trail of its position to current target image frame with optical flow method tracker, then the position (x2, y2) of reference image frame is gone back in backward tracing, calculates FB error:

F B - E R R O R = \sqrt{{(x 2 - x 1)}^{2} - {(y 2 - y 1)}^{2}}

Wherein, FB-ERROR is the FB error between pixel position (x1, y1) and pixel position (x2, y2).

Filter out this FB error minimum front 50% pixel, go out the first kind estimation range of destination object at current target image frame according to the changes in coordinates of these pixels and the change calculations of distance, namely destination object is in the predicted position of current target image frame.

In addition, the residual error of each pixel position displacement is calculated:

d＝|d _i-d _m|

Wherein, d is the residual error of pixel position displacement, d _ifor the displacement of a pixel position in grid, d _mfor the median of pixel position displacement amounts all in grid.

If the median of the residual error of all pixel position displacements is more than 10 pixel units in grid, so optical flow method tracker follows the trail of the objective object unsuccessfully, now, cannot obtain the first kind estimation range of destination object at current target image frame.

S102: determine the gradient magnitude figure corresponding to described target image frame.

Concrete, the described gradient magnitude figure determined corresponding to described target image frame, comprising:

f＝min(|g _x|+|g _y|,255)

g _x＝I(x+1,y)-I(x,y)

g _y＝I(x,y+1)-I(x,y)

Further, according to actual service condition, target image frame size can be scaled fixed measure, so that the gradient magnitude carrying out each pixel position in target image frame calculates.

S103: utilize predetermined window to slide by pixel in described gradient magnitude figure, obtain multiple sub-image areas that described predetermined window generates in sliding process.

According to actual service condition, described predetermined window can be the window of 8 × 8 pixels, and can be the window of 10 × 10 pixels, certain embodiment be not limited in this.

According to actual service condition, the mode that described predetermined window is slided by pixel in described gradient magnitude figure can be from top to bottom, and order from left to right, can also be from the bottom up, the order of turning left from the right side, certainly, embodiment be not limited in this.

S104: from multiple sub-image area, screening obtains the first kind sub-image area comprising subject.

Concrete, calculate the score of each sub-image area in described multiple subimage;

S = ω \cdot f^{T} = Σ_{k = 1}^{64} x_{k} y_{k}

Concrete, described use Supplement device and nearest neighbor classifier, screen at least one extracted sub-image area, obtains the first kind sub-image area comprising subject, comprising:

Using the selection result as the first kind sub-image area comprising subject;

Or,

Wherein, default proper vector ω=(x is obtained ₁, x ₂..., x ₆₄) step as follows:

Obtain Linear SVM (SupportVectorMachine, support vector machine) sorter;

Obtain the region of all objects object in training plan picture frame;

Determine the gradient magnitude figure corresponding to training plan picture frame;

By the area reduction of subject to preset range, such as: 8 × 8 pixels, accelerating algorithm is used to calculate the two-value gradient magnitude in each subject region;

Use the two-value gradient magnitude training linear SVM classifier of the two-value gradient magnitude in subject region and the background image region of grab sample respectively, output characteristic vector ω=(x ₁, x ₂..., x ₆₄), by the proper vector ω=(x of this 64 dimension ₁, x ₂..., x ₆₄) as presetting proper vector, the intermediate result proper vector ω=(x of training linear SVM classifier ₁, x ₂..., x ₆₄) graph-based as shown in Figure 2.

For precalculated position scope, determine according to actual service condition, such as: from 1-2000 position range.

Further, according to actual service condition, the target image frame size at its place can be scaled former target image frame size at least one sub-image area of extracting, use Supplement device and nearest neighbor classifier, at least one extracted sub-image area is screened.

Described fern sorter comprises n fundamental classifier, and each fundamental classifier is made m to sub-image area and compared grey scale pixel value, and producing length is the binary code of m, and this binary code one has 2 ^mkind.Wherein, the first kind sub-image area comprising subject produces the binary code x that length is m.When sub-image area quantity is sufficient, the posterior probability that sub-image area is the first kind sub-image area comprising subject must be arrived:

p_{i} = \frac{# p}{# p + # n}

Wherein, p _ifor arriving i-th fundamental classifier the posterior probability that sub-image area is the first kind sub-image area comprising subject, i is i-th fundamental classifier, #p meets the first kind sub-image area quantity comprising subject that binary code is x, the sub-image area quantity of #n to be that not meet binary code be the sub-image area of x be background image.

The corresponding posterior probability that each sub-image area obtains through each fundamental classifier is averaged, average is greater than the sub-image area of 50% as the first kind sub-image area comprising subject.

Then, being screened described sub-image area further by nearest neighbor classifier, is pre-set specifications form according to actual service condition adjustment sub-image area, such as: 15 × 15 pixels, then sub-image area is the vector of 255 dimensions, and the similarity between two sub-image areas is:

S(j,k)＝0.5(NCC(j,k)+1)

Wherein, S (j, k) is normalized correlation coefficient for the similarity between sub-image area j and sub-image area k, NCC, utilizes following formula to obtain NCC (j, k):

N C C (j, k) = \frac{Σ_{n = 1}^{255} j (n) k (n)}{\sqrt{Σ_{n = 1}^{255} j {(n)}^{2} Σ_{n = 1}^{255} k {(n)}^{2}}}

Wherein, NCC (j, k) is the normalized correlation coefficient between sub-image area j and sub-image area k, the n-th vector that j (n) is sub-image area j, the n-th vector that k (n) is sub-image area k.

Suppose that sub-image area be the sub-image area of the first kind sub-image area comprising subject is positive sub-image region, sub-image area is the sub-image area of background image is negative sub-image area, calculates positive arest neighbors similarity and negative arest neighbors similarity respectively:

S^{+} (j, M) = \max_{j^{+} &Element; M} S (j, j^{+})

S^{-} (j, M) = \max_{j^{-} &Element; M} S (j, j^{-})

Wherein, S ⁺the positive arest neighbors similarity that (j, M) is sub-image area j, S ^-the negative arest neighbors similarity that (j, M) is sub-image area j, M is the set M=(j of positive and negative sub-image area composition ⁺, k ⁺, z ⁺, j ^-, k ^-, z ^-), j ⁺for positive sub-image region, j ^-for negative sub-image area.

Calculate the relative similarity of sub-image area j:

S_{j}^{r} = \frac{S^{+} (j, M)}{S^{+} (j, M) + S^{-} (j, M)}

Wherein, for the relative similarity of sub-image area j, S ⁺the positive arest neighbors similarity that (j, M) is sub-image area j, S ^-the negative arest neighbors similarity that (j, M) is sub-image area j.

This relative similarity span is [0,1], and value shows more greatly sub-image area more close to the region of destination object in target image frame.

If the relative similarity of this sub-image area j meet:

S_{j}^{r} > θ_{N N}

Wherein, θ _nNfor predetermined coefficient, value is 0.6;

Then determine that this sub-image area is the first kind sub-image area comprising subject.

In another embodiment of the invention, sub-image area due to described extraction can be the sub-image area of inclusion body object, also can for comprising the sub-image area of background image, therefore further judgement is done to screening each sub-image area obtained, judge to screen whether each sub-image area obtained is predetermined background image region, will not belong to the sub-image area of background image region in the selection result as the first kind sub-image area comprising subject.

S105: presumptive area analysis operation is performed to each first kind sub-image area.

Wherein, described predetermined analysis operation at least comprises:

Concrete, described regional analysis operation also comprises:

Such as: using the 5th frame in video file as reference picture frame, then the 6th frame is current target image frame, 4th frame is auxiliary view picture frame, destination object A is had in reference image frame, if there is no the matching rate that in the 6th frame, in the first kind sub-image area of destination object A and the 5th frame, the first kind estimation range of destination object A matches is greater than the first kind estimation range of the destination object A of predetermined threshold value, then select the 3rd frame as current auxiliary view picture frame, and determine the Equations of The Second Kind estimation range of destination object A in the 4th frame in the 3rd frame, the first kind estimation range of destination object A in the 6th frame is mated with the Equations of The Second Kind estimation range of destination object A in the 4th frame, if there is the Equations of The Second Kind estimation range that matching rate is greater than the destination object A of predetermined threshold value, then by the first kind sub-image area of this destination object A, as the region of described destination object A in the 5th frame, otherwise, continue to perform from the 4th frame to the 1st frame, select near the 5th frame a two field picture as destination object A first kind sub-image area corresponding to current auxiliary view picture frame step, such as continue selection the 2nd frame as current auxiliary view picture frame, until the auxiliary view picture frame of each two field picture all corresponding to the first kind sub-image area as this destination object A before the 4th frame.

Visible, in the embodiment of the present invention, propose a kind of multi-object tracking method, by the first kind sub-image area in target image frame is carried out images match with each destination object in reference image frame in the first kind estimation range of current target image frame respectively, by the region of mode determination destination object in described target image frame of coupling, and not only directly predict according to the area information of the destination object of previous frame image, therefore, it is possible to improve multiple target tracking accuracy rate.

Fig. 3 is that a kind of multi-object tracking method in this paper and original TLD method are at process one frame temporal comparison diagram used, a () is for the AVSS-PV-Medium video requency frame data collection in AVSS2007 file at process one frame temporal comparison diagram used, (b) is at process one frame temporal comparison diagram used for the AVSS-PV-Hard video requency frame data collection in AVSS2007 file.

Wherein, original TLD (Tracking-Learning-Detection, follow the trail of study to detect) method is a kind of method for multiple target tracking, indicated by the solid line in the drawings, proposed method is represented by dashed line, AVSS2007 (AdvancedVideoandSignalbasedSurveillance2007, advanced video and input 2007) file is video requency frame data collection file for multiple target tracking, AVSS-PV-Medium video requency frame data collection wherein and AVSS-PV-Hard video requency frame data collection are two kinds of difficulty type video requency frame data collection in video requency frame data collection file, i.e. medium difficulty video requency frame data collection and difficult difficulty video requency frame data collection.

Visible, along with the increase of tracking target number of objects, the processing time of proposed multi-object tracking method is also in increase, but total processing time remain hundred Milliseconds other, when tracking target number of objects is increased to 7 destination objects, the time of proposed multi-object tracking method process one frame is not also more than 160ms, when destination object number is more than 2, the processing time of original TLD method has been thousand milliseconds of ranks, so proposed multi-object tracking method is compared, TLD method has very large lifting on the processing time.

Fig. 4 is a kind of multi-object tracking method in this paper and the comparison diagram of original TLD method in accuracy rate, a () is for the comparison diagram of AVSS-PV-Medium video requency frame data collection in accuracy rate in AVSS2007 file, (b) is for the comparison diagram of AVSS-PV-Hard video requency frame data collection in accuracy rate in AVSS2007 file.

Wherein, original TLD method is indicated by the solid line in the drawings, and proposed method is represented by dashed line.

If the Euclidean distance between the destination object center position coordinates of prediction and destination object real center position coordinates exceedes threshold value, be defined as target of prediction object's position correct, accuracy rate is the ratio of the correct destination object number of prediction and destination object sum.

Visible, when destination object number is little, such as 1 or 2 destination objects time, multi-object tracking method in this paper can keep high accuracy rate, but when destination object number starts to increase gradually, the accuracy rate of multi-object tracking method in this paper has decline in various degree under different scene, under medium Medium difficulty, even if destination object number is increased to 7, multi-object tracking method in this paper still can keep the accuracy rate of more than 86%; Under difficult Hard difficulty, along with the increase of destination object number, it is very fast that the accuracy rate of tracking declines, and when destination object number is increased to 7, the accuracy rate of tracking drops to about 80%.

The tracking results schematic diagram of Fig. 5 a kind of multi-object tracking method in this paper.

Wherein, the current frame number of numeral in every width figure lower right corner.

The structural representation of a kind of multiple target tracking device that Fig. 6 provides for the embodiment of the present invention, corresponding with the process flow diagram shown in Fig. 1, comprise the first determination module 601, second determination module 602, sub-image area generation module 603, screening module 604 and analysis module 605.

Wherein, described first determination module 601, for determining that in reference image frame, each destination object is in the first kind estimation range of current target image frame, wherein, described reference image frame and described target image frame are the different images frame in video file, and reference image frame is consecutive previous frame image with target image frame; Wherein, described destination object is at least one in multiple subject included in described reference image frame;

Described second determination module 602, for determining the gradient magnitude figure corresponding to described target image frame;

Described sub-image area generation module 603, for utilizing predetermined window to slide by pixel in described gradient magnitude figure, obtains multiple sub-image areas that described predetermined window generates in sliding process;

Described screening module 604, for from multiple sub-image area, screens and obtains the first kind sub-image area comprising subject;

Described analysis module 605, for performing presumptive area analysis operation to each first kind sub-image area, wherein, described predetermined analysis operation at least comprises:

Presumptive area analysis operation performed by described analysis module also comprises:

Described second determination module 602 comprises:

f＝min(|g _x|+|g _y|,255)

g _x＝I(x+1,y)-I(x,y)

g _y＝I(x,y+1)-I(x,y)

Described screening module 604, comprising:

S = ω \cdot f^{T} = Σ_{k = 1}^{64} x_{k} y_{k}

Described sort module, comprising:

Or,

Described sort module, comprising:

Visible, in the embodiment of the present invention, propose a kind of multi-object tracking method and device, by the first kind sub-image area in target image frame is carried out images match with each destination object in reference image frame in the first kind estimation range of current target image frame respectively, by the region of mode determination destination object in described target image frame of coupling, and not only directly predict according to the area information of the destination object of previous frame image, therefore, it is possible to improve multiple target tracking accuracy rate.

It should be noted that, in this article, the relational terms of such as first and lower first-class and so on is only used for an entity or operation to separate with another entity or operational zone, and not necessarily requires or imply the relation that there is any this reality between these entities or operation or sequentially.And, term " comprises ", " comprising " or its any other variant are intended to contain comprising of nonexcludability, thus make to comprise the process of a series of key element, method, article or equipment and not only comprise those key elements, but also comprise other key elements clearly do not listed, or also comprise by the intrinsic key element of this process, method, article or equipment.When not more restrictions, the key element limited by statement " comprising ... ", and be not precluded within process, method, article or the equipment comprising described key element and also there is other identical element.

Each embodiment in this instructions all adopts relevant mode to describe, between each embodiment identical similar part mutually see, what each embodiment stressed is the difference with other embodiments.Especially, for system embodiment, because it is substantially similar to embodiment of the method, so description is fairly simple, relevant part illustrates see the part of embodiment of the method.

The foregoing is only preferred embodiment of the present invention, be not intended to limit protection scope of the present invention.All any amendments done within the spirit and principles in the present invention, equivalent replacement, improvement etc., be all included in protection scope of the present invention.

Claims

1. a multi-object tracking method, is characterized in that, comprising:

2. method according to claim 1, is characterized in that, described regional analysis operation also comprises:

3. method according to claim 1, is characterized in that, the described gradient magnitude figure determined corresponding to described target image frame, comprising:

f＝min(|g _x|+|g _y|,255)

g _x＝I(x+1,y)-I(x,y)

g _y＝I(x,y+1)-I(x,y)

4. method according to claim 1, is characterized in that, described from multiple sub-image area, and screening obtains the first kind sub-image area comprising subject, comprising:

Calculate the score of each sub-image area in described multiple subimage;

S = ω \cdot f^{T} = Σ_{k = 1}^{64} x_{k} y_{k}

5. method according to claim 4, is characterized in that, described use Supplement device and nearest neighbor classifier, screens at least one extracted sub-image area, obtains the first kind sub-image area comprising subject, comprising:

Using the selection result as the first kind sub-image area comprising subject;

Or,

6. a multiple target tracking device, is characterized in that, comprising:

7. device according to claim 6, is characterized in that, the presumptive area analysis operation performed by described analysis module also comprises:

8. device according to claim 6, is characterized in that, described second determination module comprises:

f＝min(|g _x|+|g _y|,255)

g _x＝I(x+1,y)-I(x,y)

g _y＝I(x,y+1)-I(x,y)

9. device according to claim 6, is characterized in that, described screening module, comprising:

S = ω \cdot f^{T} = Σ_{k = 1}^{64} x_{k} y_{k}

10. device according to claim 9, is characterized in that, described sort module, comprising:

Or,

Described sort module, comprising: