KR101681104B1 - A multiple object tracking method with partial occlusion handling using salient feature points - Google Patents
A multiple object tracking method with partial occlusion handling using salient feature points Download PDFInfo
- Publication number
- KR101681104B1 KR101681104B1 KR1020150097839A KR20150097839A KR101681104B1 KR 101681104 B1 KR101681104 B1 KR 101681104B1 KR 1020150097839 A KR1020150097839 A KR 1020150097839A KR 20150097839 A KR20150097839 A KR 20150097839A KR 101681104 B1 KR101681104 B1 KR 101681104B1
- Authority
- KR
- South Korea
- Prior art keywords
- feature points
- minimum bounding
- bounding rectangle
- tracking
- next frame
- Prior art date
Links
Images
Classifications
-
- G06K9/34—
-
- G06K9/4604—
-
- G06T7/2033—
Landscapes
- Image Analysis (AREA)
Abstract
The present invention relates to a main feature point based multi-object tracking method for extracting a salient feature point (SFP) from each object and simultaneously tracking a plurality of objects in the image based on the feature point, wherein (a) Extracting major feature points in a corresponding frame of the image and calculating a minimum bounding rectangle of each object including all the major feature points of each object; (c) predicting positions of major feature points of a next frame; (d) determining whether the main feature points of the predicted next frame of each object are erroneously or normally tracked, using the outlier analysis; (e) calculating a minimum bounding rectangle of a next frame of each object using the main feature points of the next frame of the normal track of each object; And (f) using the minimum bounding rectangle of the next frame of each object to modify the main traits of each object.
By using the method described above, multiple SFPs can be tracked individually in a video frame, and if misclosed by the occlusion, multiple objects can be successfully tracked by utilizing the relative positions of the correctly tracked other SFPs, Tracking accuracy can be achieved.
Description
The present invention relates to a multi-object tracking method based on a feature point, which tracks multiple objects moving in a continuous image when the object is partially obscured by a background object or when two or more objects overlap each other.
More particularly, the present invention relates to a method for extracting a salient feature point (SFP) from each object when a partial occlusion phenomenon occurs between objects in an image, and simultaneously tracking a plurality of objects in the image based on the feature point. And more particularly, to a multi-object tracking method based on feature points.
Visual object tracking is an important and complex task in the field of computer vision. The method has various applications such as automatic object detection, object monitoring, motion analysis, and human computer interaction [Non-Patent Documents 1-4]. For example, automated surveillance systems play an important role in plant, school, traffic, hospital, bank monitoring, and other areas, including object detection, tracking, and event analysis, according to various needs.
In a video stream, a portion of an object may not be visible to a person by occlusion. Blurring has been recognized as one of the major challenges in visual object tracking as tracking accuracy is seriously degraded.
A person can recognize an object even if it is partially hidden. If an object is only partially visible, the human brain can reconstruct the entire object using inference based on knowledge of the visible part of the object and the overall structure of the object. For example, in FIG. 1 (a), the whole body of a human being is seen, whereas in FIG. 1 (b), only a part of the body is shown by an obstacle. Despite the presence of obstacles, a person can predict the size and shape of an object by gauging the posture of the visible part.
Sophisticated techniques may be needed to implement tracking systems similar to human object recognition mechanisms in the event of occlusion. Multiple object tracking, which tracks multiple objects at the same time when an occlusion of an object occurs in the image, has been recognized as a difficult problem in the field of computer vision, and research has been conducted steadily.
Many existing tracking methods can accurately track multiple objects when multiple objects are clearly separated from each other and the colors of multiple objects are not very similar to the background. Otherwise, it may fail, and the object tracker may incorrectly trace anywhere else in the moving object or background.
Existing tracking methods will be described in more detail.
Object tracking is largely divided into single object tracking and multiple object tracking. Tracking a single object or a few isolated objects is relatively easy when tracking occlusion and / or tracking multiple objects under harsh background conditions. The target object being tracked may be masked by a background object or other target object. On the other hand, a variety of single visual approaches and multiple visual approaches have been proposed to track obscured target objects.
Multiple visual approaches [Non-Patent Literature 5-8] use information obtained from one or more cameras to reconstruct 3D spatial information by reducing hidden portions. However, such a configuration in which an image of the same scene is shot with several cameras may not be actually possible.
Existing single visual approaches can effectively track isolated objects, while occlusion phenomena, especially inter-object occlusion, are likely to cause severe obstruction and failure of multi-object tracking. If there is a change in the appearance of an object, for example, if the shape of the object changes by some rotation, most algorithms will not be able to accurately track the object. In the present invention, a multi-object tracking method in case of occlusion using a single visual approach is discussed.
To date, much work has been done to track moving objects, and a variety of techniques have been used for effective tracking, including object detection, display, and tracking algorithms [9, 10]. Various algorithms such as Kalman Filter [Non-Patent Document 11-13], mean shift [Non-Patent Document 14-17], and particle filter [Non-Patent Document 18-27] It is proposed to track moving objects as a single visual approach.
Kalman filters are widely applied to object tracking applications, but require linear models for state dynamics that are not guaranteed in all scenarios [Non-patent Document 28]. Beymer and Konolige [Non-Patent Document 11] proposed a method of detecting the position and velocity of an object using a linear velocity model, and estimating the position and velocity of the object during occlusion by using a Kalman filter. In this method, the obscured object is tracked as a new object after the occlusion. Rowe et al. [Non-Patent Document 12] proposed a block-based color histogram matching algorithm in which tracking is performed by three steps-object detection, low-level tracking, and high-level tracking for a multi-object tracking using a Kalman filter. This method is difficult in a hybrid environment and requires initialization of many parameters that can cause misclassification. Chang et al. [Non-Patent Document 13] displayed the object using the center of the object and the boundary rectangle of the object, and tracked the object using the Kalman filter. A new large bounding rectangle is created that combines the obscured objects to track objects during occlusion. This approach can recognize blindness, but does not track individual objects that are obscured.
The mean shift (MS) tracking algorithm is widely used because of simplicity and efficiency. In the MS tracking method, the target object is mainly depicted as a weighted [non-patent document 14] or characteristic [non-patent document 15] histogram of pixels on the bounding rectangle of the object or object. The objects are then retrieved in the video frame through template matching using various vector similarity measures such as the Bhattacharyya coefficient [29] or Kullback-Leibler divergence [30]. Ning et al. [Non-Patent Document 16] proposed a corrected background weighted histogram (CBWH) in which the target position estimation is improved by reducing the relevance of the background information. However, when the colors of objects and backgrounds are similar, this algorithm may not maintain consistency. Zhao et al. Used 3D human models, color histograms, and foreground and background appearance models to track multiple people in crowded scenes [Non-Patent Document 17]. This method is very sophisticated and results in computational overhead.
A particle filter (PF), also known as a sequential Monte Carlo method (non-patent reference 31), is the best known algorithm for generating a posterior probability density function (pdf) using the propagation rule of state density [Non-patent Document 32]. This algorithm produces better results than other algorithms, especially in nonlinear environments. Non-Patent Document 18], Outline [Non-Patent Document 19], and Color Information [Non-Patent Document 20] methods are mainly used for the algorithm based on the particle filter. While it is not easy to find exact coordinates in the object coordinate method, approaches based on contour and color information may fail to track the object when the background is similar to the object.
Jin et al. [Non-Patent Document 21] divides a human body into three parts, each part is represented by a combination of a color histogram and a histogram of oriented gradient descriptor, and each part is individually tracked. This proposal does not track multiple objects and does not resolve blindness. Chang and Ansari [Non-Patent Document 22] used an elliptical model and gradient estimation. The proposed KPF (kernel particle filter) here suggests that it has better performance than the conventional PF, but it does not track multiple target objects and does not provide any mechanism to solve the blur. R. Cabido et al. [Non-Patent Document 23] proposed a tracking algorithm combining a particle filter and an imitation algorithm. It can track multiple objects, but does not address blindness. Liu and Sun [Non-Patent Document 24] used PF to track a target object represented by a rectangle. Traceability is improved by an incremental likelihood function which is a combination of histogram and Bhattacharyya similarity calculations. This method is very fast, but less accurate. Yang et al. [Non-Patent Document 25] proposed a method based on particle filtering for multi-object tracking including pseudo random sampling. This method does not solve the partial shape change or change of the tracked object. Cai et al [Non-Patent Document 26] proposed a particle filter approach combined with a mean shift algorithm, while Okuma et al. [Non-Patent Document 27] used an increased particle filter. Both methods use color histograms to represent the target model and are effective in tracking people and other non-rigid objects. However, a large number of particles are needed for pre-training.
The object of the present invention is to solve the above-mentioned problems, and focus on tracking multiple objects moving in a continuous image when an object is partially obscured by a background object or when two or more objects obscure each other In order to solve the occlusion problem, the multi-object tracking uses a similar strategy to the object reconstruction mechanism of a human, and provides a main feature point based multi-object tracking method.
In particular, it is an object of the present invention to extract a plurality of salient feature points (SFPs) from each target object, track each SFP individually in successive video frames, and determine which one of the SFPs of the object is blind Object tracking method based on key feature points that, when traced by uncooperative background conditions, utilizes the relative positions of correctly tracked other SFPs of the same object to gauge the exact location of the mis-tracked SFPs.
According to an aspect of the present invention, there is provided a method for tracking a plurality of objects detected in an image composed of a plurality of consecutive frames in time, the method comprising the steps of: (a) Extracting minutiae points and calculating a minimum bounding rectangle of each object including all major minutiae of each object; (c) predicting positions of major feature points of a next frame; (d) determining whether the main feature points of the predicted next frame of each object are erroneously or normally tracked, using the outlier analysis; (e) calculating a minimum bounding rectangle of a next frame of each object using the main feature points of the next frame of the normal track of each object; And (f) using the minimum bounding rectangle of the next frame of each object, modifying the main traits of each object.
According to another aspect of the present invention, there is provided a method for tracking a plurality of objects detected in an image composed of a plurality of consecutive frames in time, the method comprising the steps of: (a) extracting main feature points in a corresponding frame of the image; Calculating a minimum bounding rectangle of each object including all major feature points of each object; (b) calculating a feature descriptor for the main feature points of each object; (c) predicting positions of major feature points of a next frame; (d) determining whether the main feature points of the predicted next frame of each object are erroneously or normally tracked, using the outlier analysis; (e) calculating a minimum bounding rectangle of a next frame of each object using the main feature points of the next frame of the normal track of each object; (f) using the minimum bounding rectangle of the next frame of each object to modify the main traits of each object; And (g) modifying the feature descriptor for the main feature points.
According to another aspect of the present invention, there is provided a method for tracking a multi-object based on a feature point, the method comprising the steps of: generating a plurality of particles by a velocity of a main feature point and a Gaussian distribution at a position of a main feature point of the corresponding frame, The weight of the particle is determined by the Bhattacharyya distance between the main feature points, and the position of the particle selected by the obtained weight is predicted by the position of the main feature point of the next frame.
According to another aspect of the present invention, there is provided a method for tracking a multi-object based on a feature point, the method comprising: predicting a position of a particle having a maximum weight to a position of a main feature point of a next frame.
According to another aspect of the present invention, there is provided a method for tracking a multi-object feature based on a feature point, the method comprising: predicting a position of a minimum bounding rectangle of each object based on a relative position of predicted main feature points of each object; , The distribution of the positions of the predicted minimum bounding rectangles of all the predicted main feature points of each object is obtained and if the predicted main feature points are out of a predetermined range from the center of the distribution, .
According to another aspect of the present invention, there is provided a main feature point-based multi-object tracking method, wherein, in the step (d), if a position of a main feature point predicted deviates from a mean (m) of the distribution by a standard deviation Is judged to be mis-traced.
According to another aspect of the present invention, there is provided a multi-object tracking method based on a feature point, wherein the distribution is a distribution of positions of left-upper corners of a minimum bounding rectangle predicted from all predicted main feature points of each object.
According to another aspect of the present invention, there is provided a method of tracking a multi-object feature based on a feature point, the method comprising: averaging positions of a minimum bounding rectangle calculated based on relative positions of top- The ratio of the size of the minimum bounding rectangle to the relative position of the main tracked feature points of each object is equal to the ratio of the size of the minimum bounding rectangle to the relative position of the major feature points in the frame The size of the minimum bounding rectangle of each object is predicted and the size of the minimum bounding rectangle of the next frame is predicted as an average of the predicted sizes.
According to another aspect of the present invention, there is provided a method for tracking a multi-object based on a feature point, the method comprising the steps of: (a) detecting a relative position of major feature points in a corresponding frame, And the size of the minimum bounding rectangle is adjusted according to the size ratio of the minimum bounding rectangle.
According to another aspect of the present invention, there is provided a multi-object tracking method based on a feature point, wherein when one main feature point is included in a minimum bounding rectangle of at least two objects, the main feature point is judged to be duplicated, Is set to a speed.
In addition, the present invention is characterized in that, in the main feature point-based multi-object tracking method, the feature descriptor is a Histogram of Oriented Gradient descriptor (HOG) descriptor.
As described above, according to the multi-object tracking method based on the main feature points according to the present invention, when a plurality of SFPs are individually tracked within a video frame and are mistakenly tracked by the occlusion phenomenon, the relative positions of the correctly tracked other SFPs are utilized, Multiple objects can be successfully tracked, and significant object tracking accuracy can be achieved.
The major achievements of the method according to the present invention can be summarized as follows: (1) An effective way of dealing with moving multiple object tracking in the event of partial occlusion, one of the difficult tasks of moving multiple object tracking, is presented. The method according to the present invention has achieved significant object tracking accuracy, as shown in the experimental results. (2) We proposed an effective means to represent moving objects using bounding rectangles that closely surround the object's SFP. The tracking algorithm of the present invention enables more accurate tracking based on the SFP rather than the entire object widely used in the existing technology. (3) We define anomalies to detect false tracking points and propose a detection method of outliers. (4) The tracking method of the present invention is robust and configured so that the accurately tracked feature points play an important role in correcting the feature points traced by the blurring.
FIG. 1 is a diagram illustrating a human inference method for reconstructing an object in the occurrence of a general blurring phenomenon, in which (a) the whole body is visible, (b) the body is partially visible, A figure that shows that a person can predict an entire object (seen using a bounding rectangle), even if shown.
2 is a block diagram of a configuration of an overall system for implementing a main feature point based multi-object tracking method according to an embodiment of the present invention.
3 is a flow chart illustrating a method for tracking a main feature point based multi-object according to an embodiment of the present invention.
FIG. 4 is a data structure diagram of an object and an SFP according to the present invention, wherein (a) is an attribute of an object, and (b) is a data structure diagram of attributes of the SFP.
5 is a table showing operator definitions for vector operation according to the present invention;
6 is a view illustrating an object representation according to the present invention, wherein each circle represents the position of the characteristic point, the rectangle surrounding the points is the minimum boundary rectangle, and the dotted line represents the characteristic point Fj (R).
FIG. 7 is a diagram showing an example of the SFP position prediction using the maximum weight among particles and the average weight of particles according to the present invention. FIG. 7 (a) shows the original position of SFP in
FIG. 8 is a diagram for predicting the left-to-upper corner position of a minimum bounding rectangle derived by the SFP according to the present invention, in which (a) is an ideal case, (b) Or that the prediction can be changed by a change in the characteristic point inside the minimum bounding rectangle.
9 is a diagram illustrating an example of the positional correction of an outlier SFP according to the present invention. In FIG. 9, (a) an original position of a feature point is indicated by a red dot, (b) Fig. (C) shows the predicted position of the feature point after the position modification, and the position of the outlier SFP after the outlier analysis is modified based on the previous relative position.
Figure 10 is an exemplary diagram of an overlapped SFP according to the present invention (indicated by the light blue dots), wherein the SFP belongs to the person walking from the left (object 1), and modifying the SFP descriptor at this location, An example showing that false tracking can be caused.
FIG. 11 shows a result of tracking of each method for seven video frames (2nd, 5th, 8th, 11th, 13th, 16th, and 20th) according to the experiment of the present invention. (b) MS [Non-Patent Document 14], (c) CBWH [Non-Patent Document 16], and (d) Resultant image according to the method of the present invention. As can be seen from the figure, KFOH The minimum bounding rectangle predicted by the MS method generates a large minimum bounding rectangle and the CBWH generates a long bounding minimum bounding rectangle. On the other hand, the method of the present invention shows that the two persons are tracked more accurately.
12 is a graph showing the relationship between the minimum bounding rectangle position (left-top position coordinates) of the target object and the minimum bounding rectangle position (left-top position coordinates) predicted by each method in the video frame according to the experiment of the present invention. As a graph of Euclidian distance comparison results, (a) a comparison with a female, (b) a graph with a comparison with a male.
13 is a table for comparison of the tracking accuracy of each method for a female (object 1) and a male (object 2) according to the experiment of the present invention.
DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS Hereinafter, the present invention will be described in detail with reference to the drawings.
In the description of the present invention, the same parts are denoted by the same reference numerals, and repetitive description thereof will be omitted.
First, a configuration of an overall system for implementing a main feature point-based multi-object tracking method according to an embodiment of the present invention will be described with reference to FIG.
As shown in FIG. 2, the main feature-based multi-object tracking method according to the present invention can be implemented in a
The
Meanwhile, as another embodiment, the main feature point based multi-object tracking method may be implemented by a single electronic circuit such as an ASIC (application-specific semiconductor) in addition to being operated by a general-purpose computer. Or a
Next, before describing the present invention, the techniques used in the present invention will be described in more detail.
First, a particle filter (or particle filter) will be described.
The particle filter is known as a recursive Bayesian estimation method that predicts posterior probability using state density propagation rules. Bayesian tracking recursively computes the posterior density p (x t | z 1 : t ) of the current state based on previously observed information. The probability density function is estimated through two steps (i.e., the prediction step and the update step). In the first Markov process, the probability density function (pdf) is obtained as follows.
[Equation 1]
&Quot; (2) "
Where κ is the state p | normalization constant and independent from the (z t x t), p (x t | z 1: t-1) and p (x t | x t -1 ) is the likelihood function, respectively, the dynamic model And is temporally ahead of x t . Particle filters are implemented using recursive Bayesian filters through Monte Carlo simulation. Is: {n = 1, ..., N s t (n)} to represent each sample, the weight (w (j)) | ( z t x t) is set in the sample conditional state density p at time t The likelihood or conditional pdf is estimated by the following formula.
&Quot; (3) "
The post-pdf is updated using the weighted particles as shown in equation (4).
&Quot; (4) "
Particles are propagated according to the dynamic model of the system. Particle filters can handle nonlinear and non-Gaussian distributions by maintaining multiple hypotheses.
Next, the corner detecting method will be described.
If two dominant and unequal borders intersect in the vicinity of a point, the point is considered to be a corner point. In other words, the corner point is the point with the greatest curvature in the curve. Many corner detection methods have been devised to extract corner points [Non-Patent Documents 33, 34]. In the present invention, an algorithm (Non-Patent Document 35) in which the contour line is extracted from the Canny edge map and the absolute value of the curvature as the initial corner candidate is calculated is used for detecting the corner point. Then an adaptive curvature threshold is applied to remove the rounded corners from the initial list and the false corners are eventually removed due to the quantization noise.
Next, the Histogram of Oriented Gradient (HOG) feature will be described.
HOG [Non-Patent Document 36] is a basic feature descriptor used in computer vision. The HOG can capture the border or gradient structure that represents the partial shape properties of the object and adapt to the geometric changes of the partial photometer. The HOG is computed mainly as a block (or window) and is represented as a normalized histogram of the normalized direction of the image gradient within the dense grid. The gradient operator (usually Sobel) is applied to calculate the direction and size of the gradient. The reference and the direction are calculated using equations (5) and (6), respectively.
&Quot; (5) "
&Quot; (6) "
All pixels in the window are adopted within the histogram section along the gradient direction with the gradient size. At the end, the size of the accumulated gradient is normalized.
Next, we will explain the Bhattacharyya street.
The Bhattacharyya distance is used to measure the similarity or proximity between two separate or continuous normal distributions (Non-Patent Document 29). The distance D between the histograms H1 and H2, which have two equal size distributions, is calculated using the following equation.
&Quot; (7) "
Where n is the number of histogram intervals. Short distances indicate that matching is good between the two histograms. Usually, the distance score range is between 0 and 1.
Next, a main feature point based multi-object tracking method according to an embodiment of the present invention will be described in detail with reference to FIG.
In the multi-object tracking method based on the main feature points according to the present invention, salient feature points (SFPs) are extracted from each object when a partial occlusion occurs between the objects in the image, and based on the feature points, At the same time. The method proposed by the present invention is composed of seven steps as shown in FIG.
As shown in FIG. 3, the main feature point based multi-object tracking method according to the present invention includes the steps of (a) extracting main feature points S10, (b) calculating feature descriptors S20, (D) a main feature point detection step (S40) which is misclassified through outlier analysis, (e) a minimum boundary rectangle detection step (S50) for the main feature points tracked, f) the step of correcting the position of the main feature points that are mis-tracked, S, and the step of modifying the feature descriptors of the main feature points.
First, in step (a), objects of interest are detected from the first start frame of the video stream, and key feature points (SFPs) are extracted from each object using a corner detection algorithm. Then, And represented by a minimum bounding rectangle containing points (S10).
Next, in step (b), a feature descriptor is calculated from each SFP (S20). As a feature descriptor, HOG (histogram of oriented gradient descriptor), which is well known in the field of computer vision, is used.
Next, in step (c), a SFP position is predicted in a next frame using a particle filter (S30). Information on the shape of the object can be obtained using the relative position information of each SFP, and this relative position information is an important factor for evaluating whether the SFP is correctly detected.
Next, in step (d), an SFP that has been misjudged through outlier analysis is detected (S40).
Next, in step (e), the minimum bounding rectangle is detected using the correctly tracked SFP (S50).
Next, in step (f), the minimum bounding rectangle formed by the SFPs is corrected by considering the relative positions of the SFPs in the previous frame with respect to the mis-traced SFP (S60).
Next, in step (g), the feature descriptor is modified for the SFPs whose positions are correctly corrected (S70).
Steps (c) and (g) are repeated until the next frame remains, and the method of the present invention can successfully track multiple objects in this manner.
Each step will be described in more detail below.
First, a step S20 of extracting main feature points (S10) and calculating feature descriptors will be described in more detail.
Objects of interest are detected from the first start frame of the video stream and key feature points (SFPs) are extracted from each object using a corner detection algorithm (Non-patent Document 35) (S10).
A minimum bounding rectangle containing feature points extracted from each object can be generated, and each object is represented by this minimum bounding rectangle. As shown in FIG. 4 (a), each extracted object is represented by a set of a minimum bounding rectangle (B), a velocity (velocity, u), and a salient feature point (SFP). 4 (b), each SFP includes a location p, a relative location r, a descriptor h, and a velocity of the feature point v ), And other flags.
The table in Fig. 5 defines operators so that the calculation between these vector quantities is possible. Let the vector r = (r 1 , r 2 ), the vector p = (p 1 , p 2 ) and the vector l = (l 1 , l 2 )
, , , ,? And? Are defined as shown in the table of FIG.The minimum bounding rectangle (B) for the target object consists of the left-topological coordinates l = (x, y) of the minimum bounding rectangle and the rectangle size s = (width, height) ) Is calculated from the moving displacement of the left-up coordinate value of the rectangle as shown in the following equation.
&Quot; (8) "
Here, B t and B t -1 are the t-th and (t-1) th frames of the object O, respectively.
As shown in FIG. 6, each object can be represented by a minimum bounding rectangle formed by a set of minutiae points. Each SFP (F j ) is composed of a plurality of attributes as shown in FIG. 4 (b).
p is the position of the SFP, and r is the position vector of the SFP (F j ) with respect to the left-to-right coordinates of the rectangle.
&Quot; (9) "
Therefore, r represents the position of the SFP in the minimum bounding rectangle of the object. Conversely, the location and size of the minimum bounding rectangle can be estimated using the r values of the correctly tracked feature points.
The descriptor h represents the characteristic of the feature point and uses a Histogram of Oriented Gradient descriptor (Non-Patent Document 36) widely known in the field of computer vision. The velocity v of the feature point represents the change in position in two consecutive frames and is calculated by the following equation.
&Quot; (10) "
Where p t and p t -1 are the positions of the SFP (F j ) in the t-th and (t-1) -th frames, respectively. The SFP has two flags, an Outlier flag indicating whether or not the SFP has been correctly tracked, and an Overlapped flag indicating whether the SFP belongs to a position occupied by a plurality of objects .
Next, the steps (S30 to S70) for tracking the object will be described in more detail.
In general, object tracking using a particle filter (PF) estimates the posterior distribution of the position of a target object in a frame based on information obtained from past observations. The method according to the present invention also adopts this method to predict the position of the target object in each frame in the video stream.
In the normal method, the tracking algorithm tracks the entire object using a descriptor or template representing the object. However, this method has a problem that the error rate increases even if the position of the object or the condition of the image is slightly changed.
In the method according to the present invention, SFP tracking results are integrated to track each SFP of an object and predict the position of an object in a next frame. Next, the SFP attributes are modified based on the current position of the SFPs. This correction process reflects the current conditions of the feature point, and enables precise tracking even if the shape of the feature points varies slightly between frames.
Explain about particles.
The state of the SFP (F j ) is represented by {F j .p, F j .r, F j .v, F j .h} and the set of n particles Z = {z i | i = 1, 2, 3, ..., n} is generated from each SFP based on the current state. The particle z i of feature point F j of object O is generated by the following equation.
&Quot; (11) "
Here, O. F j . v is the velocity of F j calculated based on the position of F j in the previous frame and N (0, d 2 ) means a Gaussian distribution with mean of 0 and variance of d 2 .
The generated particles allow us to estimate the possible position of the SFP in the video frame. The feature descriptor can be used to validate the particle, for which the HOG feature OF j .z i .h is calculated at each location OF j .z i .q predicted by the particle. Each particle is assigned a weight (w) using the following equation.
&Quot; (12) "
Here, BD (X, Y) is a Bhattacharyya distance (non-patent document 29) representing overlap or similarity between two distributions X and Y.
The weight of each particle is assigned by the similarity between the SFP being tracked and the SFP at the position predicted by the particle. Therefore, a larger weight means that the particle is likely to exhibit a similar SFP in the current frame.
According to the present invention, the average using the weights does not always present the exact position of the SFP, and the position of the SFP is at any position within the predicted position range. Since the algorithm presented in the present invention is based on SFP, it is important to predict the exact position of the SFP.
Therefore, in the method according to the present invention, each tracker selects a particle having the largest weight value among the particles generated by the following equation, and becomes a predicted position of the SFP that the particle is tracking. This approach gives a more accurate result than averaging the predicted positions of all particles.
&Quot; (13) "
FIG. 7 is a diagram comparing the SFP position prediction using the maximum weight among the particles and the average weight of the particles. 7 (a) shows the original position of the SFP in
Next, the outlier analysis and the minimum bounding rectangle correction step will be described.
In order to analyze the outliers, a method similar to the method proposed by RANSAC [Non-Patent Document 37] is used. However, in the method proposed in the present invention, detection is performed by only one execution instead of a plurality of iterations to detect an abnormal value. Based on the relative position of the feature point SFP (F j ) of the object O, the position of the left-upper corner of the object's minimum bounding rectangle can be predicted using the following equation.
&Quot; (14) "
That is, p is obtained in the next frame (frame to be predicted) using the particle, and the relative position r in the next frame regards the same size / direction as the vector amount of the current frame. Then we predict the left-upper corner position of the minimum bounding rectangle (B) with p and r in the next frame.
In order to predict the position of the left-upper corner of the minimum bounding rectangle derived by the SFP, as shown in FIG. 8 (a), the ideal case is that all the predictions are directed to the left-upper corner of the minimum bounding rectangle.
However, in general, it is difficult to accurately track all SFPs because the tracking can fail or the prediction may change due to changes in feature points within the minimum bounding rectangle.
Therefore, only the prediction for the SFPs whose flag is outlier = overlapped = 0 is considered, and the Hough-like method is used to predict the left-upper corner (1) of the minimum bounding rectangle of the object of the current frame. As shown in FIG. 8 (b), the prediction by all feature points except for F 2 is similar to each other. The prediction by F 2 may cause tracking error, and thus, the case where most of the predictions included in distribution C and other prediction results are abnormal, is defined as follows.
&Quot; (15) "
The position (O.B.l) of the left-upper corner of the object's minimum bounding rectangle is calculated as the predicted average value of the correctly tracked SFPs, outliers = overlapped = 0. The size of the minimum bounding rectangle is calculated using the relative position (r) of the SFPs.
The ratio of the size of the minimum bounding rectangle to the relative position of the correctly tracked SFPs (F j ) can be estimated to be approximately the same.
&Quot; (16) "
Here, OBs t -1 is the size of the minimum bounding rectangle in the (t-1) th frame, and O.Fj.r t and O.Fj.r t -1 are the t-th and (t-1) The relative position of the SFP in the frame.
OBs t -1 is calculated by the following equation.
&Quot; (17) "
Where Bs t is the size of the minimum bounding rectangle in the t th frame and N is the number of correctly tracked SFPs.
Since the relative position of the SFP of the target object varies depending on the frame, the position of the correctly tracked SFPs is fixed after adjusting the size of the minimum bounding rectangle.
However, the position of the outlier SFP needs to be corrected using the following equation based on the previous relative position and the size of the minimum bounding rectangle.
&Quot; (18) "
Shows a modified location of the position of the corner - here, j OF .p and OBl the left of the minimum bounding rectangle for each outlier it characterized in that the t-th frame. OF j . R t -1 is the relative position of the outlier feature point in the (t-1) th frame.
FIG. 9 shows a process of modifying the position of an outlier SFP. 9 (a) shows the original position of the characteristic point as a red dot, Fig. 9 (b) shows the predicted position of the characteristic point in the next frame, and Fig. 9 Show the predicted location. After the outlier analysis, the position of the outlier SFP is appropriately modified based on the previous relative position.
Next, a method for solving the partial clipping phenomenon will be described.
Blindness is a common phenomenon when tracking multiple objects and is known to be a difficult problem that is not easy to solve. When a target object is obscured by an obstacle, the obstructed SFP becomes difficult to distinguish from the SFP of another object or the SFP of the background. When this occurs, it can not match with the exact descriptor of the obscured SFP, and erroneous position information is obtained compared to the position of the correctly tracked SFPs. In the present invention, this SFP is regarded as an outlier, and the outlier problem is solved based on the position of the correctly tracked SFPs.
Also, when two or more objects appear in close proximity or overlap in one frame, a case may occur in which a part of the SFP of the target object in the next frame is matched with the SFP of another object. In this case, the SFP is more likely to be anomalies, so modifying the SFP descriptor at that location can lead to error tracking that tracks the SFPs of other objects. In this case, the algorithm proposed in the present invention effectively solves the problem by not modifying the SFP descriptor with the overlapped flag set to 1.
FIG. 10 shows an example of an overlapped SFP. The overlapping SFP displayed by the light blue dot actually belongs to the person walking from the left (object 1), but it can best match the shoulder of the person wearing the black shirt (object 2). Therefore, modifying the SFP descriptor at this location may cause false tracing in the next frame. To solve this problem, we define the nested SFP as follows.
&Quot; (19) "
Where B is the minimum bounding rectangle. p and q are arbitrary real numbers. OF j . p ∈ B p ∩ B q is OF j . p and p is a description of the situation (overlap) belonging to any two minimum bounding rectangle (B).
Calculating the speed of a nested SFP is not accurate because the SFP is overlaid on many objects. Therefore, for nested SFPs, the velocity is calculated using the following equation:
&Quot; (20) "
As a result, accurately tracked SFPs can accurately and effectively track objects even when there is a change in the speed and direction of the object in the event of partial occlusion.
If the particular SFP p is an overlapping situation belonging to two minimum bounding rectangles (B), the flag overlapped = 1 is set, in which case the speed v of the p is not recomputed and the speed u of the object is used.
Next, modification of the minutiae descriptor will be described.
The shape of various parts of the target object varies depending on the frame due to the body movement and posture change. Therefore, tracking the SFP using the initial SFP descriptor (HOG) can lead to inaccurate tracking by predicting the location of the erroneous feature point in the next frame. Therefore, accurately tracked SFP descriptors are calculated using the HOG at the current frame in the current frame. However, even after correcting the position, the predicted position of the outlier SFP may not be accurate, and if the technician is modified using this, the desired SFP is not obtained. In this case, leave the technician untouched until you find another SFP that matches exactly.
Next, the effects of the present invention through experiments will be described in detail.
Three experimental data sets were used to verify the effectiveness of the algorithm developed in the present invention. Parameters such as the number of particles and the size of the HOG window were determined experimentally, the number of particles was 20, and the size of the window for calculating the HOG descriptor was 13x13. In order to verify the performance of the developed algorithm, three techniques, KF with occlusion handling (KFOH) [Non-Patent Document 13], mean shift [Non-Patent Document 14] and CBWH [Non-Patent Document 16] Performance comparison was performed. As a measure for performance evaluation, Euclidian distance d between the minimum boundary rectangle position of the target object and the minimum boundary rectangle position predicted by the developed algorithm was used. The Euclidian distance d is defined by the following equation.
&Quot; (21) "
Here, (x 1 , y 1 ) is the coordinate value of the left-upper corner of the minimum bounding rectangle of the target object, and (x GT , y GT ) represents the minimum bounding rectangle position of the target object identified with the naked eye. We use precision and recall as measures to verify the accuracy of each method. The precision and recall are defined by the following equations.
&Quot; (22) "
Where B GT and B Alg represent the minimum bounding rectangle of the target object identified by the naked eye and the minimum bounding rectangle predicted by the developed algorithm, respectively, and area () is calculated as the number of pixels in the minimum bounding rectangle.
PETS 2010 [Non-Patent Document 39] is used as an experimental data set, and FIG. 11 shows a result obtained by applying four algorithms to a video stream of a separated scene after two persons approach each other and overlap each other Respectively. (A) is KFOH (non-patent document 13), (b) is MS (non-patent document 14), and (c) CBWH [Non-Patent Document 16], and (d) shows the results of tracking by the method developed in the present invention. As can be seen from the figure, KFOH generates one large minimum bounding rectangle in case of occlusion, and the minimum bounding rectangle predicted by MS method contains many parts of background, CBWH is one long minimum bounding Create a rectangle. On the other hand, it can be seen that the method developed by the present invention tracks two persons more accurately.
The value obtained by comparing the Euclidian distance d between the minimum bounding rectangle position (left-top position coordinate) of the target object identified by the naked eye in the video frame and the minimum bounding rectangle position (left-top position coordinate) predicted by the developed algorithm is 12.
12 is a graph comparing Euclidian distances between the minimum bounding rectangle positions (left-top position coordinates) of the target object and the minimum bounding rectangle positions predicted by each method (left-top position coordinates) (A) is a comparison of women, and (b) is a comparison of men. In case of occlusion, the distance due to KFOH is large because this method can not track multiple objects correctly. MS and CBWH show a significant error rate for the second person. On the other hand, the method developed by the present invention shows that the tracking accuracy of the two persons is higher than the other methods.
The table in FIG. 13 is a table comparing the accuracy of each method for female (object 1) and male (object 2) using the precision and recall scales. KFOH shows a high recall in all frames, but in frames 11-16 the precision is degraded. This is because this method combines a female object and a male object to generate a single large minimum bounding rectangle in the occlusion phenomenon. The MS and CBWH methods show fairly high precision and recall for women and men in almost every frame. However, in frames 13-16, the MS is far from the actual minimum position of the bounding rectangle due to the mutual overlap of objects. In
The invention made by the present inventors has been described concretely with reference to the embodiments. However, it is needless to say that the present invention is not limited to the embodiments, and that various changes can be made without departing from the gist of the present invention.
10: learning data 12: EMG signal
20: computer terminal 30: walking step recognition and prediction device
Claims (11)
(a) extracting major feature points in a corresponding frame of the image, and calculating a minimum bounding rectangle of each object including all the major feature points of each object;
(c) predicting positions of major feature points of a next frame;
(d) determining whether the main feature points of the predicted next frame of each object are erroneously or normally tracked, using the outlier analysis;
(e) calculating a minimum bounding rectangle of a next frame of each object using the main feature points of the next frame of the normal track of each object; And
(f) using the minimum bounding rectangle of the next frame of each object to modify the main traits of each object to be tracked.
(a) extracting major feature points in a corresponding frame of the image, and calculating a minimum bounding rectangle of each object including all the major feature points of each object;
(b) calculating a feature descriptor for the main feature points of each object;
(c) predicting positions of major feature points of a next frame;
(d) determining whether the main feature points of the predicted next frame of each object are erroneously or normally tracked, using the outlier analysis;
(e) calculating a minimum bounding rectangle of a next frame of each object using the main feature points of the next frame of the normal track of each object;
(f) using the minimum bounding rectangle of the next frame of each object to modify the main traits of each object; And
(g) modifying the feature descriptor for the major feature points.
In the step (c), a plurality of particles are generated by the velocity of the main feature point and the Gaussian distribution at the positions of the main feature points of the frame, and the particles are generated by the Bhattacharyya distance between the generated particles and the main feature points. And predicting the position of the particle selected by the obtained weight to the position of the main feature point of the next frame.
And predicting the position of the particle having the maximum weight to the position of the main feature points of the next frame.
In the step (d), the outlier analysis predicts the position of the minimum bounding rectangle of each object on the basis of the relative positions of the predicted major feature points of each object, Wherein a distribution of the positions of the boundary rectangles is obtained, and when the positions of the predicted main feature points deviate from the center of the distribution to a predetermined range, it is determined that the main feature points are mistakenly tracked.
Wherein in the step (d), if the predicted position of the main feature point deviates from the mean (m) of the distribution by more than the standard deviation (?) X 2, it is determined that the main feature point is mis- Object tracking method.
Wherein the distribution is a distribution of locations of left-upper corners of a minimum bounding rectangle predicted from all predicted major feature points of each object.
A position of a minimum bounding rectangle of each object is calculated by averaging the positions of the minimum bounding rectangles calculated based on the relative positions of the main tracking points of each object,
Since the ratio of the size of the minimum bounding rectangle to the relative position of the main tracked feature points of each object is the same as the ratio of the size of the minimum bounding rectangle to the relative position of the major feature points in the frame, And estimating the size of the minimum bounding rectangle of the next frame as an average of the predicted sizes.
In the step (f), the location of the main feature points of the object that are mis-tracked is modified according to the relative position of the major feature points in the frame and the ratio of the size of the minimum bounding rectangle in the frame and the next frame Multi - object tracking method based on key feature points.
Characterized in that if one main feature point is included in the minimum bounding rectangle of at least two objects, the main feature point is judged to be duplicated, and the speed of the main feature point is set to the speed of the object. .
Wherein the feature descriptor is a Histogram of Oriented Gradient descriptor (HOG) descriptor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150097839A KR101681104B1 (en) | 2015-07-09 | 2015-07-09 | A multiple object tracking method with partial occlusion handling using salient feature points |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150097839A KR101681104B1 (en) | 2015-07-09 | 2015-07-09 | A multiple object tracking method with partial occlusion handling using salient feature points |
Publications (1)
Publication Number | Publication Date |
---|---|
KR101681104B1 true KR101681104B1 (en) | 2016-11-30 |
Family
ID=57707739
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150097839A KR101681104B1 (en) | 2015-07-09 | 2015-07-09 | A multiple object tracking method with partial occlusion handling using salient feature points |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101681104B1 (en) |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101885839B1 (en) | 2017-03-14 | 2018-08-06 | 중앙대학교 산학협력단 | System and Method for Key point Selecting for Object Tracking |
KR102029860B1 (en) * | 2019-07-31 | 2019-11-08 | 주식회사 시그널웍스 | Method for tracking multi objects by real time and apparatus for executing the method |
CN110753932A (en) * | 2017-04-16 | 2020-02-04 | 脸谱公司 | System and method for providing content |
KR20210067016A (en) * | 2019-11-29 | 2021-06-08 | 군산대학교산학협력단 | Method for object tracking using extended control of search window and object tracking system thereof |
CN113012225A (en) * | 2021-04-14 | 2021-06-22 | 合肥高晶光电科技有限公司 | Method for quickly positioning minimum external rectangular frame of material image of color sorter |
KR20220046524A (en) * | 2021-06-03 | 2022-04-14 | 베이징 바이두 넷컴 사이언스 테크놀로지 컴퍼니 리미티드 | Image data correction method, apparatus, electronic device, storage medium, computer program, and autonomous vehicle |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101033243B1 (en) | 2010-11-17 | 2011-05-06 | 엘아이지넥스원 주식회사 | Object tracking method and apparatus |
KR101360349B1 (en) | 2013-10-18 | 2014-02-24 | 브이씨에이 테크놀러지 엘티디 | Method and apparatus for object tracking based on feature of object |
-
2015
- 2015-07-09 KR KR1020150097839A patent/KR101681104B1/en active IP Right Grant
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101033243B1 (en) | 2010-11-17 | 2011-05-06 | 엘아이지넥스원 주식회사 | Object tracking method and apparatus |
KR101360349B1 (en) | 2013-10-18 | 2014-02-24 | 브이씨에이 테크놀러지 엘티디 | Method and apparatus for object tracking based on feature of object |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101885839B1 (en) | 2017-03-14 | 2018-08-06 | 중앙대학교 산학협력단 | System and Method for Key point Selecting for Object Tracking |
CN110753932A (en) * | 2017-04-16 | 2020-02-04 | 脸谱公司 | System and method for providing content |
KR102029860B1 (en) * | 2019-07-31 | 2019-11-08 | 주식회사 시그널웍스 | Method for tracking multi objects by real time and apparatus for executing the method |
KR20210067016A (en) * | 2019-11-29 | 2021-06-08 | 군산대학교산학협력단 | Method for object tracking using extended control of search window and object tracking system thereof |
KR102335755B1 (en) | 2019-11-29 | 2021-12-06 | 군산대학교 산학협력단 | Method for object tracking using extended control of search window and object tracking system thereof |
CN113012225A (en) * | 2021-04-14 | 2021-06-22 | 合肥高晶光电科技有限公司 | Method for quickly positioning minimum external rectangular frame of material image of color sorter |
CN113012225B (en) * | 2021-04-14 | 2024-04-16 | 合肥高晶光电科技有限公司 | Quick positioning method for minimum circumscribed rectangular frame of material image of color sorter |
KR20220046524A (en) * | 2021-06-03 | 2022-04-14 | 베이징 바이두 넷컴 사이언스 테크놀로지 컴퍼니 리미티드 | Image data correction method, apparatus, electronic device, storage medium, computer program, and autonomous vehicle |
KR102579124B1 (en) * | 2021-06-03 | 2023-09-14 | 베이징 바이두 넷컴 사이언스 테크놀로지 컴퍼니 리미티드 | Image data correction method, apparatus, electronic device, storage medium, computer program, and autonomous vehicle |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR101681104B1 (en) | A multiple object tracking method with partial occlusion handling using salient feature points | |
Sato et al. | Temporal spatio-velocity transform and its application to tracking and interaction | |
EP2192549B1 (en) | Target tracking device and target tracking method | |
Mangawati et al. | Object Tracking Algorithms for video surveillance applications | |
Ozyildiz et al. | Adaptive texture and color segmentation for tracking moving objects | |
EP2345999A1 (en) | Method for automatic detection and tracking of multiple objects | |
Ali et al. | Multiple object tracking with partial occlusion handling using salient feature points | |
Piater et al. | Multi-modal tracking of interacting targets using Gaussian approximations | |
Salih et al. | Comparison of stochastic filtering methods for 3D tracking | |
Ajith et al. | Unsupervised segmentation of fire and smoke from infra-red videos | |
US20110091074A1 (en) | Moving object detection method and moving object detection apparatus | |
Smith | ASSET-2: Real-time motion segmentation and object tracking | |
Nallasivam et al. | Moving human target detection and tracking in video frames | |
Zoidi et al. | Stereo object tracking with fusion of texture, color and disparity information | |
Huang et al. | Random sampling-based background subtraction with adaptive multi-cue fusion in RGBD videos | |
JP7488674B2 (en) | OBJECT RECOGNITION DEVICE, OBJECT RECOGNITION METHOD, AND OBJECT RECOGNITION PROGRAM | |
CN107665495B (en) | Object tracking method and object tracking device | |
Nguyen et al. | 3d pedestrian tracking using local structure constraints | |
Gautam et al. | Computer vision based asset surveillance for smart buildings | |
Truong et al. | Single object tracking using particle filter framework and saliency-based weighted color histogram | |
Cho et al. | Robust centroid target tracker based on new distance features in cluttered image sequences | |
Hammer et al. | Motion segmentation and appearance change detection based 2D hand tracking | |
Vezhnevets | Method for localization of human faces in color-based face detectors and trackers | |
Wu et al. | Depth image-based hand tracking in complex scene | |
Emami et al. | Novelty detection in human tracking based on spatiotemporal oriented energies |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant | ||
FPAY | Annual fee payment |
Payment date: 20191107 Year of fee payment: 4 |