WO2016026370A1 - High-speed automatic multi-object tracking method and system with kernelized correlation filters - Google Patents
High-speed automatic multi-object tracking method and system with kernelized correlation filters Download PDFInfo
- Publication number
- WO2016026370A1 WO2016026370A1 PCT/CN2015/085270 CN2015085270W WO2016026370A1 WO 2016026370 A1 WO2016026370 A1 WO 2016026370A1 CN 2015085270 W CN2015085270 W CN 2015085270W WO 2016026370 A1 WO2016026370 A1 WO 2016026370A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- foreground
- kernel
- matrix
- vector
- video
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19173—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2411—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
- G06T7/248—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/41—Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/49—Segmenting video sequences, i.e. computational techniques such as parsing or cutting the sequence, low-level clustering or determining units such as shots or scenes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/19—Recognition using electronic means
- G06V30/191—Design or setup of recognition systems or techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
- G06V30/19147—Obtaining sets of training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30232—Surveillance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Definitions
- the present invention generally relates to the field of computer vision technologies and, more particularly, to high-speed automatic multi-object tracking methods and systems with kernelized correlation filters.
- Object tracking is an important research domain in computer vision. Object tracking is the basis for detailed analysis of an object. Based on the object tracking, object trajectory and behavioral analysis can be implemented. Currently, there are two types of object tracking models in the academic field: a recognition-based tracker and a generation-based tracker.
- a recognition-based tracker is better than a generation-based tracker.
- Online machine learning is generally required for the recognition-based tracker.
- a classifier generated through online machine learning is used to identify objects.
- recognition-based tracking algorithms can adapt to object change in a certain extent and are robust. But the recognition-based tracking algorithms require a large number of training samples and the training process is very time-consuming. It is difficult for the recognition-based tracking algorithms to solve multi-scale problems. Therefore, to overcome disadvantages of the recognition-based tracking algorithms, a circulant matrix method can be used to obtain the training samples. On one hand, sufficient number of training samples can be obtained to train a classifier with a higher recognition rate; on the other hand, according to characteristics of the circulant matrix, Fourier transform and kernel trick are used to reduce the time required for training the classifier. Thus, the method can resolve the problem of training sample and reduce training time. However, the method cannot solve the multi-scale problems and cannot accelerate Fourier transform. In addition, the method cannot be extended to multi-object tracking scenarios.
- the disclosed methods and systems are directed to solve one or more problems set forth above and other problems.
- the high-speed automatic multi-object tracking method with kernelized correlation filters can be applied in battlefield surveillance, video monitoring, image compression, image retrieve, human-computer interaction, and so on.
- One aspect of the present disclosure includes a high-speed automatic multi-object tracking method with kernelized correlation filters.
- the method includes obtaining an image frame from a plurality of image frames in a video, extracting a foreground object sequence from the obtained image frame, and determining similarity between each foreground object of the extracted foreground object sequence and a tracked object.
- the method also includes calculating Histogram of Oriented Gradients (HOG) features of the foreground objects with a lower similarity, obtaining training samples for each of the foreground objects with the lower similarity using a circular matrix, training a classifier via a kernel method accelerated by faster Fourier transform in the west (FFTW) , and obtaining tracking points using a sparse optical flow.
- the method includes detecting object matching responses using a detection response function, performing multi-scale analysis for the object based on an optical flow method, and processing a next image frame of the plurality of image frames in the video until the video ends.
- the system includes a video capture module configured to capture a video, an obtaining module configured to obtain an image frame from a plurality of image frames in the video captured by the video capture module, and an extraction module configured to extract a foreground object sequence from the obtained image frame.
- the system also includes an image analyzer configured to determine similarity between each foreground object of the extracted foreground object sequence and a tracked object, obtain training samples for each of the foreground objects with the lower similarity using a circular matrix, train a classifier via a kernel method accelerated by faster Fourier transform in the west (FFTW) , and obtain tracking points using a sparse optical flow.
- the system includes a detection module configured to detect object matching responses using a detection response function and perform multi-scale analysis for the object based on an optical flow method, where a location with a maximum response is a new location of the object.
- Figure 1 illustrates a flow chart of an exemplary high-speed automatic multi-object tracking process with kernelized correlation filters consistent with the disclosed embodiments
- Figure 2 illustrates a schematic diagram of an exemplary high-speed automatic multi-object tracking system with kernelized correlation filters consistent with the disclosed embodiments
- Figure 3 illustrates a video stream being divided into different video components
- Figure 4 illustrates an exemplary computing system consistent with the disclosed embodiments.
- ridge regression is a biased estimation regression method for collinear data analysis.
- ridge regression is a simple variant of ordinary least square method. Ridge regression discards the unbiasedness feature of the least square method. By losing some information and reducing accuracy, a regression method, which has more practical, and more reliable regression coefficients can be obtained. For tolerance of morbidity data, ridge regression is far stronger than the least squares method.
- circulant matrix is a special kind of Toeplitz matrix, where each row vector is rotated one element to the right relative to the preceding row vector.
- circulant matrices are important because they are diagonalized by a discrete Fourier transform, and hence linear equations that contain them may be quickly solved using a faster Fourier transform.
- optical flow or optic flow is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer (an eye or a camera) and the scene.
- Optical flow has been commonly described as the apparent motion of brightness patterns in an image sequence. That is, optical flow is the projection of a 3-D motion vector of objects onto the 2-D image plane.
- Kernel methods owe the name to the use of kernel functions, which enable them to operate in a high-dimensional, implicit feature space without ever computing the coordinates of the data in that space, but rather by simply computing the inner products between the images of all pairs of data in the feature space.
- the kernel function may be defined by:
- xl and x2 are points (scalars or vectors) in a high-dimensional space
- ⁇ phi (xi) represents a point converted from a low-dimensional space to in high-dimensional space
- ⁇ , > represents an inner product of vectors.
- classification (or regression) problems can be divided into two categories: a parameter learning and an instance-based learning.
- the parameter learning is implemented through a lot of training data.
- the parameters of the corresponding model can be obtained by learning through the training data and then the training data is no useful. For new data, appropriate conclusions can be obtained using the parameters obtained by learning.
- the instance-based learning also called memory-based learning
- Examples of instance-based learning algorithm are the k-nearest neighbor (kNN) algorithm, kernel machines and RBF networks.
- the instance-based learning stores training sets; when predicting a value/class for a new instance, the instance-based learning computes distances or similarities between this instance and the training instances to make a decision.
- the similarities between this instance and the training instances may be represented by an inner product of vectors. Therefore, the kernel methods only aim at the instance-based learning.
- obtaining a large number of samples is very important, because a classifier trained by a large number of samples has a higher recognition rate.
- it is time-consuming to train the classifier using a large number of samples. So it is very difficult to meet real-time requirements.
- the common strategy is to randomly select some samples to train the classifier. Although such a strategy makes some sense, the recognition rate of the classifier may be reduced and the tracking performance is decreased.
- the high-speed automatic multi-object tracking method with kernel-based collaborative filtering obtains a large number of samples by using a circulant matrix.
- the time consumption of the method is very low, achieving real-time requirements.
- the high-speed automatic multi-object tracking method with kernel-based collaborative filtering includes a learning phase and a detection phase.
- a ridge regression algorithm is used.
- the ⁇ is a regularization parameter that controls overfitting.
- the goal of training is to determine the parameter w. Based on Equation (1) , the parameter w is represented by:
- X is a matrix of sample data; the data matrix X has one sample per row x i , and each element of vector y is a regression target y i ; I is an identity matrix; and T is a transpose of the matrix.
- Equation (2) In the Fourier domain, quantities are usually complex valued.
- Equation (2) The complex version of Equation (2) is represents by:
- Equation (3) reduces to Equation (2) .
- a base sample is an n ⁇ 1 vector representing a patch with the object of interest, denoted x.
- the goal is to train a classifier with both the base sample (a positive example) and several virtual samples obtained by translating it (which serve as negative examples) .
- One-dimensional translations of this vector can be modeled by a cyclic shift operator, which is the permutation matrix:
- P x [x n , x 1 , x 2 , ... , x n-1 ] T shifts x by one element, modeling a small translation. It can chain u shifts to achieve a larger translation by using the matrix power P u x. A negative u can shift in the reverse direction. P u x represents that a sample is shifted for the number of u times.
- sample data matrix I is obtained through the circulant matrix p transform. Due to the property of the circulant matrix, the sample data matrix I is also a circulant matrix and all circulant matrices are made diagonal by Discrete Fourier Transform (DFT) , regardless of the generating vector x. This can be expressed as:
- DFT Discrete Fourier Transform
- F is a constant matrix that does not depend on x, and , denotes the DFT of the generating vector
- Equation (5) expresses the eigendecomposition of a general circulant matrix.
- the shared, deterministic eigenvectors F lie at the root of many uncommon features, such as commutativity or closed-form inversion.
- Equation (3) Equation (3) can be written as:
- Equation (3) the fraction denotes element-wise division.
- the Correlation filters and the kernel trick are further introduced to accelerate solving Equation (6) .
- the solutions w can be written as a linear combination of the samples. That is,
- K is the kernel matrix and ⁇ is the vector of coefficients ⁇ i , that represent the solution in the dual space.
- K is the circulant matrix.
- the function that is, a classifier
- f(z) is evaluated on several image locations, i.e., for several candidate patches. These patches can be modeled by cyclic shifts.
- K z is a (asymmetric) kernel matrix between all training samples and all candidate patches.
- f (z) can be diagonalized to obtain a detection response function.
- the detection response function is represented by:
- Z is a candidate position vector of the object
- ⁇ is the vector of coefficients ⁇ i
- K xz is the kernel correlation of X and Z.
- Figure 1 illustrates a flow chart of an exemplary high-speed automatic multi-object tracking process with kernelized correlation filters consistent with the disclosed embodiments.
- the high-speed automatic multi-object tracking process with kernelized correlation filters may include the following steps.
- Step 1 an image frame is obtained from a plurality of image frames in a video.
- a video is a sequence of frames and changes between consecutive frames are relatively small due to the typical frame rate for a video (e.g. 25 frames/second)
- some grouping or clustering techniques may be applied to separate the whole video into different sets of frames with each set has similar frames for further processing.
- Figure 3 illustrates a video stream being divided into different video components.
- a video stream may be divided into scenes, a scene may be divided into shots, and a shot may be divided into frames, etc.
- the frame can be further divided into objects and features of the video frame may be extracted for further processing.
- Step 2 Based on a Gaussian mixture background modeling algorithm, a foreground object sequence is extracted from the image frame.
- the Gaussian Mixture background modeling algorithm is used as a statistical model of the background pixel color generation process. Effectively, the mixture is used as a multi-modal probability density function predicting the probability of occurrence of a pixel value as part of the background scene.
- Step 3 similarity between each foreground object of the extracted foreground object sequence and a tracked object is determined, where all foreground objects with a higher similarity are abandoned, and only Histogram of Oriented Gradients (HOG) features of the foreground objects with a lower similarity are calculated. If the similarity between the foreground object and the tracked object is high, it indicates that the foreground object with the higher similarity is tracked. Therefore, the foreground objects with the higher similarity do not need to be tracked again.
- HOG Histogram of Oriented Gradients
- HOG are feature descriptors used in computer vision and image processing for the purpose of object detection.
- the technique counts occurrences of gradient orientation in localized portions of an image.
- Step 4 for each of the foreground objects with the lower similarity in Step 3, training samples are obtained using a circular matrix; through a ridge regression plan, a classifier is obtained using a formula via a kernel method accelerated by faster Fourier transform in the west (FFTW) ; and tracking points are obtained using a sparse optical flow.
- FFTW faster Fourier transform in the west
- Step 5 object matching responses are detected using a detection response function A location with a maximum response is a new location of the object; and based on an optical flow method, multi-scale analysis for the object is performed.
- f (z) can be diagonalized to obtain the detection response function.
- Z is a candidate position vector of the object
- K xz is the kernel correlation of X and Z.
- Step 6 Steps 3, 4, 5 are repeated to process each foreground object.
- Step 7 a next image frame is obtained from the plurality of image frames in the video and Steps 2, 3, 4, 5 and 6 are repeated until the video ends. Finally, the system outputs the results of the object detection.
- FIG. 2 illustrates a schematic diagram of an exemplary high-speed automatic multi-object tracking system with kernelized correlation filters consistent with the disclosed embodiments.
- the high-speed automatic multi-object tracking system with kernelized correlation filters 200 may include a video capture module 202, an obtaining module 204, an extraction module 206, an image analyzer 208, and a detection module 210. Certain modules may be omitted and other modules may be included.
- the video capture module 202 may be configured to capture a video.
- the obtaining module 204 may be configured to obtain an image frame from a plurality of image frames in the video captured by the video capture module 202.
- the extraction module 206 may be configured to extract a foreground object sequence from the obtained image frame.
- the image analyzer 208 may be configured to determine similarity between each foreground object of the extracted foreground object sequence and a tracked object, wherein all foreground objects with a higher similarity are abandoned, and only Histogram of Oriented Gradients (HOG) features of the foreground objects with a lower similarity are calculated.
- HOG Histogram of Oriented Gradients
- the image analyzer 208 may be configured to obtain training samples for each of the foreground objects with the lower similarity using a circular matrix, train a classifier via a kernel method accelerated by faster Fourier transform in the west (FFTW) , and obtain tracking points using a sparse optical flow.
- DFT Discrete Fourier Transform
- the detection module 210 may be configured to detect object matching responses using a detection response function and perform multi-scale analysis for the object based on an optical flow method, wherein a location with a maximum response is a new location of the object.
- X is a matrix of sample data; the data matrix X has one sample per row xi, and each element of vector y is a regression target yi, the detection response function is represented by:
- Z is a candidate position vector of the object
- ⁇ is a vector of coefficients ⁇ i
- K xz is kernel correlation of X and Z.
- FIG. 4 illustrates an exemplary computing system consistent with the disclosed embodiments.
- computing system 400 may include a processor 402, a storage medium 404, a display 406, a communication module 408, a database 410, and peripherals 412. Certain devices may be omitted and other devices may be included.
- Processor 402 may include any appropriate processor or processors. Further, processor 402 can include multiple cores for multi-thread or parallel processing.
- Storage medium 404 may include memory modules, such as ROM, RAM, flash memory modules, and mass storages, such as CD-ROM and hard disk, etc. Storage medium 404 may store computer programs for implementing various processes when the computer programs are executed by processor 402.
- peripherals 412 may include various sensors and other I/O devices, such as keyboard and mouse, and communication module 408 may include certain network interface devices for establishing connections through communication networks.
- Database 410 may include one or more databases for storing certain data and for performing certain operations on the stored data, such as database searching.
- Embodiments consistent with the present disclosure may be implemented with a video camera control system to track multi-objects.
- the control system for the video camera may perform certain camera functions, such as zooming, re-scaling, target recognition, based on the output of the object tracking system in real time. For example, if the object tracking system detects the new location of the tracked object, the camera system may re-apply the zoom based on the newly determined location of the object. If the object tracking system detects the new locations of a plurality of tracked objects, the camera system may re-apply the zoom based on the newly determined locations of the objects.
- Embodiments consistent with the present disclosure may be implemented with a video camera control system to track multi-objects.
- the video camera system may be integrated with an LED (light emitting diode) lighting system.
- the control system for the video camera/LED lighting system may perform certain lighting related functions, such as adjusting lighting on the object for the camera, based on the output of the object tracking system in real time. For example, if the object tracking system detects the new location of the tracked object, the camera system may adjust the lighting, such as re-orient the LED lighting device or adjust the brightness of certain area lit, based on the newly determined location of the object. If the object tracking system detects the new location of a plurality of tracked objects, the camera system may adjust the lighting, such as re-orient the LED lighting device or adjust the brightness of certain area lit, based on the newly determined locations of the objects.
- a high-speed automatic multi-object tracking method with kernelized correlation filters can extract a foreground object sequence from an image frame based on a Gaussian mixture background modeling algorithm and can be extended to multi-object tracking scenarios.
- the method can obtain sufficient number of training samples for each of foreground objects using a circular matrix and train a classifier via a kernel method accelerated by faster Fourier transform in the west (FFTW) , reducing the time required for training the classifier.
- FFTW Fourier transform in the west
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Artificial Intelligence (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- General Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Image Analysis (AREA)
Abstract
A high-speed automatic multi-object tracking method with kernelized correlation filters is provided. The method includes obtaining an image frame from a plurality of image frames in a video, extracting a foreground object sequence from the obtained image frame, and determining similarity between each foreground object of the extracted foreground object sequence and a tracked object. The method also includes calculating HOG features of the foreground objects with a lower similarity, obtaining training samples for each of the foreground objects with the lower similarity using a circular matrix, obtaining a classifier via a kernel method accelerated by FFTW, and obtaining tracking points using a sparse optical flow. Further, the method includes detecting object matching responses using a detection response function, performing multi-scale analysis for the object based on an optical flow method, and processing a next image frame of the plurality of image frames in the video until the video ends.
Description
CROSS-REFERENCES TO RELATED APPLICATIONS
This PCT application claims priority to Chinese Patent Application No. 201410418797.7, filed on August 22, 2014, the entire content of which is incorporated herein by reference.
The present invention generally relates to the field of computer vision technologies and, more particularly, to high-speed automatic multi-object tracking methods and systems with kernelized correlation filters.
Object tracking is an important research domain in computer vision. Object tracking is the basis for detailed analysis of an object. Based on the object tracking, object trajectory and behavioral analysis can be implemented. Currently, there are two types of object tracking models in the academic field: a recognition-based tracker and a generation-based tracker.
In general, a recognition-based tracker is better than a generation-based tracker. Online machine learning is generally required for the recognition-based tracker. Further, a classifier generated through online machine learning is used to identify objects.
In general, recognition-based tracking algorithms can adapt to object change in a certain extent and are robust. But the recognition-based tracking algorithms require a large number of training samples and the training process is very time-consuming. It is difficult for the recognition-based tracking algorithms to solve multi-scale problems. Therefore, to overcome disadvantages of the recognition-based tracking algorithms, a circulant matrix method can be used to obtain the training samples. On one hand, sufficient number of training samples can be obtained to train a classifier with a higher recognition rate; on the other hand, according to characteristics of the circulant matrix, Fourier transform and kernel trick are used to reduce the time required for training the classifier. Thus, the method can resolve the
problem of training sample and reduce training time. However, the method cannot solve the multi-scale problems and cannot accelerate Fourier transform. In addition, the method cannot be extended to multi-object tracking scenarios.
The disclosed methods and systems are directed to solve one or more problems set forth above and other problems. For example, the high-speed automatic multi-object tracking method with kernelized correlation filters can be applied in battlefield surveillance, video monitoring, image compression, image retrieve, human-computer interaction, and so on.
BRIEF SUMMARY OF THE DISCLOSURE
One aspect of the present disclosure includes a high-speed automatic multi-object tracking method with kernelized correlation filters. The method includes obtaining an image frame from a plurality of image frames in a video, extracting a foreground object sequence from the obtained image frame, and determining similarity between each foreground object of the extracted foreground object sequence and a tracked object. The method also includes calculating Histogram of Oriented Gradients (HOG) features of the foreground objects with a lower similarity, obtaining training samples for each of the foreground objects with the lower similarity using a circular matrix, training a classifier via a kernel method accelerated by faster Fourier transform in the west (FFTW) , and obtaining tracking points using a sparse optical flow. Further, the method includes detecting object matching responses using a detection response function, performing multi-scale analysis for the object based on an optical flow method, and processing a next image frame of the plurality of image frames in the video until the video ends.
Another aspect of the present disclosure includes a high-speed automatic multi-object tracking system with kernelized correlation filters. The system includes a video capture module configured to capture a video, an obtaining module configured to obtain an image frame from a plurality of image frames in the video captured by the video capture module, and an extraction module configured to extract a foreground object sequence from the obtained image frame. The system also includes an image analyzer configured to determine similarity between each foreground object of the extracted foreground object sequence and a tracked object, obtain training samples for each of the foreground objects with the lower similarity using a circular matrix, train a classifier via a kernel method accelerated by faster Fourier transform in the west (FFTW) , and obtain tracking points using a sparse optical flow. Further, the system includes a detection module configured to detect object matching
responses using a detection response function and perform multi-scale analysis for the object based on an optical flow method, where a location with a maximum response is a new location of the object.
Other aspects of the present disclosure can be understood by those skilled in the art in light of the description, the claims, and the drawings of the present disclosure.
The following drawings are merely examples for illustrative purposes according to various disclosed embodiments and are not intended to limit the scope of the present disclosure.
Figure 1 illustrates a flow chart of an exemplary high-speed automatic multi-object tracking process with kernelized correlation filters consistent with the disclosed embodiments;
Figure 2 illustrates a schematic diagram of an exemplary high-speed automatic multi-object tracking system with kernelized correlation filters consistent with the disclosed embodiments;
Figure 3 illustrates a video stream being divided into different video components; and
Figure 4 illustrates an exemplary computing system consistent with the disclosed embodiments.
Reference will now be made in detail to exemplary embodiments of the invention, which are illustrated in the accompanying drawings. Wherever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.
The term “ridge regression” is a biased estimation regression method for collinear data analysis. In fact, ridge regression is a simple variant of ordinary least square method. Ridge regression discards the unbiasedness feature of the least square method. By losing some information and reducing accuracy, a regression method, which has more practical, and more reliable regression coefficients can be obtained. For tolerance of morbidity data, ridge regression is far stronger than the least squares method.
The term “circulant matrix” is a special kind of Toeplitz matrix, where each row vector is rotated one element to the right relative to the preceding row vector. In numerical analysis, circulant matrices are important because they are diagonalized by a discrete Fourier
transform, and hence linear equations that contain them may be quickly solved using a faster Fourier transform.
The term “optical flow or optic flow” is the pattern of apparent motion of objects, surfaces, and edges in a visual scene caused by the relative motion between an observer (an eye or a camera) and the scene. Optical flow has been commonly described as the apparent motion of brightness patterns in an image sequence. That is, optical flow is the projection of a 3-D motion vector of objects onto the 2-D image plane.
Kernel methods owe the name to the use of kernel functions, which enable them to operate in a high-dimensional, implicit feature space without ever computing the coordinates of the data in that space, but rather by simply computing the inner products between the images of all pairs of data in the feature space. The kernel function may be defined by:
K (xl, x2) =<\phi (xl) , \phi (x2) >,
where xl and x2 are points (scalars or vectors) in a high-dimensional space; \phi (xi) represents a point converted from a low-dimensional space to in high-dimensional space; < , > represents an inner product of vectors.
In machine learning, classification (or regression) problems can be divided into two categories: a parameter learning and an instance-based learning. The parameter learning is implemented through a lot of training data. The parameters of the corresponding model can be obtained by learning through the training data and then the training data is no useful. For new data, appropriate conclusions can be obtained using the parameters obtained by learning. The instance-based learning (also called memory-based learning) is a family of learning algorithms that, instead of performing explicit generalization, compares new problem instances with instances seen in training, which have been stored in memory. Examples of instance-based learning algorithm are the k-nearest neighbor (kNN) algorithm, kernel machines and RBF networks. The instance-based learning stores training sets; when predicting a value/class for a new instance, the instance-based learning computes distances or similarities between this instance and the training instances to make a decision. The similarities between this instance and the training instances may be represented by an inner product of vectors. Therefore, the kernel methods only aim at the instance-based learning.
For the recognition-based tracking algorithms, obtaining a large number of samples is very important, because a classifier trained by a large number of samples has a higher recognition rate. However, it is time-consuming to train the classifier using a large number of samples. So it is very difficult to meet real-time requirements. The common strategy is to randomly select some samples to train the classifier. Although such a strategy makes some
sense, the recognition rate of the classifier may be reduced and the tracking performance is decreased.
Therefore, the high-speed automatic multi-object tracking method with kernel-based collaborative filtering obtains a large number of samples by using a circulant matrix. The time consumption of the method is very low, achieving real-time requirements.
The high-speed automatic multi-object tracking method with kernel-based collaborative filtering includes a learning phase and a detection phase.
In the learning phase, a ridge regression algorithm is used. The goal of training is to find a function (that is, a classifier) f (z) = wTz that minimizes the squared error over samples xi and the regression object yi,
The λ is a regularization parameter that controls overfitting. The goal of training is to determine the parameter w. Based on Equation (1) , the parameter w is represented by:
w= (XT X+λI) -1 XT y (2)
where X is a matrix of sample data; the data matrix X has one sample per row xi, and each element of vector y is a regression target yi; I is an identity matrix; and T is a transpose of the matrix.
In the Fourier domain, quantities are usually complex valued. The complex version of Equation (2) is represents by:
w*= (XH X+λI) -1 XH y (3)
where *represents complex conjugates; XH is the Hermitian transpose, i.e., XH = (X*) T , and X*is the complex-conjugate of X. If X is a real number, Equation (3) reduces to Equation (2) .
In general, a large system of linear equations must be solved to compute the solution, which can become prohibitive in a real-time setting. A circulant matrix and a kernel trick are used to bypass this limitation.
Then, the matrix of sample data X is reconstructed by a specific circulant matrix (i.e., permutation matrix) P. Specifically, a base sample is an n × 1 vector representing a patch with the object of interest, denoted x. The goal is to train a classifier with both the base sample (a positive example) and several virtual samples obtained by translating it (which serve as negative examples) . One-dimensional translations of this vector can be modeled by a cyclic shift operator, which is the permutation matrix:
The product Px = [xn, x1, x2, ... , xn-1] T shifts x by one element, modeling a small translation. It can chain u shifts to achieve a larger translation by using the matrix power Pux. A negative u can shift in the reverse direction. Pux represents that a sample is shifted for the number of u times. For a given sample X, sample data matrix I is obtained through the circulant matrix p transform. Due to the property of the circulant matrix, the sample data matrix I is also a circulant matrix and all circulant matrices are made diagonal by Discrete Fourier Transform (DFT) , regardless of the generating vector x. This can be expressed as:
where F is a constant matrix that does not depend on x, and , denotes the DFT of the generating vector,
Here, a hat ^ can be used as shorthand for the DFT of a vector. The constant matrix F is known as the DFT matrix, and is the unique matrix that computes the DFT of any input vector. This is possible because the DFT is a linear operation. Equation (5) expresses the eigendecomposition of a general circulant matrix. The shared, deterministic eigenvectors F lie at the root of many uncommon features, such as commutativity or closed-form inversion.
Equation (5) is applied to the full expression for linear regression (i.e., Equation (3)) . Most quantities can be put inside the diagonal. Equation (3) can be written as:
whereandrepresent the DFT of vectors x and y, respectively. In Equation (3) , the fraction denotes element-wise division.
The Correlation filters and the kernel trick are further introduced to accelerate solving Equation (6) . In the kernel trick, the solutions w can be written as a linear combination of the samples. That is,
The variables under optimization are thus α, instead of w. Further, The solution to the kernelized version of Ridge Regression is given by:
α= (K+λI) -1y (8)
where K is the kernel matrix and α is the vector of coefficients αi, that represent the solution in the dual space. By proof of theorem, when the selected kernel is Radial Basis Function kernels (e.g., Gaussian) or dot-product kernels (e.g., linear, polynomial) , K is the circulant matrix.
whereis the first row of the kernel matrix K = C (kxx) , and again a hat ^ denotes the DFT of a vector.
In the detection phrase, to detect the object of interest, the function (that is, a classifier) f(z) is evaluated on several image locations, i.e., for several candidate patches. These patches can be modeled by cyclic shifts. Kz is a (asymmetric) kernel matrix between all training samples and all candidate patches. f (z) = (Kz) Tα is a vector, containing the output for all cyclic shifts of z, i.e., the full detection response. f (z) can be diagonalized to obtain a detection response function. The detection response function is represented by:
where Z is a candidate position vector of the object, α is the vector of coefficients αi; and Kxz is the kernel correlation of X and Z.
Figure 1 illustrates a flow chart of an exemplary high-speed automatic multi-object tracking process with kernelized correlation filters consistent with the disclosed embodiments. As shown in Figure 1, the high-speed automatic multi-object tracking process with kernelized correlation filters may include the following steps.
Step 1: an image frame is obtained from a plurality of image frames in a video.
Because a video is a sequence of frames and changes between consecutive frames are relatively small due to the typical frame rate for a video (e.g. 25 frames/second) , instead of dealing with each frame individually, some grouping or clustering techniques may be applied to separate the whole video into different sets of frames with each set has similar frames for further processing.
For example, Figure 3 illustrates a video stream being divided into different video components. As show in Figure 3, a video stream may be divided into scenes, a scene may be
divided into shots, and a shot may be divided into frames, etc. The frame can be further divided into objects and features of the video frame may be extracted for further processing.
Returning to Figure 1, after the image frame is obtained, the process goes to Step 2.
Step 2: Based on a Gaussian mixture background modeling algorithm, a foreground object sequence is extracted from the image frame.
The Gaussian Mixture background modeling algorithm is used as a statistical model of the background pixel color generation process. Effectively, the mixture is used as a multi-modal probability density function predicting the probability of occurrence of a pixel value as part of the background scene.
Step 3: similarity between each foreground object of the extracted foreground object sequence and a tracked object is determined, where all foreground objects with a higher similarity are abandoned, and only Histogram of Oriented Gradients (HOG) features of the foreground objects with a lower similarity are calculated. If the similarity between the foreground object and the tracked object is high, it indicates that the foreground object with the higher similarity is tracked. Therefore, the foreground objects with the higher similarity do not need to be tracked again.
HOG are feature descriptors used in computer vision and image processing for the purpose of object detection. The technique counts occurrences of gradient orientation in localized portions of an image.
Step 4: for each of the foreground objects with the lower similarity in Step 3, training samples are obtained using a circular matrix; through a ridge regression plan, a classifier is obtained using a formulavia a kernel method accelerated by faster Fourier transform in the west (FFTW) ; and tracking points are obtained using a sparse optical flow.
In the formulaK is a kernel matrix and α is the vector of coefficients αi.is the first row of the kernel matrix K = C (kxx) , and a hat ^ denotes the DFT of a vector.
Step 5: object matching responses are detected using a detection response function A location with a maximum response is a new location of the object; and based on an optical flow method, multi-scale analysis for the object is performed.
f (z) = (Kz) Tα is a vector, containing the output for all cyclic shifts of z, i.e., the full detection response. f (z) can be diagonalized to obtain the detection response function. In the detection response functionwhere Z is a candidate position vector of the object, and Kxz is the kernel correlation of X and Z.
Step 6: Steps 3, 4, 5 are repeated to process each foreground object.
Step 7: a next image frame is obtained from the plurality of image frames in the video and Steps 2, 3, 4, 5 and 6 are repeated until the video ends. Finally, the system outputs the results of the object detection.
Figure 2 illustrates a schematic diagram of an exemplary high-speed automatic multi-object tracking system with kernelized correlation filters consistent with the disclosed embodiments. As shown in Figure 2, the high-speed automatic multi-object tracking system with kernelized correlation filters 200 may include a video capture module 202, an obtaining module 204, an extraction module 206, an image analyzer 208, and a detection module 210. Certain modules may be omitted and other modules may be included.
The video capture module 202 may be configured to capture a video. The obtaining module 204 may be configured to obtain an image frame from a plurality of image frames in the video captured by the video capture module 202.
The extraction module 206 may be configured to extract a foreground object sequence from the obtained image frame.
The image analyzer 208 may be configured to determine similarity between each foreground object of the extracted foreground object sequence and a tracked object, wherein all foreground objects with a higher similarity are abandoned, and only Histogram of Oriented Gradients (HOG) features of the foreground objects with a lower similarity are calculated.
Further, the image analyzer 208 may be configured to obtain training samples for each of the foreground objects with the lower similarity using a circular matrix, train a classifier via a kernel method accelerated by faster Fourier transform in the west (FFTW) , and obtain tracking points using a sparse optical flow. The image analyzer 208 may obtain the classifier using a formulavia the kernel method accelerated by the FFTW, where K is a kernel matrix; α is the vector of coefficients αi; λ is a regularization parameter that controls overfitting; is a first row of the kernel matrix K = C (kxx) ; and a hat ^ denotes Discrete Fourier Transform (DFT) of a vector.
The detection module 210 may be configured to detect object matching responses using a detection response function and perform multi-scale analysis for the object based on an optical flow method, wherein a location with a maximum response is a new location of the object.
It is assumed that X is a matrix of sample data; the data matrix X has one sample per row xi, and each element of vector y is a regression target yi, the detection response function is represented by:
where Z is a candidate position vector of the object; α is a vector of coefficients αi; and Kxz is kernel correlation of X and Z.
Figure 4 illustrates an exemplary computing system consistent with the disclosed embodiments. As shown in Figure 4, computing system 400 may include a processor 402, a storage medium 404, a display 406, a communication module 408, a database 410, and peripherals 412. Certain devices may be omitted and other devices may be included.
Further, peripherals 412 may include various sensors and other I/O devices, such as keyboard and mouse, and communication module 408 may include certain network interface devices for establishing connections through communication networks. Database 410 may include one or more databases for storing certain data and for performing certain operations on the stored data, such as database searching.
Further, although the methods and systems are disclosed for illustrative purposes, similar concept and approach can be applied to other object tracking system. For example, a high-speed automatic multi-object tracking method with kernelized correlation filters can be applied in battlefield surveillance, video monitoring, image compression, image retrieve, human-computer interaction, and so on. Other applications, advantages, alternations, modifications, or equivalents to the disclosed embodiments are obvious to those skilled in the art.
INDUSTRIAL APPLICABILITY AND ADVANTAGEOUS EFFECTS
Without limiting the scope of any claim and/or the specification, examples of industrial applicability and certain advantageous effects of the disclosed embodiments are listed for illustrative purposes. Various alternations, modifications, or equivalents to the technical solutions of the disclosed embodiments can be obvious to those skilled in the art and can be included in this disclosure.
Embodiments consistent with the present disclosure may be implemented with a video camera control system to track multi-objects. The control system for the video camera may perform certain camera functions, such as zooming, re-scaling, target recognition, based on the output of the object tracking system in real time. For example, if the object tracking system detects the new location of the tracked object, the camera system may re-apply the zoom based on the newly determined location of the object. If the object tracking system detects the new locations of a plurality of tracked objects, the camera system may re-apply the zoom based on the newly determined locations of the objects.
Embodiments consistent with the present disclosure may be implemented with a video camera control system to track multi-objects. The video camera system may be integrated with an LED (light emitting diode) lighting system. The control system for the video camera/LED lighting system may perform certain lighting related functions, such as adjusting lighting on the object for the camera, based on the output of the object tracking system in real time. For example, if the object tracking system detects the new location of the tracked object, the camera system may adjust the lighting, such as re-orient the LED lighting device or adjust the brightness of certain area lit, based on the newly determined location of the object. If the object tracking system detects the new location of a plurality of tracked objects, the camera system may adjust the lighting, such as re-orient the LED lighting device or adjust the brightness of certain area lit, based on the newly determined locations of the objects.
Compared to existing technologies, a high-speed automatic multi-object tracking method with kernelized correlation filters can extract a foreground object sequence from an image frame based on a Gaussian mixture background modeling algorithm and can be extended to multi-object tracking scenarios. The method can obtain sufficient number of training samples for each of foreground objects using a circular matrix and train a classifier via a kernel method accelerated by faster Fourier transform in the west (FFTW) , reducing the time required for training the classifier. At the same time, the method can solve the multi-scale problems.
Claims (10)
- A high-speed automatic multi-object tracking method with kernelized correlation filters implemented by an object tracking system, comprising:obtaining an image frame from a plurality of image frames in a video;extracting a foreground object sequence from the obtained image frame;determining similarity between each foreground object of the extracted foreground object sequence and a tracked object;calculating Histogram of Oriented Gradients (HOG) features of the foreground objects with a lower similarity;obtaining training samples for each of the foreground objects with the lower similarity using a circular matrix;training a classifier via a kernel method accelerated by faster Fourier transform in the west (FFTW) ;obtaining tracking points using a sparse optical flow;detecting object matching responses using a detection response function, wherein a location with a maximum response is a new location of the object;performing multi-scale analysis for the object based on an optical flow method; andprocessing a next image frame of the plurality of image frames in the video until the video ends.
- The method according to claim 1, wherein extracting a foreground object sequence from the obtained image frame further includes:based on a Gaussian mixture background modeling algorithm, extracting the foreground object sequence from the obtained image frame.
- The method according to claim 1, wherein:all foreground objects with a higher similarity are abandoned, and only Histogram of Oriented Gradients (HOG) features of the foreground objects with the lower similarity are calculated.
- The method according to claim 1, wherein obtaining a classifier via a kernel method accelerated by FFTW further includes:obtaining the classifier usingvia the kernel method accelerated by the FFTW, wherein K is a kernel matrix; α is the vector of coefficients αi; λ is a regularization parameter that controls overfitting; is a first row of the kernel matrix K = C (kxx) ; and a hat ^ denotes Discrete Fourier Transform (DFT) of a vector.
- The method according to claim 2, wherein:provided that X is a matrix of sample data; the matrix X has one sample per row xi, and each element of vector y is a regression target yi, the detection response function is represented by:wherein Z is a candidate position vector of the object; α is a vector of coefficients αi; and Kxz is kernel correlation of X and Z.
- A high-speed automatic multi-object tracking system with kernelized correlation filters, comprising:a video capture module configured to capture a video;an obtaining module configured to obtain an image frame from a plurality of image frames in the video captured by the video capture module;an extraction module configured to extract a foreground object sequence from the obtained image frame;an image analyzer configured to:determine similarity between each foreground object of the extracted foreground object sequence and a tracked object;obtain training samples for each of the foreground objects with the lower similarity using a circular matrix;train a classifier via a kernel method accelerated by faster Fourier transform in the west (FFTW) ; andobtain tracking points using a sparse optical flow; anda detection module configured to detect object matching responses using a detection response function and perform multi-scale analysis for the object based on an optical flow method, wherein a location with a maximum response is a new location of the object.
- The system according to claim 6, wherein the extraction module is further configured to:extract the foreground object sequence from the obtained image frame based on a Gaussian mixture background modeling algorithm.
- The system according to claim 6, wherein:all foreground objects with a higher similarity are abandoned, and only Histogram of Oriented Gradients (HOG) features of the foreground objects with the lower similarity are calculated.
- The system according to claim 6, wherein the image analyzer is further configured to:obtain the classifier usingvia the kernel method accelerated by the FFTW, wherein K is a kernel matrix; α is the vector of coefficients αi; λ is a regularization parameter that controls overfitting; is a first row of the kernel matrix K = C (kxx) ; and a hat ^ denotes Discrete Fourier Transform (DFT) of a vector.
- The system according to claim 6, wherein:provided that X is a matrix of sample data, the matrix X has one sample per row xi, and each element of vector y is a regression target yi, the detection response function is represented by:wherein Z is a candidate position vector of the object; α is a vector of coefficients αi; and Kxz is kernel correlation of X and Z.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP15833258.5A EP3183690A4 (en) | 2014-08-22 | 2015-07-28 | High-speed automatic multi-object tracking method and system with kernelized correlation filters |
US15/023,841 US9898827B2 (en) | 2014-08-22 | 2015-07-28 | High-speed automatic multi-object tracking method and system with kernelized correlation filters |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201410418797.7A CN104200237B (en) | 2014-08-22 | 2014-08-22 | One kind being based on the High-Speed Automatic multi-object tracking method of coring correlation filtering |
CN201410418797.7 | 2014-08-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2016026370A1 true WO2016026370A1 (en) | 2016-02-25 |
Family
ID=52085526
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2015/085270 WO2016026370A1 (en) | 2014-08-22 | 2015-07-28 | High-speed automatic multi-object tracking method and system with kernelized correlation filters |
Country Status (4)
Country | Link |
---|---|
US (1) | US9898827B2 (en) |
EP (1) | EP3183690A4 (en) |
CN (1) | CN104200237B (en) |
WO (1) | WO2016026370A1 (en) |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106204638A (en) * | 2016-06-29 | 2016-12-07 | 西安电子科技大学 | A kind of based on dimension self-adaption with the method for tracking target of taking photo by plane blocking process |
CN107122795A (en) * | 2017-04-01 | 2017-09-01 | 同济大学 | A kind of pedestrian integrated based on coring feature and stochastic subspace discrimination method again |
CN109903281A (en) * | 2019-02-28 | 2019-06-18 | 中科创达软件股份有限公司 | It is a kind of based on multiple dimensioned object detection method and device |
CN110175649A (en) * | 2019-05-28 | 2019-08-27 | 南京信息工程大学 | It is a kind of about the quick multiscale estimatiL method for tracking target detected again |
CN110348492A (en) * | 2019-06-24 | 2019-10-18 | 昆明理工大学 | A kind of correlation filtering method for tracking target based on contextual information and multiple features fusion |
CN110895820A (en) * | 2019-03-14 | 2020-03-20 | 河南理工大学 | KCF-based scale self-adaptive target tracking method |
CN111028265A (en) * | 2019-11-11 | 2020-04-17 | 河南理工大学 | Target tracking method for constructing correlation filtering response based on iteration method |
CN111105441A (en) * | 2019-12-09 | 2020-05-05 | 嘉应学院 | Related filtering target tracking algorithm constrained by previous frame target information |
CN111383252A (en) * | 2018-12-29 | 2020-07-07 | 曜科智能科技(上海)有限公司 | Multi-camera target tracking method, system, device and storage medium |
CN111582266A (en) * | 2020-03-30 | 2020-08-25 | 西安电子科技大学 | Configuration target tracking hardware acceleration control method, system, storage medium and application |
CN112489034A (en) * | 2020-12-14 | 2021-03-12 | 广西科技大学 | Modeling method based on time domain information characteristic space background |
CN112614158A (en) * | 2020-12-18 | 2021-04-06 | 北京理工大学 | Sampling frame self-adaptive multi-feature fusion online target tracking method |
CN112686957A (en) * | 2019-10-18 | 2021-04-20 | 北京华航无线电测量研究所 | Quick calibration method for sequence image |
CN114373151A (en) * | 2021-12-30 | 2022-04-19 | 北京理工大学 | Automatic target tracking method for surveying and typing system |
CN116030098A (en) * | 2023-03-27 | 2023-04-28 | 齐鲁工业大学(山东省科学院) | Weld joint target tracking method and system based on directional characteristic driving |
Families Citing this family (55)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104200237B (en) * | 2014-08-22 | 2019-01-11 | 浙江生辉照明有限公司 | One kind being based on the High-Speed Automatic multi-object tracking method of coring correlation filtering |
CN104599289B (en) * | 2014-12-31 | 2018-12-07 | 南京七宝机器人技术有限公司 | Method for tracking target and device |
CN106296723A (en) * | 2015-05-28 | 2017-01-04 | 展讯通信(天津)有限公司 | Target location method for tracing and device |
CN104933542B (en) * | 2015-06-12 | 2018-12-25 | 临沂大学 | A kind of logistic storage monitoring method based on computer vision |
CN106557774B (en) * | 2015-09-29 | 2019-04-30 | 南京信息工程大学 | The method for real time tracking of multichannel core correlation filtering |
CN106683110A (en) * | 2015-11-09 | 2017-05-17 | 展讯通信(天津)有限公司 | User terminal and object tracking method and device thereof |
CN105701840A (en) * | 2015-12-31 | 2016-06-22 | 上海极链网络科技有限公司 | System for real-time tracking of multiple objects in video and implementation method |
CN105844665B (en) * | 2016-03-21 | 2018-11-27 | 清华大学 | The video object method for tracing and device |
CN106355604B (en) * | 2016-08-22 | 2019-10-18 | 杭州保新科技有限公司 | Tracking image target method and system |
CN106709934A (en) * | 2016-08-25 | 2017-05-24 | 上海瞬动科技有限公司合肥分公司 | Frequency domain Gaussian kernel function image tracking method |
CN106570893A (en) * | 2016-11-02 | 2017-04-19 | 中国人民解放军国防科学技术大学 | Rapid stable visual tracking method based on correlation filtering |
CN106651913A (en) * | 2016-11-29 | 2017-05-10 | 开易(北京)科技有限公司 | Target tracking method based on correlation filtering and color histogram statistics and ADAS (Advanced Driving Assistance System) |
CN107122741B (en) * | 2017-04-27 | 2020-06-09 | 中国科学院长春光学精密机械与物理研究所 | Target tracking method and system |
CN107705324A (en) * | 2017-10-20 | 2018-02-16 | 中山大学 | A kind of video object detection method based on machine learning |
CN107943837B (en) * | 2017-10-27 | 2022-09-30 | 江苏理工学院 | Key-framed video abstract generation method for foreground target |
CN109753846A (en) * | 2017-11-03 | 2019-05-14 | 北京深鉴智能科技有限公司 | Target following system for implementing hardware and method |
CN109753628A (en) * | 2017-11-06 | 2019-05-14 | 北京深鉴智能科技有限公司 | The hardware realization apparatus and method for tracking target of target tracker |
CN107871019B (en) * | 2017-12-01 | 2020-09-15 | 湖北微模式科技发展有限公司 | Man-vehicle association search method and device |
CN109934042A (en) * | 2017-12-15 | 2019-06-25 | 吉林大学 | Adaptive video object behavior trajectory analysis method based on convolutional neural networks |
CN108280808B (en) * | 2017-12-15 | 2019-10-25 | 西安电子科技大学 | Method for tracking target based on structuring output correlation filter |
CN108288062B (en) * | 2017-12-29 | 2022-03-01 | 中国电子科技集团公司第二十七研究所 | Target tracking method based on kernel correlation filtering |
CN108346159B (en) * | 2018-01-28 | 2021-10-15 | 北京工业大学 | Tracking-learning-detection-based visual target tracking method |
CN110298214A (en) * | 2018-03-23 | 2019-10-01 | 苏州启铭臻楠电子科技有限公司 | A kind of stage multi-target tracking and classification method based on combined depth neural network |
CN108846854B (en) * | 2018-05-07 | 2021-03-16 | 中国科学院声学研究所 | Vehicle tracking method based on motion prediction and multi-feature fusion |
CN108921177A (en) * | 2018-06-22 | 2018-11-30 | 重庆邮电大学 | The instrument localization method of Intelligent Mobile Robot |
US11500477B2 (en) * | 2018-07-02 | 2022-11-15 | Google Llc | Systems and methods for interacting and interfacing with an artificial intelligence system |
CN109191493B (en) * | 2018-07-13 | 2021-06-04 | 上海大学 | Target tracking method based on RefineNet neural network and sparse optical flow |
CN109242885B (en) * | 2018-09-03 | 2022-04-26 | 南京信息工程大学 | Correlation filtering video tracking method based on space-time non-local regularization |
CN109584271B (en) * | 2018-11-15 | 2021-10-08 | 西北工业大学 | High-speed correlation filtering tracking method based on high-confidence updating strategy |
CN109670410A (en) * | 2018-11-29 | 2019-04-23 | 昆明理工大学 | A kind of fusion based on multiple features it is long when motion target tracking method |
US10878292B2 (en) * | 2018-12-07 | 2020-12-29 | Goodrich Corporation | Automatic generation of a new class in a classification system |
CN109859241B (en) * | 2019-01-09 | 2020-09-18 | 厦门大学 | Adaptive feature selection and time consistency robust correlation filtering visual tracking method |
CN109978779A (en) * | 2019-03-12 | 2019-07-05 | 东南大学 | A kind of multiple target tracking device based on coring correlation filtering method |
CN110046659B (en) * | 2019-04-02 | 2023-04-07 | 河北科技大学 | TLD-based long-time single-target tracking method |
CN110009060B (en) * | 2019-04-17 | 2021-07-23 | 东北大学 | Robustness long-term tracking method based on correlation filtering and target detection |
JP7405580B2 (en) | 2019-06-06 | 2023-12-26 | 株式会社デンソーアイティーラボラトリ | Multiple object tracking device, method and program |
CN110428446B (en) * | 2019-06-28 | 2022-06-14 | 武汉大学 | Satellite video target tracking method based on mixed kernel correlation filtering |
CN110490899A (en) * | 2019-07-11 | 2019-11-22 | 东南大学 | A kind of real-time detection method of the deformable construction machinery of combining target tracking |
CN110472577B (en) * | 2019-08-15 | 2022-02-15 | 江南大学 | Long-term video tracking method based on adaptive correlation filtering |
CN110717930B (en) * | 2019-08-28 | 2024-02-20 | 河南中烟工业有限责任公司 | Mutation moving target tracking method based on extended Wang-Landau Monte Carlo and KCF |
CN110827316A (en) * | 2019-10-29 | 2020-02-21 | 贵州民族大学 | Crowd panic scatter detection method and system, readable storage medium and electronic equipment |
CN111079539B (en) * | 2019-11-19 | 2023-03-21 | 华南理工大学 | Video abnormal behavior detection method based on abnormal tracking |
KR102318397B1 (en) * | 2020-02-07 | 2021-10-27 | 국방과학연구소 | Object tracking method and device that is robust against distance and environment change |
CN111382705A (en) * | 2020-03-10 | 2020-07-07 | 创新奇智(广州)科技有限公司 | Reverse behavior detection method and device, electronic equipment and readable storage medium |
CN111951298B (en) * | 2020-06-25 | 2024-03-08 | 湖南大学 | Target tracking method integrating time sequence information |
CN112561958B (en) * | 2020-12-04 | 2023-04-18 | 武汉华中天经通视科技有限公司 | Correlation filtering image tracking loss judgment method |
CN112785622B (en) * | 2020-12-30 | 2024-04-05 | 大连海事大学 | Method and device for tracking unmanned captain on water surface and storage medium |
CN112750146B (en) * | 2020-12-31 | 2023-09-12 | 浙江大华技术股份有限公司 | Target object tracking method and device, storage medium and electronic equipment |
CN113608618B (en) * | 2021-08-11 | 2022-07-29 | 兰州交通大学 | Hand region tracking method and system |
CN113971684B (en) * | 2021-09-16 | 2024-07-16 | 中国人民解放军火箭军工程大学 | Real-time robust target tracking method based on KCF and SURF features |
CN114596335B (en) * | 2022-03-01 | 2023-10-31 | 广东工业大学 | Unmanned ship target detection tracking method and system |
CN114708307B (en) * | 2022-05-17 | 2022-11-01 | 北京航天晨信科技有限责任公司 | Target tracking method, system, storage medium and device based on correlation filter |
CN115131401B (en) * | 2022-06-20 | 2024-04-12 | 武汉大学 | Remote sensing video target tracking method based on multi-scale multi-direction kernel correlation filtering |
CN115841048B (en) * | 2023-02-13 | 2023-05-12 | 中国人民解放军火箭军工程大学 | Multi-mode simulation data set preparation method based on target mechanism model |
CN117788511B (en) * | 2023-12-26 | 2024-06-25 | 兰州理工大学 | Multi-expansion target tracking method based on deep neural network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102074034A (en) * | 2011-01-06 | 2011-05-25 | 西安电子科技大学 | Multi-model human motion tracking method |
US20120237082A1 (en) * | 2011-03-16 | 2012-09-20 | Kuntal Sengupta | Video based matching and tracking |
CN103237197A (en) * | 2013-04-10 | 2013-08-07 | 中国科学院自动化研究所 | Self-adaptive multi-feature fusion method for robust tracking |
CN104200237A (en) * | 2014-08-22 | 2014-12-10 | 浙江生辉照明有限公司 | High speed automatic multi-target tracking method based on coring relevant filtering |
Family Cites Families (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5384867A (en) * | 1991-10-23 | 1995-01-24 | Iterated Systems, Inc. | Fractal transform compression board |
US5875108A (en) * | 1991-12-23 | 1999-02-23 | Hoffberg; Steven M. | Ergonomic man-machine interface incorporating adaptive pattern recognition based control system |
WO2001028238A2 (en) * | 1999-10-08 | 2001-04-19 | Sarnoff Corporation | Method and apparatus for enhancing and indexing video and audio signals |
ITRM20060110A1 (en) * | 2006-03-03 | 2007-09-04 | Cnr Consiglio Naz Delle Ricerche | METHOD AND SYSTEM FOR THE AUTOMATIC DETECTION OF EVENTS IN SPORTS ENVIRONMENT |
US7403643B2 (en) * | 2006-08-11 | 2008-07-22 | Fotonation Vision Limited | Real-time face tracking in a digital image acquisition device |
US9036902B2 (en) * | 2007-01-29 | 2015-05-19 | Intellivision Technologies Corporation | Detector for chemical, biological and/or radiological attacks |
WO2010063463A2 (en) * | 2008-12-05 | 2010-06-10 | Fotonation Ireland Limited | Face recognition using face tracker classifier data |
US8659697B2 (en) * | 2010-11-11 | 2014-02-25 | DigitalOptics Corporation Europe Limited | Rapid auto-focus using classifier chains, MEMS and/or multiple object focusing |
US8648959B2 (en) * | 2010-11-11 | 2014-02-11 | DigitalOptics Corporation Europe Limited | Rapid auto-focus using classifier chains, MEMS and/or multiple object focusing |
CN102148921B (en) * | 2011-05-04 | 2012-12-12 | 中国科学院自动化研究所 | Multi-target tracking method based on dynamic group division |
US8873813B2 (en) * | 2012-09-17 | 2014-10-28 | Z Advanced Computing, Inc. | Application of Z-webs and Z-factors to analytics, search engine, learning, recognition, natural language, and other utilities |
US20140169663A1 (en) * | 2012-12-19 | 2014-06-19 | Futurewei Technologies, Inc. | System and Method for Video Detection and Tracking |
CN103871079B (en) * | 2014-03-18 | 2016-11-09 | 南京金智视讯技术有限公司 | Wireless vehicle tracking based on machine learning and light stream |
EP3134850B1 (en) * | 2014-04-22 | 2023-06-14 | Snap-Aid Patents Ltd. | Method for controlling a camera based on processing an image captured by other camera |
-
2014
- 2014-08-22 CN CN201410418797.7A patent/CN104200237B/en active Active
-
2015
- 2015-07-28 WO PCT/CN2015/085270 patent/WO2016026370A1/en active Application Filing
- 2015-07-28 EP EP15833258.5A patent/EP3183690A4/en not_active Withdrawn
- 2015-07-28 US US15/023,841 patent/US9898827B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102074034A (en) * | 2011-01-06 | 2011-05-25 | 西安电子科技大学 | Multi-model human motion tracking method |
US20120237082A1 (en) * | 2011-03-16 | 2012-09-20 | Kuntal Sengupta | Video based matching and tracking |
CN103237197A (en) * | 2013-04-10 | 2013-08-07 | 中国科学院自动化研究所 | Self-adaptive multi-feature fusion method for robust tracking |
CN104200237A (en) * | 2014-08-22 | 2014-12-10 | 浙江生辉照明有限公司 | High speed automatic multi-target tracking method based on coring relevant filtering |
Non-Patent Citations (1)
Title |
---|
See also references of EP3183690A4 * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106204638A (en) * | 2016-06-29 | 2016-12-07 | 西安电子科技大学 | A kind of based on dimension self-adaption with the method for tracking target of taking photo by plane blocking process |
CN107122795B (en) * | 2017-04-01 | 2020-06-02 | 同济大学 | Pedestrian re-identification method based on coring characteristics and random subspace integration |
CN107122795A (en) * | 2017-04-01 | 2017-09-01 | 同济大学 | A kind of pedestrian integrated based on coring feature and stochastic subspace discrimination method again |
CN111383252B (en) * | 2018-12-29 | 2023-03-24 | 曜科智能科技(上海)有限公司 | Multi-camera target tracking method, system, device and storage medium |
CN111383252A (en) * | 2018-12-29 | 2020-07-07 | 曜科智能科技(上海)有限公司 | Multi-camera target tracking method, system, device and storage medium |
CN109903281A (en) * | 2019-02-28 | 2019-06-18 | 中科创达软件股份有限公司 | It is a kind of based on multiple dimensioned object detection method and device |
CN110895820B (en) * | 2019-03-14 | 2023-03-24 | 河南理工大学 | KCF-based scale self-adaptive target tracking method |
CN110895820A (en) * | 2019-03-14 | 2020-03-20 | 河南理工大学 | KCF-based scale self-adaptive target tracking method |
CN110175649B (en) * | 2019-05-28 | 2022-06-07 | 南京信息工程大学 | Rapid multi-scale estimation target tracking method for re-detection |
CN110175649A (en) * | 2019-05-28 | 2019-08-27 | 南京信息工程大学 | It is a kind of about the quick multiscale estimatiL method for tracking target detected again |
CN110348492A (en) * | 2019-06-24 | 2019-10-18 | 昆明理工大学 | A kind of correlation filtering method for tracking target based on contextual information and multiple features fusion |
CN112686957A (en) * | 2019-10-18 | 2021-04-20 | 北京华航无线电测量研究所 | Quick calibration method for sequence image |
CN111028265B (en) * | 2019-11-11 | 2023-03-31 | 河南理工大学 | Target tracking method for constructing correlation filtering response based on iteration method |
CN111028265A (en) * | 2019-11-11 | 2020-04-17 | 河南理工大学 | Target tracking method for constructing correlation filtering response based on iteration method |
CN111105441A (en) * | 2019-12-09 | 2020-05-05 | 嘉应学院 | Related filtering target tracking algorithm constrained by previous frame target information |
CN111105441B (en) * | 2019-12-09 | 2023-05-05 | 嘉应学院 | Related filtering target tracking method constrained by previous frame target information |
CN111582266A (en) * | 2020-03-30 | 2020-08-25 | 西安电子科技大学 | Configuration target tracking hardware acceleration control method, system, storage medium and application |
CN111582266B (en) * | 2020-03-30 | 2023-04-07 | 西安电子科技大学 | Configuration target tracking hardware acceleration control method, system, storage medium and application |
CN112489034A (en) * | 2020-12-14 | 2021-03-12 | 广西科技大学 | Modeling method based on time domain information characteristic space background |
CN112614158A (en) * | 2020-12-18 | 2021-04-06 | 北京理工大学 | Sampling frame self-adaptive multi-feature fusion online target tracking method |
CN114373151A (en) * | 2021-12-30 | 2022-04-19 | 北京理工大学 | Automatic target tracking method for surveying and typing system |
CN116030098A (en) * | 2023-03-27 | 2023-04-28 | 齐鲁工业大学(山东省科学院) | Weld joint target tracking method and system based on directional characteristic driving |
Also Published As
Publication number | Publication date |
---|---|
US20160239982A1 (en) | 2016-08-18 |
CN104200237B (en) | 2019-01-11 |
US9898827B2 (en) | 2018-02-20 |
CN104200237A (en) | 2014-12-10 |
EP3183690A1 (en) | 2017-06-28 |
EP3183690A4 (en) | 2017-11-22 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US9898827B2 (en) | High-speed automatic multi-object tracking method and system with kernelized correlation filters | |
Chen et al. | Underwater object detection using Invert Multi-Class Adaboost with deep learning | |
Cen et al. | Fully convolutional siamese fusion networks for object tracking | |
Yang et al. | A multi-scale cascade fully convolutional network face detector | |
Soomro et al. | Action recognition in realistic sports videos | |
US9846821B2 (en) | Fast object detection method based on deformable part model (DPM) | |
US9940553B2 (en) | Camera/object pose from predicted coordinates | |
Yang et al. | Learning target-oriented dual attention for robust RGB-T tracking | |
Sánchez-Lozano et al. | A functional regression approach to facial landmark tracking | |
Zakaria et al. | Face detection using combination of Neural Network and Adaboost | |
Khan et al. | An efficient algorithm for recognition of human actions | |
Masci et al. | Descriptor learning for omnidirectional image matching | |
Masood et al. | Classification of Deepfake videos using pre-trained convolutional neural networks | |
Li et al. | Learning a dynamic feature fusion tracker for object tracking | |
Zhong et al. | Improved localization accuracy by locnet for faster r-cnn based text detection | |
Chavate et al. | A comparative analysis of video shot boundary detection using different approaches | |
Khan et al. | Dimension invariant model for human head detection | |
Khan et al. | Clip: Train faster with less data | |
Villamizar et al. | Online learning and detection of faces with low human supervision | |
Coppi et al. | Transductive people tracking in unconstrained surveillance | |
Yi et al. | Single online visual object tracking with enhanced tracking and detection learning | |
Lei et al. | Convolutional restricted Boltzmann machines learning for robust visual tracking | |
Reddy et al. | View-Invariant Feature Representation for Action Recognition under Multiple Views. | |
Zhong et al. | Jointly feature learning and selection for robust tracking via a gating mechanism | |
Liu et al. | Semantic motion concept retrieval in non-static background utilizing spatial-temporal visual information |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
WWE | Wipo information: entry into national phase |
Ref document number: 15023841 Country of ref document: US |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 15833258 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
REEP | Request for entry into the european phase |
Ref document number: 2015833258 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2015833258 Country of ref document: EP |