CN108470354B - Video target tracking method and device and implementation device - Google Patents

Video target tracking method and device and implementation device Download PDF

Info

Publication number
CN108470354B
CN108470354B CN201810249416.5A CN201810249416A CN108470354B CN 108470354 B CN108470354 B CN 108470354B CN 201810249416 A CN201810249416 A CN 201810249416A CN 108470354 B CN108470354 B CN 108470354B
Authority
CN
China
Prior art keywords
feature
target object
target
tracking
feature point
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201810249416.5A
Other languages
Chinese (zh)
Other versions
CN108470354A (en
Inventor
周浩
高赟
张晋
袁国武
普园媛
杜欣悦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Yunnan University YNU
Original Assignee
Yunnan University YNU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Yunnan University YNU filed Critical Yunnan University YNU
Priority to CN201810249416.5A priority Critical patent/CN108470354B/en
Publication of CN108470354A publication Critical patent/CN108470354A/en
Application granted granted Critical
Publication of CN108470354B publication Critical patent/CN108470354B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention provides a video target tracking method, a video target tracking device and a video target tracking implementation device; the method comprises the following steps: detecting a feature point set in a current frame in a set image range, and screening the feature point set according to a preset screening condition; further, according to the screened feature point set, feature point matching, motion estimation and tracking condition analysis are carried out on the target object; and updating the feature point set of the target object and the neighborhood background, the apparent features of the target object and the neighborhood background, and the interframe motion parameters of the target object and the neighborhood background according to the matching result, the motion estimation result and the tracking condition analysis result, so as to update the tracking strategy of the target object. The tracking result in the invention not only can reflect the position of the target object in time, but also can accurately reflect the range and the rotation angle of the target object, so that the tracking of the video frame target object has better robustness and robustness, and meanwhile, the calculation complexity is lower, and the tracking robustness and the calculation speed are both considered.

Description

Video target tracking method and device and implementation device
Technical Field
The invention relates to the technical field of video target tracking, in particular to a video target tracking method, a video target tracking device and a video target tracking implementation device.
Background
The motion tracking means that an interested target is detected in a continuous image sequence to obtain information of the position, the range, the form and the like of the target, so that the corresponding relation of the target is established in the continuous video sequence, and reliable data is provided for the video understanding and analysis of the next step. The traditional tracking method builds a model for a target, when a new frame comes, the target is tracked by searching the optimal likelihood of the target model, and in consideration of the problem of algorithm complexity, the position of the tracked target is usually returned, information such as an imaging range, rotation change and the like of the target in a video is not returned, and the tracking drift and even the tracking failure are easily caused by the influences of factors such as a disordered background, shielding, sudden movement change and the like; therefore, the traditional tracking method of the existing tracking algorithm may have a good effect on the aspect of computational complexity, but sacrifices robustness to a certain extent, or emphasizes robustness, but sacrifices the computation speed, and is usually difficult to consider.
Disclosure of Invention
In view of this, the present invention provides a method, an apparatus, and an implementation apparatus for tracking a video target, so that tracking of a video frame target object has better robustness and robustness, and meanwhile, the computation complexity is low, and the tracking robustness and the computation speed are both considered.
In a first aspect, an embodiment of the present invention provides a video target tracking method, including: initializing a tracking parameter; the tracking parameters at least comprise the position and the range of the target object, the interframe motion parameters of the target object and the neighborhood background, and the feature point set of the target object and the neighborhood background; a plurality of apparent features of the target object and the neighborhood background; detecting a feature point set in a current frame in a set image range, and screening the feature point set according to a preset screening condition; the feature point set comprises feature points and feature vectors corresponding to the feature points; respectively matching the screened feature point set with a target object corresponding to the previous frame and a feature point set of a neighborhood background; according to the screened feature points, carrying out motion estimation on the target object; analyzing the tracking condition of the target object in the current frame according to the distance between the screened feature point and the center position of the target object and the apparent feature of the target object; and updating the feature point set of the target object and the neighborhood background, the apparent features of the target object and the neighborhood background, and the interframe motion parameters of the target object and the neighborhood background according to the matching result, the motion estimation result and the tracking condition analysis result, so as to update the tracking strategy of the target object.
In a second aspect, an embodiment of the present invention provides a video target tracking apparatus, including: the initialization module is used for initializing the tracking parameters; the tracking parameters at least comprise the position and the range of the target object, the interframe motion parameters of the target object and the neighborhood background, and the feature point set of the target object and the neighborhood background; a plurality of apparent features of the target object and the neighborhood background; the screening module is used for detecting a feature point set in the current frame in a set image range and screening the feature point set according to preset screening conditions; the feature point set comprises feature points and feature vectors corresponding to the feature points; the feature point matching module is used for respectively matching the screened feature point set with a target object corresponding to the previous frame and a feature point set of a neighborhood background; the motion estimation module is used for carrying out motion estimation on the target object according to the screened feature points; the tracking condition analysis module is used for analyzing the tracking condition of the target object in the current frame according to the distance between the screened feature point and the center position of the target object and the apparent feature of the target object; and the updating module is used for updating the feature point set of the target object and the neighborhood background, the apparent features of the target object and the neighborhood background, and the interframe motion parameters of the target object and the neighborhood background according to the matching result, the motion estimation result and the tracking condition analysis result, so that the tracking strategy of the target object is updated.
In a third aspect, an embodiment of the present invention provides a video target tracking implementation apparatus, including a processor and a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions capable of being executed by the processor, and the processor executes the machine-executable instructions to implement the video target tracking method.
The embodiment of the invention has the following beneficial effects:
according to the video target tracking method, the video target tracking device and the video target tracking implementation device, after tracking parameters are initialized, a feature point set in a current frame is detected in a set image range, and the feature point set is screened according to preset screening conditions; respectively matching the screened feature point set with a target object corresponding to the previous frame and a feature point set of a neighborhood background; then, according to the screened feature points, motion estimation is carried out on the target object, and according to the distance between the screened feature points and the center position of the target object and the apparent features of the target object, tracking condition analysis is carried out on the target object in the current frame; finally, updating the feature point set of the target object and the neighborhood background, the apparent features of the target object and the neighborhood background, and the interframe motion parameters of the target object and the neighborhood background according to the matching result, the motion estimation result and the tracking condition analysis result, so as to update the tracking strategy of the target object; in the mode, the tracking result can reflect the position of the target object in time, and can also accurately reflect the range and the rotation angle of the target object, so that the tracking of the video frame target object has better robustness and robustness, meanwhile, the calculation complexity is lower, and the tracking robustness and the calculation speed are both considered.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by the practice of the invention as set forth above.
In order to make the aforementioned and other objects, features and advantages of the present invention comprehensible, preferred embodiments accompanied with figures are described in detail below.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
FIG. 1 is a flow chart of an algorithm for video target tracking according to an embodiment of the present invention;
fig. 2 is a flowchart of a video target tracking method according to an embodiment of the present invention;
FIG. 3 is a flowchart of initializing trace parameters according to an embodiment of the present invention;
fig. 4 is a flowchart for matching a feature point set of a target object and a feature point set of a neighborhood background respectively corresponding to a previous frame according to a filtered feature point set according to the embodiment of the present invention;
fig. 5 is a flowchart of analyzing a tracking status of a target object in a current frame according to an embodiment of the present invention;
fig. 6 is a schematic diagram of analyzing a tracking status of a feature point matching situation according to an embodiment of the present invention;
FIG. 7 is a diagram illustrating a process of tracking and locating an object according to an embodiment of the present invention;
fig. 8 is a schematic diagram of updating a feature point set of a target object and a neighborhood background, an apparent feature of the target object and the neighborhood background, and an inter-frame motion parameter of the target object and the neighborhood background according to an embodiment of the present invention;
fig. 9 is a flowchart of updating feature point sets of a target object and a neighborhood background according to an embodiment of the present invention;
FIG. 10 is a flowchart of another video target tracking method according to an embodiment of the present invention;
fig. 11 is a schematic structural diagram of a video target tracking apparatus according to an embodiment of the present invention;
fig. 12 is a schematic structural diagram of an apparatus for tracking a video target according to an embodiment of the present invention.
Detailed Description
To make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a flow chart of an algorithm for video target tracking is shown; after target initialization, a target initial state X is obtained0And initializing the target appearance model A0And entering a tracking stage. In video frame ItAfter the target arrives, positioning the target in the current frame according to the previous target state and the target model to obtain the state X of the target in the current frametAccording to the apparent characteristics of the target in the current frame, the appearance model A is formedtAnd (6) updating. Generally, the tracking process is inevitableAnd shielding and tracking drift occur, so that robust tracking is realized, the current tracking state is analyzed, and a tracking strategy is adjusted correspondingly. In addition, to realize robust and robust tracking in a complex scene, a feature model is often established by fusing multiple features, so that the problem of multi-feature fusion is often also a problem to be considered by a robust tracking algorithm.
A typical target tracking system mainly includes the following three steps:
(1) and (3) establishing a target model, wherein no matter how the tracking strategy is, the tracking algorithm needs to establish an apparent model for describing the target and search the position of the target in the current frame according to the target model.
(2) The video target tracking solution can be generally divided into a stochastic algorithm and a deterministic algorithm according to different tracking algorithm ideas, and the stochastic method considers the tracking problem as the optimal state of the target in the current frame under the observation data and the state of the known target. While the deterministic approach reduces tracking to the problem of solving the optimal cost function.
(3) The tracking algorithm is to compare and analyze the observation result of the current frame characteristic information with the prior knowledge (namely the target model) of the characteristic information to obtain the tracking result of the current frame. However, in the actual tracking process, the apparent features of the tracked target are not constant, and the apparent changes of the target can be divided into two cases: the appearance of the target in the image frame is actually changed due to factors such as illumination change, deformation, non-planar rotation and the like, and at the moment, the appearance model of the target is adapted in time to follow the change; another situation is the change of the target appearance due to occlusion, noise, etc., when the appearance model should not follow the change of the current frame. It can be seen that the requirements for updating the appearance model are quite different for the two cases, and therefore how to deal with the change of the appearance feature of the target is an important challenge for robust target tracking.
Methods for searching and positioning targets can be divided into stochastic algorithms and deterministic algorithms. The stochastic algorithm converts the target tracking problem into an optimal state estimation problem under a Bayesian framework, wherein the state is a target tracking result and comprises parameters such as the position range of a target in the current frame. The stochastic tracking algorithm is divided into two steps of prediction and observation vector updating, under the condition that prior knowledge of targets such as target representation, initial state and the like is known, the current state of the targets is predicted according to a target motion model, then the maximum posterior probability of the target state is solved through observation data to obtain the optimal estimation of the targets, and the classical stochastic tracking algorithm comprises Kalman filtering (Kalman filter), Particle filtering (Particle filter) and an improved algorithm thereof.
Deterministic algorithms enable tracking by measuring the similarity of a current frame candidate target region to a known target model, often by matching algorithms such as: the Mean-shift algorithm uses the gradient of the non-parameter probability density, and searches an image area which is most similar to the density estimation of the target color kernel in the neighborhood of a previous frame target as a reference in a current frame as the position of the current frame target. The Mean-shift and Cam-shift algorithms are based on the idea to track the target. To improve the robustness of tracking, it is usually necessary to preprocess the image frame sequence, improve the image quality, and build and update the target model.
Whatever target positioning strategy needs to establish a target model and search the optimal matching of the target in the current frame according to the target model. Therefore, establishing a model for describing the appearance of the target is an important factor for determining the robustness of the tracking algorithm, and the primary problem of establishing the apparent modeling of the target is to select apparent features capable of effectively describing the target, and the methods for establishing the apparent model can be divided into the following methods according to the image features used for establishing the apparent model of the target:
(1) apparent features described based on pixel values: directly using pixel values to create target features can be divided into vector-based methods, which directly convert image regions into a high-dimensional vector, and matrix-based methods, which generally create target features directly using a two-dimensional matrix. After the apparent characteristics of the target are established by the method, the target is tracked by calculating the correlation between the current frame image area and the target template, and the target characteristics are updated by using the tracking result in the current frame image.
(2) Apparent features described based on the optical flow method: the optical flow method takes a space-time displacement density field of each pixel in a target image area as a target feature, and generally comprises two types of optical flow calculation methods based on a brightness constant constraint and a non-brightness constant constraint. The non-luminance-invariant constraint method is to geometrically constrain the optical flow field by introducing the spatial context of the pixels. In general, the optical flow method has a high computational complexity.
(3) Apparent features described based on probability density: an image histogram is the most common gray level probability distribution description method, such as Mean-shift and Camshift tracking algorithms, and establishing target features by using the histogram is the most common method in the target tracking algorithm at present.
(4) Apparent features based on covariance description: the target model established based on the covariance can describe the interrelation of all parts in the target.
(5) Apparent features based on profile description: describing the tracked target by using a closed contour curve of the target object boundary, and establishing the apparent characteristic of the target; and the contour features can be continuously updated in a self-adaptive manner along with the scaling, rotation and deformation of the target, so that the method is suitable for occasions of tracking non-rigid targets.
(6) Apparent features based on local feature description: the target is described by only using some local characteristics of the target, such as some distinctive points, lines or local areas of the target, and the target is tracked by establishing a target model through the local characteristics and matching with the local characteristics detected in the current frame, so that even if the target is partially occluded, as long as some local characteristic points can still be detected, effective target tracking can be realized. Local Features commonly used in the target tracking process include angular point Features (such as Harris angular points), Gabor Features, SIFT (Scale Invariant Feature Transform) Features, Speeded Up Robust Features (SURF) Features, and the like.
(7) Apparent features based on compressed sensing: target tracking can be seen as a problem of finding a sparse representation of a tracked target based on a dynamically constructed and updated sample set. And performing sparse representation on the tracking target by utilizing a norm minimization method according to a target sample set, and evaluating the tracking target based on the sparse representation of the sample under the framework of a Kalman filter. The target tracking can also be regarded as a sparse approximation problem under a particle filter framework, the target is sparsely represented by a regularization least square method, and a candidate target with the minimum error with the target sparsely represented in a new image frame is a tracking target.
In the actual tracking process, the apparent characteristics of the tracked target are not invariable due to the influence of factors such as shielding, noise, illumination change, and distance change between the target and the detector. The current apparent online adaptive updating algorithm can be divided into two types: a generating formula (generating) method and a judging formula (discriminating) method; the algorithm of the generating formula only models the target appearance, but does not consider the distinguishing capability of the target model to the background and other target appearances, and the method firstly establishes a target appearance model and searches and tracks the target by obtaining the maximum likelihood or the maximum posterior probability; the decision-based algorithm treats target tracking as a target detection problem: the method comprises the steps that an object is separated from a local area of a neighborhood background of the object through an online training and updating classifier, when an initial frame is used, a user firstly determines the object, and therefore a feature set describing the object and a feature set describing the neighborhood background of the object are obtained, in a continuous frame, the object is separated from the background through a binary classifier, and the classifier needs to be updated timely in order to cope with apparent changes.
The existing tracking algorithm always makes a trade-off between robustness, tracking accuracy, robustness and computational complexity, and has the following specific disadvantages:
(1) the tracking result generally only comprises the position of the target and does not comprise the range of the target; the traditional tracking algorithm obtains the current target position by establishing a target model and adopting a searching and matching method in the current frame, and considering the requirement of tracking application on computational complexity, the tracking result usually does not include the range of the target, no mention is made of the rotation angle of the target, because the position of the tracked target is only the optimal search in a two-dimensional image, and the optimal matching search space is expanded to three or even four dimensions to obtain the range or the rotation angle of the target, so that the computational complexity is greatly increased, however, in many application occasions, the accurate knowledge of the range and the rotation angle of the target has important significance for further processing.
(2) The tracking robustness under the conditions of shielding, tracking drift, complex background and the like needs to be further improved; the traditional tracking method is very sensitive to target occlusion even if the target occlusion is partial; in addition, the tracking drift and the tracking loss lack accurate analysis and judgment, and background information is easily introduced into a tracking model of a target, so that the abnormal condition in the tracking process is difficult to be timely processed, and the tracking failure is caused.
(3) The calculation complexity is always a key factor of the tracking algorithm, and the performance of all aspects is difficult to be considered; an excellent tracker or target tracking method should give consideration to robustness, robustness and computational complexity, and is a complete system requiring cooperation of multiple links, whereas the conventional tracking method may have a good effect on computational complexity, but sacrifices robustness to a certain extent, or emphasizes robustness, but sacrifices computational speed, and is generally difficult to give consideration to.
Based on this, the embodiment of the invention provides a video target tracking method, a video target tracking device and a video target tracking implementation device; the technology can be applied to the target tracking process among continuous video frames; the techniques may be implemented in associated software or hardware, as described by way of example below.
Referring to fig. 2, a flow chart of a video target tracking method is shown; the method comprises the following steps:
step S202, initializing tracking parameters; the tracking parameters at least comprise the position and the range of the target object, the interframe motion parameters of the target object and the neighborhood background, and the feature point set of the target object and the neighborhood background; a plurality of apparent features of the target object and the neighborhood background; the target object may also be referred to as a target.
This step S202 may be specifically implemented by: (1) extracting the apparent characteristics of a target object and a neighborhood background in a current frame; the apparent features at least comprise a plurality of feature descriptor vectors, scale factor feature information, color features, texture features and edge features; (2) determining the center position of the target object and the length and width of the target rectangular frame; (3) initializing the inter-frame motion parameters of the target object and the neighborhood background into the difference of corresponding transformation parameters between the current frame and the previous frame; (4) initializing a feature point set of a target object into a rectangular frame of the target object, and detecting the feature point set; initializing a feature point set of a neighborhood background into a feature point set of the neighborhood background detected in a neighborhood region in a preset range outside a target object; (5) and initializing the apparent features of the target object and the neighborhood background into feature vectors of the extracted apparent features.
In consideration of the fact that video target tracking finds a target from a neighborhood background region in each frame so as to accurately position the target, the distinguishing of the target from the neighborhood background region is usually based on the difference of apparent characteristics of the target and the neighborhood background, so that a model is established only for the target and is used as a basis for tracking, and a robust and stable tracking result is difficult to obtain. Thus, the model built includes a model of the target and its neighborhood background.
In a video frame, the motion of the background among frames is caused by the motion change of a detector, and the change of the position and the range of an object among image frames is caused by the motion of the object and the motion of the detector, so that the rule of the motion of the object is different from that of the motion of the neighborhood background, and correspondingly, the change of the position of the feature point of the object or the neighborhood background area among frames actually reflects the inter-frame motion of the feature point. In practical application, no matter the object or the background, the motion between frames does not change suddenly, and especially for application under the condition of high frame rate sampling, the change of the position between frames always has continuity. Therefore, the corresponding characteristic point position frame-to-frame change also has corresponding continuity and does not generate abrupt change. Observing the inter-frame displacement of a feature point (x, y) in successive video frames, the course of its motion over time can also be expressed as
{Ui(1),…,Ui(t)}=(ui(x,y,t0):1≤t0≤t) (1)
ui(x,y,t0) Is t0The inter-frame displacement observed at the characteristic point i at the moment (x, y) can be simulated by superimposing certain Gaussian noise on the motion of the inter-frame with the uniform motion within a certain time range. Therefore, we also describe the "feature point inter-frame displacement process" by using a single Gaussian distribution model, i.e. Gaussian distribution N (mu)uu) To simulate { Ui(1),…,Ui(t)}。
On the other hand, the object and its neighboring background usually have different apparent features such as color, texture, edge, etc., and therefore their corresponding apparent models should not be the same. Also, due to the influence of noise, illumination variations, motion of the object and detector, background changes, etc., even if the entire scene is stationary, different image frames acquired by the same detector at different times will not be identical. Therefore, even if a feature point is stable in a video, image information in a local region at the position (x, y) thereof, including information such as a gradation value, changes with time. At any time t, the position of the feature point i is (x, y), featurei(x,y,t0) Is t0The feature value observed at a feature point (x, y) at a time, the "feature information process" (the time-dependent change process of the feature information) of the observed value in the neighborhood of the feature point, can be expressed as
{Feati(1),…,Feati(t)}=(featurei(x,y,t0):1≤t0≤t) (2)
See the flow chart of initializing tracking parameters shown in FIG. 3; the video between successive frames is always relatively stable and does not mutate. Even in the event of occlusion, occlusionThe image area of the blocking object blocks the blocked object, the image area of the blocking object is relatively stable, on the other hand, the blocked object can still appear after a period of time, and the prior observed prior knowledge still plays an important role in the knowledge of the blocked object. The corresponding SURF feature point detection-based extracted various feature vectors, including SURF feature descriptor information, scale factor information and other feature information, are relatively stable in changes in the video over time. This "signature information process" can be described using a single Gaussian distribution model, i.e., using a Gaussian distribution N (μ)featfeat) To simulate { Feati(1),…,Feati(t) }. And establishing an appearance model of the target through a Gaussian distribution model of the feature vectors of the feature points on the target, wherein similarly, the appearance model of the target neighborhood background is formed by the Gaussian distribution model of the feature vectors of the feature points in the target neighborhood background area.
In fig. 3, model initialization is performed at the first frame, including initialization of the following parameters:
(1) the target is represented by a rectangular frame, and the position and range of the target are initialized
Figure BDA0001607039920000061
Wherein
Figure BDA0001607039920000062
The coordinates representing the center of the target rectangular box,
Figure BDA0001607039920000063
representing the height and width of the target rectangular box; initializing a target neighborhood background region to
Figure BDA0001607039920000064
Center of, in
Figure BDA0001607039920000065
The area of the target itself is removed for the high and wide rectangular boxes.
(2) The inter-frame motion parameters of the initialization target are
Figure BDA0001607039920000066
I.e. no translation, no rotation, no zooming; the inter-frame motion parameter of the initialized target neighborhood background is
Figure BDA0001607039920000067
I.e. no translation, no rotation, no zooming; the coefficient t represents the frame ordinal number, and t is 0 in the first frame; the mean value of the Gaussian model of the inter-frame motion of the initialized target is
Figure BDA0001607039920000068
After the second frame comes, the variance of the Gaussian model is set to be
Figure BDA0001607039920000069
Initializing to a difference of detected transformation parameters of the first frame and the second frame; the mean value of the Gaussian model of the interframe motion of the initialized background is
Figure BDA00016070399200000610
After the second frame comes, the variance of the background motion Gaussian model is set to be
Figure BDA00016070399200000611
The difference of the detected transformation parameters of the first frame and the second frame is initialized.
(3) Initializing a set of SURF feature points of the target and the neighborhood background. In a manner that
Figure BDA00016070399200000612
The center of the device is provided with a central hole,
Figure BDA00016070399200000613
SURF feature points are detected in a high and wide rectangular area to obtain a SURF feature point set Pg0Feature point set to be located in target rectangular frame
Figure BDA00016070399200000614
Initialized to a target feature point set, and the feature point set positioned in a background region of a target neighborhood
Figure BDA00016070399200000615
A background feature point set, among which:
Figure BDA00016070399200000616
(4) initializing the appearance model of the target and the appearance model of the neighborhood background area. For each feature point, extracting a feature description sub-vector corresponding to the ith feature point at coordinates (x, y) in the t frame according to a SURF feature point detection algorithm
Figure BDA0001607039920000071
And simultaneously obtaining the corresponding scale factor characteristic information
Figure BDA0001607039920000072
According to different tracking objects and application occasions, vectors such as textures, gradients, gray average values and the like in the neighborhood of the feature points can be detected
Figure BDA0001607039920000073
Considering that each feature vector selected by each SURF feature point meets Gaussian distribution along with the change of video frames, when t is 0 in the first frame, initializing the mean value of corresponding Gaussian components of the feature points
Figure BDA0001607039920000074
Observed value of the feature point
Figure BDA0001607039920000075
Figure BDA0001607039920000076
At the first frame time, corresponding Gaussian model variance of each feature vector
Figure BDA0001607039920000077
Is initialized to be one comparisonA large initial value; therefore, the tracking starts, and after initialization, the model is established to include: (1) position of target and bounding rectangle, using parameters
Figure BDA0001607039920000078
Description is given; (2) motion model, object motion Gaussian model, using parameters
Figure BDA0001607039920000079
To describe, and the Gaussian model of the background motion, using parameters
Figure BDA00016070399200000710
Description is given; (3) detected feature point set Pg0Belonging to a target and a neighborhood background, the characteristic point set of the target is
Figure BDA00016070399200000711
The feature points of the background are set as
Figure BDA00016070399200000712
(4) The feature vector corresponding to each feature point is
Figure BDA00016070399200000713
The feature vector corresponds to a Gaussian model parameter of
Figure BDA00016070399200000714
Step S204, detecting a feature point set in the current frame in a set image range, and screening the feature point set according to a preset screening condition; the feature point set comprises feature points and feature vectors corresponding to the feature points;
the step S204 may be implemented in the following manner: (1) determining the coordinates of the upper left corner and the lower right corner of an image rectangular frame of the image range to be detected; (2) detecting the characteristic points in the image rectangular frame to obtain the coordinates of the characteristic points; (3) calculating the trace of the Hessian matrix of the characteristic points and the characteristic vectors corresponding to the characteristic points; the feature vector comprises a feature descriptor vector, scale factor feature information, and color, texture and edge vectors; (4) and screening the characteristic points in the characteristic point set according to the following screening conditions: the track of the Hessian matrix of the characteristic points has the same sign as the Hessian matrix track of the characteristic points in the previous frame of video frame; the distance between the characteristic point and the characteristic point in the previous frame video frame is smaller than a preset distance threshold value; the Euclidean distance between the feature point and the feature point in the previous frame video frame and the corresponding feature vector meets a preset feature vector threshold; the displacement length, the displacement direction and the relative position relation between the characteristic point and the characteristic point in the previous frame video frame meet a preset displacement consistency threshold; and when the feature points and the feature points of the previous frame of video frame are in a plurality of-to-one matching relationship, screening the feature points with the minimum Euclidean distance from the plurality of feature points.
Step S206, respectively matching the screened feature point set with the feature point set of the target object and the neighborhood background corresponding to the previous frame;
in consideration of the problem of computational complexity, feature point detection and matching are not usually performed on the full image, and the position and range of the target are determined by inter-frame matching of feature points in each frame. And evaluating the target positioning and tracking condition of the current frame by combining with the previously established target motion model, and determining a local area range for detecting and matching the feature points in the new frame of image according to the evaluation result. On the basis of evaluating the tracking accuracy, determining the image range of the next frame for detecting the feature points according to the tracking result of the current frame according to the following formula:
Figure BDA00016070399200000715
Figure BDA00016070399200000716
Figure BDA00016070399200000717
Figure BDA00016070399200000718
in the formula
Figure BDA0001607039920000081
For the coordinates of the center position of the target in the current frame and the height and width, thrdU is a threshold constant, and usually takes a value of 2.4-3, (LTx, LTy) and (RBx, RBy) respectively represent the coordinates of the top left corner and the bottom right corner of the rectangular frame of the image for feature point detection in the next frame.
SURF feature point detection is performed within a rectangular image block defined by coordinates (LTx, LTy) and (RBx, RBy), and coordinates (x) of the detected feature point in the image are calculatedi,yi) Calculating the Hessian matrix trace of the characteristic points, and calculating the characteristic vector corresponding to each characteristic point
Figure BDA0001607039920000082
The characteristic points belonging to the target are different from the characteristic points belonging to the neighborhood background region in apparent characteristics such as obeying motion rules, colors, shapes and the like, so that the characteristic point set Pgt-1The method is divided into two categories: feature point set located in target area
Figure BDA0001607039920000083
Feature point set located in background area
Figure BDA0001607039920000084
Detecting the feature point set of the current frame
Figure BDA0001607039920000085
Then, respectively connecting with the target feature point set
Figure BDA0001607039920000086
And background feature point set
Figure BDA0001607039920000087
Matching is performed, where TN (t-1) is t-1The number of the target characteristic point sets at the moment, BN (t-1) is the number of the background characteristic point sets at the moment of t-1. The matching result between feature point sets can be represented by a binary vector in the pairing space, where Matched is {0,1}MMatched per entry in vector MatchedijMatched on behalf of a pairing responseijAnd if not, the matching is failed, M represents a matching space formed by the feature point sets of the previous and next frames, M can be described by a two-dimensional matrix, the size of the matrix is N (t-1) × N (t), and N (t-1) and N (t) respectively represent the number of the feature points of the previous and next frames participating in matching. The feature points in the previous frame are successfully matched with one feature point of the current frame, or no feature points are matched, namely, the matching is matched when the constraint condition Rstr is met:
Figure BDA0001607039920000088
referring to fig. 4, a flowchart for matching the feature point set after screening with the feature point set of the target object and the neighborhood background corresponding to the previous frame respectively includes the following steps:
(1) matching based on Hessian matrix trace; the SURF feature points are local extreme points in the image, and can be divided into two types according to different extreme value conditions, namely, the central gray value of the feature point is two conditions of a minimum gray value and a maximum gray value in the neighborhood, and obviously, matching should not occur between the two types of feature points. The central gray scale of the Hessian matrix of the SURF characteristic point can be judged to be the local maximum or minimum value by calculating the Trace of the Hessian matrix (namely the sum of diagonal elements of the Hessian matrix), the Trace of the characteristic point is expressed by Trace, and if the Trace of the Hessian matrix is positive, the central brightness of the characteristic point is larger than the brightness of the neighborhood pixels; if the trace of the Hessian matrix is negative, the central brightness of the feature point is darker than the brightness of the neighborhood pixels. Comparing Hessian matrix traces of two feature points i and j to be matched in the pairing space M, and considering that the pair of the feature points to be matched is possibly matched only if the feature points i and the feature points j to be matched are of the same sign, namely matchedij1, and the candidate matching feature point set candidate _ matchpair0 is obtained.
(2) Base ofMatching the characteristic point displacement size constraint; since it is considered that inter-frame motion of the feature point does not change abruptly, the feature point j of the current frame that can be matched with the feature point i of the previous frame is certainly within a certain range centered on the feature point i, and feature points beyond the range do not have the possibility of matching with i, that is, the distance Dist of the inter-frame feature point pair (i, j) is removed from the candidate matching feature point set candidate _ matchpair0mijGreater than a prescribed threshold thre σmTo obtain a new candidate matching feature point set candidate _ matchpair1, which can be expressed by the following formula.
Figure BDA0001607039920000089
(3) Matching based on feature vector constraints; respectively calculating the target feature point set of the previous frame
Figure BDA00016070399200000810
And background feature point set
Figure BDA00016070399200000811
The feature point set detected from the current frame
Figure BDA0001607039920000091
Between feature vectors
Figure BDA0001607039920000092
Distance between them
Figure BDA0001607039920000093
According to the established feature point appearance model, comparing the distance between the feature vectors with the variance of the corresponding feature model
Figure BDA0001607039920000094
Distance if
Figure BDA0001607039920000095
If the values are all less than the corresponding threshold values, the matching is considered, and the matching response matched is configuredijIs 1, otherwise it is consideredMismatch, configure the paired response matchedijIs 0.
Figure BDA0001607039920000096
Figure BDA0001607039920000097
Figure BDA0001607039920000098
matchedij=match_d&match_s&match_o (12)
Thus, a new candidate matching feature point set candidatjmatcha 2 is further selected from candidate matching feature point set candidatjmatcha 1, where thre σ is a threshold value, typically set to 2.4-3.
(4) Matching based on feature point displacement consistency constraint; the movement of the characteristic points in the target area between frames is caused by the change of the position of the target between frames, and similarly, the displacement of the characteristic points in the background area between frames is caused by the movement of the detector, so that the characteristic point set belonging to the target
Figure BDA0001607039920000099
The position change between frames should satisfy the same motion constraint, and similarly, the feature point set belonging to the background
Figure BDA00016070399200000910
The same motion constraints should also be satisfied. We generalize such motion constraints into three conditions: the interframe displacement of the same type of feature points has similar displacement size, namely the interframe displacement vector lengths of correctly paired feature points have consistency; the interframe displacement of the same type of feature points should have similar displacement directions, and the interframe displacement vector directions of correctly paired feature points should also have consistency; in most cases, the characteristic points which can be correctly matched are shifted before, after and between frames, and mutuallyThe meta-position relationship should remain substantially unchanged.
By using the idea of RANSAC algorithm, the matching feature points meeting the above 3 conditions are selected in the set candidate _ matchpair2, and the process can be divided into three steps: (1) optionally two pairs of inter-frame paired feature points (i) satisfying a condition1,j1) And (i)2,j2) Set parameters are estimated. Characteristic point i1And i2Is the feature point of the previous frame, and j1And j2For the current frame feature point, the frame bit quantity is calculated
Figure BDA00016070399200000911
Length of | a | ═ i |1,j1| and vector
Figure BDA00016070399200000912
Length b | ═ i2,j2And calculating the vector
Figure BDA00016070399200000913
And vector
Figure BDA00016070399200000914
Angle theta betweenab(ii) a Computing intra-frame vectors
Figure BDA00016070399200000915
Length of | c | ═ i |1,i2L, calculated amount
Figure BDA00016070399200000916
Length of (d | ═ j |)1,j2L. Calculate the mean of the inter-frame vector lengths | a | and | b |
Figure BDA00016070399200000917
And variance
Figure BDA00016070399200000918
Calculate the mean of the vector lengths | c | and | d | within the frame
Figure BDA00016070399200000919
And variance
Figure BDA00016070399200000920
By ratio of variance to mean
Figure BDA00016070399200000921
And
Figure BDA00016070399200000922
representing the vector length change conditions of different candidate matching feature points between frames and in frames, wherein the motion of the feature points is subject to the motion of the whole target or background area to which the feature points belong because the motion cannot be suddenly changed, so that the two specific values are not too large, and the angle thetaabIt should not be too large if the inter-frame feature point displacement variance sum-mean ratio Par1 is less than 0.24, and the intra-frame feature point displacement variance sum-mean ratio Par2 is less than 0.2, and two pairs of feature points (i)1,j1) And (i)2,j2) Angle theta betweenabWhen the arc value is less than 0.15, the arc value is calculated according to
Figure BDA00016070399200000923
And the mean of the phase angles of vector a and b
Figure BDA00016070399200000924
Taking the characteristic point pairs as model parameters, continuing the next step, or reselecting the characteristic point pairs; (2) using estimated model parameters
Figure BDA00016070399200000925
And
Figure BDA00016070399200000926
setting a threshold value, and calculating each candidate matching feature point pair (i) of the set candidate _ matchpair2n,jn) Inter-frame displacement length in,jnI, direction
Figure BDA0001607039920000101
Computing inter-feature point vectors in a previous frame
Figure BDA0001607039920000102
And vector
Figure BDA0001607039920000103
Mean value of length
Figure BDA0001607039920000104
Vectors between feature points in the current frame
Figure BDA0001607039920000105
And vector
Figure BDA0001607039920000106
Mean value of length
Figure BDA0001607039920000107
Calculating the variance of vector length between feature points in a frame
Figure BDA0001607039920000108
If it is
Figure BDA0001607039920000109
Is less than
Figure BDA00016070399200001010
Less than 0.1, and
Figure BDA00016070399200001011
if the characteristic point pair is less than 0.3, the characteristic point pair (i) is considered to ben,jn) And an inner point, otherwise, an outer point. Find out the interior points in the set candidate _ matchpair2, and record the corresponding number of interior points. (3) Finding out the estimation with the most interior points, if the proportion of the most interior points to the total number of the set pairings is greater than a threshold value, or the number of interior points is greater than a specified threshold value for short, then the interior points judged under the estimation are used as a new candidate pairing feature point set candidate _ matchpair3, otherwise, repeating the above steps.
(5) Matching based on feature point pairing uniqueness constraints; in the new candidate paired feature point set candidate _ matchpair3, there may be a plurality of featuresWhen the feature points match the same feature point, it is obviously incorrect, and all the feature point pairing relations in the set candidate _ matchpair3 that do not satisfy the one-to-one correspondence constraint are detected, and the non-minimum fusion distance Dist between the apparent feature vectors is deletedintergralijOnly those matching relations with the minimum fusion distance are reserved as the matching result, as shown in fig. 4, a new matching relation set candidat _ matchpair4 is further obtained. Wherein the fusion distance DistintergralijBy the distance between the various feature vectors
Figure BDA00016070399200001012
Weighted fusion to obtain:
Figure BDA00016070399200001013
weightnis the normalized fusion weight value of the nth feature information, n belongs to { d, s, o } is one of the above-mentioned features,
Figure BDA00016070399200001014
representing the distance between the feature vectors selected according to the actual situation of the video, and calculating the feature vectors in an online learning mode
Figure BDA00016070399200001015
Variance over time
Figure BDA00016070399200001016
Defining the fusion weight as:
Figure BDA00016070399200001017
step S208, estimating the motion of the target object according to the screened feature points;
usually, a rectangular frame is used to represent the tracking area of the target, and the center of the rectangular frame of the target in the previous frame is xct-1=(center_xt-1,center_yt-1),ht-1And wt-1Width and height of the representation. The change of the inter-frame position of the target and its neighborhood background region can be regarded as the superposition of translation along the horizontal or vertical direction, scaling with the geometric center as the origin and rotation, and the transformation parameters can be used
Figure BDA00016070399200001018
(object) or
Figure BDA00016070399200001019
(neighborhood context), where ut=(uxt,uyt) For translation parameters, ptTo scale the parameters, and θtFor the rotation parameter, the transformation equation between target region frames is:
Figure BDA00016070399200001020
ideally, the feature points on the target should follow the target to move in unison therewith. Let the characteristic point at time t-1
Figure BDA00016070399200001021
And the position at time t is
Figure BDA00016070399200001022
The characteristic points are obtained by calculation according to a formula (15)
Figure BDA00016070399200001023
Position estimate at time t
Figure BDA00016070399200001024
Should be associated with the characteristic point
Figure BDA00016070399200001025
Similarly, in practice, the estimated value is due to the influence of noise and observation angle variation
Figure BDA00016070399200001026
And the observed value
Figure BDA00016070399200001027
And not completely identical. Observed value
Figure BDA00016070399200001028
Can be regarded as an estimate
Figure BDA0001607039920000111
Superimposed with gaussian noise. After the pairing relation set candidate _ matchpair4 of the feature points of the previous and next frames is obtained, the estimation value of the feature point set of the previous frame in the current frame image is used as the basis
Figure BDA0001607039920000112
And the observation value of the current frame characteristic point matched with the current frame characteristic point
Figure BDA0001607039920000113
Defining the observation error as:
Figure BDA0001607039920000114
solving motion equation parameter meeting minimum observation error by using nonlinear least square curve fitting method
Figure BDA0001607039920000115
And
Figure BDA0001607039920000116
here the weights
Figure BDA0001607039920000117
The robustness of the feature points is used for determining, and the feature points with good robustness are endowed with larger weights.
Step S210, analyzing the tracking condition of the target object in the current frame according to the distance between the screened feature point and the center position of the target object and the apparent feature of the target object;
the step S210 may be specifically implemented by: (1) detecting the feature points which are wrongly classified according to the distance between the feature points and the central position of the target object, removing the feature points which are wrongly classified, and generating a first feature point set; (2) and analyzing whether the target object in the current video frame has tracking drift or not according to the apparent characteristics of each characteristic point in the first characteristic point set.
In the process of tracking a target object, tracking drift, occlusion (including partial occlusion and complete occlusion) and tracking loss are inevitable, and if robust tracking is to be realized, the current tracking result is analyzed to judge whether tracking is accurate or not, or drift, occlusion, loss and other situations occur, and a tracking strategy is adjusted in time to ensure robust tracking.
Referring to fig. 5, a flowchart for analyzing the tracking status of the target object in the current frame is shown; acquiring a pairing relation set candidate _ matchpair4 of the feature points of the current frame and the previous frame, and respectively estimating the inter-frame motion parameters of the target and the neighborhood background area thereof by using a least square method
Figure BDA0001607039920000118
And
Figure BDA0001607039920000119
and then, whether tracking drift occurs or not needs to be analyzed, and whether normal tracking or shielding and tracking loss occur or not is analyzed according to the characteristic point matching condition.
The tracking loss usually starts from tracking drift, so that the accurate judgment of whether the tracking drift occurs is of great significance for improving the performance of the tracker. The embodiment of the invention detects the characteristic points of the current frame
Figure BDA00016070399200001110
The sets respectively correspond to the target feature point sets
Figure BDA00016070399200001111
And feature point set of background region
Figure BDA00016070399200001112
Matching is carried out, a pairing relation set candidate _ matchpair4 is found through multistage series connection and multi-condition constraint, and the inter-frame motion parameters of the target and the neighborhood background are respectively estimated according to the pairing relation set candidate _ matchpair4
Figure BDA00016070399200001113
And
Figure BDA00016070399200001114
calculating the rectangular frame range corresponding to the current frame target according to the inter-frame motion parameters of the target, wherein the feature points detected by the current frame are located in the classified target feature points in the target rectangular frame
Figure BDA00016070399200001115
And the feature points outside the target rectangular frame are classified as background feature points
Figure BDA00016070399200001116
However, in practical applications, feature points appearing around a target and in an adjacent background area are easily misclassified, if the feature points belonging to the background are misclassified as targets, the misclassified feature points are successfully matched again when feature points of subsequent frames are matched, and even the misclassified feature points can be successfully matched between frames and participate in calculation of motion model parameters, so that tracking drift and even tracking loss can be caused when the subsequent frames are tracked. In addition, due to noise, similar local image features are also prone to cause tracking drift.
In practical application, the tracked target is a rigid body, or the shape of the target does not change suddenly between frames. Therefore, the relative position of the background feature point in the background does not change abruptly between frames, the relative position of the target feature point on the target does not change abruptly, and particularly for a rigid body target, the relative position of the target feature point on the target changes less.
And if the relative position of the target feature point to the geometric center of the target feature point does not have interframe mutation, detecting the misclassified feature point on the basis. First, the target is normalized by the width and height of the target rectangular boxThe distance from the target feature point to the geometric center is used as the relative position of the feature point, and when t frames are calculated on the basis, the coordinates are
Figure BDA0001607039920000121
Relative position of the feature point i
Figure BDA0001607039920000122
And comparing the relative position of the feature point with the relative position of the feature point in the previous frame
Figure BDA0001607039920000123
If the change is more than 0.25, the feature point is considered to be misclassified and therefore causes tracking drift, and the target feature point set is selected
Figure BDA0001607039920000124
The feature point is eliminated, the pairing relation set candidate _ matchpair4 is updated to candidate _ matchpair5, and the target inter-frame motion parameter is re-estimated
Figure BDA0001607039920000125
Due to noise and factors of similar local apparent features in image space, it is still possible to cause feature point matching errors, resulting in tracking drift. For such tracking drift, it is assumed that the apparent information of the target does not change abruptly between frames. If tracking drift occurs, part of the detected target range is actually the neighborhood background, the extracted apparent information in the range can be fused into the background information, and compared with the prior knowledge of the target apparent characteristics, the apparent characteristics extracted in the range have a larger difference with the condition of the extracted apparent characteristics under the condition of accurate tracking, namely, sudden change of the apparent information can occur.
From previously estimated inter-target motion parameters
Figure BDA0001607039920000126
And the positions of four vertexes of the target rectangular frame of the previous frame are calculated to obtain the current tableAnd displaying a rectangular region of the target, extracting an apparent feature vector in the rectangular region, comparing the apparent feature vector with historical experience of the apparent feature vector, judging whether mutation occurs or not, further judging whether drift occurs or not, and converting whether the tracking of the current frame has drift or not into the problem of solving the likelihood probability. The current estimated target motion parameter is obtained by comparing the apparent feature vectors between frames
Figure BDA0001607039920000127
And the steps are realized, so that whether the tracking is accurate or not is judged by analyzing the apparent characteristic vectors, and the tracking is not reliable.
However, for the tracking algorithm, on one hand, the robustness of the algorithm needs to be improved, and on the other hand, the computational efficiency of the algorithm needs to be ensured. The compressed sensing theory considers that a signal can be projected to a certain proper transform domain to obtain a sparse transform coefficient, then a high-efficiency observation matrix is designed to obtain a useful observation value hidden in the sparse signal, the useful observation value can be associated with the signal through a small amount of observation value, the effectiveness of a characteristic vector on target tracking judgment is concerned corresponding to the video tracking problem, therefore, the target characteristic is converted into a limited observation value, namely a compressed vector, the compressed vector after dimension reduction is directly utilized to describe a target to obtain the apparent characteristic of the target, the compressed sensing theory ensures that the information of an original signal can be almost losslessly stored through the small amount of compressed vector, and the calculation complexity of an algorithm can be greatly reduced. Extracting high-dimensional Haar-like feature vectors from the candidate target region according to the sparse theory
Figure BDA0001607039920000128
Thus, the signal x is a vector which can obtain K sparse transform coefficients under orthogonal transform, and a Gaussian random measurement matrix which meets constraint isometry property can be directly adopted
Figure BDA0001607039920000129
Measuring the compression thereof to obtain the pressureReduced measurement vector
Figure BDA00016070399200001210
N may be set to 106K is 10 and the compressed measurement vector dimension m is 50. Therefore, the ith element in the compressed measurement vector y is the inner product of the ith row vector of the measurement matrix and the Haar-like feature vector, namely:
Figure BDA00016070399200001211
after a target position and a range are determined by SURF feature point matching in a current frame, in a neighborhood with a radius smaller than alpha near the position, an image block with the same size as that of a target rectangular frame is sampled as a positive sample by taking the neighborhood as a center, alpha can be set to be 3, in a neighborhood range with a radius smaller than beta and larger than xi near the target position of the current frame, the image block with the same size as that of the target rectangular frame is randomly sampled 60 by taking the neighborhood as a center, and xi is used as a negative sample<Beta and beta can be set as the length of a rectangular frame, xi is 6, a compressed measurement vector y is extracted from an image block represented by positive and negative samples, and the (mu) of the compressed measurement vector y of the positive and negative samples is calculated and updated by an EM algorithm under the condition that tracking is accurate11) And (mu)00). Wherein: mu.s1σ 1 and μ00Mean and standard deviation of the real target and candidate background samples, respectively.
The question of whether or not the candidate region is a target can be regarded as a two-class question, and the result v ∈ {0,1}, where p (v ═ 1) and p (v ═ 0) represent the probabilities that the candidate region is a target and a non-target, respectively, and both the probabilities are 0.5. Consider the conditional distribution p (y)i1) obeys a gaussian distribution
Figure BDA0001607039920000131
And conditional distribution p (y)i| v ═ 0) obeys a gaussian distribution
Figure BDA0001607039920000132
After m positive and negative samples are obtained, the score values of the samples can be calculated:
Figure BDA0001607039920000133
because the target apparent characteristics do not have interframe mutation and the corresponding score value does not have interframe mutation, the change of the score value also meets the Gaussian distribution
Figure BDA0001607039920000134
And updating the mean and variance of the target score after the tracking of each frame is finished by using an EM algorithm
Figure BDA0001607039920000135
Taking the tracking result matched according to the SURF feature points at present as a sample to be evaluated, and calculating the evaluation value H of the image rectangular frame currently trackedT(y) and judging the target tracking state:
Figure BDA0001607039920000136
drift ∈ (0,1), where 1 and 0 denote the presence or absence of tracking Drift, thred σ, respectivelyTBeing a predefined threshold constant, thred σTMay be set to 2.4-3.
Before the current frame comes, the known feature point set is Pgt-1Includes a target feature point set
Figure BDA0001607039920000137
And target neighborhood background feature point set
Figure BDA0001607039920000138
Feature point set PgD detected in the current frame respectivelytMatching is carried out, partial characteristic points can be matched, and the target characteristic points on the matching are
Figure BDA0001607039920000139
The background feature points on the match are
Figure BDA00016070399200001310
And other part of feature points which cannot be matched are respectively expressed as target feature point sets which cannot be matched
Figure BDA00016070399200001311
And background feature point set that fails to match
Figure BDA00016070399200001312
Referring to fig. 6, a schematic diagram of analyzing the tracking status of the feature point matching condition is shown; feature point set matching by analysis
Figure BDA00016070399200001313
And
Figure BDA00016070399200001314
the current tracking situation can be preliminarily analyzed according to the spatial distribution situation of the current tracking situation: FIG. 6 (a) shows a feature point set
Figure BDA00016070399200001315
And
Figure BDA00016070399200001316
all the tracking areas are not empty and are located in respective areas, and normal tracking is performed; as in (b), the feature point set
Figure BDA00016070399200001317
And
Figure BDA00016070399200001318
all are not empty, but have partially matched background feature point sets
Figure BDA00016070399200001319
The existing feature point is located in a target area in the current frame, and the target is possibly partially occluded. As in (c), the feature point set
Figure BDA00016070399200001320
Empty, but feature point set
Figure BDA00016070399200001321
Not empty, i.e. no feature points belonging to the target are successfully matched, which often corresponds to a loss of tracking, or a complete occlusion of the target; as in (d), the feature point set
Figure BDA00016070399200001322
And
Figure BDA00016070399200001323
all are empty, i.e. no feature points in the previous frame have been matched, which corresponds to a loss of tracking.
The above process may also be referred to as a target tracking and positioning process; as shown in fig. 7, interframe matching of SURF feature points calculates interframe displacement parameters of a target and a neighboring background thereof, calculates an interframe target, determines a region where the target may appear in a new frame according to the historical knowledge about the motion of the target after the t-th frame arrives, detects SURF feature points in the region, and respectively combines the detected SURF feature points with a target feature point set of a previous frame
Figure BDA00016070399200001324
And background feature point set
Figure BDA00016070399200001325
Matching is carried out, in order to ensure that the correct matched feature point pairs between frames are found as much as possible, and in order to avoid wrong matching as much as possible, a series connection mode of multiple constraint conditions can be adopted, wrong matching is gradually eliminated from a candidate matched feature point set, and correct matching is finally obtained; specifically, correct matching between the feature point set of the current frame and the feature point set of the previous frame can be found out according to constraint conditions that inter-frame displacement of the feature points does not suddenly change, apparent features of the feature points do not suddenly change, inter-frame displacement of the feature points belonging to the target keeps consistent with overall target motion, and the like, and inter-frame motion parameters of the target are estimated according to the matched feature points, so that target tracking is achieved.
Step S212, updating the feature point set of the target object and the neighborhood background, the apparent features of the target object and the neighborhood background, and the interframe motion parameters of the target object and the neighborhood background according to the matching result, the motion estimation result and the tracking condition analysis result, so as to update the tracking strategy of the target object.
Referring to fig. 8, a schematic diagram of updating the feature point set of the target object and the neighborhood background, the apparent features of the target object and the neighborhood background, and the inter-frame motion parameters of the target object and the neighborhood background is shown; the embodiment of the invention divides the detected characteristic points into the target characteristic point set
Figure BDA0001607039920000141
And background feature point set
Figure BDA0001607039920000142
And performing target tracking through the SURF characteristic point matching detected by the current frame. In the tracking process, due to factors such as noise, illumination change, background change and the like, in order to realize stable tracking, a tracking model and a tracking strategy need to be adjusted in time according to the change of a video. In practical application, not all feature points in the feature point set can be matched, some feature points either disappear or cannot be matched for a long time, new feature points will appear continuously, the number of the feature points and the matching condition will change, and therefore the feature point set needs to be updated; the appearance information corresponding to the feature points can change along with time, and the corresponding appearance model can reflect the change in time; the inter-frame motion rules of the target and the neighborhood background will also change, and thus the corresponding motion model should be updated in time.
Specifically, the step of updating the feature point sets of the target object and the neighborhood background includes: (1) classifying the feature points in the feature point set according to the matching result to obtain a subset of a plurality of feature points; the subsets comprise feature point subsets with successful matching and feature point subsets with failed matching; the feature point subset which is successfully matched also comprises feature points on the target object and feature points on the neighborhood background; the feature point subset which fails in matching also comprises feature points on the target object and feature points on the neighborhood background; (2) deleting feature points in the feature point subset which are not matched successfully within the recent frame number and have higher matching failure than a set threshold from the feature point set corresponding to the previous frame; wherein, the recent frame number is the frame number of the continuous video frames of the set number before the previous frame; (3) adding the characteristic points in the characteristic point set of the current frame to the characteristic point set corresponding to the previous frame according to the tracking state of the current frame; (4) and updating the position coordinates of the characteristic points in the characteristic point set corresponding to the previous frame into the position coordinates of the corresponding characteristic points in the current frame.
Referring to fig. 9, a flowchart of updating the feature point sets of the target object and the neighborhood background is shown; the tracker has established a characteristic point set Pg before the t frame arrivest-1Including a set of target feature points
Figure BDA0001607039920000143
And background feature point set
Figure BDA0001607039920000144
Respectively with the feature point set PgD detected in the t-th frametMatching is performed, after matching PgDtThe feature points in the set should be classified into target feature points and background feature points, and feature point set Pgt-1Together form a new feature point set Pgt,Pgt-1Some feature points in the image should be eliminated, and the retained feature points should be merged into the Pg after updating the coordinate positions of the feature pointstIn (1).
Feature point set Pg established at the end of t-1 framet-1Including a set of feature points located on an object
Figure BDA0001607039920000145
And a feature point set located on the target neighborhood background
Figure BDA0001607039920000146
The two types of feature point sets are respectively matched, and the classification attributes of the feature points are not changed whether the feature points are matched or not. Feature point set PgD detected in t framestAnd is and
Figure BDA0001607039920000147
and
Figure BDA0001607039920000148
after matching, the successfully matched feature points are respectively classified as target feature points
Figure BDA0001607039920000149
And background feature points
Figure BDA00016070399200001410
But there will still be some feature points that are not successfully matched, and is marked as Pg _ newtNamely:
Figure BDA00016070399200001411
therefore, what is needed to determine the type of feature point is the feature point Pg _ new detected in the current frame but not successfully matchedt. According to the set Pg _ newtPosition of middle feature point i
Figure BDA00016070399200001412
Tracking the position and range of the current target
Figure BDA00016070399200001413
And tracking state, and collecting the unmatched feature points Pg _ newtThe classification is two types of target and background:
Figure BDA00016070399200001414
collection
Figure BDA00016070399200001415
And
Figure BDA00016070399200001416
respectively comparing the feature points with the feature point set of the previous frame
Figure BDA00016070399200001417
And
Figure BDA00016070399200001418
merging to obtain the characteristic point set of the t-th frame
Figure BDA00016070399200001419
Feature point set Pg _ new detected by current frame and not matchedtThe feature points which are newly appeared are often added into the corresponding feature point set, but the feature point set is not feasible to be infinitely increased along with the video frames, so that the matching condition among the feature point frames is generally required to be analyzed, and the relative stability of the number of the feature points is kept.
The number of times that each feature point can be matched in the latest period of time reflects the robustness of the information of the local region of the image corresponding to the feature point in the latest video. The more recent matching times are, the more stable the image information of the local area is; on the other hand, if the image information is not matched for a long time recently, it is considered that the image information of the local area is easily affected by noise and the like, and is relatively fragile. As mentioned above, the robust feature points should be given larger weight when the least square estimation of the motion model parameters in equation (16) is applied
Figure BDA0001607039920000151
And the reliability is higher, and conversely, the weak characteristic points are endowed with smaller weights. By setting parameters
Figure BDA0001607039920000152
The reliability of the characteristic point i at time t is described. After the matching operation of the feature points between frames is finished, the parameters of each feature point are updated
Figure BDA0001607039920000153
For the matched feature point i, the updating method comprises the following steps:
Figure BDA0001607039920000154
for unmatched feature point i, its coefficients
Figure BDA0001607039920000155
The update is performed as follows:
Figure BDA0001607039920000156
wherein Inc and Dec are constants
Figure BDA00016070399200001518
The method is an important basis for deleting the characteristic points, and can set the Inc to be 1 and the Dec to be 0.5.
For unmatched feature points, if corresponding
Figure BDA0001607039920000157
If the value is too small, the characteristic point i does not appear in the video for a long time, the image local information represented by the characteristic point can not appear in the video image due to factors such as occlusion, non-planar rotation and the like, and therefore, there is almost no "evidence" that the image local information described by the characteristic point will appear again, and when the value is too small, the image local information represented by the characteristic point i does not appear in the video image for a long time
Figure BDA0001607039920000158
And when the value is less than 0, deleting the characteristic point from the characteristic point set.
Feature point set PgD detected in the current frametAt the feature point set Pgt-1When matching is performed, there is a partial feature point Pg _ newtIf the matching is not successful, the part of feature points are newly added feature points, and the feature points can be added to the background feature point set and the target feature point set respectively according to the fact that the positions of the feature points are in the target or background area and the current tracking state is normal tracking, suspected partial occlusion and tracking loss (complete occlusion).
(a) Under the normal tracking condition, newly adding the classification of the feature points; is provided withThe current tracking obtained target position and range are
Figure BDA0001607039920000159
If the characteristic point is in the target range, adding the characteristic point to a characteristic point set of the target
Figure BDA00016070399200001510
In, otherwise, classifying as a feature point set belonging to the background
Figure BDA00016070399200001511
(b) Under the condition of partial shielding, classifying the newly added feature points; as shown in fig. 6, in the case of partial occlusion, some of the matched background feature points appear in the target range of the current frame, which is marked as
Figure BDA00016070399200001512
The feature points in the target range of the current frame and capable of matching with the previous frame include target feature points
Figure BDA00016070399200001513
And background feature points
Figure BDA00016070399200001514
It cannot be simply added to the target feature point set depending on whether the feature point is within the range of the target or not. At this time, a nearest neighbor algorithm may be used to classify the feature points newly added in the target range, that is, the feature point i is a feature point which is newly appeared in the target range and is not matched, and then the feature points are classified according to the following formula, and classified as the class with the closest spatial distance to the feature point:
Figure BDA00016070399200001515
Figure BDA00016070399200001516
here, the function G _ dis (i, Pg) represents the closest distance in spatial position in the image from the feature point i to each feature point in the feature point set Pg. And for the newly added feature points appearing in the background area, the feature points are all classified as a background feature point set
Figure BDA00016070399200001517
(c) Classification of newly added feature points under lost tracking (complete occlusion) conditions; at this time, the feature point set capable of matching with the target feature point in the previous frame detected in the current frame is an empty set, and all the feature points capable of matching with the previous frame are all the feature points belonging to the background
Figure BDA0001607039920000161
All newly appeared feature points are also classified as background feature point set
Figure BDA0001607039920000162
For each new feature point appearing in the current frame, the corresponding feature point
Figure BDA0001607039920000163
An Initial value, Initial _ M, is given, which may be set to 1:
Figure BDA0001607039920000164
the feature point set Pg of the previous frame is processedt-1The coordinate position of (2) is updated to the coordinates in the current frame, which can be divided into target and background feature points that can be matched as described above (
Figure BDA0001607039920000165
And
Figure BDA0001607039920000166
) Also, some of the feature points cannot be matched with the current frame, i.e. set
Figure BDA0001607039920000167
And
Figure BDA0001607039920000168
feature point set capable of being matched
Figure BDA0001607039920000169
And
Figure BDA00016070399200001610
the positions of the feature points are the feature points matched with the current frame, and the feature points which are not matched are decreased according to the formula (23)
Figure BDA00016070399200001611
Has part of characteristic points due to
Figure BDA00016070399200001612
The values are eliminated after being reduced to be smaller than a specified threshold value, however, some feature points which can not be matched can not be eliminated. The coordinate positions of the part of feature points in the new frame are updated according to the motion equation estimated by the formula (15). Eliminating part of feature points and updating the set Pg after coordinate positiont-1Feature point set not matched with current frame
Figure BDA00016070399200001613
And
Figure BDA00016070399200001614
new feature point set Pg obtained by combinationt
The step of updating the apparent features of the target object and the neighborhood background includes: and updating the mean and variance of Gaussian components of the feature points according to the feature description sub-vectors, the scale factor feature information, the color, the texture and the edge vectors of the feature points in the feature point subset successfully matched.
As described above, the embodiment of the present invention uses the gaussian distribution model to describe the time course of the feature point appearance, and the model is described by the mean μ and the variance σ. And initially assigning values to the corresponding mean value and variance of the feature vector, namely initializing the model, and updating the mean value and variance corresponding to the appearance model according to the feature point matching condition, namely updating the model. In practical application, through experimental analysis in a small target image range, the variation of the additive noise at different positions can be considered to be consistent within a period of time, that is, the variance of the noise at different image positions can be considered to be the same or not different approximately. Therefore, the change of the feature vector corresponding to each feature point detected in the target and the neighborhood of the target in the video frame is approximately considered to be subjected to the Gaussian distribution with the same variance. Then, the initialization and updating strategies of the gaussian model are based on the assumption, and the feature vectors corresponding to the feature points located in the target range have the same variance value, and similarly, the feature vectors of the feature points located in the neighborhood background range also have the same variance value.
When the first frame arrives or a new feature point is detected, as shown in equation (4), the mean of the newly detected feature point model is initialized to the corresponding feature vector of the detected unique feature point. At the first frame time, the variance of each feature vector of the appearance model
Figure BDA00016070399200001615
May be initialized to a larger initial value, such as 0.9; in the tracking process, the newly detected feature points have the same variance as the feature information process of different feature point feature vectors, the mean value of the corresponding appearance model is initialized to the detected feature vector value of the feature point, and the variance is initialized to the variance value of the corresponding feature vector of the current target or background feature point.
After the initialization of the appearance model is completed, the SURF feature points are subjected to interframe matching in a new image frame, tracking state analysis is carried out, and the Gaussian model of the feature vector is updated on the basis. The model may be trained using an online EM approximation method based on autoregressive filtering. For the feature vector j at time t, doMean value of corresponding Gaussian components of matched feature points
Figure BDA00016070399200001616
And variance
Figure BDA00016070399200001617
Kept constant, while the mean and variance of the matched gaussian components are based on the new observations
Figure BDA00016070399200001618
Updating:
Figure BDA00016070399200001619
Figure BDA0001607039920000171
wherein, the parameter i represents the serial number of the matched feature points, and N represents the total number of the matched feature points, which indicates that the variance calculated here is the average variance of the corresponding feature vectors of all the matched feature points. Parameter etaμAnd ησThe learning factors for mean and variance update are typically distributed between 0 and 1, which determine the rate at which the mean and variance of the gaussian change with time constant, so that the process of updating the mean and variance of the gaussian can be considered as a result of causal low pass filtering of past parameters. Generally, when a model is initially built, it is desirable that the model be built and converged as soon as possible, and a large learning factor is generally selected to enable the model to be built quickly. After that, the model should be stable to ensure that the previous image data has a certain influence on the model, so that the established model can reflect the history of the change of the 'feature vector' within a certain time, and a smaller learning factor should be selected to improve the robustness of the model to noise.
Thus the learning parameter η for the model meanμThe setting is performed as follows:
Figure BDA0001607039920000172
similarly, the update parameter η of the model varianceσThe method comprises the following steps:
Figure BDA0001607039920000173
wherein, CkμCount the number of times each feature point is matched, and CkσIs a count of the number of image frames for which there are feature points that are matched. In the model initialization phase, CkμOr CkσAnd the model is small, and the convergence rate is high. After the first matching, the parameter ημSuch that the model mean is set to the current observation and after the second match, the parameter ησThe setting of (2) shows that the variance of the model is set as the difference of the feature vectors at the first and second matching. Over time, CkμAnd CkσAs the contribution of the current observation value to model update gradually decreases, but if the learning factor approaches zero, the model is abnormally stable and cannot reflect normal changes of image information in time, so that minimum values thrd μ and thrd σ of weight update coefficients are set, and thrd μ and thrd σ may be set to 0.2.
In addition, if the variance of the Gaussian component
Figure BDA0001607039920000174
Too small, it is easy to cause that the feature points that should be matched cannot be correctly matched due to being too sensitive to noise in the feature point frame-to-frame matching process. Thus the variance for all gaussian components
Figure BDA0001607039920000175
Defining a lower limit, e.g. Tσ0.05 to enhance the robustness of the system.
The step of updating the inter-frame motion parameters of the target object and the neighborhood background includes: and updating the mean value and the variance of the motion parameters according to the estimated values of the motion transformation parameters between the current frame and the previous frame.
The motion of the target object in the latest period is described, and the current interframe motion transformation parameter Par estimated under the minimum mean square error meaning is only matched according to the characteristic pointt=(uxt,uyttt) It is not enough, and a corresponding motion model needs to be established for the motion of the target and the field thereof. The motion process between frames can also be described by using a gaussian distribution, and since the inter-frame deformation of the target is assumed to be small, the motion of the feature points and the motion of the target have high consistency, and the inter-frame motion of each feature point can be approximately regarded as obeying the same motion parameters. In order to reduce the operation complexity, the feature points of the target and background regions can be collected
Figure BDA0001607039920000176
And
Figure BDA0001607039920000177
the motion model of each feature point is simplified by using the Gaussian models of the motion of the target and the neighborhood background respectively, and motion transformation models are respectively established for the target and the neighborhood background area.
The model is updated by adopting an online EM approximation method, the time t is based on, and the estimated value Par of the current interframe motion transformation parametert={mtUpdate of mean and variance of motion parameter m by m ∈ (ux, uy, ρ, θ):
Figure BDA0001607039920000181
Figure BDA0001607039920000182
learning factor eta for model update1It is also similarly set up as:
Figure BDA0001607039920000183
same CkmIs a count of the number of image frames for which there are feature points that are matched. Model mean parameter
Figure BDA0001607039920000184
Is initialized to (0,0,1,0), namely the target and the neighborhood background are considered to be static without any spatial position change, after the first frame arrives, the mean value of the model is initialized to the motion parameter Par detected by the current framet=(uxt,uyttt) After the second frame comes, the variance of the model is calculated
Figure BDA0001607039920000185
The difference of the detected transformation parameters of the first frame and the second frame is initialized. Initial phase, CkmIs relatively small to make the model converge as soon as possible, after which eta1Keeping the value constant, therefore thrdm can be set to 0.1, allowing the model to be updated at a steady rate. Similarly, if the motion between frames is very uniform over a period of time, updating the formula according to the variance results in a variance of the Gaussian component
Figure BDA0001607039920000186
In this case, once the inter-frame motion changes slightly, the feature points to be matched cannot be matched correctly in the inter-frame matching process of the feature points, and therefore, the variance is required
Figure BDA0001607039920000187
Also stipulate the lower limit Tσ1,1,0.01,0.01 to enhance the robustness of the system.
In the tracking process, besides normal tracking, different tracking states such as drifting, loss, shielding and the like are inevitable, and for different tracking states, corresponding different tracking strategies are adopted for tracking so as to ensure the robustness and robustness of the algorithm. The target model, including the appearance model and the motion model, and the range estimation of the next frame of target that may appear are important tracking strategies and key factors affecting the robustness of the tracking algorithm. In the case of normal tracking, the target appearance model and the motion model do not have sudden changes, so the models are updated, and the target range estimation method of the next frame is performed according to the model updating method described above in the embodiment of the invention, however, under the conditions of tracking drift, loss and occlusion. The position and range of the target cannot be accurately determined or the target cannot be accurately observed, and in such an abnormal tracking state, tracking strategies such as an apparent model and a motion model of the target should be timely adjusted.
Therefore, the step of updating the tracking policy of the target object may be specifically implemented by:
(1) carrying out target shielding treatment; under the condition that the target is partially or completely shielded, the observation of the apparent characteristic information of the target is influenced, in the characteristic point set, the parameters of the characteristic model corresponding to the characteristic points capable of being matched can be updated according to the observation values matched in the current frame according to the formulas (26) and (27), the model parameters (mean value and variance) corresponding to the characteristic points incapable of being matched are kept unchanged, and the corresponding importance parameters are kept unchanged
Figure BDA0001607039920000188
And also remains unchanged. In the case of complete occlusion, the apparent feature model parameters of the target are unchanged, and in particular, the importance parameters corresponding to the feature points of the target
Figure BDA0001607039920000189
Remain unchanged. When partially occluded, it is still possible to locate the position and range of the target by matching local feature points. Under the condition of complete shielding or tracking loss, no characteristic point on the target can be matched at the moment, the target cannot be observed, and the motion model transformation parameter Par cannot be observed naturallyt=(uxt,uyttt) The target cannot be accurately positioned. At the moment, the tracker estimates the position and the range of the target in the current frame according to the priori knowledge of the motion of the target in the video frame, so that the model can be considered to keep moving at a constant speed, and the mean parameter of the motion model
Figure BDA00016070399200001810
Remain unchanged. When the part is shielded, the position and the range of the target can still be positioned through the matched local characteristic points, and the range of the target detection of the next frame is determined according to the formulas (5) and (6). When the occlusion is completely blocked or the tracking is lost, the greater thrdU value is given to the formulas (5) and (6), so that the SURF feature points can be detected in a larger range to track the target.
(2) Target drift processing; when tracking drift occurs, the position and range of the determined target are not very accurate, so if the apparent model and the motion model are completely updated according to the current tracking result, a large error may be introduced into the model, so that the subsequent tracking result is affected, errors are gradually accumulated, drift is more and more, and most of the tracking drift is also the reason for gradual development of tracking failure. Therefore, when it is determined that tracking drift has occurred, updating of the parameters of the appearance and motion models is usually stopped, and the state of the target in the current frame is calculated according to the historical experience represented by the motion model. The feature point detection range determination for the next frame can still be performed according to equations (5) and (6), but the parameter thrdU should take a larger value. When the target tracking is judged to be correct, the feature point detection can be carried out in a relatively small range in the next frame, otherwise, the feature point detection is carried out in a large range.
According to the video target tracking method provided by the embodiment of the invention, after tracking parameters are initialized, a feature point set in a current frame is detected in a set image range, and the feature point set is screened according to preset screening conditions; respectively matching the screened feature point set with a target object corresponding to the previous frame and a feature point set of a neighborhood background; then, according to the screened feature points, motion estimation is carried out on the target object, and according to the distance between the screened feature points and the center position of the target object and the apparent features of the target object, tracking condition analysis is carried out on the target object in the current frame; finally, updating the feature point set of the target object and the neighborhood background, the apparent features of the target object and the neighborhood background, and the interframe motion parameters of the target object and the neighborhood background according to the matching result, the motion estimation result and the tracking condition analysis result, so as to update the tracking strategy of the target object; in the mode, the tracking result can reflect the position of the target object in time, and can also accurately reflect the range and the rotation angle of the target object, so that the tracking of the video frame target object has better robustness and robustness, meanwhile, the calculation complexity is lower, and the tracking robustness and the calculation speed are both considered.
Target tracking is a key core technology of intelligent video equipment for video behavior analysis, human-computer interaction and the like; the local features are used as one of the image features, natural robustness can be achieved on partial shielding of the target, and the stable local features can be used as a basis for carrying out robust tracking on the target. The SURF feature points are obtained by improving the quick calculation of the SIFT feature points, the calculation speed is greatly improved through optimization, and meanwhile, the advantages of accurate SIFT feature positioning, insensitivity to illumination change and rotation invariance are kept. And stable local extreme points in the image are obtained through SURF feature point detection and serve as a basis for accurately positioning the target, so that efficient video target tracking is realized.
Based on this, another video target tracking method is provided in the embodiments of the present invention, as shown in fig. 10, this method may also be referred to as a video target tracking method based on local feature point matching, and this method includes the following steps: 1. an initialization stage, establishing a model of a target and a neighborhood background thereof; 2. positioning a target in a new frame, and obtaining the state (target position, range and rotation angle) of the target in the current frame through inter-frame feature point matching to obtain a tracking result; 3. and updating the model according to the tracking result. The method comprises an initialization stage and a target tracking and model updating stage.
In the initialization stage, firstly, the state of the target, namely the position, the range and the angle of the target in the current frame are initialized, the position and the range of the target are represented by a rectangular frame, and the range of a neighborhood background area is further initialized; then, detecting SURF characteristic points of the target and the neighborhood thereof on the basis, respectively initializing and establishing a model of the target and the neighborhood background thereof according to the detected characteristic points, and establishing an initial model of the target and the neighborhood background region thereof; we consider that the inter-frame motion of the object can be described by translation, rotation around the geometric center of the object, and scaling, and initialize the inter-frame motion parameters of the object and its neighborhood background.
In the target tracking and positioning stage, after a new frame comes, SURF feature points are detected in a certain area of a new frame of image according to historical knowledge of target motion, SURF feature points are matched according to an established target model and a neighborhood background model thereof, feature point pairs capable of being correctly matched are searched, interframe motion parameters of the target and the neighborhood background thereof are calculated according to the feature point pairs, so that the position, the range and the rotation angle of the target in the new frame are determined, the current obtained target state is analyzed on the basis, and whether tracking loss, drifting and other conditions occur or not is judged to obtain a final tracking result. In the updating stage of the model, different strategies are adopted to update the model of the target and the neighborhood background thereof according to the tracking result and the analysis of the tracking state (whether the tracking is accurate, drifting, losing or being shielded).
The video target tracking method improves the robustness and robustness of tracking, and has stronger capabilities of resisting shielding, noise and a disordered background; the tracking result not only can reflect the position of the target in time, but also can reflect the imaging range and the rotation change of the target; and a feature point matching method is adopted for tracking, so that the optimal likelihood of searching a target model is avoided, and the calculation complexity is reduced.
Corresponding to the above method embodiment, refer to a schematic structural diagram of a video target tracking apparatus shown in fig. 11; the device includes: an initialization module 110, configured to initialize tracking parameters; the tracking parameters at least comprise the position and the range of the target object, the interframe motion parameters of the target object and the neighborhood background, and the feature point set of the target object and the neighborhood background; a plurality of apparent features of the target object and the neighborhood background; the screening module 111 is configured to detect a feature point set in a current frame within a set image range, and screen the feature point set according to a preset screening condition; the feature point set comprises feature points and feature vectors corresponding to the feature points; a feature point matching module 112, configured to match, according to the filtered feature point set, the feature point set of the target object and the feature point set of the neighborhood background corresponding to the previous frame, respectively; a motion estimation module 113, configured to perform motion estimation on the target object according to the filtered feature points; a tracking condition analysis module 114, configured to analyze a tracking condition of the target object in the current frame according to a distance between the screened feature point and the center position of the target object and an apparent feature of the target object; and the updating module 115 is configured to update the feature point set of the target object and the neighborhood background, the apparent features of the target object and the neighborhood background, and the inter-frame motion parameters of the target object and the neighborhood background according to the matching result, the motion estimation result, and the tracking condition analysis result, so as to update the tracking policy of the target object.
The initialization module is further configured to: extracting the apparent characteristics of a target object and a neighborhood background in a current frame; the apparent features at least comprise a plurality of feature descriptor vectors, scale factor feature information, color features, texture features and edge features; determining the center position of the target object and the length and width of the target rectangular frame; initializing the inter-frame motion parameters of the target object and the neighborhood background into the difference of corresponding transformation parameters between the current frame and the previous frame; initializing a feature point set of a target object into a feature point set detected in a rectangular frame of the target object; initializing a feature point set of a neighborhood background into a feature point set of the neighborhood background detected in a neighborhood region in a preset range outside a target object; and initializing the apparent features of the target object and the neighborhood background into feature vectors of the extracted apparent features.
The embodiment also provides a video target tracking implementation device corresponding to the method embodiment. FIG. 12 is a schematic structural diagram of the video object tracking device; the apparatus comprises a memory 100 and a processor 101; the memory 100 is used to store one or more computer instructions that are executed by the processor to implement the above-described video target tracking method, which may include one or more of the above methods.
Further, the apparatus for implementing video object tracking shown in fig. 12 further includes a bus 102 and a communication interface 103, and the processor 101, the communication interface 103 and the memory 100 are connected via the bus 102. The Memory 100 may include a high-speed Random Access Memory (RAM) and may further include a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The communication connection between the network element of the system and at least one other network element is realized through at least one communication interface 103 (which may be wired or wireless), and the internet, a wide area network, a local network, a metropolitan area network, and the like can be used. The bus 102 may be an ISA bus, PCI bus, EISA bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one double-headed arrow is shown in FIG. 12, but that does not indicate only one bus or one type of bus.
The processor 101 may be an integrated circuit chip having signal processing capabilities. In implementation, the steps of the above method may be performed by integrated logic circuits of hardware or instructions in the form of software in the processor 101. The Processor 101 may be a general-purpose Processor, and includes a Central Processing Unit (CPU), a Network Processor (NP), and the like; the device can also be a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), a Field-Programmable Gate Array (FPGA), or other Programmable logic devices, discrete Gate or transistor logic devices, discrete hardware components. The various methods, steps, and logic blocks disclosed in the embodiments of the present disclosure may be implemented or performed. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of the method disclosed in connection with the embodiments of the present disclosure may be directly implemented by a hardware decoding processor, or implemented by a combination of hardware and software modules in the decoding processor. The software module may be located in ram, flash memory, rom, prom, or eprom, registers, etc. storage media as is well known in the art. The storage medium is located in the memory 100, and the processor 101 reads the information in the memory 100, and completes the steps of the method of the foregoing embodiment in combination with the hardware thereof.
The embodiment of the present invention further provides a machine-readable storage medium, where the machine-readable storage medium stores machine-executable instructions, and when the machine-executable instructions are called and executed by a processor, the machine-executable instructions cause the processor to implement the video target tracking method.
The embodiment of the invention provides a video target tracking method, a video target tracking device and a video target tracking implementation device, and provides a target tracking and positioning system based on SURF interframe matching, which comprises a multi-feature information extraction and self-adaptive fusion technology and a feature information updating technology; the characteristic information updating technology comprises characteristic point set updating, appearance model updating, motion model updating and tracking strategy adjustment; has the following advantages: (1) the method provides deep consideration under the SURF feature point detection and interframe matching framework and organically combines a plurality of key links such as multi-feature fusion, target and neighborhood background modeling, target tracking and positioning, model updating, tracking state detection and the like, so that the method becomes a complete tracking system and realizes robust continuous tracking of the specified target in the video. (2) The tracker designed by the invention can accurately estimate the motion parameters of the target in the current frame according to the SURF feature point interframe matching condition, accurately estimate the displacement, the target range and the rotation angle of the target, avoid the complicated search process of the traditional tracking algorithm and reduce the calculation complexity. (3) The robustness of the tracker is improved through the combination of links such as multi-feature fusion, feature point classification, hierarchical series feature point matching method design, tracking state analysis and model updating, so that the tracker can realize robust and stable tracking in complex scenes such as shading, disordered backgrounds and low signal-to-noise ratios.
The video target tracker method, apparatus, and computer program product of the system provided in the embodiments of the present invention include a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the method described in the foregoing method embodiments, and specific implementations may refer to the method embodiments and are not described herein again.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present invention, which are used for illustrating the technical solutions of the present invention and not for limiting the same, and the protection scope of the present invention is not limited thereto, although the present invention is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present invention, and they should be construed as being included therein. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims (9)

1. A video target tracking method is characterized by comprising the following steps:
initializing a tracking parameter; the tracking parameters at least comprise the position and the range of a target object, interframe motion parameters of the target object and a neighborhood background, and a feature point set of the target object and the neighborhood background; a plurality of apparent features of the target object and the neighborhood background;
detecting a feature point set in a current frame in a set image range, and screening the feature point set according to a preset screening condition; the feature point set comprises feature points and feature vectors corresponding to the feature points;
matching the feature point set after screening with the feature point set of the target object and the neighborhood background corresponding to the previous frame respectively;
according to the screened feature points, carrying out motion estimation on the target object;
analyzing the tracking condition of the target object in the current frame according to the distance between the screened feature point and the center position of the target object and the apparent feature of the target object;
updating the feature point sets of the target object and the neighborhood background, the apparent features of the target object and the neighborhood background, and the interframe motion parameters of the target object and the neighborhood background according to the matching result, the motion estimation result and the tracking condition analysis result, so as to update the tracking strategy of the target object;
the method comprises the following steps of detecting a feature point set in a current frame in a set image range, and screening the feature point set according to a preset screening condition, wherein the steps comprise:
determining the coordinates of the upper left corner and the lower right corner of an image rectangular frame of the image range to be detected;
detecting characteristic points in the image rectangular frame to obtain coordinates of the characteristic points;
calculating the trace of the Hessian matrix of the characteristic point and the characteristic vector corresponding to the characteristic point; the feature vector comprises a feature descriptor vector, scale factor feature information, and color, texture and edge vectors;
and screening the characteristic points in the characteristic point set according to the following screening conditions:
the track of the Hessian matrix of the characteristic point has the same sign as the track of the Hessian matrix of the characteristic point in the previous frame of video frame;
the distance between the characteristic point and the characteristic point in the previous frame video frame is smaller than a preset distance threshold value;
the Euclidean distance between the feature point and the feature point in the previous frame video frame corresponding to the feature vector meets a preset feature vector threshold;
the displacement length, the displacement direction and the relative position relation between the characteristic point and the characteristic point in the previous frame video frame meet a preset displacement consistency threshold; the relative position is determined by the distance between the characteristic point of the target object and the central position of the target object, wherein the characteristic point is normalized by the length and the width of a target rectangular frame of the target object;
and when the feature points and the feature points of the previous frame of video frame are in a matching relationship of a plurality of pairs, screening the feature points with the minimum Euclidean distance from the plurality of feature points.
2. The method of claim 1, wherein the step of initializing tracking parameters comprises:
extracting the apparent characteristics of the target object and the neighborhood background in the current frame; the apparent features at least comprise a plurality of feature descriptor vectors, scale factor feature information, color features, texture features and edge features;
determining the center position of the target object and the length and width of a target rectangular frame;
initializing the inter-frame motion parameters of the target object and the neighborhood background to be the difference of corresponding transformation parameters between the current frame and the previous frame;
initializing the feature point set of the target object into the detected feature point set in the rectangular frame of the target object; initializing the feature point set of the neighborhood background into a feature point set of the neighborhood background detected in a neighborhood region in a preset range outside the target object;
initializing the apparent features of the target object and the neighborhood background to the extracted feature vector of the apparent features.
3. The method according to claim 1, wherein the step of analyzing the tracking condition of the target object in the current frame according to the distance between the screened feature point and the center position of the target object and the apparent feature of the target object comprises:
detecting wrongly classified feature points according to the distance between the feature points and the center position of the target object, and eliminating the wrongly classified feature points to generate a first feature point set;
and analyzing whether the target object in the current video frame has tracking drift or not according to the apparent characteristics of each characteristic point in the first characteristic point set.
4. The method of claim 1, wherein the step of updating the set of feature points of the target object and the neighborhood background comprises:
classifying the feature points in the feature point set according to the matching result to obtain a subset of a plurality of feature points; wherein the subsets comprise feature point subsets with successful matching and feature point subsets with failed matching; the feature point subset successfully matched also comprises feature points on the target object and feature points on the neighborhood background; the feature point subset which fails in matching further comprises feature points on the target object and feature points on the neighborhood background;
deleting feature points in the feature point subset which are not matched successfully for times higher than a set threshold value and are in the matching failure in the recent frame number from the feature point set corresponding to the previous frame; wherein the recent frame number is the number of consecutive video frames of a set number before the previous frame;
adding the characteristic points in the characteristic point set of the current frame to the characteristic point set corresponding to the previous frame according to the tracking state of the current frame;
and updating the position coordinates of the characteristic points in the characteristic point set corresponding to the previous frame into the position coordinates of the corresponding characteristic points in the current frame.
5. The method of claim 4, wherein the step of updating the apparent features of the target object and the neighborhood background comprises:
and updating the mean and variance of Gaussian components of the feature points according to the feature description sub-vectors, the scale factor feature information, the color, the texture and the edge vectors of the feature points in the feature point subset successfully matched.
6. The method of claim 1, wherein the step of updating the inter-frame motion parameters of the target object and the neighborhood background comprises:
and updating the mean value and the variance of the motion parameters according to the estimated values of the motion transformation parameters between the current frame and the previous frame.
7. A video object tracking apparatus, comprising:
the initialization module is used for initializing the tracking parameters; the tracking parameters at least comprise the position and the range of a target object, interframe motion parameters of the target object and a neighborhood background, and a feature point set of the target object and the neighborhood background; a plurality of apparent features of the target object and the neighborhood background;
the screening module is used for detecting a feature point set in a current frame in a set image range and screening the feature point set according to a preset screening condition; the feature point set comprises feature points and feature vectors corresponding to the feature points;
the feature point matching module is used for respectively matching the feature point set after screening with the feature point set of the target object and the neighborhood background corresponding to the previous frame;
the motion estimation module is used for carrying out motion estimation on the target object according to the screened feature points;
the tracking condition analysis module is used for analyzing the tracking condition of the target object in the current frame according to the distance between the screened feature point and the center position of the target object and the apparent feature of the target object;
the updating module is used for updating the feature point set of the target object and the neighborhood background, the apparent features of the target object and the neighborhood background, and the interframe motion parameters of the target object and the neighborhood background according to the matching result, the motion estimation result and the tracking condition analysis result, so as to update the tracking strategy of the target object;
the screening module is further configured to: determining the coordinates of the upper left corner and the lower right corner of an image rectangular frame of the image range to be detected; detecting characteristic points in the image rectangular frame to obtain coordinates of the characteristic points; calculating the trace of the Hessian matrix of the characteristic point and the characteristic vector corresponding to the characteristic point; the feature vector comprises a feature descriptor vector, scale factor feature information, and color, texture and edge vectors; and screening the characteristic points in the characteristic point set according to the following screening conditions: the track of the Hessian matrix of the characteristic point has the same sign as the track of the Hessian matrix of the characteristic point in the previous frame of video frame; the distance between the characteristic point and the characteristic point in the previous frame video frame is smaller than a preset distance threshold value; the Euclidean distance between the feature point and the feature point in the previous frame video frame corresponding to the feature vector meets a preset feature vector threshold; the displacement length, the displacement direction and the relative position relation between the characteristic point and the characteristic point in the previous frame video frame meet a preset displacement consistency threshold; the relative position is determined by the distance between the characteristic point of the target object and the central position of the target object, wherein the characteristic point is normalized by the length and the width of a target rectangular frame of the target object; and when the feature points and the feature points of the previous frame of video frame are in a matching relationship of a plurality of pairs, screening the feature points with the minimum Euclidean distance from the plurality of feature points.
8. The apparatus of claim 7, wherein the initialization module is further configured to:
extracting the apparent characteristics of the target object and the neighborhood background in the current frame; the apparent features at least comprise a plurality of feature descriptor vectors, scale factor feature information, color features, texture features and edge features;
determining the center position of the target object and the length and width of a target rectangular frame;
initializing the inter-frame motion parameters of the target object and the neighborhood background to be the difference of corresponding transformation parameters between the current frame and the previous frame;
initializing the feature point set of the target object to the feature point set detected in the rectangular frame of the target object; initializing the feature point set of the neighborhood background into a feature point set of the neighborhood background detected in a neighborhood region in a preset range outside the target object;
initializing the apparent features of the target object and the neighborhood background to the extracted feature vector of the apparent features.
9. A video object tracking implementation apparatus comprising a processor and a machine-readable storage medium storing machine-executable instructions executable by the processor to implement the method of any one of claims 1 to 6.
CN201810249416.5A 2018-03-23 2018-03-23 Video target tracking method and device and implementation device Expired - Fee Related CN108470354B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810249416.5A CN108470354B (en) 2018-03-23 2018-03-23 Video target tracking method and device and implementation device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810249416.5A CN108470354B (en) 2018-03-23 2018-03-23 Video target tracking method and device and implementation device

Publications (2)

Publication Number Publication Date
CN108470354A CN108470354A (en) 2018-08-31
CN108470354B true CN108470354B (en) 2021-04-27

Family

ID=63264696

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810249416.5A Expired - Fee Related CN108470354B (en) 2018-03-23 2018-03-23 Video target tracking method and device and implementation device

Country Status (1)

Country Link
CN (1) CN108470354B (en)

Families Citing this family (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10460453B2 (en) * 2015-12-30 2019-10-29 Texas Instruments Incorporated Feature point identification in sparse optical flow based tracking in a computer vision system
CN108898615B (en) * 2018-06-15 2021-09-24 阿依瓦(北京)技术有限公司 Block matching method for high frequency information image
CN109255337B (en) * 2018-09-29 2020-04-28 北京字节跳动网络技术有限公司 Face key point detection method and device
CN109323697B (en) * 2018-11-13 2022-02-15 大连理工大学 Method for rapidly converging particles during starting of indoor robot at any point
CN111385490B (en) * 2018-12-28 2021-07-13 清华大学 Video splicing method and device
CN109827578B (en) * 2019-02-25 2019-11-22 中国人民解放军军事科学院国防科技创新研究院 Satellite relative attitude estimation method based on profile similitude
CN110111361B (en) * 2019-04-22 2021-05-18 湖北工业大学 Moving object detection method based on multi-threshold self-optimization background modeling
CN110415275B (en) * 2019-04-29 2022-05-13 北京佳讯飞鸿电气股份有限公司 Point-to-point-based moving target detection and tracking method
CN112085025B (en) * 2019-06-14 2024-01-16 阿里巴巴集团控股有限公司 Object segmentation method, device and equipment
CN110660090B (en) * 2019-09-29 2022-10-25 Oppo广东移动通信有限公司 Subject detection method and apparatus, electronic device, and computer-readable storage medium
CN113012216B (en) * 2019-12-20 2023-07-07 舜宇光学(浙江)研究院有限公司 Feature classification optimization method, SLAM positioning method, system and electronic equipment
CN111144483B (en) * 2019-12-26 2023-10-17 歌尔股份有限公司 Image feature point filtering method and terminal
CN111160266B (en) * 2019-12-30 2023-04-18 三一重工股份有限公司 Object tracking method and device
CN111382309B (en) * 2020-03-10 2023-04-18 深圳大学 Short video recommendation method based on graph model, intelligent terminal and storage medium
CN111652263B (en) * 2020-03-30 2021-12-28 西北工业大学 Self-adaptive target tracking method based on multi-filter information fusion
CN112053381A (en) * 2020-07-13 2020-12-08 北京迈格威科技有限公司 Image processing method, image processing device, electronic equipment and storage medium
CN111882583B (en) * 2020-07-29 2023-11-14 成都英飞睿技术有限公司 Moving object detection method, device, equipment and medium
CN112184766B (en) * 2020-09-21 2023-11-17 广州视源电子科技股份有限公司 Object tracking method and device, computer equipment and storage medium
CN112184769B (en) * 2020-09-27 2023-05-02 上海高德威智能交通系统有限公司 Method, device and equipment for identifying tracking abnormality
CN112200126A (en) * 2020-10-26 2021-01-08 上海盛奕数字科技有限公司 Method for identifying limb shielding gesture based on artificial intelligence running
CN112215205B (en) * 2020-11-06 2022-10-18 腾讯科技(深圳)有限公司 Target identification method and device, computer equipment and storage medium
CN113450578B (en) * 2021-06-25 2022-08-12 北京市商汤科技开发有限公司 Traffic violation event evidence obtaining method, device, equipment and system
CN113822279B (en) * 2021-11-22 2022-02-11 中国空气动力研究与发展中心计算空气动力研究所 Infrared target detection method, device, equipment and medium based on multi-feature fusion
CN118015501B (en) * 2024-04-08 2024-06-11 中国人民解放军陆军步兵学院 Medium-low altitude low-speed target identification method based on computer vision

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999920A (en) * 2012-10-25 2013-03-27 西安电子科技大学 Target tracking method based on nearest neighbor classifier and mean shift
CN103870839A (en) * 2014-03-06 2014-06-18 江南大学 Online video target multi-feature tracking method
CN103886611A (en) * 2014-04-08 2014-06-25 西安煤航信息产业有限公司 Image matching method suitable for automatically detecting flight quality of aerial photography

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7480079B2 (en) * 2003-09-09 2009-01-20 Siemens Corporate Research, Inc. System and method for sequential kernel density approximation through mode propagation
US8897528B2 (en) * 2006-06-26 2014-11-25 General Electric Company System and method for iterative image reconstruction
CN103400395A (en) * 2013-07-24 2013-11-20 佳都新太科技股份有限公司 Light stream tracking method based on HAAR feature detection
CN103985136A (en) * 2014-03-21 2014-08-13 南京大学 Target tracking method based on local feature point feature flow pattern
CN105046717B (en) * 2015-05-25 2019-03-19 浙江师范大学 A kind of video object method for tracing object of robustness

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102999920A (en) * 2012-10-25 2013-03-27 西安电子科技大学 Target tracking method based on nearest neighbor classifier and mean shift
CN103870839A (en) * 2014-03-06 2014-06-18 江南大学 Online video target multi-feature tracking method
CN103886611A (en) * 2014-04-08 2014-06-25 西安煤航信息产业有限公司 Image matching method suitable for automatically detecting flight quality of aerial photography

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Multiple Feature Fusion for Tracking of Moving Objects in Video Surveillance;Huibin Wang等;《2008 International Conference on Computational International and Security》;20081231;第553-559页 *

Also Published As

Publication number Publication date
CN108470354A (en) 2018-08-31

Similar Documents

Publication Publication Date Title
CN108470354B (en) Video target tracking method and device and implementation device
Kristan et al. The visual object tracking vot2015 challenge results
CN107633226B (en) Human body motion tracking feature processing method
CN108399627B (en) Video inter-frame target motion estimation method and device and implementation device
KR20160096460A (en) Recognition system based on deep learning including a plurality of classfier and control method thereof
JP2006209755A (en) Method for tracing moving object inside frame sequence acquired from scene
CN110349188B (en) Multi-target tracking method, device and storage medium based on TSK fuzzy model
Roy et al. Foreground segmentation using adaptive 3 phase background model
Gündoğdu et al. The visual object tracking VOT2016 challenge results
Zhang et al. Visual saliency based object tracking
Li et al. Robust object tracking via multi-feature adaptive fusion based on stability: contrast analysis
Afonso et al. Automatic estimation of multiple motion fields from video sequences using a region matching based approach
Sahoo et al. Adaptive feature fusion and spatio-temporal background modeling in KDE framework for object detection and shadow removal
Zhang et al. Weighted smallest deformation similarity for NN-based template matching
Kim et al. Simultaneous foreground detection and classification with hybrid features
CN107665495B (en) Object tracking method and object tracking device
CN108492328B (en) Video inter-frame target matching method and device and implementation device
Dai et al. Robust and accurate moving shadow detection based on multiple features fusion
CN113129332A (en) Method and apparatus for performing target object tracking
CN117011346A (en) Blower image registration algorithm
Li et al. Research on hybrid information recognition algorithm and quality of golf swing
Ding et al. Robust tracking with adaptive appearance learning and occlusion detection
Maia et al. Visual object tracking by an evolutionary self-organizing neural network
Nithin et al. Multi-camera tracklet association and fusion using ensemble of visual and geometric cues
Tsin et al. Learn to track edges

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20210427

CF01 Termination of patent right due to non-payment of annual fee