CN112116634A - Multi-target tracking method of semi-online machine - Google Patents

Multi-target tracking method of semi-online machine Download PDF

Info

Publication number
CN112116634A
CN112116634A CN202010754142.2A CN202010754142A CN112116634A CN 112116634 A CN112116634 A CN 112116634A CN 202010754142 A CN202010754142 A CN 202010754142A CN 112116634 A CN112116634 A CN 112116634A
Authority
CN
China
Prior art keywords
frame
detection
target
trajectory
kalman
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010754142.2A
Other languages
Chinese (zh)
Other versions
CN112116634B (en
Inventor
刘龙军
金焰明
孙宏滨
郑南宁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian Jiaotong University
Original Assignee
Xian Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian Jiaotong University filed Critical Xian Jiaotong University
Priority to CN202010754142.2A priority Critical patent/CN112116634B/en
Publication of CN112116634A publication Critical patent/CN112116634A/en
Application granted granted Critical
Publication of CN112116634B publication Critical patent/CN112116634B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30196Human being; Person

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • General Engineering & Computer Science (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Image Analysis (AREA)

Abstract

A multi-target tracking method of a semi-online mechanism is characterized in that a detection frame of a pedestrian or a moving target is obtained according to a pedestrian or moving target video, a Kalman sequence spectrum is obtained according to position change information among the detection frames in a period of time window, a pair of Kalman heads is found according to the Kalman sequence spectrum, a detection frame of the target or the moving object to be tracked in the next frame is obtained through the similarity of an appearance model, a moving model and a size change model, the target or the moving object is enabled to be located in the detection frame in the frame, and otherwise, the target is indicated to be lost; and splicing the detection frames with the similarity higher than the threshold value into the Kalman sequence spectrum, updating a motion model and an appearance model in the Kalman sequence spectrum, and tracking the pedestrian or moving object target in the next frame. The method is suitable for any track splicing type multi-target tracking algorithm, namely, the method is not limited by the constraint of different tracks generated by a plurality of targets such as pedestrians, moving objects and the like, the tracking precision can be effectively improved, and the identity conversion value is reduced.

Description

Multi-target tracking method of semi-online machine
Technical Field
The invention relates to a tracking method, in particular to a multi-target tracking method of a semi-online device.
Background
The multi-target tracking method is mainly applied to track tracking of a plurality of persons or moving objects in a video sequence shot by a camera: in the driving scene of the unmanned vehicle, real-time track tracking can be carried out and the motion track of the targets of pedestrians or other vehicles on the road can be predicted through the targets of the pedestrians or other vehicles shot by a camera arranged in the unmanned vehicle, so that the unmanned vehicle can implement effective avoidance or automatic driving decision according to the motion of the targets; in a plurality of cross-camera monitoring scenes, a plurality of pedestrians in the camera can be tracked according to requirements, and walking tracks and positioning of a plurality of pedestrian targets can be monitored through videos captured by different cameras; in a sports scene shot by the camera, such as a basketball game, the moving tracks of a plurality of athletes shot by the camera can be respectively tracked by a multi-target tracking method, and actions, behavior analysis and the like on the athlete field are carried out based on the tracked tracks. The multi-target tracking method can also be applied to tracking a plurality of targets such as enemy ships, vehicles and the like in military scenes. The current tracking methods are numerous, but in order to track efficiently, the multi-target tracking method needs to be prompted and optimized in real time, accuracy and the like.
MOT (multi-target tracking) can be mainly divided into online MOT and offline MOT, and the difference between the online MOT and the offline MOT is as follows: the former can be pushed backwards along with the number of real-time frames, a tracking track result can be given in time, and the real-time performance is higher than that of the latter on the whole, and each precision is relatively low; the latter must wait for the whole video sequence to complete the forward calculation, and then track after obtaining the information of detection frames and the like in all video frames, so that it is difficult to meet the real-time requirement compared with the former, but the accuracy is generally higher due to better combination of global information. On-line tracking requires that real-time trajectory tracking be completed immediately after the detection operation for each next frame is completed. Therefore, the online tracking algorithm intuitively has better real-time performance, but cannot effectively utilize the global information of the video, thereby possibly causing the precision to be reduced; in contrast, offline tracking is tracking a track after all frames of a given video sequence have been detected. The mode can well utilize global information, the tracking result is relatively accurate, and the real-time requirement cannot be met. The time receptive field sizes of the online tracking, the semi-online tracking and the offline tracking are respectively the current frame, the time window, the whole, and are increased in sequence; the real-time performance of the system is reduced in turn.
The occlusion problem has been one of the difficulties in MOT, and although the iterative update of various algorithms is very rapid, most of the algorithm performance is still difficult to maintain robust when severe occlusion is encountered. When an occlusion problem is encountered, either an online MOT or an offline MOT, or a MOT constructed using a deep learning method, various approaches have been made to attempt to solve the occlusion problem. But essentially by sacrificing real-time. The precision and the accuracy are very important in the scene of practical tracking application, for example, the poor real-time performance of a tracking algorithm in an unmanned automobile can lead to the delay of vehicle judgment, lead to the misjudgment or delayed decision and cause unnecessary traffic accidents; poor accuracy can lead to a plurality of targets to track in disorder, leads to tracking inefficacy, for example use multi-target tracking algorithm can lead to chasing away when tracking criminal suspects in many intelligent cameras in the city, or the non-suspects who track, cause real suspects to run away etc..
Disclosure of Invention
The invention aims to provide a semi-online multi-target tracking method.
In order to achieve the purpose, the invention is realized by the following technical scheme:
a multi-target tracking method of a semi-online mechanism is characterized in that detection frames of pedestrians or moving targets are obtained through a YOLO-V3 detector according to videos of the pedestrians or the moving targets, a Kalman sequence spectrum is obtained according to position change information among the detection frames in a period of time window, then a pair of Kalman heads are found according to the Kalman sequence spectrum, detection frames of targets or moving objects to be tracked in the next frame are obtained through similarity of an appearance model, a movement model and a size change model, the targets or the moving objects are enabled to be located in the detection frames in the frame, and otherwise, the targets are indicated to be lost; and splicing the detection frames with the similarity higher than the threshold value into the Kalman sequence spectrum, updating a motion model and an appearance model in the Kalman sequence spectrum, and tracking the pedestrian or moving object target in the next frame.
The invention is further improved in that the similarity of the appearance models is obtained by the following process:
at the n-ththIn the frame video, the video is displayed,the size of the patch is fixed [64,128 ]]There are D check boxes, D patch, the Xth check box is shown as
Figure BDA0002610979240000021
The patch corresponding to the Xth detection frame is
Figure BDA0002610979240000022
At nthIn the frame, crop and resize operations are carried out on the area where each detection frame is located to obtain D patches with the quantity equal to the fixed size of the detection frame, then the pixels of the D patches are respectively divided into a plurality of groups according to color intervals,
the matrix result reshape obtained by grouping is used as a one-dimensional vector TsrXA one-dimensional vector TsrXAs
Figure BDA0002610979240000031
Thereby obtaining an appearance model, and representing the appearance model of the xth detection box and the yth track as: f (X) and f (Y); finally, the appearance model is updated by vector fusion, represented as
Figure BDA00026109792400000321
The similarity of the appearance model is as the following formula (3-1);
Figure BDA0002610979240000032
in the formula: lambdaA(X, Y) represents the similarity of the appearance model.
The invention is further improved in that the similarity between the motion model and the size change model is obtained through the following processes: the time difference between adjacent frames is deltatThe k-th target in the n-th frame is
Figure BDA0002610979240000033
Position center coordinates of
Figure BDA0002610979240000034
Is (x, y), the velocity vector corresponding to the coordinates is
Figure BDA0002610979240000035
Acceleration vector corresponding to coordinate
Figure BDA0002610979240000036
Is composed of
Figure BDA0002610979240000037
The size of the target corresponding to the detection frame
Figure BDA0002610979240000038
Is (w, h), corresponding to the dimensional change speed
Figure BDA0002610979240000039
Is composed of
Figure BDA00026109792400000310
Varying the driving force of
Figure BDA00026109792400000311
The detector impact factor is α;
motion state of kth object in nth frame
Figure BDA00026109792400000312
And dimensional state
Figure BDA00026109792400000313
Are respectively as
Figure BDA00026109792400000314
And
Figure BDA00026109792400000315
covariance matrix between element factors in motion state
Figure BDA00026109792400000316
The covariance matrix between the element factors in the size state is
Figure BDA00026109792400000317
According to the physical motion law, a position prediction equation and a size prediction equation for the next frame are obtained as follows:
Figure BDA00026109792400000318
Figure BDA00026109792400000319
namely, it is
Figure BDA00026109792400000320
Figure BDA0002610979240000041
Figure BDA0002610979240000042
Namely, it is
Figure BDA0002610979240000043
Order to
Figure BDA0002610979240000044
Simplifying the two iterative state transfer equations and the covariance matrix updating equation as follows:
Figure BDA0002610979240000045
Figure BDA0002610979240000046
and (3-8) and (3-9) are used as iterative equations of a motion model and a size model, and Kalman filter prediction based on normal distribution is respectively carried out to obtain position prediction information of the (n +1) th frame
Figure BDA0002610979240000047
And size prediction information
Figure BDA0002610979240000048
For any first segment trajectory X and second segment trajectory Y,
Figure BDA0002610979240000049
and
Figure BDA00026109792400000410
a forward velocity vector pointing from the head to the tail of the first trajectory X and a reverse velocity vector pointing from the tail to the head of the second trajectory Y, respectively;
Figure BDA00026109792400000417
representing a motion process simulated by a kalman filter; FX, Y) is a forward similarity score pointing from the tail of trace X to the head of trace Y;
Figure BDA00026109792400000411
is an inverse similarity score pointing from the head of trajectory Y to the tail of trajectory X;
Figure BDA00026109792400000412
Figure BDA00026109792400000413
Figure BDA00026109792400000414
wherein, ΛM(X, Y) representsSimilarity between the first segment trajectory X and the second segment trajectory Y.
A further development of the invention consists in defining the length of the time window as N and the minimum instantiation length of the short trajectory as TmThe Kalman family is denoted by KFM and the kth detection box in the nth frame is denoted by KFM
Figure BDA00026109792400000415
Figure BDA00026109792400000416
Representing detection boxes in KFM
Figure BDA0002610979240000051
Representing an order in the corresponding patch trajectory;
Figure BDA0002610979240000052
if it is not
Figure BDA0002610979240000053
It means that the detection box has not cascaded with any fragmentation traces in the KFM, and x means
Figure BDA0002610979240000054
Is the (x +1) th member of a certain segment of the fragment track in the KFM, and the ith fragment track in the KFM is defined as TKiIf the length of the ith fragment track is greater than TmAnd its motion model, appearance model, size model are not updated in the nth frame, then the ith patch trajectory is instantiated as a reliable short trajectory STjOtherwise, the ith fragment track is disassembled;
Figure BDA0002610979240000055
the further improvement of the invention is that the specific process of splicing the detection frames with similarity higher than the threshold value into the Kalman sequence spectrum is as follows:
first, finding out the detection frame KH of the n-th frame of pedestrian pictures: detection frame in n-th frame and n +1 frame pictures
Figure BDA0002610979240000056
In (1),
Figure BDA0002610979240000057
for the detection frame in the image of the nth frame,
Figure BDA0002610979240000058
finding out each pair of detection frames which possibly belong to the same target and are close in the IOU relation for the detection frames in the (n +1) th frame picture; if it is not
Figure BDA0002610979240000059
Then will be
Figure BDA00026109792400000510
And
Figure BDA00026109792400000511
respectively labeled as 0 and 1, and will
Figure BDA00026109792400000512
And
Figure BDA00026109792400000513
referred to as a pair detection frame KH;
there will be several pairs KH in the nth and n +1 frames (e.g.,
Figure BDA00026109792400000514
and
Figure BDA00026109792400000515
);
Figure BDA00026109792400000516
representing detection boxes in KFM
Figure BDA00026109792400000517
Representing order in corresponding patch trajectoriesThe ith detection box in the nth frame is represented as
Figure BDA00026109792400000518
And step two, predicting:
predicting the position of the pedestrian target in the next frame of picture according to the motion module of each pedestrian target in the current (n +1) th frame of picture
Figure BDA00026109792400000519
Step three, track growth: according to the formula (3), selecting and
Figure BDA00026109792400000520
the detection frame with the most similar position is
Figure BDA00026109792400000521
Then, order
Figure BDA00026109792400000522
And updating the instable TKiThe motion model and appearance model of (1);
Figure BDA00026109792400000523
using updated unstable trajectory TKiPredicting a position in a next frame using the motion model and the appearance model of (a);
the fourth step: repeating the process from the first step to the third step for tracking each frame;
fifthly, instantiation or backtracking: instantiating or backtracking the short track in KFM in the current frame according to the following conditions:
a) instantiation: if the unstable track TK in the Kalman sequence spectrumiLength, if not updated in the last frame, the unstable trajectory TKiInstantiated as a new reliable trajectory STj
b) Backtracking: if the unstable track TK in the Kalman sequence spectrumiIs less than a threshold value TmAnd there is no update in the last frame, the unstable trajectory TK will be deleted in the Kalman family diagram KFMiAnd is provided with
Figure BDA0002610979240000061
And marking the fragment track in the Kalman sequence spectrum as a forbidden route.
The invention has the further improvement that the specific process of the second step is as follows: according to
Figure BDA0002610979240000062
And
Figure BDA0002610979240000063
establishing an unstable trajectory TKiAccording to which the prediction belongs to the trajectory TKiIs (n +2)thThe position of the frame, and defining the position as
Figure BDA0002610979240000064
Compared with the prior art, the invention has the following beneficial effects:
firstly, the method comprises the following steps: the method is suitable for any track splicing type multi-target tracking algorithm, namely, the method is not limited by the constraint of different tracks generated by a plurality of targets such as pedestrians, moving objects and the like, the tracking precision can be effectively improved, and the identity conversion value is reduced;
secondly, the method comprises the following steps: the generated tracking result can be checked, and the wrong tracking result is corrected, so that the algorithm is more robust, for example, when a target positioning error in a current video frame is caused in the tracking of a pedestrian target, namely when the tracking result of the on-line multi-target tracking algorithm is wrong, the error can be detected through a backtracking module in a time window of the method, so as to correct the tracking track;
thirdly, the method comprises the following steps: by carrying out mask covering (IOU) on intersection areas among the targets, the discrimination among the multiple targets is effectively improved at the cost of extremely small calculated amount, so that the problem of target feature disappearance caused by serious shielding in crowded places such as shopping malls, station intersections and the like can be effectively solved, and the feature discrimination of incompletely shielded targets and shielded targets is effectively improved;
fourthly: the invention can use the global information in the existing time window to check and correct the error tracking result in a certain time under the condition of meeting the requirement of real-time property. The method has very robust results in various extreme scenes, and has good adaptability to other algorithms similar to short-track splicing.
Drawings
FIG. 1 is a flowchart of a backtracking algorithm of the present invention.
FIG. 2 is an overall algorithm flow diagram of the present invention.
FIG. 3 is a schematic diagram of an IOU mask module according to an embodiment of the present invention.
Fig. 4 is a schematic diagram of appearance model establishment according to an embodiment of the present invention.
Figure 5 shows a comparison of MOT2015 algorithm performance.
FIG. 6 shows a comparison of the FPS for the MOT2015 algorithms.
Detailed Description
The present invention will be described in detail with reference to the accompanying drawings.
The invention adopts MOT of semi-on-line mechanism, and the method can make good compromise and optimization in the aspects of real-time performance and precision.
Referring to fig. 1, the specific process of the present invention is: a pedestrian or moving object video shot by a camera is detected by a YOLO-V3 detector to obtain a detection frame of the pedestrian or moving object, namely the frame of the pedestrian or moving object is taken to exclude other objects or backgrounds. Acquiring a Kalman sequence spectrum according to position change information among detection frames in a time window by acquiring a video of a period of time, then finding a pair of Kalman heads (Kalman Head, KH) according to the Kalman sequence spectrum, acquiring a detection frame of a target or a moving object to be tracked in the next frame according to the similarity of an appearance model, a motion model and a size change model, and enabling the target or the moving object to be always in the detection frame in the frame, otherwise, indicating that the target is lost. And splicing the detection frames with the similarity higher than the threshold value into the Kalman sequence spectrum, and updating the motion model and the appearance model in the Kalman sequence spectrum for tracking the pedestrian or moving object target in the next frame.
Wherein, the similarity of the appearance model is obtained through the following processes:
at the n-ththIn a frame, the size of the patch is fixed [64,128 ]]The number of groups of pixel histogram is 64, there are D detection boxes, D patch, and the Xth detection box is shown as
Figure BDA00026109792400000819
The patch corresponding to the Xth detection frame is
Figure BDA0002610979240000081
Then, at nthIn a frame, the invention performs crop and resize operations on the region in which each detection box is located (e.g., the invention crops and resizes each patch to a shape such as [64,128 ]]Tensor of). After these operations, the invention obtains a number of D patches equal to the fixed size of the detection box. The invention then divides the pixels of the D patches into groups (e.g. 64 groups) each by color interval,
and the matrix result reshape obtained by grouping is used as a one-dimensional vector TsrXThat is, the present invention obtains a 1 × 144 tensor reshaped from a 3 × 64 tensor obtained by color interval grouping. Then the invention converts TsrXAs
Figure BDA0002610979240000082
Is used to represent the vector. Binding [12]The invention obtains an appearance model, and expresses the appearance model of the X-th detection frame and the Y-th track as: f (X) and f (Y). Finally, the present invention updates the appearance model through vector fusion, which can be expressed as
Figure BDA00026109792400000820
Therefore, the appearance similarity can be obtained as the following formula (3-1).
Figure BDA0002610979240000083
In the formula: lambdaA(X, Y) represents the similarity of the appearance model. The method is an effective way for enhancing the discrimination between the targets when the relation between the track and the detection frame is complex and the single physical motion model cannot be used.
The similarity between the motion model and the size change model is obtained through the following processes: the time difference between adjacent frames is deltatThe k-th target in the n-th frame is
Figure BDA0002610979240000084
Position center coordinates of
Figure BDA0002610979240000085
Is (x, y), the velocity vector corresponding to the coordinates is
Figure BDA0002610979240000086
Acceleration vector corresponding to coordinate
Figure BDA0002610979240000087
Is composed of
Figure BDA0002610979240000088
The size of the target corresponding to the detection frame
Figure BDA0002610979240000089
Is (w, h), corresponding to the dimensional change speed
Figure BDA00026109792400000810
Is composed of
Figure BDA00026109792400000811
Varying the driving force of
Figure BDA00026109792400000812
The detector impact factor is a (the higher the mIOU of the detector, the higher the value, the default is 0.7).
Motion state of kth object in nth frame
Figure BDA00026109792400000813
And dimensional state
Figure BDA00026109792400000814
Are respectively as
Figure BDA00026109792400000815
And
Figure BDA00026109792400000816
covariance matrix between element factors in motion state
Figure BDA00026109792400000817
The covariance matrix between the element factors in the size state is
Figure BDA00026109792400000818
According to the physical motion law, a position prediction equation and a size prediction equation for the next frame are obtained as follows:
Figure BDA0002610979240000091
Figure BDA0002610979240000092
namely, it is
Figure BDA0002610979240000093
Figure BDA0002610979240000094
Figure BDA0002610979240000095
Namely, it is
Figure BDA0002610979240000096
Order to
Figure BDA0002610979240000097
Simplifying the two iterative state transfer equations and the covariance matrix updating equation as follows:
Figure BDA0002610979240000098
Figure BDA0002610979240000099
and (3-8) and (3-9) are used as iterative equations of a motion model and a size model, and Kalman filter prediction based on normal distribution is respectively carried out to obtain position prediction information of the (n +1) th frame
Figure BDA00026109792400000910
And size prediction information
Figure BDA00026109792400000911
For any two sections of the trajectory X and Y,
Figure BDA00026109792400000912
and
Figure BDA00026109792400000913
a forward velocity vector pointing from the head to the tail of trajectory X and a reverse velocity vector pointing from the tail to the head of trajectory Y, respectively (as derived from equations (3-10) and (3-11)).
Figure BDA00026109792400000916
Representing the course of motion simulated by a kalman filter. FX, Y) is directed from the tail of track X to trackA forward similarity score of the head of Y;
Figure BDA00026109792400000914
is the inverse semblance score from the head of track Y to the tail of track X. The similarity is as follows:
Figure BDA00026109792400000915
Figure BDA0002610979240000101
Figure BDA0002610979240000102
the overall similarity can be represented by (3-12), ΛM(X, Y) represents the similarity between the first-stage trajectory X and the second-stage trajectory Y calculated by the equations (3-10) and (3-11). LambdaMThe value range of (X, Y) is [0,1 ]],ΛMThe closer the value of (X, Y) is to 1, the more likely the first segment trajectory X and the second segment trajectory Y belong to the same target in the simulation process of the physical movement of the model, which is an important basis for judging the relationship between the fragment trajectories.
Trajectory confidence can be intuitively understood as the degree of match between the constructed trajectory and the true trajectory of the object. Trace conf (T)i) The confidence of (c) can be represented by the formula (3-13).
Figure BDA0002610979240000103
In the formula:
Figure BDA0002610979240000104
representing the average similarity between the various detections in the existing trace,
Figure BDA0002610979240000105
indicating railTrace TiTwo detection frames in
Figure BDA0002610979240000106
And
Figure BDA0002610979240000107
Figure BDA0002610979240000108
representing the continuity of the trajectory, α is the number of frames the object is missing, β is a control parameter related to the accuracy of the detector (default to 0.4).
The video sequence selected by the semi-online mechanism on the time axis is positioned between the online mechanism and the offline mechanism, and the performance of the video sequence is a good compromise between the online mechanism and the offline mechanism, but the semi-online tracking mechanism can be well optimized on the aspects of real-time performance and accuracy through algorithm optimization, such as shielding optimization, semantic segmentation optimization and the like.
The present invention defines the length of the time window as N. The minimum instantiation length of the short track is Tm. The kalman family diagram may be denoted KFM, for recording the detection relationship between the motion model and the appearance model. Note that the kth detection box in the nth frame is represented as
Figure BDA0002610979240000109
It also contains the coordinates and reliabilities detected in the list: [ x, y, w, h, conf](ii) a For the invention
Figure BDA00026109792400001010
Representing detection boxes in KFM
Figure BDA00026109792400001011
Representing the order in the corresponding patch trajectory. It can be represented by the following mathematical expression:
Figure BDA00026109792400001012
if it is not
Figure BDA00026109792400001013
It means that the detection box has not cascaded with any fragmentation traces in the KFM. x represents
Figure BDA00026109792400001014
Is the (x +1) th member of a certain segment of fragment trace in the KFM. The ith fragment track in KFM is defined as TKiIf it is longer than TmAnd its motion model, appearance model, size model are not updated in the nth frame, it will be instantiated as a reliable short trajectory STjOtherwise, the trace would be disassembled.
Figure BDA0002610979240000111
The invention will take the situation of the pedestrian picture of the nth frame as an example, and introduce the short track tracking process and the track backtracking strategy in the invention, as shown in fig. 2:
first, finding out the detection frame KH of the n-th frame of pedestrian pictures: detection frame in n-th frame and n +1 frame pictures
Figure BDA0002610979240000112
In (1),
Figure BDA0002610979240000113
for the detection frame in the image of the nth frame,
Figure BDA0002610979240000114
for the detection frames in the (n +1) th frame picture, each pair of detection frames close in the IOU relationship, which may belong to the same target, is found. If it is not
Figure BDA0002610979240000115
Then will be
Figure BDA0002610979240000116
And
Figure BDA0002610979240000117
respectively labeled as 0 and 1, and will
Figure BDA0002610979240000118
And
Figure BDA0002610979240000119
referred to as a pair detection frame KH.
After this step, the present invention will have several pairs KH in the nth and n +1 frames (e.g.,
Figure BDA00026109792400001110
and
Figure BDA00026109792400001111
)。
Figure BDA00026109792400001112
representing detection boxes in KFM
Figure BDA00026109792400001113
Representing the order in the corresponding patch track, the ith detection box in the nth frame is denoted as
Figure BDA00026109792400001114
And step two, predicting:
and predicting the positions of the pedestrian objects in the next frame picture according to the motion module of each pedestrian object in the current (n +1) th frame picture. The specific process is as follows:
according to
Figure BDA00026109792400001115
And
Figure BDA00026109792400001116
establishing an unstable trajectory TKiThe motion model of (1). According to the motion model, the TK belonging to the trajectory is predictediIs (n +2)thThe position of the frame, and defining the position as
Figure BDA00026109792400001117
Step three, track growth: selecting and matching the matching strategy according to the formula (3)
Figure BDA00026109792400001118
The detection frame with the most similar position is
Figure BDA00026109792400001119
Then, order
Figure BDA00026109792400001120
And updating the instable TKiMotion model and appearance model.
Figure BDA00026109792400001121
Using updated unstable trajectory TKiThe motion model and the appearance model of (c) predict a position in the next frame.
The fourth step: the process of the first step to the third step is repeated for tracking of each frame.
Fifthly, instantiation or backtracking: the KFM (e.g., TK) in the current frame is determined according to the following condition0,TK1,…,TKi) Instantiating or backtracking the short track in (1):
a) instantiation: if the unstable track TK in the Kalman sequence spectrumiLength, if not updated in the last frame, the unstable trajectory TKiInstantiated as a new reliable trajectory STj
For example, an unstable trajectory TKiIf the length is greater than or equal to the threshold value TmThe trace is then a reliable trace. Threshold value TmAccording to actual conditions, 5 is generally adopted.
b) Backtracking: if the unstable track TK in the Kalman sequence spectrumiIs less than a threshold value TmAnd there is no update in the last frame, the unstable trajectory TK will be deleted in the Kalman family diagram KFMiAnd is provided with
Figure BDA0002610979240000121
And marking the fragment track in the Kalman sequence spectrum as a forbidden route so as to avoid the path reappearing by later exploration.
The IOU mask module is adopted by the invention to process the condition that two or more targets are mutually shielded, and the process is as follows. As shown in fig. 3, a scene with objects occluded from each other is shown. When the target A and the target B are mutually shielded, before the characteristics are extracted from the detection frame area, the IOU area between the A and the B is used as a mask to cover the pixel information of the IOU area, so that related targets are prevented from sharing the characteristic information of the IOU area, and the distinguishing degree of different target appearance models is effectively improved. However, when a plurality of targets block each other, it is easy to cause the detection area of the target to be almost completely covered by the plurality of IOU masks, thereby causing a phenomenon that the appearance features of the blocked target are completely covered. To avoid this, the present invention sets threshIOUTo avoid the worst case.
Referring to FIG. 3, in the nth frame, the kth detection frame is marked
Figure BDA0002610979240000122
Figure BDA0002610979240000123
A set of IOU masks between each detection box is
Figure BDA0002610979240000124
For the kth detection box, assume that there is a set of detection boxes
Figure BDA0002610979240000125
All the detection frames in (1) are as follows
Figure BDA0002610979240000126
If the shielding phenomenon occurs in the area of the detection frame, the detection frame is collected
Figure BDA0002610979240000127
Inner detection box, mask IOU
Figure BDA0002610979240000128
Is recorded as the total occlusion merge area
Figure BDA0002610979240000129
If it is covered by
Figure BDA00026109792400001210
After covering
Figure BDA00026109792400001211
Has a residual area of
Figure BDA00026109792400001212
Then is formed by
Figure BDA00026109792400001213
Covering
Figure BDA00026109792400001214
Is represented by the process of
Figure BDA00026109792400001215
Figure BDA00026109792400001216
Figure BDA0002610979240000131
Figure BDA0002610979240000132
If obtained, is
Figure BDA0002610979240000133
Is less than a preset threshold value ThresIMI.e. by
Figure BDA0002610979240000134
Can lead to the appearance characteristics of the targetIt is difficult to express in the appearance model, therefore, the invention adopts the method of detecting the box set
Figure BDA0002610979240000135
Inner detection frame and
Figure BDA0002610979240000136
sorting the areas of the shielded areas, and then sequentially rejecting the detection frames with the smallest shielded area one by one in the detection frame set
Figure BDA0002610979240000137
In addition, a new set of detection boxes is obtained
Figure BDA0002610979240000138
And collecting new detection boxes
Figure BDA0002610979240000139
Substituting the formula (4) and the formula (5) again to calculate until
Figure BDA00026109792400001310
Then, the final IOU mask is obtained and is used
Figure BDA00026109792400001311
And when the IOU mask module is used for extracting the appearance characteristics, setting the pixel value of the original image area where the IOU mask module is positioned as zero: and the shielded target are intersected in an area through shielding, so that the feature discrimination between the targets is increased.
The following are specific examples.
The time window length is first set to 40 frames. The video and the detection frame of each frame are used as input, feature extraction is carried out, the patch corresponding to the detection frame obtained after cutting and resize is subjected to appearance representation in a pixel histogram grouping mode, as shown in fig. 3, and an appearance model is established. The appearance model is built by the following process: at the n-ththIn a frame, the size of patch is fixed to [64,128 ]]The number of pixel histogram groups is 64, and there are D detection boxes, D patch, and the Xth detection box in the frameIs composed of
Figure BDA00026109792400001312
The patch corresponding to the Xth detection frame is
Figure BDA00026109792400001313
At the n-ththWithin the frame, crop and resize operations are performed on the regions of each inspection box (e.g., the invention crops and resizes each patch to a shape such as [64,128 ]]Tensor of). After these operations, the invention obtains a number of D patches equal to the fixed size of the detection box.
Then, the invention divides the pixels of the D patches into a plurality of groups (such as 64 groups) according to the color interval, and the matrix result reshape obtained by grouping is a one-dimensional vector TsrX. That is, the present invention obtains a 1 × 192 tensor reshaped from a 3 × 64 tensor obtained by color interval grouping. Then the invention converts the one-dimensional vector TsrXPatch corresponding to the Xth detection frame
Figure BDA00026109792400001314
The appearance model of (1).
And combining the appearance model, and expressing the appearance model of the X detection frame and the appearance model of the Y track as follows: f (X) and f (Y). Finally, the present invention updates the appearance model through vector fusion, which can be expressed as
Figure BDA0002610979240000143
Therefore, the appearance similarity can be obtained as shown in formula (7).
Figure BDA0002610979240000141
In the formula: lambdaA(X, Y) represents the similarity of the appearance model.
Through the steps, fragment tracks with greatly improved reliability can be obtained, and noises appearing in detection can be effectively checked, as shown in table 1.
TABLE 1 comparison of the results of the algorithms at MOT15
Figure BDA0002610979240000142
Referring to fig. 5 and 6, the present invention combines the advantages of online and offline MOT at the expense of a small amount of real-time performance, and has good precision improvement on MOTA, MOTP, IDS, ML, MT, and FM, and a proper balance between real-time performance and accuracy is achieved.
Compared with baseline, on a data set MOT2015, generally, besides fps, a plurality of indexes are basically better to perform, wherein MOTA and MOTP are respectively improved by 12.6 percent and 6.3 percent, which shows that compared with baseline, the algorithm of the invention has very large improvement on continuous tracking capability of a target on the whole, and simultaneously shows that short track fragments generated by the algorithm of the invention are more accurate and robust; the algorithm of the present invention reduces 82 on IDS, with less boost from the overall number of identity translations. The algorithm of the invention has a higher MT value and a lower ML value, which shows that to a certain extent, the more robust short track can reduce the number of frames which are missed among partial fragment tracks. From the overall performance among the algorithms, the MOTA bit column of the algorithm is the first, and other indexes such as MOTP, Recall and IDS are far better than the average value overall, which means that the algorithm has stronger stability and generalization capability on the algorithm framework. It is particularly noted that the backtracking mechanism does not include any module involving complex computation, and only relies on the simple side appearance model and the motion model to perform the tracking process through the state between online and offline, so that the algorithm of the present invention has an extremely significant FPS advantage over other algorithms in the table.
The invention adopts a semi-online mechanism to optimize the real-time performance and the accuracy of the multi-target tracking method. The method can detect and correct the error of the established tracking result, effectively improve the appearance characteristic degree of the target, has high speed and low calculation resource requirement, can be used on special integrated circuits such as Yingdada TX2 and the like in the scenes of automatic driving, pedestrian tracking and the like, and effectively solves the problem that the real-time performance and the algorithm precision such as MOTA (motion over adaptive tracking) indexes of the existing multi-target tracking algorithm are difficult to achieve the optimal performance at the same time.

Claims (6)

1. A multi-target tracking method of a semi-online mechanism is characterized in that a detection frame of a pedestrian or a moving target is obtained through a YOLO-V3 detector according to a pedestrian or moving target video, a Kalman sequence spectrum is obtained according to position change information among the detection frames in a period of time window, then a pair of Kalman heads is found according to the Kalman sequence spectrum, a detection frame of the target or the moving object to be tracked in the next frame is obtained through similarity of an appearance model, a moving model and a size change model, the target or the moving object is enabled to be located in the detection frame in the frame, and if not, the target is indicated to be lost; and splicing the detection frames with the similarity higher than the threshold value into the Kalman sequence spectrum, updating a motion model and an appearance model in the Kalman sequence spectrum, and tracking the pedestrian or moving object target in the next frame.
2. The multi-target tracking method for the semi-online mechanism according to claim 1, wherein the similarity of the appearance model is obtained through the following processes:
at the n-ththIn frame video, the size of patch is fixed to [64,128 ]]There are D check boxes, D patch, the Xth check box is shown as
Figure FDA0002610979230000011
The patch corresponding to the Xth detection frame is
Figure FDA0002610979230000012
At nthIn the frame, crop and resize operations are carried out on the area where each detection frame is located to obtain D patches with the quantity equal to the fixed size of the detection frame, then the pixels of the D patches are respectively divided into a plurality of groups according to color intervals,
will be grouped intoThe resulting matrix result reshape is the one-dimensional vector TsrXA one-dimensional vector TsrXAs
Figure FDA0002610979230000013
Thereby obtaining an appearance model, and representing the appearance model of the xth detection box and the yth track as: f (X) and f (Y); finally, the appearance model is updated by vector fusion, represented as
Figure FDA0002610979230000014
The similarity of the appearance model is as the following formula (3-1);
Figure FDA0002610979230000015
in the formula: lambdaA(X, Y) represents the similarity of the appearance model.
3. The multi-target tracking method for the semi-online mechanism according to claim 1, wherein the similarity between the motion model and the size change model is obtained through the following processes: the time difference between adjacent frames is deltatThe k-th target in the n-th frame is
Figure FDA0002610979230000016
Position center coordinates of
Figure FDA0002610979230000017
Is (x, y), the velocity vector corresponding to the coordinates is
Figure FDA0002610979230000018
Acceleration vector corresponding to coordinate
Figure FDA0002610979230000021
Is composed of
Figure FDA0002610979230000022
The size of the target corresponding to the detection frame
Figure FDA0002610979230000023
Is (w, h), corresponding to the dimensional change speed
Figure FDA00026109792300000222
Is composed of
Figure FDA0002610979230000025
Varying the driving force of
Figure FDA0002610979230000026
The detector impact factor is α;
motion state of kth object in nth frame
Figure FDA0002610979230000027
And dimensional state
Figure FDA0002610979230000028
Are respectively as
Figure FDA0002610979230000029
And
Figure FDA00026109792300000210
covariance matrix between element factors in motion state
Figure FDA00026109792300000211
The covariance matrix between the element factors in the size state is
Figure FDA00026109792300000212
According to the physical motion law, a position prediction equation and a size prediction equation for the next frame are obtained as follows:
Figure FDA00026109792300000213
Figure FDA00026109792300000214
namely, it is
Figure FDA00026109792300000215
Figure FDA00026109792300000216
Figure FDA00026109792300000217
Namely, it is
Figure FDA00026109792300000218
Order to
Figure FDA00026109792300000219
Simplifying the two iterative state transfer equations and the covariance matrix updating equation as follows:
Figure FDA00026109792300000220
Figure FDA00026109792300000221
and (3-8) and (3-9) are used as iterative equations of a motion model and a size model, and Kalman filter prediction based on normal distribution is respectively carried out to obtain position prediction information of the (n +1) th frame
Figure FDA0002610979230000031
And size prediction information
Figure FDA0002610979230000032
For any first segment trajectory X and second segment trajectory Y,
Figure FDA0002610979230000033
and
Figure FDA0002610979230000034
a forward velocity vector pointing from the head to the tail of the first trajectory X and a reverse velocity vector pointing from the tail to the head of the second trajectory Y, respectively;
Figure FDA0002610979230000035
representing a motion process simulated by a kalman filter; f (X, Y) is a forward similarity score pointing from the tail of trajectory X to the head of trajectory Y;
Figure FDA0002610979230000036
is an inverse similarity score pointing from the head of trajectory Y to the tail of trajectory X;
Figure FDA0002610979230000037
Figure FDA0002610979230000038
Figure FDA0002610979230000039
wherein A isM(X, Y) represents the similarity between the first segment of trajectory X and the second segment of trajectory Y.
4. The multi-target tracking method for semi-online machines according to claim 1, wherein the length of the time window is defined as N, and the minimum instantiation length of the short track is TmThe Kalman family is denoted by KFM and the kth detection box in the nth frame is denoted by KFM
Figure FDA00026109792300000310
Figure FDA00026109792300000311
Representing detection boxes in KFM
Figure FDA00026109792300000312
Representing an order in the corresponding patch trajectory;
Figure FDA00026109792300000313
if it is not
Figure FDA00026109792300000314
It means that the detection box has not cascaded with any fragmentation traces in the KFM, and x means
Figure FDA00026109792300000315
Is the (x +1) th member of a certain segment of the fragment track in the KFM, and the ith fragment track in the KFM is defined as TKiIf the length of the ith fragment track is greater than TmAnd its motion model, appearance model, size model are not updated in the nth frame, then the ith patch trajectory is instantiated as a reliable short trajectory STjOtherwise, the ith fragment track is disassembled;
Figure FDA00026109792300000316
5. the multi-target tracking method for the semi-online machine according to claim 1, wherein the specific process of splicing the detection frames with the similarity higher than the threshold value into the Kalman sequence spectrum is as follows:
first, finding out the detection frame KH of the n-th frame of pedestrian pictures: detection frame in n-th frame and n +1 frame pictures
Figure FDA00026109792300000317
In (1),
Figure FDA00026109792300000318
for the detection frame in the image of the nth frame,
Figure FDA00026109792300000319
finding out each pair of detection frames which possibly belong to the same target and are close in the IOU relation for the detection frames in the (n +1) th frame picture; if it is not
Figure FDA0002610979230000041
Then will be
Figure FDA0002610979230000042
And
Figure FDA0002610979230000043
respectively labeled as 0 and 1, and will
Figure FDA0002610979230000044
And
Figure FDA0002610979230000045
referred to as a pair detection frame KH;
there will be several pairs KH in the nth and n +1 frames (e.g.,
Figure FDA0002610979230000046
and
Figure FDA0002610979230000047
);
Figure FDA0002610979230000048
representing detection boxes in KFM
Figure FDA0002610979230000049
Representing the order in the corresponding patch track, the ith detection box in the nth frame is denoted as
Figure FDA00026109792300000410
And step two, predicting:
predicting the position of the pedestrian target in the next frame of picture according to the motion module of each pedestrian target in the current (n +1) th frame of picture
Figure FDA00026109792300000411
Step three, track growth: according to the formula (3), selecting and
Figure FDA00026109792300000412
the detection frame with the most similar position is
Figure FDA00026109792300000413
Then, order
Figure FDA00026109792300000414
And updating the instable TKiThe motion model and appearance model of (1);
Figure FDA00026109792300000415
using updated unstable trajectory TKiPredicting a position in a next frame using the motion model and the appearance model of (a);
the fourth step: repeating the process from the first step to the third step for tracking each frame;
fifthly, instantiation or backtracking: instantiating or backtracking the short track in KFM in the current frame according to the following conditions:
a) instantiation: if the unstable track TK in the Kalman sequence spectrumiLength, if not updated in the last frame, the unstable trajectory TKiInstantiated as a new reliable trajectory STj
b) Backtracking: if the unstable track TK in the Kalman sequence spectrumiIs less than a threshold value TmAnd there is no update in the last frame, the unstable trajectory TK will be deleted in the Kalman family diagram KFMiAnd is provided with
Figure FDA00026109792300000416
And marking the fragment track in the Kalman sequence spectrum as a forbidden route.
6. The multi-target tracking method for the semi-online mechanism according to claim 5, wherein the specific process of the second step is as follows: according to
Figure FDA00026109792300000417
And
Figure FDA00026109792300000418
establishing an unstable trajectory TKiAccording to which the prediction belongs to the trajectory TKiIs (n +2)thThe position of the frame, and defining the position as
Figure FDA00026109792300000419
CN202010754142.2A 2020-07-30 2020-07-30 Multi-target tracking method of semi-online machine Active CN112116634B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010754142.2A CN112116634B (en) 2020-07-30 2020-07-30 Multi-target tracking method of semi-online machine

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010754142.2A CN112116634B (en) 2020-07-30 2020-07-30 Multi-target tracking method of semi-online machine

Publications (2)

Publication Number Publication Date
CN112116634A true CN112116634A (en) 2020-12-22
CN112116634B CN112116634B (en) 2024-05-07

Family

ID=73799581

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010754142.2A Active CN112116634B (en) 2020-07-30 2020-07-30 Multi-target tracking method of semi-online machine

Country Status (1)

Country Link
CN (1) CN112116634B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112906533A (en) * 2021-02-07 2021-06-04 成都睿码科技有限责任公司 Safety helmet wearing detection method based on self-adaptive detection area

Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101141633A (en) * 2007-08-28 2008-03-12 湖南大学 Moving object detecting and tracing method in complex scene
CN103530894A (en) * 2013-10-25 2014-01-22 合肥工业大学 Video target tracking method based on multi-scale block sparse representation and system thereof
CN103632376A (en) * 2013-12-12 2014-03-12 江苏大学 Method for suppressing partial occlusion of vehicles by aid of double-level frames
CN104915970A (en) * 2015-06-12 2015-09-16 南京邮电大学 Multi-target tracking method based on track association
CN105809714A (en) * 2016-03-07 2016-07-27 广东顺德中山大学卡内基梅隆大学国际联合研究院 Track confidence coefficient based multi-object tracking method
CN106096645A (en) * 2016-06-07 2016-11-09 上海瑞孚电子科技有限公司 Resist and repeatedly block and the recognition and tracking method and system of color interference
WO2017185688A1 (en) * 2016-04-26 2017-11-02 深圳大学 Method and apparatus for tracking on-line target
US20180232891A1 (en) * 2017-02-13 2018-08-16 Electronics And Telecommunications Research Institute System and method for tracking multiple objects
CN108447080A (en) * 2018-03-02 2018-08-24 哈尔滨工业大学深圳研究生院 Method for tracking target, system and storage medium based on individual-layer data association and convolutional neural networks
CN109191497A (en) * 2018-08-15 2019-01-11 南京理工大学 A kind of real-time online multi-object tracking method based on much information fusion
CN109919981A (en) * 2019-03-11 2019-06-21 南京邮电大学 A kind of multi-object tracking method of the multiple features fusion based on Kalman filtering auxiliary
CN110135314A (en) * 2019-05-07 2019-08-16 电子科技大学 A kind of multi-object tracking method based on depth Trajectory prediction
US20190295313A1 (en) * 2018-03-21 2019-09-26 Leigh Davies Method and apparatus for masked occlusion culling
CN110362715A (en) * 2019-06-28 2019-10-22 西安交通大学 A kind of non-editing video actions timing localization method based on figure convolutional network
KR20200039043A (en) * 2018-09-28 2020-04-16 한국전자통신연구원 Object recognition device and operating method for the same
US20200126241A1 (en) * 2018-10-18 2020-04-23 Deepnorth Inc. Multi-Object Tracking using Online Metric Learning with Long Short-Term Memory
KR20200061118A (en) * 2018-11-23 2020-06-02 인하대학교 산학협력단 Tracking method and system multi-object in video
CN111242985A (en) * 2020-02-14 2020-06-05 电子科技大学 Video multi-pedestrian tracking method based on Markov model

Patent Citations (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101141633A (en) * 2007-08-28 2008-03-12 湖南大学 Moving object detecting and tracing method in complex scene
CN103530894A (en) * 2013-10-25 2014-01-22 合肥工业大学 Video target tracking method based on multi-scale block sparse representation and system thereof
CN103632376A (en) * 2013-12-12 2014-03-12 江苏大学 Method for suppressing partial occlusion of vehicles by aid of double-level frames
CN104915970A (en) * 2015-06-12 2015-09-16 南京邮电大学 Multi-target tracking method based on track association
CN105809714A (en) * 2016-03-07 2016-07-27 广东顺德中山大学卡内基梅隆大学国际联合研究院 Track confidence coefficient based multi-object tracking method
WO2017185688A1 (en) * 2016-04-26 2017-11-02 深圳大学 Method and apparatus for tracking on-line target
CN106096645A (en) * 2016-06-07 2016-11-09 上海瑞孚电子科技有限公司 Resist and repeatedly block and the recognition and tracking method and system of color interference
US20180232891A1 (en) * 2017-02-13 2018-08-16 Electronics And Telecommunications Research Institute System and method for tracking multiple objects
CN108447080A (en) * 2018-03-02 2018-08-24 哈尔滨工业大学深圳研究生院 Method for tracking target, system and storage medium based on individual-layer data association and convolutional neural networks
US20190295313A1 (en) * 2018-03-21 2019-09-26 Leigh Davies Method and apparatus for masked occlusion culling
CN109191497A (en) * 2018-08-15 2019-01-11 南京理工大学 A kind of real-time online multi-object tracking method based on much information fusion
KR20200039043A (en) * 2018-09-28 2020-04-16 한국전자통신연구원 Object recognition device and operating method for the same
US20200126241A1 (en) * 2018-10-18 2020-04-23 Deepnorth Inc. Multi-Object Tracking using Online Metric Learning with Long Short-Term Memory
KR20200061118A (en) * 2018-11-23 2020-06-02 인하대학교 산학협력단 Tracking method and system multi-object in video
CN109919981A (en) * 2019-03-11 2019-06-21 南京邮电大学 A kind of multi-object tracking method of the multiple features fusion based on Kalman filtering auxiliary
CN110135314A (en) * 2019-05-07 2019-08-16 电子科技大学 A kind of multi-object tracking method based on depth Trajectory prediction
CN110362715A (en) * 2019-06-28 2019-10-22 西安交通大学 A kind of non-editing video actions timing localization method based on figure convolutional network
CN111242985A (en) * 2020-02-14 2020-06-05 电子科技大学 Video multi-pedestrian tracking method based on Markov model

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
SEUNG-HWAN BAE 等,: "Confidence-Based Data Association and Discriminative Deep Appearance Learning for Robust Online Multi-Object Tracking", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 *
SEUNG-HWAN BAE 等,: "Confidence-Based Data Association and Discriminative Deep Appearance Learning for Robust Online Multi-Object Tracking", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》, vol. 40, no. 3, 31 March 2018 (2018-03-31), pages 595 - 610 *
STRONGERHUANG,: "深度分析卡尔曼滤波算法原理", 《HTTPS://MP.WEIXIN.QQ.COM/S/OSTYC-NA-GFJNCZ2XQQTDQ》 *
STRONGERHUANG,: "深度分析卡尔曼滤波算法原理", 《HTTPS://MP.WEIXIN.QQ.COM/S/OSTYC-NA-GFJNCZ2XQQTDQ》, 24 June 2020 (2020-06-24), pages 1 - 18 *
嵌入式ARM,: "深度解读:卡尔曼滤波,如此强大的工具 你值得弄懂!", 《嵌入式ARM》 *
嵌入式ARM,: "深度解读:卡尔曼滤波,如此强大的工具 你值得弄懂!", 《嵌入式ARM》, 8 September 2019 (2019-09-08), pages 1 - 21 *
慧天地,: "详解卡尔曼滤波原理", 《HTTPS://WWW.SOHU.COM/A/332038419_650579》 *
慧天地,: "详解卡尔曼滤波原理", 《HTTPS://WWW.SOHU.COM/A/332038419_650579》, 7 August 2019 (2019-08-07), pages 1 - 24 *
李明华 等,: "基于分层数据关联的在线多目标跟踪算法", 《现代计算机》 *
李明华 等,: "基于分层数据关联的在线多目标跟踪算法", 《现代计算机》, vol. 2018, no. 5, 15 February 2018 (2018-02-15), pages 25 - 29 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112906533A (en) * 2021-02-07 2021-06-04 成都睿码科技有限责任公司 Safety helmet wearing detection method based on self-adaptive detection area
CN112906533B (en) * 2021-02-07 2023-03-24 成都睿码科技有限责任公司 Safety helmet wearing detection method based on self-adaptive detection area

Also Published As

Publication number Publication date
CN112116634B (en) 2024-05-07

Similar Documents

Publication Publication Date Title
CN109800689B (en) Target tracking method based on space-time feature fusion learning
WO2017150032A1 (en) Method and system for detecting actions of object in scene
CN109919981A (en) A kind of multi-object tracking method of the multiple features fusion based on Kalman filtering auxiliary
CN106709436A (en) Cross-camera suspicious pedestrian target tracking system for rail transit panoramic monitoring
CN111311666A (en) Monocular vision odometer method integrating edge features and deep learning
US20190378283A1 (en) System and method for transforming video data into directional object count
CN108416780B (en) Object detection and matching method based on twin-region-of-interest pooling model
CN109389086A (en) Detect the method and system of unmanned plane silhouette target
Cheng et al. A self-constructing cascade classifier with AdaBoost and SVM for pedestriandetection
CN111881749B (en) Bidirectional people flow statistics method based on RGB-D multi-mode data
CN111862145A (en) Target tracking method based on multi-scale pedestrian detection
CN112651995A (en) On-line multi-target tracking method based on multifunctional aggregation and tracking simulation training
CN114220061B (en) Multi-target tracking method based on deep learning
CN115841649A (en) Multi-scale people counting method for urban complex scene
CN115731266A (en) Cross-camera multi-target tracking method, device and equipment and readable storage medium
Han et al. A method based on multi-convolution layers joint and generative adversarial networks for vehicle detection
CN111062971A (en) Cross-camera mud head vehicle tracking method based on deep learning multi-mode
CN116152297A (en) Multi-target tracking method suitable for vehicle movement characteristics
Wei et al. Traffic sign detection and recognition using novel center-point estimation and local features
CN113763427A (en) Multi-target tracking method based on coarse-fine shielding processing
CN115346155A (en) Ship image track extraction method for visual feature discontinuous interference
CN114926859A (en) Pedestrian multi-target tracking method in dense scene combined with head tracking
CN111882581A (en) Multi-target tracking method for depth feature association
CN112116634A (en) Multi-target tracking method of semi-online machine
CN111862147B (en) Tracking method for multiple vehicles and multiple lines of human targets in video

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant