CN111080673B - Anti-occlusion target tracking method - Google Patents

Anti-occlusion target tracking method Download PDF

Info

Publication number
CN111080673B
CN111080673B CN201911261618.2A CN201911261618A CN111080673B CN 111080673 B CN111080673 B CN 111080673B CN 201911261618 A CN201911261618 A CN 201911261618A CN 111080673 B CN111080673 B CN 111080673B
Authority
CN
China
Prior art keywords
target
tracking
candidate
detection
candidate item
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911261618.2A
Other languages
Chinese (zh)
Other versions
CN111080673A (en
Inventor
张盛
易梦云
徐赫
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen International Graduate School of Tsinghua University
Original Assignee
Shenzhen International Graduate School of Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen International Graduate School of Tsinghua University filed Critical Shenzhen International Graduate School of Tsinghua University
Priority to CN201911261618.2A priority Critical patent/CN111080673B/en
Publication of CN111080673A publication Critical patent/CN111080673A/en
Application granted granted Critical
Publication of CN111080673B publication Critical patent/CN111080673B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Abstract

The invention provides an anti-occlusion target tracking method, which comprises the steps of firstly, for an input video or an image sequence, detecting each frame of image in the video by adopting a target detector to obtain a candidate item based on detection; and according to the target detection result of the current frame, predicting the position of the target in the next frame by using a Kalman filter to obtain a candidate item based on tracking. Calculating the confidence of the candidate item according to a confidence score formula, and obtaining a final candidate item by adopting a non-maximum suppression algorithm; and inputting the candidate items of the adjacent frames into a feature matching network, and calculating the matching degree between the targets through a cascade matching algorithm. Extracting features of the candidate items based on detection through a deep neural network, and matching similarity between the features; and performing IOU coincidence degree matching based on the tracked candidate items. And determining the position of the target in the current frame according to the target matching result of the adjacent frame so as to output a target motion track. The target is detected and tracked under the condition that the target is shielded, and the tracking precision and performance are improved.

Description

Anti-occlusion target tracking method
Technical Field
The invention relates to the technical field of target tracking, in particular to an anti-occlusion target tracking method.
Background
In recent years, with the continuous development of deep neural networks and the continuous improvement of GPU computing power, methods based on deep learning make breakthrough progress on computer vision tasks. Computer vision technologies such as target detection, target recognition, target tracking, pedestrian re-recognition and the like are rapidly developed and widely applied to various industries and fields such as intelligent monitoring, human-computer interaction, virtual reality and augmented reality, medical image analysis and the like.
Multi-Object Tracking (Multi Object Tracking) is a classic computer vision task, a region of interest obtained by target Tracking is the basis for further high-level vision analysis, and the accuracy of target Tracking directly affects the performance of a computer vision system. Most of the existing multi-target Tracking methods adopt Tracking-by-Detection (Tracking-by-Detection), that is, under the Detection result of a target detector, the motion track association is carried out on the Detection result from the same target between frames. Such detection methods depend to a large extent on the detection result. However, in many practical applications, especially in crowded scenes, the detection result of the detector is usually not accurate enough due to the interaction between objects, the appearance similarity and frequent occlusion of the objects, thereby seriously affecting the accuracy and performance of tracking.
In the existing multi-target tracking algorithm, a target detector is retrained through a large-scale data set to obtain a more accurate detection result, however, motion information in a video image is ignored, and the method is not efficient enough. Some methods carry out feature extraction by designing and training deeper neural networks to obtain more robust target features, however, the appearance similarity problem is difficult to solve by appearance-based features, and the real-time performance of the algorithm is difficult to guarantee. In view of the above, it is desirable to provide a new anti-occlusion target tracking method for solving the target occlusion interaction.
Disclosure of Invention
The invention provides an anti-occlusion target tracking method for solving the existing problems.
In order to solve the above problems, the technical solution adopted by the present invention is as follows:
an anti-occlusion target tracking method comprises the following steps: s1: inputting a video or an image sequence into a target detector according to frames to obtain a target detection result, wherein the target detection result is a candidate item based on detection and comprises a bounding box and detection confidence of all targets in each frame of image; s2: generating a candidate item based on tracking for each frame of image by utilizing a joint detection and tracking frame according to the target detection result, wherein the joint detection and tracking frame carries out tracking motion estimation on the detection result through a Kalman filter and camera motion compensation so as to obtain the candidate item based on tracking; s3: screening the detection-based candidate items and the tracking-based candidate items by using a non-maximum suppression algorithm according to the confidence degrees of the detection-based candidate items and the tracking-based candidate items to obtain the screened detection-based candidate items and the screened tracking-based candidate items; s4: extracting apparent features of all screened candidate items based on detection and screened candidate items based on tracking of the current frame by utilizing a pre-trained deep neural network; s5: calculating the target matching degree of the adjacent frames by using a cascade matching algorithm, wherein the method comprises the following steps: the screened candidate items based on detection perform apparent feature similarity matching on the existing tracks of the adjacent frames; the screened candidate items based on the tracking are subjected to boundary frame intersection with a target boundary frame of the existing track of the adjacent frame and are matched with the matching degree; s6: and determining the position of the target in the current frame according to the target matching degree of the adjacent frames, thereby outputting a target motion track.
Preferably, the object detector is an SDP object detector.
Preferably, the confidence is given by the following confidence score formula:
Figure SMS_1
wherein the content of the first and second substances,
Figure SMS_2
is the detection confidence of the t-1 th frame, is>
Figure SMS_3
For the tracking confidence of the t-th frame, N det The number N of the candidate items based on detection in the track to be associated trk For the number of the candidate items based on tracking in the track to be associated last time, (. Cndot.) is a binary function, when the function is true, the value is 1, otherwise, the value is 0, and the parameter α is a constant.
Preferably, the screening the detection-based candidate item and the tracking-based candidate item by using a non-maximum suppression algorithm to obtain the screened detection-based candidate item and the screened tracking-based candidate item includes the following steps: s21: sorting according to the confidence score of all the candidate items based on detection and the candidate items based on tracking to obtain a candidate list; s22: selecting the detection-based candidate item and the tracking-based candidate item with the highest confidence level to be added into a final output list and deleted from the candidate item list; s23: calculating the detection-based candidate item and the tracking-based candidate item with the highest confidence coefficient to be in a border intersection ratio with other candidate items, and deleting the detection-based candidate item and the tracking-based candidate item with the border intersection ratio larger than a preset threshold value; s24: and repeating the process until the candidate item list is empty, wherein the final output list is the screened candidate items based on detection and the candidate items based on tracking.
Preferably, the preset threshold is 0.3-0.5.
Preferably, the deep neural network is a google lenet based network, including from an input layer to an initiation _4e layer, and then connected through a 1 × 1 convolutional layer.
Preferably, the loss function of the training of the neural network is:
l triplet (I i ,I j ,I k )=m+d(I i ,I j )-d(I i ,I k )
wherein, I i ,I j For pictures from the same identity, I i ,I k For pictures from different identities, d represents the euclidean distance and m is a constant.
Preferably, the step of calculating the target matching degree of the adjacent frames by using the cascade matching algorithm comprises the following steps: s51: obtaining the target detection result of the first frame, and generating a track for each target to obtain an initial track set
Figure SMS_5
A candidate set of the filtered detection-based candidate and the filtered tracking-based candidate->
Figure SMS_10
The apparent characteristic->
Figure SMS_14
And constructs all matched candidate sets ≥>
Figure SMS_6
All candidate sets ≥ that do not match>
Figure SMS_9
S52: selecting the selected detection-based candidate->
Figure SMS_13
And the initial set of trajectories->
Figure SMS_15
Performing feature similarity calculation, and updating the matched candidate item set based on the matching result>
Figure SMS_4
The unmatched candidate set->
Figure SMS_11
The initial set of trajectories +>
Figure SMS_16
S53: selecting the selected tracking-based candidate->
Figure SMS_17
And the updated initial trajectory set->
Figure SMS_7
The target bounding box carries out bounding box intersection and matching of the matching degree of gravity, and the matched candidate item set is updated according to the matching result>
Figure SMS_8
The unmatched candidate set->
Figure SMS_12
Preferably, the matched candidate item set is
Figure SMS_18
Each candidate bounding box in the cluster of candidates is matched to the initial set of tracks->
Figure SMS_19
The track segments in (1) are connected; combining the unmatched set of candidates>
Figure SMS_20
Initializing the trajectory into a new trajectory; set ≥ for the initial locus>
Figure SMS_21
Is set as a temporary track segment, and if it is not matched in the following consecutive N frames, the temporary track segment is considered to have ended and is based on the initial set of tracks £ or £ the>
Figure SMS_22
Is deleted. The value of N is 5-8.
The invention also provides a computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of the above.
The beneficial effects of the invention are as follows: the anti-blocking target tracking method is provided, and through the combined action of the combined detection tracking frame and the cascade matching algorithm, when the target is blocked interactively and the detection result of the detector is inaccurate, a better candidate item can be generated through the combined detection tracking frame to carry out target cascade matching. The problem of inaccurate detection during target interaction shielding is solved, and the influence of target shielding on the tracking effect is reduced, so that accurate tracking during target shielding is realized.
Furthermore, the method is very simple to implement, the calculation cost is low, the algorithm can reach the operation speed of 30 frames/second on the GPU, and real-time tracking can be realized. Compared with the traditional target tracking method, the method has the advantages of low required calculation cost, strong anti-blocking capability and high real-time property.
Drawings
FIG. 1 is a schematic diagram of an anti-occlusion target tracking method in an embodiment of the present invention.
Fig. 2 is a schematic diagram of a method for obtaining filtered detection-based candidate items and filtered tracking-based candidate items according to an embodiment of the present invention.
Fig. 3 is a schematic diagram of a method for calculating a target matching degree of adjacent frames by using a cascade matching algorithm in the embodiment of the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantageous effects to be solved by the embodiments of the present invention more clearly apparent, the present invention is further described in detail below with reference to the accompanying drawings and the embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
It will be understood that when an element is referred to as being "secured to" or "disposed on" another element, it can be directly on the other element or be indirectly on the other element. When an element is referred to as being "connected to" another element, it can be directly connected to the other element or be indirectly connected to the other element. In addition, the connection may be for either a fixing function or a circuit connection function.
It is to be understood that the terms "length," "width," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like are used in an orientation or positional relationship indicated in the drawings for convenience in describing the embodiments of the present invention and to simplify the description, and are not intended to indicate or imply that the referenced device or element must have a particular orientation, be constructed in a particular orientation, and be in any way limiting of the present invention.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include one or more of that feature. In the description of the embodiments of the present invention, "a plurality" means two or more unless specifically limited otherwise.
As shown in FIG. 1, the present invention provides an anti-occlusion target tracking method, which comprises the following steps:
s1: inputting a video or an image sequence into a target detector according to frames to obtain a target detection result, wherein the target detection result is a candidate item based on detection and comprises a bounding box and detection confidence of all targets in each frame of image;
s2: generating a candidate item based on tracking for each frame of image by utilizing a joint detection and tracking frame according to the target detection result, wherein the joint detection and tracking frame carries out tracking motion estimation on the detection result through a Kalman filter and camera motion compensation so as to obtain the candidate item based on tracking;
taking the nth frame as an example, the position of the target boundary frame output by the SDP target detector of the current frame is taken as a candidate for the nth frame based on detection. Meanwhile, the position of the target boundary box is input into a Kalman filter, and the position of the target boundary box in the next frame is estimated to be used as a candidate item of the N +1 th frame based on tracking.
S3: screening the detection-based candidate items and the tracking-based candidate items by using a non-maximum suppression algorithm according to the confidence degrees of the detection-based candidate items and the tracking-based candidate items to obtain the screened detection-based candidate items and the screened tracking-based candidate items;
s4: extracting apparent features of all screened candidate items based on detection and screened candidate items based on tracking of the current frame by utilizing a pre-trained deep neural network;
the apparent feature obtained in one embodiment of the invention is a 512-dimensional depth feature;
s5: calculating the target matching degree of the adjacent frames by using a cascade matching algorithm, wherein the method comprises the following steps: the screened candidate items based on detection perform apparent feature similarity matching on the existing tracks of the adjacent frames; the screened candidate items based on the tracking are subjected to boundary frame intersection with a target boundary frame of the existing track of the adjacent frame and are matched with the matching degree;
s6: and determining the position of the target in the current frame according to the target matching degree of the adjacent frames, thereby outputting a target motion track.
In one embodiment of the invention, the object detector is an SDP object detector.
The confidence is given by the following confidence score formula:
Figure SMS_23
wherein the content of the first and second substances,
Figure SMS_24
is the detection confidence of the t-1 th frame, is>
Figure SMS_25
For the tracking confidence of the t-th frame, N det The number N of the candidate items based on detection in the track to be associated trk For the number of the candidate items based on tracking in the track to be associated last time, (. Cndot.) is a binary function, when the function is true, the value is 1, otherwise, the value is 0, and the parameter α is a constant.
In one embodiment of the invention, the α value is 0.05.
As shown in fig. 2, the step of screening the detection-based candidate and the tracking-based candidate by using a non-maximum suppression algorithm to obtain the screened detection-based candidate and the screened tracking-based candidate includes the following steps:
s21 (not shown): sorting according to the confidence score of all the candidate items based on detection and the candidate items based on tracking to obtain a candidate list;
s22 (not shown in the figure): selecting the detection-based candidate item and the tracking-based candidate item with the highest confidence level to be added into a final output list and deleted from the candidate item list;
s23 (not shown in the figure): calculating the detection-based candidate item and the tracking-based candidate item with the highest confidence coefficient to be in a border intersection ratio with other candidate items, and deleting the detection-based candidate item and the tracking-based candidate item with the border intersection ratio larger than a preset threshold value;
s24 (not shown): and repeating the process until the candidate item list is empty, wherein the final output list is the screened candidate items based on detection and the candidate items based on tracking.
In one embodiment of the invention, the predetermined threshold is 0.3-0.5.
The deep neural network is a google lenet based network, which includes layers from the input layer to the initiation _4e layer, and then is connected by a 1 × 1 convolutional layer. The network input picture size is 160 x 80, and the output target feature is 512 dimensions. The network is pre-trained on a large-scale pedestrian re-identification data set, and the loss function is as follows:
l triplet (I i ,I j ,I k )=m+d(I i ,I j )-d(I i ,I k )
wherein, I i ,I j For pictures from the same identity, I i ,I k For pictures from different identities, d represents the euclidean distance and m is a constant.
As shown in fig. 3, the step of calculating the target matching degree of the adjacent frames by using the cascade matching algorithm includes the following steps:
s51 (not shown in the figure): obtaining the target detection result of the first frame, and generating a track for each target to obtain an initial track set
Figure SMS_26
A candidate set of the filtered detection-based candidate and the filtered tracking-based candidate->
Figure SMS_27
Said apparent characteristic>
Figure SMS_28
And constructs all matched candidate sets ≥>
Figure SMS_29
All candidates that are not matchedCollection/>
Figure SMS_30
S52 (not shown): the screened candidate items based on detection are processed
Figure SMS_31
And the initial set of trajectories->
Figure SMS_32
Performing feature similarity calculation, and updating the matched candidate item set based on the matching result>
Figure SMS_33
The unmatched candidate set->
Figure SMS_34
Said initial set of tracks +>
Figure SMS_35
In an embodiment of the invention, a Hungarian algorithm is used for feature similarity matching.
S53 (not shown in the figure): selecting the selected candidate item based on tracking
Figure SMS_36
And the updated initial trajectory set->
Figure SMS_37
The target bounding box carries out bounding box intersection and matching of the matching degree of gravity, and the matched candidate item set is updated according to the matching result>
Figure SMS_38
The unmatched candidate set->
Figure SMS_39
In one embodiment of the invention, hungarian algorithm is used for bounding box intersection and fitness matching.
Further, matching the candidate item set
Figure SMS_40
In which each candidate bounding box is matched to the initial set of tracks>
Figure SMS_41
The track segments in (1) are connected; set the unmatched candidate term->
Figure SMS_42
Initializing the track into a new track; set ≥ for the initial locus>
Figure SMS_43
Is set as a temporary track segment, and if it is not matched in the following consecutive N frames, the temporary track segment is considered to have ended and is based on the initial set of tracks £ or £ the>
Figure SMS_44
Wherein N is generally 5-8.
On the MOT17 public multi-target pedestrian tracking dataset, the tracking results of the present invention are shown in the following table. It can be seen that in most metrics, particularly in F1 score, tracking rate, number of ID exchanges and accuracy, are superior to other existing techniques and can be run at real-time speed. The improvement of the ID exchange times shows that the apparent features extracted by the method enhance the recognition capability of the tracker and reduce the inaccuracy of tracking under the condition that the target is interacted and occluded. The improvement of false positive and tracking rate indicates the effectiveness of the anti-occlusion target tracking method of the invention.
TABLE 1 test results
Method Accuracy of measurement F1 score Tracking rate Loss rate False positive False negative Number of ID exchanges Speed of rotation
HISP 44.6 38.8 15.1% 38.8% 25,478 276,395 10,617 4.7
SORT 43.1 39.8 12.5% 42.3% 28,398 287,582 4,852 143.3
FPSN 44.9 48.4 16.5% 35.8% 33,757 269,952 7,136 10.1
MASS 46.9 46 16.9% 36.3% 25,773 269,116 4,478 17.1
OTCD 44.6 38.8 15.1% 38.8% 25,478 276,359 3,573 46.5
The invention 47.4 50.1 16.8% 37.2% 26,910 267,331 2,760 35.7
All or part of the flow of the method of the embodiments may be implemented by a computer program, which may be stored in a computer readable storage medium and executed by a processor, to instruct related hardware to implement the steps of the embodiments of the methods. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer Memory, read-Only Memory (ROM), random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, and the like. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.
The foregoing is a more detailed description of the invention in connection with specific preferred embodiments and it is not intended that the invention be limited to these specific details. For those skilled in the art to which the invention pertains, several equivalent substitutions or obvious modifications can be made without departing from the spirit of the invention, and all the properties or uses are considered to be within the scope of the invention.

Claims (10)

1. An anti-occlusion target tracking method is characterized by comprising the following steps:
s1: inputting a video or an image sequence into a target detector according to frames to obtain a target detection result, wherein the target detection result is a candidate item based on detection and comprises a bounding box and detection confidence of all targets in each frame of image;
s2: generating a candidate item based on tracking for each frame of image by using a joint detection and tracking frame according to the target detection result, wherein the joint detection and tracking frame carries out tracking motion estimation on the target detection result through a Kalman filter and camera motion compensation to obtain the candidate item based on tracking;
s3: screening the detection-based candidate items and the tracking-based candidate items by using a non-maximum suppression algorithm according to the confidence degrees of the detection-based candidate items and the tracking-based candidate items to obtain the screened detection-based candidate items and the screened tracking-based candidate items;
s4: extracting apparent features of all screened candidate items based on detection and screened candidate items based on tracking of the current frame by utilizing a pre-trained deep neural network;
s5: calculating the target matching degree of the adjacent frames by using a cascade matching algorithm, wherein the method comprises the following steps: the screened candidate items based on detection perform apparent feature similarity matching on the existing tracks of the adjacent frames; the screened candidate items based on the tracking are subjected to boundary frame intersection with a target boundary frame of the existing track of the adjacent frame and are matched with the matching degree;
s6: and determining the position of the target in the current frame according to the target matching degree of the adjacent frames, thereby outputting a target motion track.
2. The anti-occlusion target tracking method of claim 1, wherein the target detector is an SDP target detector.
3. The anti-occlusion target tracking method of claim 1, wherein the confidence is given by the following confidence score formula:
Figure FDA0004095236470000011
wherein the content of the first and second substances,
Figure FDA0004095236470000012
is the detection confidence of the t-1 th frame, is>
Figure FDA0004095236470000013
For the tracking confidence of the t-th frame, N det The number N of the candidate items based on detection in the track to be associated trk For the number of the candidate items based on tracking in the track to be associated last time, (. Cndot.) is a binary function, when the function is true, the value is 1, otherwise, the value is 0, and the parameter α is a constant.
4. The anti-occlusion target tracking method of claim 1, wherein the step of screening the detection-based candidate and the tracking-based candidate using a non-maximum suppression algorithm to obtain the screened detection-based candidate and the screened tracking-based candidate comprises the steps of:
s21: sorting according to the confidence score of all the candidate items based on detection and the candidate items based on tracking to obtain a candidate item list;
s22: selecting the detection-based candidate item and the tracking-based candidate item with the highest confidence level to be added into a final output list and deleted from the candidate item list;
s23: calculating the boundary frame intersection ratio of the detection-based candidate item and the tracking-based candidate item with the highest confidence coefficient to other candidate items, and deleting the detection-based candidate item and the tracking-based candidate item with the boundary frame intersection ratio larger than a preset threshold value;
s24: repeating steps S21 to S23 until the candidate list is empty, and the final output list is the filtered detection-based candidate and the tracking-based candidate.
5. The anti-occlusion target tracking method of claim 4, wherein the preset threshold is 0.3-0.5.
6. The anti-occlusion target tracking method of claim 1, wherein the deep neural network is a google lenet based network comprising from an input layer to an initiation _4e layer, and then connected by a 1 x 1 convolutional layer.
7. The anti-occlusion target tracking method of claim 6, wherein a loss function of the training of the neural network is:
l triplet (I i ,I j ,I k )=m+d(I i ,I j )-d(I i ,I k )
wherein, I i ,I j For pictures from the same identity, I i ,I k For pictures from different identities, d represents the euclidean distance and m is a constant.
8. The anti-occlusion target tracking method of claim 1, wherein calculating the target matching degree of adjacent frames using a cascade matching algorithm comprises the steps of:
s51: obtaining the target detection result of the first frame, and generating a track for each target to obtain an initial track set
Figure FDA0004095236470000021
A set of candidate items consisting of the filtered detection-based candidate item and the filtered tracking-based candidate item +>
Figure FDA0004095236470000022
The apparent characteristic->
Figure FDA0004095236470000023
And constructing all matched candidate item sets
Figure FDA0004095236470000024
All candidate sets ≥ that do not match>
Figure FDA0004095236470000025
S52: selecting the candidate item based on detection after screening
Figure FDA0004095236470000026
And said initial set of tracks>
Figure FDA0004095236470000027
Performing feature similarity calculation, and updating the matched candidate item set based on the matching result>
Figure FDA0004095236470000031
The unmatched candidate set->
Figure FDA0004095236470000032
The initial set of trajectories +>
Figure FDA0004095236470000033
S53: selecting the selected tracking-based candidate item
Figure FDA0004095236470000034
And the updated initial trajectory set->
Figure FDA0004095236470000035
The target bounding box carries out bounding box intersection and matching of the matching degree of gravity, and the matched candidate item set is updated according to the matching result>
Figure FDA0004095236470000036
The set of unmatched candidates +>
Figure FDA0004095236470000037
9. The anti-occlusion target tracking method of claim 8, wherein the matched candidate item set is
Figure FDA0004095236470000038
Each candidate bounding box in the cluster of candidates is matched to the initial set of tracks->
Figure FDA0004095236470000039
The track segments in (1) are connected; set the unmatched candidate term->
Figure FDA00040952364700000310
Initializing the track into a new track; set ≥ for the initial locus>
Figure FDA00040952364700000311
Is set as a temporary trajectory section, and if it is not matched in the following successive N frames, it is considered that the temporary trajectory section has ended and is reserved from the initial trajectory set ÷ or>
Figure FDA00040952364700000312
Deleting; the value of N is 5-8.
10. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 9.
CN201911261618.2A 2019-12-10 2019-12-10 Anti-occlusion target tracking method Active CN111080673B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911261618.2A CN111080673B (en) 2019-12-10 2019-12-10 Anti-occlusion target tracking method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911261618.2A CN111080673B (en) 2019-12-10 2019-12-10 Anti-occlusion target tracking method

Publications (2)

Publication Number Publication Date
CN111080673A CN111080673A (en) 2020-04-28
CN111080673B true CN111080673B (en) 2023-04-18

Family

ID=70313832

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911261618.2A Active CN111080673B (en) 2019-12-10 2019-12-10 Anti-occlusion target tracking method

Country Status (1)

Country Link
CN (1) CN111080673B (en)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113689462A (en) * 2020-05-19 2021-11-23 深圳绿米联创科技有限公司 Target processing method and device and electronic equipment
CN112016445B (en) * 2020-08-27 2022-04-19 重庆科技学院 Monitoring video-based remnant detection method
CN112509338B (en) * 2020-09-11 2022-02-22 博云视觉(北京)科技有限公司 Method for detecting traffic jam event through silent low-point video monitoring
CN112734800A (en) * 2020-12-18 2021-04-30 上海交通大学 Multi-target tracking system and method based on joint detection and characterization extraction
CN112883819B (en) * 2021-01-26 2023-12-08 恒睿(重庆)人工智能技术研究院有限公司 Multi-target tracking method, device, system and computer readable storage medium
CN112990072A (en) * 2021-03-31 2021-06-18 广州敏视数码科技有限公司 Target detection and tracking method based on high and low dual thresholds
CN113223051A (en) * 2021-05-12 2021-08-06 北京百度网讯科技有限公司 Trajectory optimization method, apparatus, device, storage medium, and program product
CN114972418B (en) * 2022-03-30 2023-11-21 北京航空航天大学 Maneuvering multi-target tracking method based on combination of kernel adaptive filtering and YOLOX detection

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919981A (en) * 2019-03-11 2019-06-21 南京邮电大学 A kind of multi-object tracking method of the multiple features fusion based on Kalman filtering auxiliary

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11144761B2 (en) * 2016-04-04 2021-10-12 Xerox Corporation Deep data association for online multi-class multi-object tracking

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919981A (en) * 2019-03-11 2019-06-21 南京邮电大学 A kind of multi-object tracking method of the multiple features fusion based on Kalman filtering auxiliary

Also Published As

Publication number Publication date
CN111080673A (en) 2020-04-28

Similar Documents

Publication Publication Date Title
CN111080673B (en) Anti-occlusion target tracking method
CN110516556B (en) Multi-target tracking detection method and device based on Darkflow-deep Sort and storage medium
US11094070B2 (en) Visual multi-object tracking based on multi-Bernoulli filter with YOLOv3 detection
CN109360226B (en) Multi-target tracking method based on time series multi-feature fusion
Elhoseny Multi-object detection and tracking (MODT) machine learning model for real-time video surveillance systems
Fernando et al. Tracking by prediction: A deep generative model for mutli-person localisation and tracking
US10402627B2 (en) Method and apparatus for determining identity identifier of face in face image, and terminal
Huang et al. Robust object tracking by hierarchical association of detection responses
Zhou et al. Improving video saliency detection via localized estimation and spatiotemporal refinement
Rezatofighi et al. Joint probabilistic data association revisited
CN106778712B (en) Multi-target detection and tracking method
US20050216274A1 (en) Object tracking method and apparatus using stereo images
CN113191180B (en) Target tracking method, device, electronic equipment and storage medium
Mukherjee et al. Gaussian mixture model with advanced distance measure based on support weights and histogram of gradients for background suppression
Gan et al. Online CNN-based multiple object tracking with enhanced model updates and identity association
CN110991397B (en) Travel direction determining method and related equipment
CN110349188B (en) Multi-target tracking method, device and storage medium based on TSK fuzzy model
Kim et al. Multiple player tracking in soccer videos: an adaptive multiscale sampling approach
Cai et al. A real-time visual object tracking system based on Kalman filter and MB-LBP feature matching
Al-Shakarji et al. Robust multi-object tracking with semantic color correlation
CN111161325A (en) Three-dimensional multi-target tracking method based on Kalman filtering and LSTM
Iraei et al. Object tracking with occlusion handling using mean shift, Kalman filter and edge histogram
Idan et al. Fast shot boundary detection based on separable moments and support vector machine
Poiesi et al. Tracking multiple high-density homogeneous targets
Lit et al. Multiple object tracking with gru association and kalman prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant