CN112734800A - Multi-target tracking system and method based on joint detection and characterization extraction - Google Patents

Multi-target tracking system and method based on joint detection and characterization extraction Download PDF

Info

Publication number
CN112734800A
CN112734800A CN202011510839.1A CN202011510839A CN112734800A CN 112734800 A CN112734800 A CN 112734800A CN 202011510839 A CN202011510839 A CN 202011510839A CN 112734800 A CN112734800 A CN 112734800A
Authority
CN
China
Prior art keywords
target
frame
candidate
joint detection
characterization
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011510839.1A
Other languages
Chinese (zh)
Inventor
邓国伟
陈彩莲
涂静正
关新平
杨博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Jiaotong University
Original Assignee
Shanghai Jiaotong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Jiaotong University filed Critical Shanghai Jiaotong University
Priority to CN202011510839.1A priority Critical patent/CN112734800A/en
Publication of CN112734800A publication Critical patent/CN112734800A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/23Clustering techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/20Image preprocessing
    • G06V10/25Determination of region of interest [ROI] or a volume of interest [VOI]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20092Interactive image processing based on input by user
    • G06T2207/20104Interactive definition of region of interest [ROI]

Abstract

The invention discloses a multi-target tracking system and method based on joint detection and characterization extraction, and relates to the field of computer visual tracking. According to the technical scheme, the magnitude and the calculation cost of the network parameters to be trained are reduced, and the algorithm efficiency and the multi-target tracking precision are improved.

Description

Multi-target tracking system and method based on joint detection and characterization extraction
Technical Field
The invention relates to the field of computer vision tracking, in particular to a multi-target tracking system and method based on joint detection and characterization extraction.
Background
With the rapid development of internet technology, the continuous improvement of the performance of devices such as smart phones and computers, and the continuous reduction of manufacturing cost, abundant image and video data are continuously generated every moment. As a colloquial language, "a picture prevails over a thousand of languages," huge amounts of valuable information are contained in images and videos. How to utilize the data quickly and accurately becomes a problem which needs to be solved urgently. Computer vision technology, now rapidly developed, can utilize the powerful computing power of computers to process image data instead of human eyes. Computer vision technology has become the core technology in many fields.
Multi-object tracking (MOT) is an important research direction in the field of computer vision, and its task is to continuously track and locate multiple objects, such as pedestrians on the street, vehicles on the road, etc., from a video sequence, while keeping their identity information unchanged, and then derive the motion trajectory of each object. The multi-target tracking not only can accurately detect the space-time information of the target in the video, but also can provide a great deal of valuable information for gesture prediction, action recognition, behavior analysis and the like. The multi-target tracking algorithm has wide application in the fields of intelligent video monitoring, automatic driving, intelligent robots, intelligent human-computer interaction, intelligent transportation, sports video analysis and the like, and has become a popular direction for research in recent years.
The multi-target tracking problem is an extension of the single-target tracking problem. Given a particular target, the task of single target tracking is to continuously track the target from the scene. The task of multi-target tracking is to track a series of objects of interest in a scene, such as pedestrians, vehicles, etc. in the scene. Therefore, compared with single-target tracking, multi-target tracking also needs to complete two additional tasks:
(1) judging the number change of the targets in the scene, and finishing the initialization of the new track and the termination of the old track;
(2) and keeping the identity information of the tracking target.
Currently, detection-based tracking is the mainstream paradigm of multi-target tracking, and can be divided into the following two independent subtasks:
target detection, detecting a target position in a current image;
data correlation, correlating the detection results with existing traces.
Researchers often use pre-trained target detection models directly, so that the video multi-target tracking problem is converted into a data association problem based on detection results. In order to obtain an optimal correlation result, the correlation cost and the optimization algorithm of two key links as data correlation become the research focus of the detection-based tracking algorithm.
A preliminary multi-target tracking method is designed in a domestic patent 'multi-target tracking method, device, electronic equipment and storage medium' (application number 202010573301.9), but the problem of frequent shielding among targets in a scene is not considered, and the track is broken frequently. The name of the domestic patent application number 202010605987.5 is 'an integrated target detection and associated pedestrian multi-target tracking method' which provides a model capable of simultaneously performing target detection and target feature extraction, but the target association step only adopts a simpler threshold discrimination method, which leads the method to be incapable of obtaining the optimal matching result between targets in a scene in which a plurality of similar targets simultaneously appear.
In a domestic patent 'a vehicle multi-target tracking method based on a target center point' (application number is 20201059041.1), a vehicle detection model and a tracking model are integrated in a network, so that the calculated amount and the running time are greatly reduced, and the detection based on tracking is simplified.
Accordingly, those skilled in the art are devoted to developing a multi-target tracking system and method based on joint detection and characterization extraction.
Disclosure of Invention
In view of the above-mentioned drawbacks of the prior art, the technical problems to be solved by the present invention are: 1. how to improve the operation speed of the algorithm on the premise of keeping the accuracy of the algorithm; 2. how to organically combine two major links of target detection and data association and improve the tracking precision by comprehensively utilizing information.
In order to achieve the above purpose, the present invention provides a multi-target tracking system based on joint detection and characterization extraction, which is characterized by comprising a joint detection and characterization extraction module, a trajectory prediction module and a candidate frame screening module, wherein the joint detection and characterization extraction module is composed of a backbone network, a region selection network, a target boundary frame regressor and a characterization extractor.
Furthermore, the track prediction module adopts a linear motion model, speculates the possible position of the tracked target in the current video frame according to the motion information of the track, and corrects the existing track to reduce the error.
Furthermore, the candidate frame screening module adopts a non-maximization inhibition algorithm with identity transmission, can screen out an optimal candidate frame with confidence, and completes data association of the detected candidate frame and the track through identity transmission.
Further, the backbone network adopts a backbone network capable of extracting image features, and a feature pyramid network is established on the basis of the backbone network.
Further, the target bounding box regressor and the characterization extractor both adopt a deep neural network structure, and the target bounding box regressor uses a full-connection layer network.
A multi-target tracking method based on joint detection and characterization extraction comprises the following steps:
step 1, making an active track set as an empty set and an inactive track set as an empty set; inputting the video frame sequence into the backbone network frame by frame to obtain a feature table of the current frame image;
step 2, generating a candidate frame by utilizing the functions of the RPN, the boundary frame regressor, the characterization extractor and the like in the track prediction module and the joint detection and characterization extraction module according to the information in the feature table;
step 3, screening out the optimal candidate frame from the candidate frames by adopting a non-maximum inhibition method with identity transfer;
step 4, updating the track according to the screening result, including track generation, extension and deletion;
and 5, if the current frame is not the last frame of the video, returning to the first step, and if not, ending.
Further, the step 2 further comprises:
step 2.1, detecting a target in the image;
2.2, predicting the possible position of the track;
step 2.3, generating a candidate frame;
and 2.4, extracting the characterization vectors.
Further, step 3 adopts a non-polarization inhibition method with identity transfer, and specifically includes:
step 3.1, clustering candidate frames input according to intersection and comparison between the target boundary frames, clustering the candidate frames belonging to the same target into one class by using the spatial relationship between the candidate frames, and distinguishing the candidate frames not belonging to the same target;
3.2, if a certain cluster in the clustering result contains a candidate frame with an identity label, transmitting the identity label of the candidate frame to all candidate frames in the cluster;
and 3.3, deleting the candidate box with the non-maximum confidence coefficient in each cluster, and only keeping the candidate box with the maximum confidence coefficient in the cluster.
Further, the step 4 further includes:
step 4.1, updating the track in the active track set;
4.2, comparing the characteristics between the inactivation track set and the screening result, and carrying out re-identification operation;
and 4.3, updating the tracks successfully re-identified in the inactivation track set, adding the tracks into the active track set, taking the screening result of the re-identification failure as a new target, creating tracks for the tracks and adding the tracks into the active track set.
Further, the adopted re-identification method is a short-term method based on Euclidean distance between the characterization vectors.
Technical effects
1. And providing a joint detection and characterization extraction module. The target position in the image can be detected, and the target appearance representation for subsequent re-recognition can be extracted, so that the magnitude and the calculation cost of the network parameters to be trained are greatly reduced.
2. A candidate box generation module is designed. The module generates a detection candidate frame by searching a target position in a current image, directionally generates a track candidate frame corresponding to the existing track, and extracts target characteristics in the candidate frame, so that the target position in the image can be accurately detected, the subsequent data association steps are greatly facilitated, and the algorithm efficiency is obviously improved.
3. A candidate box screening module is designed. The module adopts a non-maximum inhibition algorithm with identity transfer, can screen the most accurate target boundary box through a unified standard, and obviously improves the tracking precision; and the association of the candidate box with the existing track is efficiently completed through the identity transfer operation.
The conception, the specific structure and the technical effects of the present invention will be further described with reference to the accompanying drawings to fully understand the objects, the features and the effects of the present invention.
Drawings
FIG. 1 is a system flow diagram of a preferred embodiment of the present invention;
FIG. 2 is a block diagram of a joint detection and characterization extraction model according to a preferred embodiment of the present invention;
FIG. 3 is a schematic diagram of a regression network for the target bounding box according to a preferred embodiment of the present invention;
FIG. 4 is a schematic diagram of a token extraction network in accordance with a preferred embodiment of the present invention.
Detailed Description
The technical contents of the preferred embodiments of the present invention will be more clearly and easily understood by referring to the drawings attached to the specification. The present invention may be embodied in many different forms of embodiments and the scope of the invention is not limited to the embodiments set forth herein.
As shown in fig. 1, a target tracking method based on a joint detection and characterization extraction module includes the following steps:
the first step is as follows: movement track set
Figure BDA0002846360000000041
Set of inactivation trajectories
Figure BDA0002846360000000042
Converting the video frame sequence I to { I ═ I0,i1,...,iT-1Inputting the frames into the main network of the module to obtain the characteristic table F of the current frame imaget
The second step is that: based on the characteristic table FtThe candidate frame C is generated by the following four stepst
2.1 detecting objects in the image. The RPN generates a reference bounding box on each pixel of the image and based on the feature table FtFrom which to find areas where there is a possibility of targets
Figure BDA0002846360000000043
2.2 predicting the possible positions of the trajectory. The track prediction module infers the possible position of the tracked target in the current video frame according to the motion information of the track
Figure BDA0002846360000000044
With RPN output
Figure BDA0002846360000000045
Different, output of prediction module
Figure BDA0002846360000000046
With identity information of the corresponding track, which will be between the candidate box and the trackThe association of (a) provides convenience.
2.3 generating candidate boxes. Bounding box with low precision
Figure BDA0002846360000000047
And
Figure BDA0002846360000000048
and characteristic table FtInputting the candidate frame C into a target bounding box grouping module to obtain a candidate frame Ct=Dt+Bt. Therein called DtTo detect candidate boxes, BtIs a track candidate box. In this step, the process is carried out,
Figure BDA0002846360000000049
will automatically pass on to Bt
2.4 extracting the characterization vector. The step is prepared for a subsequent pedestrian re-identification link. Algorithm will frame candidates CtAnd characteristic table FtThe input is input into a characterization extractor of the module, and a characterization vector of each candidate box is calculated.
The third step: and screening candidate frames.
Candidate frame C generated in the second steptComprises two parts: 1) detection candidate frame D from RPNt(ii) a 2) Trajectory candidate box B from the prediction modulet. Neither of these can be directly regarded as a tracking result of the current frame. Firstly, detecting that a candidate frame is not associated with a track, so that the candidate frame does not carry identity information; secondly, because the prediction accuracy of the prediction module is limited, the direct use of the trajectory candidate box will make the trajectory accuracy not high. The method adopts a non-maximum inhibition method with identity transfer from CtScreening out the optimal candidate frame C'tThe method comprises the following steps:
3.1 clustering. According to the intersection ratio (IoU) between the target bounding boxes, the candidate box set C is processedtAnd clustering, namely clustering the candidate frames belonging to the same target into one class by using the spatial relationship among the candidate frames, and distinguishing the candidate frames not belonging to the same target.
3.2 identity transfer. And if a certain cluster in the clustering result contains a candidate frame with the identity label, transmitting the identity label of the candidate frame to all candidate frames in the cluster.
3.3 inhibition. And deleting the candidate box with non-maximum confidence in each cluster, and only keeping the candidate box with the maximum confidence in the cluster.
This step is to mix CtScreening is optimal candidate frame C'tOf which is C't=D′t+B′t,D′tA bounding box, B ', representing the screening result without the identity information yet'tIs a bounding box with identity information.
The fourth step: and (4) track processing, including track generation, extension and deletion.
4.1 update the track. According to B'tThe corresponding track in T is updated according to the identity information and the position information in the T; deleting ' B ' in activity track set T 'tCorrelating the traces and adding them into the inactivation trace set T';
4.2 re-identification.
Due to frequent occlusion between targets in the scene, the candidate frame D 'without the identity label in the screening result'tPossibly a newly emerging target, and may also belong to a portion of the trajectory of an occluded target. In order to reduce track fracture and keep the linearity and real-time performance of the algorithm, the short-term pedestrian re-identification method is adopted to judge D'tWhether it is an occluded target: first, the traces in the inactive trace set T' will be additionally saved TsFrame time, and still using the trajectory prediction module to predict the location of the trajectory in T' during this time; according to D'tAnd the distance between the characteristic vector of the track in the T' is used for judging whether the two are the same target or not. In order to reduce the error re-identification rate, the following judgment criteria are set: firstly, the distance between the two characterization vectors must be smaller than a certain threshold; second, the interaction ratio between the two is greater than a certain threshold.
And after the re-identification step is completed, updating the successfully re-identified track in the inactivation track set T', and adding the track into the active track set T. D'tWith medium heavy recognition failureThe candidate box is a newly emerged target for which a new trajectory is created and added to the active trajectory set T.
The fifth step: if the current frame is not the last frame of the video, returning to the first step; otherwise, ending.
As shown in fig. 2, the multi-target tracking system based on joint detection and feature extraction uses a designed joint detection and feature extraction module as a core skeleton, and adds a trajectory prediction module and a candidate frame extraction module to complete a multi-target tracking task. The joint detection and characterization extraction module is integrally composed of a backbone network, a Region selection network (RPN), a target bounding box regressor and a characterization extractor. The module can not only detect the target position in the image, but also extract the representation vector of the target.
The backbone network adopts a backbone network which can extract image features, such as Alexnet, VGG, Resnet series, increment series, Densenet series, ResNeXt series and the like. In addition, a Feature Pyramid Network (FPN) is established on the basis of the backbone Network, so that the position of the target can be accurately detected based on Feature tables of different scales.
The region selection network adopts the RPN structure of fast RCNN, which can search the region with object from the picture. The RPN first generates a large number of reference bounding boxes (Anchors) on each pixel point in the image. Secondly, finding out the characteristics corresponding to each reference boundary frame in the characteristic table, and judging whether a target exists in the reference boundary frame; meanwhile, the reference boundary frame is made to accord with the actual position of the target as much as possible by a regression method of the target boundary frame. Typically, the RPN generates the reference bounding box with an aspect ratio of { 1: 2, 1: 1, 2: 1 }. In practical applications, an appropriate aspect ratio may be selected according to the characteristics of the target of interest to improve accuracy and efficiency.
As shown in fig. 3 and 4, both the target bounding box regressor and the characterization extractor adopt a deep neural network structure. The deep neural network has excellent fitting capability and characteristic representation capability, and can effectively improve the accuracy of the algorithm. The target bounding box regressor shown in fig. 3 uses a 4-layer full-link network with the number {1, 2, 3, 4 }. The module obtains a boundary frame with more accurate positioning precision and a corresponding confidence coefficient according to the feature table and the boundary frame with poorer positioning precision. The token extractor shown in fig. 4 uses a 3-layer full connectivity layer network numbered 5, 6, 7. The module extracts a characterization vector of the target according to the feature table and the target bounding box. The generated characterization vector satisfies the following properties: given a distance metric method, the distance between the token vectors of the same object before and after the video is sufficiently small, while the distance between the token vectors of different objects is sufficiently large.
The track prediction module is used for presuming the possible position of a tracked target in the current video frame according to the motion information of the track and correcting the existing track to reduce errors. The method can effectively reduce the search space and improve the tracking precision. The trajectory prediction module predicts a most likely position of the trajectory at the current time based on the linear motion model.
The candidate box filtering module employs a non-maximization suppression algorithm with identity delivery. Different from general non-maximization suppression, the non-maximization suppression with identity transmission is performed after clustering, and if a certain cluster in a clustering result contains a candidate frame with an identity label, the identity label of the candidate frame is transmitted to all candidate frames in the cluster. The module can screen out the optimal candidate frame according to the confidence coefficient, and completes the data association of the detection candidate frame and the track through identity transmission, thereby avoiding the complex similarity measurement calculation and binary image distribution process.
The embodiment of the application also provides electronic equipment which comprises a processor and a memory.
The memory is used for storing a computer program;
the processor is used for realizing any one of the multi-target tracking methods when executing the program stored in the memory.
Embodiments of the present application may also provide a computer-readable storage medium, in which a computer program is stored, and when the computer program is executed by a processor, the computer program implements any one of the multi-target tracking methods described above.
The foregoing detailed description of the preferred embodiments of the invention has been presented. It should be understood that numerous modifications and variations could be devised by those skilled in the art in light of the present teachings without departing from the inventive concepts. Therefore, the technical solutions available to those skilled in the art through logic analysis, reasoning and limited experiments based on the prior art according to the concept of the present invention should be within the scope of protection defined by the claims.

Claims (10)

1. The multi-target tracking system based on joint detection and characterization extraction is characterized by comprising a joint detection and characterization extraction module, a trajectory prediction module and a candidate frame screening module, wherein the joint detection and characterization extraction module is composed of a backbone network, a region selection network, a target boundary frame regressor and a characterization extractor.
2. The multi-target tracking system based on joint detection and feature extraction as claimed in claim 1, wherein the trajectory prediction module uses a linear motion model to infer the possible position of the tracked target in the current video frame according to the motion information of the trajectory, and corrects the existing trajectory to reduce the error.
3. The multi-target tracking system based on joint detection and characterization extraction according to claim 1, wherein the candidate frame screening module employs a non-maximization inhibition algorithm with identity transfer, can screen out an optimal candidate frame with confidence, and completes data association of the detected candidate frame and the track through identity transfer at the same time.
4. The multi-target tracking system based on joint detection and feature extraction as claimed in claim 1, wherein the backbone network adopts a backbone network capable of extracting image features, and a feature pyramid network is established on the basis of the backbone network.
5. The joint detection and characterization extraction based multi-target tracking system of claim 1 wherein the target bounding box regressor and the characterization extractor both employ a deep neural network structure, the target bounding box regressor using a full connectivity layer network.
6. A multi-target tracking method based on joint detection and characterization extraction is characterized by comprising the following steps:
step 1, making an active track set as an empty set and an inactive track set as an empty set; inputting the video frame sequence into the backbone network frame by frame to obtain a feature table of the current frame image;
step 2, generating a candidate frame by utilizing the functions of the RPN, the boundary frame regressor, the characterization extractor and the like in the track prediction module and the joint detection and characterization extraction module according to the information in the feature table;
step 3, screening out the optimal candidate frame from the candidate frames by adopting a non-maximum inhibition method with identity transfer;
step 4, updating the track according to the screening result, including track generation, extension and deletion;
and 5, if the current frame is not the last frame of the video, returning to the first step, and if not, ending.
7. The multi-target tracking method based on joint detection and characterization extraction as claimed in claim 6, wherein said step 2 further comprises:
step 2.1, detecting a target in the image;
2.2, predicting the possible position of the track;
step 2.3, generating a candidate frame;
and 2.4, extracting the characterization vectors.
8. The multi-target tracking method based on joint detection and characterization extraction as claimed in claim 6, wherein the step 3 adopts a non-polarization inhibition method with identity transfer, specifically comprising:
step 3.1, clustering candidate frames input according to intersection and comparison between the target boundary frames, clustering the candidate frames belonging to the same target into one class by using the spatial relationship between the candidate frames, and distinguishing the candidate frames not belonging to the same target;
3.2, if a certain cluster in the clustering result contains a candidate frame with an identity label, transmitting the identity label of the candidate frame to all candidate frames in the cluster;
and 3.3, deleting the candidate box with the non-maximum confidence coefficient in each cluster, and only keeping the candidate box with the maximum confidence coefficient in the cluster.
9. The multi-target tracking method based on joint detection and characterization extraction as claimed in claim 6, wherein said step 4 further comprises:
step 4.1, updating the track in the active track set;
4.2, comparing the characteristics between the inactivation track set and the screening result, and carrying out re-identification operation;
and 4.3, updating the tracks successfully re-identified in the inactivation track set, adding the tracks into the active track set, taking the screening result of the re-identification failure as a new target, creating tracks for the tracks and adding the tracks into the active track set.
10. The multi-target tracking method based on joint detection and feature extraction as claimed in claim 9, wherein the re-recognition method is a short-term one based on Euclidean distance between feature vectors.
CN202011510839.1A 2020-12-18 2020-12-18 Multi-target tracking system and method based on joint detection and characterization extraction Pending CN112734800A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011510839.1A CN112734800A (en) 2020-12-18 2020-12-18 Multi-target tracking system and method based on joint detection and characterization extraction

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011510839.1A CN112734800A (en) 2020-12-18 2020-12-18 Multi-target tracking system and method based on joint detection and characterization extraction

Publications (1)

Publication Number Publication Date
CN112734800A true CN112734800A (en) 2021-04-30

Family

ID=75603418

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011510839.1A Pending CN112734800A (en) 2020-12-18 2020-12-18 Multi-target tracking system and method based on joint detection and characterization extraction

Country Status (1)

Country Link
CN (1) CN112734800A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469118A (en) * 2021-07-20 2021-10-01 京东科技控股股份有限公司 Multi-target pedestrian tracking method and device, electronic equipment and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919974A (en) * 2019-02-21 2019-06-21 上海理工大学 Online multi-object tracking method based on the more candidate associations of R-FCN frame
CN110991272A (en) * 2019-11-18 2020-04-10 东北大学 Multi-target vehicle track identification method based on video tracking
CN111080673A (en) * 2019-12-10 2020-04-28 清华大学深圳国际研究生院 Anti-occlusion target tracking method
CN111126152A (en) * 2019-11-25 2020-05-08 国网信通亿力科技有限责任公司 Video-based multi-target pedestrian detection and tracking method
CN111476116A (en) * 2020-03-24 2020-07-31 南京新一代人工智能研究院有限公司 Rotor unmanned aerial vehicle system for vehicle detection and tracking and detection and tracking method
CN111639551A (en) * 2020-05-12 2020-09-08 华中科技大学 Online multi-target tracking method and system based on twin network and long-short term clues

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109919974A (en) * 2019-02-21 2019-06-21 上海理工大学 Online multi-object tracking method based on the more candidate associations of R-FCN frame
CN110991272A (en) * 2019-11-18 2020-04-10 东北大学 Multi-target vehicle track identification method based on video tracking
CN111126152A (en) * 2019-11-25 2020-05-08 国网信通亿力科技有限责任公司 Video-based multi-target pedestrian detection and tracking method
CN111080673A (en) * 2019-12-10 2020-04-28 清华大学深圳国际研究生院 Anti-occlusion target tracking method
CN111476116A (en) * 2020-03-24 2020-07-31 南京新一代人工智能研究院有限公司 Rotor unmanned aerial vehicle system for vehicle detection and tracking and detection and tracking method
CN111639551A (en) * 2020-05-12 2020-09-08 华中科技大学 Online multi-target tracking method and system based on twin network and long-short term clues

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
裴明涛: "视频事件分析与理解", 北京理工大学出版社 *

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113469118A (en) * 2021-07-20 2021-10-01 京东科技控股股份有限公司 Multi-target pedestrian tracking method and device, electronic equipment and storage medium

Similar Documents

Publication Publication Date Title
Marvasti-Zadeh et al. Deep learning for visual tracking: A comprehensive survey
CN109858390B (en) Human skeleton behavior identification method based on end-to-end space-time diagram learning neural network
US20220383535A1 (en) Object Tracking Method and Device, Electronic Device, and Computer-Readable Storage Medium
Han et al. Adaptive discriminative deep correlation filter for visual object tracking
US20220254157A1 (en) Video 2D Multi-Person Pose Estimation Using Multi-Frame Refinement and Optimization
Kim et al. CDT: Cooperative detection and tracking for tracing multiple objects in video sequences
Shuai et al. Multi-object tracking with siamese track-rcnn
Li et al. When object detection meets knowledge distillation: A survey
Chang et al. Fast Random‐Forest‐Based Human Pose Estimation Using a Multi‐scale and Cascade Approach
CN114898403A (en) Pedestrian multi-target tracking method based on Attention-JDE network
Wu et al. FSANet: Feature-and-spatial-aligned network for tiny object detection in remote sensing images
Urdiales et al. An improved deep learning architecture for multi-object tracking systems
Pang et al. Analysis of computer vision applied in martial arts
CN114926859A (en) Pedestrian multi-target tracking method in dense scene combined with head tracking
Song et al. Detection and tracking of safety helmet based on DeepSort and YOLOv5
Zhang et al. Residual memory inference network for regression tracking with weighted gradient harmonized loss
Gu et al. Real-time streaming perception system for autonomous driving
CN112734800A (en) Multi-target tracking system and method based on joint detection and characterization extraction
Zhang et al. Action detection with two-stream enhanced detector
Yang et al. Explorations on visual localization from active to passive
CN116245913A (en) Multi-target tracking method based on hierarchical context guidance
Yi et al. Single online visual object tracking with enhanced tracking and detection learning
Nalaie et al. AttTrack: Online deep attention transfer for multi-object tracking
Narmadha et al. Robust Deep Transfer Learning Based Object Detection and Tracking Approach.
Dou et al. Boosting cnn-based pedestrian detection via 3d lidar fusion in autonomous driving

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination