CN114897939A - Multi-target tracking method and system based on deep path aggregation network - Google Patents

Multi-target tracking method and system based on deep path aggregation network Download PDF

Info

Publication number
CN114897939A
CN114897939A CN202210599934.6A CN202210599934A CN114897939A CN 114897939 A CN114897939 A CN 114897939A CN 202210599934 A CN202210599934 A CN 202210599934A CN 114897939 A CN114897939 A CN 114897939A
Authority
CN
China
Prior art keywords
target
network
frame
path aggregation
aggregation network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210599934.6A
Other languages
Chinese (zh)
Inventor
张毅锋
陈曦
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southeast University
Original Assignee
Southeast University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southeast University filed Critical Southeast University
Priority to CN202210599934.6A priority Critical patent/CN114897939A/en
Publication of CN114897939A publication Critical patent/CN114897939A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/10Geometric CAD
    • G06F30/18Network design, e.g. design based on topological or interconnect aspects of utility systems, piping, heating ventilation air conditioning [HVAC] or cabling
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F30/00Computer-aided design [CAD]
    • G06F30/20Design optimisation, verification or simulation
    • G06F30/27Design optimisation, verification or simulation using machine learning, e.g. artificial intelligence, neural networks, support vector machines [SVM] or training a model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/50Image enhancement or restoration using two or more images, e.g. averaging or subtraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/277Analysis of motion involving stochastic approaches, e.g. using Kalman filters
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20212Image combination
    • G06T2207/20221Image fusion; Image merging

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Software Systems (AREA)
  • Geometry (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computer Hardware Design (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-target tracking method and system based on a deep path aggregation network. With a deep path aggregation network as a framework, firstly inputting a large amount of external video data to train the network so as to generate a target center heat map, a target center offset and a prediction frame size, and simultaneously extracting re-ID characteristics of a target; then generating a prediction frame according to the target position information, and marking the detected object in a rectangular frame form; then the cosine distances of the re-ID feature vectors of all detected objects in a certain frame and the previous frame of the video and IoU of a prediction box are calculated, and all the objects in the current frame are connected into the existing track. And finally, further estimating the positions of all the targets in the current frame through a Kalman filtering algorithm. The invention adopts the feature fusion layer from bottom to top to extract the target space feature information, shortens the information path between the bottom layer and the top layer features, and ensures that the tracker has the advantages of higher tracking precision and high real-time tracking speed.

Description

Multi-target tracking method and system based on deep path aggregation network
Technical Field
The invention belongs to the fields of computer vision, deep learning technology and multi-target tracking, and particularly relates to a multi-target tracking method and system based on a deep path aggregation network.
Background
The multi-target tracking is a popular research field in computer vision and has important application in the fields of automatic driving, intelligent traffic, intelligent monitoring and the like. With the rapid development of artificial intelligence, more and more deep learning algorithms are applied to various aspects in life. In single target tracking, the appearance characteristics of an object are known in advance. In multi-target tracking, a tracker needs to estimate the trajectories of multiple targets in a video and detect targets leaving or entering a video scene. Multiple objects in a video may be occluded or have similar appearances, and external environmental factors such as lighting, weather, and video quality may make tracking difficult for a tracker. In the field of multi-target tracking, the trackers based on deep learning have relatively good performance, including JDE, FairMOT, and the like, but they usually cannot achieve a good balance between tracking accuracy and tracking speed. And the spatial feature information of the target is not further extracted, so that the accurate position of the target cannot be well deduced.
Disclosure of Invention
The invention aims to provide a multi-target tracking method and system based on a deep path aggregation network, and aims to solve the technical problems that an existing tracker based on deep learning cannot well balance tracking precision and tracking speed and cannot well deduce the accurate position of a target.
In order to solve the technical problems, the specific technical scheme of the invention is as follows:
a multi-target tracking method based on a deep path aggregation network comprises the following steps:
step 1, data preprocessing: aiming at each frame image in each section of video of the training set, performing data enhancement by using rotation, scaling and color dithering to obtain an input data set of the network;
step 2, constructing a deep path aggregation network, comprising the following substeps:
step 2.1, designing a network structure of the deep path aggregation network;
2.2, constructing a training sample, selecting images from the input data set, and inputting the images into the depth path aggregation network as the input of the network;
step 2.3, designing an error function to perform back propagation, and optimizing parameters of the network until convergence;
step 3, performing multi-target tracking on each detected object in the video: extracting a detection area and re-ID characteristics of the target based on the trained depth path aggregation network, calculating the cosine distance of the re-ID characteristics of each target in a certain frame and the previous frame of the video and IoU of the detection frame, and performing tracking prediction by using a Kalman filter to obtain the position of the target in the current frame.
Further, the data preprocessing step in step 1 is specifically as follows:
and randomly selecting an angle of-10 to 10 for each frame of image in each section of video of the training set, rotating, then carrying out image scaling operation with the proportion of 4, and finally increasing the image color depth and the image brightness by 0.5 time to obtain an input data set of the network.
Further, designing a network structure of the deep path aggregation network in step 2.1 specifically includes the following steps:
step 201, based on the DLA network, adding three feature map layers from top to bottom at the final stage for performing down-sampling and aggregation operations on the output feature map of the DLA network, thereby obtaining three feature maps with different resolutions;
and 202, performing multi-scale aggregation on two medium and small resolution feature maps and another large resolution feature map to obtain a high resolution feature map representation, aiming at the output three feature maps with different resolutions. The high-resolution feature map representation is an output feature map of the deep path aggregation network, and the center, the heat map, the offset and the re-ID feature vector of the target can be output through the output feature map.
Further, the step 2.2 of constructing the training sample specifically includes the following steps:
and 4 images are selected from the input data set and input into the depth path aggregation network.
Further, the error function is designed in step 2.3 to perform back propagation, and the parameters of the network are optimized until convergence, specifically:
calculating a central heat map of each target in the image by using a Focal local Loss function, calculating a central offset and a predicted frame size of each target in the image by using an L1 local Loss function, and calculating a re-ID embedded Loss value of each target in the image by using an ID local Loss function; then, corresponding weight values are given to the three loss values to form a total loss value; after training, the central position, the size of a prediction frame and re-ID characteristics of each target in the image can be obtained; the total loss value is:
Figure BDA0003663938700000021
L detection =L heat +L box
wherein w 1 To adjust learnable parameters of target detection in the total loss function; w is a 2 The learnable parameters of the re-ID task in the total loss function are adjusted to balance the target detection and the re-ID task; e is a natural constant in mathematics; l is identity A penalty function representing a re-ID task; l is detection A loss function representing the detection task, which includes the heat map loss, prediction box size, and offset loss for each target in the image.
Further, the step 3 of performing multi-target tracking in the video specifically comprises the following steps:
step 301, inputting a certain frame image into a trained depth path aggregation network in a section of video to obtain a convolution feature map of the certain frame image, and then respectively generating a target heat map feature map, a prediction frame size feature map, a prediction frame offset feature map and a re-ID embedded feature map;
step 302, in the subsequent frame, performing online tracking on each target according to all target positions and re-ID features deduced from the previous frame, the steps are as follows:
step 3021, calculating detection frames and re-ID characteristics of all targets in the current frame by using the trained deep path aggregation network;
step 3022, predicting the position in the current frame by using a Kalman filter according to the motion track of each target in the previous frame;
step 3023, calculating cosine distances of re-ID features of all targets in the previous frame and all targets in the current frame and IoU of a detection frame, if the cosine distances are greater than 0.4 and the IoU score is greater than 0.5, determining that the tracking is successful, and connecting the successfully tracked targets to the existing motion track;
and step 3024, performing state updating by using a Kalman filter for all the targets successfully tracked to obtain the optimal estimation in the current frame.
The invention also provides a multi-target tracking system based on the deep path aggregation network, which comprises a data preprocessing unit, a deep path aggregation network training unit and a video multi-target tracking unit;
the data preprocessing unit is used for inputting an image sequence, and performing data enhancement on the image sequence by using rotation, scaling and color dithering;
the deep path aggregation network training unit is used for training a designed deep path aggregation network and is configured to execute the following steps:
step A, designing a network structure of a deep path aggregation network;
step B, constructing a training sample as the input of the network;
step C, designing an error function to perform back propagation, and optimizing parameters of the network until convergence;
the video multi-target tracking unit is configured to execute the following actions: extracting a detection area and re-ID characteristics of the targets based on the trained depth path aggregation network, and performing tracking prediction by using a Kalman filter according to the cosine distance of the re-ID characteristics of each target and IoU of the detection frame to obtain the positions of the targets in the current frame.
The multi-target tracking method and system based on the deep path aggregation network have the following advantages:
1. the multi-target tracking method based on the depth path aggregation network can be used for tracking all targets of each frame in any video;
2. the method can output the characteristics of different levels of the target by using the trained deep path aggregation network and by multi-scale aggregation of the cross-resolution ratio, so that the robustness on the appearance change of the target is stronger.
3. The invention adopts the feature fusion layer from bottom to top to extract the target space feature information, thereby shortening the information path between the bottom layer feature and the top layer feature, leading the tracker to have higher tracking precision and simultaneously ensuring the real-time performance of tracking.
Drawings
Fig. 1 is a schematic diagram of a multi-target tracking method based on a deep path aggregation network according to the present invention.
Detailed Description
In order to better understand the purpose, structure and function of the present invention, a multi-target tracking method and system based on a deep path aggregation network according to the present invention are described in further detail below with reference to the accompanying drawings.
It will be understood by those skilled in the art that, unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the prior art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein.
The invention firstly provides a multi-target tracking method based on a deep path aggregation network, which is shown by referring to fig. 1 and comprises the following steps:
step 1, data preprocessing: performing data enhancement on the input image sequence by using rotation, scaling and color dithering to obtain an input data set of a network;
and randomly selecting an angle of-10 to 10 for each frame of image in each section of video of the training set, rotating, then carrying out image scaling operation with the proportion of 4, and finally increasing the image color depth and the image brightness by 0.5 time to obtain an input data set of the network.
Step 2, constructing a deep path aggregation network, comprising the following substeps:
step 2.1, designing a network structure of the deep path aggregation network, which specifically comprises the following steps:
step 201, based on the DLA network, adding three feature map layers from top to bottom at the final stage for performing down-sampling and aggregation operations on the output feature map of the DLA network, thereby obtaining three feature maps with different resolutions; deep aggregation networks (DLAs) are a type of neural network that can extract features and perform multi-scale aggregation. The deep path aggregation network proposed herein is an improvement based on DLA networks.
Step 202, aiming at three output feature maps with different resolutions, carrying out multi-scale aggregation on two middle and small resolution feature maps and another large resolution feature map to obtain a high resolution feature map representation; the high-resolution feature map representation is an output feature map of the deep path aggregation network, and the center, the heat map, the offset and the re-ID feature vector of the target can be output through the output feature map.
2.2, constructing a training sample, selecting 4 images from the input data set, and inputting the images into the depth path aggregation network as the input of the network;
step 2.3, designing an error function to perform back propagation, and optimizing parameters of the network until convergence, wherein the method specifically comprises the following steps:
respectively calculating a central heat map, a central offset, a predicted frame size and a re-ID embedding Loss value of each target in the image by using the Focal local, the L1 local and the ID local; then, corresponding weight values are given to the three loss values to form a total loss value; after training, the central position, the size of a prediction frame and re-ID characteristics of each target in the image can be obtained; where Focal local, L1 local, ID local are all some Loss functions common in the deep learning field, which can be understood as functions that compute target center heat map Loss, center offset and bounding box size Loss, re-ID embedding Loss, respectively. re-ID is an abbreviation of re-identification and means re-identification, the depth path aggregation network finally outputs re-ID feature vectors of all targets in each frame of image in the video, and the positions of the targets in the previous frame of image in the next frame of image can be re-identified by calculating cosine distances of all re-ID feature vectors in two adjacent frames of image in the tracking stage. The total loss value is:
Figure BDA0003663938700000051
L detection =L heat +L box
wherein w 1 To adjust learnable parameters of target detection in the total loss function; w is a 2 The learnable parameters of the re-ID task in the total loss function are adjusted to balance the target detection and the re-ID task; e is a natural constant, L identity A penalty function representing a re-ID task; l is detection A loss function representing the detection task, which includes the heat map loss, prediction box size, and offset loss for each target in the image.
Step 3, performing multi-target tracking on each detected object in the video: extracting a detection area and a re-ID feature of a target based on a trained deep path aggregation network, calculating the cosine distance of the re-ID feature of each target in a certain frame and a previous frame of a video and the IoU of a detection frame, wherein the IoU represents the overlapping degree of two frames, and the target detection field is often used for judging the overlapping degree of a prediction frame and a real frame, so that the size of the prediction frame is continuously adjusted in the training process to approximate the position and the size of the real frame. Tracking and predicting by using a Kalman filter to obtain the position of a target in a current frame, and specifically comprising the following steps of:
step 301, inputting a certain frame image into a trained depth path aggregation network in a section of video to obtain a convolution feature map of the certain frame image, and then respectively generating a target heat map feature map, a prediction frame size feature map, a prediction frame offset feature map and a re-ID embedded feature map;
step 302, in the subsequent frame, performing online tracking on each target according to all target positions and re-ID features deduced from the previous frame, the steps are as follows:
step 3021, calculating detection frames and re-ID characteristics of all targets in the current frame by using the trained deep path aggregation network;
step 3022, predicting the position in the current frame by using a Kalman filter according to the motion track of each target in the previous frame;
step 3023, calculating cosine distances of re-ID features of all targets in the previous frame and all targets in the current frame and IoU of a detection frame, if the cosine distances are greater than 0.4 and the IoU score is greater than 0.5, determining that the tracking is successful, and connecting the successfully tracked targets to the existing motion track;
and step 3024, performing state updating by using a Kalman filter for all the targets successfully tracked to obtain the optimal estimation in the current frame.
The invention also provides a multi-target tracking system based on the deep path aggregation network, which comprises a data preprocessing unit, a deep path aggregation network training unit and a video multi-target tracking unit;
the data preprocessing unit is used for inputting an image sequence, and performing data enhancement on the image sequence by using rotation, scaling and color dithering;
the deep path aggregation network training unit is used for training a designed deep path aggregation network and is configured to execute the following steps:
step A, designing a network structure of a deep path aggregation network;
step B, constructing a training sample as the input of the network;
step C, designing an error function to perform back propagation, and optimizing parameters of the network until convergence;
the video multi-target tracking unit is configured to execute the following actions: extracting a detection area and re-ID characteristics of the targets based on the trained depth path aggregation network, and performing tracking prediction by using a Kalman filter according to the cosine distance of the re-ID characteristics of each target and IoU of the detection frame to obtain the positions of the targets in the current frame.
It will be understood by those within the art that each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the methods specified in the block or blocks of the block diagrams and/or flowchart block or blocks.
It is to be understood that the present invention has been described with reference to certain embodiments, and that various changes in the features and embodiments, or equivalent substitutions may be made therein by those skilled in the art without departing from the spirit and scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed, but that the invention will include all embodiments falling within the scope of the appended claims.

Claims (7)

1. A multi-target tracking method based on a deep path aggregation network is characterized by comprising the following steps:
step 1, data preprocessing: aiming at each frame image in each section of video of the training set, performing data enhancement by using rotation, scaling and color dithering to obtain an input data set of the network;
step 2, constructing a deep path aggregation network, comprising the following substeps:
step 2.1, designing a network structure of the deep path aggregation network;
2.2, constructing a training sample, selecting images from the input data set, and inputting the images into the depth path aggregation network as the input of the network;
step 2.3, designing an error function to perform back propagation, and optimizing parameters of the network until convergence;
step 3, performing multi-target tracking on each detected object in the video: extracting a detection area and re-ID characteristics of the target based on the trained depth path aggregation network, calculating the cosine distance of the re-ID characteristics of each target in a certain frame and the previous frame of the video and IoU of the detection frame, and performing tracking prediction by using a Kalman filter to obtain the position of the target in the current frame.
2. The multi-target tracking method based on the deep path aggregation network as claimed in claim 1, wherein the data preprocessing step in the step 1 is as follows:
and randomly selecting an angle of-10 to 10 for each frame of image in each section of video of the training set, rotating, then carrying out image scaling operation with the proportion of 4, and finally increasing the image color depth and the image brightness by 0.5 time to obtain an input data set of the network.
3. The multi-target tracking method based on the deep path aggregation network according to claim 1, wherein the step 2.1 of designing the network structure of the deep path aggregation network specifically comprises the following steps:
step 201, based on the DLA network, adding three feature map layers from top to bottom at the final stage for performing down-sampling and aggregation operations on the output feature map of the DLA network, thereby obtaining three feature maps with different resolutions;
and 202, performing multi-scale aggregation on two medium and small resolution feature maps and another large resolution feature map to obtain a high resolution feature map representation, aiming at the output three feature maps with different resolutions.
4. The multi-target tracking method based on the deep path aggregation network as claimed in claim 3, wherein the constructing of the training samples in the step 2.2 specifically includes the following steps:
and 4 images are selected from the input data set and input into the depth path aggregation network.
5. The multi-target tracking method based on the deep path aggregation network as claimed in claim 4, wherein the designing of the error function in the step 2.3 is performed with back propagation to optimize the parameters of the network until convergence, and specifically includes:
calculating a central heat map of each target in the image by using a FocalLoss Loss function, calculating a central offset and a predicted frame size of each target in the image by using an L1 Loss Loss function, and calculating a re-ID embedded Loss value of each target in the image by using an ID Loss Loss function; then, corresponding weight values are given to the three loss values to form a total loss value; after training, obtaining the central position, the size of a prediction frame and re-ID characteristics of each target in the image; the total loss value is:
Figure FDA0003663938690000021
L detection =L heat +L box
wherein w 1 To adjust learnable parameters of target detection in the total loss function; w is a 2 The learnable parameters of the re-ID task in the total loss function are adjusted to balance the target detection and the re-ID task; e is a natural constant, L identity A penalty function representing a re-ID task; l is detection A loss function representing the detection task, which includes the heat map loss, prediction box size, and center offset loss for each target in the image.
6. The multi-target tracking method based on the deep path aggregation network as claimed in claim 1, wherein the step 3 of performing multi-target tracking in the video specifically comprises the following steps:
step 301, inputting a certain frame image into a trained depth path aggregation network in a section of video to obtain a convolution feature map of the certain frame image, and then respectively generating a target heat map feature map, a prediction frame size feature map, a prediction frame offset feature map and a re-ID embedded feature map;
step 302, in the subsequent frame, performing online tracking on each target according to all target positions and re-ID features deduced from the previous frame, the steps are as follows:
step 3021, calculating detection frames and re-ID characteristics of all targets in the current frame by using the trained deep path aggregation network;
step 3022, predicting the position in the current frame by using a Kalman filter according to the motion track of each target in the previous frame;
step 3023, calculating cosine distances of re-ID features of all targets in the previous frame and all targets in the current frame and IoU of a detection frame, if the cosine distances are greater than 0.4 and the IoU score is greater than 0.5, determining that the tracking is successful, and connecting the successfully tracked targets to the existing motion track;
and step 3024, performing state updating by using a Kalman filter for all the targets successfully tracked to obtain the optimal estimation in the current frame.
7. The multi-target tracking system based on the deep path aggregation network is characterized by comprising a data preprocessing unit, a deep path aggregation network training unit and a video multi-target tracking unit, wherein the data preprocessing unit is used for preprocessing data;
the data preprocessing unit is used for inputting an image sequence, and performing data enhancement on the image sequence by using rotation, scaling and color dithering;
the deep path aggregation network training unit is used for training a designed deep path aggregation network and is configured to execute the following steps:
step A, designing a network structure of a deep path aggregation network;
step B, constructing a training sample as the input of the network;
step C, designing an error function to perform back propagation, and optimizing parameters of the network until convergence;
the video multi-target tracking unit is configured to execute the following actions: extracting a detection area and re-ID characteristics of the targets based on the trained depth path aggregation network, and performing tracking prediction by using a Kalman filter according to the cosine distance of the re-ID characteristics of each target and IoU of the detection frame to obtain the positions of the targets in the current frame.
CN202210599934.6A 2022-05-26 2022-05-26 Multi-target tracking method and system based on deep path aggregation network Pending CN114897939A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210599934.6A CN114897939A (en) 2022-05-26 2022-05-26 Multi-target tracking method and system based on deep path aggregation network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210599934.6A CN114897939A (en) 2022-05-26 2022-05-26 Multi-target tracking method and system based on deep path aggregation network

Publications (1)

Publication Number Publication Date
CN114897939A true CN114897939A (en) 2022-08-12

Family

ID=82725901

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210599934.6A Pending CN114897939A (en) 2022-05-26 2022-05-26 Multi-target tracking method and system based on deep path aggregation network

Country Status (1)

Country Link
CN (1) CN114897939A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116258608A (en) * 2023-05-15 2023-06-13 中铁水利信息科技有限公司 Water conservancy real-time monitoring information management system integrating GIS and BIM three-dimensional technology

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116258608A (en) * 2023-05-15 2023-06-13 中铁水利信息科技有限公司 Water conservancy real-time monitoring information management system integrating GIS and BIM three-dimensional technology
CN116258608B (en) * 2023-05-15 2023-08-11 中铁水利信息科技有限公司 Water conservancy real-time monitoring information management system integrating GIS and BIM three-dimensional technology

Similar Documents

Publication Publication Date Title
CN113506317B (en) Multi-target tracking method based on Mask R-CNN and apparent feature fusion
CN109800689B (en) Target tracking method based on space-time feature fusion learning
CN107452015B (en) Target tracking system with re-detection mechanism
CN111814621A (en) Multi-scale vehicle and pedestrian detection method and device based on attention mechanism
CN113674416B (en) Three-dimensional map construction method and device, electronic equipment and storage medium
CN113409361B (en) Multi-target tracking method and device, computer and storage medium
CN104820997B (en) A kind of method for tracking target based on piecemeal sparse expression Yu HSV Feature Fusion
CN112668483B (en) Single-target person tracking method integrating pedestrian re-identification and face detection
CN104517275A (en) Object detection method and system
CN110827320B (en) Target tracking method and device based on time sequence prediction
CN110009060A (en) A kind of robustness long-term follow method based on correlation filtering and target detection
CN106780567B (en) Immune particle filter extension target tracking method fusing color histogram and gradient histogram
CN111161325A (en) Three-dimensional multi-target tracking method based on Kalman filtering and LSTM
CN111027505A (en) Hierarchical multi-target tracking method based on significance detection
CN105809718A (en) Object tracking method with minimum trajectory entropy
CN113763427A (en) Multi-target tracking method based on coarse-fine shielding processing
CN115063447A (en) Target animal motion tracking method based on video sequence and related equipment
CN115690545B (en) Method and device for training target tracking model and target tracking
Foresti Object detection and tracking in time-varying and badly illuminated outdoor environments
CN114897939A (en) Multi-target tracking method and system based on deep path aggregation network
CN113223064A (en) Method and device for estimating scale of visual inertial odometer
CN102800105B (en) Target detection method based on motion vector
Lee et al. An edge detection–based eGAN model for connectivity in ambient intelligence environments
CN116861262B (en) Perception model training method and device, electronic equipment and storage medium
CN113129336A (en) End-to-end multi-vehicle tracking method, system and computer readable medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination