CN117670938A - Multi-target space-time tracking method based on super-treatment robot - Google Patents

Multi-target space-time tracking method based on super-treatment robot Download PDF

Info

Publication number
CN117670938A
CN117670938A CN202410125476.1A CN202410125476A CN117670938A CN 117670938 A CN117670938 A CN 117670938A CN 202410125476 A CN202410125476 A CN 202410125476A CN 117670938 A CN117670938 A CN 117670938A
Authority
CN
China
Prior art keywords
target
network
traffic
dimensional
vehicle
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410125476.1A
Other languages
Chinese (zh)
Other versions
CN117670938B (en
Inventor
罗江
喻恺
陈震
陈昭彰
吴传洁
王新官
陈鹏
黄涛
李欣
于大龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangxi Fangxing Technology Co ltd
Original Assignee
Jiangxi Fangxing Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangxi Fangxing Technology Co ltd filed Critical Jiangxi Fangxing Technology Co ltd
Priority to CN202410125476.1A priority Critical patent/CN117670938B/en
Priority claimed from CN202410125476.1A external-priority patent/CN117670938B/en
Publication of CN117670938A publication Critical patent/CN117670938A/en
Application granted granted Critical
Publication of CN117670938B publication Critical patent/CN117670938B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

The invention discloses a multi-target space-time tracking method based on a super-robot, which is characterized in that an integrated multi-level description network is constructed, vehicle targets with different scales are well described in static characteristics under a large-scale traffic scene, the static attributes such as the position, the category, the size and the like of each traffic target are efficiently obtained by combining a traffic scene target characteristic set and a key frame screening network, a model belonging to the traffic target is constructed, a certain traffic target is uniquely described, a CARLA simulation platform is adopted for a traffic monitoring camera, a large-scale traffic monitoring scene data set covering multiple scenes and the traffic targets is generated, complete target and scene characteristic information is provided for subsequent modeling, a multi-target tracking network is designed for video segments, a track model is established, a time-space diagram under a single camera is generated and updated, the fact that the camera cannot fully cover road sections is considered, and the traffic time-space diagram under multiple cameras is fused by utilizing the characteristic models of single vehicles and vehicle topologies under the covered road sections.

Description

Multi-target space-time tracking method based on super-treatment robot
Technical Field
The invention relates to the technical field of highways, in particular to a multi-target space-time tracking method based on a super-treatment robot.
Background
The overload control method is used for controlling the phenomenon of illegal overload of vehicles, the illegal overload transportation of the vehicles induces a large number of road traffic safety accidents, and the illegal overload not only has damage to roads, damage to transportation vehicles, damage to drivers and damage to normal competing commercial environments of transportation markets, but also has the defects of incomplete feature extraction, high false detection rate, poor robustness and the like when the traditional method mainly depends on the characteristics of manual design, has simple algorithm and small calculated amount and faces the factors such as camera shake, target shielding, illumination change, weather change and the like.
Therefore, a deep learning method is required to be introduced, the deep learning method has super-strong self-help learning capability and modeling capability of complex tasks, compared with the traditional method, the deep learning method has a large number of model parameters which can approach complex nonlinear relations, so that the model has stronger expression capability and higher accuracy of an algorithm, and meanwhile, the running speed of the algorithm is higher due to distributed storage and parallel computing technology, but the deep learning method still has the following problems.
(1) A camera full-automatic calibration method based on deep learning. The existing method is often focused on calibration under a general scene, and the specificity of the traffic scene is not considered, for example, the traffic scene comprises lane lines, vehicles and the like, and meanwhile, a data set for automatic calibration of a camera aiming at the traffic scene is still blank.
(2) Convolutional neural networks have the problems that convolutional features are sensitive to scale changes, interesting area pooling damages the feature structures of small objects, reverse propagation errors accumulate in the network training process and the like, and final feature extraction is incomplete due to the problems. Continuous target detection, feature extraction and modeling in traffic scenes still need to be further studied, and especially the detection precision of small targets is improved.
(3) Existing multi-target tracking methods are often divided into three phases, namely: the detection, feature modeling and association matching have serious dependency relationship in the three stages, and few methods are used for realizing the aim of integrated multi-target tracking by predicting the motion parameters of targets.
(4) The target detection network is often focused on positioning and identification, and most of the tasks cannot be completed at the same time, and the traffic static information description not only comprises the positions and classifications of the vehicle targets, but also comprises axles, vehicle colors and the like, so that the static information extraction based on the multi-task network is urgently needed to be studied.
Based on the reasons, the invention discloses a multi-target space-time tracking method based on a super-robot.
Disclosure of Invention
The invention aims to provide a multi-target space-time tracking method based on a super-robot.
The invention aims to solve the problems that: and obtaining more accurate camera calibration parameters. And then redesigning the deep neural network structure, improving the scale sensitivity of the deep neural network structure, and constructing a two-dimensional and three-dimensional detection integrated multi-level description network, so that the vehicle targets with different scales can be better described in static characteristics in a large-scale traffic scene. Meanwhile, combining a traffic scene target feature set and a key frame screening network, efficiently obtaining the static attributes of the position, the category, the size, the vehicle axle number, the license plate, the vehicle color and the like of each traffic target, constructing a feature model belonging to the traffic target, and describing a certain traffic target by using the feature model. Meanwhile, aiming at the traffic monitoring camera, a large-scale traffic monitoring scene data set covering various climatic conditions, traffic scenes and traffic targets is generated through the CARLA simulation platform, and complete target and scene characteristic information is provided for development of subsequent researches. And then, aiming at the video fragment, designing a multi-target tracking algorithm based on a graph network, establishing a track characteristic model, and generating and updating a space-time diagram under a single camera. And finally, considering that the cameras cannot fully cover road sections, and fusing traffic space diagrams under multiple cameras by utilizing the feature models of single vehicles and vehicle topologies under the covered road sections.
A multi-target space-time tracking method based on a super-robot comprises the following steps:
s1, establishing a traffic scene target feature set;
s2, screening key frames by a key frame screening network;
s3, constructing an integrated detection network, and integrally describing two-dimensional and three-dimensional attributes of traffic targets in multiple layers;
s4, modeling traffic target characteristics;
s5, constructing a multi-target tracking network to track the multi-target motion parameters;
s6, modeling the dynamic information characteristics of the vehicle.
Further, the traffic scene target feature set established in the step S1 includes an actual scene image feature set and a virtual traffic scene feature set, wherein the actual scene image is obtained by collecting high-definition images of actual road-bridge tunneling monitoring cameras at home and abroad, the virtual traffic scene feature set is simulated by a unmanned and automatic driving platform calla, and a virtual camera is placed in a virtual traffic scene, so that a virtual traffic scene video is generated by recording the virtual camera.
Further, the screening key frame network in S2 constructs a large-scale video frame screening image library, and sends the images to be screened and the previous frame image into the video frame screening image library to a convolution network to obtain a one-dimensional vector, the one-dimensional vector contains the features of two images, then the full-connection layer combines the image features and classifies the image features by softmax, and then a key frame result is output, and the video frame screening image library covers the two types of images by manual labeling: the system comprises useless frames and key frames, wherein the useless frames comprise a useless frame tag 0 and a key frame tag 1, the useless frames comprise frames with fewer traffic targets, frames without traffic targets and frames with road surface areas with lower occupation in images, and the key frames comprise frames with a large number of traffic targets in images and frames with road surface areas with higher occupation in images.
Further, the integrated detection network in S3 includes:
s31, inputting a traffic scene video frame, and generating a convolution feature map through a convolution layer;
s32, generating a group of suggestions based on the convolution feature map by using the regional proposal network RPN;
s33, predicting a boundary box possibly containing an object by using an area proposal network RPN, pooling by utilizing deconvolution and bilinear kernels, expanding small proposal area characteristics, avoiding the problem of insensitivity of a small target caused by representing the small traffic target by repeated values, and applying pooling operation to serially fusing pooling elements positioned at different convolution layers in a plurality of layers of a convolution neural network to fuse low-layer detailed information and high-layer semantic information;
s34, dividing the network into a plurality of branches according to the size of the proposal area, reducing the training burden of traffic targets with different scales and dimensions, and improving the detection precision of large objects and small objects;
s35, after combining the characteristics of the ROI area, predicting targets with different scales by adopting three prediction branches, wherein the three prediction branches respectively correspond to a small target, a medium target and a large target;
s36, adding a full-connection layer into the three prediction branches, adding a three-dimensional prediction branch into the multi-level prediction branch, which is responsible for detecting key points of a vehicle, adding a pyramid mechanism to adapt to a multi-scale vehicle target, and simultaneously introducing the multi-level prediction branches of the license plate, the vehicle color and the vehicle axle number of the detection target to obtain different vehicle static characteristics;
s37, fusing all prediction results of the branches to obtain a final detection result.
Further, the loss function of the integrated detection network is as follows:
wherein the overall loss functionConsists of four parts, namely->Loss of standard softmax, +.>For the proposal in the batch, propose +.when the proposal is positive>,/>To smooth L1 loss, in +.>In (I)>Regularization constants for category and regression, +.>The regression vector is a two-dimensional target rectangular frame regression vector obtained by integral network prediction and a two-dimensional target frame true regression vector; at->In (I)>Regularized constant for three-dimensional detection branch, +.>Respectively obtaining three-dimensional regression vectors and true three-dimensional regression vectors by integral network prediction; at->In (I)>Regularization parameters for template similarity +.>Respectively obtaining a template vector and a real template vector by integrated network prediction; at->In (I)>Regularization parameters for component detection, +.>And respectively predicting the obtained target component vector and the actual target component vector position for the integrated network.
Further, the modeling of the traffic target feature model in the step S4 is based on the target static attribute acquired in the step S4, a feature model belonging to a corresponding target is constructed, and the acquired target static attribute is stored in a unified information coding format as a binary format.
Further, the multi-target tracking network is constructed based on a deep motion modeling network DMM-Net, and the specific structure is as follows: inputting a time stampTo->Several frames of images in between, a video frame sequence using oneThe feature extractors are convolved, then 6 layers of features are selected from the rest feature extractors and respectively input into a motion information network, a classification network and a visualization network, and the loss functions of the motion information network, the classification network and the visualization network are respectively used->、/>And->The loss function of the entire network is represented by the following formula:
wherein N is the number of positive anchor pipes; alpha and beta are super parameters, and are used for manually adjusting the learning rate.
Further, in the step S6, the modeling of the dynamic information features of the vehicle constructs an information feature model based on the three-dimensional motion trail of the video vehicle and the timestamp information acquired by the multi-target tracking network, and the information features include a timestamp, an image position, a real position, an average speed, a category and a color.
The invention has the beneficial effects that: the invention utilizes a three-dimensional machine vision technology to extract wagon bottom information in the process of overload treatment, and based on a constructed two-dimensional and three-dimensional detection integrated multi-level description network of a traffic target, static attribute information such as the position, the category, the size, the number of vehicle axles, license plates, vehicle colors and the like of the target is obtained, a characteristic model belonging to the vehicle target is constructed, and the obtained static attribute of the target is uniformly coded; the method comprises the steps of constructing an information characteristic model by using three-dimensional motion trail and time stamp information of a vehicle target in a video;
the application innovation of multi-source data fusion is realized, and traffic volume information such as vehicle speed, vehicle type, flow, vehicle head distance, vehicle head time interval, vehicle following percentage and the like is analyzed and obtained on the basis of vehicle basic information acquisition. When the vehicles are queued to pass through the detection area, the car queue is inserted, no license plate or license plate identification error exists, the vehicles are backed up and the like, the acquired positions and the traffic flow states of the vehicles ensure that the license plate, weight, wheel axle, outline, photo, video and other data of the same vehicle are matched and summarized into a driving record, and the problems of multi-source data matching and vehicle queue information uploading are solved;
the whole process monitoring of truck weighing is realized, the information including the speed of the truck at the place of loading, the weight balance, the sliding edge, the abnormal acceleration, the dynamic vehicle separation and the like can be monitored and provided, the single truck weighing data strip is realized, the data and the weighing interference are avoided, and more complete evidence is provided for the super business.
Drawings
FIG. 1 is a schematic diagram of a key frame screening network architecture according to the present invention;
FIG. 2 is a schematic diagram of an integrated detection network architecture according to the present invention;
FIG. 3 is a schematic diagram of a modeling and encoding flow of a traffic target feature model according to the present invention;
FIG. 4 is a schematic diagram of a DMM-Net network architecture according to the present invention.
Detailed Description
The present invention will be further described more fully hereinafter, but the scope of the invention is not limited thereto.
A multi-target space-time tracking method based on a super-robot comprises the following steps:
s1, establishing a traffic scene target feature set;
s2, screening key frames by a key frame screening network;
s3, constructing an integrated detection network, and integrally describing two-dimensional and three-dimensional attributes of traffic targets in multiple layers;
s4, modeling traffic target characteristics;
s5, constructing a multi-target tracking network to track the multi-target motion parameters;
s6, modeling the dynamic information characteristics of the vehicle.
Further, the large-scale target data sets presently disclosed, such as COCO, pascalVoc data sets, BIT vehicle data sets, which include numerous common item features, are not satisfactory for traffic scenarios and the present research problem. Because a large-scale traffic scene needs to be considered in the research problem, the shooting range of a camera is wider, a target on a road can generate severe deformation when the target is driven to the camera and driven from the camera, and meanwhile, in order to meet the diversity of the traffic scene and consider the traffic target condition under a complex traffic environment, a plurality of data sets facing the traffic scene need to be constructed, so that the traffic scene target feature set established in the S1 needs to comprise an actual scene image feature set and a virtual traffic scene feature set, wherein the actual scene image covers traffic target features with various scale changes and shape changes by collecting high-definition images of actual road bridge tunneling monitoring cameras at home and abroad, and considers the problem of insufficient light caused by severe weather conditions such as overcast and rainy days, and the like, the virtual traffic scene feature set generates virtual traffic scene videos by recording the virtual cameras in the virtual traffic scene through unmanned and automatic driving platform CARLA simulation, and the virtual traffic scene feature set can acquire a plurality of scene description information through videos: the position angle of the camera, the internal and external parameter matrix of the camera, the weather in the scene, the crowding degree of the vehicles in the scene and the like; numerous traffic objective description information may also be obtained: the position, speed, type, number and the like of the traffic targets collect 300 more traffic scenes in the simulated traffic scene target feature set, and 4000 more traffic videos comprise 2000 tens of thousands of video frames. The actual scene is complementary with the simulated scene target feature set in advantage, so that the real state of the traffic target in operation is considered, the diversity of the traffic scene is enriched, and complete target and scene feature information is provided for development of subsequent research.
Further, in traffic video sequences, there are often a large number of redundant video frames, and if each video frame is analyzed, the calculation speed is seriously affected. Therefore, for the original video sequence, firstly, a key frame screening network is sent to provide valuable video frames in the original video sequence, and subsequent analysis is performed, so that the processing efficiency is improved, as shown in fig. 1, in the step S2, the key frame screening network builds a large-scale video frame screening image library, and sends images to be screened and previous frame images in the video frame screening image library into a convolution network to obtain a one-dimensional vector, wherein the one-dimensional vector contains features of two images, then, the full-connection layer combines the image features and classifies the image features by softmax, and then, a key frame result is output, and the video frame screening image library covers the two types of images through manual labeling: the system comprises useless frames and key frames, wherein the useless frames comprise a useless frame tag 0 and a key frame tag 1, the useless frames comprise frames with fewer traffic targets, frames without traffic targets and frames with road surface areas with lower occupation in images, and the key frames comprise frames with a large number of traffic targets in images and frames with road surface areas with higher occupation in images.
Further, as shown in fig. 2, the integrated detection network in S3 includes:
s31, inputting a traffic scene video frame, and generating a convolution feature map through a convolution layer;
s32, generating a group of suggestions based on the convolution feature map by using the regional proposal network RPN;
s33, regional proposal network RPN, namely regional generation network prediction, possibly contains a boundary box of an object, and pools by utilizing deconvolution and bilinear kernels, so that small proposal area characteristics are enlarged, the problem of insensitivity of a small target caused by representing a small traffic target by a repeated value is avoided, and pooling operation is applied to serially fusing pooling elements positioned in different convolution layers in a plurality of layers of a convolution neural network with low-layer detailed information and high-layer semantic information;
s34, dividing the network into a plurality of branches according to the size of the proposal area, reducing the training burden of traffic targets with different scales and dimensions, and improving the detection precision of large objects and small objects;
s35, after combining the characteristics of the ROI region, namely the region of interest, for targets with different scales, predicting by adopting three prediction branches, wherein the three prediction branches respectively correspond to a small target, a medium target and a large target;
s36, adding a full-connection layer into the three prediction branches, adding a three-dimensional prediction branch into the multi-level prediction branch, which is responsible for detecting key points of a vehicle, adding a pyramid mechanism to adapt to a multi-scale vehicle target, and simultaneously introducing the multi-level prediction branches of the license plate, the vehicle color and the vehicle axle number of the detection target to obtain different vehicle static characteristics;
s37, fusing all prediction results of the branches to obtain a final detection result.
Further, the loss function of the integrated detection network is as follows:
wherein the overall loss functionConsists of four parts, namely->Loss of standard softmax, +.>For the proposal in the batch, propose +.when the proposal is positive>,/>To smooth L1 loss, in +.>In (I)>Regularization constants for category and regression, +.>The regression vector is a two-dimensional target rectangular frame regression vector obtained by integral network prediction and a two-dimensional target frame true regression vector; at->In (I)>Is three in threeRegularization constant of the dimension detection branch, +.>Respectively obtaining three-dimensional regression vectors and true three-dimensional regression vectors by integral network prediction; at->In (I)>Regularization parameters for template similarity +.>Respectively obtaining a template vector and a real template vector by integrated network prediction; at->In (I)>Regularization parameters for component detection, +.>And respectively predicting the obtained target component vector and the actual target component vector position for the integrated network.
Further, the modeling of the traffic target feature model in S4 is based on the target static attribute acquired in S4, a feature model belonging to the corresponding target is constructed, the acquired target static attribute is stored in a unified information coding format as a binary format, and a specific coding format is shown in fig. 3. As can be seen from fig. 3, the traffic target to be modeled is a red vehicle, the image position, the category, the size, the number of axles, the license plate and the color of the traffic target are obtained by using a two-dimensional and three-dimensional detection integrated multi-level description network of the traffic target, the information is converted by using an encoder and stored, the modeling of the traffic target feature model is completed, and when the specific traffic information of the target is needed, the specific information of the target can be recovered by using a decoder. The traffic target feature model can uniquely describe the static attribute of the target, so that the traffic target feature model can be used for uniquely judging the target in the multi-camera linkage traffic scene, and meanwhile, the target information recorded by the traffic target feature model can be used in various environments through encoding and decoding.
Further, as shown in fig. 4, the multi-target tracking network is constructed based on a deep motion modeling network DMM-Net, and the specific structure is as follows: inputting a time stampTo->The video frame sequence is convolved by a feature extractor, then 6 layers of features are selected from the rest feature extractors and respectively input into a motion information network, a classification network and a visualization network, and the loss functions of the motion information network, the classification network and the visualization network are respectively equal to +.>And->The loss function of the entire network is represented by the following formula:
wherein N is the number of positive anchor pipes; alpha and beta are super parameters, are used for manually adjusting the learning rate, and realize the multi-target tracking task in the video by constructing the multi-target tracking network.
In the step S6, the vehicle dynamic information feature modeling is based on the three-dimensional motion trail and time stamp information of the video vehicle acquired by the multi-target tracking network, an information feature model is constructed, the information features comprise time stamps, image positions, real positions, average speeds, categories and colors, and the dynamic information of the vehicle in the video scene can be comprehensively and detailed expressed through the feature vectors.
The embodiments of the present invention are disclosed as preferred embodiments, but not limited thereto, and those skilled in the art will readily appreciate from the foregoing description that various extensions and modifications can be made without departing from the spirit of the present invention.

Claims (8)

1. A multi-target space-time tracking method based on a super-robot is characterized by comprising the following steps:
s1, establishing a traffic scene target feature set;
s2, screening key frames by a key frame screening network;
s3, constructing an integrated detection network, and integrally describing two-dimensional and three-dimensional attributes of traffic targets in multiple layers;
s4, modeling traffic target characteristics;
s5, constructing a multi-target tracking network to track the multi-target motion parameters;
s6, modeling the dynamic information characteristics of the vehicle.
2. The multi-objective space-time tracking method based on the super-robot treatment system according to claim 1, wherein the traffic scene objective feature set established in the step S1 comprises an actual scene image feature set and a virtual traffic scene feature set, wherein the actual scene image is obtained by collecting high-definition images of actual road-bridge tunnel monitoring cameras at home and abroad, the virtual traffic scene feature set is obtained by simulating a unmanned and automatic driving platform calla, and a virtual camera is placed in a virtual traffic scene, so that virtual traffic scene videos are generated by recording the virtual camera.
3. The multi-objective spatio-temporal tracking method based on super-robot according to claim 1, wherein: the screening key frame network in the S2 is used for constructing a large-scale video frame screening image library, sending images to be screened and the previous frame image into a convolution network to obtain a one-dimensional vector, wherein the one-dimensional vector comprises the characteristics of two images, combining the image characteristics through a full connection layer and classifying by using softmax, and outputting a key frame result, and the video frame screening image library covers the two types of images through manual labeling: the system comprises useless frames and key frames, wherein the useless frames comprise a useless frame tag 0 and a key frame tag 1, the useless frames comprise frames with fewer traffic targets, frames without traffic targets and frames with road surface areas with lower occupation in images, and the key frames comprise frames with a large number of traffic targets in images and frames with road surface areas with higher occupation in images.
4. The multi-objective spatio-temporal tracking method based on super robot as claimed in claim 1, wherein said integrated detection network in S3 comprises:
s31, inputting a traffic scene video frame, and generating a convolution feature map through a convolution layer;
s32, generating a group of suggestions based on the convolution feature map by using the regional proposal network RPN;
s33, predicting a boundary box possibly containing an object by using an area proposal network RPN, pooling by utilizing deconvolution and bilinear kernels, expanding small proposal area characteristics, avoiding the problem of insensitivity of a small target caused by representing the small traffic target by repeated values, and applying pooling operation to serially fusing pooling elements positioned at different convolution layers in a plurality of layers of a convolution neural network to fuse low-layer detailed information and high-layer semantic information;
s34, dividing the network into a plurality of branches according to the size of the proposal area, reducing the training burden of traffic targets with different scales and dimensions, and improving the detection precision of large objects and small objects;
s35, after combining the characteristics of the ROI area, predicting targets with different scales by adopting three prediction branches, wherein the three prediction branches respectively correspond to a small target, a medium target and a large target;
s36, adding a full-connection layer into the three prediction branches, adding a three-dimensional prediction branch into the multi-level prediction branch, which is responsible for detecting key points of a vehicle, adding a pyramid mechanism to adapt to a multi-scale vehicle target, and simultaneously introducing the multi-level prediction branches of the license plate, the vehicle color and the vehicle axle number of the detection target to obtain different vehicle static characteristics;
s37, fusing all prediction results of the branches to obtain a final detection result.
5. The multi-objective space-time tracking method based on super robot as claimed in claim 4, wherein the loss function of the integrated detection network is:
wherein the overall loss functionConsists of four parts, namely->Loss of standard softmax, +.>For the proposal in the batch, propose +.when the proposal is positive>,/>To smooth L1 loss, in +.>In (I)>Regularization constants for category and regression, +.>The regression vector is a two-dimensional target rectangular frame regression vector obtained by integral network prediction and a two-dimensional target frame true regression vector;at->In (I)>Regularized constant for three-dimensional detection branch, +.>Respectively obtaining three-dimensional regression vectors and true three-dimensional regression vectors by integral network prediction; at->In (I)>Regularization parameters for template similarity +.>Respectively obtaining a template vector and a real template vector by integrated network prediction; at->In (I)>Regularization parameters for component detection, +.>And respectively predicting the obtained target component vector and the actual target component vector position for the integrated network.
6. The multi-target space-time tracking method based on the super-robot treatment system according to claim 1, wherein the modeling of the traffic target feature model in the step S4 is based on the target static attribute acquired in the step S4, the feature model belonging to the corresponding target is constructed, and the acquired target static attribute is stored in a unified information coding format as a binary format.
7. The multi-target space-time tracking method based on the super-robot as claimed in claim 1, wherein the multi-target tracking network is constructed based on a deep motion modeling network DMM-Net, and the specific structure is as follows:
inputting a time stampTo->The video frame sequence is convolved by a feature extractor, then 6 layers of features are selected from the rest feature extractors and respectively input into a motion information network, a classification network and a visualization network, and the loss functions of the motion information network, the classification network and the visualization network are respectively equal to +.>、/>And->The loss function of the entire network is represented by the following formula:
wherein N is the number of positive anchor pipes; alpha and beta are super parameters, and are used for manually adjusting the learning rate.
8. The multi-target space-time tracking method based on the super-robot according to claim 1, wherein the modeling of the dynamic information features of the vehicle in S6 constructs an information feature model based on the three-dimensional motion track of the video vehicle and the timestamp information acquired by the multi-target tracking network, and the information features include a timestamp, an image position, a real position, an average speed, a category and a color.
CN202410125476.1A 2024-01-30 Multi-target space-time tracking method based on super-treatment robot Active CN117670938B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410125476.1A CN117670938B (en) 2024-01-30 Multi-target space-time tracking method based on super-treatment robot

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410125476.1A CN117670938B (en) 2024-01-30 Multi-target space-time tracking method based on super-treatment robot

Publications (2)

Publication Number Publication Date
CN117670938A true CN117670938A (en) 2024-03-08
CN117670938B CN117670938B (en) 2024-05-10

Family

ID=

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968123A (en) * 2020-08-28 2020-11-20 北京交通大学 Semi-supervised video target segmentation method
CN112307921A (en) * 2020-10-22 2021-02-02 桂林电子科技大学 Vehicle-mounted end multi-target identification tracking prediction method
CN113139620A (en) * 2021-05-14 2021-07-20 重庆理工大学 End-to-end multi-target detection and tracking joint method based on target association learning
CN114387265A (en) * 2022-01-19 2022-04-22 中国民航大学 Anchor-frame-free detection and tracking unified method based on attention module addition
CN114550023A (en) * 2021-12-31 2022-05-27 武汉中交交通工程有限责任公司 Traffic target static information extraction device
CN114627447A (en) * 2022-03-10 2022-06-14 山东大学 Road vehicle tracking method and system based on attention mechanism and multi-target tracking
KR20220080631A (en) * 2020-12-07 2022-06-14 부경대학교 산학협력단 Apparatus and method for tracking multi-object in real time
CN115861884A (en) * 2022-12-06 2023-03-28 中南大学 Video multi-target tracking method, system, device and medium in complex scene
CN116189116A (en) * 2023-04-24 2023-05-30 江西方兴科技股份有限公司 Traffic state sensing method and system
CN116229112A (en) * 2022-12-06 2023-06-06 重庆邮电大学 Twin network target tracking method based on multiple attentives
CN116912804A (en) * 2023-07-31 2023-10-20 江苏大学 Efficient anchor-frame-free 3-D target detection and tracking method and model

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111968123A (en) * 2020-08-28 2020-11-20 北京交通大学 Semi-supervised video target segmentation method
CN112307921A (en) * 2020-10-22 2021-02-02 桂林电子科技大学 Vehicle-mounted end multi-target identification tracking prediction method
KR20220080631A (en) * 2020-12-07 2022-06-14 부경대학교 산학협력단 Apparatus and method for tracking multi-object in real time
CN113139620A (en) * 2021-05-14 2021-07-20 重庆理工大学 End-to-end multi-target detection and tracking joint method based on target association learning
CN114550023A (en) * 2021-12-31 2022-05-27 武汉中交交通工程有限责任公司 Traffic target static information extraction device
CN114387265A (en) * 2022-01-19 2022-04-22 中国民航大学 Anchor-frame-free detection and tracking unified method based on attention module addition
CN114627447A (en) * 2022-03-10 2022-06-14 山东大学 Road vehicle tracking method and system based on attention mechanism and multi-target tracking
CN115861884A (en) * 2022-12-06 2023-03-28 中南大学 Video multi-target tracking method, system, device and medium in complex scene
CN116229112A (en) * 2022-12-06 2023-06-06 重庆邮电大学 Twin network target tracking method based on multiple attentives
CN116189116A (en) * 2023-04-24 2023-05-30 江西方兴科技股份有限公司 Traffic state sensing method and system
CN116912804A (en) * 2023-07-31 2023-10-20 江苏大学 Efficient anchor-frame-free 3-D target detection and tracking method and model

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ZHOU, Y (ZHOU, YAN) ; CHEN, JY (CHEN, JUNYU) ; WANG, DL (WANG, DONGLI) ; ZHU, XL (ZHU, XIAOLIN): ""Multi-object tracking using context-sensitive enhancement via feature fusion"", 《MULTIMEDIA TOOLS AND APPLICATIONS》, vol. 83, no. 7, 10 August 2023 (2023-08-10), pages 19465 - 19484 *
刘文强等: ""深度在线多目标跟踪算法综述"", 《计算机科学与探索》, vol. 16, no. 12, 31 December 2022 (2022-12-31), pages 2718 - 2733 *
许小伟;陈乾坤;钱枫;李浩东;唐志鹏;: "基于小型化YOLOv3的实时车辆检测及跟踪算法", 《公路交通科技》, no. 08, 15 August 2020 (2020-08-15) *

Similar Documents

Publication Publication Date Title
CN108694386B (en) Lane line detection method based on parallel convolution neural network
Geiger et al. Vision meets robotics: The kitti dataset
CN111429484B (en) Multi-target vehicle track real-time construction method based on traffic monitoring video
Yang et al. Video scene understanding using multi-scale analysis
CN111598030A (en) Method and system for detecting and segmenting vehicle in aerial image
CN114170580A (en) Highway-oriented abnormal event detection method
Sellat et al. Intelligent semantic segmentation for self-driving vehicles using deep learning
Masihullah et al. Attention based coupled framework for road and pothole segmentation
CN109543520B (en) Lane line parameterization method for semantic segmentation result
Švorc et al. An infrared video detection and categorization system based on machine learning
CN117670938B (en) Multi-target space-time tracking method based on super-treatment robot
CN117670938A (en) Multi-target space-time tracking method based on super-treatment robot
CN111160282A (en) Traffic light detection method based on binary Yolov3 network
Yi et al. End-to-end neural network for autonomous steering using lidar point cloud data
CN114550023A (en) Traffic target static information extraction device
Khosravian et al. Multi‐domain autonomous driving dataset: Towards enhancing the generalization of the convolutional neural networks in new environments
CN116310970A (en) Automatic driving scene classification algorithm based on deep learning
Kheder et al. Transfer Learning Based Traffic Light Detection and Recognition Using CNN Inception-V3 Model
CN113298781B (en) Mars surface three-dimensional terrain detection method based on image and point cloud fusion
US20230084761A1 (en) Automated identification of training data candidates for perception systems
Krajewski et al. VeGAN: Using GANs for augmentation in latent space to improve the semantic segmentation of vehicles in images from an aerial perspective
Zou et al. HFT: Lifting Perspective Representations via Hybrid Feature Transformation for BEV Perception
CN114511740A (en) Vehicle image classification method, vehicle track restoration method, device and equipment
CN114255450A (en) Near-field vehicle jamming behavior prediction method based on forward panoramic image
Hadzic et al. Rasternet: Modeling free-flow speed using lidar and overhead imagery

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant