CN113744316A - Multi-target tracking method based on deep neural network - Google Patents

Multi-target tracking method based on deep neural network Download PDF

Info

Publication number
CN113744316A
CN113744316A CN202111048838.4A CN202111048838A CN113744316A CN 113744316 A CN113744316 A CN 113744316A CN 202111048838 A CN202111048838 A CN 202111048838A CN 113744316 A CN113744316 A CN 113744316A
Authority
CN
China
Prior art keywords
target
frame
video
target detection
original image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111048838.4A
Other languages
Chinese (zh)
Inventor
邢建川
蒋芷昕
孔渝峰
张栋
卢胜
陈洋
周春文
杨明兴
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202111048838.4A priority Critical patent/CN113744316A/en
Publication of CN113744316A publication Critical patent/CN113744316A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Abstract

The invention discloses a multi-target tracking method based on a deep neural network, which comprises the following steps: collecting a video to be tested, preprocessing the video to be tested, and extracting an original image frame of the video to be tested; carrying out target detection on each original image frame, identifying a target to be tracked, and acquiring a target detection frame of each original image frame; matching target detection frames in two continuous frames of images on a time axis, calculating the similarity of the target to be tracked of the target detection frames, comparing the similarity of the target to be tracked in the two continuous frames of images on the time axis, judging whether the target is the same target to be tracked or not, if so, distributing an ID (identity) and outputting a tracking result; if not, matching and judging again; and realizing continuous tracking of multiple targets of the video based on the ID and the tracking result. The invention integrates the motion characteristic and the appearance characteristic into the loss matrix calculation process, improves the accuracy of target prediction of the next frame, and reduces the ID Switch index, thereby truly realizing the continuous tracking of the target.

Description

Multi-target tracking method based on deep neural network
Technical Field
The invention relates to the technical field of machine vision, in particular to a multi-target tracking method based on a deep neural network.
Background
Early recognition and detection of images mainly rely on extraction of manually designed visual feature descriptors (such as color, shape and edge), however, the traditional manual design method is based on prior knowledge of the existing data set, has large limitation, has small coverage and containment degree of real-world objects, is not enough to find significant objects of complex scenes or accurately draw object boundaries, and is difficult to achieve satisfactory performance and effect.
The multi-target tracking problem appears in radar signal detection at the earliest time, and with the deep research in the field of computer vision and the continuous improvement of the precision of a target detection algorithm, the multi-target tracking algorithm based on detection is also developed greatly. In addition, with the intensive research on deep learning and the vigorous development thereof, a deep neural network trained using a large amount of data is applied to the recognition and detection of objects. Compared with the traditional method, the method of replacing manual design by the deep neural network obviously improves the robustness to various interference factors in the image processing process, and enables the object identification and detection tasks to be rapidly developed. Although the application of the deep neural network realizes the performance improvement of the target detection and identification tasks, the deep neural network has the problems of large parameter scale, complex calculation and the like, and has high requirements on calculation resources, storage resources and the like, so that the deep neural network is difficult to be effectively and widely used in mobile equipment such as smart phones and vehicle-mounted equipment with limited and large limitations on resources. In addition, when multi-target identification tracking is performed, appearance information of a target object to be tracked cannot be integrated into a correlation and matching link, false detection and frequent ID (identity) jumping are easily caused when the target object to be tracked is shielded, and continuous tracking of the target object to be tracked cannot be really realized.
In the future, a multi-target tracking algorithm based on detection still takes accuracy and high efficiency as research key points, and the operation precision is obviously improved while the operation speed is ensured.
Disclosure of Invention
The invention aims to provide a multi-target tracking method based on a deep neural network, which solves the problems in the prior art, increases the consideration of target apparent information while using target motion state information, and fuses motion characteristics and appearance characteristics into the calculation process of a loss matrix, thereby improving the accuracy of target prediction of the next frame, enabling a program to better cope with the interference caused by the problem of target shielding, reducing ID Switch indexes, and really realizing the continuous tracking of targets.
In order to achieve the purpose, the invention provides the following scheme: the invention provides a multi-target tracking method based on a deep neural network, which comprises the following steps:
collecting a video to be tested, preprocessing the video to be tested, and extracting an original image frame of the video to be tested;
carrying out target detection on each original image frame, identifying a target to be tracked, and acquiring a target detection frame of each original image frame;
matching the target detection frames in two continuous frames of images on a time axis, calculating the similarity of the target to be tracked in the target detection frames, comparing the similarity of the target to be tracked in the two continuous frames of images on the time axis, judging whether the two continuous frames of images are the same target to be tracked, if so, allocating an ID (identity) and outputting a tracking result; if not, matching and judging again;
and realizing continuous tracking of multiple targets in the video based on the ID and the tracking result.
Preferably, the preprocessing the video to be tested, and the extracting the original image frame of the video to be tested includes:
reading the video to be tested by adopting an OpenCV library; acquiring the frame rate and the total frame number of the video to be tested by a get method; and extracting the original image frame of the video to be tested frame by frame or frame skipping based on the frame rate and the total frame number combined with image frame acquisition requirements.
Preferably, the target detection frame is obtained by adopting a target detector, wherein the target detector is built by adopting a YOLO network.
Preferably, before matching the target detection frame in two consecutive images on the time axis, kalman filtering is performed on the target detection frame.
Preferably, the kalman filtering process includes:
in the moving process of the target to be tracked, calculating an initial predicted value of the target detection frame in the current original image frame based on the target detection frame in the previous original image frame, wherein the initial predicted value is a vector; and acquiring a true value of the target detection frame in the current original image frame, calculating a linear weighted value of the initial predicted value and the true value, and acquiring a position predicted value of the target detection frame in the current original image frame.
Preferably, the target detection frame in two consecutive images on the matching time axis comprises:
calculating a geometric distance d between the predicted value and the real value based on the predicted value and the real value(1)(i,j):
Figure BDA0003252073200000031
In the formula, yiTo predict value, diIs the true value;
adopting a CNN network to extract the apparent information of the target detection frame, storing the apparent information as an apparent information matrix, calculating the minimum cosine distance of the apparent information matrix, and obtaining an apparent distance d(2)(i,j):
Figure BDA0003252073200000041
In the formula, rjRepresenting the extraction of the j-th appearance vector via the CNN network,
Figure BDA0003252073200000042
representing a kth apparent vector in an ith target tracker;
calculating linear weighted values of the geometric distance and the apparent distance based on the geometric distance and the apparent distance of two continuous frames of images to obtain a loss matrix ci,j
ci,j=λd(1)(i,j)+(1-λ)d(2)(i,j);
In the formula, i is the ith tracking result, j is the jth detection result, and lambda is a weight value set manually;
when c is going toi,jAnd simultaneously, the target detection frames in two continuous frames of images on the time axis are associated when the target detection frames fall within the threshold set by the two constraints of the geometric distance and the apparent distance.
Preferably, a convolutional neural network is adopted to perform multi-layer processing on the target detection frame of the original image frame.
Preferably, the tracking result includes the item type of the target to be tracked and the time period from appearance to disappearance.
The invention discloses the following technical effects:
the invention provides a multi-target tracking method based on a deep neural network, which increases the consideration of apparent information of a target while using the motion state information of the target, integrates motion characteristics and appearance characteristics into the calculation process of a loss matrix, improves the accuracy of target prediction of a next frame, enables a program to better cope with the interference caused by the problem of target shielding, reduces ID Switch indexes, thereby really realizing continuous tracking of the target, supports a multi-target tracking software program of mixed type tracking, can detect and track different types of moving objects in the same video without mutual interference, and simultaneously outputs the respective motion state information of the moving objects, thereby providing early-stage data support and basis for a machine vision task of the next step.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings without inventive exercise.
FIG. 1 is a flow chart of a multi-target tracking method according to an embodiment of the present invention;
FIG. 2 is a schematic diagram of capturing each frame of image of an input video according to an embodiment of the present invention;
FIG. 3 is a schematic diagram illustrating the detection of an object on all frame images according to an embodiment of the present invention;
FIG. 4 is a schematic diagram illustrating a target detection frame marked on a frame image according to an embodiment of the present invention;
fig. 5 is a schematic image diagram of a video outputting a detection result according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In order to make the aforementioned objects, features and advantages of the present invention comprehensible, embodiments accompanied with figures are described in further detail below.
The invention provides a multi-target tracking method based on a deep neural network, which comprises the following steps of:
s101, collecting a video to be tested, preprocessing the video to be tested, and extracting an original image frame of the video to be tested.
In the embodiment, the crossroads with large traffic flow are selected as the test video scene, the video time of the video to be tested is 11.205 seconds, the video size is 2.91MB, the total frame number is 336 frames, and the target to be tracked is set to be car and truck, namely, different types of vehicles running in the road are tracked. Preprocessing a video to be tested, calling an existing OpenCV (open source video library) in python to read complete video data, and acquiring a frame rate and a total frame number of an original video by a get method, wherein a large amount of time is needed to acquire images frame by frame when the frame number of the video is large, and at the moment, whether frame skipping is needed to acquire a video image can be determined according to self requirements of different tasks, so that an original image frame for extracting the video to be tested is acquired, as shown in FIG. 2.
In the step, original picture frames are disassembled from the video, so that the subsequent target detection can be realized according to the picture processing method, and finally, each processed frame image is written into the result video. The method converts the multi-target tracking identification of the video into the processing of the picture, and is an important link for decomposing and converting the video problem into the picture problem.
S102, carrying out target detection on each frame of original image, identifying a target to be tracked, and obtaining a target detection frame of each frame of original image.
An object detector based on a YOLOv5m model is constructed, and the object detector is constructed and trained by using an MS COCO (Microsoft Common Objects in Context) data set and a YOLOv5m network to obtain the object detector.
The target detector is used for identifying and detecting targets 'car' and 'truck' in each frame of original image, the identified target object is marked by a rectangular frame, and a target detection frame is obtained and is used for representing position information of the target object in different image frames, which is shown in fig. 3-4. And constructing a target detector through the steps to finish the addition of the target detection frame in the image.
S103, matching target detection frames in two continuous frames of images on a time axis, calculating the similarity of the target to be tracked according to the target detection frames, comparing the similarity of the target to be tracked in the two continuous frames of images on the time axis, judging whether the target is the same target to be tracked, if so, distributing an ID (identity) and outputting a tracking result; and if not, performing matching and judgment again.
Although the target detection frame obtained by the target detector can relatively accurately give the position information of the object, more or less interference terms are inevitably introduced in the extraction process, so that the position coordinates of the target frame are not accurate enough for practical application, and the 'denoising' process is also needed.
In this embodiment, to facilitate storage and use of object information in the later stage, the motion state of the target is defined as a vector including 8 normal distribution parameters:
Figure BDA0003252073200000071
where (u, v, γ) is the center coordinates and aspect ratio of the frame candidate, h is the frame candidate height, and the remaining four parameters indicate their degree of change, which are all 0 initially. The prediction of the next frame image is the prediction of the four variables (u, v, γ, h).
When designing the filter, two models, namely a Constant Velocity Motion Model (Constant Velocity Motion Model) and a Linear Observation Model (Linear observer Model), are referred to, and the specific implementation of the algorithm comprises two steps: uniform velocity assumptions and linear updates. During the moving process of the target, the Tth information is fully utilized based on the prior known information and the prior experiencen-1Obtaining the Tth parameter value such as the position coordinate and the speed of the rectangular frame in the frame imagenHypothesis results in frame images. And after the assumed value and the real value are obtained, the linear weighted value of the two vectors is taken as the final assumed result of the current target.
In this embodiment, Hungarian algorithm is adopted to treat the TthnTarget frame and Tth frame of framen-1) And matching the target frames of the frames pairwise.
The matching process to achieve the goal often needs to follow certain rules and custom constraints to ensure that the assumptions made are of sufficient reasonableness and efficiency. Defining a loss matrix to describe the required opening to match two elements in two setsAnd (4) a pin. Defining the loss matrix requires computing the TthnTarget frame and Tth frame of framen-1Geometric and apparent distances of the frame's target box.
1. Calculating geometric distance
Vector parameter y predicted in Kalman filteringiAnd directly detecting the resulting vector parameter diMahalanobis Distance (Mahalanobis Distance) between the detected position and the average tracking position, i.e., the standard deviation between the detected position and the average tracking position, defines the geometric Distance, as shown in equation (1):
Figure BDA0003252073200000081
therein, sigma-1T represents the sign of the transpose of the matrix, which is the covariance matrix of the multidimensional vector.
2. Calculating an apparent distance
When the reliability and predictability of the motion of the object are relatively high, in some image or video scenes with strong interference and more noise (such as distortion of original data, high-speed movement of a camera or severe jitter), the effect of using the mahalanobis distance to realize the association is greatly reduced, and more frequent ID jumps are caused.
In order to solve the above problem, in this embodiment, a CNN network is used to extract an object in each target frame, and each pixel point in the image is represented, that is, a convolutional neural network is used, and multilayer processing such as local convolution calculation and pooling operation is performed on the target frame in each frame by using a convolution kernel, so as to combine low-level features to form a higher-level feature map. In this embodiment, the high-level feature map is stored as a 128-dimensional apparent information matrix, and the matrix is normalized to calculate its minimum Cosine Distance (Cosine Distance), i.e., apparent Distance. The apparent distance is calculated as shown in equation (2):
Figure BDA0003252073200000091
wherein r isjRepresentation extraction via CNN networkThe j-th appearance vector is then calculated,
Figure BDA0003252073200000092
representing the kth apparent vector, R, in the ith target trackeriRepresenting the appearance of the object in different frames.
3. Constructing a loss matrix
The geometric distance and the apparent distance of the target in the front frame and the back frame can be obtained through the formulas (1) to (2), the linear weighted value of the two metrics is used as the final metric for defining the loss, and the loss matrix construction method is shown as the formula (3):
ci,j=λd(1)(i,j)+(1-λ)d(2)(i,j) (3)
wherein i is the ith tracking result, j is the jth detection result, and λ is a manually set weight value.
From the formula (3), only when ci,jAnd the data association of two frames of targets before and after one successful time can be considered when the data association falls within the threshold set by the two constraints of the geometric distance and the apparent distance.
It is necessary to mix d(1)(i, j) and d(2)The (i, j) two distance values are normalized, and when the value is much larger than 1, the value is considered as an error and should be discarded.
In addition, in the present embodiment, the similarity can be indirectly obtained by calculating the loss matrix, and when the loss is large, the similarity is small, and when the loss is small, the similarity is large, and the similarity between two frames of images is compared by the similarity.
In this embodiment, before calculating the loss matrix, the detection frames are filtered using the confidence matrix, and the detection frames and features with insufficient reliability are deleted. The confidence matrix is used for expressing the credibility of the observed value and also comprises a geometric part and an apparent part.
1. Geometric confidence
The geometric confidence is as shown in equation (4):
Figure BDA0003252073200000101
wherein, t(1)Is a constant obtained by a set test method in probability statistics. The above formula is represented by(1)(i,j)≤t(1)Then, then
Figure BDA0003252073200000102
Otherwise
Figure BDA0003252073200000103
2. Apparent confidence
The apparent confidence definition is the same as the geometric confidence definition, and the apparent confidence is as shown in equation (5):
Figure BDA0003252073200000104
3. final confidence
And (3) integrating the geometric confidence coefficient and the apparent confidence coefficient to obtain a confidence matrix, namely the final confidence coefficient of the target, wherein the final confidence coefficient is shown as the formula (6):
Figure BDA0003252073200000105
in the formula (I), the compound is shown in the specification,
Figure BDA0003252073200000111
and
Figure BDA0003252073200000112
the values of (a) are equally and importantly limited. It is clear that a detection result can only be considered reasonable if both the geometric and apparent measures of the detection value are reasonable.
If the detection result is reasonable, indicating that the target objects in the front and rear target frames are the same target to be tracked, distributing ID and outputting the tracking result; and if the detection result is unreasonable, the target objects in the front and rear target frames are not the same target to be tracked, and then matching and judging are carried out again.
In this embodiment, the detection frames are first filtered by using the confidence matrix, and the detection frames and features with low confidence levels are deleted, so that the detection frames with large deviation can be filtered in advance, and the influence on the subsequent similarity calculation result is avoided. And then calculating a loss matrix, and taking the result of the loss matrix as the input of the Hungarian algorithm. And finally, filtering the results of the similarity by using the confidence matrix, and further filtering the results of the similarity. For the results with similar similarity, the results with higher confidence coefficient are reserved, and the accuracy can be further improved.
And S104, outputting and storing the multi-target tracking result, integrating the output object state information, and realizing the visualization of the result.
After the basic task of multi-target tracking of video content is completed, the resulting video with the continuous tracking target frame and the corresponding ID is output, as shown in fig. 5. And simultaneously outputting the text information of the targets, including the target types, IDs, appearing times, target frame position coordinates, and the like, as shown in table 1. And reading the content of the text to obtain the time period from appearance to disappearance of each target, and storing the time period in a clear text format.
TABLE 1
Figure BDA0003252073200000121
The result of the multi-target tracking method in this embodiment is tested, and the test result is shown in table 2, which indicates that it may be difficult to keep the tracking process for a single target continuous for a long time (the Frag index is high), but this may not cause a serious influence on the overall tracking effect of the same target (the ID-SW is maintained at a low level). In addition, when the image quality is good, the false detection rate of the program is kept at a low level, most target types can be correctly identified and detected (the FN and FP indexes are low), and the MOTA result shows that the program shows good processing performance in the aspects of object identification detection and continuity of tracking tracks. The step realizes the processing of the output information and the visualization of the information, so that the information is displayed more visually, and the method is more favorable for practical application.
TABLE 2
Measure GT ID_SW FN FP Frag MOTA
Result 7 1 0 1 11 71.429%
The above-described embodiments are merely illustrative of the preferred embodiments of the present invention, and do not limit the scope of the present invention, and various modifications and improvements of the technical solutions of the present invention can be made by those skilled in the art without departing from the spirit of the present invention, and the technical solutions of the present invention are within the scope of the present invention defined by the claims.

Claims (8)

1. A multi-target tracking method based on a deep neural network is characterized in that: the method comprises the following steps:
collecting a video to be tested, preprocessing the video to be tested, and extracting an original image frame of the video to be tested;
carrying out target detection on each original image frame, identifying a target to be tracked, and acquiring a target detection frame of each original image frame;
matching the target detection frames in two continuous frames of images on a time axis, calculating the similarity of the target to be tracked in the target detection frames, comparing the similarity of the target to be tracked in the two continuous frames of images on the time axis, judging whether the two continuous frames of images are the same target to be tracked, if so, allocating an ID (identity) and outputting a tracking result; if not, matching and judging again;
and realizing continuous tracking of multiple targets in the video based on the ID and the tracking result.
2. The deep neural network-based multi-target tracking method according to claim 1, wherein: preprocessing the video to be tested, and extracting the original image frame of the video to be tested comprises the following steps:
reading the video to be tested by adopting an OpenCV library; acquiring the frame rate and the total frame number of the video to be tested by a get method; and extracting the original image frame of the video to be tested frame by frame or frame skipping based on the frame rate and the total frame number combined with image frame acquisition requirements.
3. The deep neural network-based multi-target tracking method according to claim 1, wherein: and acquiring a target detection frame by adopting a target detector, wherein the target detector is built by adopting a YOLO network.
4. The deep neural network-based multi-target tracking method according to claim 1, wherein: and performing Kalman filtering on the target detection frame before matching the target detection frame in two continuous frames of images on a time axis.
5. The deep neural network-based multi-target tracking method according to claim 4, wherein: the Kalman filtering process comprises the following steps:
in the moving process of the target to be tracked, calculating an initial predicted value of the target detection frame in the current original image frame based on the target detection frame in the previous original image frame, wherein the initial predicted value is a vector; and acquiring a true value of the target detection frame in the current original image frame, calculating a linear weighted value of the initial predicted value and the true value, and acquiring a position predicted value of the target detection frame in the current original image frame.
6. The deep neural network-based multi-target tracking method according to claim 5, wherein: the target detection frame in two continuous frames of images on the matching time axis comprises the following steps:
calculating a geometric distance d between the predicted value and the real value based on the predicted value and the real value(1)(i,j):
Figure FDA0003252073190000021
In the formula, yiTo predict value, diIs the true value;
adopting a CNN network to extract the apparent information of the target detection frame, storing the apparent information as an apparent information matrix, calculating the minimum cosine distance of the apparent information matrix, and obtaining an apparent distance d(2)(i,j):
Figure FDA0003252073190000022
In the formula, rjRepresenting the extraction of the j-th appearance vector via the CNN network,
Figure FDA0003252073190000023
is shown in the ith orderA kth apparent vector in the target tracker;
calculating linear weighted values of the geometric distance and the apparent distance based on the geometric distance and the apparent distance of two continuous frames of images to obtain a loss matrix ci,j
ci,j=λd(1)(i,j)+(1-λ)d(2)(i,j);
In the formula, i is the ith tracking result, j is the jth detection result, and lambda is a weight value set manually;
when c is going toi,jAnd simultaneously, the target detection frames in two continuous frames of images on the time axis are associated when the target detection frames fall within the threshold set by the two constraints of the geometric distance and the apparent distance.
7. The deep neural network-based multi-target tracking method according to claim 1, wherein: and performing multilayer processing on the target detection frame of the original image frame by adopting a convolutional neural network.
8. The deep neural network-based multi-target tracking method according to claim 1, wherein: the tracking result comprises the item type of the target to be tracked and the time period from appearance to disappearance.
CN202111048838.4A 2021-09-08 2021-09-08 Multi-target tracking method based on deep neural network Pending CN113744316A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111048838.4A CN113744316A (en) 2021-09-08 2021-09-08 Multi-target tracking method based on deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111048838.4A CN113744316A (en) 2021-09-08 2021-09-08 Multi-target tracking method based on deep neural network

Publications (1)

Publication Number Publication Date
CN113744316A true CN113744316A (en) 2021-12-03

Family

ID=78737056

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111048838.4A Pending CN113744316A (en) 2021-09-08 2021-09-08 Multi-target tracking method based on deep neural network

Country Status (1)

Country Link
CN (1) CN113744316A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972418A (en) * 2022-03-30 2022-08-30 北京航空航天大学 Maneuvering multi-target tracking method based on combination of nuclear adaptive filtering and YOLOX detection
CN115690163A (en) * 2023-01-04 2023-02-03 中译文娱科技(青岛)有限公司 Target tracking method, system and storage medium based on image content
CN116665177A (en) * 2023-07-31 2023-08-29 福思(杭州)智能科技有限公司 Data processing method, device, electronic device and storage medium

Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522843A (en) * 2018-11-16 2019-03-26 北京市商汤科技开发有限公司 A kind of multi-object tracking method and device, equipment and storage medium
CN109816690A (en) * 2018-12-25 2019-05-28 北京飞搜科技有限公司 Multi-target tracking method and system based on depth characteristic
CN110378259A (en) * 2019-07-05 2019-10-25 桂林电子科技大学 A kind of multiple target Activity recognition method and system towards monitor video
CN111127513A (en) * 2019-12-02 2020-05-08 北京交通大学 Multi-target tracking method
CN111882580A (en) * 2020-07-17 2020-11-03 元神科技(杭州)有限公司 Video multi-target tracking method and system
CN112001252A (en) * 2020-07-22 2020-11-27 北京交通大学 Multi-target tracking method based on heteromorphic graph network
CN112241969A (en) * 2020-04-28 2021-01-19 北京新能源汽车技术创新中心有限公司 Target detection tracking method and device based on traffic monitoring video and storage medium
CN112288773A (en) * 2020-10-19 2021-01-29 慧视江山科技(北京)有限公司 Multi-scale human body tracking method and device based on Soft-NMS
CN112308881A (en) * 2020-11-02 2021-02-02 西安电子科技大学 Ship multi-target tracking method based on remote sensing image
US20210065384A1 (en) * 2019-08-29 2021-03-04 Boe Technology Group Co., Ltd. Target tracking method, device, system and non-transitory computer readable storage medium
CN113034548A (en) * 2021-04-25 2021-06-25 安徽科大擎天科技有限公司 Multi-target tracking method and system suitable for embedded terminal

Patent Citations (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109522843A (en) * 2018-11-16 2019-03-26 北京市商汤科技开发有限公司 A kind of multi-object tracking method and device, equipment and storage medium
CN109816690A (en) * 2018-12-25 2019-05-28 北京飞搜科技有限公司 Multi-target tracking method and system based on depth characteristic
CN110378259A (en) * 2019-07-05 2019-10-25 桂林电子科技大学 A kind of multiple target Activity recognition method and system towards monitor video
US20210065384A1 (en) * 2019-08-29 2021-03-04 Boe Technology Group Co., Ltd. Target tracking method, device, system and non-transitory computer readable storage medium
CN111127513A (en) * 2019-12-02 2020-05-08 北京交通大学 Multi-target tracking method
CN112241969A (en) * 2020-04-28 2021-01-19 北京新能源汽车技术创新中心有限公司 Target detection tracking method and device based on traffic monitoring video and storage medium
CN111882580A (en) * 2020-07-17 2020-11-03 元神科技(杭州)有限公司 Video multi-target tracking method and system
CN112001252A (en) * 2020-07-22 2020-11-27 北京交通大学 Multi-target tracking method based on heteromorphic graph network
CN112288773A (en) * 2020-10-19 2021-01-29 慧视江山科技(北京)有限公司 Multi-scale human body tracking method and device based on Soft-NMS
CN112308881A (en) * 2020-11-02 2021-02-02 西安电子科技大学 Ship multi-target tracking method based on remote sensing image
CN113034548A (en) * 2021-04-25 2021-06-25 安徽科大擎天科技有限公司 Multi-target tracking method and system suitable for embedded terminal

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
HASITH KARUNASEKERA等: "Multiple Object Tracking With Attention to Appearance, Structure, Motion and Size", vol. 7, pages 104423 - 104434, XP011739318, DOI: 10.1109/ACCESS.2019.2932301 *
师燕妮: "基于视频的人体目标检测与跟踪技术研究", no. 3, pages 138 - 1388 *
武玉伟等: "《深度学习基础与应用》", vol. 1, 西安电子科技大学出版社, pages: 74 - 76 *
王溜: "基于深度学习的行人多目标跟踪算法研究_王溜", no. 8, pages 138 - 453 *

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114972418A (en) * 2022-03-30 2022-08-30 北京航空航天大学 Maneuvering multi-target tracking method based on combination of nuclear adaptive filtering and YOLOX detection
CN114972418B (en) * 2022-03-30 2023-11-21 北京航空航天大学 Maneuvering multi-target tracking method based on combination of kernel adaptive filtering and YOLOX detection
CN115690163A (en) * 2023-01-04 2023-02-03 中译文娱科技(青岛)有限公司 Target tracking method, system and storage medium based on image content
CN115690163B (en) * 2023-01-04 2023-05-09 中译文娱科技(青岛)有限公司 Target tracking method, system and storage medium based on image content
CN116665177A (en) * 2023-07-31 2023-08-29 福思(杭州)智能科技有限公司 Data processing method, device, electronic device and storage medium
CN116665177B (en) * 2023-07-31 2023-10-13 福思(杭州)智能科技有限公司 Data processing method, device, electronic device and storage medium

Similar Documents

Publication Publication Date Title
CN107424171B (en) Block-based anti-occlusion target tracking method
CN113744316A (en) Multi-target tracking method based on deep neural network
CN112132119B (en) Passenger flow statistical method and device, electronic equipment and storage medium
US20230289979A1 (en) A method for video moving object detection based on relative statistical characteristics of image pixels
CN112528878A (en) Method and device for detecting lane line, terminal device and readable storage medium
CN107808138B (en) Communication signal identification method based on FasterR-CNN
EP2034426A1 (en) Moving image analyzing, method and system
CN108198201A (en) A kind of multi-object tracking method, terminal device and storage medium
CN106910204B (en) A kind of method and system to the automatic Tracking Recognition of sea ship
CN107452015A (en) A kind of Target Tracking System with re-detection mechanism
CN102915545A (en) OpenCV(open source computer vision library)-based video target tracking algorithm
Hadi et al. A computationally economic novel approach for real-time moving multi-vehicle detection and tracking toward efficient traffic surveillance
Lian et al. A novel method on moving-objects detection based on background subtraction and three frames differencing
CN109740609A (en) A kind of gauge detection method and device
CN113255444A (en) Training method of image recognition model, image recognition method and device
CN112686122B (en) Human body and shadow detection method and device, electronic equipment and storage medium
CN109978916B (en) Vibe moving target detection method based on gray level image feature matching
CN105208402B (en) A kind of frame of video complexity measure method based on Moving Objects and graphical analysis
CN112348011B (en) Vehicle damage assessment method and device and storage medium
Yu et al. Length-based vehicle classification in multi-lane traffic flow
Palaio et al. Multi-object tracking using an adaptive transition model particle filter with region covariance data association
CN112348112B (en) Training method and training device for image recognition model and terminal equipment
de Oliveira et al. Vehicle counting and trajectory detection based on particle filtering
CN114782500A (en) Kart race behavior analysis method based on multi-target tracking
CN114743257A (en) Method for detecting and identifying image target behaviors

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination