CN112883928A - Multi-target tracking algorithm based on deep neural network - Google Patents

Multi-target tracking algorithm based on deep neural network Download PDF

Info

Publication number
CN112883928A
CN112883928A CN202110325552.XA CN202110325552A CN112883928A CN 112883928 A CN112883928 A CN 112883928A CN 202110325552 A CN202110325552 A CN 202110325552A CN 112883928 A CN112883928 A CN 112883928A
Authority
CN
China
Prior art keywords
target
network
image
tracking
formula
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110325552.XA
Other languages
Chinese (zh)
Inventor
邵叶秦
吕昌
唐宇亮
蒋雯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nantong University
Original Assignee
Nantong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nantong University filed Critical Nantong University
Priority to CN202110325552.XA priority Critical patent/CN112883928A/en
Publication of CN112883928A publication Critical patent/CN112883928A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/20Movements or behaviour, e.g. gesture recognition
    • G06V40/23Recognition of whole body movements, e.g. for sport training
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Psychiatry (AREA)
  • Social Psychology (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a multi-target tracking algorithm based on a deep neural network, which is characterized in that firstly, a single-target tracker realized based on a twin network is designed, similarity measurement is carried out on a determined target in a frame image through the network, and position prediction of each target on the previous frame image on the frame is obtained; then, carrying out target detection on the frame image by utilizing a deep convolutional neural network; and finally, performing correlation matching on the obtained tracking prediction image and the detected pedestrian image, and improving the tracking accuracy by utilizing cosine similarity and area overlapping. The invention firstly introduces a single target tracker designed based on a full convolution twin network, and mainly searches the most similar targets for each target in a to-be-tracked area by using the idea of twin network similarity measurement. And matching each detection target with each tracking candidate target by using a matching algorithm. Finally, experiments show that the theory has feasibility and show experimental results.

Description

Multi-target tracking algorithm based on deep neural network
Technical Field
The invention particularly relates to a multi-target tracking algorithm based on a deep neural network.
Background
In order to better identify the pedestrian, the complete pedestrian feature needs to be extracted, so a tracking module is added to acquire a video sequence of the pedestrian. With the continuous development of the computer vision field, the multi-target tracking algorithm is applied to more and more scenes. The multi-target tracking can be mainly divided into online tracking and offline tracking. On-line tracking is sequential frame-by-frame tracking, while off-line tracking is an estimation of the state of each target, then considering the rationality constraint of the overall state. The offline tracing can be simplified as: and obtaining the detection result of each frame of image, and associating the detection result with the existing tracking track to obtain a track result of multi-target tracking.
The tracking problem is to identify an object in a first frame of the video and then track in subsequent frames. Since the targets are randomly selected, a specific target tracker cannot be trained for tracking. A typical object tracker learns an appearance model of an object in an online manner. Such as TLD, Struck, KCF, MIL and MOSSE. Common tracking algorithms generally do not consider the detection of targets.
Aiming at the problems, the invention provides a single target tracker based on a full convolution twin network and a target detector based on a deep convolution neural network, and aims to match the obtained detection target with the candidate target of tracking prediction by means of cosine similarity and the like and fuse target detection into the tracking process.
Disclosure of Invention
The purpose of the invention is as follows: in order to overcome the defects of the prior art, the invention provides a multi-target tracking algorithm based on a deep neural network.
The technical scheme is as follows: a multi-target tracking algorithm based on a deep neural network, comprising the operations of: firstly, designing a single target tracker realized based on a twin network, and carrying out similarity measurement on a determined target in a next frame image through the network to obtain the position prediction of each target on the previous frame image on the frame image; then, carrying out target detection on the frame image by utilizing a deep convolutional neural network; and finally, performing correlation matching on the obtained tracking prediction image and the detected pedestrian image, and improving the tracking accuracy by using cosine similarity and area overlapping.
As an optimization: the twin network is a 'conjoined neural network', the 'conjoined' of the neural network is realized by sharing a weight, and the twin network is used for measuring the similarity of two inputs; the twin neural network respectively maps the two inputs to a new space to form a representation of the inputs in the new space, and similarity of the two inputs is evaluated through loss calculation;
twin network based single target tracking is to find the most similar regions between the template image z and the search regions A, B of this frame image x (these search regions are near the target in the previous frame image); template image and search area pass
Figure BDA0002994515510000021
Is mapped to the characteristic space of the image,
Figure BDA0002994515510000022
is a feature mapping implemented using a neural network; the characteristic size of the input target image after mapping is 6 multiplied by 128, and similarly, the characteristic size obtained by the lower branch of the twin network is 22 multiplied by 128; in order to obtain the characteristic position of the search area, taking the obtained 6 × 6 × 128 characteristic of the upper branch of the twin network as a convolution kernel of the lower branch, and performing characteristic convolution on the 22 × 22 × 128 characteristic of the lower branch; finally, obtaining a score map with the size of 17 multiplied by 1, wherein the map represents the similarity scores between each position and the template; the algorithm is used for comparing the similarity between a search area and a target, the method is similar to a correlation filtering idea, a twin network utilizes characteristics to serve as convolution, and the maximum similar position is found in a convolution result;
the full convolution network is independent of the size of the candidate image, it will calculate the similarity of all the transformation sub-windows x and z, the algorithm uses convolution embedding function
Figure BDA0002994515510000023
And cross-correlation to obtain the result, the formula is as follows:
Figure BDA0002994515510000024
in the formula (1), b represents a value at each position;
converting the size of an input image into the size of 127 multiplied by 127, converting a candidate image into the size of 255 multiplied by 255, changing the interval into p, keeping the area unchanged by a scale factor s, and continuously adjusting the image size by a formula:
s(w+2p)×s(h+2p)=A (2)
in formula (2), A is 1272P ═ w + h)/4, w, h is the width and height of the candidate box;
when the network is trained, l (y, v) is the training of the network by positive and negative samples, and the formula is as follows:
l(y,v)=log(1+exp(-yu)) (3)
in the formula (3), u is a score matrix of a single sample, v is a similarity score of a single template and the candidate region, and y is only +1 or-1;
the definition of y is as follows:
Figure BDA0002994515510000031
in the formula (4), k is the step length of the network, and R is the radius of the center of the fractional graph element;
during training, the convolution of the network is implemented using a graph containing an example graph and a larger candidate graph, the loss of the score graph being defined as the mean of all losses:
Figure BDA0002994515510000032
in the formula (5), D ∈ Z2Is a finite grid;
when the full convolution twin network is used for training, each pair of the inputs can obtain a similarity score map which is marked as u; finding the maximum score on the score map, and taking the position of the score point on the original graph as the final predicted position; using linear interpolation, expand the 17 × 17 score map to 272 × 272, map the point of the original score map that responds to the maximum value onto 272 × 272 score map as the target position:
maximum score position score graph center × grid block size (6)
Next, at each position of the score map, the convolution parameters are obtained by random gradient descent (SGD):
argminθEL(y,f(z,x;θ)) (7)
in the formula (7), theta is a parameter;
obtaining m detection targets {1, 2, …, m } through a detection network, obtaining n tracking prediction candidate targets {1, 2, …, n } through a tracking network, and matching the detection targets and the tracking prediction candidate targets by using a matching algorithm; firstly, an area fusion method is adopted to detect a candidate frame d (x)1,y1,x2,y2) And tracking candidate frame s (x'1,y′1,x′2,y′2) The formula is as follows:
xmin,xmax→(x1,x2,x′1,x′2) (8)
ymin,ymax→(y1,y2,y′1,y′2) (9)
w=xmax-xmin (10)
h=ymax-ymin (11)
s=w×h (12)
in the formula (12), w is the width of the overlapping region, h is the height of the overlapping region, and s is the area of the overlapping region;
if the overlapping rate of the areas has a problem, the cosine matching method is used again for verification, and the cosine formula is as follows:
Figure BDA0002994515510000041
in formula (13), xi,yi(i ═ 1, 2, …, n) are the coordinates of vector i, respectively.
Has the advantages that: the invention firstly introduces a single target tracker designed based on a full convolution twin network, and mainly utilizes the idea of twin network similarity measurement to search a target for a region to be tracked to find the most similar target. And matching the detection target with the tracking candidate target by using a matching algorithm. Finally, experiments show that the theory has feasibility and show experimental results.
Drawings
FIG. 1 is a block diagram of the twin concept based multi-target tracking of the present invention;
FIG. 2 is a schematic diagram of a full convolution twin network based tracker structure of the present invention;
FIG. 3 is a sample schematic of the algorithm data of the present invention;
FIG. 4 is a score map visualization image of the present invention;
FIG. 5 is a schematic representation of the results of the single person tracking of the present invention;
FIG. 6 is a diagram illustrating the result of multi-person tracking according to the present invention;
FIG. 7 is a schematic diagram of the accuracy of the algorithm of the present invention;
FIG. 8 is a graphical representation of the efficiency of the algorithm of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below so that those skilled in the art can better understand the advantages and features of the present invention, and thus the scope of the present invention will be more clearly defined. The embodiments described herein are only a few embodiments of the present invention, rather than all embodiments, and all other embodiments that can be derived by one of ordinary skill in the art without inventive faculty based on the embodiments described herein are intended to fall within the scope of the present invention.
Examples
1. Problem definition and system framework
According to the tracking idea, the invention designs a deep convolutional neural network to track the target. Firstly, a single target tracker based on twin network implementation is designed, similarity measurement is carried out on the determined targets in the next frame image through the network, and position prediction of each target on the previous frame image on the frame image is obtained. Then, target detection is carried out on the frame image by utilizing a deep convolutional neural network. And finally, performing association matching on the obtained tracking predicted image and the detected pedestrian image, and improving the tracking accuracy by using cosine similarity and area overlapping modes, wherein fig. 1 is a frame schematic diagram of multi-target tracking.
The image shows that the tracker is composed of a tracker based on a twin network and a detector based on a deep convolutional neural network, and finally matching is carried out by utilizing a matching algorithm, so that the motion trail of the pedestrian in a natural walking state is tracked.
2. Tracking algorithm based on full convolution twin network
The twin network (Siamese-network) is a 'conjoined neural network', and the 'conjoined' of the neural network is realized by sharing weight values. The twin network is a measure of the similarity of the two inputs. The twin neural network maps the two inputs to new spaces, respectively, forming a representation of the inputs in the new spaces. Similarity of the two inputs is evaluated by loss calculation, and fig. 2 is a tracking structure based on a full convolution twin network.
Twin network based single object tracking finds the most similar areas between the template image z and the search areas A, B of this frame image x (these search areas are near the object in the previous frame image). Template image and search area pass
Figure BDA0002994515510000061
Is mapped to the characteristic space of the image,
Figure BDA0002994515510000062
is a feature mapping implemented using a neural network; the characteristic size of the input target image after mapping is 6 multiplied by 128, and similarly, the characteristic size obtained by the lower branch of the twin network is 22 multiplied by 128;in order to obtain the characteristic position of the search area, taking the obtained 6 × 6 × 128 characteristic of the upper branch of the twin network as a convolution kernel of the lower branch, and performing characteristic convolution on the 22 × 22 × 128 characteristic of the lower branch; finally, obtaining a score map with the size of 17 multiplied by 1, wherein the map represents the similarity scores between each position and the template; the algorithm is used for comparing the similarity between a search area and a target, the method is similar to a correlation filtering idea, a twin network utilizes characteristics to serve as convolution, and the maximum similar position is found in a convolution result;
the full convolution network is independent of the size of the candidate image and it will compute the similarity of all the transformation sub-windows x and z. The algorithm uses a convolution embedding function
Figure BDA0002994515510000063
And cross-correlation are combined to obtain the result, and the formula is as follows.
Figure BDA0002994515510000064
In the formula (1), b represents a value at each position.
The input image size is converted to 127 × 127, the candidate image is converted to 255 × 255, the variation interval is p, and the scale factor s keeps the area constant. The image size is continuously adjusted by a formula.
s(w+2p)×s(h+2p)=A (2)
In formula (2), A is 1272And p is (w + h)/4, and w and h are the width and the height of the candidate frame.
When the network is trained, l (y, v) is the training of the network by positive and negative samples, and the formula is as follows.
l(y,v)=log(1+exp(-yu)) (3)
In the formula (3), u is a score matrix of a single sample, v is a similarity score of a single template and the candidate region, and y is only +1 or-1.
The definition of y is as follows:
Figure BDA0002994515510000071
in the formula (4), k is the step length of the network, and R is the radius of the center of the fractional graph element.
During training, convolution of the network is achieved using the containment example graph and the larger candidate graph. The loss of the score map is defined as the mean of all losses.
Figure BDA0002994515510000072
In the formula (5), D ∈ Z2Is a finite grid.
When the full convolution twin network is used for training, each pair of the inputs can obtain a similarity score map which is marked as u; finding the maximum score on the score map, and taking the position of the score point on the original graph as the final predicted position; using linear interpolation, expand the 17 × 17 score map to 272 × 272, map the point of the original score map that responds to the maximum value onto 272 × 272 score map as the target position:
maximum score position score graph center × grid block size (6)
Next, at each position of the score map, the convolution parameters are obtained by random gradient descent (SGD):
ar g minθ E L(y,f(z,x;θ)) (7)
in the formula (7), theta is a parameter;
obtaining m detection targets {1, 2, …, m } through a detection network, obtaining n tracking prediction candidate targets {1, 2, …, n } through a tracking network, and matching the detection targets and the tracking prediction candidate targets by using a matching algorithm; firstly, an area fusion method is adopted to detect a candidate frame d (x)1,y1,x2,y2) And tracking candidate frame s (x'1,y′1,x′2,y′2) The formula is as follows:
xmin,xmax→(x1,x2,x′1,x′2) (8)
ymin,ymax→(y1,y2,y′1,y′2) (9)
w=xmax-xmin (10)
h=ymax-ymin (11)
s=w×h (12)
in the formula (12), w is the width of the overlap region, h is the height of the overlap region, and s is the area of the overlap region.
If the overlapping rate of the areas has a problem, the cosine matching method is used again for verification, and the cosine formula is as follows:
Figure BDA0002994515510000081
in formula (13), xi,yi(i ═ 1, 2, …, n) are the coordinates of vector i, respectively.
3. Results and analysis of the experiments
3.1 Experimental data
In this experiment, the model was trained according to the ILSVRC2015 data set format. The ILSVRC has mainly three folders, ImageSets, Data and exceptions, where ImageSets contain the relevant description of the Data set. Data stores all Data information, including pictures and video clips, and indications of the pictures in the Data correspond to indications of the pictures in the Data. Fig. 3 shows a training sample of the algorithm model.
3.2 results of the experiment
The input image size is converted to a size of 127 × 127, and the candidate image is converted to a size of 255 × 255 as an input of the feature extraction network. The first layer is a convolution layer, and the template image and the search area image are subjected to 96 convolution kernels of 11 × 11 × 3 to obtain feature images of 59 × 59 and 123 × 123. The pooled layers were followed by downsampling the feature map obtained in the first layer using max-pooling in units of 3 × 3 to obtain 29 × 29 and 61 × 61 feature maps. After the convolution of 256 layers of the third layer, which is 5 × 5 × 48, 25 × 25 and 57 × 57 characteristic images are obtained, and after the pooling layer with the unit of 3 × 3, 12 × 12 and 28 × 28 characteristic images are obtained. After the convolution of the fourth, fifth and sixth layers, a 6 × 6 template feature map and a 22 × 22 search area feature map are finally obtained. To reduce the risk of over-fitting, a ReLU nonlinear activation layer follows each convolutional layer. Then, a score map matrix is obtained by taking the 6 × 6 template feature map as a convolution kernel of the 22 × 22 search area feature map, fig. 4 is a graph obtained by visualizing the score matrix, wherein the bright spots represent positions of tracking prediction, and the large map and the small map are the original image and the target image respectively. Finally, the score map is interpolated from 17 × 17 upsampling to 272 × 272 for target localization.
In this experiment, single person, multiple person, etc. were tested. The processing speed at the GPU server is 86 frames/second and fig. 5 and 6 are the result of the tracing.
Experiments show that the processing speed of each frame can not only meet the real-time requirement, but also reach more than 95% in accuracy. To further highlight the superiority of the algorithm, the accuracy and efficiency of the algorithm are shown in fig. 7 and 8, compared to the BOOSTING, MIL, KCF, TLD and MOSSE algorithms.
The fully-convolutional twin network mainly solves the problem of similarity, and the problem can be solved by training the twin network to perform similarity evaluation. Experiments prove that the network not only has good performance on the video data set ILSVRC2015, but also can achieve the effect of real-time tracking in practical application.

Claims (3)

1. A multi-target tracking algorithm based on a deep neural network is characterized in that: the method comprises the following operations: firstly, designing a single target tracker based on a twin network, and carrying out similarity measurement on a determined target in a next frame image through the network to obtain the position prediction of each target on the previous frame image on the frame image; then, carrying out target detection on the frame image by utilizing a deep convolutional neural network; and finally, performing correlation matching on the obtained tracking prediction image and the detected pedestrian image, and improving the tracking accuracy by utilizing cosine similarity and area overlapping.
2. The deep neural network-based multi-target tracking algorithm of claim 1, wherein: and predicting the problem of each tracked target on the frame image by using a twin network according to the tracked target on the previous frame image, and comparing the problem with a plurality of results of target detection on the frame image to find the most similar target for each target so as to realize multi-target tracking.
3. The deep neural network-based multi-target tracking algorithm of claim 1, wherein: the twin network is a 'conjoined neural network', the 'conjoined' of the neural network is realized by sharing a weight, and the twin network is used for measuring the similarity of two inputs; the twin neural network maps the two inputs to a new space respectively to obtain the representation of the inputs in the new space, and the similarity of the two inputs is evaluated through loss calculation;
twin network based single target tracking is to find the most similar regions between the template image z and the search regions A, B of this frame image x (these search regions are near the target in the previous frame image); template image and search area pass
Figure FDA0002994515500000011
Is mapped to the characteristic space of the image,
Figure FDA0002994515500000012
is a feature mapping implemented using a neural network; the characteristic size of the input target image after mapping is 6 multiplied by 128, and similarly, the characteristic size obtained by the lower branch of the twin network is 22 multiplied by 128; in order to obtain the characteristic position of the search area, taking the obtained 6 × 6 × 128 characteristic of the upper branch of the twin network as a convolution kernel of the lower branch, and performing characteristic convolution on the 22 × 22 × 128 characteristic of the lower branch; finally, obtaining a score map with the size of 17 multiplied by 1, wherein the map represents the similarity scores between each position and the template; algorithms for comparing similarity between search regions and objects, this class of methodsSimilar to the correlation filtering idea, the twin network utilizes the characteristics to act as convolution, and finds the maximum similar position in the convolution result;
the full convolution network is independent of the size of the candidate image, it will calculate the similarity of all search windows x and z, the algorithm uses convolution embedding function
Figure FDA0002994515500000021
And cross-correlation to obtain the result, the formula is as follows:
Figure FDA0002994515500000022
in the formula (1), b represents a value at each position;
converting the size of an input image into the size of 127 multiplied by 127, converting a candidate image into the size of 255 multiplied by 255, changing the interval into p, keeping the area unchanged by a scale factor s, and continuously adjusting the image size by a formula:
s(w+2p)×(h+2p)=A (2)
in formula (2), A is 1272P ═ w + h)/4, w, h is the width and height of the candidate box;
when the network is trained, l (y, v) is the training of the network by positive and negative samples, and the formula is as follows:
l(y,v)=log(1+exp(-yu)) (3)
in the formula (3), u is a score matrix of a single sample, v is a similarity score of a single template and the candidate region, and y is only +1 or-1;
the definition of y is as follows:
Figure FDA0002994515500000023
in the formula (4), k is the step length of the network, and R is the radius of the center of the fractional graph element;
during training, the convolution of the network is implemented using a graph containing an example graph and a larger candidate graph, the loss of the score graph being defined as the mean of all losses:
Figure FDA0002994515500000024
in the formula (5), D ∈ Z2Is a finite grid;
when the full convolution twin network is used for training, each pair of the inputs can obtain a similarity score map which is marked as u; finding the maximum score on the score map, and taking the position of the score point on the original graph as the final predicted position; using linear interpolation, expand the 17 × 17 score map to 272 × 272, map the point of the original score map that responds to the maximum value onto 272 × 272 score map as the target position:
maximum score position score graph center × grid block size (6)
Next, at each position of the score map, the convolution parameters are obtained by random gradient descent (SGD):
argminθEL(y,f(z,x;θ)) (7)
in the formula (7), theta is a parameter;
obtaining m detection targets {1, 2, …, m } through a detection network, obtaining n tracking prediction candidate targets {1, 2, …, n } through a tracking network, and matching the detection targets and the tracking prediction candidate targets by using a matching algorithm; firstly, an area fusion method is adopted to detect a candidate frame d (x)1,y1,x2,y2) And tracking candidate frame s (x'1,y′1,x′2,y′2) The formula is as follows:
xmin,xmax→(x1,x2,x′1,x′2) (8)
ymin,ymax→(y1,y2,y′1,y′2) (9)
w=xmax-xmin (10)
h=ymax-ymin (11)
s=w×h (12)
in the formula (12), w is the width of the overlapping region, h is the height of the overlapping region, and s is the area of the overlapping region;
if the overlapping rate of the areas has a problem, the cosine matching method is used again for verification, and the cosine formula is as follows:
Figure FDA0002994515500000031
in formula (13), xi,yi(i ═ 1, 2, …, n) are the coordinates of vector i, respectively.
CN202110325552.XA 2021-03-26 2021-03-26 Multi-target tracking algorithm based on deep neural network Pending CN112883928A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110325552.XA CN112883928A (en) 2021-03-26 2021-03-26 Multi-target tracking algorithm based on deep neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110325552.XA CN112883928A (en) 2021-03-26 2021-03-26 Multi-target tracking algorithm based on deep neural network

Publications (1)

Publication Number Publication Date
CN112883928A true CN112883928A (en) 2021-06-01

Family

ID=76042435

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110325552.XA Pending CN112883928A (en) 2021-03-26 2021-03-26 Multi-target tracking algorithm based on deep neural network

Country Status (1)

Country Link
CN (1) CN112883928A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114240996A (en) * 2021-11-16 2022-03-25 灵译脑科技(上海)有限公司 Multi-target tracking method based on target motion prediction
CN114862914A (en) * 2022-05-26 2022-08-05 淮阴工学院 Pedestrian tracking method based on detection and tracking integration
CN116188804A (en) * 2023-04-25 2023-05-30 山东大学 Twin network target search system based on transformer
CN118397068A (en) * 2024-07-01 2024-07-26 杭州师范大学 Monocular depth estimation method based on evolutionary neural network architecture search

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014013975A (en) * 2012-07-03 2014-01-23 Sharp Corp Image decoding device, data structure of encoded data, and image encoding device
CN111105436A (en) * 2018-10-26 2020-05-05 曜科智能科技(上海)有限公司 Target tracking method, computer device, and storage medium
CN111260688A (en) * 2020-01-13 2020-06-09 深圳大学 Twin double-path target tracking method
CN112184752A (en) * 2020-09-08 2021-01-05 北京工业大学 Video target tracking method based on pyramid convolution

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2014013975A (en) * 2012-07-03 2014-01-23 Sharp Corp Image decoding device, data structure of encoded data, and image encoding device
CN111105436A (en) * 2018-10-26 2020-05-05 曜科智能科技(上海)有限公司 Target tracking method, computer device, and storage medium
CN111260688A (en) * 2020-01-13 2020-06-09 深圳大学 Twin double-path target tracking method
CN112184752A (en) * 2020-09-08 2021-01-05 北京工业大学 Video target tracking method based on pyramid convolution

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
吴凌燕: ""基于深度卷积神经网络的行人检测和行人识别系统"", 《中国优秀硕士学位论文全文数据库信息科技辑》, pages 23 - 26 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114240996A (en) * 2021-11-16 2022-03-25 灵译脑科技(上海)有限公司 Multi-target tracking method based on target motion prediction
CN114240996B (en) * 2021-11-16 2024-05-07 灵译脑科技(上海)有限公司 Multi-target tracking method based on target motion prediction
CN114862914A (en) * 2022-05-26 2022-08-05 淮阴工学院 Pedestrian tracking method based on detection and tracking integration
CN116188804A (en) * 2023-04-25 2023-05-30 山东大学 Twin network target search system based on transformer
CN118397068A (en) * 2024-07-01 2024-07-26 杭州师范大学 Monocular depth estimation method based on evolutionary neural network architecture search

Similar Documents

Publication Publication Date Title
CN112883928A (en) Multi-target tracking algorithm based on deep neural network
Luo et al. Thermal infrared and visible sequences fusion tracking based on a hybrid tracking framework with adaptive weighting scheme
CN111476817A (en) Multi-target pedestrian detection tracking method based on yolov3
CN111161317A (en) Single-target tracking method based on multiple networks
CN111292355A (en) Nuclear correlation filtering multi-target tracking method fusing motion information
CN113706581B (en) Target tracking method based on residual channel attention and multi-level classification regression
CN107067410B (en) Manifold regularization related filtering target tracking method based on augmented samples
CN113240716B (en) Twin network target tracking method and system with multi-feature fusion
Xu et al. Hierarchical convolution fusion-based adaptive Siamese network for infrared target tracking
Li et al. Model-based temporal object verification using video
CN116402858B (en) Transformer-based space-time information fusion infrared target tracking method
CN113378675A (en) Face recognition method for simultaneous detection and feature extraction
CN117252904B (en) Target tracking method and system based on long-range space perception and channel enhancement
CN113129345A (en) Target tracking method based on multi-feature map fusion and multi-scale expansion convolution
Kuai et al. Masked and dynamic Siamese network for robust visual tracking
CN113298850A (en) Target tracking method and system based on attention mechanism and feature fusion
CN116381672A (en) X-band multi-expansion target self-adaptive tracking method based on twin network radar
CN110688512A (en) Pedestrian image search algorithm based on PTGAN region gap and depth neural network
CN114707604A (en) Twin network tracking system and method based on space-time attention mechanism
Gao et al. Occluded person re-identification based on feature fusion and sparse reconstruction
CN113011359A (en) Method for simultaneously detecting plane structure and generating plane description based on image and application
Gong et al. Research on an improved KCF target tracking algorithm based on CNN feature extraction
CN116309715A (en) Multi-target detection and tracking method, device, computer equipment and storage medium
Tan et al. Online visual tracking via background-aware Siamese networks
Dai et al. Long-term object tracking based on siamese network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination