CN110458864A - Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects - Google Patents

Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects Download PDF

Info

Publication number
CN110458864A
CN110458864A CN201910590225.XA CN201910590225A CN110458864A CN 110458864 A CN110458864 A CN 110458864A CN 201910590225 A CN201910590225 A CN 201910590225A CN 110458864 A CN110458864 A CN 110458864A
Authority
CN
China
Prior art keywords
target
frame
network
semantic knowledge
tracking method
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910590225.XA
Other languages
Chinese (zh)
Inventor
张索非
冯烨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201910590225.XA priority Critical patent/CN110458864A/en
Publication of CN110458864A publication Critical patent/CN110458864A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/251Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/22Cropping

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present invention provides a kind of based on the method for tracking target and target tracker of integrating semantic knowledge and example aspects.Described method includes following steps: extracting the picture of the 1st, t-1, t frame;Cut in step 1 the 1st, the picture of t-1, t frame, take input of the picture after cutting as convolutional neural networks;The neural network model based on Darknet-19 is constructed, and carries out trickle amendment on its backbone network;The entire tracker convolutional neural networks of training;Finally, the model performance of assessment training.Regression problem and directly the prediction target location coordinate that is passed to frame are modeled as the present invention is based on proposing a kind of new network architecture model on Darknet-19, while by Target Tracking Problem.The model of present invention training is for specific object type, it is achieved that state-of-the-art performance and speed is very fast.

Description

Target tracking method and target tracker based on integrated semantic knowledge and instance features
Technical Field
The invention belongs to the technical field of image processing, and particularly relates to a target tracking method and a target tracker based on integrated semantic knowledge and example features.
Background
As an important component of a large number of computer vision systems, target tracking technology has attracted research interest to a number of researchers. In the last decade, deep learning based methods have shown great power in the field of target tracking. Typical deep network structures, such as Convolutional Neural Networks (CNNs), can extract representative visual features in end-to-end training. Unlike traditional representations of hand-made features, this description of image data can save rich knowledge into the model to track drastic changes in the target. Therefore, the best target trackers, such as Visual Object Tracking (VOT), object tracking fiducial (OTB), etc., are all methods based on deep learning.
Unlike object detection or recognition, current research in object tracking focuses primarily on example features of the object rather than semantic knowledge. However, the human eye acts as a high performance tracker that captures both low levels of visual features and high levels of semantic knowledge. When the human eye attempts to track a car, the features that are seen are always considered part of the average car. When detailed instance features (e.g., jitter, occlusion, or perspective change, etc.) are not present, these a priori knowledge plays a key role in challenging conditions.
When a series of targets including pedestrians, vehicles and the like are processed, although the positions of the targets can be directly predicted by adopting a Region Proposal Network (RPN) structure, the targets only realize regression of different anchor points without any semantic assumption.
In view of the above, there is a need to design a target tracking method based on integrating semantic knowledge and instance features to solve the above problems.
Disclosure of Invention
The invention provides a method for tracking a target based on integrated semantic knowledge and example characteristics, which aims to solve the problem that a general target tracker only focuses on example characteristics of the target and ignores semantic prior knowledge. The method provides a new network architecture model based on Darknet-19, models a target tracking problem as a regression problem and directly predicts the target position coordinates of an incoming frame.
To achieve the above object, the present invention provides a method comprising the steps of:
step 1: extracting pictures of the 1 st frame, the t-1 th frame and the t th frame;
step 2: cutting the pictures of the 1 st, t-1 st and t frames in the step 1, and taking the cut pictures as the input of the convolutional neural network;
and step 3: constructing a neural network model based on Darknet-19, and slightly modifying the backbone network of the neural network model;
and 4, step 4: training the whole tracker convolutional neural network;
and 5: the trained model performance is evaluated.
The invention further improves the method, and before the step 4, the method also comprises a step 3.1, wherein the step 3.1 is to design a network output, and the network output comprises a classification branch and a regression branch; before step 5, a step 4.1 is also included, wherein the step 4.1 is to design a network loss function.
In step 1, the 1 st frame of picture is selected as a standard template containing the target, and the t th frame of picture is selected as a candidate area where the target may appear.
A further refinement of the invention consists in that in step 2 a standard template comprising the object is extracted for initialization.
A further improvement of the present invention is that, assuming that the size of the real bounding box is (w, h), the input 1 st frame of picture is cropped by the size S around the center of the target to obtain an example image, which is always used as a standard template throughout the tracking process, when the margin information satisfies the following relationship:
s2=(3w)*(3h) (1)。
a further improvement of the invention is that in step 3, based on the original structure of Darknet-19, three convolutional layers and two fully-connected layers are used instead of global pooling for classification and localization, respectively.
In a further development of the invention, in the t-th frame, the network outputs a fractional vector w from the fully-connected layert∈RKAs a result of the classification of the target, K is the number of classes, this vector reflecting the likelihood of the corresponding object appearing in the line of sight; at the same time, the network outputsAs a deformation prediction for each class target, assume that the t-1 frame output bounding box is pt-1=(xt-1,yt-1,wt-1,ht-1) Where x, y are the center coordinates of the box, w and h are the width and height of the box, and the regression of the deformation for class kConsists of four coordinates:
wherein,representing the deformation of the target under different semantic assumptions. Last result p of t-th frametCan be calculated from:
the invention is further improved in that a cross-entropy loss function is adopted for the classification loss w, and an L1 loss function is adopted for the bounding box regression loss d:
wherein,being the true deformation of the second input to the third input, the L1 penalty is higher for slight errors in the predicted bounding box and the true bounding box. The trained model thus has a more stable bounding box.
A further development of the invention is that step 4 comprises a first stage: pre-training 10 epochs of a backbone network on an ImageNet classification dataset, and taking an original image as a first input and a standard random contrast enhanced image and color change as a second input and a third input; and a second stage: and training the whole target tracking network to obtain a training model.
In order to achieve the purpose of the invention, the invention also provides a target tracker for realizing the method.
The invention has the following beneficial effects: the invention integrates semantic knowledge and example characteristics to track the target, provides a new network architecture model based on Darknet-19, models the target tracking problem as a regression problem and directly predicts the target position coordinates of an incoming frame, and realizes high accuracy and execution efficiency of the tracker in daily tracking tasks.
Drawings
FIG. 1 shows the 1 st, t-1 st and t frames of the extracted picture.
FIG. 2 is a diagram of the three pictures in FIG. 1 being cropped to include a target.
Fig. 3 is a convolutional neural network model.
Fig. 4 shows two output branches of a convolutional neural network.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in detail with reference to the accompanying drawings and specific embodiments.
It should be emphasized that in describing the present invention, various formulas and constraints are identified with consistent labels, but the use of different labels to identify the same formula and/or constraint is not precluded and is provided for the purpose of more clearly illustrating the features of the present invention.
The CNN target tracking model provided by the invention is trained on a mixed data set (ImageNet VID and ALOV300+ +). The ImageNet VID contains 30 different classes of targets, and we pick the commonly used 8 classes including: airplanes, bicycles, birds, buses, cars, cats, horses, and motorcycles. Since there are no pedestrians in this dataset, the pedestrians were selected from the ALOV300+ +, and finally a mixed dataset containing 9 classes was formed.
As shown in fig. 1, the present invention first extracts three pictures of a video frame as input of a network, and then obtains three target pictures input to a CNN network by cropping. As shown in fig. 2, the three target images are subjected to a CNN convolutional neural network model (as shown in fig. 3) to extract features, and two branches are finally output as shown in fig. 4, one branch is a classification branch and is used for distinguishing the category of the target, and the other branch is a regression branch and is used for regression of the bounding box.
Table 1 shows detailed parameters of the CNN network structure designed by the present invention.
As shown in table 1, the present invention fine-tunes the Darknet-19 network model, uses three convolutional layers and two fully-connected layers instead of global pooling for classification and localization, respectively, and fine-tunes the above-mentioned hybrid video data set. The first and t-1 frames are extracted every 100 frames in a video sequence. For data enhancement, a transformation to the real bounding box is performed at the t-th frame using a Gaussian distribution. This model iterates over 50 times on a 4-block NVIDIA Tesla P40 GPU, with 800 batches (512 samples) per iteration.
Specifically, the method for tracking the target based on the integrated semantic knowledge and the example characteristics comprises the following steps:
step 1, extracting pictures of the 1 st, t-1 st and t frames as input:
the first picture selects the first frame as a standard template of the target, the second picture is selected from the t-1 th frame, and the last picture is selected from a candidate area in which the target may appear in the current frame.
Step 2: and (3) cutting the three pictures input in the step (1) to enable the three pictures to comprise targets:
in the first frame, a standard template of the extraction target is initialized. Assuming that the size of the real bounding box is (w, h), the picture is cropped around the center of the target by the size S, the square region provides an example image and the contextual edge distance information satisfies the following relationship:
s2=(3w)*(3h) (1)
this example image is the first input to the CNN network and is 288 × 288 in size. This example image is used as a template throughout the tracking process. The hyperparameter 3 in equation (1) is retained from the video statistics in the VID dataset. This configuration contains the motion of almost all objects in adjacent frames while ensuring an acceptable resolution value after scaling.
Assume that the tracking result of the t-1 th frame is pt-1In the t-1 th frame and the t-th frame with (x)t-1,yt-1) Cutting at the center to obtain a cut size of (3 w)t-1,3ht-1) The picture after cropping is also 288 × 288 and serves as both the second and third inputs to the CNN network. Note that the ratio of objects in the first frame is to be preserved because it encodes the features of the template object. Conversely, the target of the t-1 frame is expanded to a size of 96 pixels to facilitate the CNN network to learn the bounding box regression more efficiently by normalizing the deformation between frames.
And step 3: constructing a neural network model based on Darknet-19, and slightly modifying the backbone network of the neural network model:
to balance model capacity and efficiency, Darknet-19 was developed as a backbone network. Darknet-19 has been shown to enable high performance in relevant target detection tasks. The model consists of convolution filters of 3 x 3 and 1 x 1, connected between different scales using maximal pooling, doubling the number of channels per scale. The model performs very well in tasks such as object classification and localization and uses relatively few parameters. Based on the original structure of Darknet-19, the present invention uses three convolutional layers and two fully-connected layers instead of global pooling for classification and localization, respectively. Table 1 lists the detailed network architecture.
And 4, designing network output, including classification and regression branches:
in the t-th frame, the network outputs a fractional vector w from the fully-connected layert∈RKK is the number of classes as a result of the classification of the target, this vector reflecting the likelihood that the corresponding object appears in the line of sight. At the same time, the network outputsAs a deformation prediction for each class target, assume that the t-1 frame output bounding box is pt-1=(xt-1,yt-1,wt-1,ht-1) Where x, y are the center coordinates of the box, w and h are the width and height of the box, and the regression of the deformation for class kConsists of four coordinates:
representing the deformation of the target under different semantic assumptions. Last result p of t-th frametCan be calculated from:
and 5: designing a network loss function:
the cross entropy loss function is used for the classification loss w, and the L1 loss function is used for the bounding box regression loss d:
wherein,being the true deformation of the second input to the third input, the L1 penalty is higher for slight errors in the predicted bounding box and the true bounding box. The trained model thus has a more stable bounding box.
Step 6: training the convolutional neural network model of the tracker:
the first stage is as follows: the backbone network was pre-trained on ImageNet classification datasets for 10 epochs, using the original image as the first input, and standard random contrast enhanced images and color variations as the second and third inputs. The network achieves 72.5% top-1 accuracy and 91.0% top-2 accuracy in ImageNet.
And a second stage: and training the whole target tracking network to obtain a training model.
And 7: evaluating the performance of the training model:
the trained model is evaluated on a sub data set of the VOT 2016, which has 15 video sequences.
The invention integrates semantic knowledge and example characteristics to track the target, provides a new network architecture model based on Darknet-19, models the target tracking problem as a regression problem and directly predicts the target position coordinates of an incoming frame, and realizes high accuracy and execution efficiency of the tracker in daily tracking tasks.
Although the present invention has been described in detail with reference to the preferred embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the spirit and scope of the present invention.

Claims (10)

1. A target tracking method based on integrated semantic knowledge and instance features is characterized by comprising the following steps:
step 1: extracting pictures of the 1 st frame, the t-1 th frame and the t th frame;
step 2: cutting the pictures of the 1 st, t-1 st and t frames in the step 1, and taking the cut pictures as the input of the convolutional neural network;
and step 3: constructing a neural network model based on Darknet-19, and slightly modifying the backbone network of the neural network model;
and 4, step 4: training the whole tracker convolutional neural network;
and 5: the trained model performance is evaluated.
2. The integrated semantic knowledge and instance feature based target tracking method of claim 1, characterized in that: before step 4, a step 3.1 is also included, wherein the step 3.1 is to design a network output, and the network output comprises a classification branch and a regression branch; before step 5, a step 4.1 is also included, wherein the step 4.1 is to design a network loss function.
3. The integrated semantic knowledge and instance feature based target tracking method of claim 1, characterized in that: in step 1, the 1 st frame of picture is selected as a standard template containing a target, and the t-th frame of picture is selected as a candidate area where the target may appear.
4. The integrated semantic knowledge and instance feature based target tracking method of claim 3, wherein: in step 2, a standard template including the target is extracted for initialization.
5. The integrated semantic knowledge and instance feature based target tracking method of claim 4, wherein: assuming that the size of the real bounding box is (w, h), the input 1 st frame of picture is cropped by the size S around the center of the target to obtain an example image, which is always used as a standard template in the whole tracking process, and the edge distance information satisfies the following relation:
s2=(3w)*(3h) (1)。
6. the integrated semantic knowledge and instance feature based target tracking method of claim 1, characterized in that: in step 3, three convolutional layers and two fully-connected layers are used instead of global pooling for classification and localization, respectively, based on the original structure of Darknet-19.
7. The integrated semantic knowledge and instance feature based target tracking method of claim 2, wherein: in the t-th frame, the network outputs a fractional vector w from the fully-connected layert∈RKTo serve as the purposeThe target classification result, K is the number of classes, this vector reflects the probability that the corresponding object appears in the line of sight; at the same time, the network outputsAs a deformation prediction for each class target, assume that the t-1 frame output bounding box is pt-1=(xt-1,yt-1,wt-1,ht-1) Where x, y are the center coordinates of the box, w and h are the width and height of the box, and the regression of the deformation for class kConsists of four coordinates:
wherein,representing the deformation of the target under different semantic assumptions. Last result p of t-th frametCan be calculated from:
8. the integrated semantic knowledge and instance feature based target tracking method of claim 7, wherein: the cross entropy loss function is used for the classification loss w, and the L1 loss function is used for the bounding box regression loss d:
wherein,is the second input toThe true deformation of the three inputs, the L1 loss, penalizes the predicted bounding box and the true bounding box slightly more. The trained model thus has a more stable bounding box.
9. The integrated semantic knowledge and instance feature based object tracking method of claim 8, wherein: step 4 comprises two stages: the first stage is to pre-train 10 epochs of the backbone network on the ImageNet classification dataset, and adopt the original image as the first input and adopt the standard random contrast enhanced image and color variation as the second and third inputs; the second stage is to train the whole target tracking network to obtain a training model.
10. An object tracker implementing the method of any one of claims 1-9.
CN201910590225.XA 2019-07-02 2019-07-02 Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects Pending CN110458864A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910590225.XA CN110458864A (en) 2019-07-02 2019-07-02 Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910590225.XA CN110458864A (en) 2019-07-02 2019-07-02 Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects

Publications (1)

Publication Number Publication Date
CN110458864A true CN110458864A (en) 2019-11-15

Family

ID=68482051

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910590225.XA Pending CN110458864A (en) 2019-07-02 2019-07-02 Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects

Country Status (1)

Country Link
CN (1) CN110458864A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111105442A (en) * 2019-12-23 2020-05-05 中国科学技术大学 Switching type target tracking method
CN111428567A (en) * 2020-02-26 2020-07-17 沈阳大学 Pedestrian tracking system and method based on affine multi-task regression
CN112053384A (en) * 2020-08-28 2020-12-08 西安电子科技大学 Target tracking method based on bounding box regression model
CN112232359A (en) * 2020-09-29 2021-01-15 中国人民解放军陆军炮兵防空兵学院 Visual tracking method based on mixed level filtering and complementary characteristics
CN112861652A (en) * 2021-01-20 2021-05-28 中国科学院自动化研究所 Method and system for tracking and segmenting video target based on convolutional neural network
CN112966581A (en) * 2021-02-25 2021-06-15 厦门大学 Video target detection method based on internal and external semantic aggregation
CN113298142A (en) * 2021-05-24 2021-08-24 南京邮电大学 Target tracking method based on deep space-time twin network
CN117237402A (en) * 2023-11-15 2023-12-15 北京中兵天工防务技术有限公司 Target motion prediction method and system based on semantic information understanding

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709936A (en) * 2016-12-14 2017-05-24 北京工业大学 Single target tracking method based on convolution neural network
CN106845430A (en) * 2017-02-06 2017-06-13 东华大学 Pedestrian detection and tracking based on acceleration region convolutional neural networks
CN108027972A (en) * 2015-07-30 2018-05-11 北京市商汤科技开发有限公司 System and method for Object tracking
CN109255351A (en) * 2018-09-05 2019-01-22 华南理工大学 Bounding box homing method, system, equipment and medium based on Three dimensional convolution neural network
CN109543754A (en) * 2018-11-23 2019-03-29 中山大学 The parallel method of target detection and semantic segmentation based on end-to-end deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108027972A (en) * 2015-07-30 2018-05-11 北京市商汤科技开发有限公司 System and method for Object tracking
CN106709936A (en) * 2016-12-14 2017-05-24 北京工业大学 Single target tracking method based on convolution neural network
CN106845430A (en) * 2017-02-06 2017-06-13 东华大学 Pedestrian detection and tracking based on acceleration region convolutional neural networks
CN109255351A (en) * 2018-09-05 2019-01-22 华南理工大学 Bounding box homing method, system, equipment and medium based on Three dimensional convolution neural network
CN109543754A (en) * 2018-11-23 2019-03-29 中山大学 The parallel method of target detection and semantic segmentation based on end-to-end deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHAOQING REN等: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《EEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE(VOL.39,NO.6)》 *
SHAOQING REN等: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《EEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE(VOL.39,NO.6)》, 30 June 2017 (2017-06-30), pages 1137 - 1149, XP055705510, DOI: 10.1109/TPAMI.2016.2577031 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111105442B (en) * 2019-12-23 2022-07-15 中国科学技术大学 Switching type target tracking method
CN111105442A (en) * 2019-12-23 2020-05-05 中国科学技术大学 Switching type target tracking method
CN111428567A (en) * 2020-02-26 2020-07-17 沈阳大学 Pedestrian tracking system and method based on affine multi-task regression
CN111428567B (en) * 2020-02-26 2024-02-02 沈阳大学 Pedestrian tracking system and method based on affine multitask regression
CN112053384A (en) * 2020-08-28 2020-12-08 西安电子科技大学 Target tracking method based on bounding box regression model
CN112053384B (en) * 2020-08-28 2022-12-02 西安电子科技大学 Target tracking method based on bounding box regression model
CN112232359B (en) * 2020-09-29 2022-10-21 中国人民解放军陆军炮兵防空兵学院 Visual tracking method based on mixed level filtering and complementary characteristics
CN112232359A (en) * 2020-09-29 2021-01-15 中国人民解放军陆军炮兵防空兵学院 Visual tracking method based on mixed level filtering and complementary characteristics
CN112861652A (en) * 2021-01-20 2021-05-28 中国科学院自动化研究所 Method and system for tracking and segmenting video target based on convolutional neural network
CN112966581B (en) * 2021-02-25 2022-05-27 厦门大学 Video target detection method based on internal and external semantic aggregation
CN112966581A (en) * 2021-02-25 2021-06-15 厦门大学 Video target detection method based on internal and external semantic aggregation
CN113298142A (en) * 2021-05-24 2021-08-24 南京邮电大学 Target tracking method based on deep space-time twin network
CN113298142B (en) * 2021-05-24 2023-11-17 南京邮电大学 Target tracking method based on depth space-time twin network
CN117237402A (en) * 2023-11-15 2023-12-15 北京中兵天工防务技术有限公司 Target motion prediction method and system based on semantic information understanding
CN117237402B (en) * 2023-11-15 2024-02-20 北京中兵天工防务技术有限公司 Target motion prediction method and system based on semantic information understanding

Similar Documents

Publication Publication Date Title
CN110458864A (en) Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects
CN111210443B (en) Deformable convolution mixing task cascading semantic segmentation method based on embedding balance
Gu et al. A review on 2D instance segmentation based on deep neural networks
Garcia-Garcia et al. A survey on deep learning techniques for image and video semantic segmentation
Zhou et al. Contextual ensemble network for semantic segmentation
CN112446398B (en) Image classification method and device
Xiong et al. DP-LinkNet: A convolutional network for historical document image binarization
CN107967484B (en) Image classification method based on multi-resolution
CN112507777A (en) Optical remote sensing image ship detection and segmentation method based on deep learning
Guo et al. A survey on deep learning based approaches for scene understanding in autonomous driving
CN110738207A (en) character detection method for fusing character area edge information in character image
CN108898145A (en) A kind of image well-marked target detection method of combination deep learning
Girisha et al. Performance analysis of semantic segmentation algorithms for finely annotated new uav aerial video dataset (manipaluavid)
Wulamu et al. Multiscale road extraction in remote sensing images
CN111612008A (en) Image segmentation method based on convolution network
Hara et al. Towards good practice for action recognition with spatiotemporal 3d convolutions
Chen et al. Corse-to-fine road extraction based on local Dirichlet mixture models and multiscale-high-order deep learning
CN112489050A (en) Semi-supervised instance segmentation algorithm based on feature migration
CN115984172A (en) Small target detection method based on enhanced feature extraction
Behera et al. Superpixel-based multiscale CNN approach toward multiclass object segmentation from UAV-captured aerial images
Tao et al. Exploiting web images for weakly supervised object detection
CN113177503A (en) Arbitrary orientation target twelve parameter detection method based on YOLOV5
CN113297959A (en) Target tracking method and system based on corner attention twin network
CN116596966A (en) Segmentation and tracking method based on attention and feature fusion
US12033307B2 (en) System and methods for multiple instance segmentation and tracking

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20191115