CN110458864A - Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects - Google Patents

Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects Download PDF

Info

Publication number
CN110458864A
CN110458864A CN201910590225.XA CN201910590225A CN110458864A CN 110458864 A CN110458864 A CN 110458864A CN 201910590225 A CN201910590225 A CN 201910590225A CN 110458864 A CN110458864 A CN 110458864A
Authority
CN
China
Prior art keywords
target
frame
example aspects
semantic knowledge
picture
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201910590225.XA
Other languages
Chinese (zh)
Inventor
张索非
冯烨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing Post and Telecommunication University
Nanjing University of Posts and Telecommunications
Original Assignee
Nanjing Post and Telecommunication University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing Post and Telecommunication University filed Critical Nanjing Post and Telecommunication University
Priority to CN201910590225.XA priority Critical patent/CN110458864A/en
Publication of CN110458864A publication Critical patent/CN110458864A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/11Region-based segmentation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/251Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving models
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2210/00Indexing scheme for image generation or computer graphics
    • G06T2210/22Cropping

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The present invention provides a kind of based on the method for tracking target and target tracker of integrating semantic knowledge and example aspects.Described method includes following steps: extracting the picture of the 1st, t-1, t frame;Cut in step 1 the 1st, the picture of t-1, t frame, take input of the picture after cutting as convolutional neural networks;The neural network model based on Darknet-19 is constructed, and carries out trickle amendment on its backbone network;The entire tracker convolutional neural networks of training;Finally, the model performance of assessment training.Regression problem and directly the prediction target location coordinate that is passed to frame are modeled as the present invention is based on proposing a kind of new network architecture model on Darknet-19, while by Target Tracking Problem.The model of present invention training is for specific object type, it is achieved that state-of-the-art performance and speed is very fast.

Description

Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects
Technical field
The invention belongs to technical field of image processing more particularly to a kind of based on the mesh for integrating semantic knowledge and example aspects Mark tracking and target tracker.
Background technique
As the important component of a large amount of computer vision systems, target following technology has attracted grinding for numerous scientific research persons Study carefully interest.In the past decade, powerful ability is shown in target tracking domain based on the method for deep learning.It is typical deep Layer network structure, as convolutional neural networks (CNN) can extract representative visual signature in training end to end.With craft Make feature tradition indicate unlike, the description of this image data knowledge abundant can be saved in model come with The acute variation of track target.Therefore, best target tracker, such as visual object tracking (VOT), to image tracing benchmark (OTB) Etc. the method for being all based on deep learning.
Unlike target detection or identification, the research of current goal tracking focuses primarily on the example aspects of target Rather than semantic knowledge.However, human eye is as a kind of high performance tracker, it is special that it can either capture low-level vision Sign, can also capture high-caliber semantic knowledge.When human eye attempts to track an automobile, the feature that always will be seen that is as general The a part being open to the traffic.In the absence of detailed example aspects (such as: shaking, block or have an X-rayed change etc.), these priori Knowledge plays key effect under conditions of challenging.
When handling a series of targets includes pedestrian and vehicle etc., although using region motion network (RPN) structure It can directly predict the position of target, but they only realize the recurrence of different anchor points without any semantic hypothesis.
In view of this, it is necessary to design it is a kind of based on the method for tracking target for integrating semantic knowledge and example aspects, with solution The certainly above problem.
Summary of the invention
Of the invention is to solve general target tracker to focus simply on the example aspects of target and have ignored semantic priori The problem of knowledge, proposes a kind of based on the method for integrating semantic knowledge and example aspects progress target following.This method is based on Darknet-19 proposes a kind of new network architecture model, and Target Tracking Problem is modeled as regression problem and is directly predicted The target location coordinate of incoming frame.
In order to achieve the above object, this method comprises the following steps the present invention provides a kind of method:
Step 1: extracting the picture of the 1st, t-1, t frame;
Step 2: cut in step 1 the 1st, the picture of t-1, t frame, take the picture after cutting as convolutional neural networks Input;
Step 3: neural network model of the building based on Darknet-19, and trickle amendment is carried out on its backbone network;
Step 4: the entire tracker convolutional neural networks of training;
Step 5: assessing trained model performance.
A further improvement of the present invention is that further including step 3.1 before step 4, the step 3.1 is design grid Network output, network output is comprising classification branch and returns branch;It further include step 4.1 before step 5, the step 4.1 For planned network loss function.
A further improvement of the present invention is that in step 1, the 1st frame picture is chosen for the standard form comprising target, T frame picture is chosen for the candidate region that target is likely to occur.
A further improvement of the present invention is that in step 2, extracting the standard form including target to be initialized.
A further improvement of the present invention is that, it is assumed that the size of real border frame be (w, h), around the center of target with Size S cuts the 1st inputted frame picture, so that example image is obtained, always by this example image during entire tracking As standard form, back gauge information meets following relationship at this time:
s2=(3w) * (3h) (1).
A further improvement of the present invention is that in step 3, the prototype structure based on Darknet-19 is rolled up using three Lamination and two full articulamentums replace the global pool to be respectively used to classify and position.
A further improvement of the present invention is that the network exports a scores vector w by full articulamentum in t framet∈RK As the classification results of target, K is the number of classification, this vector reflects corresponding object and appears in possibility in sight Property;Meanwhile the network exportsDeformation Prediction as each classification target, it is assumed that t-1 frame output boundary frame is pt-1=(xt-1,yt-1,wt-1,ht-1), wherein x, y are the centre coordinates of frame, and w and h are the width and height of frame, for classification k Deformation returnIt is made of four coordinates:
Wherein,Indicate the deformation of the target under different semantic hypothesis.The final result p of t frametIt can To be calculated by following formula:
A further improvement of the present invention is that using cross entropy loss function for Classification Loss w, bounding box is returned It loses d and uses L1 loss function:
Wherein,It is second true strain for being input to third input, L1 loses to the bounding box of prediction and true The slight errors of bounding box have higher punishment.Thus the model of training has more stable bounding box.
A further improvement of the present invention is that step 4 includes the first stage: the pre-training on ImageNet categorized data set 10 epochs of backbone network are inputted using original image as first, using the random contrast image of standard and color Variation inputs as second and third;Second stage: the entire target following network of training obtains training pattern.
To realize goal of the invention, the present invention also provides a kind of target trackers for realizing preceding method.
Beneficial effects of the present invention are as follows: the present invention integrates semantic knowledge and example aspects carry out target following, is based on Darknet-19 proposes a kind of new network architecture model, and Target Tracking Problem is modeled as regression problem and is directly predicted The target location coordinate of incoming frame, realizes high-accuracy and execution efficiency of the tracker in daily tracking task.
Detailed description of the invention
Fig. 1 be extract the 1st, t-1, t frame picture.
Fig. 2 is three pictures cut in Fig. 1, makes that it includes targets.
Fig. 3 is convolutional neural networks model.
Fig. 4 is two output branchs of convolutional neural networks.
Specific embodiment
To make the objectives, technical solutions, and advantages of the present invention clearer, right in the following with reference to the drawings and specific embodiments The present invention is described in detail.
It is emphasized that various formula and constraint condition use self-consistent respectively in description process of the present invention Label distinguishes, but is also not excluded for using the different identical formula of label mark and/or constraint condition, the mesh being arranged in this way Be in order to it is clearer illustrate feature of present invention where.
CNN target following model proposed by the present invention be on a mixed data set training (ImageNet VID and ALOV300++).ImageNet VID contains 30 different types of targets, we choose common 8 class include: aircraft, from Driving, bird, bus, car, cat, horse and motorcycle.Because there is no pedestrian on this data set, from ALOV300+ + middle selection pedestrian, final composition one are sorted in interior mixed data set comprising 9.
As shown in Figure 1, the present invention extracts input of three pictures of video frame as network first, then by cutting The Target Photo of CNN network is input to three width.As shown in Fig. 2, this three Target Photos pass through CNN convolutional neural networks model Extraction feature (as shown in Figure 3), final output Liang Ge branch is as shown in figure 4, one is classification branch, for differentiating the class of target Not, the other is returning branch, the recurrence for bounding box.
Table 1 is CNN network structure detail parameters designed by the present invention.
As shown in table 1, the present invention has finely tuned Darknet-19 network model, uses three convolutional layers and two full articulamentums It replaces global pool to be respectively used to classify and positions, and finely tune above-mentioned mixed video data set.It is every in a video sequence First frame and t-1 frame are extracted every 100 frames.For data enhancing, real border frame is transformed to using Gaussian Profile in t frame. Model iteration on 4 pieces of NVIDIA Tesla P40 GPU is more than 50 times, and each iteration has 800 batches (512 samples).
Specifically, the present invention is based on the methods for integrating semantic knowledge and example aspects progress target following, comprising following Step:
Step 1: the picture of the 1st, t-1, t frame is extracted as input:
First picture chooses standard form of the first frame as target, and the second picture is selected from t-1 frame, finally One picture is selected in target is likely to occur in the current frame candidate region.
Step 2: three pictures that input in step 1 are cut, make that it includes targets:
In first frame, the standard form for extracting target is initialized.Assuming that the size of real border frame is (w, h), Diagram piece is cut with size S around the center of target, square region provides example image and context back gauge information Meet following relationship:
s2=(3w) * (3h) (1)
First input of this example image as CNN network, size 288*288.During entire tracking Always using this example image as template.Hyper parameter 3 in equation (1) is protected from the video statistics data in VID data set It stays.Movement of this configuration comprising all targets in almost consecutive frame, while ensuring acceptable resolution ratio number after scaling Value.
It is assumed that the tracking result of t-1 frame is pt-1, with (x in t-1 frame and t framet-1,yt-1) center cut, Cutting size is (3wt-1,3ht-1), the size of picture is also 288*288 and while second as CNN network after cutting Input and third input.The ratio of target in first frame to be kept is paid attention to, because it encodes template clarification of objective.On the contrary Ground is extended to the target of t-1 frame the size of 96 pixels, in favor of CNN network by standardization frame between deformation come More effectively study bounding box returns.
Step 3: neural network model of the building based on Darknet-19, and trickle amendment is carried out on its backbone network:
For balance model capacity and efficiency, Darknet-19 is developed as pillar network.Darknet-19 is demonstrate,proved It is bright to can be realized high-performance in related objective Detection task.The model is made of the convolution filter of 3*3 and 1*1, in difference Scale between use maximum Chi Hualai connection, the number of active lanes of each scale doubles.The model is in such as target classification Parameter that is very good and using is relatively fewer with the performance that shows in the task of positioning.Original knot based on Darknet-19 Structure, the present invention replace global pool using three convolutional layers and two full articulamentums to be respectively used to classify and position.Table 1 arranges The detailed network architecture is gone out.
Step 4: planned network output, including classify and return branch:
In t frame, which exports a scores vector w by full articulamentumt∈RKAs the classification results of target, K is The number of classification, this vector reflect a possibility that corresponding object appears in sight.Meanwhile the network exportsDeformation Prediction as each classification target, it is assumed that t-1 frame output boundary frame is pt-1=(xt-1,yt-1,wt-1, ht-1), wherein x, y are the centre coordinates of frame, and w and h are the width and height of frame, and the deformation of classification k is returnedIt is sat by four Mark composition:
Indicate the deformation of the target under different semantic hypothesis.The final result p of t frametIt can be by Following formula calculates:
Step 5: planned network loss function:
Cross entropy loss function is used for Classification Loss w, loss d is returned for bounding box and uses L1 loss function:
Wherein,It is second true strain for being input to third input, L1 loses to the bounding box of prediction and true The slight errors of bounding box have higher punishment.Thus the model of training has more stable bounding box.
Step 6: the convolutional neural networks model of training tracker:
First stage: 10 epochs of pre-training backbone network on ImageNet categorized data set, using original image It inputs as first, is inputted using the random contrast image of standard and color change as second and third.It should Network realizes 72.5% top-1 accuracy rate and 91.0% top-2 accuracy rate in ImageNet.
Second stage: the entire target following network of training obtains training pattern.
Step 7: assessment training pattern performance:
Trained model is assessed on the Sub Data Set of VOT 2016, which there are 15 video sequences Column.
The present invention integrates semantic knowledge and example aspects and carries out target following, is proposed based on Darknet-19 a kind of new Network architecture model, and Target Tracking Problem is modeled as regression problem and directly predicts the target location coordinate of incoming frame, it is real High-accuracy and execution efficiency of the tracker in daily tracking task are showed.
The above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although referring to preferred embodiment to this hair It is bright to be described in detail, those skilled in the art should understand that, it can modify to technical solution of the present invention Or equivalent replacement, without departing from the spirit and scope of the technical solution of the present invention.

Claims (10)

1. a kind of based on the method for tracking target for integrating semantic knowledge and example aspects, which is characterized in that the method includes such as Lower step:
Step 1: extracting the picture of the 1st, t-1, t frame;
Step 2: cut in step 1 the 1st, the picture of t-1, t frame, take the picture after cutting as the defeated of convolutional neural networks Enter;
Step 3: neural network model of the building based on Darknet-19, and trickle amendment is carried out on its backbone network;
Step 4: the entire tracker convolutional neural networks of training;
Step 5: assessing trained model performance.
2. according to claim 1 based on the method for tracking target for integrating semantic knowledge and example aspects, it is characterised in that: Further include step 3.1 before step 4, the step 3.1 be planned network output, the network output comprising classification branch and Return branch;It further include step 4.1 before step 5, the step 4.1 is planned network loss function.
3. according to claim 1 based on the method for tracking target for integrating semantic knowledge and example aspects, it is characterised in that: In step 1, the 1st frame picture is chosen for the standard form comprising target, t frame picture is chosen for what target was likely to occur Candidate region.
4. according to claim 3 based on the method for tracking target for integrating semantic knowledge and example aspects, it is characterised in that: In step 2, the standard form including target is extracted to be initialized.
5. according to claim 4 based on the method for tracking target for integrating semantic knowledge and example aspects, it is characterised in that: Assuming that the size of real border frame is (w, h), the 1st inputted frame picture is cut with size S around the center of target, thus Example image is obtained, always using this example image as standard form during entire tracking, under back gauge information meets at this time Column relationship:
s2=(3w) * (3h) (1).
6. according to claim 1 based on the method for tracking target for integrating semantic knowledge and example aspects, it is characterised in that: In step 3, based on the prototype structure of Darknet-19, global pool is replaced using three convolutional layers and two full articulamentums It is respectively used to classify and position.
7. according to claim 2 based on the method for tracking target for integrating semantic knowledge and example aspects, it is characterised in that: In t frame, which exports a scores vector w by full articulamentumt∈RKAs the classification results of target, K is the number of classification Mesh, this vector reflect a possibility that corresponding object appears in sight;Meanwhile the network exportsAs every The Deformation Prediction of one classification target, it is assumed that t-1 frame output boundary frame is pt-1=(xt-1,yt-1,wt-1,ht-1), wherein x, y It is the centre coordinate of frame, w and h are the width and height of frame, and the deformation of classification k is returnedIt is made of four coordinates:
Wherein,Indicate the deformation of the target under different semantic hypothesis.The final result p of t frametIt can be by Following formula calculates:
8. according to claim 7 based on the method for tracking target for integrating semantic knowledge and example aspects, it is characterised in that: Cross entropy loss function is used for Classification Loss w, loss d is returned for bounding box and uses L1 loss function:
Wherein,It is second true strain for being input to third input, bounding box and true side of the L1 loss to prediction The slight errors of boundary's frame have higher punishment.Thus the model of training has more stable bounding box.
9. according to claim 8 based on the method for tracking target for integrating semantic knowledge and example aspects, it is characterised in that: Step 4 includes two stages: the first stage is 10 epochs of pre-training backbone network on ImageNet categorized data set, is adopted Original image is used to input as first, using the random contrast image of standard and color change as second and third A input;Second stage is that the entire target following network of training obtains training pattern.
10. a kind of realize the target tracker such as any one of claim 1-9 the method.
CN201910590225.XA 2019-07-02 2019-07-02 Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects Pending CN110458864A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910590225.XA CN110458864A (en) 2019-07-02 2019-07-02 Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910590225.XA CN110458864A (en) 2019-07-02 2019-07-02 Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects

Publications (1)

Publication Number Publication Date
CN110458864A true CN110458864A (en) 2019-11-15

Family

ID=68482051

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910590225.XA Pending CN110458864A (en) 2019-07-02 2019-07-02 Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects

Country Status (1)

Country Link
CN (1) CN110458864A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111105442A (en) * 2019-12-23 2020-05-05 中国科学技术大学 Switching type target tracking method
CN111428567A (en) * 2020-02-26 2020-07-17 沈阳大学 Pedestrian tracking system and method based on affine multi-task regression
CN112053384A (en) * 2020-08-28 2020-12-08 西安电子科技大学 Target tracking method based on bounding box regression model
CN112232359A (en) * 2020-09-29 2021-01-15 中国人民解放军陆军炮兵防空兵学院 Visual tracking method based on mixed level filtering and complementary characteristics
CN112861652A (en) * 2021-01-20 2021-05-28 中国科学院自动化研究所 Method and system for tracking and segmenting video target based on convolutional neural network
CN112966581A (en) * 2021-02-25 2021-06-15 厦门大学 Video target detection method based on internal and external semantic aggregation
CN113298142A (en) * 2021-05-24 2021-08-24 南京邮电大学 Target tracking method based on deep space-time twin network
CN117237402A (en) * 2023-11-15 2023-12-15 北京中兵天工防务技术有限公司 Target motion prediction method and system based on semantic information understanding

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709936A (en) * 2016-12-14 2017-05-24 北京工业大学 Single target tracking method based on convolution neural network
CN106845430A (en) * 2017-02-06 2017-06-13 东华大学 Pedestrian detection and tracking based on acceleration region convolutional neural networks
CN108027972A (en) * 2015-07-30 2018-05-11 北京市商汤科技开发有限公司 System and method for Object tracking
CN109255351A (en) * 2018-09-05 2019-01-22 华南理工大学 Bounding box homing method, system, equipment and medium based on Three dimensional convolution neural network
CN109543754A (en) * 2018-11-23 2019-03-29 中山大学 The parallel method of target detection and semantic segmentation based on end-to-end deep learning

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108027972A (en) * 2015-07-30 2018-05-11 北京市商汤科技开发有限公司 System and method for Object tracking
CN106709936A (en) * 2016-12-14 2017-05-24 北京工业大学 Single target tracking method based on convolution neural network
CN106845430A (en) * 2017-02-06 2017-06-13 东华大学 Pedestrian detection and tracking based on acceleration region convolutional neural networks
CN109255351A (en) * 2018-09-05 2019-01-22 华南理工大学 Bounding box homing method, system, equipment and medium based on Three dimensional convolution neural network
CN109543754A (en) * 2018-11-23 2019-03-29 中山大学 The parallel method of target detection and semantic segmentation based on end-to-end deep learning

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
SHAOQING REN等: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《EEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE(VOL.39,NO.6)》 *
SHAOQING REN等: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《EEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE(VOL.39,NO.6)》, 30 June 2017 (2017-06-30), pages 1137 - 1149, XP055705510, DOI: 10.1109/TPAMI.2016.2577031 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111105442B (en) * 2019-12-23 2022-07-15 中国科学技术大学 Switching type target tracking method
CN111105442A (en) * 2019-12-23 2020-05-05 中国科学技术大学 Switching type target tracking method
CN111428567A (en) * 2020-02-26 2020-07-17 沈阳大学 Pedestrian tracking system and method based on affine multi-task regression
CN111428567B (en) * 2020-02-26 2024-02-02 沈阳大学 Pedestrian tracking system and method based on affine multitask regression
CN112053384A (en) * 2020-08-28 2020-12-08 西安电子科技大学 Target tracking method based on bounding box regression model
CN112053384B (en) * 2020-08-28 2022-12-02 西安电子科技大学 Target tracking method based on bounding box regression model
CN112232359B (en) * 2020-09-29 2022-10-21 中国人民解放军陆军炮兵防空兵学院 Visual tracking method based on mixed level filtering and complementary characteristics
CN112232359A (en) * 2020-09-29 2021-01-15 中国人民解放军陆军炮兵防空兵学院 Visual tracking method based on mixed level filtering and complementary characteristics
CN112861652A (en) * 2021-01-20 2021-05-28 中国科学院自动化研究所 Method and system for tracking and segmenting video target based on convolutional neural network
CN112966581B (en) * 2021-02-25 2022-05-27 厦门大学 Video target detection method based on internal and external semantic aggregation
CN112966581A (en) * 2021-02-25 2021-06-15 厦门大学 Video target detection method based on internal and external semantic aggregation
CN113298142A (en) * 2021-05-24 2021-08-24 南京邮电大学 Target tracking method based on deep space-time twin network
CN113298142B (en) * 2021-05-24 2023-11-17 南京邮电大学 Target tracking method based on depth space-time twin network
CN117237402A (en) * 2023-11-15 2023-12-15 北京中兵天工防务技术有限公司 Target motion prediction method and system based on semantic information understanding
CN117237402B (en) * 2023-11-15 2024-02-20 北京中兵天工防务技术有限公司 Target motion prediction method and system based on semantic information understanding

Similar Documents

Publication Publication Date Title
CN110458864A (en) Based on the method for tracking target and target tracker for integrating semantic knowledge and example aspects
Liu et al. ABNet: Adaptive balanced network for multiscale object detection in remote sensing imagery
Baheti et al. Eff-unet: A novel architecture for semantic segmentation in unstructured environment
Oršić et al. Efficient semantic segmentation with pyramidal fusion
Yang et al. Deeperlab: Single-shot image parser
Si et al. Real-time semantic segmentation via multiply spatial fusion network
Raza et al. Appearance based pedestrians’ head pose and body orientation estimation using deep learning
CN111598030A (en) Method and system for detecting and segmenting vehicle in aerial image
CN108537824B (en) Feature map enhanced network structure optimization method based on alternating deconvolution and convolution
Zhang et al. Domain adaptive yolo for one-stage cross-domain detection
CN107463892A (en) Pedestrian detection method in a kind of image of combination contextual information and multi-stage characteristics
CN111611895B (en) OpenPose-based multi-view human skeleton automatic labeling method
Weng et al. Deep multi-branch aggregation network for real-time semantic segmentation in street scenes
Lu et al. A cnn-transformer hybrid model based on cswin transformer for uav image object detection
Weidmann et al. A closer look at seagrass meadows: Semantic segmentation for visual coverage estimation
CN112733590A (en) Pedestrian re-identification method based on second-order mixed attention
CN110517270A (en) A kind of indoor scene semantic segmentation method based on super-pixel depth network
CN112288776A (en) Target tracking method based on multi-time step pyramid codec
Safavi et al. Comparative study of real-time semantic segmentation networks in aerial images during flooding events
Yu et al. Frequency feature pyramid network with global-local consistency loss for crowd-and-vehicle counting in congested scenes
Zhang et al. From Coarse Attention to Fine-Grained Gaze: A Two-stage 3D Fully Convolutional Network for Predicting Eye Gaze in First Person Video.
Sun et al. An integration–competition network for bridge crack segmentation under complex scenes
Noman et al. ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change Detection
CN105956607B (en) A kind of improved hyperspectral image classification method
CN117576149A (en) Single-target tracking method based on attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination