CN110223316A - Fast-moving target tracking method based on circulation Recurrent networks - Google Patents

Fast-moving target tracking method based on circulation Recurrent networks Download PDF

Info

Publication number
CN110223316A
CN110223316A CN201910512271.8A CN201910512271A CN110223316A CN 110223316 A CN110223316 A CN 110223316A CN 201910512271 A CN201910512271 A CN 201910512271A CN 110223316 A CN110223316 A CN 110223316A
Authority
CN
China
Prior art keywords
recurrent networks
circulation
target
network
fast
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910512271.8A
Other languages
Chinese (zh)
Other versions
CN110223316B (en
Inventor
邬向前
卜巍
马丁
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin Institute of Technology
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201910512271.8A priority Critical patent/CN110223316B/en
Publication of CN110223316A publication Critical patent/CN110223316A/en
Application granted granted Critical
Publication of CN110223316B publication Critical patent/CN110223316B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a kind of fast-moving target tracking methods based on circulation Recurrent networks, and described method includes following steps: Step 1: using ResNet50 network as the basic network of Recurrent networks;Step 2: introducing LSTM network on its basis after the complete Recurrent networks of training and forming the various cosmetic variations that final circulation Recurrent networks occur to capture target during tracking;Step 3: being trained using Smooth-L1 loss function to circulation Recurrent networks.Whole process of the present invention carries out target following using a neural network, supervises the feature upper returning target frame coordinate in different scale with deep layer, and use the long various cosmetic variations that memory network occurs in short-term to capture target during tracking.More existing method for tracking target can accurately position target, have good robustness in the case where not needing to carry out online updating.

Description

Fast-moving target tracking method based on circulation Recurrent networks
Technical field
The present invention relates to a kind of method for tracking target more particularly to a kind of fast-moving target trackings based on circulation Recurrent networks Method.
Background technique
The purpose of target following is that frame is demarcated by given initial frame automatically to mark target in subsequent frames.With The progress of target following technology plays effect in more and more fields, for example video monitoring, human-computer interaction and movement are known Not etc..However, ineffective target following result will directly affect the performance of the above-mentioned related application based on target following, To limit the application category and application effect of method for tracking target to a certain extent.In recent years, due to convolutional Neural net For network in the application of computer vision field, target following achieves huge success.
Summary of the invention
In order to preferably carry out target following, the present invention provides a kind of fast-moving target trackings based on circulation Recurrent networks Method.This method propose a kind of Recurrent networks to obtain expression more abundant for target, and incorporate on this basis with Time-domain information during track, so that entire tracking process does not need frequently to update can be obtained with the cosmetic variation for adapting to target Obtain accurate target positioning.Method of the invention can be very good to carry out target following, and obtain on multiple databases Competitive result.
The purpose of the present invention is what is be achieved through the following technical solutions:
A kind of fast-moving target tracking method based on circulation Recurrent networks, includes the following steps:
Step 1: using ResNet50 network as the basic network of Recurrent networks comprising Pool5, Rec5b, Six convolution modules of Rec4f, Rec3d, Rec2c and Pool1 and the Pool5_A being engaged on after each convolution module, Six add-on modules of Rec5b_A, Rec4f_A, Rec3d_A, Rec2c_A and Pool1_A and be connected on all convolution modules it 3 full articulamentums afterwards;Wherein: Pool5_A, Rec5b_A, Rec4f_A, Rec3d_A, Rec2c_A and Pool1_A possess identical Structure: 3 Ge Juan bases, 1 Concat layers, 1 Correlation layers, 1 sigmoid layers and 3 full articulamentums;This time The input for returning network includes two kinds of information: one is the full size image pair of two continuous frames (former frame and present frame);The second is The rectangle frame coordinate of target in former frame;The advantages of having benefited from the feature cascade of this basic network itself, target is in present frame Coordinate is predicted to obtain jointly by 6 convolution modules and its add-on module and 3 full articulamentums;
Step 2: introducing long memory network (LSTM) in short-term on its basis after the complete Recurrent networks of training and being formed most The various cosmetic variations that whole circulation Recurrent networks occur to capture target during tracking, the LSTM internet startup disk is in base After the full articulamentum of the 2nd of present networks, output is 4 units (including predicting the upper left corner of rectangle frame and the cross in the lower right corner Ordinate), the output of this 4 units later will predict final rectangle frame coordinate as the input of the 3rd full articulamentum;
Step 3: being trained using Smooth-L1 loss function to circulation Recurrent networks.
Compared with the prior art, the present invention has the advantage that
Whole process of the present invention carries out target following using a neural network, supervises the spy in different scale with deep layer Upper returning target frame coordinate is levied, and is become using the long various appearances that memory network occurs in short-term to capture target during tracking Change.More existing method for tracking target can accurately position target in the case where not needing to carry out online updating, It has good robustness.
Detailed description of the invention
Fig. 1 is that the present invention is based on the fast-moving target tracking method flow diagrams of circulation Recurrent networks;
Fig. 2 is add-on module structure chart;
Fig. 3 is visual comparison result of the different method for tracking target from method of the invention in different challenge scenes;
Fig. 4 is the comparison of method of the invention and 12 kinds of method for tracking target in TC128 data set.
Fig. 5 is the comparison of method of the invention and 7 kinds of method for tracking target in VOT2017 data set.
Specific embodiment
Further description of the technical solution of the present invention with reference to the accompanying drawing, and however, it is not limited to this, all to this Inventive technique scheme is modified or replaced equivalently, and without departing from the spirit and scope of the technical solution of the present invention, should all be covered Within the protection scope of the present invention.
The present invention provides a kind of fast-moving target tracking method based on circulation Recurrent networks, Fig. 1 show whole network Overall structure, be generally segmented into three parts, particular content is as follows:
First part is basic network, and current most of track algorithms mostly use greatly the more light-weighted net such as VGG-M Why network selects such light-weighted network to be to be able to go out in target appearance as basic network as basic network When existing acute variation, on-line fine is carried out in time, enables whole network when target such cosmetic variation occurs again Relatively high response can be generated, to accurately position target.However, update so frequent will greatly increase algorithm Complexity, seriously dragged slowly the speed of tracker.For this purpose, the present invention uses net based on the ResNet50 of larger capacity Network designs and proposes 6 accessory modules to come in different scale on this basis using the basic network as characteristic extracting module Feature on the position of target frame is returned.
The second part is the specific structure of add-on module, as shown in Figure 2.CNN model is by cascading multiple convolution sum ponds Change layer and carrys out abstract characteristics.That is, the CNN feature of shallow-layer is more focused on for target detail (including edge, angle etc.) It portrays, can accurately position target using these shallow-layer features.And the CNN feature of deep layer often focuses more on the language of target volume Adopted information tends to distinguish target and background by further feature.In order to give full play to the excellent of different depth feature Point, our Pool5, Rec5b, Rec4f, Rec3d, Rec2c and Pool1 in 6 convolution modules of ResNet50 add respectively Add add-on module Pool5_A, Rec5b_A, Rec4f_A, Rec3d_A, Rec2c_A and Pool1_A.Wherein, each additional mode Block all enough can predict the coordinate of target frame under the feature of current scale.For each add-on module, we Using identical structure.As shown in Fig. 2, the feature of Pool5_A, which is carried out up-sampling operation, first makes for Rec5b_A Its feature sizes that can match Rec5b, the correlation calculated between two groups of features followed by Correlation layers will Correlation layers of output is connected with Concat layers of characteristic use of Rec5b, connects two convolutional layers and one later Sigmoid layers and three full articulamentums.
Part III is to introduce long memory network (LSTM) in short-term on the basis of Recurrent networks to form final be recycled back to Return network, structure is as shown in Figure 3.Speed index is often the important evidence for measuring tracker performance quality, with the hair of CNN Exhibition, more and more the track algorithm based on CNN is suggested.Mostly there are following characteristics in these track algorithms: in most base In the track algorithm of CNN, designed by the most number of plies of CNN model it is shallower, the purpose is to when deformation occurs for target Timely trim network enables whole network to capture the deformation of target.However, frequent on-line fine can greatly increase Computation system complexity.Therefore, the present invention combines the Recurrent networks designed before with long memory network (LSTM) in short-term, so that Whole network can also capture the outer of target without finely tuning the case where by the information being stored in long memory network in short-term See variation.For this purpose, LSTM is embedded between the 2nd and the 3rd full articulamentum of Recurrent networks by we.The input of LSTM and defeated It can indicate out are as follows:
Zt=T (AXt+BYt-1+b) (1);
Yt=Ot⊙T(Ct) (2);
Wherein, t is frame index, ZtIt is the output vector of present frame, XtIt is the input vector of present frame, Yt-1Be circulation to Amount, b be it is bigoted, T is tangent function, OtIt is the out gate of t frame, CtIt is memory cell, A and B are weight terms, and ⊙ indicates dot product Operation.
The objective function of network training is Smooth-L1.
Four, experimental result
The database that network training uses is YouTubeBoundingBoxes database, wherein including 380000 views Frequently.The present invention trains Recurrent networks and circulation Recurrent networks using different Training strategies.For Recurrent networks, the present invention is adopted It is trained with two stage method.It is abundant the purpose is to obtain using discrete two frame as input in first stage Location information.In second stage, the space time information of supplementary network is carried out using continuous two frame.Later, fixed Recurrent networks Parameter train circulation Recurrent networks.
When verifying inventive energy, the present invention is evaluated proposed by the invention using the public database of three standards The performance of method for tracking target, respectively OTB100, TC128 and VOT2017.Wherein, OTB100 and TC128 is used identical Evaluation criterion, respectively accuracy rate (precision) and success rate (Success).It is different from OTB100 and TC128, The evaluation criterion of VOT2017 includes: accuracy rate-robust rate curve (Accuracy-Robustnessrank is denoted as A-R ranking) It is average Duplication (ExpectAverageOverlapRate) with expectation.
Fig. 3 illustrates the visualization comparison of method He other current the best ways of the invention.From figure 3, it can be seen that Method of the invention has obtained accurate tracking result in various challenging scenes.It carries out below some specific Analysis:
(1) method of the invention enables Recurrent networks to predict target under the feature of different scale using deep layer supervision Position, while by merging the long various deformation that memory network (LSTM) occurs in short-term to capture target during tracking, from And it can accurately predict the position of target.
(2) in the case where incorporating long memory network in short-term, method of the invention is more enough (such as to be schemed when deformation occurs for target 32,4 rows) it can also obtain accurate target positioning.
(3) when dimensional variation occurs for target (3,5,6 rows of such as Fig. 3), the positioning result ratio that method of the invention obtains Other methods are much better.
Table 1 illustrate method and 19 kinds of best method for tracking target of the invention in accuracy rate (Precision), successfully Quantitative assessment result of the evaluation criterions such as rate (Success) and speed (FPS) in OTB data set.
1 experimental result of table and at present best target following result on OTB100 database accuracy rate (Precision), The comparison of success rate (Success) and speed (FPS)
From table 1 it follows that method of the invention in OTB100 test data set, integrates accuracy rate (Precision), success rate (Success) and speed (FPS) these three evaluation criterions all achieve it is competitive as a result, This demonstrate that the validity of method of the invention.
As can be seen that the accuracy rate (Precision) of method of the invention, success rate (Success) are high from Figure 4 and 5 In other methods, it means that even if method of the invention is more more robust than other methods on challenging data set. Particularly, method of the invention has also obtained good result in VOT2017 data set.VOT2017 data set is a difficulty And challenging conspicuousness detection data collection, the disturbing factor of many of them difficulty includes: to fast move, and is blocked, ruler Degree variation etc..Method of the invention can be accurately located target, this depends on method of the invention, and there is powerful feature to mention Ability and feature are taken in the ductility of time dimension, make e-learning to feature can guarantee and position target under complex background, To obtain preferable tracking result.

Claims (6)

1. a kind of fast-moving target tracking method based on circulation Recurrent networks, it is characterised in that described method includes following steps:
Step 1: using ResNet50 network as the basic network of Recurrent networks;
Step 2: introducing LSTM network on its basis after the complete Recurrent networks of training and forming final circulation Recurrent networks The various cosmetic variations occurred to capture target during tracking;
Step 3: being trained using Smooth-L1 loss function to circulation Recurrent networks.
2. the fast-moving target tracking method according to claim 1 based on circulation Recurrent networks, it is characterised in that the base Present networks include six convolution modules of Pool5, Rec5b, Rec4f, Rec3d, Rec2c and Pool1 and are engaged on each convolution Six add-on modules of Pool5_A, Rec5b_A, Rec4f_A, Rec3d_A, Rec2c_A and Pool1_A and string after module 3 full articulamentums being associated in after all convolution modules.
3. the fast-moving target tracking method according to claim 2 based on circulation Recurrent networks, it is characterised in that described Pool5_A, Rec5b_A, Rec4f_A, Rec3d_A, Rec2c_A and Pool1_A possess identical structure: 3 Ge Juan bases, 1 Concat layers, 1 Correlation layers, 1 sigmoid layers and 3 full articulamentums.
4. the fast-moving target tracking method according to claim 1 based on circulation Recurrent networks, it is characterised in that described time The input for returning network includes two kinds of information: one is the full size image pair of two continuous frames;The second is in former frame target square Shape frame coordinate.
5. the fast-moving target tracking method according to claim 1 based on circulation Recurrent networks, it is characterised in that described For LSTM internet startup disk after the 2nd full articulamentum of basic network, output is the upper left corner and the lower right corner of prediction rectangle frame 4 units of transverse and longitudinal coordinate, the output of this 4 units later will predict final square as the input of the 3rd full articulamentum Shape frame coordinate.
6. the fast-moving target tracking method according to claim 1 based on circulation Recurrent networks, it is characterised in that described LSTM's outputs and inputs expression are as follows:
Zt=T (AXt+BYt-1+b) (1);
Yt=Ot⊙T(Ct) (2);
Wherein, t is frame index, ZtIt is the output vector of present frame, XtIt is the input vector of present frame, Yt-1It is cyclic vector, b is Bigoted, T is tangent function, OtIt is the out gate of t frame, CtIt is memory cell, A and B are weight terms, and ⊙ indicates dot product operation.
CN201910512271.8A 2019-06-13 2019-06-13 Rapid target tracking method based on cyclic regression network Active CN110223316B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910512271.8A CN110223316B (en) 2019-06-13 2019-06-13 Rapid target tracking method based on cyclic regression network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910512271.8A CN110223316B (en) 2019-06-13 2019-06-13 Rapid target tracking method based on cyclic regression network

Publications (2)

Publication Number Publication Date
CN110223316A true CN110223316A (en) 2019-09-10
CN110223316B CN110223316B (en) 2021-01-29

Family

ID=67817080

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910512271.8A Active CN110223316B (en) 2019-06-13 2019-06-13 Rapid target tracking method based on cyclic regression network

Country Status (1)

Country Link
CN (1) CN110223316B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113947618A (en) * 2021-10-20 2022-01-18 哈尔滨工业大学 Adaptive regression tracking method based on modulator

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170161607A1 (en) * 2015-12-04 2017-06-08 Pilot Ai Labs, Inc. System and method for improved gesture recognition using neural networks
US20170255832A1 (en) * 2016-03-02 2017-09-07 Mitsubishi Electric Research Laboratories, Inc. Method and System for Detecting Actions in Videos
CN108305275A (en) * 2017-08-25 2018-07-20 深圳市腾讯计算机系统有限公司 Active tracking method, apparatus and system
CN108320297A (en) * 2018-03-09 2018-07-24 湖北工业大学 A kind of video object method for real time tracking and system
CN108520530A (en) * 2018-04-12 2018-09-11 厦门大学 Method for tracking target based on long memory network in short-term
CN109344725A (en) * 2018-09-04 2019-02-15 上海交通大学 A kind of online tracking of multirow people based on space-time attention rate mechanism
CN109360226A (en) * 2018-10-17 2019-02-19 武汉大学 A kind of multi-object tracking method based on time series multiple features fusion
CN109829391A (en) * 2019-01-10 2019-05-31 哈尔滨工业大学 Conspicuousness object detection method based on concatenated convolutional network and confrontation study

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170161607A1 (en) * 2015-12-04 2017-06-08 Pilot Ai Labs, Inc. System and method for improved gesture recognition using neural networks
US20170255832A1 (en) * 2016-03-02 2017-09-07 Mitsubishi Electric Research Laboratories, Inc. Method and System for Detecting Actions in Videos
CN108305275A (en) * 2017-08-25 2018-07-20 深圳市腾讯计算机系统有限公司 Active tracking method, apparatus and system
CN108320297A (en) * 2018-03-09 2018-07-24 湖北工业大学 A kind of video object method for real time tracking and system
CN108520530A (en) * 2018-04-12 2018-09-11 厦门大学 Method for tracking target based on long memory network in short-term
CN109344725A (en) * 2018-09-04 2019-02-15 上海交通大学 A kind of online tracking of multirow people based on space-time attention rate mechanism
CN109360226A (en) * 2018-10-17 2019-02-19 武汉大学 A kind of multi-object tracking method based on time series multiple features fusion
CN109829391A (en) * 2019-01-10 2019-05-31 哈尔滨工业大学 Conspicuousness object detection method based on concatenated convolutional network and confrontation study

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
CHANHO KIM ET.AL: "Multi-object Tracking with Neural Gating Using Bilinear LSTM", 《EUROPEAN CONFERENCE ON COMPUTER VISION》 *
KWANG-EUN KO ET.AL: "Deep convolutional framework for abnormal behavior detection in a smart surveillance system", 《ENGINEERING APPLICATIONS OF ARTIFICIAL INTELLIGENCE》 *
YUNHUA ZHANG ET.AL: "Structured Siamese Network for Real-Time Visual Tracking", 《EUROPEAN CONFERENCE ON COMPUTER VISION》 *
戴蔼佳: "基于显著性纠偏的深度回归网络视觉跟踪算法", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *
李林泽: "基于度量学习的人体检测与跟踪方法研究及系统实现", 《中国优秀硕士学位论文全文数据库 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113947618A (en) * 2021-10-20 2022-01-18 哈尔滨工业大学 Adaptive regression tracking method based on modulator
CN113947618B (en) * 2021-10-20 2023-08-29 哈尔滨工业大学 Self-adaptive regression tracking method based on modulator

Also Published As

Publication number Publication date
CN110223316B (en) 2021-01-29

Similar Documents

Publication Publication Date Title
CN112784798B (en) Multi-modal emotion recognition method based on feature-time attention mechanism
JP7147078B2 (en) Video frame information labeling method, apparatus, apparatus and computer program
CN111652903B (en) Pedestrian target tracking method based on convolution association network in automatic driving scene
CN106845430A (en) Pedestrian detection and tracking based on acceleration region convolutional neural networks
CN108509976A (en) The identification device and method of animal
CN112396027A (en) Vehicle weight recognition method based on graph convolution neural network
CN112258554A (en) Double-current hierarchical twin network target tracking method based on attention mechanism
CN109389035A (en) Low latency video actions detection method based on multiple features and frame confidence score
CN109671102A (en) A kind of composite type method for tracking target based on depth characteristic fusion convolutional neural networks
CN113706581B (en) Target tracking method based on residual channel attention and multi-level classification regression
CN110889450B (en) Super-parameter tuning and model construction method and device
CN112348849A (en) Twin network video target tracking method and device
CN110110602A (en) A kind of dynamic sign Language Recognition Method based on three-dimensional residual error neural network and video sequence
CN110120065A (en) A kind of method for tracking target and system based on layering convolution feature and dimension self-adaption core correlation filtering
CN108830170A (en) A kind of end-to-end method for tracking target indicated based on layered characteristic
CN111915644A (en) Real-time target tracking method of twin guiding anchor frame RPN network
CN111862145A (en) Target tracking method based on multi-scale pedestrian detection
CN110110663A (en) A kind of age recognition methods and system based on face character
CN109598742A (en) A kind of method for tracking target and system based on SSD algorithm
CN112149665A (en) High-performance multi-scale target detection method based on deep learning
CN107633196A (en) A kind of eyeball moving projection scheme based on convolutional neural networks
Li et al. Hierarchical knowledge squeezed adversarial network compression
CN110223316A (en) Fast-moving target tracking method based on circulation Recurrent networks
CN111144497B (en) Image significance prediction method under multitasking depth network based on aesthetic analysis
CN112150504A (en) Visual tracking method based on attention mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant