CN109801310A - A kind of method for tracking target in orientation and scale differentiation depth network - Google Patents

A kind of method for tracking target in orientation and scale differentiation depth network Download PDF

Info

Publication number
CN109801310A
CN109801310A CN201811403020.8A CN201811403020A CN109801310A CN 109801310 A CN109801310 A CN 109801310A CN 201811403020 A CN201811403020 A CN 201811403020A CN 109801310 A CN109801310 A CN 109801310A
Authority
CN
China
Prior art keywords
network
target
sample
tracking
scale
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811403020.8A
Other languages
Chinese (zh)
Inventor
胡昭华
侍孝义
陈慧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Information Science and Technology
Original Assignee
Nanjing University of Information Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Information Science and Technology filed Critical Nanjing University of Information Science and Technology
Priority to CN201811403020.8A priority Critical patent/CN109801310A/en
Publication of CN109801310A publication Critical patent/CN109801310A/en
Pending legal-status Critical Current

Links

Landscapes

  • Image Analysis (AREA)

Abstract

The invention discloses the method for tracking target that a kind of orientation and scale differentiate depth network, comprising the following steps: (1), and pre-training network;Step (2), azimuth information classification;Step (3), sliding window operation;Step (4), on-line fine;Step (5), loss detect again.The present invention joined directional information classification and positive and negative sample classification under deep learning network frame, so that depth network query function amount is low, fast speed;The deformation of network-adaptive target different times is made when deformation occurs for target by on-line fine network parameter strategy simultaneously, can still complete tracing task well;In addition, tracking when, introduce re-detection strategy, in target following target lose the case where can also handle very well.

Description

A kind of method for tracking target in orientation and scale differentiation depth network
Technical field
The present invention relates to the image procossings and computer vision field in human-computer interaction and video monitoring, more particularly to one kind Orientation and scale differentiate the method for tracking target of depth network, realize mesh by the method for depth e-learning target direction information Mark tracking.
Background technique
In order to allow machine to understand real world, computer vision is always the forward position studied, and target following is always The core content of computer vision, so target following technology has been a hot spot of research problem in recent years.Its purpose is exactly In the case of target sizes and location information by marking first frame, using the feature of algorithm learning objective, predict in subsequent frame The size and location of the target.In monotrack field, recent track algorithm is broadly divided into correlation filtering and deep learning two General orientation.
Deep learning was used widely in tracking field in recent years, Naiyan Wang et al. (Wang N, Yeung D Y.Learning a deep compact image representation for visual tracking[C]// Advances in neural information processing systems.2013:809-817.) propose DLT algorithm, Deep learning is introduced into target tracking domain for the first time.DLT algorithm first uses stack noise reduction self-encoding encoder to exist It is general to obtain that unsupervised offline pre-training is carried out on extensive natural image data set as TinyImagesdataset Object characterize ability.Tracking phase is initial, and there is no obtain to the current particular expression ability for being tracked object for network.At this time Positive negative sample is obtained using first frame, it is more targeted to current tracking target and background to be finely adjusted acquisition to sorter network Sorter network.During tracking, the sample of a collection of candidate is extracted by the way of particle filter to present frame, these samples are defeated Enter in sorter network, confidence level is highest to become final prediction target.The data set that the offline pre-training of DLT algorithm uses only wraps The picture of the size containing 32*32, resolution ratio is obviously low, therefore is difficult to acquire sufficiently strong character representation.The network knot that DLT is connected entirely Structure portray it to clarification of objective ability is not outstanding enough, although having used 4 layers of depth model, effect is still below some make Manually traditional tracking of feature.
As deep learning is widely applied, the CNN network for being good at handling image is introduced in tracking field.It is large quantities of outstanding Algorithm is born, Nam et al. (Nam H, Han B.Learning multi-domain convolutional neural networks for visual tracking[C]//Computer Vision and Pattern Recognition (CVPR), 2016IEEE Conference on.IEEE, 2016:4293-4302.) propose MDNet algorithm.Nam et al. consciousness To, there are huge difference, MDNet proposes directly to be led to tracking video pre-training CNN between image classification task and tracking The method that target indicates ability.MDNet proposes each training sequence treating as an individual domain, each domain There are a two classification layers for it, for distinguishing the foreground and background of current sequence, and all layers before network are all What sequence was shared.Inclusion layer has achieved the purpose that learn clarification of objective expression in tracking sequence in this way.Tracking phase uses The data of one frame train the bounding box regression model of the sequence.Positive sample and negative sample are extracted with first frame, is updated The weight of network.256 candidate samples are generated later, and therefrom selection confidence level is highest, be bounding-box later and return Return to obtain final result.MDNet effect is although good, there is also some problems, to transmitting a samples up to a hundred before needing, although Network structure is smaller, and speed is still relatively slow.And bounding-box recurrence is also required to independent training.
Summary of the invention
Goal of the invention: in order to solve the shortcoming in the above technology, make tracker that lighting change, scale occur in target Variation covering, deformation, motion blur, fast moves, under the complex situations such as blurred background, can still be accurately tracked by target, The present invention proposes a kind of target tracking algorism of deep learning target bearing information, and addition sliding window is mobile, single Piecewise Micro Tracking, Accelerate depth network query function, is a kind of simple and robust tracking.
Technical solution: in order to solve by target deformation, block, the caused target following failure of situations such as illumination, rotation with And the problem that depth network speed is slow, the invention proposes a kind of deep learning target bearing information object track sides of single sample Method is also able to maintain the stability of target following under complex scene, improves the precision of tracker, improves the speed of depth network.
A kind of tracking of deep learning target bearing information of the present invention, the specific steps of this method are as follows:
(1) step 1: pre-training network.The present invention is using three-layer coil product three layers of fully-connected network frame of neural network.But Third layer fully-connected network uses two-way sorter network.Azimuth information is classified layer, carries out the judgement of target position information, and positive and negative point Class layer carries out the judgement of target and background.The initialization of three-layer coil lamination is using VGG-M before present networks, using extensive The convolutional network of data set training has good Generalization Capability and transfer learning ability.The present invention is using trained network Initialization of the weight as training network, improves a lot to the tracking learning ability of network.
The training output of each sorter network uses softmax cross entropy loss function, and stochastic gradient descent is passed with reversed The update for carrying out network weight is broadcast, cycle-index 100, training dataset is in VOT2013, VOT2014, VOT2015 58 video sequences, using video sequence carry out pre-training in the past using extensive classification data it is different, the method can be right Network for tracking is more targeted, is more suitable for track training.
The selection of training stage sample is chosen using the strategy for calculating Duplication.In order to make e-learning to more Target bearing information, training stage indicate the Duplication threshold value of the division of positive negative sample with α.Such as formula (1), l represents positive and negative The label of sample, IoU indicates Duplication, if gsAnd gtDuplication be greater than α be then judged as positive sample, be otherwise negative sample.Volume Lamination learning rate β1It indicates, full articulamentum learning rate β2It indicates.
(2) step 2: azimuth information classification.The present invention proposes ten a kind of orientation of depth e-learning positive sample, main to divide For upper left (D1), a left side (D2), lower-left (D3), under (D4), bottom right (D5), the right side (D6), upper right (D7), upper (D8), small scale (S1), it is big Scale (S2), true sample (T).
[xl,xr,yu,yd] respectively indicate the left margin that orientation divides constraint frame, right margin, coboundary and lower boundary.Boundary Setting be to be determined by the original object size of sample, as shown in formula (2), (3).If gt=[xt,yt,wt,ht] it is present frame True value, xtIt is target position in the position of x-axis coordinate, ytIt is target position in the position of y-axis, wtIt is target size in x-axis Width, htFor target size y-axis height.
In formula (2), (3), α is the scaling factor of boundary limitation.The affiliated class of sample is by formula (2), the limit of (3) Boundary processed determines.
C=ρ (gs,[xl,xr,yu,yd],[wt,ht]) (4)
Formula (4) is the judgment mechanism description to sample judgement classification.gs=[xs,ys,ws,hs] believe for the position of sample Breath, c are sample generic output results.ρ indicates judgement mode of the invention, utilizes restriction frame [xl,xr,yu,yd] With sample position information gsJudgement sample belonging positions classification utilizes the wide height [w of scale decision mechanism and target true valuet,ht] sentence Disconnected dimensional variation classification.
The region decision sample generic fallen according to the value of sample transverse and longitudinal coordinate, as described in formula (5).If sample Originally restriction frame most intermediate region is fallen in, then the scale of judgement sample changes, and utilizes the w of samplesAnd hsWith original mesh Mark true value wtAnd htIt compares, the dimensional variation of judgement sample.It is inputted with multiple dimensioned sample used by traditional most of algorithm The dimensional variation for adjudicating target is different, and dimensional variation is fused in sample classification, is directly placed on the judgement of scale by the present invention In the classification layer of depth network output, it is utilized the powerful learning ability of depth convolutional neural networks significantly.Adjudicate sample The precondition of change of scale is that sample point falls in restriction frame middle position, i.e. xl<xs< xr and yu<ys< yu.Sample The judgement of scale mainly determines that γ and λ respectively indicates the change of scale factor in formula by formula (6).Sample point falls in constraint frame It is interior and be then judged as classification T without change in size.
(3) step 3: sliding window operation.It is not that target following is wanted by the classification that sorter network obtains target sample To as a result, needing the actual position that a treatment mechanism approaches target at leisure while obtaining sample information.The present invention The actual position of target is gradually approached using the method for sliding window.
The sample of current location is sent to azimuth information sorter network, obtains the specific side that current location is located at target Position carries out corresponding sliding close to target to current sample window according to azimuth information.It is operated using sliding window, each network only connects A sample is received, to realize the network query function of single sample, improves arithmetic speed.
There is corresponding sliding window strategy for each azimuth information.Shown in specific implementation method such as formula (7).
θ represents the mobile pixel size of sliding window operation in formula.gs+1Indicate next sample point position that will be sampled. Work as gs+1Sample be passed to network output result be classification T, i.e., then stop closest to target sample sliding window operation, finally obtain and work as The tracking objective result of previous frame.
(4) step 4: on-line fine.The network of pre-training utilizes a large amount of video sequence, there is good Generalization Capability, but It is that still cannot obtain tracking effect well directly to specific video sequence using no specific aim.So in first frame Network weight is finely adjusted using the strategy as pre-training, so that network is targeted to the video sequence.
Maximum challenge is exactly the deformation of target in target following, after target mobile a period of time, more or less all can There is deformation, at this time target itself cannot be fully described in the information of first frame, so online updating is carried out to target, it is right Network on-line fine is vital.
(5) step 5: loss detects again.Double sorter networks are exactly that the case where losing is tracked for processing target, positive and negative Sample classification layer positive sample score is less than certain threshold value and then judges that the positive sample has been lost, and tracking result is entirely background, this Shi Caiyong particle sampler strategy, default objects are all small-scale movements in consecutive frame, then utilize height in former frame target position The rule of this distribution samples present frame, and the sample that input network chooses positive sample classification layer highest scoring is present frame Target position.
Working principle: the present invention only carries out target following to network inputs list sample every time, to reduce in tracking phase The computation burden of network improves tracking velocity.The algorithm carries out classification to target information, and training network judges the side of target Position information, to obtain the actual position of target.Using depth network, target can be better described, improve the robustness of tracking.It will Dimensional variation puts classification layer into, does not need individually to train a regression model, accelerates algorithm calculating speed.
The utility model has the advantages that the present invention joined directional information classification and positive and negative sample classification under deep learning network frame, So that depth network query function amount is low, fast speed.The present invention is by on-line fine network parameter strategy simultaneously, when shape occurs for target When change, make the deformation of network-adaptive target different times, the present invention can still complete tracing task well.Finally originally The case where invention is in tracking, introducing re-detection strategy, loses to target in target following can also be handled very well.The present invention is main Innovation has: (1) learning network of the azimuth information proposed has good adaptability to target following;(2) sliding window strategy carries out single Sample tracking can reduce network query function burden, improve depth network trace speed;(3) dimensional variation is put into one of network Classification output is handled so that also having good robustness to the sequence of tracking dimensional variation class;(4) positive negative sample re-detection Strategy can guarantee that well sample is not lost.
Detailed description of the invention
Fig. 1 is the system flow chart of deep learning directional information target following of the present invention;
Fig. 2 is the schematic diagram of the depth network architecture of the present invention;
Fig. 3 is that azimuth information of the present invention divides schematic diagram;
Fig. 4 is sample of the present invention classifying quality figure;
Fig. 5 is azimuth information glide widow trace principal sketches of the present invention;
Fig. 6 is the present invention to 6 test video tracking result sample frames;
Fig. 7 is present invention figure compared with the synthesis tracking performance of 8 kinds of trackers under OPE assessment mode;
Under the OPE assessment mode that Fig. 8 challenges the factor at three kinds for the present invention compared with the synthesis tracking performance of 8 kinds of trackers Figure.
Specific embodiment
Deep learning directional information method for tracking target provided by the invention, flow chart is as shown in Figure 1, specifically include following Operating procedure:
(1) step 1: pre-training network.The present invention is such as schemed using three-layer coil product three layers of fully-connected network frame of neural network Shown in 2.But third layer fully-connected network uses two-way sorter network.Directional information classification layer FC6, carries out target position letter The judgement of breath;Positive and negative classification layer FC7 carries out the judgement of target and background.The initialization of three-layer coil lamination uses before present networks It is VGG-M, the convolutional network using large-scale dataset training has good Generalization Capability and transfer learning ability.The present invention adopts Trained network weight is used to improve a lot as the initialization of training network to the tracking learning ability of network.
The last of network frame uses double sorter networks, and a sorter network learns azimuth information, a sorter network Practise the positive negative information of sample.FC6 is ten a kind of azimuth information sorter networks as shown in Figure 2, and FC7 is positive and negative sorter network, point Class target and background information.The training of each sorter network, which exports, uses softmax cross entropy loss function, under stochastic gradient Drop and backpropagation carry out the update of network weight, cycle-index 100, and training dataset comes from VOT2013, VOT2014, 58 video sequences in VOT2015, using video sequence carry out pre-training in the past using extensive classification data it is different, this Method can be more targeted to the network for tracking, is more suitable for track training.
The selection of training stage sample is chosen using the strategy for calculating Duplication.In order to make e-learning to more Target bearing information, training stage use smaller numerical value 0.6 to the Duplication threshold alpha of the division of positive negative sample.Such as formula (1), l represents the label of positive negative sample, and IoU indicates Duplication, if gsAnd gtDuplication be greater than 0.6 be judged as positive sample, It otherwise is negative sample.Each frame sample number is set as 200 positive samples, 50 negative samples.Convolutional layer learning rate β1It is set as 0.0001, full articulamentum learns β2It is set as 0.001.
(2) step 2: azimuth information classification.The present invention proposes ten one kind orientation such as Fig. 3 of depth e-learning positive sample, It is broadly divided into upper left (D1), a left side (D2), lower-left (D3), under (D4), bottom right (D5), the right side (D6), upper right (D7), upper (D8), small scale (S1), large scale (S2), true sample (T).
[x in Fig. 3l,xr,yu,yd] respectively indicate the left margin that orientation divides constraint frame, right margin, coboundary and following Boundary.The setting on boundary is determined by the original object size of sample, as shown in formula (2), (3).If gt=[xt,yt,wt,ht] For the true value of present frame, xtIt is target position in the position of x-axis coordinate, ytIt is target position in the position of y-axis, wtFor target ruler The very little width in x-axis, htFor target size y-axis height.
In formula (2), (3), α is the scaling factor of boundary limitation.The affiliated class of sample such as Fig. 3 will by formula (2), (3) restricted boundary determines.
C=ρ (gs,[xl,xr,yu,yd],[wt,ht]) (4)
Formula (4) is the judgment mechanism description to sample judgement classification.gs=[xs,ys,ws,hs] believe for the position of sample Breath, c are sample generic output results.ρ indicates judgement mode of the invention, utilizes restriction frame [xl,xr,yu,yd] With sample position information gsJudgement sample belonging positions classification, as shown in formula (5);Utilize scale decision mechanism and target true value Wide height [wt,ht] deposit index variation classification, as shown in formula (6).
The region decision sample generic fallen according to sample transverse and longitudinal coordinate value, as described in formula (5).If sample Restriction frame most intermediate region is fallen in, then the scale of judgement sample changes, and utilizes the w of samplesAnd hsIt is true with original object Value wtAnd htIt compares, the dimensional variation of judgement sample.It is adjudicated with multiple dimensioned sample input used by traditional most of algorithm The dimensional variation of target is different, and dimensional variation is fused in sample classification by the present invention, and the judgement of scale has directly been placed on depth It spends in the classification layer of network output, the powerful learning ability of depth convolutional neural networks is utilized significantly.Adjudicate the scale of sample The precondition of transformation is that sample point falls in restriction frame middle position, i.e. xl<xs< xr and yu<ys< yu.Sample size Judgement mainly determine that γ and λ respectively indicates the change of scale factor in formula by formula (6).Sample point fall in constraint frame in and There is no change in size to be then judged as classification T.
The final classifying quality of sample is as shown in figure 4, every class is extracted two samples and shows in figure, and primitive frame is as schemed Shown in Frame1, birds2 video sequence of this sample from VOT2015 data set.The authentic specimen of target as shown in Truth, The affiliated class of sample is as shown in Fig. 4 subscript in figure.As can be seen that dividing 11 classes to positive sample in figure, the constrained frame of final effect Constraint factor limits, and gap is very small between every class sample, such as S1- 1, S1- 2 clearly visible birds2 target sizes needs pair bigger than normal Sample carries out increasing the operation of size sliding window, S2- 1, S2- 2 visible birds2 target sizes are less than normal to be needed to carry out reduction ruler to sample Very little sliding window operation, every class sample difference is very almost the change of Pixel-level less, but each sample has the orientation of the classification to become Gesture.Final operation result shows powerful neural network completely and may learn the minor change of this kind of method, and makes reasonable Judgement just obtains final outstanding tracking effect by the variation of sliding window small pixel each time.
(3) step 3: sliding window operation.It is not that target following is wanted by the classification that sorter network obtains target sample To as a result, needing the actual position that a treatment mechanism approaches target at leisure while obtaining sample information.The present invention The actual position of target is gradually approached using the method for sliding window.
The sample of current location is sent to azimuth information sorter network, obtains the specific side that current location is located at target Position carries out corresponding sliding close to target to current sample window according to azimuth information.It is operated using sliding window, each network only connects A sample is received, to realize the network query function of single sample, improves arithmetic speed.
There is corresponding sliding window strategy for each azimuth information.Shown in specific implementation method such as formula (7).
θ represents the mobile pixel size of sliding window operation in formula.gs+1Indicate next sample point position that will be sampled. Work as gs+1Sample be passed to network output result be classification T, i.e., then stop closest to target sample sliding window operation, finally obtain and work as The tracking objective result of previous frame.
The each frame of the sliding window strategy of tracking executes most 15 times, can not find target when 15 times and then uses loss mechanisms.Experiment Show that most of frames can find target in 10 sliding windows or so.Exactly because sliding window mechanism is added, so that every secondary tracking does not need Great amount of samples is taken, the computation burden of depth network is reduced, so that network trace is accelerated.
Fig. 5 is the specific schematic diagram of sliding window, Bird2 of the video sequence in OTB100, first, two, three, four rows difference Indicate that the 3rd, 25,37,73 frame tracks schematic diagram.Wherein figure caption variable subscript indicates the classification of the affiliated azimuth information of current sample. It is illustrated by taking Fig. 5 the first row picture series as an example, the first row first, which opens figure, indicates present frame, and second is knot in former frame The sampling carried out on fruit inputs inventive network, obtains the sample and belong to D1Class, according to D1It carries out sliding window and obtains second sample, Second sample input network obtains the sample and belongs to D2, according to D2Continue sliding window and obtain third sample, and so on obtain the It is judged as classification, as this frame tracking result when six sample output.The sliding window sample mode and so on of other each row pictures. As seen from the figure, set forth herein algorithms, and general sliding window, which samples 10 times or so, can obtain accurate tracking result.
(4) step 4: on-line fine.The network of pre-training utilizes a large amount of video sequence, there is good Generalization Capability, but It is that still cannot obtain tracking effect well directly to specific video sequence using no specific aim.So in first frame Network weight is finely adjusted using the strategy as pre-training, so that network is targeted to the video sequence, the present invention 500 positive samples are chosen in first frame, 200 negative samples initialize network, and Duplication is set as 0.7.
Maximum challenge is exactly the deformation of target in target following, after target mobile a period of time, more or less all can There is deformation, at this time target itself cannot be fully described in the information of first frame, so online updating is carried out to target, it is right Network on-line fine is vital., there is the case where 20 frame sequence is continuously tracked just to mesh in the online tracking phase of the present invention Indicated weight sampling, takes sample number for 200 positive samples, 50 negative samples, and Duplication is constant.Network is finely adjusted, in this way may be used With Strengthens network to the processing capacity of various deformation.
(5) step 5: loss detects again.Double sorter networks are exactly that the case where losing is tracked for processing target, positive and negative Sample classification layer positive sample score is less than certain threshold value and then judges that the positive sample has been lost, and tracking result is entirely background, this Shi Caiyong particle sampler strategy, default objects are all small-scale movements in consecutive frame, then utilize height in former frame target position The rule of this distribution samples present frame, and taking sample number is 300, and input network chooses positive sample classification layer score most High sample is the target position of present frame.
Evaluation criteria: the present invention measures the performance of tracker, OPE (One-pass by OPE evaluation criteria Evaluation) tracker disposably assesses video sequence, traditional disposable assessment tracker accuracy and success rate;It chooses The video sequence of 58 different attributes tests method for tracking target of the invention, and with other trackers (such as CNT, CSK, HCF, 8 kinds of trackers such as KCF, RPT, SAMF, SRDCF, Staple) under different challenge factors, such as plane external rotation, scale becomes Change, illumination variation, quickly moves, compared when blocking.Fig. 6 be the present invention with 8 kinds of trackers to (a) Bird2, (b) 6 Bolt, (c) Couple, (d) Football1, (e) Jogging, (f) MotorRolling test video tracking knots Fruit sample frame, Fig. 7 give the present invention and its in terms of accuracy (Precision) and success rate (Success rate) two The performance comparison figure of his 8 kinds of trackers.Fig. 8 is the present invention in plane external rotation, dimensional variation, three kinds of challenge factors of illumination variation OPE assessment mode under compared with the synthesis tracking performance of 8 kinds of trackers figure, from accuracy (Precision) and success rate (Success rate) two aspects are it can be seen that inventive algorithm has fine performance.It can be seen that provided by the invention Method for tracking target, compared with existing algorithm, arithmetic accuracy is significantly improved, and tracking result is more stable.

Claims (7)

1. the method for tracking target that a kind of orientation and scale differentiate depth network, it is characterised in that: the following steps are included:
Step (1), pre-training network;
Step (2), azimuth information classification;
Step (3), sliding window operation;
Step (4), on-line fine;
Step (5), loss detect again.
2. the method for tracking target that orientation according to claim 1 and scale differentiate depth network, it is characterised in that: step (1) in, using three-layer coil product three layers of fully-connected network frame of neural network, third layer fully-connected network is using two-way classification net Network.
3. the method for tracking target that orientation according to claim 2 and scale differentiate depth network, it is characterised in that: described The training output of sorter network uses softmax cross entropy loss function, and stochastic gradient descent and backpropagation carry out network weight The update of weight.
4. the method for tracking target that orientation according to claim 3 and scale differentiate depth network, it is characterised in that: described The selection of training output sample is chosen by calculating Duplication.
5. the method for tracking target that orientation according to claim 1 and scale differentiate depth network, it is characterised in that: step (2) in, according to the orientation of sample said target and scale size judgement sample generic.
6. the method for tracking target that orientation according to claim 1 and scale differentiate depth network, it is characterised in that: step (3) in, the actual position of target is gradually approached using sliding window.
7. the method for tracking target that orientation according to claim 1 and scale differentiate depth network, it is characterised in that: step (4) in, when occurring that multiple frame sequences are continuously tracked, to target resampling trim network.
CN201811403020.8A 2018-11-23 2018-11-23 A kind of method for tracking target in orientation and scale differentiation depth network Pending CN109801310A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811403020.8A CN109801310A (en) 2018-11-23 2018-11-23 A kind of method for tracking target in orientation and scale differentiation depth network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811403020.8A CN109801310A (en) 2018-11-23 2018-11-23 A kind of method for tracking target in orientation and scale differentiation depth network

Publications (1)

Publication Number Publication Date
CN109801310A true CN109801310A (en) 2019-05-24

Family

ID=66556371

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811403020.8A Pending CN109801310A (en) 2018-11-23 2018-11-23 A kind of method for tracking target in orientation and scale differentiation depth network

Country Status (1)

Country Link
CN (1) CN109801310A (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209813A (en) * 2019-12-27 2020-05-29 南京航空航天大学 Remote sensing image semantic segmentation method based on transfer learning
CN111274917A (en) * 2020-01-17 2020-06-12 江南大学 Long-term target tracking method based on depth detection
CN112150510A (en) * 2020-09-29 2020-12-29 中国人民解放军63875部队 Stepping target tracking method based on double-depth enhanced network
CN114613004A (en) * 2022-02-28 2022-06-10 电子科技大学 Lightweight online detection method for human body actions

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103295242A (en) * 2013-06-18 2013-09-11 南京信息工程大学 Multi-feature united sparse represented target tracking method
CN105069488A (en) * 2015-09-25 2015-11-18 南京信息工程大学 Tracking method based on template on-line clustering
CN106651921A (en) * 2016-11-23 2017-05-10 中国科学院自动化研究所 Motion detection method and moving target avoiding and tracking method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103295242A (en) * 2013-06-18 2013-09-11 南京信息工程大学 Multi-feature united sparse represented target tracking method
CN105069488A (en) * 2015-09-25 2015-11-18 南京信息工程大学 Tracking method based on template on-line clustering
CN106651921A (en) * 2016-11-23 2017-05-10 中国科学院自动化研究所 Motion detection method and moving target avoiding and tracking method

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
ZHAOHUA HU ET AL.: "Deep Directional Network for Object Tracking", 《ALGORITHMS》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111209813A (en) * 2019-12-27 2020-05-29 南京航空航天大学 Remote sensing image semantic segmentation method based on transfer learning
CN111274917A (en) * 2020-01-17 2020-06-12 江南大学 Long-term target tracking method based on depth detection
CN112150510A (en) * 2020-09-29 2020-12-29 中国人民解放军63875部队 Stepping target tracking method based on double-depth enhanced network
CN112150510B (en) * 2020-09-29 2024-03-26 中国人民解放军63875部队 Stepping target tracking method based on dual-depth enhancement network
CN114613004A (en) * 2022-02-28 2022-06-10 电子科技大学 Lightweight online detection method for human body actions

Similar Documents

Publication Publication Date Title
CN109543606B (en) Human face recognition method with attention mechanism
WO2020108362A1 (en) Body posture detection method, apparatus and device, and storage medium
CN109801310A (en) A kind of method for tracking target in orientation and scale differentiation depth network
CN109816689A (en) A kind of motion target tracking method that multilayer convolution feature adaptively merges
CN107481264A (en) A kind of video target tracking method of adaptive scale
Xiao et al. Robust multipose face detection in images
CN110458059B (en) Gesture recognition method and device based on computer vision
CN108171112A (en) Vehicle identification and tracking based on convolutional neural networks
CN109858406A (en) A kind of extraction method of key frame based on artis information
CN107767405A (en) A kind of nuclear phase for merging convolutional neural networks closes filtered target tracking
CN106960206A (en) Character identifying method and character recognition system
CN108510521A (en) A kind of dimension self-adaption method for tracking target of multiple features fusion
CN109859241B (en) Adaptive feature selection and time consistency robust correlation filtering visual tracking method
CN107103326A (en) The collaboration conspicuousness detection method clustered based on super-pixel
CN108647694A (en) Correlation filtering method for tracking target based on context-aware and automated response
CN108647654A (en) The gesture video image identification system and method for view-based access control model
CN110263712A (en) A kind of coarse-fine pedestrian detection method based on region candidate
CN113763424B (en) Real-time intelligent target detection method and system based on embedded platform
Nguyen et al. Yolo based real-time human detection for smart video surveillance at the edge
CN109087337B (en) Long-time target tracking method and system based on hierarchical convolution characteristics
CN108921011A (en) A kind of dynamic hand gesture recognition system and method based on hidden Markov model
CN108564598A (en) A kind of improved online Boosting method for tracking target
Putro et al. High performance and efficient real-time face detector on central processing unit based on convolutional neural network
Sheng et al. Robust visual tracking via an improved background aware correlation filter
Engoor et al. Occlusion-aware dynamic human emotion recognition using landmark detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: 210044 No. 219 Ningliu Road, Jiangbei New District, Nanjing City, Jiangsu Province

Applicant after: Nanjing University of Information Science and Technology

Address before: 211500 Yuting Square, 59 Wangqiao Road, Liuhe District, Nanjing City, Jiangsu Province

Applicant before: Nanjing University of Information Science and Technology

CB02 Change of applicant information
RJ01 Rejection of invention patent application after publication

Application publication date: 20190524

RJ01 Rejection of invention patent application after publication