CN108460790A - A kind of visual tracking method based on consistency fallout predictor model - Google Patents
A kind of visual tracking method based on consistency fallout predictor model Download PDFInfo
- Publication number
- CN108460790A CN108460790A CN201810270188.XA CN201810270188A CN108460790A CN 108460790 A CN108460790 A CN 108460790A CN 201810270188 A CN201810270188 A CN 201810270188A CN 108460790 A CN108460790 A CN 108460790A
- Authority
- CN
- China
- Prior art keywords
- target
- sample
- consistency
- value
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T7/00—Image analysis
- G06T7/20—Analysis of motion
- G06T7/246—Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/10—Image acquisition modality
- G06T2207/10016—Video; Image sequence
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20081—Training; Learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/20—Special algorithmic details
- G06T2207/20084—Artificial neural networks [ANN]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06T—IMAGE DATA PROCESSING OR GENERATION, IN GENERAL
- G06T2207/00—Indexing scheme for image analysis or image enhancement
- G06T2207/30—Subject of image; Context of image processing
- G06T2207/30241—Trajectory
Abstract
The invention belongs to technical field of data processing, disclose a kind of visual tracking method based on consistency fallout predictor model, the method includes:A dual input convolutional neural networks model is built first, and the high-level characteristic of synchronous extraction video frame sampling area and target template distinguishes target and background area using logistic regression method;Then convolutional neural networks are embedded into consistency fallout predictor frame, the reliability of classification results is assessed using algorithmic theory of randomness inspection, under specified risk level, classification results of the output with confidence level target in the form of domain;It finally selects high confidence level region as candidate target region, target trajectory is obtained by optimizing time-space domain global energy function.The present invention can adapt to the complex situations such as target occlusion, cosmetic variation and background interference, have stronger robustness and accuracy than current a variety of popular track algorithms.
Description
Technical field
The invention belongs to technical field of data processing more particularly to a kind of vision tracking based on consistency fallout predictor model
Method.
Background technology
As soon as visual target tracking is the basic problem in computer vision field, task is to determine target in video
In motion state, including position, speed and movement locus etc..Although in recent years Visual Tracking achieve very greatly into
Exhibition, but in target occlusion, attitudes vibration, mix the complex situations such as background under, it is huge to realize that the tracking of robust still suffers from
Big challenge.
In vision tracking problem, clarification of objective expression be influence tracking performance an important factor for one of.For expressing
Clarification of objective should be able to adapt to target appearance variation, while have good distinction to background.A large amount of feature extraction
Method is applied to vision tracking, and such as HARR, HOG etc., these features are mostly the low-level image feature by hand-designed, specific aim
It is relatively strong, and to object variations not robust.In recent years, the convolutional neural networks (Convolutional in depth learning technology
Neural Network, CNN) it is widely used in target detection, image classification, semantic segmentation etc..Compared to traditional hand
Work feature, the automatic learning characteristic based on CNN can capture the high-level semantic information of target, change to target appearance and have
There is stronger robustness, therefore is gradually introduced in the solution of Target Tracking Problem.But using depth characteristic carry out with
When track, common problem is to need great amount of samples to train and update CNN parameters, and for vision tracing task, usually
It is difficult to be obtained ahead of time the training sample largely about tracked target.Therefore, the effective training with update of CNN parameters are that it is answered
For tracking faced main problem.
On the other hand, in the method for tracking target based on CNN, after extracting target signature using CNN, typically to differentiate
Formula method realizes tracking[7-8], basic thought is the two-value classification problem that target following is regarded as to image-region, passes through grader
Image-region is divided into target and background region, final track is obtained according to the classification results of every frame.The reliability of classification results
It is the key that determine tracking success or failure, however current sorting algorithm mostly lacks the fail-safe analysis to exporting result, that is, passes through
It is correct that the confidence level of one quantization, which carrys out evaluation result to what extent,.If point at each moment can be assessed effectively
Class for Target state estimator and the update of characteristic model parameter as a result, provide reliable information foundation, it will greatly improves tracking
Accuracy and robustness.
In conclusion problem of the existing technology is:
Existing visual tracking method is poor for the robustness target following effect of video sequence;It can not adapt to target screening
The complex situations such as gear, cosmetic variation and background interference, and existing a variety of track algorithm accuracys are poor.
Invention content
In view of the problems of the existing technology, the vision tracking based on consistency fallout predictor model that the present invention provides a kind of
Method.
The invention is realized in this way a kind of visual tracking method based on consistency fallout predictor model, including:
A dual input convolutional neural networks model, synchronous extraction video frame sampling area and target template are built first
High-level characteristic, utilize logistic regression method to distinguish target and background area;
Then convolutional neural networks are embedded into consistency fallout predictor frame, are assessed point using algorithmic theory of randomness inspection
The reliability of class result, under specified risk level, classification results of the output with confidence level target in the form of domain;
It finally selects high confidence level region as candidate target region, mesh is obtained by optimizing time-space domain global energy function
Mark track.
Further, the visual tracking method based on consistency fallout predictor model specifically includes:
Input:Target original state x0, the C'N'N of pre-training, length is N sequence images;
Output:Target trajectory
Initial phase, including:
By x0Input template of the corresponding image-region as CNN;
In x0Place acquires positive negative sample, establishes training set Τ, and be divided into normal training set ΤaWith calibration set Τb;
Utilize ΤaTo in CNN full articulamentum and output layer be trained adjustment;
Tracking phase, including:
Image sequence is divided intoA segment is successively handled kth=1 ..., K segments;
Estimate the target trajectory of k-th of segment;
Update training set Τ:It selects tracking result with a high credibility to update training set according to p value, and excavates difficult negative sample
It is added into Τ;
Linking objective track T ← T ∪ TkIf otherwise the last one processed segment, output trajectory T enable k=k+1, turn
Enter the target trajectory step of k-th of segment of estimation.
Further, in the target trajectory of k-th of segment of the estimation, processing procedure includes:
Establish the candidate target set of all frames
(1) it is t, O to enable current timet=φ, with the highest dbjective state of p value in moment t-1 imageCentered on, in place
It sets and carries out Gaussian Profile stochastical sampling on scale, obtain M sampleGaussian Profile covariance is diagonal
Battle array Diag (0.1r2,0.1r2, 0.2), r isLength and wide average value;
(2) CNN is utilized to calculate the regressand value of sample
(3) according to calibration set Τb, calculated using formula (3)Confidence level
(4) it according to risk threshold value ε, is obtained using formula (4)Domain prediction resultIt chooses
Output result is { C+Or { C+,C-, and confidence level p (C+) value comes preceding NcA sampleIt is added to candidate target collection
Ot,
(5) t=t+1 is enabled, if t > nlThen processing terminates, and otherwise goes to (1).
Further, in the target trajectory of k-th of segment of the estimation, processing procedure further includes:
By optimizing energy function ETrack, obtain the target trajectory of k-th of segment
Further, the dual input convolutional neural networks model includes CNN network structures:By target template with it is to be identified
Image inputs as two-way while entering network and merges to form differentiation feature in full articulamentum after extracting feature by convolutional layer,
Finally logistic regression is carried out in output layer realize classification;Wherein, target template is obtained by craft in sequence image head frames, and
Images to be recognized is then the regional area sampled in sequence image;CNN network structures include two independent convolutional layers, this two sets
Convolutional layer shares same structure and parameter;Two-way input is being mapped as high-level characteristic after convolutional layer, is then connecting entirely
It connects in layer and is merged, be further mapped as the feature that there is distinction to target and background;Output layer returns for Logistic
Grader predicts that the target of input sample or background are different classes of by logistic regression;
Further, the dual input convolutional neural networks model further includes network parameter training:
The CNN convolutional layers off-line training on data set in advance, can extract general target feature;In pre-training, CNN nets
Network structure is single input structure, and the parameter after training is shared by two sets of convolutional layers;
The output layer of CNN is set as 10 units, and 1 unit is replaced with after pre-training, then by output layer, it is corresponding with
Two classification of track task;CNN after pre-training will carry out small parameter perturbations according to actual tracking task;It, will be pre- during tracking
Convolution layer parameter after training is fixed, and is only carried out online updating to full articulamentum and output layer parameter, is adapted to target and background
Variation;
Foundation for training set chooses the target area in first frame, according to target area by hand in tracking initial phase
Domain samples positive negative training sample, is to judge its positive and negative attribute, coverage rate given threshold with the coverage rate of sample and target area
0.5;
Data, which enhance, to be realized into row stochastic scale and rotation transformation to sample;In follow-up tracking, tied by classifying
The risk assessment of fruit is chosen and is trained specimen sample centered on meeting the tracking result of confidence level condition;Enable training set table
It is shown as T={ (x(1),y(1)),...,(x(n),y(n)), wherein y(i)∈{C-=0, C+=1 }, class label C-For background, C+For mesh
Mark, x(i)∈ZdIt is dbjective state vector, including position and scale;Calculating sample is returned in output layer using Logistic to belong to
The probability of target or background:
R(y|x;θ)=hθ(x)y(1-hθ(x))1-y(1);
Wherein,θ is network model parameter;Using training set T come training pattern so that logarithm
Likelihood loss function L (θ) reaches minimum:
Network weight and bias are adjusted along the negative gradient direction of L (θ) using stochastic gradient descent method, by reversely passing
Broadcasting method updates each layer parameter Simultaneous Iteration more than convolutional layer.
Further, the consistency fallout predictor uses the innovatory algorithm forecast sample classification of CP, in the innovatory algorithm of CP
In, it is first assumed that the sample in training set obeys independent same distribution, by training set T={ (x(1),y(1)),...,(x(n),y(n))}
It is divided into two parts:Preceding m sample forms normal training set Ta={ (x(1),y(1)),...,(x(m),y(m)), behind q
Sample forms calibration set Tb={ (x(m+1),y(m+1)),...,(x(m+q),y(m+q)), n=m+q;Normal training set TaFor updating
CNN parameters, and calibration set TbChecking sequence is constituted with together with sample to be identified, is examined using the algorithmic theory of randomness of sequence to determine
Sample class;
The algorithmic theory of randomness method of inspection of the sequence is:Mapping function A is defined first:Z(q-1)× Z → R, it will calibrate
Collect TbIn each sample be mapped to singular value space one by one, obtain unusual value sequence αm+1,...,αm+q;
It is x to enable the dbjective state of sample to be identifieds, x is assigned respectivelysClass label C-And C+, constitute two test samples
(xs,yi), i=0,1;Calculate the singular value of test samplesAfterwards, with calibration set TbCorresponding singular value constitutes two inspections together
Test sequenceBy calculating test statistics p value, the algorithmic theory of randomness for obtaining sequence is horizontal:
Wherein ps(yi) indicate dbjective state xsIt is marked as yiWhen p value, as xsBelong to classification yiConfidence level;
Assignment algorithm risk level threshold epsilon, output of the hypothesis as ICP by p value more than ε:
Work as xsTrue classification ysDo not existWhen middle, then it is assumed that there is prediction error, it is effective according to consistency fallout predictor
Property theorem, error rate be not more than algorithm risk level ε, i.e.,:
P{ps(ys)≤ε}≤ε (5);
In the algorithmic theory of randomness method of inspection of the sequence, singular value mapping function is first defined, for measuring sample to be tested
Originally it is under the jurisdiction of the degree of consistency of whole sample distribution;Consistency is analyzed according to the regressand value of CNN outputs, sample characteristics correspond to
The regressand value of true classification is bigger, then the sample and the consistency of calibration set sequence are stronger, and singular value function is defined as:
Wherein Ry(x(i)) it is that x is calculated by formula (1)(i)Corresponding to the regressand value of classification y, parameter γ is unusual for adjusting
Value aiTo the susceptibility of regressand value variation, the smaller then a of γiTo Ry(x(i)) variation it is more sensitive.
Further, the result of the innovatory algorithm output of CP includes multiple classifications;Two classification for sample to be identified, CP's
The result of innovatory algorithm output includes φ, { C-},{C+},{C+,C-}4Middle result;In each output result, in addition to classification information,
Also it is accompanied with confidence level p value;According to the domain prediction result of all samples, therefrom select sample with a high credibility as the time per frame
Select target;
It specifically includes:It is { C by output in the frame for the picture frame of t moment+Or { C+,C-Sample according to credible
Spend p (C+) value is ranked up, choose maximum NcA Sample Establishing candidate target collection Ot, wherein | Ot|≤Nc;
The candidate target collection OtInclude several possible states of t moment target, target will be from OtIn some state turn
Change to subsequent time candidate target collection Ot+1In some state;
It is to obtain optimal path by target following, defines time-space domain energy function ETrackTarget trajectory is portrayed, optimization is passed through
Energy function obtains target trajectory
Wherein ETrackIncluding two parts, local cost item ELocalWith by cost item EPariwise;
ELocalIt is defined as the dbjective state x at each momenttCorresponding to the sum of the CNN output valves of background;
Local cost item can be reduced for target part circumstance of occlusion, Robust Estimators are introduced and be used for reducing out lattice point
Influences of the data Outliers to function optimization, ELocalIt is defined as:
Wherein it isCorrespond to regressand value when background for dbjective state x, ρ (), which is Huber operators, enhances local generation
The reliability of valence item, is defined as:
EPariwiseThe variation degree of dbjective state is described;When occurring target occlusion, mixed and disorderly background or target appearance in sequence
When state changes, dbjective state great-jump-forward variation can occur since evaluated error is larger;It is assumed that the movement of target is coherent,
EPariwiseEffect be to be punished the catastrophe point in track when energy function optimizes so that track has certain smooth
Property;EPariwiseIt is defined as:
The energy function in formula (7) is optimized using dynamic programming method, obtains optimal movement locus.
Further, training sample updates, including:
During tracking, CNN model parameters are updated using the tracking result of a upper tract, and then handle next sequence
Row section;For the tracking result of moment tIt is selected according to its confidence level p value, if p is more than the threshold alpha of setting, is based onPositive negative training sample is sampled, otherwise carries out judgement selection into subsequent time.
Another object of the present invention is to provide a kind of visual tracking methods based on consistency fallout predictor model
Robust Visual Tracking System.
The present invention extracts image high-level characteristic using convolutional neural networks, is used for carrying out feature representation to target, overcome
The low-level image feature disadvantage sensitive to target appearance transformation.In order to adapt to different types of tracking target, dual input network is designed
Structure, combining target template distinguish target and background area using logistic regression method.To further increase tracking robustness,
It introduces consistency fallout predictor and fail-safe analysis is carried out to classification results, select the classification results for meeting confidence level condition as candidate
Target area obtains final target trajectory finally by time-space domain global energy function optimization.It is on public data collection and more
The currently a popular track algorithm of kind carries out contrast experiment, and the present invention can adapt to target occlusion, cosmetic variation and background interference
Etc. complex situations, the results showed that inventive algorithm, which has, more preferably tracks robustness and accuracy.
Description of the drawings
Fig. 1 is the visual tracking method flow chart provided in an embodiment of the present invention based on consistency fallout predictor model.
Fig. 2 is the visual tracking method schematic diagram provided in an embodiment of the present invention based on consistency fallout predictor model.
Fig. 3 is the CNN schematic network structures of dual input provided in an embodiment of the present invention.
Fig. 4 is target following result schematic diagram provided in an embodiment of the present invention;
In figure:(a)FaceOcc1;(b)Bolt;(c)Football;(d)CarDark.
Fig. 5 is tracking result center error schematic diagram provided in an embodiment of the present invention;
In figure:(a)FaceOcc1;(b)Bolt;(c)Football;(d)CarDark.
Fig. 6 is tracking result coverage rate schematic diagram provided in an embodiment of the present invention;
In figure:(a)FaceOcc1;(b)Bolt;(c)Football;(d)CarDark.
Fig. 6 is provided in an embodiment of the present invention once by evaluation result schematic diagram;
In figure:(a) positional precision figure;(b) success rate figure is covered.
Fig. 7 is that algorithm provided in an embodiment of the present invention is primary by evaluation result figure on all cycle tests;
In figure:(a) positional precision figure;(b) success rate figure is covered.
Specific implementation mode
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to embodiments, to the present invention
It is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to
Limit the present invention.
Currently, the robustness target following effect for video sequence is poor;Can not adapt to target occlusion, cosmetic variation with
And the complex situations such as background interference, and existing a variety of track algorithm accuracys are poor.
The application principle of the present invention is described in detail below in conjunction with the accompanying drawings.
As shown in Figure 1, the visual tracking method provided in an embodiment of the present invention based on consistency fallout predictor model, including:
S101:A dual input convolutional neural networks model, synchronous extraction video frame sampling area and mesh are built first
The high-level characteristic for marking template distinguishes target and background area using logistic regression method.
S102:Then convolutional neural networks are embedded into consistency fallout predictor frame, using algorithmic theory of randomness inspection come
The reliability for assessing classification results, under specified risk level, classification results of the output with confidence level target in the form of domain.
S103:Finally select high confidence level region as candidate target region, by optimizing time-space domain global energy function
Obtain target trajectory.The experimental results showed that the algorithm can adapt to the complicated feelings such as target occlusion, cosmetic variation and background interference
Condition has stronger robustness and accuracy than current a variety of popular track algorithms.
The application principle of the present invention is further described with reference to specific embodiment.
Visual tracking method schematic diagram provided in an embodiment of the present invention based on consistency fallout predictor model is as shown in Figure 2;
It is broadly divided into two stages.First stage is initial phase, builds the CNN of a dual input, therein
Convolution layer parameter is that training obtains in advance using conventional images data set, other layers then utilize the sample of hand sampling in first frame into
Row training, to obtain the initial parameter of model.
Second stage is tracking phase, and area sampling is carried out frame by frame to sequence image, and the high level that sample is extracted using CNN is special
Sign calculates the regressand value that sample belongs to target or background by logistic regression, and then obtains sample under specified risk level using CP
This classification;It selects target sample with a high credibility to establish candidate target collection, passes through the space-time to being defined on candidate target collection
Energy function optimizes, and obtains final target trajectory.The target following of long sequence is handled using a kind of half offline mode, it will
Entire video sequence segmentation, handles the tracking of each tract, connects every section of track, and piecewise during tracking successively
Online updating is carried out to the model parameter of CNN.
1) CNN target's feature-extractions and classification:
CNN is a kind of multilayer neural network of special disposal lattice structure data, is implicitly extracted by convolution kernel
Image local feature, and the invariance with good displacement, scaling and other types deformation.For tracking problem, network knot
Structure and parameter training mode are to influence the key factor of CNN performances, illustrate this two parts in inventive algorithm separately below
Design.
1.1) CNN network structures:
In target identification application, CNN usually requires accurately express target signature after mass data is trained, and
For a specific tracing task, it tends to be difficult to sufficient training data be obtained ahead of time, therefore applied to the CNN of target identification
It is difficult to directly apply to target following, needs to be adjusted and improves.
It is different from target identification, the specific type of target need not be paid close attention in tracking, as long as can be distinguished with background,
A kind of CNN network structures (as shown in Figure 3) of dual input are used thus, target template is defeated as two-way with images to be recognized
Enter at the same enter network, by convolutional layer extract feature after, merge to form differentiation feature in full articulamentum, finally output layer into
Row logistic regression realizes classification.Wherein, target template can be obtained by craft in sequence image head frames, and images to be recognized is then
It is the regional area sampled in sequence image;Two sets of independent convolutional layers are contained in network, are simplified model, this two sets volumes
Lamination shares same structure and parameter;Two-way input is being mapped as high-level characteristic after convolutional layer, is then connecting entirely
It is merged in layer, is further mapped as the feature that there is distinction to target and background;Output layer is Logistic recurrence point
Class device predicts the classification of input sample, i.e. target or background by logistic regression.
1.2) network parameter is trained:
CNN convolutional layers in the present invention off-line training on CIFAR-10 data sets in advance, enables to extract general mesh
Mark feature.In pre-training, CNN network structures are reduced to single input structure, and the parameter after training is then shared by two sets of convolutional layers.
In addition, for 10 classification problems of CIFAR-10 data, the output layer of CNN is set as 10 units, when pre-training knot
Shu Hou, then output layer is replaced with into 1 unit, to correspond to two classification problems of tracing task.CNN after pre-training will be according to reality
Border tracing task carries out small parameter perturbations.During tracking, in order to improve CNN parameter adjustment efficiency, by the convolution after pre-training
Layer parameter is fixed, and only online updating is carried out to full articulamentum and output layer parameter, to adapt to the variation of target and background.
Foundation for training set chooses the target area in first frame, according to target area by hand in tracking initial phase
Domain samples positive negative training sample, judges its positive and negative attribute with the coverage rate (given threshold 0.5) of sample and target area.For
Raising training samples number realizes that data enhance to sample into row stochastic scale and rotation transformation.In follow-up tracking,
By the risk assessment of classification results, chooses and be trained specimen sample centered on the tracking result for meeting confidence level condition.
Training set is enabled to be expressed as T={ (x(1),y(1)),...,(x(n),y(n)), wherein y(i)∈{C-=0, C+=1 }, class label C-For the back of the body
Scape, C+For target, x(i)∈ZdIt is dbjective state vector, including position and scale.It is returned and is calculated using Logistic in output layer
Sample belongs to the probability of target or background:
R(y|x;θ)=hθ(x)y(1-hθ(x))1-y(1);
Wherein,θ is network model parameter.Using training set T come training pattern so that logarithm
Likelihood loss function L (θ) reaches minimum:
Network weight and bias are adjusted along the negative gradient direction of L (θ) using stochastic gradient descent method, by reversely passing
Broadcasting method updates each layer parameter Simultaneous Iteration more than convolutional layer.
2) the candidate target selection based on consistency fallout predictor:
Logistic regressand values provide foundation for sample class prediction, but Logistic regressand values itself can not be in theory
On the risk of prediction error is assessed.In order to realize the fail-safe analysis of prediction result, by CNN model insertions to CP frames
In frame, according to algorithmic theory of randomness level calculation sample class confidence level, and then candidate target is selected.
2.1) consistency fallout predictor:
Current machine learning algorithm mostly lacks prediction result effective fail-safe analysis, that is, passes through one
It is correct that the confidence level of quantization, which carrys out evaluation and foreca result to what extent, and the measurement standard of effectively confidence level is adjustable
Property.CP is a kind of machine learning normal form that can effectively export confidence level, is predicted using hypothesis testing method, and to pre-
It surveys result and reliability assessment is provided.
Traditional CP algorithm calculation amounts are very big, and to improve operation efficiency, the present invention uses the innovatory algorithm forecast sample of CP
Classification, i.e. consilience of induction fallout predictor (Inductive Conformal Predictor, ICP).In ICP algorithms, first
It is assumed that the sample in training set obeys independent same distribution, by training set T={ (x(1),y(1)),...,(x(n),y(n)) be divided into
Two parts:Preceding m sample forms normal training set Ta={ (x(1),y(1)),...,(x(m),y(m)), behind q sample group
At calibration set Tb={ (x(m+1),y(m+1)),...,(x(m+q),y(m+q)), n=m+q.Normal training set TaFor updating CNN ginsengs
Number, and calibration set TbChecking sequence is constituted with together with sample to be identified, sample class is determined using algorithmic theory of randomness inspection.
The algorithmic theory of randomness method of inspection is:Mapping function A is defined first:Z(q-1)× Z → R, by calibration set TbEach of
Sample is mapped to singular value space one by one, obtains unusual value sequence αm+1,...,αm+q.It is whole with sample that singular value reflects the sample
The inconsistency of body distribution.It is x to enable the dbjective state of sample to be identifieds, x is assigned respectivelysClass label C-And C+, to constitute
Two test samples (xs,yi), i=0,1.Calculate the singular value of test samplesAfterwards, with calibration set TbCorresponding singular value one
It rises and constitutes two checking sequencesBy calculating test statistics p value, the algorithm of sequence is obtained
Randomness is horizontal:
Wherein ps(yi) indicate dbjective state xsIt is marked as yiWhen p value, as xsBelong to classification yiConfidence level.
Assignment algorithm risk level threshold epsilon, output of the hypothesis as ICP by p value more than ε:
Work as xsTrue classification ysDo not existWhen middle, then it is assumed that there is prediction error, it is effective according to consistency fallout predictor
Property theorem[10], error rate is not more than algorithm risk level ε, i.e.,:
P{ps(ys)≤ε}≤ε (5);
Therefore, the prediction domain of ICP has adjustable.
2.2) sample singular function:
The algorithmic theory of randomness inspection of sequence needs first to define singular value mapping function, is under the jurisdiction of for measuring sample to be tested
The degree of consistency of whole sample distribution.Consistency is analyzed according to the regressand value of CNN outputs, it is believed that sample characteristics correspond to true
The regressand value of classification is bigger, then the sample and the consistency of calibration set sequence are stronger, and singular value function is defined as:
Wherein Ry(x(i)) it is that x is calculated by formula (1)(i)Corresponding to the regressand value of classification y, parameter γ is unusual for adjusting
Value aiTo the susceptibility of regressand value variation, the smaller then a of γiTo Ry(x(i)) variation it is more sensitive.
2.3) candidate target selects:
ICP output the result is that one set, wherein may include multiple classifications.Two classification of sample to be identified are asked
The result of topic, ICP outputs has 4 kinds of possibilities:φ,{C-},{C+},{C+,C-}.In each output result, in addition to classification information,
Also it is accompanied with confidence level p value.According to the domain prediction result of all samples, therefrom select sample with a high credibility as the time per frame
Select target.Specifically, it is { C by output in the frame for the picture frame at t moment+Or { C+,C-Sample according to credible
Spend p (C+) value is ranked up, choose maximum NcA Sample Establishing candidate target collection Ot, it is known that | Ot|≤Nc。
3) target tracking algorism:
3.1) time-space domain energy function:
Candidate target collection OtSeveral possible states of t moment target are contained, target will be from OtIn some state conversion
To subsequent time candidate target collection Ot+1In some state, therefore target following can be considered as to searching optimum path problems, be
Acquisition optimal path defines time-space domain energy function ETrackTarget trajectory is portrayed, target track is obtained by optimizing energy function
Mark
Wherein ETrackIncluding two parts, local cost item ELocalWith by cost item EPariwise。
ELocalIt is defined as the dbjective state x at each momenttCorresponding to the sum of the CNN output valves of background.Due to target part
Circumstance of occlusion can reduce the reliability of local cost item, introduce Robust Estimators thus and be used for reducing out Grid data
(Outliers) to the influence of function optimization, ELocalIt is defined as:
Wherein it isCorrespond to regressand value when background for dbjective state x, ρ () is Huber operators, for enhancing
The reliability of local cost item, is defined as:
EPariwiseThe variation degree of dbjective state is described.When occurring target occlusion, mixed and disorderly background or target appearance in sequence
When state changes, dbjective state great-jump-forward variation may occur since evaluated error is larger.It is assumed that the movement of target is coherent
, EPariwiseEffect be to be punished the catastrophe point in track when energy function optimizes so that track has certain
Flatness.EPariwiseIt is defined as:
The energy function in formula (7) is optimized using dynamic programming method, obtains optimal movement locus.
3.2) training sample updates:
During tracking, CNN model parameters are updated using the tracking result of a upper tract, and then handle next sequence
Row section.To avoid the occurrence of model drift, only training sample is acquired in the high tracking result of reliability.Tracking for moment t
As a resultIt is selected according to its confidence level p value, if p is more than the threshold alpha of setting, is based onPositive negative training sample is sampled, it is no
Then judgement selection is carried out into subsequent time.
The negative sample of negative sample generally existing redundancy phenomena in training set, redundancy contributes very little, waste to model training
Computing resource.For this purpose, optimizing training set by excavating difficult negative sample (Hard Negative Sample), training effect is improved
Rate.It was found that, domain prediction result is { C+,C-Sample (be denoted as), it will usually it is easily mixed with target to appear in background objects
In the case of confusing, difficult negative sample can be selected from this kind of sample.A kind of simple selection mode is to judgeWith current tracking
As a resultBetween it is overlapping with the presence or absence of region, will if overlappingIt is added in training set as negative sample.
3.3) track algorithm step:
The Vision Tracking of proposition is as follows:
4) experimental result and analysis:
For the validity of verification algorithm, emulation experiment is carried out to algorithm on Matlab, hardware platform is 3.4 GHz
Intel-i7-6700 inside saves as 8GB.The parameters of algorithm are set as in experiment:Normal training set TaScale m=300, school
Quasi- collection TbScale q=30, algorithm risk level ε=0.4, unusual value parameter γ=0.5 of sample, candidate target collection scale upper limit Nc
=20, robust function parameter δ=0.4, training sample undated parameter α=0.6.Algorithm parameter remains unchanged in entire experiment, calculates
The average treatment speed of method is about 8 frames/second.
Select the video sequence in public data collection TOP100 as experimental subjects, and to current a variety of mainstream track algorithms
Carry out experiment effect comparison, including VTS, LOT, STRUCK, MIL, KCF etc..In order to verify the validity of CP, test in an experiment
One simple version of inventive algorithm, is not introduced into CP in the version, but the regressand value directly exported according to CNN, selection
Maximum NcA sample is as candidate region.Coverage rate (Coverage rate) and center position error are used in experiment
(Center location error) two standards carry out the performance of more each algorithm.Coverage rate is defined as Cr=(Rs∩Rt)/
(Rs∪Rt), wherein Rs、RtRespectively tracking result region and real goal region.Center point tolerance refers to tracking result center
Euclidean distance between point and true value central point.Part of test results is shown, some typical cases are contained in the video sequence of selection
Complex situations, such as target occlusion, variable cosmetic, illumination variation and complex background.
Target can be accurately positioned in Ours of the present invention, Ours (no CP) and tri- algorithms of KCF always in the sequence,
To blocking with stronger robustness.
The characteristics of FaceOcc1 video sequence images is that face is repeatedly blocked, and the position blocked and degree are different.
As can be seen that the tracking result of LOT is limited only to the part not being blocked under occlusion, scale error is larger, and MIL exists
When target is seriously blocked, as in the 834th frame, there is larger drift.Error and coverage data are shown from the central position,
Target can be accurately positioned in inventive algorithm Ours, Ours (no CP) and tri- algorithms of KCF always in the sequence, to hiding
Gear has stronger robustness.
Video sequence Bolt is dash match scene, and task is the one of sportsman of tracking.The challenge of the sequence is,
The posture of target is constantly changing, while with the rotation of camera lens, and sportsman is increasingly turned to the back side, therefore mesh from front in image
It is very big to mark cosmetic variation.
Inventive algorithm utilizes high-level characteristic, influenced by target appearance variation it is little, and by fail-safe analysis come more
New model avoids the occurrence of drift, finds out from error analysis, in the algorithm compared, the tracking result error of inventive algorithm
It is minimum.
Football video sequences are rugby match scenes, and tracking target is the head of a sportsman, it is shown that should
The part tracking result of sequence.The difficult point of the sequence is the quite similar sportsman of many of background appearance, between them frequently
Reciprocal motion causes interference to target following.Repeatedly there is larger drift in VTS, MIL, KCF, STRUCK and Ours (no CP)
Move, in 360 frame, VTS, MIL and KCF then perfect tracking to other sportsmen.Inventive algorithm Ours is excellent by space-time track
Change ensures smooth trajectory, reduces the influence of similar purpose interference.Statistics indicate that Ours is maintained in the tracking of the sequence
Minimum tracking error.
The characteristics of Cardark video sequences are the tail portions that track an automobile, the sequence is illumination acute variation, background
Mix and image resolution ratio is low.In the tracking result of display, interfered by the light occurred on the left of target in the 58th frame LOT,
Substantial deviation target, while scale error is larger, also there is a degree of drift in MIL and VTS.With left side light
Constantly interference, MIL and LOT have lost target in 208 frame, and in 315 frame, the inverted image speck on road surface also results in
VTS loses target.STRUCK, KCF, Ours (no CP) and inventive algorithm Ours keep stablizing during tracking the tailstock,
But the larger scale error of Ours (no CP) generally existings.Shown in the tracking error analysis of the sequence, wherein KCF and STRUCK
Tracking accuracy be slightly below Ours.
In order to compare the overall performance of 7 kinds of algorithms, it is primary on all cycle tests that The present invention gives these algorithms
By evaluation result (One-Pass Evaluation, OPE), the performance of algorithm can be from area under the curve (Area Under
Curve, AUC) it is ranked up, there it can be seen that inventive algorithm Ours is higher than in positional precision and covering success rate
Other algorithms, the wherein performance of KCF and Ours are closest, and Ours (no CP) its performance under conditions of being not introduced into CP occurs
It glides, it is more apparent especially in covering success rate.
The mean center point site error and average coverage rate of 7 kinds of algorithms are counted in table 1, the performance of inventive algorithm refers to
Mark is better than other algorithms, shows that the depth characteristic of the extraction of the CNN networks in this hair AMING algorithms can distinguish target well
And by combining ICP to carry out trust evaluation to classification results the reliability of tracking can be effectively ensured, in a variety of allusion quotations in background
Good performance is shown on the video sequence of type complex situations.
The present invention proposes a kind of target tracking algorism based on convolutional neural networks Yu consistency fallout predictor.The algorithm is adopted
With convolutional neural networks extract image high-level characteristic, be used for target carry out feature representation, overcome low-level image feature to target outside
See the sensitive disadvantage of transformation.In order to adapt to different types of tracking target, dual input network structure, combining target mould are designed
Plate distinguishes target and background area using logistic regression method.To further increase tracking robustness, consistency fallout predictor is introduced
Fail-safe analysis is carried out to classification results, selects the classification results for meeting confidence level condition as candidate target region, finally leads to
It crosses time-space domain global energy function optimization and obtains final target trajectory.On public data collection with a variety of currently a popular tracking
Algorithm carries out contrast experiment, the results showed that inventive algorithm, which has, more preferably tracks robustness and accuracy.
The application effect of the present invention is explained in detail with reference to experiment.
For the validity of verification algorithm, emulation experiment is carried out to algorithm on Matlab, hardware platform is 3.4 GHz
Intel-i7-6700 inside saves as 8GB.The parameters of algorithm are set as in experiment:Normal training set TaScale m=300, school
Quasi- collection TbScale q=30, algorithm risk level ε=0.4, unusual value parameter γ=0.5 of sample, candidate target collection scale upper limit Nc
=20, robust function parameter δ=0.4, training sample undated parameter α=0.6.Algorithm parameter remains unchanged in entire experiment, calculates
The average treatment speed of method is about 8 frames/second.
Select public data collection TOP100[14]In video sequence tracked as experimental subjects, and to current a variety of mainstreams
Algorithm carries out experiment effect comparison, including VTS[15], LOT[16], STRUCK[1], MIL[17], KCF[2]Deng.In order to verify having for CP
Effect property, tests a simple version of inventive algorithm, CP is not introduced into the version in an experiment, but directly according to CNN
The regressand value of output selects maximum NcA sample is as candidate region.In experiment using coverage rate (Coverage rate) and
Two standards of center position error (Center location error)[18]Carry out the performance of more each algorithm.Cover calibration
Justice is Cr=(Rs∩Rt)/(Rs∪Rt), wherein Rs、RtRespectively tracking result region and real goal region.Center point tolerance
It refer to the Euclidean distance between tracking result central point and true value central point.Fig. 4 shows part of test results, the video of selection
Some typical complex situations, such as target occlusion, variable cosmetic, illumination variation and complex background are contained in sequence.
The part tracking result of FaceOcc1 video sequences is shown in Fig. 4 (a), and tracking target is the face of a woman
Portion.The characteristics of sequence image is that face is repeatedly blocked by books, and the position blocked and degree are different.It can from figure
Go out, the tracking result of LOT is limited only to the part not being blocked under occlusion, and scale error is larger, and MIL is in target quilt
When seriously blocking, as in the 834th frame, there is larger drift.The covering in center error and Fig. 6 (a) in Fig. 5 (a)
Rate data show that inventive algorithm Ours, Ours (no CP) and tri- algorithms of KCF can accurately be determined always in the sequence
Target is arrived in position, to blocking with stronger robustness.
Video sequence Bolt is dash match scene, and task is the one of sportsman of tracking.The challenge of the sequence is,
The posture of target is constantly changing, while with the rotation of camera lens, and sportsman is increasingly turned to the back side, therefore mesh from front in image
It is very big to mark cosmetic variation.In the result of Fig. 4 (b) displays, VTS, STRUCK, MIL shortly shift since sequence,
Equal breakaway in 48th frame, KCF, LOT, Ours and Ours (no CP) can keep keeping up with target, but LOT and Ours (no
CP) when target deforms upon, there is larger scale error in such as the 222nd frame.Inventive algorithm utilizes high-level characteristic, by
The influence of target appearance variation is little, and avoids the occurrence of drift by fail-safe analysis come more new model, from Fig. 5 (b) and Fig. 6
(b) error analysis in is found out, in the algorithm compared, the tracking result error of inventive algorithm is minimum.
Football video sequences are rugby match scenes, and tracking target is the head of a sportsman, and Fig. 4 (c) is aobvious
The part tracking result of the sequence is shown.The difficult point of the sequence is the quite similar sportsman of many of background appearance, they it
Between frequent reciprocal motion, interference is caused to target following.VTS, MIL, KCF, STRUCK and Ours (no CP) repeatedly occur
Larger drift, in 360 frame, VTS, MIL and KCF then perfect tracking to other sportsmen.Inventive algorithm Ours passes through space-time
Track optimizing ensures smooth trajectory, reduces the influence of similar purpose interference.It is in Fig. 5 (c) and Fig. 6 (c) statistics indicate that,
Ours maintains minimum tracking error in the tracking of the sequence.
The characteristics of Cardark video sequences are the tail portions that track an automobile, the sequence is illumination acute variation, background
Mix and image resolution ratio is low.In the tracking result of Fig. 4 (d) displays, in the 58th frame LOT by the light occurred on the left of target
Interference, substantial deviation target, while scale error is larger, also there is a degree of drift in MIL and VTS.As left side is bright
The continuous interference of light, MIL and LOT have lost target in 208 frame, and in 315 frame, the inverted image speck on road surface,
It causes VTS and loses target.STRUCK, KCF, Ours (no CP) and inventive algorithm Ours are kept during tracking the tailstock
Stablize, but the larger scale error of Ours (no CP) generally existings.The tracking error analysis such as Fig. 5 (d) and Fig. 6 of the sequence
(d) shown in, the tracking accuracy of wherein KCF and STRUCK are slightly below Ours.
In order to compare the overall performance of 7 kinds of algorithms, it is primary logical on all cycle tests that these algorithms are given in Fig. 7
Cross evaluation result (One-Pass Evaluation, OPE)[14], including positional precision figure (Fig. 7 (a)) and covering success rate figure
(Fig. 7 (b)).The performance of algorithm can be ranked up with the area under the curve (Area Under Curve, AUC) provided in Fig. 6, from
In as can be seen that inventive algorithm Ours is higher than other algorithms, the wherein performance of KCF in positional precision and covering success rate
It is closest with Ours, and gliding occurs in Ours (no CP) its performance under conditions of being not introduced into CP, and is especially covering successfully
In rate more apparent (see Fig. 7 (b)).
The mean center point site error and average coverage rate of 7 kinds of algorithms are counted in table 1, the performance of inventive algorithm refers to
Mark be better than other algorithms, show in inventive algorithm CNN networks extraction depth characteristic can distinguish well target and
By combining ICP to carry out trust evaluation to classification results the reliability of tracking can be effectively ensured, in a variety of typical cases in background
Good performance is shown on the video sequence of complex situations.
1 mean center point site error of table and coverage rate
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention
All any modification, equivalent and improvement etc., should all be included in the protection scope of the present invention made by within refreshing and principle.
Claims (10)
1. a kind of visual tracking method based on consistency fallout predictor model, which is characterized in that described to be based on consistency fallout predictor
The visual tracking method of model includes:
A dual input convolutional neural networks model, the height of synchronous extraction video frame sampling area and target template are built first
Layer feature distinguishes target and background area using logistic regression method;
Then convolutional neural networks are embedded into consistency fallout predictor frame, classification knot is assessed using algorithmic theory of randomness inspection
The reliability of fruit, under specified risk level, classification results of the output with confidence level target in the form of domain;
It finally selects high confidence level region as candidate target region, target track is obtained by optimizing time-space domain global energy function
Mark.
2. the visual tracking method as described in claim 1 based on consistency fallout predictor model, which is characterized in that described to be based on
The visual tracking method of consistency fallout predictor model specifically includes:
Input:Target original state x0, the C'N'N of pre-training, length is N sequence images;
Output:Target trajectory
Initial phase, including:
By x0Input template of the corresponding image-region as CNN;
In x0Place acquires positive negative sample, establishes training set Τ, and be divided into normal training set ΤaWith calibration set Τb;
Utilize ΤaTo in CNN full articulamentum and output layer be trained adjustment;
Tracking phase, including:
Image sequence is divided intoA segment is successively handled kth=1 ..., K segments;
Estimate the target trajectory of k-th of segment;
Update training set Τ:It selects tracking result with a high credibility to update training set according to p value, and excavates difficult negative sample and be added
Into Τ;
Linking objective track T ← T ∪ TkIf otherwise the last one processed segment, output trajectory T enable k=k+1, are transferred to and estimate
Count the target trajectory step of k-th of segment.
3. the visual tracking method as claimed in claim 2 based on consistency fallout predictor model, which is characterized in that the estimation
In the target trajectory of k-th of segment, processing procedure includes:
Establish the candidate target set of all frames
(1) it is t, O to enable current timet=φ, with the highest dbjective state center of p value in moment t-1 image, in position and scale
Upper progress Gaussian Profile stochastical sampling obtains M sampleJ=1 ..., M, Gaussian Profile covariance are diagonal matrix Diag
(0.1r2,0.1r2, 0.2), r isLength and wide average value;
(2) CNN is utilized to calculate the regressand value of sample=1 ..., M;
(3) according to calibration set Τb, calculated using formula (3)Confidence levelJ=1 ..., M;
(4) it according to risk threshold value ε, is obtained using formula (4)Domain prediction resultJ=1 ..., M;Choose output
As a result it is { C+Or { C+,C-, and confidence level p (C+) value comes preceding NcA sampleIt is added to candidate target collection Ot,
(5) t=t+1 is enabled, if t > nlThen processing terminates, and otherwise goes to (1).
4. the visual tracking method as claimed in claim 2 based on consistency fallout predictor model, which is characterized in that the estimation
In the target trajectory of k-th of segment, processing procedure further includes:
By optimizing energy function ETrack, obtain the target trajectory of k-th of segment
5. the visual tracking method as described in claim 1 based on consistency fallout predictor model, which is characterized in that the two-way
It includes CNN network structures to input convolutional neural networks model:Using target template and images to be recognized as two-way input simultaneously into
Enter network, after extracting feature by convolutional layer, merge to form differentiation feature in full articulamentum, finally carrying out logic in output layer returns
Realization is returned to classify;Wherein, target template is obtained by craft in sequence image head frames, and images to be recognized is then in sequence chart
The regional area sampled as in;CNN network structures include two independent convolutional layers, this two sets convolutional layers share same structure with
Parameter;Two-way input is being mapped as high-level characteristic after convolutional layer, is then merged in full articulamentum, is further reflected
It penetrates to have the feature of distinction to target and background;Output layer is that Logistic returns grader, is predicted by logistic regression
The target or background of input sample are different classes of.
6. the visual tracking method as claimed in claim 5 based on consistency fallout predictor model, which is characterized in that the two-way
Input convolutional neural networks model further includes network parameter training:
The CNN convolutional layers off-line training on data set in advance, can extract general target feature;In pre-training, CNN network structures
For single input structure, the parameter after training is shared by two sets of convolutional layers;
The output layer of CNN is set as 10 units, replaces with 1 unit after pre-training, then by output layer, corresponding tracking is appointed
Two classification of business;CNN after pre-training will carry out small parameter perturbations according to actual tracking task;During tracking, by pre-training
Convolution layer parameter afterwards is fixed, and is only carried out online updating to full articulamentum and output layer parameter, is adapted to the variation of target and background;
Foundation for training set is chosen the target area in first frame, is adopted according to target area by hand in tracking initial phase
The positive negative training sample of sample judges its positive and negative attribute with the coverage rate of sample and target area, and coverage rate given threshold is 0.5;
Data, which enhance, to be realized into row stochastic scale and rotation transformation to sample;In follow-up tracking, pass through classification results
Risk assessment is chosen and is trained specimen sample centered on meeting the tracking result of confidence level condition;Training set is enabled to be expressed as T
={ (x(1),y(1)),...,(x(n),y(n)), wherein y(i)∈{C-=0, C+=1 }, class label C-For background, C+For target, x(i)
∈ZdIt is dbjective state vector, including position and scale;Output layer using Logistic return calculate sample belong to target or
The probability of background:
R(y|x;θ)=hθ(x)y(1-hθ(x))1-y(1);
Wherein,θ is network model parameter;Using training set T come training pattern so that log-likelihood
Loss function L (θ) reaches minimum:
Network weight and bias are adjusted along the negative gradient direction of L (θ) using stochastic gradient descent method, by backpropagation side
Method updates each layer parameter Simultaneous Iteration more than convolutional layer.
7. the visual tracking method as described in claim 1 based on consistency fallout predictor model, which is characterized in that
The consistency fallout predictor uses the innovatory algorithm forecast sample classification of CP, in the innovatory algorithm of CP, it is first assumed that instruction
Practice the sample concentrated and obey independent same distribution, by training set T={ (x(1),y(1)),...,(x(n),y(n)) it is divided into two portions
Point:Preceding m sample forms normal training set Ta={ (x(1),y(1)),...,(x(m),y(m)), behind q sample form calibration set
Tb={ (x(m+1),y(m+1)),...,(x(m+q),y(m+q)), n=m+q;Normal training set TaFor updating CNN parameters, and calibrate
Collect TbChecking sequence is constituted with together with sample to be identified, is examined using the algorithmic theory of randomness of sequence to determine sample class;
The algorithmic theory of randomness method of inspection of the sequence is:Mapping function A is defined first:Z(q-1)× Z → R, by calibration set TbIn
Each sample be mapped to singular value space one by one, obtain unusual value sequence αm+1,...,αm+q;
It is x to enable the dbjective state of sample to be identifieds, x is assigned respectivelysClass label C-And C+, constitute two test samples (xs,
yi), i=0,1;Calculate the singular value of test samplesAfterwards, with calibration set TbCorresponding singular value constitutes two inspection sequences together
RowI=0,1;By calculating test statistics p value, the algorithmic theory of randomness for obtaining sequence is horizontal:
Wherein ps(yi) indicate dbjective state xsIt is marked as yiWhen p value, as xsBelong to classification yiConfidence level;It is specified
Algorithm risk level threshold epsilon, output of the hypothesis as ICP by p value more than ε:
Work as xsTrue classification ysDo not existWhen middle, then it is assumed that there is prediction error, it is fixed according to consistency fallout predictor validity
Reason, error rate are not more than algorithm risk level ε, i.e.,:
P{ps(ys)≤ε}≤ε (5);
In the algorithmic theory of randomness method of inspection of the sequence, singular value mapping function is first defined, is subordinate to for measuring sample to be tested
Belong to the degree of consistency of whole sample distribution;Consistency is analyzed according to the regressand value of CNN outputs, sample characteristics correspond to true
The regressand value of classification is bigger, then the sample and the consistency of calibration set sequence are stronger, and singular value function is defined as:
Wherein Ry(x(i)) it is that x is calculated by formula (1)(i)Corresponding to the regressand value of classification y, parameter γ is for adjusting singular value ai
To the susceptibility of regressand value variation, the smaller then a of γiTo Ry(x(i)) variation it is more sensitive.
8. the visual tracking method as claimed in claim 7 based on consistency fallout predictor model, which is characterized in that the improvement of CP
The result of algorithm output includes multiple classifications;Two classification for sample to be identified, the result that the innovatory algorithm of CP exports include
φ,{C-},{C+},{C+,C-Result in 4;In each output result, in addition to classification information, it is also accompanied with confidence level p value;According to
The domain prediction result of all samples therefrom selects sample with a high credibility as the candidate target per frame;
It specifically includes:It is { C by output in the frame for the picture frame of t moment+Or { C+,C-Sample according to confidence level p
(C+) value is ranked up, choose maximum NcA Sample Establishing candidate target collection Ot, wherein | Ot|≤Nc;
The candidate target collection OtInclude several possible states of t moment target, target will be from OtIn some state be transformed into
Subsequent time candidate target collection Ot+1In some state;
It is to obtain optimal path by target following, defines time-space domain energy function ETrackTarget trajectory is portrayed, by optimizing energy
Function obtains target trajectory
Wherein ETrackIncluding two parts, local cost item ELocalWith by cost item EPariwise;
ELocalIt is defined as the dbjective state x at each momenttCorresponding to the sum of the CNN output valves of background;
Local cost item can be reduced for target part circumstance of occlusion, Robust Estimators are introduced and be used for reducing out Grid data
Influences of the Outliers to function optimization, ELocalIt is defined as:
Wherein it isCorrespond to regressand value when background for dbjective state x, ρ (), which is Huber operators, enhances local cost item
Reliability, be defined as:
EPariwiseThe variation degree of dbjective state is described;Become when occurring target occlusion, mixed and disorderly background or targeted attitude in sequence
When change, dbjective state great-jump-forward variation can occur since evaluated error is larger;It is assumed that the movement of target is coherent, EPariwise
Effect be to be punished the catastrophe point in track when energy function optimizes so that track have certain flatness;
EPariwiseIt is defined as:
The energy function in formula (7) is optimized using dynamic programming method, obtains optimal movement locus.
9. the visual tracking method as claimed in claim 2 based on consistency fallout predictor model, which is characterized in that training sample
Update, including:
During tracking, CNN model parameters are updated using the tracking result of a upper tract, and then handle next sequence
Section;For the tracking result of moment tIt is selected according to its confidence level p value, if p is more than the threshold alpha of setting, is based on
Positive negative training sample is sampled, otherwise carries out judgement selection into subsequent time.
10. a kind of visual tracking method as described in claim 1 based on consistency fallout predictor model based on convolutional Neural net
The robust Visual Tracking System of network and consistency fallout predictor.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810270188.XA CN108460790A (en) | 2018-03-29 | 2018-03-29 | A kind of visual tracking method based on consistency fallout predictor model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810270188.XA CN108460790A (en) | 2018-03-29 | 2018-03-29 | A kind of visual tracking method based on consistency fallout predictor model |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108460790A true CN108460790A (en) | 2018-08-28 |
Family
ID=63237253
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810270188.XA Pending CN108460790A (en) | 2018-03-29 | 2018-03-29 | A kind of visual tracking method based on consistency fallout predictor model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108460790A (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109815988A (en) * | 2018-12-27 | 2019-05-28 | 北京奇艺世纪科技有限公司 | Model generating method, classification method, device and computer readable storage medium |
CN109829936A (en) * | 2019-01-29 | 2019-05-31 | 青岛海信网络科技股份有限公司 | A kind of method and apparatus of target tracking |
CN110956058A (en) * | 2018-09-26 | 2020-04-03 | 北京嘀嘀无限科技发展有限公司 | Image recognition method and device and electronic equipment |
CN111612823A (en) * | 2020-05-21 | 2020-09-01 | 云南电网有限责任公司昭通供电局 | Robot autonomous tracking method based on vision |
CN111651149A (en) * | 2020-07-03 | 2020-09-11 | 大连东软教育科技集团有限公司 | Machine learning model system convenient to deploy and calling method thereof |
CN112233147A (en) * | 2020-12-21 | 2021-01-15 | 江苏移动信息系统集成有限公司 | Video moving target tracking method and device based on two-way twin network |
CN112348810A (en) * | 2020-08-20 | 2021-02-09 | 湖南大学 | In-service electronic system reliability assessment method |
CN115086718A (en) * | 2022-07-19 | 2022-09-20 | 广州万协通信息技术有限公司 | Video stream encryption method and device |
US11889227B2 (en) | 2020-10-05 | 2024-01-30 | Samsung Electronics Co., Ltd. | Occlusion processing for frame rate conversion using deep learning |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104951793A (en) * | 2015-05-14 | 2015-09-30 | 西南科技大学 | STDF (standard test data format) feature based human behavior recognition algorithm |
CN106709936A (en) * | 2016-12-14 | 2017-05-24 | 北京工业大学 | Single target tracking method based on convolution neural network |
CN106920250A (en) * | 2017-02-14 | 2017-07-04 | 华中科技大学 | Robot target identification and localization method and system based on RGB D videos |
CN107590821A (en) * | 2017-09-25 | 2018-01-16 | 武汉大学 | A kind of method for tracking target and system based on track optimizing |
US20180047164A1 (en) * | 2014-09-16 | 2018-02-15 | Samsung Electronics Co., Ltd. | Computer aided diagnosis apparatus and method based on size model of region of interest |
-
2018
- 2018-03-29 CN CN201810270188.XA patent/CN108460790A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180047164A1 (en) * | 2014-09-16 | 2018-02-15 | Samsung Electronics Co., Ltd. | Computer aided diagnosis apparatus and method based on size model of region of interest |
CN104951793A (en) * | 2015-05-14 | 2015-09-30 | 西南科技大学 | STDF (standard test data format) feature based human behavior recognition algorithm |
CN106709936A (en) * | 2016-12-14 | 2017-05-24 | 北京工业大学 | Single target tracking method based on convolution neural network |
CN106920250A (en) * | 2017-02-14 | 2017-07-04 | 华中科技大学 | Robot target identification and localization method and system based on RGB D videos |
CN107590821A (en) * | 2017-09-25 | 2018-01-16 | 武汉大学 | A kind of method for tracking target and system based on track optimizing |
Non-Patent Citations (1)
Title |
---|
高琳等: "基于卷积神经网络与一致性预测器的稳健视觉跟踪", 《光学学报》 * |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110956058A (en) * | 2018-09-26 | 2020-04-03 | 北京嘀嘀无限科技发展有限公司 | Image recognition method and device and electronic equipment |
CN110956058B (en) * | 2018-09-26 | 2023-10-24 | 北京嘀嘀无限科技发展有限公司 | Image recognition method and device and electronic equipment |
CN109815988A (en) * | 2018-12-27 | 2019-05-28 | 北京奇艺世纪科技有限公司 | Model generating method, classification method, device and computer readable storage medium |
CN109815988B (en) * | 2018-12-27 | 2021-08-20 | 北京奇艺世纪科技有限公司 | Model generation method, classification method, device and computer-readable storage medium |
CN109829936A (en) * | 2019-01-29 | 2019-05-31 | 青岛海信网络科技股份有限公司 | A kind of method and apparatus of target tracking |
CN109829936B (en) * | 2019-01-29 | 2021-12-24 | 青岛海信网络科技股份有限公司 | Target tracking method and device |
CN111612823A (en) * | 2020-05-21 | 2020-09-01 | 云南电网有限责任公司昭通供电局 | Robot autonomous tracking method based on vision |
CN111651149B (en) * | 2020-07-03 | 2022-11-22 | 东软教育科技集团有限公司 | Machine learning model system convenient to deploy and calling method thereof |
CN111651149A (en) * | 2020-07-03 | 2020-09-11 | 大连东软教育科技集团有限公司 | Machine learning model system convenient to deploy and calling method thereof |
CN112348810A (en) * | 2020-08-20 | 2021-02-09 | 湖南大学 | In-service electronic system reliability assessment method |
CN112348810B (en) * | 2020-08-20 | 2024-04-12 | 湖南大学 | Reliability assessment method for in-service electronic system |
US11889227B2 (en) | 2020-10-05 | 2024-01-30 | Samsung Electronics Co., Ltd. | Occlusion processing for frame rate conversion using deep learning |
CN112233147A (en) * | 2020-12-21 | 2021-01-15 | 江苏移动信息系统集成有限公司 | Video moving target tracking method and device based on two-way twin network |
CN115086718A (en) * | 2022-07-19 | 2022-09-20 | 广州万协通信息技术有限公司 | Video stream encryption method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108460790A (en) | A kind of visual tracking method based on consistency fallout predictor model | |
CN106127204B (en) | A kind of multi-direction meter reading Region detection algorithms of full convolutional neural networks | |
CN110084831A (en) | Based on the more Bernoulli Jacob's video multi-target detecting and tracking methods of YOLOv3 | |
CN107067413B (en) | A kind of moving target detecting method of time-space domain statistical match local feature | |
CN109816689A (en) | A kind of motion target tracking method that multilayer convolution feature adaptively merges | |
CN104346802B (en) | A kind of personnel leave the post monitoring method and equipment | |
CN109919974A (en) | Online multi-object tracking method based on the more candidate associations of R-FCN frame | |
CN109613006A (en) | A kind of fabric defect detection method based on end-to-end neural network | |
CN105139015B (en) | A kind of remote sensing images Clean water withdraw method | |
CN108647654A (en) | The gesture video image identification system and method for view-based access control model | |
CN108447080A (en) | Method for tracking target, system and storage medium based on individual-layer data association and convolutional neural networks | |
CN108154159B (en) | A kind of method for tracking target with automatic recovery ability based on Multistage Detector | |
CN110033473A (en) | Motion target tracking method based on template matching and depth sorting network | |
CN107784663A (en) | Correlation filtering tracking and device based on depth information | |
CN109543688A (en) | A kind of novel meter reading detection and knowledge method for distinguishing based on multilayer convolutional neural networks | |
CN103902960A (en) | Real-time face recognition system and method thereof | |
CN109671102A (en) | A kind of composite type method for tracking target based on depth characteristic fusion convolutional neural networks | |
CN107145862A (en) | A kind of multiple features matching multi-object tracking method based on Hough forest | |
CN109543632A (en) | A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features | |
CN110263712A (en) | A kind of coarse-fine pedestrian detection method based on region candidate | |
CN109711416A (en) | Target identification method, device, computer equipment and storage medium | |
CN103761747B (en) | Target tracking method based on weighted distribution field | |
CN108921011A (en) | A kind of dynamic hand gesture recognition system and method based on hidden Markov model | |
CN106611158A (en) | Method and equipment for obtaining human body 3D characteristic information | |
CN108288020A (en) | Video shelter detecting system based on contextual information and method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180828 |