CN108460790A - A kind of visual tracking method based on consistency fallout predictor model - Google Patents

A kind of visual tracking method based on consistency fallout predictor model Download PDF

Info

Publication number
CN108460790A
CN108460790A CN201810270188.XA CN201810270188A CN108460790A CN 108460790 A CN108460790 A CN 108460790A CN 201810270188 A CN201810270188 A CN 201810270188A CN 108460790 A CN108460790 A CN 108460790A
Authority
CN
China
Prior art keywords
target
sample
consistency
value
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810270188.XA
Other languages
Chinese (zh)
Inventor
高琳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Southwest University of Science and Technology
Original Assignee
Southwest University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Southwest University of Science and Technology filed Critical Southwest University of Science and Technology
Priority to CN201810270188.XA priority Critical patent/CN108460790A/en
Publication of CN108460790A publication Critical patent/CN108460790A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory

Abstract

The invention belongs to technical field of data processing, disclose a kind of visual tracking method based on consistency fallout predictor model, the method includes:A dual input convolutional neural networks model is built first, and the high-level characteristic of synchronous extraction video frame sampling area and target template distinguishes target and background area using logistic regression method;Then convolutional neural networks are embedded into consistency fallout predictor frame, the reliability of classification results is assessed using algorithmic theory of randomness inspection, under specified risk level, classification results of the output with confidence level target in the form of domain;It finally selects high confidence level region as candidate target region, target trajectory is obtained by optimizing time-space domain global energy function.The present invention can adapt to the complex situations such as target occlusion, cosmetic variation and background interference, have stronger robustness and accuracy than current a variety of popular track algorithms.

Description

A kind of visual tracking method based on consistency fallout predictor model
Technical field
The invention belongs to technical field of data processing more particularly to a kind of vision tracking based on consistency fallout predictor model Method.
Background technology
As soon as visual target tracking is the basic problem in computer vision field, task is to determine target in video In motion state, including position, speed and movement locus etc..Although in recent years Visual Tracking achieve very greatly into Exhibition, but in target occlusion, attitudes vibration, mix the complex situations such as background under, it is huge to realize that the tracking of robust still suffers from Big challenge.
In vision tracking problem, clarification of objective expression be influence tracking performance an important factor for one of.For expressing Clarification of objective should be able to adapt to target appearance variation, while have good distinction to background.A large amount of feature extraction Method is applied to vision tracking, and such as HARR, HOG etc., these features are mostly the low-level image feature by hand-designed, specific aim It is relatively strong, and to object variations not robust.In recent years, the convolutional neural networks (Convolutional in depth learning technology Neural Network, CNN) it is widely used in target detection, image classification, semantic segmentation etc..Compared to traditional hand Work feature, the automatic learning characteristic based on CNN can capture the high-level semantic information of target, change to target appearance and have There is stronger robustness, therefore is gradually introduced in the solution of Target Tracking Problem.But using depth characteristic carry out with When track, common problem is to need great amount of samples to train and update CNN parameters, and for vision tracing task, usually It is difficult to be obtained ahead of time the training sample largely about tracked target.Therefore, the effective training with update of CNN parameters are that it is answered For tracking faced main problem.
On the other hand, in the method for tracking target based on CNN, after extracting target signature using CNN, typically to differentiate Formula method realizes tracking[7-8], basic thought is the two-value classification problem that target following is regarded as to image-region, passes through grader Image-region is divided into target and background region, final track is obtained according to the classification results of every frame.The reliability of classification results It is the key that determine tracking success or failure, however current sorting algorithm mostly lacks the fail-safe analysis to exporting result, that is, passes through It is correct that the confidence level of one quantization, which carrys out evaluation result to what extent,.If point at each moment can be assessed effectively Class for Target state estimator and the update of characteristic model parameter as a result, provide reliable information foundation, it will greatly improves tracking Accuracy and robustness.
In conclusion problem of the existing technology is:
Existing visual tracking method is poor for the robustness target following effect of video sequence;It can not adapt to target screening The complex situations such as gear, cosmetic variation and background interference, and existing a variety of track algorithm accuracys are poor.
Invention content
In view of the problems of the existing technology, the vision tracking based on consistency fallout predictor model that the present invention provides a kind of Method.
The invention is realized in this way a kind of visual tracking method based on consistency fallout predictor model, including:
A dual input convolutional neural networks model, synchronous extraction video frame sampling area and target template are built first High-level characteristic, utilize logistic regression method to distinguish target and background area;
Then convolutional neural networks are embedded into consistency fallout predictor frame, are assessed point using algorithmic theory of randomness inspection The reliability of class result, under specified risk level, classification results of the output with confidence level target in the form of domain;
It finally selects high confidence level region as candidate target region, mesh is obtained by optimizing time-space domain global energy function Mark track.
Further, the visual tracking method based on consistency fallout predictor model specifically includes:
Input:Target original state x0, the C'N'N of pre-training, length is N sequence images;
Output:Target trajectory
Initial phase, including:
By x0Input template of the corresponding image-region as CNN;
In x0Place acquires positive negative sample, establishes training set Τ, and be divided into normal training set ΤaWith calibration set Τb
Utilize ΤaTo in CNN full articulamentum and output layer be trained adjustment;
Tracking phase, including:
Image sequence is divided intoA segment is successively handled kth=1 ..., K segments;
Estimate the target trajectory of k-th of segment;
Update training set Τ:It selects tracking result with a high credibility to update training set according to p value, and excavates difficult negative sample It is added into Τ;
Linking objective track T ← T ∪ TkIf otherwise the last one processed segment, output trajectory T enable k=k+1, turn Enter the target trajectory step of k-th of segment of estimation.
Further, in the target trajectory of k-th of segment of the estimation, processing procedure includes:
Establish the candidate target set of all frames
(1) it is t, O to enable current timet=φ, with the highest dbjective state of p value in moment t-1 imageCentered on, in place It sets and carries out Gaussian Profile stochastical sampling on scale, obtain M sampleGaussian Profile covariance is diagonal Battle array Diag (0.1r2,0.1r2, 0.2), r isLength and wide average value;
(2) CNN is utilized to calculate the regressand value of sample
(3) according to calibration set Τb, calculated using formula (3)Confidence level
(4) it according to risk threshold value ε, is obtained using formula (4)Domain prediction resultIt chooses Output result is { C+Or { C+,C-, and confidence level p (C+) value comes preceding NcA sampleIt is added to candidate target collection Ot,
(5) t=t+1 is enabled, if t > nlThen processing terminates, and otherwise goes to (1).
Further, in the target trajectory of k-th of segment of the estimation, processing procedure further includes:
By optimizing energy function ETrack, obtain the target trajectory of k-th of segment
Further, the dual input convolutional neural networks model includes CNN network structures:By target template with it is to be identified Image inputs as two-way while entering network and merges to form differentiation feature in full articulamentum after extracting feature by convolutional layer, Finally logistic regression is carried out in output layer realize classification;Wherein, target template is obtained by craft in sequence image head frames, and Images to be recognized is then the regional area sampled in sequence image;CNN network structures include two independent convolutional layers, this two sets Convolutional layer shares same structure and parameter;Two-way input is being mapped as high-level characteristic after convolutional layer, is then connecting entirely It connects in layer and is merged, be further mapped as the feature that there is distinction to target and background;Output layer returns for Logistic Grader predicts that the target of input sample or background are different classes of by logistic regression;
Further, the dual input convolutional neural networks model further includes network parameter training:
The CNN convolutional layers off-line training on data set in advance, can extract general target feature;In pre-training, CNN nets Network structure is single input structure, and the parameter after training is shared by two sets of convolutional layers;
The output layer of CNN is set as 10 units, and 1 unit is replaced with after pre-training, then by output layer, it is corresponding with Two classification of track task;CNN after pre-training will carry out small parameter perturbations according to actual tracking task;It, will be pre- during tracking Convolution layer parameter after training is fixed, and is only carried out online updating to full articulamentum and output layer parameter, is adapted to target and background Variation;
Foundation for training set chooses the target area in first frame, according to target area by hand in tracking initial phase Domain samples positive negative training sample, is to judge its positive and negative attribute, coverage rate given threshold with the coverage rate of sample and target area 0.5;
Data, which enhance, to be realized into row stochastic scale and rotation transformation to sample;In follow-up tracking, tied by classifying The risk assessment of fruit is chosen and is trained specimen sample centered on meeting the tracking result of confidence level condition;Enable training set table It is shown as T={ (x(1),y(1)),...,(x(n),y(n)), wherein y(i)∈{C-=0, C+=1 }, class label C-For background, C+For mesh Mark, x(i)∈ZdIt is dbjective state vector, including position and scale;Calculating sample is returned in output layer using Logistic to belong to The probability of target or background:
R(y|x;θ)=hθ(x)y(1-hθ(x))1-y(1);
Wherein,θ is network model parameter;Using training set T come training pattern so that logarithm Likelihood loss function L (θ) reaches minimum:
Network weight and bias are adjusted along the negative gradient direction of L (θ) using stochastic gradient descent method, by reversely passing Broadcasting method updates each layer parameter Simultaneous Iteration more than convolutional layer.
Further, the consistency fallout predictor uses the innovatory algorithm forecast sample classification of CP, in the innovatory algorithm of CP In, it is first assumed that the sample in training set obeys independent same distribution, by training set T={ (x(1),y(1)),...,(x(n),y(n))} It is divided into two parts:Preceding m sample forms normal training set Ta={ (x(1),y(1)),...,(x(m),y(m)), behind q Sample forms calibration set Tb={ (x(m+1),y(m+1)),...,(x(m+q),y(m+q)), n=m+q;Normal training set TaFor updating CNN parameters, and calibration set TbChecking sequence is constituted with together with sample to be identified, is examined using the algorithmic theory of randomness of sequence to determine Sample class;
The algorithmic theory of randomness method of inspection of the sequence is:Mapping function A is defined first:Z(q-1)× Z → R, it will calibrate Collect TbIn each sample be mapped to singular value space one by one, obtain unusual value sequence αm+1,...,αm+q
It is x to enable the dbjective state of sample to be identifieds, x is assigned respectivelysClass label C-And C+, constitute two test samples (xs,yi), i=0,1;Calculate the singular value of test samplesAfterwards, with calibration set TbCorresponding singular value constitutes two inspections together Test sequenceBy calculating test statistics p value, the algorithmic theory of randomness for obtaining sequence is horizontal:
Wherein ps(yi) indicate dbjective state xsIt is marked as yiWhen p value, as xsBelong to classification yiConfidence level; Assignment algorithm risk level threshold epsilon, output of the hypothesis as ICP by p value more than ε:
Work as xsTrue classification ysDo not existWhen middle, then it is assumed that there is prediction error, it is effective according to consistency fallout predictor Property theorem, error rate be not more than algorithm risk level ε, i.e.,:
P{ps(ys)≤ε}≤ε (5);
In the algorithmic theory of randomness method of inspection of the sequence, singular value mapping function is first defined, for measuring sample to be tested Originally it is under the jurisdiction of the degree of consistency of whole sample distribution;Consistency is analyzed according to the regressand value of CNN outputs, sample characteristics correspond to The regressand value of true classification is bigger, then the sample and the consistency of calibration set sequence are stronger, and singular value function is defined as:
Wherein Ry(x(i)) it is that x is calculated by formula (1)(i)Corresponding to the regressand value of classification y, parameter γ is unusual for adjusting Value aiTo the susceptibility of regressand value variation, the smaller then a of γiTo Ry(x(i)) variation it is more sensitive.
Further, the result of the innovatory algorithm output of CP includes multiple classifications;Two classification for sample to be identified, CP's The result of innovatory algorithm output includes φ, { C-},{C+},{C+,C-}4Middle result;In each output result, in addition to classification information, Also it is accompanied with confidence level p value;According to the domain prediction result of all samples, therefrom select sample with a high credibility as the time per frame Select target;
It specifically includes:It is { C by output in the frame for the picture frame of t moment+Or { C+,C-Sample according to credible Spend p (C+) value is ranked up, choose maximum NcA Sample Establishing candidate target collection Ot, wherein | Ot|≤Nc
The candidate target collection OtInclude several possible states of t moment target, target will be from OtIn some state turn Change to subsequent time candidate target collection Ot+1In some state;
It is to obtain optimal path by target following, defines time-space domain energy function ETrackTarget trajectory is portrayed, optimization is passed through Energy function obtains target trajectory
Wherein ETrackIncluding two parts, local cost item ELocalWith by cost item EPariwise
ELocalIt is defined as the dbjective state x at each momenttCorresponding to the sum of the CNN output valves of background;
Local cost item can be reduced for target part circumstance of occlusion, Robust Estimators are introduced and be used for reducing out lattice point Influences of the data Outliers to function optimization, ELocalIt is defined as:
Wherein it isCorrespond to regressand value when background for dbjective state x, ρ (), which is Huber operators, enhances local generation The reliability of valence item, is defined as:
EPariwiseThe variation degree of dbjective state is described;When occurring target occlusion, mixed and disorderly background or target appearance in sequence When state changes, dbjective state great-jump-forward variation can occur since evaluated error is larger;It is assumed that the movement of target is coherent, EPariwiseEffect be to be punished the catastrophe point in track when energy function optimizes so that track has certain smooth Property;EPariwiseIt is defined as:
The energy function in formula (7) is optimized using dynamic programming method, obtains optimal movement locus.
Further, training sample updates, including:
During tracking, CNN model parameters are updated using the tracking result of a upper tract, and then handle next sequence Row section;For the tracking result of moment tIt is selected according to its confidence level p value, if p is more than the threshold alpha of setting, is based onPositive negative training sample is sampled, otherwise carries out judgement selection into subsequent time.
Another object of the present invention is to provide a kind of visual tracking methods based on consistency fallout predictor model Robust Visual Tracking System.
The present invention extracts image high-level characteristic using convolutional neural networks, is used for carrying out feature representation to target, overcome The low-level image feature disadvantage sensitive to target appearance transformation.In order to adapt to different types of tracking target, dual input network is designed Structure, combining target template distinguish target and background area using logistic regression method.To further increase tracking robustness, It introduces consistency fallout predictor and fail-safe analysis is carried out to classification results, select the classification results for meeting confidence level condition as candidate Target area obtains final target trajectory finally by time-space domain global energy function optimization.It is on public data collection and more The currently a popular track algorithm of kind carries out contrast experiment, and the present invention can adapt to target occlusion, cosmetic variation and background interference Etc. complex situations, the results showed that inventive algorithm, which has, more preferably tracks robustness and accuracy.
Description of the drawings
Fig. 1 is the visual tracking method flow chart provided in an embodiment of the present invention based on consistency fallout predictor model.
Fig. 2 is the visual tracking method schematic diagram provided in an embodiment of the present invention based on consistency fallout predictor model.
Fig. 3 is the CNN schematic network structures of dual input provided in an embodiment of the present invention.
Fig. 4 is target following result schematic diagram provided in an embodiment of the present invention;
In figure:(a)FaceOcc1;(b)Bolt;(c)Football;(d)CarDark.
Fig. 5 is tracking result center error schematic diagram provided in an embodiment of the present invention;
In figure:(a)FaceOcc1;(b)Bolt;(c)Football;(d)CarDark.
Fig. 6 is tracking result coverage rate schematic diagram provided in an embodiment of the present invention;
In figure:(a)FaceOcc1;(b)Bolt;(c)Football;(d)CarDark.
Fig. 6 is provided in an embodiment of the present invention once by evaluation result schematic diagram;
In figure:(a) positional precision figure;(b) success rate figure is covered.
Fig. 7 is that algorithm provided in an embodiment of the present invention is primary by evaluation result figure on all cycle tests;
In figure:(a) positional precision figure;(b) success rate figure is covered.
Specific implementation mode
In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to embodiments, to the present invention It is further elaborated.It should be appreciated that the specific embodiments described herein are merely illustrative of the present invention, it is not used to Limit the present invention.
Currently, the robustness target following effect for video sequence is poor;Can not adapt to target occlusion, cosmetic variation with And the complex situations such as background interference, and existing a variety of track algorithm accuracys are poor.
The application principle of the present invention is described in detail below in conjunction with the accompanying drawings.
As shown in Figure 1, the visual tracking method provided in an embodiment of the present invention based on consistency fallout predictor model, including:
S101:A dual input convolutional neural networks model, synchronous extraction video frame sampling area and mesh are built first The high-level characteristic for marking template distinguishes target and background area using logistic regression method.
S102:Then convolutional neural networks are embedded into consistency fallout predictor frame, using algorithmic theory of randomness inspection come The reliability for assessing classification results, under specified risk level, classification results of the output with confidence level target in the form of domain.
S103:Finally select high confidence level region as candidate target region, by optimizing time-space domain global energy function Obtain target trajectory.The experimental results showed that the algorithm can adapt to the complicated feelings such as target occlusion, cosmetic variation and background interference Condition has stronger robustness and accuracy than current a variety of popular track algorithms.
The application principle of the present invention is further described with reference to specific embodiment.
Visual tracking method schematic diagram provided in an embodiment of the present invention based on consistency fallout predictor model is as shown in Figure 2;
It is broadly divided into two stages.First stage is initial phase, builds the CNN of a dual input, therein Convolution layer parameter is that training obtains in advance using conventional images data set, other layers then utilize the sample of hand sampling in first frame into Row training, to obtain the initial parameter of model.
Second stage is tracking phase, and area sampling is carried out frame by frame to sequence image, and the high level that sample is extracted using CNN is special Sign calculates the regressand value that sample belongs to target or background by logistic regression, and then obtains sample under specified risk level using CP This classification;It selects target sample with a high credibility to establish candidate target collection, passes through the space-time to being defined on candidate target collection Energy function optimizes, and obtains final target trajectory.The target following of long sequence is handled using a kind of half offline mode, it will Entire video sequence segmentation, handles the tracking of each tract, connects every section of track, and piecewise during tracking successively Online updating is carried out to the model parameter of CNN.
1) CNN target's feature-extractions and classification:
CNN is a kind of multilayer neural network of special disposal lattice structure data, is implicitly extracted by convolution kernel Image local feature, and the invariance with good displacement, scaling and other types deformation.For tracking problem, network knot Structure and parameter training mode are to influence the key factor of CNN performances, illustrate this two parts in inventive algorithm separately below Design.
1.1) CNN network structures:
In target identification application, CNN usually requires accurately express target signature after mass data is trained, and For a specific tracing task, it tends to be difficult to sufficient training data be obtained ahead of time, therefore applied to the CNN of target identification It is difficult to directly apply to target following, needs to be adjusted and improves.
It is different from target identification, the specific type of target need not be paid close attention in tracking, as long as can be distinguished with background, A kind of CNN network structures (as shown in Figure 3) of dual input are used thus, target template is defeated as two-way with images to be recognized Enter at the same enter network, by convolutional layer extract feature after, merge to form differentiation feature in full articulamentum, finally output layer into Row logistic regression realizes classification.Wherein, target template can be obtained by craft in sequence image head frames, and images to be recognized is then It is the regional area sampled in sequence image;Two sets of independent convolutional layers are contained in network, are simplified model, this two sets volumes Lamination shares same structure and parameter;Two-way input is being mapped as high-level characteristic after convolutional layer, is then connecting entirely It is merged in layer, is further mapped as the feature that there is distinction to target and background;Output layer is Logistic recurrence point Class device predicts the classification of input sample, i.e. target or background by logistic regression.
1.2) network parameter is trained:
CNN convolutional layers in the present invention off-line training on CIFAR-10 data sets in advance, enables to extract general mesh Mark feature.In pre-training, CNN network structures are reduced to single input structure, and the parameter after training is then shared by two sets of convolutional layers.
In addition, for 10 classification problems of CIFAR-10 data, the output layer of CNN is set as 10 units, when pre-training knot Shu Hou, then output layer is replaced with into 1 unit, to correspond to two classification problems of tracing task.CNN after pre-training will be according to reality Border tracing task carries out small parameter perturbations.During tracking, in order to improve CNN parameter adjustment efficiency, by the convolution after pre-training Layer parameter is fixed, and only online updating is carried out to full articulamentum and output layer parameter, to adapt to the variation of target and background.
Foundation for training set chooses the target area in first frame, according to target area by hand in tracking initial phase Domain samples positive negative training sample, judges its positive and negative attribute with the coverage rate (given threshold 0.5) of sample and target area.For Raising training samples number realizes that data enhance to sample into row stochastic scale and rotation transformation.In follow-up tracking, By the risk assessment of classification results, chooses and be trained specimen sample centered on the tracking result for meeting confidence level condition. Training set is enabled to be expressed as T={ (x(1),y(1)),...,(x(n),y(n)), wherein y(i)∈{C-=0, C+=1 }, class label C-For the back of the body Scape, C+For target, x(i)∈ZdIt is dbjective state vector, including position and scale.It is returned and is calculated using Logistic in output layer Sample belongs to the probability of target or background:
R(y|x;θ)=hθ(x)y(1-hθ(x))1-y(1);
Wherein,θ is network model parameter.Using training set T come training pattern so that logarithm Likelihood loss function L (θ) reaches minimum:
Network weight and bias are adjusted along the negative gradient direction of L (θ) using stochastic gradient descent method, by reversely passing Broadcasting method updates each layer parameter Simultaneous Iteration more than convolutional layer.
2) the candidate target selection based on consistency fallout predictor:
Logistic regressand values provide foundation for sample class prediction, but Logistic regressand values itself can not be in theory On the risk of prediction error is assessed.In order to realize the fail-safe analysis of prediction result, by CNN model insertions to CP frames In frame, according to algorithmic theory of randomness level calculation sample class confidence level, and then candidate target is selected.
2.1) consistency fallout predictor:
Current machine learning algorithm mostly lacks prediction result effective fail-safe analysis, that is, passes through one It is correct that the confidence level of quantization, which carrys out evaluation and foreca result to what extent, and the measurement standard of effectively confidence level is adjustable Property.CP is a kind of machine learning normal form that can effectively export confidence level, is predicted using hypothesis testing method, and to pre- It surveys result and reliability assessment is provided.
Traditional CP algorithm calculation amounts are very big, and to improve operation efficiency, the present invention uses the innovatory algorithm forecast sample of CP Classification, i.e. consilience of induction fallout predictor (Inductive Conformal Predictor, ICP).In ICP algorithms, first It is assumed that the sample in training set obeys independent same distribution, by training set T={ (x(1),y(1)),...,(x(n),y(n)) be divided into Two parts:Preceding m sample forms normal training set Ta={ (x(1),y(1)),...,(x(m),y(m)), behind q sample group At calibration set Tb={ (x(m+1),y(m+1)),...,(x(m+q),y(m+q)), n=m+q.Normal training set TaFor updating CNN ginsengs Number, and calibration set TbChecking sequence is constituted with together with sample to be identified, sample class is determined using algorithmic theory of randomness inspection.
The algorithmic theory of randomness method of inspection is:Mapping function A is defined first:Z(q-1)× Z → R, by calibration set TbEach of Sample is mapped to singular value space one by one, obtains unusual value sequence αm+1,...,αm+q.It is whole with sample that singular value reflects the sample The inconsistency of body distribution.It is x to enable the dbjective state of sample to be identifieds, x is assigned respectivelysClass label C-And C+, to constitute Two test samples (xs,yi), i=0,1.Calculate the singular value of test samplesAfterwards, with calibration set TbCorresponding singular value one It rises and constitutes two checking sequencesBy calculating test statistics p value, the algorithm of sequence is obtained Randomness is horizontal:
Wherein ps(yi) indicate dbjective state xsIt is marked as yiWhen p value, as xsBelong to classification yiConfidence level. Assignment algorithm risk level threshold epsilon, output of the hypothesis as ICP by p value more than ε:
Work as xsTrue classification ysDo not existWhen middle, then it is assumed that there is prediction error, it is effective according to consistency fallout predictor Property theorem[10], error rate is not more than algorithm risk level ε, i.e.,:
P{ps(ys)≤ε}≤ε (5);
Therefore, the prediction domain of ICP has adjustable.
2.2) sample singular function:
The algorithmic theory of randomness inspection of sequence needs first to define singular value mapping function, is under the jurisdiction of for measuring sample to be tested The degree of consistency of whole sample distribution.Consistency is analyzed according to the regressand value of CNN outputs, it is believed that sample characteristics correspond to true The regressand value of classification is bigger, then the sample and the consistency of calibration set sequence are stronger, and singular value function is defined as:
Wherein Ry(x(i)) it is that x is calculated by formula (1)(i)Corresponding to the regressand value of classification y, parameter γ is unusual for adjusting Value aiTo the susceptibility of regressand value variation, the smaller then a of γiTo Ry(x(i)) variation it is more sensitive.
2.3) candidate target selects:
ICP output the result is that one set, wherein may include multiple classifications.Two classification of sample to be identified are asked The result of topic, ICP outputs has 4 kinds of possibilities:φ,{C-},{C+},{C+,C-}.In each output result, in addition to classification information, Also it is accompanied with confidence level p value.According to the domain prediction result of all samples, therefrom select sample with a high credibility as the time per frame Select target.Specifically, it is { C by output in the frame for the picture frame at t moment+Or { C+,C-Sample according to credible Spend p (C+) value is ranked up, choose maximum NcA Sample Establishing candidate target collection Ot, it is known that | Ot|≤Nc
3) target tracking algorism:
3.1) time-space domain energy function:
Candidate target collection OtSeveral possible states of t moment target are contained, target will be from OtIn some state conversion To subsequent time candidate target collection Ot+1In some state, therefore target following can be considered as to searching optimum path problems, be Acquisition optimal path defines time-space domain energy function ETrackTarget trajectory is portrayed, target track is obtained by optimizing energy function Mark
Wherein ETrackIncluding two parts, local cost item ELocalWith by cost item EPariwise
ELocalIt is defined as the dbjective state x at each momenttCorresponding to the sum of the CNN output valves of background.Due to target part Circumstance of occlusion can reduce the reliability of local cost item, introduce Robust Estimators thus and be used for reducing out Grid data (Outliers) to the influence of function optimization, ELocalIt is defined as:
Wherein it isCorrespond to regressand value when background for dbjective state x, ρ () is Huber operators, for enhancing The reliability of local cost item, is defined as:
EPariwiseThe variation degree of dbjective state is described.When occurring target occlusion, mixed and disorderly background or target appearance in sequence When state changes, dbjective state great-jump-forward variation may occur since evaluated error is larger.It is assumed that the movement of target is coherent , EPariwiseEffect be to be punished the catastrophe point in track when energy function optimizes so that track has certain Flatness.EPariwiseIt is defined as:
The energy function in formula (7) is optimized using dynamic programming method, obtains optimal movement locus.
3.2) training sample updates:
During tracking, CNN model parameters are updated using the tracking result of a upper tract, and then handle next sequence Row section.To avoid the occurrence of model drift, only training sample is acquired in the high tracking result of reliability.Tracking for moment t As a resultIt is selected according to its confidence level p value, if p is more than the threshold alpha of setting, is based onPositive negative training sample is sampled, it is no Then judgement selection is carried out into subsequent time.
The negative sample of negative sample generally existing redundancy phenomena in training set, redundancy contributes very little, waste to model training Computing resource.For this purpose, optimizing training set by excavating difficult negative sample (Hard Negative Sample), training effect is improved Rate.It was found that, domain prediction result is { C+,C-Sample (be denoted as), it will usually it is easily mixed with target to appear in background objects In the case of confusing, difficult negative sample can be selected from this kind of sample.A kind of simple selection mode is to judgeWith current tracking As a resultBetween it is overlapping with the presence or absence of region, will if overlappingIt is added in training set as negative sample.
3.3) track algorithm step:
The Vision Tracking of proposition is as follows:
4) experimental result and analysis:
For the validity of verification algorithm, emulation experiment is carried out to algorithm on Matlab, hardware platform is 3.4 GHz Intel-i7-6700 inside saves as 8GB.The parameters of algorithm are set as in experiment:Normal training set TaScale m=300, school Quasi- collection TbScale q=30, algorithm risk level ε=0.4, unusual value parameter γ=0.5 of sample, candidate target collection scale upper limit Nc =20, robust function parameter δ=0.4, training sample undated parameter α=0.6.Algorithm parameter remains unchanged in entire experiment, calculates The average treatment speed of method is about 8 frames/second.
Select the video sequence in public data collection TOP100 as experimental subjects, and to current a variety of mainstream track algorithms Carry out experiment effect comparison, including VTS, LOT, STRUCK, MIL, KCF etc..In order to verify the validity of CP, test in an experiment One simple version of inventive algorithm, is not introduced into CP in the version, but the regressand value directly exported according to CNN, selection Maximum NcA sample is as candidate region.Coverage rate (Coverage rate) and center position error are used in experiment (Center location error) two standards carry out the performance of more each algorithm.Coverage rate is defined as Cr=(Rs∩Rt)/ (Rs∪Rt), wherein Rs、RtRespectively tracking result region and real goal region.Center point tolerance refers to tracking result center Euclidean distance between point and true value central point.Part of test results is shown, some typical cases are contained in the video sequence of selection Complex situations, such as target occlusion, variable cosmetic, illumination variation and complex background.
Target can be accurately positioned in Ours of the present invention, Ours (no CP) and tri- algorithms of KCF always in the sequence, To blocking with stronger robustness.
The characteristics of FaceOcc1 video sequence images is that face is repeatedly blocked, and the position blocked and degree are different. As can be seen that the tracking result of LOT is limited only to the part not being blocked under occlusion, scale error is larger, and MIL exists When target is seriously blocked, as in the 834th frame, there is larger drift.Error and coverage data are shown from the central position, Target can be accurately positioned in inventive algorithm Ours, Ours (no CP) and tri- algorithms of KCF always in the sequence, to hiding Gear has stronger robustness.
Video sequence Bolt is dash match scene, and task is the one of sportsman of tracking.The challenge of the sequence is, The posture of target is constantly changing, while with the rotation of camera lens, and sportsman is increasingly turned to the back side, therefore mesh from front in image It is very big to mark cosmetic variation.
Inventive algorithm utilizes high-level characteristic, influenced by target appearance variation it is little, and by fail-safe analysis come more New model avoids the occurrence of drift, finds out from error analysis, in the algorithm compared, the tracking result error of inventive algorithm It is minimum.
Football video sequences are rugby match scenes, and tracking target is the head of a sportsman, it is shown that should The part tracking result of sequence.The difficult point of the sequence is the quite similar sportsman of many of background appearance, between them frequently Reciprocal motion causes interference to target following.Repeatedly there is larger drift in VTS, MIL, KCF, STRUCK and Ours (no CP) Move, in 360 frame, VTS, MIL and KCF then perfect tracking to other sportsmen.Inventive algorithm Ours is excellent by space-time track Change ensures smooth trajectory, reduces the influence of similar purpose interference.Statistics indicate that Ours is maintained in the tracking of the sequence Minimum tracking error.
The characteristics of Cardark video sequences are the tail portions that track an automobile, the sequence is illumination acute variation, background Mix and image resolution ratio is low.In the tracking result of display, interfered by the light occurred on the left of target in the 58th frame LOT, Substantial deviation target, while scale error is larger, also there is a degree of drift in MIL and VTS.With left side light Constantly interference, MIL and LOT have lost target in 208 frame, and in 315 frame, the inverted image speck on road surface also results in VTS loses target.STRUCK, KCF, Ours (no CP) and inventive algorithm Ours keep stablizing during tracking the tailstock, But the larger scale error of Ours (no CP) generally existings.Shown in the tracking error analysis of the sequence, wherein KCF and STRUCK Tracking accuracy be slightly below Ours.
In order to compare the overall performance of 7 kinds of algorithms, it is primary on all cycle tests that The present invention gives these algorithms By evaluation result (One-Pass Evaluation, OPE), the performance of algorithm can be from area under the curve (Area Under Curve, AUC) it is ranked up, there it can be seen that inventive algorithm Ours is higher than in positional precision and covering success rate Other algorithms, the wherein performance of KCF and Ours are closest, and Ours (no CP) its performance under conditions of being not introduced into CP occurs It glides, it is more apparent especially in covering success rate.
The mean center point site error and average coverage rate of 7 kinds of algorithms are counted in table 1, the performance of inventive algorithm refers to Mark is better than other algorithms, shows that the depth characteristic of the extraction of the CNN networks in this hair AMING algorithms can distinguish target well And by combining ICP to carry out trust evaluation to classification results the reliability of tracking can be effectively ensured, in a variety of allusion quotations in background Good performance is shown on the video sequence of type complex situations.
The present invention proposes a kind of target tracking algorism based on convolutional neural networks Yu consistency fallout predictor.The algorithm is adopted With convolutional neural networks extract image high-level characteristic, be used for target carry out feature representation, overcome low-level image feature to target outside See the sensitive disadvantage of transformation.In order to adapt to different types of tracking target, dual input network structure, combining target mould are designed Plate distinguishes target and background area using logistic regression method.To further increase tracking robustness, consistency fallout predictor is introduced Fail-safe analysis is carried out to classification results, selects the classification results for meeting confidence level condition as candidate target region, finally leads to It crosses time-space domain global energy function optimization and obtains final target trajectory.On public data collection with a variety of currently a popular tracking Algorithm carries out contrast experiment, the results showed that inventive algorithm, which has, more preferably tracks robustness and accuracy.
The application effect of the present invention is explained in detail with reference to experiment.
For the validity of verification algorithm, emulation experiment is carried out to algorithm on Matlab, hardware platform is 3.4 GHz Intel-i7-6700 inside saves as 8GB.The parameters of algorithm are set as in experiment:Normal training set TaScale m=300, school Quasi- collection TbScale q=30, algorithm risk level ε=0.4, unusual value parameter γ=0.5 of sample, candidate target collection scale upper limit Nc =20, robust function parameter δ=0.4, training sample undated parameter α=0.6.Algorithm parameter remains unchanged in entire experiment, calculates The average treatment speed of method is about 8 frames/second.
Select public data collection TOP100[14]In video sequence tracked as experimental subjects, and to current a variety of mainstreams Algorithm carries out experiment effect comparison, including VTS[15], LOT[16], STRUCK[1], MIL[17], KCF[2]Deng.In order to verify having for CP Effect property, tests a simple version of inventive algorithm, CP is not introduced into the version in an experiment, but directly according to CNN The regressand value of output selects maximum NcA sample is as candidate region.In experiment using coverage rate (Coverage rate) and Two standards of center position error (Center location error)[18]Carry out the performance of more each algorithm.Cover calibration Justice is Cr=(Rs∩Rt)/(Rs∪Rt), wherein Rs、RtRespectively tracking result region and real goal region.Center point tolerance It refer to the Euclidean distance between tracking result central point and true value central point.Fig. 4 shows part of test results, the video of selection Some typical complex situations, such as target occlusion, variable cosmetic, illumination variation and complex background are contained in sequence.
The part tracking result of FaceOcc1 video sequences is shown in Fig. 4 (a), and tracking target is the face of a woman Portion.The characteristics of sequence image is that face is repeatedly blocked by books, and the position blocked and degree are different.It can from figure Go out, the tracking result of LOT is limited only to the part not being blocked under occlusion, and scale error is larger, and MIL is in target quilt When seriously blocking, as in the 834th frame, there is larger drift.The covering in center error and Fig. 6 (a) in Fig. 5 (a) Rate data show that inventive algorithm Ours, Ours (no CP) and tri- algorithms of KCF can accurately be determined always in the sequence Target is arrived in position, to blocking with stronger robustness.
Video sequence Bolt is dash match scene, and task is the one of sportsman of tracking.The challenge of the sequence is, The posture of target is constantly changing, while with the rotation of camera lens, and sportsman is increasingly turned to the back side, therefore mesh from front in image It is very big to mark cosmetic variation.In the result of Fig. 4 (b) displays, VTS, STRUCK, MIL shortly shift since sequence, Equal breakaway in 48th frame, KCF, LOT, Ours and Ours (no CP) can keep keeping up with target, but LOT and Ours (no CP) when target deforms upon, there is larger scale error in such as the 222nd frame.Inventive algorithm utilizes high-level characteristic, by The influence of target appearance variation is little, and avoids the occurrence of drift by fail-safe analysis come more new model, from Fig. 5 (b) and Fig. 6 (b) error analysis in is found out, in the algorithm compared, the tracking result error of inventive algorithm is minimum.
Football video sequences are rugby match scenes, and tracking target is the head of a sportsman, and Fig. 4 (c) is aobvious The part tracking result of the sequence is shown.The difficult point of the sequence is the quite similar sportsman of many of background appearance, they it Between frequent reciprocal motion, interference is caused to target following.VTS, MIL, KCF, STRUCK and Ours (no CP) repeatedly occur Larger drift, in 360 frame, VTS, MIL and KCF then perfect tracking to other sportsmen.Inventive algorithm Ours passes through space-time Track optimizing ensures smooth trajectory, reduces the influence of similar purpose interference.It is in Fig. 5 (c) and Fig. 6 (c) statistics indicate that, Ours maintains minimum tracking error in the tracking of the sequence.
The characteristics of Cardark video sequences are the tail portions that track an automobile, the sequence is illumination acute variation, background Mix and image resolution ratio is low.In the tracking result of Fig. 4 (d) displays, in the 58th frame LOT by the light occurred on the left of target Interference, substantial deviation target, while scale error is larger, also there is a degree of drift in MIL and VTS.As left side is bright The continuous interference of light, MIL and LOT have lost target in 208 frame, and in 315 frame, the inverted image speck on road surface, It causes VTS and loses target.STRUCK, KCF, Ours (no CP) and inventive algorithm Ours are kept during tracking the tailstock Stablize, but the larger scale error of Ours (no CP) generally existings.The tracking error analysis such as Fig. 5 (d) and Fig. 6 of the sequence (d) shown in, the tracking accuracy of wherein KCF and STRUCK are slightly below Ours.
In order to compare the overall performance of 7 kinds of algorithms, it is primary logical on all cycle tests that these algorithms are given in Fig. 7 Cross evaluation result (One-Pass Evaluation, OPE)[14], including positional precision figure (Fig. 7 (a)) and covering success rate figure (Fig. 7 (b)).The performance of algorithm can be ranked up with the area under the curve (Area Under Curve, AUC) provided in Fig. 6, from In as can be seen that inventive algorithm Ours is higher than other algorithms, the wherein performance of KCF in positional precision and covering success rate It is closest with Ours, and gliding occurs in Ours (no CP) its performance under conditions of being not introduced into CP, and is especially covering successfully In rate more apparent (see Fig. 7 (b)).
The mean center point site error and average coverage rate of 7 kinds of algorithms are counted in table 1, the performance of inventive algorithm refers to Mark be better than other algorithms, show in inventive algorithm CNN networks extraction depth characteristic can distinguish well target and By combining ICP to carry out trust evaluation to classification results the reliability of tracking can be effectively ensured, in a variety of typical cases in background Good performance is shown on the video sequence of complex situations.
1 mean center point site error of table and coverage rate
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention All any modification, equivalent and improvement etc., should all be included in the protection scope of the present invention made by within refreshing and principle.

Claims (10)

1. a kind of visual tracking method based on consistency fallout predictor model, which is characterized in that described to be based on consistency fallout predictor The visual tracking method of model includes:
A dual input convolutional neural networks model, the height of synchronous extraction video frame sampling area and target template are built first Layer feature distinguishes target and background area using logistic regression method;
Then convolutional neural networks are embedded into consistency fallout predictor frame, classification knot is assessed using algorithmic theory of randomness inspection The reliability of fruit, under specified risk level, classification results of the output with confidence level target in the form of domain;
It finally selects high confidence level region as candidate target region, target track is obtained by optimizing time-space domain global energy function Mark.
2. the visual tracking method as described in claim 1 based on consistency fallout predictor model, which is characterized in that described to be based on The visual tracking method of consistency fallout predictor model specifically includes:
Input:Target original state x0, the C'N'N of pre-training, length is N sequence images;
Output:Target trajectory
Initial phase, including:
By x0Input template of the corresponding image-region as CNN;
In x0Place acquires positive negative sample, establishes training set Τ, and be divided into normal training set ΤaWith calibration set Τb
Utilize ΤaTo in CNN full articulamentum and output layer be trained adjustment;
Tracking phase, including:
Image sequence is divided intoA segment is successively handled kth=1 ..., K segments;
Estimate the target trajectory of k-th of segment;
Update training set Τ:It selects tracking result with a high credibility to update training set according to p value, and excavates difficult negative sample and be added Into Τ;
Linking objective track T ← T ∪ TkIf otherwise the last one processed segment, output trajectory T enable k=k+1, are transferred to and estimate Count the target trajectory step of k-th of segment.
3. the visual tracking method as claimed in claim 2 based on consistency fallout predictor model, which is characterized in that the estimation In the target trajectory of k-th of segment, processing procedure includes:
Establish the candidate target set of all frames
(1) it is t, O to enable current timet=φ, with the highest dbjective state center of p value in moment t-1 image, in position and scale Upper progress Gaussian Profile stochastical sampling obtains M sampleJ=1 ..., M, Gaussian Profile covariance are diagonal matrix Diag (0.1r2,0.1r2, 0.2), r isLength and wide average value;
(2) CNN is utilized to calculate the regressand value of sample=1 ..., M;
(3) according to calibration set Τb, calculated using formula (3)Confidence levelJ=1 ..., M;
(4) it according to risk threshold value ε, is obtained using formula (4)Domain prediction resultJ=1 ..., M;Choose output As a result it is { C+Or { C+,C-, and confidence level p (C+) value comes preceding NcA sampleIt is added to candidate target collection Ot,
(5) t=t+1 is enabled, if t > nlThen processing terminates, and otherwise goes to (1).
4. the visual tracking method as claimed in claim 2 based on consistency fallout predictor model, which is characterized in that the estimation In the target trajectory of k-th of segment, processing procedure further includes:
By optimizing energy function ETrack, obtain the target trajectory of k-th of segment
5. the visual tracking method as described in claim 1 based on consistency fallout predictor model, which is characterized in that the two-way It includes CNN network structures to input convolutional neural networks model:Using target template and images to be recognized as two-way input simultaneously into Enter network, after extracting feature by convolutional layer, merge to form differentiation feature in full articulamentum, finally carrying out logic in output layer returns Realization is returned to classify;Wherein, target template is obtained by craft in sequence image head frames, and images to be recognized is then in sequence chart The regional area sampled as in;CNN network structures include two independent convolutional layers, this two sets convolutional layers share same structure with Parameter;Two-way input is being mapped as high-level characteristic after convolutional layer, is then merged in full articulamentum, is further reflected It penetrates to have the feature of distinction to target and background;Output layer is that Logistic returns grader, is predicted by logistic regression The target or background of input sample are different classes of.
6. the visual tracking method as claimed in claim 5 based on consistency fallout predictor model, which is characterized in that the two-way Input convolutional neural networks model further includes network parameter training:
The CNN convolutional layers off-line training on data set in advance, can extract general target feature;In pre-training, CNN network structures For single input structure, the parameter after training is shared by two sets of convolutional layers;
The output layer of CNN is set as 10 units, replaces with 1 unit after pre-training, then by output layer, corresponding tracking is appointed Two classification of business;CNN after pre-training will carry out small parameter perturbations according to actual tracking task;During tracking, by pre-training Convolution layer parameter afterwards is fixed, and is only carried out online updating to full articulamentum and output layer parameter, is adapted to the variation of target and background;
Foundation for training set is chosen the target area in first frame, is adopted according to target area by hand in tracking initial phase The positive negative training sample of sample judges its positive and negative attribute with the coverage rate of sample and target area, and coverage rate given threshold is 0.5;
Data, which enhance, to be realized into row stochastic scale and rotation transformation to sample;In follow-up tracking, pass through classification results Risk assessment is chosen and is trained specimen sample centered on meeting the tracking result of confidence level condition;Training set is enabled to be expressed as T ={ (x(1),y(1)),...,(x(n),y(n)), wherein y(i)∈{C-=0, C+=1 }, class label C-For background, C+For target, x(i) ∈ZdIt is dbjective state vector, including position and scale;Output layer using Logistic return calculate sample belong to target or The probability of background:
R(y|x;θ)=hθ(x)y(1-hθ(x))1-y(1);
Wherein,θ is network model parameter;Using training set T come training pattern so that log-likelihood Loss function L (θ) reaches minimum:
Network weight and bias are adjusted along the negative gradient direction of L (θ) using stochastic gradient descent method, by backpropagation side Method updates each layer parameter Simultaneous Iteration more than convolutional layer.
7. the visual tracking method as described in claim 1 based on consistency fallout predictor model, which is characterized in that
The consistency fallout predictor uses the innovatory algorithm forecast sample classification of CP, in the innovatory algorithm of CP, it is first assumed that instruction Practice the sample concentrated and obey independent same distribution, by training set T={ (x(1),y(1)),...,(x(n),y(n)) it is divided into two portions Point:Preceding m sample forms normal training set Ta={ (x(1),y(1)),...,(x(m),y(m)), behind q sample form calibration set Tb={ (x(m+1),y(m+1)),...,(x(m+q),y(m+q)), n=m+q;Normal training set TaFor updating CNN parameters, and calibrate Collect TbChecking sequence is constituted with together with sample to be identified, is examined using the algorithmic theory of randomness of sequence to determine sample class;
The algorithmic theory of randomness method of inspection of the sequence is:Mapping function A is defined first:Z(q-1)× Z → R, by calibration set TbIn Each sample be mapped to singular value space one by one, obtain unusual value sequence αm+1,...,αm+q
It is x to enable the dbjective state of sample to be identifieds, x is assigned respectivelysClass label C-And C+, constitute two test samples (xs, yi), i=0,1;Calculate the singular value of test samplesAfterwards, with calibration set TbCorresponding singular value constitutes two inspection sequences together RowI=0,1;By calculating test statistics p value, the algorithmic theory of randomness for obtaining sequence is horizontal:
Wherein ps(yi) indicate dbjective state xsIt is marked as yiWhen p value, as xsBelong to classification yiConfidence level;It is specified Algorithm risk level threshold epsilon, output of the hypothesis as ICP by p value more than ε:
Work as xsTrue classification ysDo not existWhen middle, then it is assumed that there is prediction error, it is fixed according to consistency fallout predictor validity Reason, error rate are not more than algorithm risk level ε, i.e.,:
P{ps(ys)≤ε}≤ε (5);
In the algorithmic theory of randomness method of inspection of the sequence, singular value mapping function is first defined, is subordinate to for measuring sample to be tested Belong to the degree of consistency of whole sample distribution;Consistency is analyzed according to the regressand value of CNN outputs, sample characteristics correspond to true The regressand value of classification is bigger, then the sample and the consistency of calibration set sequence are stronger, and singular value function is defined as:
Wherein Ry(x(i)) it is that x is calculated by formula (1)(i)Corresponding to the regressand value of classification y, parameter γ is for adjusting singular value ai To the susceptibility of regressand value variation, the smaller then a of γiTo Ry(x(i)) variation it is more sensitive.
8. the visual tracking method as claimed in claim 7 based on consistency fallout predictor model, which is characterized in that the improvement of CP The result of algorithm output includes multiple classifications;Two classification for sample to be identified, the result that the innovatory algorithm of CP exports include φ,{C-},{C+},{C+,C-Result in 4;In each output result, in addition to classification information, it is also accompanied with confidence level p value;According to The domain prediction result of all samples therefrom selects sample with a high credibility as the candidate target per frame;
It specifically includes:It is { C by output in the frame for the picture frame of t moment+Or { C+,C-Sample according to confidence level p (C+) value is ranked up, choose maximum NcA Sample Establishing candidate target collection Ot, wherein | Ot|≤Nc
The candidate target collection OtInclude several possible states of t moment target, target will be from OtIn some state be transformed into Subsequent time candidate target collection Ot+1In some state;
It is to obtain optimal path by target following, defines time-space domain energy function ETrackTarget trajectory is portrayed, by optimizing energy Function obtains target trajectory
Wherein ETrackIncluding two parts, local cost item ELocalWith by cost item EPariwise
ELocalIt is defined as the dbjective state x at each momenttCorresponding to the sum of the CNN output valves of background;
Local cost item can be reduced for target part circumstance of occlusion, Robust Estimators are introduced and be used for reducing out Grid data Influences of the Outliers to function optimization, ELocalIt is defined as:
Wherein it isCorrespond to regressand value when background for dbjective state x, ρ (), which is Huber operators, enhances local cost item Reliability, be defined as:
EPariwiseThe variation degree of dbjective state is described;Become when occurring target occlusion, mixed and disorderly background or targeted attitude in sequence When change, dbjective state great-jump-forward variation can occur since evaluated error is larger;It is assumed that the movement of target is coherent, EPariwise Effect be to be punished the catastrophe point in track when energy function optimizes so that track have certain flatness; EPariwiseIt is defined as:
The energy function in formula (7) is optimized using dynamic programming method, obtains optimal movement locus.
9. the visual tracking method as claimed in claim 2 based on consistency fallout predictor model, which is characterized in that training sample Update, including:
During tracking, CNN model parameters are updated using the tracking result of a upper tract, and then handle next sequence Section;For the tracking result of moment tIt is selected according to its confidence level p value, if p is more than the threshold alpha of setting, is based on Positive negative training sample is sampled, otherwise carries out judgement selection into subsequent time.
10. a kind of visual tracking method as described in claim 1 based on consistency fallout predictor model based on convolutional Neural net The robust Visual Tracking System of network and consistency fallout predictor.
CN201810270188.XA 2018-03-29 2018-03-29 A kind of visual tracking method based on consistency fallout predictor model Pending CN108460790A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810270188.XA CN108460790A (en) 2018-03-29 2018-03-29 A kind of visual tracking method based on consistency fallout predictor model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810270188.XA CN108460790A (en) 2018-03-29 2018-03-29 A kind of visual tracking method based on consistency fallout predictor model

Publications (1)

Publication Number Publication Date
CN108460790A true CN108460790A (en) 2018-08-28

Family

ID=63237253

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810270188.XA Pending CN108460790A (en) 2018-03-29 2018-03-29 A kind of visual tracking method based on consistency fallout predictor model

Country Status (1)

Country Link
CN (1) CN108460790A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109815988A (en) * 2018-12-27 2019-05-28 北京奇艺世纪科技有限公司 Model generating method, classification method, device and computer readable storage medium
CN109829936A (en) * 2019-01-29 2019-05-31 青岛海信网络科技股份有限公司 A kind of method and apparatus of target tracking
CN110956058A (en) * 2018-09-26 2020-04-03 北京嘀嘀无限科技发展有限公司 Image recognition method and device and electronic equipment
CN111612823A (en) * 2020-05-21 2020-09-01 云南电网有限责任公司昭通供电局 Robot autonomous tracking method based on vision
CN111651149A (en) * 2020-07-03 2020-09-11 大连东软教育科技集团有限公司 Machine learning model system convenient to deploy and calling method thereof
CN112233147A (en) * 2020-12-21 2021-01-15 江苏移动信息系统集成有限公司 Video moving target tracking method and device based on two-way twin network
CN112348810A (en) * 2020-08-20 2021-02-09 湖南大学 In-service electronic system reliability assessment method
CN115086718A (en) * 2022-07-19 2022-09-20 广州万协通信息技术有限公司 Video stream encryption method and device
US11889227B2 (en) 2020-10-05 2024-01-30 Samsung Electronics Co., Ltd. Occlusion processing for frame rate conversion using deep learning

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104951793A (en) * 2015-05-14 2015-09-30 西南科技大学 STDF (standard test data format) feature based human behavior recognition algorithm
CN106709936A (en) * 2016-12-14 2017-05-24 北京工业大学 Single target tracking method based on convolution neural network
CN106920250A (en) * 2017-02-14 2017-07-04 华中科技大学 Robot target identification and localization method and system based on RGB D videos
CN107590821A (en) * 2017-09-25 2018-01-16 武汉大学 A kind of method for tracking target and system based on track optimizing
US20180047164A1 (en) * 2014-09-16 2018-02-15 Samsung Electronics Co., Ltd. Computer aided diagnosis apparatus and method based on size model of region of interest

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180047164A1 (en) * 2014-09-16 2018-02-15 Samsung Electronics Co., Ltd. Computer aided diagnosis apparatus and method based on size model of region of interest
CN104951793A (en) * 2015-05-14 2015-09-30 西南科技大学 STDF (standard test data format) feature based human behavior recognition algorithm
CN106709936A (en) * 2016-12-14 2017-05-24 北京工业大学 Single target tracking method based on convolution neural network
CN106920250A (en) * 2017-02-14 2017-07-04 华中科技大学 Robot target identification and localization method and system based on RGB D videos
CN107590821A (en) * 2017-09-25 2018-01-16 武汉大学 A kind of method for tracking target and system based on track optimizing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
高琳等: "基于卷积神经网络与一致性预测器的稳健视觉跟踪", 《光学学报》 *

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110956058A (en) * 2018-09-26 2020-04-03 北京嘀嘀无限科技发展有限公司 Image recognition method and device and electronic equipment
CN110956058B (en) * 2018-09-26 2023-10-24 北京嘀嘀无限科技发展有限公司 Image recognition method and device and electronic equipment
CN109815988A (en) * 2018-12-27 2019-05-28 北京奇艺世纪科技有限公司 Model generating method, classification method, device and computer readable storage medium
CN109815988B (en) * 2018-12-27 2021-08-20 北京奇艺世纪科技有限公司 Model generation method, classification method, device and computer-readable storage medium
CN109829936A (en) * 2019-01-29 2019-05-31 青岛海信网络科技股份有限公司 A kind of method and apparatus of target tracking
CN109829936B (en) * 2019-01-29 2021-12-24 青岛海信网络科技股份有限公司 Target tracking method and device
CN111612823A (en) * 2020-05-21 2020-09-01 云南电网有限责任公司昭通供电局 Robot autonomous tracking method based on vision
CN111651149B (en) * 2020-07-03 2022-11-22 东软教育科技集团有限公司 Machine learning model system convenient to deploy and calling method thereof
CN111651149A (en) * 2020-07-03 2020-09-11 大连东软教育科技集团有限公司 Machine learning model system convenient to deploy and calling method thereof
CN112348810A (en) * 2020-08-20 2021-02-09 湖南大学 In-service electronic system reliability assessment method
CN112348810B (en) * 2020-08-20 2024-04-12 湖南大学 Reliability assessment method for in-service electronic system
US11889227B2 (en) 2020-10-05 2024-01-30 Samsung Electronics Co., Ltd. Occlusion processing for frame rate conversion using deep learning
CN112233147A (en) * 2020-12-21 2021-01-15 江苏移动信息系统集成有限公司 Video moving target tracking method and device based on two-way twin network
CN115086718A (en) * 2022-07-19 2022-09-20 广州万协通信息技术有限公司 Video stream encryption method and device

Similar Documents

Publication Publication Date Title
CN108460790A (en) A kind of visual tracking method based on consistency fallout predictor model
CN106127204B (en) A kind of multi-direction meter reading Region detection algorithms of full convolutional neural networks
CN110084831A (en) Based on the more Bernoulli Jacob's video multi-target detecting and tracking methods of YOLOv3
CN107067413B (en) A kind of moving target detecting method of time-space domain statistical match local feature
CN109816689A (en) A kind of motion target tracking method that multilayer convolution feature adaptively merges
CN104346802B (en) A kind of personnel leave the post monitoring method and equipment
CN109919974A (en) Online multi-object tracking method based on the more candidate associations of R-FCN frame
CN109613006A (en) A kind of fabric defect detection method based on end-to-end neural network
CN105139015B (en) A kind of remote sensing images Clean water withdraw method
CN108647654A (en) The gesture video image identification system and method for view-based access control model
CN108447080A (en) Method for tracking target, system and storage medium based on individual-layer data association and convolutional neural networks
CN108154159B (en) A kind of method for tracking target with automatic recovery ability based on Multistage Detector
CN110033473A (en) Motion target tracking method based on template matching and depth sorting network
CN107784663A (en) Correlation filtering tracking and device based on depth information
CN109543688A (en) A kind of novel meter reading detection and knowledge method for distinguishing based on multilayer convolutional neural networks
CN103902960A (en) Real-time face recognition system and method thereof
CN109671102A (en) A kind of composite type method for tracking target based on depth characteristic fusion convolutional neural networks
CN107145862A (en) A kind of multiple features matching multi-object tracking method based on Hough forest
CN109543632A (en) A kind of deep layer network pedestrian detection method based on the guidance of shallow-layer Fusion Features
CN110263712A (en) A kind of coarse-fine pedestrian detection method based on region candidate
CN109711416A (en) Target identification method, device, computer equipment and storage medium
CN103761747B (en) Target tracking method based on weighted distribution field
CN108921011A (en) A kind of dynamic hand gesture recognition system and method based on hidden Markov model
CN106611158A (en) Method and equipment for obtaining human body 3D characteristic information
CN108288020A (en) Video shelter detecting system based on contextual information and method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20180828