CN110163892A

CN110163892A - Learning rate Renewal step by step method and dynamic modeling system based on estimation interpolation

Info

Publication number: CN110163892A
Application number: CN201910375212.0A
Authority: CN
Inventors: 赖韵宇; 孔熙雨; 钟幼平; 周其平; 翁新林; 诸建敏; 何伟力; 秦纪平; 温舜茜; 梅利奇
Original assignee: State Grid Corp of China SGCC; State Grid Jiangxi Electric Power Co Ltd
Current assignee: State Grid Corp of China SGCC; State Grid Jiangxi Electric Power Co Ltd
Priority date: 2019-05-07
Filing date: 2019-05-07
Publication date: 2019-08-23
Anticipated expiration: 2039-05-07
Also published as: CN110163892B

Abstract

The invention discloses a kind of learning rate Renewal step by step method and dynamic modeling system based on estimation interpolation, this method is to utilize the target motion information in video data, the motion vector field of target is calculated in real time, estimation interpolation is used to original video data, generate Augmented Data frame, and Augmented Data frame smoothly more new model is used, to obtain accurate prediction model.Using prediction model, combined with learning rate dynamic adjustment module, when target quickly changes, increase learning rate, increase quick variation targets to the influence power of model, increases model adaptability, when blocking interference, reduce learning rate, background is reduced using learning rate dynamic adjustment to interfere object module bring, increases the robustness of model, in terms of two kinds, the problem of mitigating the middle object module degeneration of target dynamic modeling problem, so that target dynamic modeling work is more accurate.

Description

Learning rate Renewal step by step method and dynamic modeling system based on estimation interpolation

Technical field

The present invention relates to target dynamics to model field, progressive more particularly, to a kind of learning rate based on estimation interpolation Update method and dynamic modeling system.

Background technique

In recent years, in target tracking domain, many algorithms are all closely bound up with the dynamic modeling of tracked target.

The main problem that they are faced is:

1. the model accuracy for modeling the preferable dynamic modelling method of real-time is insufficient.Recently generally using tool in tracking field There is differentiation correlation filter (the Histograms of oriented gradients for human detection. of feature Conference on Computer Vision and Pattern Recognition), since MOSSE tracker (Visual object tracking using adaptive correlation filters. In:2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.pp.2544-2550).Core correlation filter (KCF) (High-speed tracking with kernelized correlation filters. IEEE Transactions on Pattern Analysis and Machine Intelligence 37 (3), 583-596) promotion for computational efficiency of loop structure is further studied, However loop structure also results in several the problem of being referred to as boundary effect, has an impact to accuracy.

2. the real-time for modeling accurate dynamic modelling method is lacking:

(Correlation filters with limited boundaries.In: 2015IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .pp.4630-4638) finite boundary CF (CFLB)), spatial regularization CF(SRDCF) (Learning spatially regularized correlation filters for visual tracking.In: 2015 IEEE International Conference on Computer Vision.pp. 4310-431) and background correlation filter (BACF) (Learning background- aware correlation filters for visual tracking. In: 2017 IEEE International Conference on Computer Vision (ICCV) .pp. 1144-1152) all in order to be obtained more from training sample Important information proposes different methods, to realize better trace model.It is proposed the CF(CFLB of finite boundary) (Correlation filters with limited boundaries.In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) .pp.4630-4638) method can learn boundary The less CF of effect.However, the feature that it is used is only based on pixel, this is proved to be not enough to expression rule in the picture (Multi-channel correlation filters.In:2013 IEEE International Conference on Computer Vision.pp.3072–3079) .Spatial regularization CF algorithm (SRDCF) has modified the optimization aim of tracker, To retain more information around the center of image packet.Although this method has good performance in terms of prediction, it Have the shortcomings that one it is main: specifically, the optimization cost of this method is very high since the optimization aim does not have analytic solutions.In addition, One group of hyper parameter for being used to form regularization weight is desirable to carefully tune.If fail to adjust these hyper parameters may result in Track performance is bad.Background correlation filter (BACF) is by the inclusion of more background informations from frame and generates negative training sample Method, to promote classifier to separate background and foreground zone.Although BACF shows impressive, change training process Structure lead to no analytic solutions, requirement of real-time is not achieved in calculating speed.

Depth convolutional network achieves major progress in terms of dynamic modeling recently, develops on this basis more and more Track algorithm replace conventional model (the Long-term correlation tracking.2015 based on manual feature IEEE Conference on Computer Vision and Pattern Recognition pp. 4847–4856).So And due to the high calculation amount in training process, most of method (Learning multi-domain based on deep learning convolutional neural networks for visual tracking.2016 IEEE Conference on Computer Vision and Pattern Recognition. pp. 4293-4302) it is limited to high calculating cost, this makes It obtains them and is not suitable for real-time objects modeling.In addition, the demand of GPU also increases the cost of equipment, which has limited depth methods Application in real time embedded system.

The analysis of domestic and international related patents can be concluded that currently without similar utilization estimation interpolation and Learning rate Renewal step by step is to mitigate the application on object module degenerate problem.

Summary of the invention

It is built for using the status and existing target dynamic that generate intermediate data and learning rate dynamic update method rareness The deficiency of mould method carries out target the first purpose of the invention is to provide a kind of combining target motion information, to video data Dynamic modeling, the learning rate Renewal step by step based on estimation interpolation to reduce model degradation, promote dynamic modeling accuracy Method.

A second object of the present invention is to provide a kind of dynamic modeling systems for learning rate Renewal step by step method.

The first purpose of this invention is achieved in that

A kind of learning rate Renewal step by step method based on estimation interpolation, be characterized in: specific step is as follows:

It A, the use of adjacent two frame of original video data (t frame, t+1 frame) is one group of (f_t, f_t+1), carry out estimation: movement Estimation (ME) passes through the identical entity in two consecutive frames of matching and calculates motion vector field (MVF) to complete；For t frame f_t With t+1 frame f_t+1Between motion vector field u_t, enable δ_tIndicate time interval, then generated on time t+λ δ t Sports ground c may be expressed as: u_λ(c+λu_t(c))=u_t(c)；For each hole region in interpolated movements field, by most adjacent Close motion vector filling；

It B, the use of adjacent two frame of original video data (t frame, t+1 frame) is one group of (f_t, f_t+1) and step A in calculate institute The motion vector field u obtained_t, augmentation obtains the sample { f of one group of carrying space time information_λ, by t frame f_tWith t+1 frame frame f_t+1It inserts The λ interpolation results that value generates are represented as: f_λ(c)=(1-λ)f_t(c-λu_λ(c)+λf_t+1(c+(1-λ)u_λ(c))；

C, using the data after augmentation, input as Target Modeling task: according to given target response R_t, extract The existing model mdl of t frame_t；

D, according to object module, next frame is detected, calculates response R_t, R_tPeak value of response max (R) is as the sound predicted Answer R_t+1；

E, the response R of prediction is utilized_t+1Generate t frame prediction model Tmp_t；

F, t frame prediction model Tmp is utilized_tWith the existing model mdl of t frame_tThe degenerate case of model is assessed, dynamic Selected learning rate α；

G, according to t frame prediction model Tmp_t, and selected learning rate α, update the existing model mdl of t frame_t, generate t frame Object module mdl_t+1, indicate are as follows:

mdl_t+1= αTmp_t+(1-α)mdl_t；

H, step C to step G is repeated, R is responded according to t frame_tWith the object module mdl of t frame_t+1, by the target mould of t frame Type mdl_t+1Existing model mdl as t+1 frame_t+1Generate Target Modeling result p_tWith the object module mdl of t+1 frame_t+2。

In step A into step B: original video data is with consecutive frame (t frame, t+1 frame) for one group of (f_t, f_t+1), quilt For two aspects, in a first aspect, in step, according to (f_t, f_t+1), calculate motion vector field u_t；Second aspect, in step B In, according to (f_t, f_t+1) and A in the u that calculates_t, calculate Augmented Data frame { f_λ}。

In stepb: using motion vector field to legacy data augmentation, not needing additionally to demarcate, the data of generation are also made It is inputted for new sample.

In stepb: Augmented Data frame { f_λOriginal video data is contained in definition, in special circumstances, as λ=0, f_λ = f_t, as λ=1, f_λ= f_t+1。

The step C is into step the G: object module can be converted into existing model with time change, such as t frame Object module mdl_t+1In t+1 frame, as existing model mdl_t+1。

Second object of the present invention is achieved in that

A kind of dynamic modeling system for learning rate Renewal step by step method, is characterized in: including sport interpolation estimation module and The target dynamic modeling module of habit rate Renewal step by step, in which:

Sport interpolation estimation module includes motion estimation module and data augmentation module；

The target dynamic modeling module of learning rate Renewal step by step includes that object module extraction module, response computation module, model move back Change evaluation module and model modification module；

By the data that sport interpolation estimation module generates, the target dynamic modeling module of learning rate Renewal step by step is inputted；That is:

A, sport interpolation estimation module carries out estimation to video contiguous frames and generates movement according to the original video data of input Vector field u_t, and then generate Augmented Data frame { f_λ, by u_t, { f_λInput learning rate Renewal step by step target dynamic modeling module；

The motion estimation module: being one group of (f from adjacent two frame (t frame, t+1 frame) of original video data_t, f_t+1), root According to (f_t, f_t+1) calculate all pixels motion vector field u_t, for hole region, it is filled, will be calculated using nearest neighbor algorithm Obtained motion vector field u_tInput data augmentation module；

The data augmentation module: being one group of (f from adjacent two frame (t frame, t+1 frame) of original video data_t, f_t+1), with And the motion vector field u of motion estimation module input_t, to (f_t, f_t+1) video data progress data augmentation, utilize t frame frame f_t With t+1 frame frame f_t+1, interpolation goes out the smallest Augmented Data frame { f of error_λ, and by Augmented Data frame { f_λIncoming learning rate is progressive The target dynamic modeling module of update；

B, the target dynamic modeling module of the learning rate Renewal step by step, according to Augmented Data frame { f_λ, in conjunction with the fortune of contiguous frames Dynamic information carries out learning rate dynamic and adjusts, and generates dynamic modeling result；Wherein:

The object module extraction module: as t=0, according to initially given response R₀, calculate existing model mdl₀Work as t > 0 When, the existing model mdl for the t frame for directly taking update to obtain_t, input response computation module；

The response computation module: the existing model mdl of t frame is used_t, with t+1 frame f_t+1In response R_tNeighbouring region meter Correlation is calculated, response R is obtained_t+1, and result R according to response_t+1, calculate prediction model tmp_t；

The model degradation evaluation module: calculated prediction model tmp is used_t, with existing model mdl_tIt is compared, for A possibility that model deviates the case where central area, and target is fast-moving target is big, according to motion vector field u_tIt is moved Rate increases learning rate α.For prediction model tmp_tCompared to existing model mdl_tAbsolute value declines obvious situation, then target shape Become or to encounter a possibility that blocking big, uses Augmented Data frame { f at this time_λIn relevant range F and prediction model tmp_tIn frequency domain It is multiplied and obtains response R_F；If prediction model tmp_tWith assessment area F calculate gained response reduce present it is regional, then for It blocks, according to serious shielding situation, learning rate α is accordingly decreased, if without regionality, for deformation, according to deformation degree, Correspondingly increase learning rate α；Model degradation evaluation module is by α input model update module；

The model modification module: the learning rate α more new model obtained according to model degradation evaluation module, updated model mdl_t+1=(1-α)mdl_t+αtmp_t。

For original video data after input motion Interpolate estimation module, sport interpolation estimation module carries out video contiguous frames Estimation generates intermediate interpolation frame, inputs the target dynamic modeling module of learning rate Renewal step by step together, and learning rate is progressive more Video data after the augmentation that new target dynamic modeling module is inputted according to target dynamic modeling module, in conjunction with the fortune of contiguous frames Dynamic information carries out learning rate dynamic and adjusts, and generates dynamic modeling result, it may be assumed that

The model modification module: the learning rate α more new model obtained according to model degradation evaluation module, updated model are mdl_t+1=(1-α)mdl_t+αtmp_t。

Compared with prior art, the present invention has the effect that

The present invention calculates the motion vector field of target, to original video counts using the target motion information in video data in real time According to estimation interpolation is used, Augmented Data frame is generated, and uses Augmented Data frame smoothly more new model, it is accurate to obtain Prediction model.It using prediction model, is combined with learning rate dynamic adjustment module, when target quickly changes, increases learning rate, Increase quick variation targets to the influence power of model, increase model adaptability, when blocking interference, reduces learning rate, benefit Background is reduced with learning rate dynamic adjustment to interfere object module bring, increases the robustness of model, in terms of two kinds, is mitigated The problem of middle object module of target dynamic modeling problem is degenerated, balances real-time and accuracy, and slightly reducing, modeling is real In the case where when property, the accuracy of dynamic modeling is improved with learning rate Renewal step by step method.

Detailed description of the invention

Fig. 1 is the structural representation of the target dynamic modeling module relationship of interpolation-movement estimation module and learning rate Renewal step by step Figure；

Fig. 2 is the schematic diagram of interpolation-movement estimation module；

Fig. 3 is the structural schematic diagram of the target dynamic modeling module example of learning rate Renewal step by step.

Specific embodiment

Below against embodiment and in conjunction with attached drawing, the present invention is further illustrated.Following embodiment will be helpful to ability The technical staff in domain further understands the present invention, but the invention is not limited in any way.It should be pointed out that this field For those of ordinary skill, without departing from the inventive concept of the premise, various modifications and improvements can be made.These all belong to In protection scope of the present invention.

A kind of learning rate Renewal step by step method based on estimation interpolation, the specific steps are as follows:

Step 1: in order to generate the Augmented Data frame for correcting learning rate, the method proposed uses estimation interpolation side Method generates the sample of one group of carrying space time information.Preferably, by depending on Block- matching or Feature Points Matching, two phases are matched Identical entity in adjacent frame simultaneously calculates motion vector field to complete.Two consecutive frames are divided into the square with length first, and Pass through match block and record the movement between two frames, calculates MVF.For the block with center c and size pp, pass through minimum Change the energy difference between the block from two frames to complete Block- matching: the form of energy difference depends on cost function, it is preferable that makes With mean absolute difference (MAD) or mean square error (MSE).It, can be raw by each piece (δ c) of estimation by minimizing cost Each piece of estimation is recorded as MVF u (c), MVF.In order to calculate interpolated frame, interpolation field is calculated first.From frame ft and Sports ground ut between frame ft+1.δ t is enabled to indicate time interval, then the sports ground c generated on time t+λ δ t can It is represented as: u_λ(c+λu_t(c))=u_t(c) it for each hole region in interpolated movements field, is sweared by closest movement Amount filling.It is represented as by the λ interpolation results that frame ft and frame ft+1 interpolation generate: f_λ(c)=(1-λ)f_t(c-λu_λ(c)+λ f_t+1(c+(1-λ)u_λ(c))；

Step 2: the input using Augmented Data frame, as Target Modeling task.According to given modeling target position p₀, mention Take out the model mdl of target₀, it is preferable that object module is chosen using correlation filter；

Step 3: detecting according to object module to next frame, response R is calculated₀, utilize response R₀Generate a prediction bits The object module Tmp set₁；

Step 4: using calculated prediction model tmp_t, with existing model mdl_tIt is compared, center is deviateed for model A possibility that the case where domain, target is fast-moving target, is big, according to movement rate, increases learning rate α, for prediction model phase Than declining obvious situation in original model absolute value, then target deformation or to encounter a possibility that blocking big, it is preferable that use view Relevant range F, which is multiplied with prediction model in frequency domain, in frequency obtains response R_FIt is rung if prediction model and assessment area calculate gained Reduction should be worth, regionality is presented, then to block, according to serious shielding situation, learning rate α is accordingly decreased, if without region Property, then it is deformation, according to deformation degree, correspondingly increases learning rate α；

Step 5: prediction model tmp_tNew training sample is provided for fresh target model；

Step 6: according to updated learning rate α, updating original object module according to the object module of predicted position, generating mesh Mark model mdl_t+1。

For t frame, second step is repeated to the 6th step, R is responded according to t frame_tWith the existing model mdl of t frame_tIt gives birth to frame by frame At response R_tWith object module mdl_t+1。

The function that the present embodiment described device is realized is as follows:

(1) estimated using motion information, generate the intermediate video data comprising space-time consistency information, learnt for Renewal step by step Rate.

(2) data of associative learning rate dynamic adjustment and augmentation, assess model degradation situation, progressive to learning rate Formula updates, and realizes target dynamic modeling.

To sum up, using the target motion information in video data, the motion vector field of target is calculated in real time, to original video Data carry out interpolation, generate intermediate state frame, for generating learning rate adjustable strategies, then carry out dynamic in real time to learning rate and adjust It is whole, when target quickly changes, increase learning rate, increase model adaptability, when target is disturbed such as blocks, reduces study Rate increases model robustness, mitigates the model degradation problem in target dynamic modeling work.

Specific embodiments of the present invention are described above.It is to be appreciated that the invention is not limited to above-mentioned Particular implementation, those skilled in the art can make various deformations or amendments within the scope of the claims, this not shadow Ring substantive content of the invention.

Claims

1. a kind of learning rate Renewal step by step method based on estimation interpolation, it is characterised in that: specific step is as follows:

C, using the data after augmentation, input as Target Modeling task: according to given target response R_t, extract t The existing model mdl of frame_t；

mdl_t+1= αTmp_t+(1-α)mdl_t；

H, step C to step G is repeated, R is responded according to t frame_tWith the object module mdl of t frame_t+1, by the object module of t frame mdl_t+1Existing model mdl as t+1 frame_t+1Generate Target Modeling result p_tWith the object module mdl of t+1 frame_t+2。

2. the learning rate Renewal step by step method according to claim 1 based on estimation interpolation, it is characterised in that: in step Rapid A is into step B: original video data is with consecutive frame (t frame, t+1 frame) for one group of (f_t, f_t+1), it is used for two sides Face, in a first aspect, in step, according to (f_t, f_t+1), calculate motion vector field u_t；Second aspect, in stepb, according to (f_t, f_t+1) and A in the u that calculates_t, calculate Augmented Data frame { f_λ}。

3. the learning rate Renewal step by step method according to claim 1 based on estimation interpolation, it is characterised in that: in step In rapid B: using motion vector field to legacy data augmentation, not needing additionally to demarcate, the data of generation are also used as new sample defeated Enter.

4. the learning rate Renewal step by step method according to claim 1 based on estimation interpolation, it is characterised in that: in step In rapid B: Augmented Data frame { f_λOriginal video data is contained in definition, in special circumstances, as λ=0, f_λ= f_t, when λ=1 When, f_λ= f_t+1。

5. the learning rate Renewal step by step method according to claim 1 based on estimation interpolation, it is characterised in that: described Step C is into step the G: object module can be converted into existing model with time change, in the object module mdl of t frame_t+1 In t+1 frame, as existing model mdl_t+1。

6. a kind of dynamic modeling system for learning rate Renewal step by step method, it is characterised in that: estimate mould including sport interpolation The target dynamic modeling module of block and learning rate Renewal step by step, in which:

B, the target dynamic modeling module of the learning rate Renewal step by step, according to Augmented Data frame { f_λ, in conjunction with the movement of contiguous frames Information carries out learning rate dynamic and adjusts, and generates dynamic modeling result；Wherein:

The object module extraction module: as t=0, according to initially given response R₀, calculate existing model mdl₀As t > 0, The existing model mdl for the t frame for directly update being taken to obtain_t, input response computation module；

The model degradation evaluation module: calculated prediction model tmp is used_t, with existing model mdl_tIt is compared, for A possibility that model deviates the case where central area, and target is fast-moving target is big, according to motion vector field u_tIt is moved Rate increases learning rate α；For prediction model tmp_tCompared to existing model mdl_tAbsolute value declines obvious situation, then target shape Become or to encounter a possibility that blocking big, uses Augmented Data frame { f at this time_λIn relevant range F and prediction model tmp_tIn frequency domain It is multiplied and obtains response R_F；If prediction model tmp_tWith assessment area F calculate gained response reduce present it is regional, then for It blocks, according to serious shielding situation, learning rate α is accordingly decreased, if without regionality, for deformation, according to deformation degree, Correspondingly increase learning rate α；Model degradation evaluation module is by α input model update module；

The model modification module: the learning rate α more new model obtained according to model degradation evaluation module, updated model mdl_t+1=(1-α)mdl_t+αtmp_t；

For original video data after input motion Interpolate estimation module, sport interpolation estimation module moves video contiguous frames Estimation generates intermediate interpolation frame, inputs the target dynamic modeling module of learning rate Renewal step by step together, learning rate Renewal step by step Video data after the augmentation that target dynamic modeling module is inputted according to target dynamic modeling module is believed in conjunction with the movement of contiguous frames Breath carries out learning rate dynamic and adjusts, and generates dynamic modeling result, it may be assumed that