CN106980843B

CN106980843B - The method and device of target following

Info

Publication number: CN106980843B
Application number: CN201710217051.3A
Authority: CN
Inventors: 丁萌; 王洁; 魏丽; 张天慈; 周子易
Original assignee: Nanjing University of Aeronautics and Astronautics
Current assignee: Nanjing University of Aeronautics and Astronautics
Priority date: 2017-04-05
Filing date: 2017-04-05
Publication date: 2019-09-06
Anticipated expiration: 2037-04-05
Also published as: CN106980843A

Abstract

The present invention provides a kind of method and devices of target following, belong to field of image processing.This method comprises: obtaining the target image and the corresponding location point of the target image of corresponding tracking target in start frame image first, using the target image as matching template, according to the sparse coding algorithm of spatial pyramid matching technique, the ScSPM feature vector of matching template is obtained；Then according to the particle filter algorithm of the corresponding location point of target image in t-1 frame image and involvement estimation, the multiple particles in t frame image are obtained, the corresponding sampling particle region of each particle is determined according to the position of each particle in t frame image and calculate the ScSPM feature vector of each particle region；Finally according to the ScSPM feature vector of ScSPM feature vectors and the matching template of multiple sampling particle regions obtain target t frame image corresponding moment optimal location to realize moving target in the lasting robustness tracking of complex condition.

Description

The method and device of target following

Technical field

The present invention relates to field of image processings, in particular to a kind of method and device of target following.

Background technique

Target following technology has obtained extensive research in fields such as intelligent monitoring, visual guidance, human-computer interactions in recent years.Mesh Before, target following technology is focused principally on to be indicated by improving object module, the two aspects of target following and detection method come Improve the robustness of track algorithm.

The essence that object module indicates is exactly to carry out feature extraction to target image, and target is transformed into spy from image space It levies in space, while reducing target indicates dimension, improves the ga s safety degree of target.Currently used feature extracting method master It to include color characteristic, textural characteristics, edge feature, Optical-flow Feature etc..But all these features are when being used for target following It all there is a problem that robustness is not strong.

Summary of the invention

In view of this, a kind of method and device for being designed to provide target following of the embodiment of the present invention, to realize fortune Moving-target is tracked in the lasting robustness of complex condition.

In a first aspect, the embodiment of the invention provides a kind of methods of target following, which comprises obtain start frame The target image and the corresponding location point of the target image of corresponding tracking target in image, in the start frame image Target image is as matching template；According to the sparse coding algorithm of spatial pyramid matching technique, the matching template is obtained ScSPM feature vector；It is calculated according to the corresponding location point of target image in t-1 frame image and the particle filter for incorporating estimation Method obtains the multiple particles in t frame image, wherein t is the positive integer greater than 1；According to each grain in the t frame image The position of son determines the corresponding sampling particle region of each particle and calculates the ScSPM feature vector of each sampling particle region； Target is obtained according to the ScSPM feature vector of the ScSPM feature vector of multiple sampling particle regions and the matching template The optimal location at moment is corresponded in t frame image.

Second aspect, the embodiment of the invention provides a kind of device of target following, described device includes: that image obtains mould Block, for obtaining the target image and the corresponding location point of the target image of corresponding tracking target in start frame image, with Target image in the start frame image is as first matching template；First eigenvector obtains module, for according to space gold The sparse coding algorithm of word tower matching technique obtains the ScSPM feature vector of the matching template；Grain sub-acquisition module, is used for According to the corresponding location point of target image in the t-1 frame image before t frame image and incorporate estimation particle filter algorithm, Obtain the multiple particles in t frame image, wherein t is the positive integer greater than 1；Second feature vector obtains module, is used for basis The position of each particle determines the corresponding sampling particle region of each particle and calculates each sampling particle in the t frame image The ScSPM feature vector in region；Optimal location obtains module, for the ScSPM feature according to multiple sampling particle regions The ScSPM feature vector of vector and the matching template obtains target in the optimal location at t frame image corresponding moment.

The method and device for the target following that various embodiments of the present invention propose obtains corresponding tracking in start frame image first The target image of target and the corresponding location point of the target image, using the target image in the start frame image as matching mould Plate obtains the ScSPM feature vector of matching template according to the sparse coding algorithm of spatial pyramid matching technique；Then basis The particle filter algorithm of the corresponding location point of target image and involvement estimation, obtains in t frame image in t-1 frame image Multiple particles, wherein t is positive integer greater than 1, determines each particle pair according to the position of each particle in t frame image The sampling particle region answered and the ScSPM feature vector for calculating each particle region；Finally according to multiple sampling particle regions The ScSPM feature vector of ScSPM feature vector and matching template obtain target t frame image corresponding moment optimal location with Realize that moving target is tracked in the lasting robustness of complex condition.

To enable the above objects, features and advantages of the present invention to be clearer and more comprehensible, preferred embodiment is cited below particularly, and cooperate Appended attached drawing, is described in detail below.

Detailed description of the invention

In order to illustrate the technical solution of the embodiments of the present invention more clearly, below will be to needed in the embodiment attached Figure is briefly described, it should be understood that the following drawings illustrates only certain embodiments of the present invention, therefore is not construed as pair The restriction of range for those of ordinary skill in the art without creative efforts, can also be according to this A little attached drawings obtain other relevant attached drawings.

Fig. 1 shows the structural block diagram of electronic equipment provided in an embodiment of the present invention；

Fig. 2 shows the flow charts of the method for the target following of first embodiment of the invention offer；

Fig. 3 shows the target image of start frame image provided in an embodiment of the present invention and corresponding tracking target；

Fig. 4 shows the flow chart of the step S120 of the method for the target following of first embodiment of the invention offer；

Fig. 5 shows target image division mode schematic diagram provided in an embodiment of the present invention；

Fig. 6 shows target image division result schematic diagram provided in an embodiment of the present invention；

Fig. 7 shows SIFT feature provided in an embodiment of the present invention and describes subvector formation schematic diagram；

Fig. 8 shows K-SVD dictionary learning flow chart provided in an embodiment of the present invention；

Fig. 9 shows the matching schematic diagram of spatial pyramid provided in an embodiment of the present invention；

Figure 10 shows the flow chart of the step S150 of the method for the target following of first embodiment of the invention offer；

Figure 11 shows the structural block diagram of the device of the target following of second embodiment of the invention offer.

Specific embodiment

Below in conjunction with attached drawing in the embodiment of the present invention, technical solution in the embodiment of the present invention carries out clear, complete Ground description, it is clear that described embodiments are only a part of the embodiments of the present invention, instead of all the embodiments.Usually exist The component of the embodiment of the present invention described and illustrated in attached drawing can be arranged and be designed with a variety of different configurations herein.Cause This, is not intended to limit claimed invention to the detailed description of the embodiment of the present invention provided in the accompanying drawings below Range, but it is merely representative of selected embodiment of the invention.Based on the embodiment of the present invention, those skilled in the art are not doing Every other embodiment obtained under the premise of creative work out, shall fall within the protection scope of the present invention.

It should also be noted that similar label and letter indicate similar terms in following attached drawing, therefore, once a certain Xiang Yi It is defined in a attached drawing, does not then need that it is further defined and explained in subsequent attached drawing.Meanwhile of the invention In description, term " first ", " second " etc. are only used for distinguishing description, are not understood to indicate or imply relative importance.

As shown in Figure 1, being the block diagram of electronic equipment 100.The electronic equipment 100 includes: the dress of target following Set 200, memory 110, storage control 120, processor 130, Peripheral Interface 140, input-output unit 150, audio unit 160, display unit 170.

The memory 110, storage control 120, processor 130, Peripheral Interface 140, input-output unit 150, sound Frequency unit 160 and each element of display unit 170 are directly or indirectly electrically connected between each other, with realize data transmission or Interaction.It is electrically connected for example, these elements can be realized between each other by one or more communication bus or signal wire.The mesh The device of mark tracking includes that at least one can be stored in the memory or consolidate in the form of software or firmware (firmware) Change the software function module in the operating system (operating system, OS) of the client device.The processor 130 for executing the executable module stored in memory 110, such as the device software function mould that includes of the target following Block or computer program.

Wherein, memory 110 may be, but not limited to, random access memory (Random Access Memory, RAM), read-only memory (Read Only Memory, ROM), programmable read only memory (Programmable Read-Only Memory, PROM), erasable read-only memory (Erasable Programmable Read-Only Memory, EPROM), Electricallyerasable ROM (EEROM) (Electric Erasable Programmable Read-Only Memory, EEPROM) etc.. Wherein, memory 110 is for storing program, and the processor 130 executes described program after receiving and executing instruction, aforementioned Method performed by the server that the stream process that any embodiment of the embodiment of the present invention discloses defines can be applied to processor 130 In, or realized by processor 130.

Processor 130 may be a kind of IC chip, the processing capacity with signal.Above-mentioned processor 130 can To be general processor, including central processing unit (Central Processing Unit, abbreviation CPU), network processing unit (Network Processor, abbreviation NP) etc.；Can also be digital signal processor (DSP), specific integrated circuit (ASIC), Ready-made programmable gate array (FPGA) either other programmable logic device, discrete gate or transistor logic, discrete hard Part component.It may be implemented or execute disclosed each method, step and the logic diagram in the embodiment of the present invention.General processor It can be microprocessor or the processor be also possible to any conventional processor etc..

Various input/output devices are couple processor 130 and memory 110 by the Peripheral Interface 140.Some In embodiment, Peripheral Interface 140, processor 130 and storage control 120 can be realized in one single chip.Other one In a little examples, they can be realized by independent chip respectively.

Input-output unit 150 is used to be supplied to the interaction that user input data realizes user and electronic equipment 100.It is described Input-output unit 150 may be, but not limited to, mouse and keyboard etc..

Audio unit 160 provides a user audio interface, may include one or more microphones, one or more raises Sound device and voicefrequency circuit.

Display unit 170 provides an interactive interface (such as user interface) between electronic equipment 100 and user Or it is referred to for display image data to user.In the present embodiment, the display unit 170 can be liquid crystal display or touching Control display.It can be the touching of the capacitance type touch control screen or resistance-type of support single-point and multi-point touch operation if touch control display Control screen etc..Single-point and multi-point touch operation is supported to refer to that touch control display can sense on the touch control display one or more The touch control operation generated simultaneously at a position, and the touch control operation that this is sensed transfers to processor 130 to be calculated and handled.

It should be understood that structure shown in FIG. 1 is only to illustrate, electronic equipment 100 may also include it is more than shown in Fig. 1 or The less component of person, or with the configuration different from shown in Fig. 1.Each component shown in Fig. 1 can using hardware, software or A combination thereof is realized.In the present embodiment, the electronic equipment 100 can be computer, server or other with image procossing The equipment of ability.

The embodiment of the present invention be used for by video with continuous multiple frames image or other have in time it is successively suitable The multiple image of sequence carries out target following, the video or other to have the multiple image of sequencing in time be to quilt The target of tracking, which is shot, to be obtained.The shooting can be to be realized by image collecting devices such as camera, video cameras.Below by implementation The object tracking process is described in detail in example.

First embodiment

Referring to figure 2., Fig. 2 is a kind of flow chart of the method for target following that first embodiment of the invention provides.Below Process shown in Fig. 2 will be described in detail, which comprises

Step S110: the target image and the target image for obtaining corresponding tracking target in start frame image are corresponding Location point, using the target image in the start frame image as matching template.

In the present embodiment, start frame image can be obtained from image collecting devices such as camera, video cameras, be also possible to The image data of the start frame image of user or other equipment input.It should be understood that the start frame image is for target Start frame in the multiple image of tracking in time sequencing.

Further, determine that the target image of corresponding tracking target in the start frame image, target image are tracking target Corresponding imaging region on the image, using the target image as matching template, as shown in Figure 3.In Fig. 3, region shown in P exists Corresponding region is determining target image in the image of acquisition.Determine that the concrete mode of target image can in start frame image To be, determined according to artificial calibration region of the user in start frame image, or detected by object detector and determine starting Frame is not intended as limiting in the present embodiment.

The corresponding location point of the target image is obtained again, as a kind of preferred mode, the available target image Center positionAs the location point of the target image, that is, pass through target image top left co-ordinate (x₁,y₁) and bottom right Angular coordinate (x₂,y₂) and following formula calculate and obtain:

Step S120: according to the sparse coding algorithm of spatial pyramid matching technique, the ScSPM of the matching template is obtained Feature vector.

Research has shown that the impression signal of the first cellula visualis of human body be it is sparse, therefore, using sparse coding more can table On the basis of levying characteristics of image, smaller reconstructed error can be obtained.I.e. in the present embodiment, according to spatial pyramid matching technique Sparse coding algorithm, obtain the ScSPM feature vector of matching template.

Specifically, referring to figure 4., the step of Fig. 4 shows the method for the target following of first embodiment of the invention offer The flow chart of S120 will be explained in detail below for process shown in Fig. 4, which comprises

Step S121: the target image is divided into multiple subgraphs.

Aiming at the problem that being easily blocked in target motion process, the present embodiment draws the target image of corresponding tracking target It is divided into multiple subgraphs, to extract the local feature of the target image.Specifically, a kind of embodiment party as the present embodiment Formula, being divided into a row b and arrange total B=a*b to have size target image is the subgraph of block*block, lateral, vertical It is step pixel to step-length, it is preferred that step <block makes have certain Duplication between each subgraph, to obtain Richer information, partitioned mode is as shown in figure 5, be directed to target image shown in Fig. 3, each son got using this method Image result is as shown in Figure 6.

As a kind of preferred embodiment of the embodiment, step value is 6 pixels, and block value is 16 pixels, can be with Understand, the value of step and block are not intended as the restriction of this programme, it is only necessary to meet step <block.

In the present embodiment, the value of a, b, step and block are preferably positive integer, and specific value is meeting value It is not limited in the case where it is required that.

Step S122: SIFT feature is extracted to each subgraph respectively and describes subvector.

SIFT feature is extracted to each subgraph and describes subvectorWherein S_ijFor target image In the SIFT feature of j-th of subgraph of the i-th row subvector is described, can solve target due to rotation, deformation, scaling and cause with The decline of track performance even tracks the problem of target is lost.

Specifically, the Gauss scale first by Gaussian function G (x, y, z) and subgraph I (x, y) convolution building subgraph is empty Between L (x, y, σ), calculated by following formula:

L (x, y, σ)=G (x, y, z) * I (x, y)

Wherein, Gaussian function G is defined as follows:

Wherein, (x, y) is the space coordinate of image, and S is the scale space factor；

Gaussian difference scale space D (x, y, s) is constructed using difference of Gaussian function DoG, is constructed by following formula:

D (x, y, σ)=(G (x, y, k σ)-G (x, y, σ)) * I (x, y)=L (x, y, k σ)-L (x, y, σ)

Wherein, k is a constant, indicates the multiplication constant of scale space factor s.

Then the extreme point of the scale space is detected, specifically, each sampled point and all the points adjacent thereto are carried out Compare, sees whether the sampled point is bigger or small than the consecutive points of its image area and its scale domain, if the sampled point is in sky Between be maximum value or minimum value in domain and scale domain, then it is assumed that extreme point is a characteristic point under the scale.

As a kind of mode, in order to improve the stability of characteristic point, and unstable skirt response point is eliminated, needed to ruler Degree space DoG function carries out curve fitting.The second Taylor series formula of DoG function, i.e. fitting function are as follows:

Wherein, X=(x, y, σ)^T.To above formula derivation, and equation is allowed to be equal to 0, acquires the offset of extreme point are as follows:

Corresponding extreme point, the value of equation are as follows:

Wherein,The offset for representing opposite interpolation center, when the offset on its dimension in office is greater than When 0.5, indicate that interpolation center has shifted on other point of proximity, it is necessary to change the position of current signature point, while new Interpolation is up to convergence repeatedly on position, to determine the higher characteristic point of stability from all extreme points.

Further, the gradient direction distribution characteristic for recycling characteristic point field pixel is that each characteristic point obtained is specified Directioin parameter, specifically, the modulus value m (x, y) of characteristic point gradient and direction q (x, y) are calculated by following formula:

Then in the corresponding Gaussian image of each characteristic point, its neighbors around is divided into d*d sub-regions, In, neighbors around refers to centered on this feature point, using r as the neighborhood of radius,Certainly, adjacent nearby The size in domain does not limit in the present embodiment.Each subregion carries out 8 as a seed point, to each seed point region Histogram of gradients statistics, specifically, 0~360 degree of direction scope is divided into 8 columns (bins), 45 degree of every column.Histogram is most Peak direction represents characteristic point principal direction, while in order to make feature description have certain robustness, weeding out the less side of pixel To only reservation peak value is greater than auxiliary direction of the direction of principal direction peak value 80% as characteristic point.

All statistics with histogram vectors are stitched together row vector normalized of going forward side by side, are formed by corresponding each subgraph SIFT feature describes sub- S_ij∈R^m, wherein m=d*d*8.In an embodiment of the present embodiment, the value of d is 4.Specifically , as shown in fig. 7, all statistics with histogram vectors are spliced into H=(h₁,h₂…h_m) vector, in order to remove the shadow of illumination variation It rings, needs the processing that they are normalized.Obtain normalization characteristic vector S=(S₁,S₂…S_m), wherein

Step S123: describing subvector and sparse coding algorithm according to the SIFT feature, obtains each described subgraph The sparse vector of picture.

In the present embodiment, subvector is described to each SIFT feature and carries out sparse coding, obtain each subgraph Sparse vectorWherein, P_ijFor the sparse vector of j-th of subgraph of the i-th row.Specifically, sharp first Complete dictionary D ∈ R was obtained with K-SVD algorithm^m×n(n > > m) then carries out the SIFT feature vector of each subgraph dilute Dredge coding S_ij=DP_ij, the P that acquires_ij∈RⁿThe sparse vector of i.e. each subgraph.

Wherein, Fig. 8 shows K-SVD dictionary learning flow chart:

Input: training sample setWherein, m is dictionary atom number, degree of rarefication T (i.e. non-zero The number upper limit of amount).In method provided in this embodiment, training sample set Y comes from natural image, certain sample set Source is not intended as the limitation of this method.

Initialization: dictionary D⁽⁰⁾∈R^m×n(n > > m) is arbitrary non-singular matrix, and its each column atom is all returned One change processing, iteration count J=1；

The first step calculates the rarefaction representation of each training sample using dictionary D；

Specifically, using orthogonal matching pursuit (OMP) algorithm, following optimization problem is solved:

Second step, by singular value decomposition, to each atom x in dictionary_k(k=1,2 ..., n) is updated by column；

Specifically, enablingFor dictionary atom x_kRow k vector in corresponding coefficient matrix X, finding training sample Y indicated Dictionary atom x is used in journey_kSample index set ω_k,Dictionary atom x is removed in calculating_kShadow After sound, the quantization error matrix of training sample Y rarefaction representationDefine matrixThe matrix For a sparse matrix, only index (ω_k(i), i) at value be 1, remaining position is all 0.Meanwhile enabling errorOnly retain and dictionary atom x_kRelated error column vector；Coefficient vectorIt is only remained In nonzero term；To matrixDo SVD decompositionThen dictionary atom x_kIt is updated to the first row of matrix U, new Coefficient vectorFor the product of matrix V first row and Δ (1,1)；

Third step judges whether to continue to update；

Specifically, ifAnd do not meet other restrictive conditions (such as the number of iterations), then the first step is gone to, and Iteration count J=J+1 is updated, is otherwise terminated；

Output: dictionary D ∈ R^m×n(n>>m)

Step S124: spatial pyramid matching is carried out to the sparse vector of the subgraph, obtains the ScSPM of matching template Feature vector.

In an embodiment of the present embodiment, after the sparse vector for obtaining subgraph, spatial pyramid is carried out Match, obtains the ScSPM feature vector of matching template.

Specifically, dividing the image into l layers first, i-th layer divides the image into 2^i-1×2^i-1(i=1,2 ..., l) a image block, Wherein, l is the pyramid number of plies, and in the present embodiment, l value is 3, which is matching template.

Maximum pond is carried out to the sparse vector in every layer of each image block respectively, then generates 2 for i-th layer^i-1×2^i-1I.e. 2² ^(i-1)(i=1,2 ..., l) a Chi Hualiang；

Specifically, i-th layer will generate maximum pond matrix as follows:

Wherein, M_pq∈Rⁿ(p, q=1,2 ... 2^i-1) it is M_iPth row q column maximum pond vector, i.e. i-th layer of pth The maximum pond vector of row q column image block, element M_pq(j) (j=1,2 ..., n) it is calculated by following formula:

M_pq(j)=max (P_u,v(j),P_u,v+1(j),P_u+1,v(j),P_u+1,v+1(j),…,P_u,V(j),…,p_U,v(j),P_U,v+1 (j),…,P_U,V(j))

Wherein, u=(p-1) unit_line+1, v=(q-1) unit_row+1.Work as p, q ≠ 2^i-1When, U=p Unit_line, V=qunit_row；Work as p=2^i-1,q≠2^i-1When, U=a, V=qunit_row；When p ≠ 2^i-1, q=2ⁱ ^-1When, U=punit_line, V=b；Work as p, q=2^i-1When, U=a, V=b.

Wherein,

Finally above-mentioned pond vector ending is connected and is converted to oneThe column vector of dimension, i.e. matching mould The ScSPM feature vector of plate, it may be assumed that

As shown in figure 9, Fig. 9 shows 3 layers of pyramid matching schematic diagram of embodiment offer.

Step S130: it is filtered according to the corresponding location point of target image in t-1 frame image and the particle for incorporating estimation Wave algorithm obtains the multiple particles in t frame image, wherein t is the positive integer greater than 1.

In an embodiment of the present embodiment, according to the location point of target image in start frame image, pass through involvement The particle filter algorithm of estimation obtains the multiple particles in the second frame image, according to each particle in the second frame image Position determines the corresponding sampling particle region of each particle and calculates the ScSPM feature vector of each sampling particle region, then root Target is obtained in the second frame figure according to the ScSPM feature vector of multiple sampling particle regions and the ScSPM feature vector of matching template As the optimal location at corresponding moment, the location point of the second frame image is obtained；According to the location point of target image in the second frame image, By incorporating the particle filter algorithm of estimation, the multiple particles in third frame image are obtained, according to every in third frame image The position of a particle determine each particle it is corresponding sampling particle region and calculate it is each sampling particle region ScSPM feature to The ScSPM feature vector of amount, ScSPM feature vectors and matching template further according to multiple sampling particle regions obtains target the Three frame images correspond to the optimal location at moment, obtain the location point of target image in third frame image；It repeats the above steps, until The location point of target image in t-1 frame image is obtained, then according to the location point of t-1 frame target image, is transported by incorporating The particle filter algorithm of dynamic estimation determines the multiple particles in t frame image, wherein t is the positive integer greater than 1；

Specifically, the distribution for sampling particle is calculated by following formula in t frame:

Wherein, (μ_x,t,μ_y,t) be t frame particle sampler Gaussian Profile center point coordinate, the flat of preceding M frame is passed through by following formula Target position (the x of equal speed and previous frame_t-1,y_t-1) calculate:

Wherein, σ_x,t、σ_y,tFor t frame particle Gaussian Profile variance, preceding M frame target average acceleration meter is passed through by following formula It calculates:

Wherein, M is the forward direction frame number for participating in estimation, and in this embodiment in preferred embodiment, M can be with value 3 To between 20, it is possible to understand that, the value range of M cannot function as the restriction of this programme.It should be understood that the average speed of preceding M frame Degree and target average acceleration can pass through target image pair between time difference between every two frame in preceding M frame and every two frame The change in location answered determines.

Step S140: the corresponding sampling particle of each particle is determined according to the position of each particle in the t frame image Region and the ScSPM feature vector for calculating each sampling particle region.

In the present embodiment, the corresponding sampling of each particle is determined according to the position of each particle in t frame image first Particle region, wherein particle sampler region, which refers to, to be extended to the outside centered on particle until obtaining size is equal to target image Region, the region become particle sampler region, then according to the sparse coding algorithm of space pyramid matching technique, calculate each The ScSPM feature vector of particle region is sampled, calculation method is identical as the ScSPM feature vector calculation method of matching template, Details are not described herein.

Step S150: according to the ScSPM of the ScSPM feature vector of multiple sampling particle regions and the matching template Feature vector obtains the optimal location that target corresponds to the moment in t frame image.

In one embodiment of the invention, calculate it is multiple sampling particle regions ScSPM feature vectors with match mould Similarity degree between plate ScSPM feature vector corresponds to the optimal location at moment for determining target in T frame image.

Specifically, please referring to Figure 10, Figure 10 shows the step of the method for the target following of first embodiment of the invention offer The flow diagram of rapid S150 will be explained in detail below for process shown in Fig. 10, which comprises

Step S151: the ScSPM feature vector of each sampling particle region and the ScSPM of the matching template are calculated Pasteur's distance of feature vector.

As the present embodiment, preferably a kind of mode, the ScSPM for calculating each sampling particle region by Pasteur's distance are special Levy the similarity degree between vector and the ScSPM feature vector of matching template.Wherein, the ScSPM of each sampling particle region is special The Pasteur's distance for levying the ScSPM feature vector of vector and matching template, is calculated by following formula:

Wherein, T₀For the feature vector of the ScSPM of matching template,For the ScSPM feature of i-th of particle region of t frame Vector, L are characterized vector length, wherein

Step S152: according to each sampling particle area of each corresponding Pasteur's distance calculating of sampling particle region The corresponding likelihood function in domain.

Wherein, it after obtaining the corresponding Pasteur's distance of each sampling particle region, is calculated according to Pasteur distance and is each adopted The corresponding likelihood function in like-particles region, likelihood function are calculated by following formula:

Step S153: the weight of the corresponding particle of each likelihood function is updated according to the likelihood function, and to institute Weight is stated to be normalized.

Specifically, being updated after getting the corresponding likelihood function of each sampling particle region according to the likelihood function The weight of the particle, wherein t frame particle right value update is calculated by following formula:

Obtained weight is normalized again, is calculated by following formula:

Wherein, N is total number of particles, and in the present embodiment preferred embodiment, the value of N is between 50 to 150, can be with Understand, the present embodiment is to the value of N, not as the limitation of this programme.In start frame image, the weight of each particle can be with It is set as 1/N.

Step S154: calculating the position of the multiple particles in t frame image and the weighted average of its weight, obtains Target corresponds to the optimal location at moment in t frame image.

It is calculated specifically, obtaining target and corresponding to the optimal location at moment in t frame image by following formula:

Wherein, (x_t,y_t) it is that t frame tracks target position,For the coordinate of i-th of particle.

Step S160: judge whether to need to update the matching template.

Wherein, during tracking target image, target is easy to happen change of scale and is illuminated by the light, noise and blocks Etc. factors influence, tend not to carry out using changeless template to stablize effective tracking, so needing to judge matching mould Whether plate, which needs, updates.

Specifically, in a kind of currently preferred embodiment, using given threshold method to determine whether needing to update Template, if the mean value of target obtained by preceding num frame and Pasteur's distance between matching template feature vector is d_avr, it is calculated by following formula:

It is compared with threshold value th, if d_avr> th then determines to need to update matching template, by the matching template It is updated to the corresponding matching template of t frame image matching template corresponding with t-1 frame image and is weighted the mould that summation obtains Plate is calculated by following formula:

T_new=α T_t+(1+α)T_t-1

Wherein, num indicates that pre-set frame number, th indicate pre-set threshold value, and α indicates pre-set weight, In the present embodiment preferred embodiment, the value of num can between 5 to 10, the value of th can 0.7 to 0.8 it Between, the value of α can be between 0.05 to 0.95, it is possible to understand that, the num in the embodiment, the value range of th, α are not Restriction of the energy as the program.

Wherein, if d_avr≤ th, then retain current template, uses as matching template.

Step S170: judge whether to need to reacquire particle.

In the present embodiment, with the propulsion of tracking process, the weight of particle will appear extreme differences, i.e., a small number of particles Weights sum is almost 1, and the weight of most particle is close to 0, i.e., so-called weight degradation phenomena, as a kind of side Formula needs to judge whether particle needs to reacquire to overcome the problems, such as that particle weight is degenerated.

It can be by number of effective particles N specifically, judging whether to need to reacquire particle_vTo judge, wherein N_vBy Following formula calculates:

Setting threshold value is β, works as N_vWhen/N < β, then judgement needs to reacquire particle, wherein preferably implements in embodiment In mode, the value of β can be 1/2, it is possible to understand that, the restriction of the value of β not as this programme.

Wherein, when judging particle to be reacquired, the too low particle of weight is eliminated, the big particle of duplication weight is protected Card total number of particles remains unchanged.Specifically, generating uniform random number column U={ u on [0,1] first₁,u₂,…,u_N, wherein u_j- U (0,1), j=1,2 ..., N calculate weight accumulated probability P={ p₁,p₂,…,p_N, whereinK=1, 2,…,N.Then, it finds k and makes p_k-1<u_j<p_k, then k-th of particle is replicated in the position of j-th of particle after reacquiring.Make For a kind of embodiment, it is copied after each particle weights it is equal, be 1/N, it is possible to understand that, the weight value not as The weight of the restriction of this programme, some particles can be different.

The method of target following provided in an embodiment of the present invention is constructed first after obtaining target image using sparse coding Object module, and piecemeal processing is carried out to target image, to obtain the local message of target, each subgraph is extracted SIFT feature describes subvector, so as to solve target due to rotation, deformation, scaling and cause tracking performance decline even lose Problem.Further, spatial pyramid is matched and introduces sparse coding by the program, obtains fixed length output, while remaining target The global and local information of image, to solve the problems, such as that target is blocked during tracking and leads to tracking failure.This method Introduce estimation also to solve the problems, such as that target quickly moves.Simultaneously matching template update with solve target carriage change, The problems such as illumination variation.Therefore, this method can still be able to maintain preferable tracking robustness in complex situations.

Second embodiment

Figure 11 is please referred to, Figure 11 is a kind of structural block diagram of the device for target following that second embodiment of the invention provides. Structural block diagram shown in Figure 11 will be illustrated below, shown device 200 include: image collection module 210, fisrt feature to Amount obtain module 220, grain sub-acquisition module 230, second feature vector obtain module 240, optimal location obtain module 250, With template judgment module 260, particle judgment module 270, in which:

Image collection module 210, for obtaining the target image and the mesh of corresponding tracking target in start frame image The corresponding location point of logo image, using the target image in the start frame image as matching template.

First eigenvector obtains module 220, for the sparse coding algorithm according to spatial pyramid matching technique, obtains The ScSPM feature vector of the matching template.Wherein, the first eigenvector obtains module 220 further include: image divides son Module 222, for the target image to be divided into multiple subgraphs；SIFT feature describes subvector acquisition submodule 224, uses Subvector is described in extracting SIFT feature to each subgraph respectively；Subgraph sparse vector acquisition submodule 226, is used for Subvector and sparse coding algorithm are described according to the SIFT feature, obtains the sparse vector of each subgraph；First Feature vector acquisition submodule 228 carries out spatial pyramid matching for the sparse vector to the subgraph, obtains matching mould The ScSPM feature vector of plate.

Grain sub-acquisition module 230, for according to the corresponding location point of target image in the t-1 frame image before t frame image With the particle filter algorithm for incorporating estimation, the multiple particles in t frame image are obtained, wherein t is the positive integer greater than 1.

Second feature vector obtains module 240, determines for the position according to each particle in the t frame image each The corresponding sampling particle region of particle and the ScSPM feature vector for calculating each sampling particle region.

Optimal location obtain module 250, for according to it is multiple it is described sampling particle regions ScSPM feature vectors with it is described The ScSPM feature vector of matching template obtains the optimal location that target corresponds to the moment in t frame image.Wherein, the optimal position It sets and obtains module 250 further include: Pasteur is apart from computational submodule 252, for calculating the ScSPM of each sampling particle region Pasteur's distance of feature vector and the ScSPM feature vector of the matching template；Likelihood function computational submodule 254 is used for root The corresponding likelihood function of each sampling particle region is calculated according to the corresponding Pasteur's distance of each sampling particle region；Power Value updates submodule 256, for updating the weight of the corresponding particle of each likelihood function according to the likelihood function, and it is right The weight is normalized；Optimal location acquisition submodule 258, for calculating the multiple particles in t frame image The weighted average of position and its weight obtains the optimal location that target corresponds to the moment in t frame image.

Matching template judgment module 260 needs to update the matching template for judging whether；When judgement needs to update institute When stating matching template, it is corresponding with t-1 frame image that the matching template is updated to the corresponding matching template of the t frame image Matching template weighted sum.

Particle judgment module 270 needs to reacquire particle for judging whether；When judgement needs to reacquire particle When, the low particle of weight is eliminated, the big particle of weight is replicated, the total number of particles is constant.

The present embodiment refers to above-mentioned the process of the respective function of each Implement of Function Module of the device 200 of target following Content described in Fig. 1 to embodiment illustrated in fig. 9, details are not described herein again.

In conclusion the embodiment of the present invention propose target following method and device obtain first it is right in start frame image The target image and the corresponding location point of the target image that target should be tracked, using the target image in the start frame image as Matching template obtains the ScSPM feature vector of matching template according to the sparse coding algorithm of spatial pyramid matching technique；So Afterwards according to the particle filter algorithm of the corresponding location point of target image in t-1 frame image and involvement estimation, t frame is obtained Multiple particles in image, wherein t is the positive integer greater than 1, is determined according to the position of each particle in t frame image each The corresponding sampling particle region of particle and the ScSPM feature vector for calculating each particle region；Finally according to multiple sampling particles It is optimal at t frame image corresponding moment that the ScSPM feature vector of the ScSPM feature vector in region and matching template obtains target Position is to realize that moving target is tracked in the lasting robustness of complex condition.

In several embodiments provided herein, it should be understood that disclosed device and method can also pass through Other modes are realized.The apparatus embodiments described above are merely exemplary, for example, flow chart and block diagram in attached drawing Show the device of multiple embodiments according to the present invention, the architectural framework in the cards of method and computer program product, Function and operation.In this regard, each box in flowchart or block diagram can represent the one of a module, section or code Part, a part of the module, section or code, which includes that one or more is for implementing the specified logical function, to be held Row instruction.It should also be noted that function marked in the box can also be to be different from some implementations as replacement The sequence marked in attached drawing occurs.For example, two continuous boxes can actually be basically executed in parallel, they are sometimes It can execute in the opposite order, this depends on the function involved.It is also noted that every in block diagram and or flow chart The combination of box in a box and block diagram and or flow chart can use the dedicated base for executing defined function or movement It realizes, or can realize using a combination of dedicated hardware and computer instructions in the system of hardware.

In addition, each functional module in each embodiment of the present invention can integrate one independent portion of formation together Point, it is also possible to modules individualism, an independent part can also be integrated to form with two or more modules.

It, can be with if the function is realized and when sold or used as an independent product in the form of software function module It is stored in a computer readable storage medium.Based on this understanding, technical solution of the present invention is substantially in other words The part of the part that contributes to existing technology or the technical solution can be embodied in the form of software products, the meter Calculation machine software product is stored in a storage medium, including some instructions are used so that a computer equipment (can be a People's computer, server or network equipment etc.) it performs all or part of the steps of the method described in the various embodiments of the present invention. And storage medium above-mentioned includes: that USB flash disk, mobile hard disk, read-only memory (ROM, Read-Only Memory), arbitrary access are deposited The various media that can store program code such as reservoir (RAM, Random Access Memory), magnetic or disk.It needs Illustrate, herein, relational terms such as first and second and the like be used merely to by an entity or operation with Another entity or operation distinguish, and without necessarily requiring or implying between these entities or operation, there are any this realities The relationship or sequence on border.Moreover, the terms "include", "comprise" or its any other variant are intended to the packet of nonexcludability Contain, so that the process, method, article or equipment for including a series of elements not only includes those elements, but also including Other elements that are not explicitly listed, or further include for elements inherent to such a process, method, article, or device. In the absence of more restrictions, the element limited by sentence "including a ...", it is not excluded that including the element Process, method, article or equipment in there is also other identical elements.

The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.It should also be noted that similar label and letter exist Similar terms are indicated in following attached drawing, therefore, once being defined in a certain Xiang Yi attached drawing, are then not required in subsequent attached drawing It is further defined and explained.

The above description is merely a specific embodiment, but scope of protection of the present invention is not limited thereto, any Those familiar with the art in the technical scope disclosed by the present invention, can easily think of the change or the replacement, and should all contain Lid is within protection scope of the present invention.Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. a kind of method of target following, which is characterized in that the described method includes:

The target image and the corresponding location point of the target image of corresponding tracking target in start frame image are obtained, with described Target image in start frame image is as matching template；

According to the sparse coding algorithm of spatial pyramid matching technique, the ScSPM feature vector of the matching template is obtained；

According to the particle filter algorithm of the corresponding location point of target image in t-1 frame image and involvement estimation, t is obtained Multiple particles in frame image, wherein t is the positive integer greater than 1；

The corresponding sampling particle region of each particle is determined according to the position of each particle in the t frame image and is calculated each Sample the ScSPM feature vector of particle region；

It is obtained according to the ScSPM feature vector of the ScSPM feature vector of multiple sampling particle regions and the matching template Target corresponds to the optimal location at moment in t frame image.

2. the method according to claim 1, wherein the sparse coding according to spatial pyramid matching technique Algorithm, the ScSPM feature vector for obtaining the matching template include:

The target image is divided into multiple subgraphs；

SIFT feature is extracted to each subgraph respectively and describes subvector；

Subvector and sparse coding algorithm are described according to the SIFT feature, obtains the sparse vector of each subgraph；

Spatial pyramid matching is carried out to the sparse vector of the subgraph, obtains the ScSPM feature vector of matching template.

3. the method according to claim 1, wherein the ScSPM according to multiple sampling particle regions The ScSPM character vector of feature vector and the matching template obtains target in the optimal position at t frame image corresponding moment It sets and includes:

Calculate bar of the ScSPM feature vector of each sampling particle region and the ScSPM feature vector of the matching template Family name's distance；

The corresponding likelihood of each sampling particle region is calculated according to the corresponding Pasteur's distance of each sampling particle region Function；

The weight of the corresponding particle of each likelihood function is updated according to the likelihood function, and normalizing is carried out to the weight Change processing；

The position of the multiple particles in t frame image and the weighted average of its weight are calculated, obtains target in t frame figure As the optimal location at corresponding moment.

4. the method according to claim 1, wherein the ScSPM according to multiple sampling particle regions The ScSPM feature vector of feature vector and the matching template obtain target t frame image corresponding moment optimal location it Afterwards further include:

Judge whether to need to update the matching template；

When judgement needs to update the matching template, the matching template is updated to the corresponding matching of the t frame image The weighted sum of template matching template corresponding with t-1 frame image.

5. the method according to claim 1, wherein the ScSPM according to multiple sampling particle regions The ScSPM feature vector of feature vector and the matching template obtain target t frame image corresponding moment optimal location it Afterwards further include:

Judge whether to need to reacquire particle；

When judgement needs to reacquire particle, the low particle of weight is eliminated, the big particle of duplication weight, the total number of particles is not Become.

6. a kind of device of target following, which is characterized in that described device includes:

Image collection module, for obtaining the target image and the target image pair of corresponding tracking target in start frame image The location point answered, using the target image in the start frame image as matching template；

First eigenvector obtains module, for the sparse coding algorithm according to spatial pyramid matching technique, obtains described ScSPM feature vector with template；

Grain sub-acquisition module, for being transported according to the corresponding location point of target image in the t-1 frame image before t frame image and involvement The particle filter algorithm of dynamic estimation, obtains the multiple particles in t frame image, wherein t is the positive integer greater than 1；

Second feature vector obtains module, for determining each particle pair according to the position of each particle in the t frame image The sampling particle region answered and the ScSPM feature vector for calculating each sampling particle region；

Optimal location obtains module, for according to multiple ScSPM feature vectors for sampling particle regions and the matching mould The ScSPM feature vector of plate obtains the optimal location that target corresponds to the moment in t frame image.

7. device according to claim 6, which is characterized in that the first eigenvector obtains module and includes:

Image divides submodule, for the target image to be divided into multiple subgraphs；

SIFT feature describes subvector acquisition submodule, for respectively to each subgraph extract SIFT feature description to Amount；

Subgraph sparse vector acquisition submodule is obtained for describing subvector and sparse coding algorithm according to the SIFT feature Take the sparse vector of each subgraph；

First eigenvector acquisition submodule carries out spatial pyramid matching for the sparse vector to the subgraph, obtains The ScSPM feature vector of matching template.

8. device according to claim 6, which is characterized in that the optimal location obtains module and includes:

Pasteur is apart from computational submodule, for calculating the ScSPM feature vector and the matching of each sampling particle region Pasteur's distance of the ScSPM feature vector of template；

Likelihood function computational submodule, for according to the corresponding Pasteur's distance of each sampling particle region calculate it is each described in Sample the corresponding likelihood function of particle region；

Right value update submodule, for updating the weight of the corresponding particle of each likelihood function according to the likelihood function, And the weight is normalized；

Optimal location acquisition submodule, for calculating the position of the multiple particles in t frame image and the weighting of its weight Average value obtains the optimal location that target corresponds to the moment in t frame image.

9. device according to claim 6, which is characterized in that described device further include:

Matching template judgment module needs to update the matching template for judging whether；

10. device according to claim 6, which is characterized in that described device further include:

Particle judgment module needs to reacquire particle for judging whether；