CN102682454B

CN102682454B - Method and device for tracking region of interest in video

Info

Publication number: CN102682454B
Application number: CN 201210132913
Authority: CN
Inventors: 刘震; 张冬; 李厚强
Original assignee: University of Science and Technology of China USTC
Current assignee: University of Science and Technology of China USTC
Priority date: 2012-04-28
Filing date: 2012-04-28
Publication date: 2013-05-08
Anticipated expiration: 2032-04-28
Also published as: CN102682454A

Abstract

The invention discloses a method and a device for tracking a region of interest in a video. The method for tracking the region of interest in the video comprises the steps: first, obtaining motion vectors of pixels or macro blocks in an existing frame, confirming distributed parameters of the movement speed of the region of interest (ROI) according to the motion vectors, confirming ROI scaling parameters according to the ROI state in a reference frame, utilizing the distributed parameters of the movement speed of the ROI and the scaling parameters to perform state transition to particles obtained through sampling in the existing frame, and confirming the ROI position and the ROI size of the existing frame according to the particles after the state transition. The method and a device for tracking the region of interest in the video can utilize motion vector information which exists in compressed code flows or is generated in encoding to guide the state transition process of the particles, accordingly can decrease required particle number in the tracking process under the condition that the tracking effect is ensured, further reduce the complexity level of tracking processing, and can have good tracking effect.

Description

Region-of-interest tracking method in a kind of video and device

Technical field

The present invention relates to a kind of technical field of video processing, relate in particular to method and device that in a kind of video processing procedure, area-of-interest is followed the tracks of.

Background technology

Along with the universal and development of the communication technology, corresponding mobile TV, the Video service such as video conference and video monitoring also develop rapidly are got up.In the user carries out the process of corresponding Video service by various terminals and different access way access, the diversity of user terminal and the complicacy of network environment, how effectively making, transmitted video content becomes the huge challenge that designs video service system.

At present, SVC(scalable video) technology can effectively be transmitted corresponding video content to a certain extent.The SVC technology is by the subcode stream of encode simultaneously in a segment encode stream various code rate, resolution, frame per second, carries out simple extraction operation at transmission node according to the demand of network condition and user or subscriber equipment and just can generate mutually deserved adaptive code stream.The SVC technology is with respect to single code stream technology, and the code stream that it can provide a space, time, quality scalable namely can extract some subcode streams from this code stream.Corresponding subcode stream can satisfy network transmission speed and terminal user to video in the space, the demand of the aspects such as time and signal to noise ratio (S/N ratio), therefore, the SVC technology makes video flowing can adapt to better various network environment and user terminal.

In the SVC technology, the minimum quality coding layer that can provide is known as the basic layer of BL(), the coding layer that can strengthen spatial resolution, temporal resolution or signal to noise ratio (S/N ratio) intensity is known as the EL(enhancement layer).Wherein, spatial scalability uses the method for hierarchical coding (Layered Coding), utilizes motion, texture and the residual information of interlayer; The time scalability adopts classification bi-directional predicted frames (Hierarchical B) coding techniques; For the scalability of signal to noise ratio (S/N ratio), can adopt CGS(coarseness quality scalable) and MGS(medium size quality scalable) method.

Corresponding SVC technology also provides the area-of-interest to ROI() support of coding.ROI typically refers to the zone that comprises the object with clear and definite high-level semantic in frame of video for the viewer, as someone, and object etc.Carry out in the process of video tour the user, if the display size of its equipment is little, perhaps its available bandwidth reduces, and can keep the sharpness of area-of-interest, not affect the user to the experience of viewing and admiring of this video as far as possible.For example, when access bandwidth is not enough, can the non-area-of-interest of deletion to adapt to bandwidth demand to the impact of Subjective video quality, namely when bandwidth is not enough to transmit the encoding code stream of basic layer and enhancement layer, can transmit the encoding code stream of basic layer and ROI to take full advantage of bandwidth, can keep video quality to a certain extent, guarantee that user's subjectivity is experienced impression.

In order to realize utilizing the ROI coding techniques to adapt to various application, need to determine position and the size of ROI in each frame of video, usually can adopt the video tracking technology to determine size and the position of ROI in each frame of video.

The below will determine that the implementation of ROI in each frame of video is described to the video tracking technology of passing through of present employing.

A kind of ROI tracking mode that adopts at present is to realize based on particle filter algorithm, specifically the region list of following the tracks of is shown as a particle (rectangle or ellipse etc.), and particle can comprise various states, as movement velocity, direction, and area size etc.When following the tracks of, generate the particle of some by importance sampling in present frame, and according to the correlativity of the particle in zone to be tracked in the particle of this some and reference frame, obtain the steady state (SS) of particle in present frame by weighting, thereby obtain the zone to be tracked in present frame.

Accordingly, the concrete processing mode based on the zone to be tracked in particle filter algorithm acquisition present frame comprises:

(1) initialization: getting k=0(is initial time), press p (x ₀) N sample point of extraction

I=1 ..., N, wherein p (x ₀) refer to that initialized destination probability distributes, particularly, can be set as in the first frame the Gaussian distribution take the position size of target as average.

(2) importance sampling:

Order

I=1 wherein ..., N, wherein,

Expression k is the state of i particle constantly,

Expression is from 0 (initial time) state of particle till the k moment constantly, z _1:kExpression is from 1 (the initial tracking constantly) observed reading (referring generally to the color histogram of tracking target) of target till the k moment constantly,

Refer to,

The observed reading of 1 to k frame and i particle have been referred to the condition of the state of k-1 frame, the estimation of the distributions of particle in the k frame, i.e. importance function.

(3) calculate weights

ω_{k}^{(i)} = ω_{k - 1}^{(i)} \frac{p (z_{k} | x_{k}^{(i)} p (x_{k}^{(i)} | x_{k - 1}^{(i)})}{q (x_{k}^{(i)} | x_{0 : k - 1}^{(i)}, z_{1 : k})},

If adopt a step to shift the posteriority distributions, this formula can be reduced to:

Wherein,

Refer to observation model, namely represent particle

The probability of institute's tracking target,

Refer to state transition model, namely target is by the probability Distribution Model of k-1 frame to the motion of k frame.

(4) normalization weights:

{\tilde{ω}}_{k}^{(i)} = \frac{ω_{k}^{(i)}}{Σ_{j = 1}^{N} ω_{k}^{(j)}} .

(5) resample: according to normalization weights separately

Size copy or give up sample

Obtain N approximate the obedience The sample that distributes

Order

ω_{k}^{(i)} = {\tilde{ω}}_{k}^{(i)} = 1 / N,

i＝1,…,N。

(6) Output rusults: the output of algorithm is the particle collection Can approximate representation posterior probability and function x with it _0:kExpectation, wherein:

Posterior probability:

\hat{p} (x_{0 : k} | z_{1 : k}) = \frac{1}{N} Σ_{i = 1}^{N} δ_{x_{0 : k}^{(i)}} ({dx}_{0 : k});

Function x _0:kExpectation:

E (x_{0 : k}) = \frac{1}{N} Σ_{i = 1}^{N} x_{0 : k}^{i} .

(7) make k=k+1, repeat said process (2) to process (6).

In above-mentioned implementation, if will obtain stable tracking effect, need more number of particles, and number of particles is more, follow the tracks of required calculated amount larger, causing processing complexity increases greatly.

Summary of the invention

The purpose of this invention is to provide region-of-interest tracking method and device in a kind of video, to reduce the processing complexity in tracing process under the prerequisite that guarantees tracking effect.

The objective of the invention is to be achieved through the following technical solutions:

Region-of-interest tracking method in a kind of video comprises:

Obtain the motion vector of pixel in present frame or macro block, and determine the translational speed distribution parameter of region of interest ROI according to described motion vector, also determine the ROI zooming parameter according to the status information of ROI in reference frame;

Wherein, the step of the translational speed distribution parameter of described definite ROI comprises: determine pixel in present frame or reference pixel corresponding to macro block or the position of macro block in reference frame according to the motion vector of pixel in present frame or macro block, and choose described position in the motion vector of pixel or macro block be arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector of macro block in present frame; Determine the translational speed distribution parameter of described ROI according to the motion vector of pixel or macro block in the present frame of choosing acquisition;

The step of described definite ROI zooming parameter comprises: determine pixel or macro block in corresponding present frame according to the motion vector of pixel or macro block in the present frame of choosing acquisition, and determine described ROI zooming parameter according to pixel or macro block in the reference frame of pixel or macro block in this present frame and correspondence;

The particle that utilizes the translational speed distribution parameter of described ROI and zooming parameter that sampling in present frame is obtained carries out state transitions to be processed, and determines ROI position and the size of present frame according to the particle after state transitions.

Alternatively, the translational speed distribution parameter of described ROI comprises:

P ({MV}^{ROI}) = Σ_{i = 0}^{| G | - 1} δ ({MV}^{i} - {MV}^{ROI}) / | G |,

Wherein, δ is Dirac function, and G is that described position is arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector set of macro block, MV ⁱRefer to i motion vector in G,

For ROI by the translational speed of former frame to present frame; Described ROI zooming parameter comprises:

Wherein, a is, b is for being affine parameter, and this affine parameter adopts least square method in conjunction with four parameter transformation model solutions, and corresponding four parameter transformation modules be according to pixel or macro block foundation in pixel or macro block in this present frame and corresponding reference frame.

Alternatively, the described particle that sampling in present frame is obtained carries out the step that state transitions processes and comprises:

Determine according to the translational speed distribution parameter of described ROI the speed v that particle state shifts ₁, v ₂, comprise; Even distribution with 0～1 produces random number μ, if μ＜α, order

Indexed variable

Otherwise, with distribution P (MV ^ROI) be chosen at two components corresponding to an element in the motion vector set G that adds up in a module (MV ' _x, MV ' _y) respectively as v ₁, v ₂Value, even v ₁=MV _x', v ₂=MV _y', and note

α is the state transitions parameter, and its initial value is for presetting, and the mode of upgrading this value in follow-up particle renewal process comprises:

α = Σ_{n = 1}^{N} T_{k}^{n} / N,

Be the v in particle n in the k-1 frame _xComponent,

Be the v in particle n in the k-1 frame _yComponent;

Determine particle zooming parameter η according to described ROI zooming parameter, comprise; Even distribution with 0～1 produces random number γ, if γ＜β makes η=ρ, otherwise makes η=1, and wherein, β is target sizes intensity of variation parameter, and its value is for presetting, and ρ is described zooming parameter;

The speed and the particle zooming parameter that shift according to described particle state carry out the particle state transfer processing, and the result after the particle state of n particle in acquisition k frame shifts comprises: the position of this particle

(s_{k}^{n} {x} = s_{k - 1}^{n} {x} + v_{1} + ϵ_{x}, s_{k}^{n} {y} = s_{k - 1}^{n} {y} + v_{2} + ϵ_{y}),

The translational speed of the x of this particle, y direction is respectively:

s_{k}^{n} {v_{x}} = v_{1} + ϵ_{v_{x}}

With

S_{k}^{n} {v_{y}} = v_{2} + ϵ_{v_{y}},

The major semi-axis of this particle corresponding ellipse on described position is

s_{k}^{n} {a} = η \times s_{k - 1}^{n} {a} + ϵ_{a},

Minor semi-axis is

s_{k}^{n} {b} = η \times s_{k - 1}^{n} {b} + ϵ_{b},

ε wherein _x, ε _yThe stochastic variable of obeying N (0,0.2) distribution, The stochastic variable of obeying N (0,0.25), ε _a, ε _bIt is the stochastic variable of obeying N (0,0.1).

Alternatively, describedly determine that according to the particle after state transitions the step of the ROI of present frame comprises:

The particle that carries out after state transitions is processed is carried out each particle field color statistics with histogram processing, and carry out particle according to each particle field color statistics with histogram result and upgrade processing;

The result that upgrade to process obtains according to particle is calculated ROI position size.

Area-of-interest tracking means in a kind of video comprises:

Translational speed distribution parameter and zooming parameter determination module, be used for obtaining the motion vector of present frame pixel or macro block, and determine the translational speed distribution parameter of region of interest ROI according to described motion vector,, also be used for determining zooming parameter according to the status information of reference frame ROI;

Described translational speed distribution parameter and zooming parameter determination module specifically comprise: the motion vector acquisition module, be used for determining pixel in present frame or reference pixel corresponding to macro block or the position of macro block in reference frame according to the motion vector of present frame pixel or macro block, and in present frame in the motion vector of pixel or macro block, choose described position and be arranged in the reference pixel of reference frame ROI or present frame pixel corresponding to macro block or the motion vector of macro block; The parameter determination module is used for determining according to the motion vector that described motion vector acquisition module is chosen the present frame pixel of acquisition or macro block translational speed distribution parameter and the zooming parameter of described ROI; Also determine pixel or macro block in corresponding present frame according to the motion vector of pixel or macro block in the present frame of choosing acquisition, and determine described ROI zooming parameter according to pixel or macro block in the reference frame of pixel or macro block in this present frame and correspondence;

The particle state shift module carries out the state transitions processing for the translational speed distribution parameter that utilizes the definite ROI of described translational speed distribution parameter and zooming parameter determination module and the zooming parameter particle that sampling obtains to present frame;

The ROI determination module is used for carrying out particle after state transitions according to described particle state shift module, determines ROI position and the size of present frame.

Alternatively, described parameter determination module comprises:

The translational speed estimation of distribution parameters module of ROI is used for estimating the translational speed distribution parameter of ROI, and estimates that the translational speed distribution parameter of the described ROI of acquisition comprises:

P ({MV}^{ROI}) = Σ_{i = 0}^{| G | - 1} δ ({MV}^{i} - {MV}^{ROI}) / | G |,

For ROI by the translational speed of former frame to present frame;

ROI zoom parameter estimation module is used for estimating the ROI zooming parameter, and estimates that the described ROI zooming parameter that obtains comprises:

Wherein, a is, b is affine parameter, and this affine parameter is for adopting least square method in conjunction with four parameter transformation model solutions, and corresponding four parameter transformation modules be according to pixel or macro block foundation in pixel or macro block in this present frame and corresponding reference frame.

Alternatively, described particle state shift module comprises:

Particle position speed shift module, the translational speed distribution parameter that is used for the ROI that the translational speed estimation of distribution parameters module according to described ROI obtains is determined the speed v that particle state shifts ₁, v ₂, comprise; Even distribution with 0～1 produces random number μ, if μ＜α, order

v

_{1} = s_{k - 1}^{n} {v_{k}},

v_{2} = s_{k - 1}^{n} {v_{y}},

Indexed variable

T_{k - 1}^{n} = 1,

Otherwise, with distribution P (MV ^ROI) be chosen at two components corresponding to an element in the motion vector set G that adds up in a module (MV ' _x, MV ' _y) respectively as v ₁, v ₂Value, even v ₁=MV _x', v ₂=MV ' _y, and note

Be the v in particle n in the k-1 frame _xComponent, Be the v in particle n in the k-1 frame _yComponent;

The particle size shift module is used for determining particle zooming parameter η according to the ROI zooming parameter that described ROI zoom parameter estimation module obtains, and comprises; Even distribution with 0～1 produces random number γ, if γ＜β makes η=ρ, otherwise makes η=1, and wherein, β is target sizes intensity of variation parameter, and its value is for presetting, and ρ is described ROI zooming parameter;

Particle state shifts determination module as a result, the particle zooming parameter that the speed that the particle state that is used for determining according to described particle position speed shift module shifts and described particle size shift module are determined carries out the particle state transfer processing, and the result after the particle state of n particle in acquisition k frame shifts comprises: the position of this particle

(s_{k}^{n} {x} = s_{k - 1}^{n} {x} + v_{1} + ϵ_{x}, s_{k}^{n} {y} = s_{k - 1}^{n} {y} + v_{2} + ϵ_{y}),

The translational speed of the x of this particle, y direction is respectively:

s_{k}^{n} {v_{x}} = v_{1} + ϵ_{v_{x}}

With

s_{k}^{n} {v_{y}} = v_{2} + ϵ_{v_{y}},

s_{k}^{n} {a} = η \times s_{k - 1}^{n} {a} + ϵ_{a},

Minor semi-axis is

s_{k}^{n} {b} = η \times s_{k - 1}^{n} {b} + ϵ_{b},

ε wherein _x, ε _yThe stochastic variable of obeying N (0,0.2) distribution,

The stochastic variable of obeying N (0,0.25), ε _a, ε _bIt is the stochastic variable of obeying N (0,0.1).

Alternatively, described ROI determination module comprises:

Each particle field color statistics with histogram processing module is used for the particle that carries out after state transitions is processed is carried out each particle field color statistics with histogram processing;

The particle update module is used for carrying out particle according to each particle field color statistics with histogram result and upgrades processing;

Size computing module in ROI position is used for upgrading according to particle and processes the result that obtains and calculate ROI position size.

As seen from the above technical solution provided by the invention, the ROI tracking technique that the embodiment of the present invention provides, can utilize the motion vector information that is present in compressed bit stream or produces during coding to instruct the particle state transfer process, thereby can be in the situation that guarantee tracking effect, reduce required number of particles in tracing process, and then reduce and follow the tracks of the complexity of processing, and can obtain better tracking effect.

Description of drawings

In order to be illustrated more clearly in the technical scheme of the embodiment of the present invention, during the below will describe embodiment, the accompanying drawing of required use is done to introduce simply, apparently, accompanying drawing in the following describes is only some embodiments of the present invention, for those of ordinary skill in the art, do not pay at s under the prerequisite of creative work, can also obtain other accompanying drawings according to these accompanying drawings.

The processing procedure schematic diagram of the method that Fig. 1 provides for the embodiment of the present invention;

The structural representation one of the device that Fig. 2 provides for the embodiment of the present invention;

The structural representation two of the device that Fig. 3 provides for the embodiment of the present invention;

The structural representation three of the device that Fig. 4 provides for the embodiment of the present invention;

The structural representation four of the device that Fig. 5 provides for the embodiment of the present invention;

The MV distribution histogram that Fig. 6 provides for the embodiment of the present invention;

Fig. 7 shifts schematic diagram for the particle state that the embodiment of the present invention provides;

The structural representation five of the device that Fig. 8 provides for the embodiment of the present invention;

The particle field color histogram that Fig. 9 provides for the embodiment of the present invention;

The particle renewal process schematic diagram that Figure 10 provides for the embodiment of the present invention;

Figure 11 is the effect schematic diagram one of the embodiment of the present invention;

Figure 12 is the effect schematic diagram two of the embodiment of the present invention;

Figure 13 is the applied environment schematic diagram one of the embodiment of the present invention;

Figure 14 is the applied environment schematic diagram two of the embodiment of the present invention.

Embodiment

Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is only the present invention's part embodiment, rather than whole embodiment.Based on embodiments of the invention, those of ordinary skills belong to protection scope of the present invention not making the every other embodiment that obtains under the creative work prerequisite.

Below in conjunction with accompanying drawing, the embodiment of the present invention is described in further detail.

The embodiment of the present invention provides the region-of-interest tracking method in a kind of video, and its specific implementation can comprise the following steps as shown in Figure 1:

Step 11 is obtained the motion vector of pixel in present frame (or macro block), and determines the translational speed distribution parameter of ROI according to corresponding motion vector, also determines the ROI zooming parameter according to the status information of ROI in reference frame;

Wherein, corresponding ROI position present frame to be determined is referred to as present frame, and the reference frame of present frame on coding structure be referred to as reference frame, ROI location aware in reference frame;

Particularly, in this step, determine ROI the translational speed distribution parameter process specifically can but be not limited to comprise:

At first, determine pixel in present frame or reference pixel corresponding to macro block or the position of macro block in reference frame according to the motion vector of pixel in present frame or macro block; Afterwards, choose reference pixel or the macro block that corresponding position is arranged in reference frame ROI in the motion vector of pixel or macro block in present frame, and obtain the motion vector of pixel in this reference pixel or present frame corresponding to macro block or macro block; Afterwards, can determine according to the motion vector of pixel or macro block in the present frame of choosing acquisition the translational speed distribution parameter of corresponding ROI;

According to the status information of ROI in reference frame determine the ROI zooming parameter step can but be not limited to comprise:

At first, determine pixel or macro block in corresponding present frame according to the motion vector of pixel or macro block in the above-mentioned present frame of choosing; Afterwards, can determine described ROI zooming parameter according to pixel or macro block in the reference frame of pixel or macro block in this present frame and correspondence.

Wherein, the account form of the translational speed distribution parameter of above-mentioned ROI and ROI zooming parameter can but be not limited to comprise:

The translational speed distribution parameter of ROI

P ({MV}^{ROI}) = Σ_{i = 0}^{| G | - 1} δ ({MV}^{i} - {MV}^{ROI}) / | G |,

For ROI by the translational speed of former frame (being reference frame) to present frame;

The ROI zooming parameter Wherein, a is, b is affine parameter, and this affine parameter adopts least square method in conjunction with four parameter transformation model solutions, and corresponding four parameter transformation modules be according to pixel or macro block foundation in pixel or macro block in this present frame and corresponding reference frame.

Step 12, the particle that the translational speed distribution parameter of the ROI that utilization is determined and ROI zooming parameter obtain sampling in present frame are carried out state transitions and are processed;

The specific implementation of this step can comprise:

At first, determine according to the translational speed distribution parameter of above-mentioned ROI the speed v that particle state shifts ₁, v ₂, comprise; Even distribution with 0～1 produces random number μ, if μ＜α, order

Indexed variable

Be the v in particle n in the k-1 frame _xComponent,

Be the v in particle n in the k-1 frame _yComponent, and present frame is the k frame, reference frame is the k-1 frame;

Afterwards, determine particle zooming parameter η according to above-mentioned ROI zooming parameter, comprise; Even distribution with 0～1 produces random number γ, if γ＜β makes η=ρ, otherwise makes η=1, and wherein, β is target sizes intensity of variation parameter, and its value is for presetting, and ρ is described zooming parameter;

At last, the speed and the particle zooming parameter that shift according to above-mentioned particle state carry out the particle state transfer processing, and the result after the particle state of n particle in acquisition k frame shifts comprises: the position of this particle

(s_{k}^{n} {x} = s_{k - 1}^{n} {x} + v_{1} + ϵ_{x}, s_{k}^{n} {y} = s_{k - 1}^{n} {y} + v_{2} + ϵ_{y}),

The translational speed of the x of this particle, y direction is respectively:

s_{k}^{n} {v_{x}} = v_{1} + ϵ_{v_{x}}

With

s_{k}^{n} {v_{y}} = v_{2} + ϵ_{v_{y}},

s_{k}^{n} {a} = η \times s_{k - 1}^{n} {a} + ϵ_{a},

Minor semi-axis is

s_{k}^{n} {b} = η \times s_{k - 1}^{n} {b} + ϵ_{b},

ε wherein _x, ε _yThe stochastic variable of obeying N (0,0.2) distribution,

Step 13 is determined ROI position and the size of present frame according to the particle after state transitions, realize processing for the tracking of the ROI in video;

Particularly, the specific implementation of this step can comprise:

The embodiment of the present invention is mainly to utilize in code stream or the additional information that produces during coding instructs particle state to shift, thereby can guarantee the tracking effect for ROI, and can reduce the processing complexity of tracing process.Particularly, the present invention has utilized particle state transfer process in the MV information guiding particle filter algorithm that is present in compressed bit stream or produces during coding, thereby can in the situation that guarantee tracking effect, reduce required number of particles, and then reduce the complexity of tracking processing procedure; Perhaps, the present invention can obtain more stable tracking effect in the situation that the identical number of particles of employing is followed the tracks of processing.And the present invention has also utilized Particle filtering theory the stability of noise to be obtained the robustness of track algorithm, has further guaranteed the tracking effect of the tracking technique scheme that the embodiment of the present invention provides.

The embodiment of the present invention also provides the tracking means of the area-of-interest in a kind of video, and its specific implementation structure can comprise as shown in Figure 2:

(1) translational speed distribution parameter and zooming parameter determination module 21, be used for obtaining the motion vector of present frame pixel or macro block, and determine the translational speed distribution parameter of region of interest ROI according to described motion vector, also determine the ROI zooming parameter according to the status information of ROI in reference frame;

Further, as shown in Figure 3, this translational speed distribution parameter and zooming parameter determination module 21 specifically can comprise:

Motion vector acquisition module 211, be used for determining pixel in present frame or reference pixel corresponding to macro block or the position of macro block in reference frame according to the motion vector of present frame pixel or macro block, and choose reference pixel or the macro block that described position is arranged in reference frame ROI in the motion vector of pixel or macro block in present frame, then obtain the motion vector of pixel in this reference pixel or present frame corresponding to macro block or macro block;

Parameter determination module 212 is used for determining according to the motion vector that above-mentioned motion vector acquisition module 211 is chosen the present frame pixel of acquisition or macro block the translational speed distribution parameter of corresponding ROI; Also determine pixel or macro block in corresponding present frame according to the motion vector of pixel or macro block in the present frame of choosing acquisition, and determine corresponding ROI zooming parameter according to pixel or macro block in the reference frame of pixel or macro block in this present frame and correspondence;

This parameter determination module 212 specifically can comprise translational speed estimation of distribution parameters module 2121 and the ROI zoom parameter estimation module 2122 of ROI, wherein:

The translational speed estimation of distribution parameters module 2121 of ROI is used for estimating the translational speed distribution parameter of ROI, and estimates that the translational speed distribution parameter of the described ROI of acquisition comprises:

P ({MV}^{ROI}) = Σ_{i = 0}^{| G | - 1} δ ({MV}^{i} - {MV}^{ROI}) / | G |,

ROI zoom parameter estimation module 2122 is used for estimating the ROI zooming parameter, and estimates that the described ROI zooming parameter that obtains comprises:

(2) the particle state shift module 22, carry out the state transitions processing for the translational speed distribution parameter that utilizes the definite ROI of above-mentioned translational speed distribution parameter and zooming parameter determination module 21 and the zooming parameter particle that sampling obtains to present frame;

Further, with reference to shown in Figure 8, corresponding particle state shift module can comprise:

Particle position speed shift module 221, the translational speed distribution parameter that is used for the ROI that the translational speed estimation of distribution parameters module 2121 according to above-mentioned ROI obtains is determined the speed v that particle state shifts ₁, v ₂, comprise; Even distribution with 0～1 produces random number μ, if μ＜α, order

v

_{1} = s_{k - 1}^{n} {v_{k}},

v_{2} = s_{k - 1}^{n} {v_{y}},

Indexed variable

T_{k - 1}^{n} = 1,

Be the vx component in particle n in the k-1 frame,

Be the v in particle n in the k-1 frame _yComponent;

Particle size shift module 222 is used for determining particle zooming parameter η according to the ROI zooming parameter that above-mentioned ROI zoom parameter estimation module 2122 obtains, and specifically can comprise; Even distribution with 0～1 produces random number γ, if γ＜β makes η=ρ, otherwise makes η=1, and wherein, β is target sizes intensity of variation parameter, and its value is for presetting, and ρ is above-mentioned ROI zooming parameter;

Particle state shifts determination module 223 as a result, be used for the speed of the definite particle state transfer of the corresponding particle position speed shift module 221 of basis and the particle zooming parameter that described particle size shift module 222 is determined and carry out the particle state transfer processing, the result after the particle state of n particle in acquisition k frame shifts comprises: the position of this particle

(s_{k}^{n} {x} = s_{k - 1}^{n} {x} + v_{1} + ϵ_{x}, s_{k}^{n} {y} = s_{k - 1}^{n} {y} + v_{2} + ϵ_{y}),

The translational speed of the x of this particle, y direction is respectively:

s_{k}^{n} {v_{x}} = v_{1} + ϵ_{v_{x}}

With

s_{k}^{n} {v_{y}} = v_{2} + ϵ_{v_{y}},

s_{k}^{n} {a} = η \times s_{k - 1}^{n} {a} + ϵ_{a},

Minor semi-axis is

s_{k}^{n} {b} = η \times s_{k - 1}^{n} {b} + ϵ_{b},

ε wherein _x, ε _yThe stochastic variable of obeying N (0,0.2) distribution,

(3) the ROI determination module 23, are used for carrying out particle after state transitions according to above-mentioned particle state shift module 22, determine ROI position and the size of present frame;

Particularly, as shown in Figure 4, this ROI determination module specifically can comprise:

Each particle field color statistics with histogram processing module 231 is used for the particle that carries out after state transitions is processed is carried out each particle field color statistics with histogram processing;

Particle update module 232 is used for carrying out the particle renewal according to the result that above-mentioned each particle field color statistics with histogram processing module 231 is carried out each particle field color statistics with histogram processing acquisition and processes;

Size computing module 233 in ROI position is used for carrying out particles according to above-mentioned particle update module 232 and upgrades and process the result that obtains and calculate ROI position size.

Equally, in said apparatus, utilized the MV information guiding particle state transfer process that is present in compressed bit stream or produces during coding, thereby can be in the situation that guarantee tracking effect, reduce required number of particles, perhaps, in the situation that adopt identical number of particles to follow the tracks of processing, can obtain more stable tracking effect.

That is to say, for the degree of accuracy that improves the ROI tracking and the complexity that reduces tracking, adopted in the embodiment of the present invention utilize produce in corresponding cataloged procedure or code stream in the implementation of the state transitions of particle in the MV information guiding particle filter algorithm that existed, to obtain more superior tracking effect.

For ease of understanding better the embodiment of the present invention, below in conjunction with accompanying drawing and concrete application process, the embodiment of the present invention is explained in detail.

In the embodiment of the present invention, specifically according to Particle filtering theory, utilize produce in cataloged procedure or code stream in the MV information realization that existed to the tracking of ROI.Wherein, ROI obtains frame by frame, and in present frame, the data such as the position of ROI, size are the information acquisitions according to ROI in its reference frame.Corresponding tracing process mainly can comprise:

At first, obtain the reference frame of present frame from the buffer memory of reference frame, according to the MV of status information and the pixel (or macro block) of ROI in reference frame, by ROI translational speed distribution parameter and zoom parameter estimation module, estimate velocity distribution and zooming parameter that ROI moves to present frame;

Secondly, estimate that by the particle state shift module ROI position and size information in present frame distribute;

The 3rd, the color histogram in each particle zone that utilizes that each particle overlay area statistical color histogram module statistics obtains, through the particle update module, the information of the ROI that obtained by the particle state shift module is distributed and do further correction, calculate the position size information of ROI in the output present frame, and deposit together itself and present frame in the reference frame buffer memory, in order to estimate the ROI information in subsequent frame.

Particularly, as shown in Figure 4, the device that the embodiment of the present invention provides specifically can comprise:

(1) reference frame acquisition module

This module is used for obtaining the reference frame of present frame, and is identical, not shown with the method for obtaining reference frame in traditional coding method.

(2) ROI translational speed distribution parameter and zoom parameter estimation module

This module is translational speed distribution parameter and zooming parameter determination module, it is used for obtaining translational speed distribution parameter distribution estimation and the zoom parameter estimation in ROI zone, as shown in Figure 5, this module further can comprise MV acquisition module, ROI translational speed estimation of distribution parameters module and ROI zoom parameter estimation module, and the processing capacity that the below will complete modules respectively is described:

(21) MV acquisition module, i.e. motion vector acquisition module is used for the MV that statistical-reference pixel (or macro block, the pixel that occurs in below describing all can replace with macro block) is arranged in the pixel in reference frame ROI zone, and MV is a bivector (MV accordingly _x, MV _y), namely the MV of pixel is at the component of x, y direction;

Particularly, easily know according to coding standard H.264/SVC, can obtain the position of its reference pixel in reference frame by the MV of pixel in present frame, therefore, if if in present frame, the reference pixel of certain pixel is in ROI zone in reference frame, obtain the MV of this pixel in present frame, a plurality of pixels that its reference pixel are in the present frame in ROI zone in reference frame are designated as set G, element number in this set G is | G|, i element is g _i, g _iMV be designated as I=0,1 ..., | G|-1; Further, the set with each element M V in G is designated as M;

For example, as shown in Figure 6, corresponding MV1, MV2, the associated reference pixel of MV3 are arranged in the ROI zone of reference frame, and therefore, the pixel in present frame corresponding to this MV1, MV2, MV3 is at the row of statistics, namely need pixel with correspondence to be recorded in and gather in G; And the associated reference pixel of MV0 is arranged in outside reference frame ROI, and therefore, the pixel in present frame corresponding to this MV0 is not at the row of statistics.

(22) translational speed distribution estimation module, be used for (MV) the distribution histogram that the statistical-reference pixel is arranged in the present frame pixel in reference frame ROI zone, wherein-MV gets respectively negative sign to two components of MV to obtain, be in the set G that obtains of MV acquisition module pixel-the MV distribution histogram, and it is used the normalization of L1 norm, be used as the ROI zone by the estimation of reference frame to the translational speed probability distribution of present frame, it can be designated as translational speed distribution parameter P (MV particularly ^ROI), that is:

P ({MV}^{ROI}) = Σ_{i = 0}^{| G | - 1} δ ({MV}^{i} - {MV}^{ROI}) / | G |,

Wherein, δ is Dirac function, and G is that described position is arranged in present frame pixel corresponding to the reference pixel of reference frame ROI or the motion vector set of macro block, MV ⁱRefer to i motion vector in G,

For ROI by the translational speed of former frame (being reference frame) to present frame.

Only provided statistics P (MV in Fig. 6 ^ROI) an example, '+' represent pixel wherein, pixel reference pixel as corresponding in the dotted arrow afterbody is not arranged in the ROI zone of reference frame, its motion vector MV ⁰Not at the row of statistics; Other motion vector such as MV ¹, MV ²Deng all at the row of statistics.

(23) the zoom parameter estimation module in ROI zone, be used for estimating to obtain the zooming parameter in ROI zone, this zooming parameter is used to indicate the interregional convergent-divergent relation of ROI zone and the ROI in present frame in reference frame, particularly, can obtain based on the mode of affined transformation this zooming parameter, this zooming parameter can for:

Wherein, a, b are affine transformation parameter;

The mode that obtains particularly this ROI zooming parameter can comprise:

If i element g in G _iCoordinate be (x _i, y _i), the coordinate of its corresponding reference pixel is

(u_{i}, v_{i}) = (x_{i} + {MV}_{x}^{i}, y_{i} + {MV}_{y}^{i});

Adopt a kind of special case of affined transformation, namely four parameter transformation models are as the running transform model of target, that is:

[\begin{matrix} x_{i} \\ y_{i} \end{matrix}] = \begin{matrix} [\begin{matrix} a & - b \\ b & a \end{matrix}] \times [\begin{matrix} u_{i} \\ v_{i} \end{matrix}] \end{matrix} + [\begin{matrix} c \\ d \end{matrix}],

Wherein a, b, c, d are affine transformation parameters;

Adopt criterion of least squares to estimate to all pixels in above-mentioned set G the four parameter transformation models that ROI is unified, can be constructed as follows matrix A, B, C, that is:

A = [\begin{matrix} a \\ b \\ c \\ d \end{matrix}],

[\begin{matrix} \cdot \\ \cdot \\ \cdot \\ u_{i} & ({- v}_{i}) & 10 \\ v_{i} & u_{i} & 01 \\ \cdot \\ \cdot \\ \cdot \end{matrix}],

C = [\begin{matrix} \cdot \\ \cdot \\ \cdot \\ x_{i} \\ y_{i} \\ \cdot \\ \cdot \\ \cdot \end{matrix}];

Like this, the running transform model of corresponding target can be converted into C=B * A, can solve A=(B by criterion of least squares ^TB) ^-1B ^TC, wherein B ^TBe the transposed matrix of B, affine transformation parameter a, b are a, the b components in matrix A, because matrix B and C are known quantity, thus can obtain the value of affine transformation parameter a, b, and then can calculate the value ρ that obtains corresponding zooming parameter.

(3) particle transfer module

This module is the particle state shift module, and it can be used for obtaining according to ROI translational speed distribution parameter and zoom parameter estimation module ROI zone is by the estimation P (MV of reference frame to the translational speed probability distribution of present frame ^ROI) and the estimation ρ of zooming parameter, the state transitions of particle is done guidance, to reach better tracking results.

Particularly, the purpose of this module is that the ROI distribution of present frame is done according to a preliminary estimate, as shown in Figure 7, each circle represents a particle (information such as position size that comprise ROI), particle assembly in reference frame represents that the status information of ROI in reference frame distributes, obtain the particle assembly in present frame after particle assembly in reference frame shifts through particle state, the particle assembly in this present frame has represented that in the present frame, the status information of ROI distributes;

In the processing procedure of carrying out the particle state transfer, the information that this module need to use comprises: particle state, state transitions parameter alpha that in reference frame, expression ROI distributes, and P (MV ^ROI) and ρ, wherein, the α initial value can but do not limit and be set to 0.5, also can automatically upgrade the value of this state transitions parameter alpha in follow-up processing procedure.

Particularly, need to all carry out same processing for each particle, only be treated to example with the state transitions for n particle here and describe explanation, as shown in Figure 8, can comprise for the state transitions processing procedure of n particle:

At first, determine the speed that particle state shifts, specifically can comprise: the even distribution with 0 ~ 1 produces random number μ, if μ＜α, order (be the v in particle n in the k-1 frame _xComponent, below this type of expression formula implication similar),

Otherwise, with according to the ROI zone by the estimated result P (MV of reference frame to the translational speed probability distribution of present frame ^ROI) choose in the motion vector set M that obtains before two components corresponding to element (MV ' _x, MV ' _y) respectively as v ₁, v ₂Value, even v ₁=MV _x', v ₂=MV ' _y, and note

Wherein, v ₁, v ₂Temporary variable;

Secondly, determine the particle zooming parameter, specifically can comprise: the even distribution with 0 ~ 1 produces random number γ, if γ＜β, make η=ρ, otherwise make η=1, wherein, η is temporary variable, β is target sizes intensity of variation parameter, and it is a constant parameter, is used for controlling the intensity of variation of target sizes, if the size variation velocity ratio of target is very fast, the β value is larger, otherwise the β value is less, in general adjacent two frames, to change be not very violent to target sizes usually, thus can but to be not limited to parameters β be 0.2;

The 3rd, particle state shifts, according to the temporary variable v that determines before ₁, v ₂Carry out the particle state transfer processing with η, particularly, the state transitions result of the particle of the correspondence of n particle in calculating k frame

For: the position of this particle on the x direction of principal axis

The position of this particle on the y direction of principal axis

s_{k}^{n} {y} = s_{k - 1}^{n} {y} + v_{2} + ϵ_{y},

The translational speed of this particle on the x direction of principal axis

s_{k}^{n} {v_{x}} = v_{1} + ϵ_{v_{x}},

The translational speed of this particle on the y direction of principal axis

The major semi-axis of this particle corresponding ellipse on the relevant position

The minor semi-axis of this particle corresponding ellipse on the relevant position

s_{k}^{n} {b} = η \times s_{k - 1}^{n} {b} + ϵ_{b},

ε wherein _x, ε _yThe stochastic variable of obeying N (0,0.2) distribution,

The stochastic variable of obeying N (0,0.25), ε _a, ε _bIt is the stochastic variable of obeying N (0,0.1).Wherein,

Represent n particle in the k frame, its implication is: in the k frame, ROI is with probability Appear at (x, y) and locate, major semi-axis and minor semi-axis are respectively a, b, and with (v _x, v _y) speed move to next frame;

(4) each particle field color statistics with histogram module

This module be used for to be calculated the observation model of Particle filtering theory, all needs to carry out identical operation for each particle, and the below describes as an example of the observation model that calculates Particle filtering theory for particle n example.

Particle n in present frame, state is

It can be corresponding to centered by (x, y) on the image of present frame, and the ellipse take a, b as semi-major axis is designated as As shown in Figure 9, this module namely is used for the color histogram of statistical graph elliptical region, and with the normalization of L1 norm, can be designated as

Its account form is as follows:

P ({sb}_{k}^{n} = b) = Σ_{t = 0}^{t = | R_{k}^{n} | - 1} δ (b_{Y}^{t} \times N_{U} \times N_{V} + b_{U}^{t} \times N_{v} + b_{V}^{t} - b) / | R_{k}^{n} |;

Wherein, suppose that video input is the YUV color space, N _Y, N _U, N _VBe each Color Channel number of dividing bin (by yuv space is quantized to obtain, as the every one dimension of yuv space all is divided into 10 parts, N _Y, N _U, N _VAll equal 10);

To drop on

In the sequence number (sequence number of the bin of each Color Channel increases progressively since 0 accordingly, until 9) of the bin that falls on each Color Channel of t pixel;

Expression is oval In number of pixels.

Particularly, the implementation that the observation model of calculating Particle filtering theory adopts in this module is same as the prior art, just the state transitions result that is input as the particle that obtains according to the inventive method of this module

s_{k}^{n} = {x, v_{x}, y, v_{y}, a, b} .

(5) particle update module

This module is used for carrying out corresponding particle and upgrades processing, the concrete particle (being the output of particle transfer module) that is used for the expression present frame ROI distributions that obtains through state transitions, calculate its weight, and estimating through the distribution that resampling obtains ROI in present frame, and update mode transfer parameters α.

Particularly, as shown in figure 10, corresponding particle upgrades processing procedure and can comprise:

At first, upgrade the weight of each particle

π_{k}^{n} = π_{k - 1}^{n} \times \exp {- {1 - Σ_{i = 1}^{N} {\sqrt{P ({sb}_{0}) [j] \times P ({sb}_{k}^{n}) [j]}}^{1 / 2} / σ},

Wherein,

Represent the weight (zone of particle representative is the probability in ROI zone) of n particle in the k frame, the N here represents the bin number of color histogram used, N=N _YN _UN _V

Be illustrated in the value of j the bin of color histogram of the particle n that in a module (adding up each particle field color histogram module), statistics obtains; P (sb ₀) refer to the color histogram in initialized ROI zone; Parameter σ is constant, can but be not limited to be set to 0.25; Each particle weight of normalization

π_{k}^{n} = π_{k}^{n} / Σ_{n = 1}^{N} π_{k}^{n} .

Secondly, carry out re-sampling operations, specifically can comprise: at first, calculate number of effective particles

Judgement afterwards

If number of effective particles

Carry out corresponding re-sampling operations, otherwise do not carry out re-sampling operations, directly carry out next the 3rd step update mode transfer parameters α, wherein, N is number of particles, and λ is a constant parameter, can preset; Particularly, corresponding re-sampling operations comprises: to the particle assembly of present frame

S = {s_{k}^{n}, π_{k}^{n}, T_{k}^{n}}_{n = 1}^{N}

According to weight

Sample, put into new particle assembly

S^{'} = {s_{k}^{' n}, π_{k}^{' n}, T_{k}^{' n}}_{n = 1}^{N}

In, i.e. particle

Have

Possibility add in new particle assembly S', upgrade each particle weight in S ' after resampling

Make S=S ', to complete corresponding re-sampling operations process.

The 3rd, update mode transfer parameters α:

α = Σ_{n = 1}^{N} T_{k}^{n} / N .

(6) calculate the large little module in ROI position

Through after the calculating of above each module, obtained the particle assembly in the present frame

Can calculate now the position size of the ROI in present frame, the position at ROI center

(x, y) = (Σ_{n = 1}^{n} π_{k}^{n} \times s_{k}^{n} {x}, Σ_{n = 1}^{n} π_{k}^{n} \times s_{k}^{n} {y}),

The length of side of ROI

a = Σ_{n = 1}^{n} π_{k}^{n} \times s_{k}^{n} {a},

b = Σ_{n = 1}^{n} π_{k}^{n} \times s_{k}^{n} {b} .

Then, also need and to represent the particle assembly of ROI state in present frame

Deposit the buffer(buffer memory of reference frame in) in, so that use in follow-up tracking processing procedure.

The realization of the invention described above embodiment makes in the process of following the tracks of ROI, can obtain better tracking effect with lower complexity.

Particularly, ROI tracking scheme and tracking scheme of the prior art that the embodiment of the present invention is provided, compare experiment as input and find with coastguard, stephen sequence: in the coastguard sequence, tracking scheme of the prior art is lost tracking results since the 49th frame, lose tracking results fully to 56 frames, the tracking scheme that the embodiment of the present invention proposes shows that tracking results is good; In the stephen sequence, tracking scheme of the prior art is lost tracking results since the 33rd frame, loses tracking results fully to the 78th frame, and the tracking scheme that the embodiment of the present invention proposes shows that equally tracking results is good.

In Particle filtering theory, a standard that is used for weighing the tracking scheme quality is that number of effective particles is with the situation of change of tracking time.as Figure 11 and shown in Figure 12, wherein shown respectively at sequence coastguard, the situation that in the tracking scheme that in stephen, the tracking scheme of employing prior art and the embodiment of the present invention propose, number of effective particles changes with tracking time, the line correspondence that is positioned at the top in two width figure the tracking scheme that the embodiment of the present invention proposes, the line of below is corresponding the tracking scheme of prior art, as seen the tracking scheme number of effective particles of embodiment of the present invention proposition reduces slower with tracking time, be that the tracking scheme that the embodiment of the present invention proposes obviously is better than tracking scheme of the prior art.

The below just provides the region-of-interest tracking method in a kind of video and installs adaptable environment the embodiment of the present invention and is illustrated.

Application Example one

The embodiment of the present invention provide in a kind of video region-of-interest tracking method and the device can use contain the gradable SVC(Scalable Video Coding of ROI, scalable video) coding in.

Particularly, contain the structure of the gradable SVC scrambler of ROI as shown in figure 13, corresponding SVC provides the support to encoding region of interest.ROI comprises the zone of the object with clear and definite high-level semantic often in frame of video for the viewer, as someone, and object etc.Carry out in the process of video tour the user, if the display size of its equipment is little, perhaps its available bandwidth reduces, and in order not affect it to the experience of viewing and admiring of this video, need to keep as far as possible the sharpness of area-of-interest.As shown in figure 13, the region-of-interest tracking method and the device that the invention provides in a kind of video can be used for containing in the acquisition process in ROI zone in the gradable SVC scrambler of ROI, namely can be used for realizing the regional acquisition module of the enhancement layer ROI shown in Figure 13.The input of module is basic layer MV and the video data information that present frame has encoded, and after following the tracks of processing, the big or small positional information of output ROI in present frame is so that the ROI regional code of enhancement layer.

Application Example two

The embodiment of the present invention provides the region-of-interest tracking method in a kind of video and installs except being applied to the scalable video technology at home and abroad, can also be applied in the transcoding in ROI zone.

Due to restrictions such as video terminal screen size and the network bandwidths, often existing compressed video bit stream transcoding need to be become the target code stream that client needs, in order to guarantee that visual quality can adopt, former compressed bit stream is transcoded into the ROI code stream, namely abandon inessential visual information concerning the client, only keep the high-quality in ROI zone.The structure of corresponding tandem type transcoder as shown in figure 14, the embodiment of the present invention provides the region-of-interest tracking method in a kind of video and installs the position size information that the ROI tracking module that can put into the cascade connection type transcoder shown in Figure 14 is used for obtaining ROI, ROI is encoded into the higher code stream of quality.

Certainly, the embodiment of the present invention provides region-of-interest tracking method and device in a kind of video can also be applied to other similar need to tracking in the applied environment of ROI.Give an example no longer one by one at this.

The above; only for the better embodiment of the present invention, but protection scope of the present invention is not limited to this, anyly is familiar with those skilled in the art in the technical scope that the present invention discloses; the variation that can expect easily or replacement are within all should being encompassed in protection scope of the present invention.Therefore, protection scope of the present invention should be as the criterion with the protection domain of claims.

Claims

1. the region-of-interest tracking method in a video, is characterized in that, comprising:

2. method according to claim 1, is characterized in that, the translational speed distribution parameter of described ROI comprises:

P ({MV}^{ROI}) = Σ_{i = 0}^{| G | - 1} δ ({MV}^{i} - {MV}^{ROI}) / | G |,

3. method according to claim 2, is characterized in that, the step that the described particle that sampling in present frame is obtained carries out the state transitions processing comprises:

Indexed variable

α = Σ_{n = 1}^{N} T_{k}^{n} / N,

Be the vx component in particle n in the k-1 frame,

Be the v in particle n in the k-1 frame _yComponent;

(s_{k}^{n} {x} = s_{k - 1}^{n} {x} + v_{1} + ϵ_{x}, s_{k}^{n} {y} = s_{k - 1}^{n} {y} + v_{2} + ϵ_{y}),

The translational speed of the x of this particle, y direction is respectively:

s_{k}^{n} {v_{x}} = v_{1} + ϵ_{v_{x}}

With

s_{k}^{n} {v_{y}} = v_{2} + ϵ_{v_{y}},

s_{k}^{n} {a} = η \times s_{k - 1}^{n} {a} + ϵ_{a},

Minor semi-axis is

s_{k}^{n} {b} = η \times s_{k - 1}^{n} {b} + ϵ_{b},

4. according to claim 1-3 described methods of any one, is characterized in that, describedly determines that according to the particle after state transitions the step of the ROI of present frame comprises:

5. the area-of-interest tracking means in a video, is characterized in that, comprising:

6. device according to claim 5, is characterized in that, described parameter determination module comprises: